# Metering Telemetry Autobahn Concept Status: Draft Version: 3.1 Scope: `fm-rds-tx` runtime telemetry transport for live metering, browser UI, future composite spectrum, and snapshot fallback APIs ## 1. Summary The current `GET /measurements` endpoint is already a useful and well-shaped snapshot API for composite/MPX metering. It should stay. If the UI is expected to evolve toward: - smoother live audio meters - smoother MPX meters - higher refresh rates - future composite spectrum/analyzer views - multiple concurrent telemetry consumers then snapshot polling alone is no longer the ideal transport. This document proposes a **live telemetry transport layer** — the “metering autobahn” — built around: - a small telemetry broadcaster/hub inside the control plane - WebSocket delivery for low-latency/high-rate live data - a deliberately tiny WS-1 scope - continued support for `GET /measurements` as a stable snapshot fallback - explicit protection of the ingest / DSP / TX realtime path The key design rules are: - **streaming is added, not substituted** - **`/measurements` remains a first-class snapshot endpoint** - **metering must never be allowed to interfere with ingest / DSP / TX timing** --- ## 2. Core Principle ### `/measurements` stays The existing snapshot endpoint must **not** be removed. It remains valuable for: - debugging - curl/manual inspection - API consumers that only want snapshots - low-complexity integrations - fallback behavior when WebSocket transport is unavailable So the intended model is: - **WebSocket for live streaming** - **`GET /measurements` for stable snapshot access** Not one replacing the other. --- ## 3. Goals ### Primary goals - Provide a transport suitable for higher-rate live metering. - Support future spectrum/analyzer-style UI features. - Keep structured measurement semantics separate from transport concerns. - Avoid forcing the browser to poll snapshots at increasingly high rates. - Preserve `/measurements` as a stable snapshot API. - Protect ingest / DSP / TX timing from telemetry, transport, browser, and control-plane behavior. ### Secondary goals - Support multiple telemetry consumers. - Handle slow clients safely. - Avoid overloading the hot DSP path. - Make future spectrum support possible without transport redesign. - Ensure telemetry degrades by dropping metering data rather than slowing the realtime path. --- ## 4. Non-Goals This concept is **not**: - a replacement for `/runtime` - a replacement for `/measurements` - a raw-sample streaming design - a browser-side FFT design - a long-term telemetry database - an excuse to build a huge generalized pub/sub system in WS-1 - a design where browser/UI/control-plane demand can push back into ingest / DSP / TX --- ## 5. Single Source of Truth This is the most important semantic rule. There must be exactly **one latest measurement snapshot truth** inside the runtime/control system. That means: - the generator/engine path produces the latest measurement snapshot - `GET /measurements` exposes that snapshot - WebSocket streams updates derived from that same snapshot source WebSocket must **not** introduce a separate meter-calculation path. Otherwise the system risks a future mismatch like: - polling UI shows one value - streaming UI shows another value - both appear plausible - nobody trusts either anymore So the rule is: - **same measurement source** - **different delivery mechanisms** Additional WS rule: - **the `measurement.data` payload sent over WebSocket should be semantically identical to the `measurement` object returned by `GET /measurements`** - transport envelope fields such as `type`, `ts`, and `seq` may differ, but the underlying measurement meaning must not drift --- ## 6. Realtime Safety Rule This is the most important operational rule. Ingest / DSP / TX timing owns the system. Metering is valuable, but it is **not** allowed to compete with realtime work for correctness. If the system must choose between: - keeping ingest / DSP / TX on time - or delivering every metering update then metering loses. The rule is: - **realtime first** - **metering is best-effort** - **dropped telemetry is acceptable** - **timing interference is not acceptable** This means metering transport must be designed so that: - slow clients cannot block producers - control-plane activity cannot block producers - JSON / HTTP / WebSocket work cannot occur on the realtime path - telemetry backlog cannot cause unbounded memory growth - the realtime path never waits for telemetry consumers In short: - **if anything must be sacrificed under load, sacrifice telemetry freshness/completeness, never ingest / DSP / TX timing** This rule also applies to future spectrum support: - spectrum is also best-effort - future spectrum work must never degrade ingest / DSP / TX timing --- ## 7. One-Way Data Flow Rule The data-flow direction must be explicit. Allowed direction: - **realtime path → published measurement snapshot → control-plane broadcaster → clients** Forbidden direction: - **client demand → control plane → realtime path “give me data now”** This means: - the realtime path produces telemetry only when it naturally completes work - the control plane reads what the realtime side has already published - browser refresh rate must not cause extra DSP work - WebSocket clients do not “request the current meter” from the realtime path The system should therefore behave as: - one producer of measurement snapshots - one non-RT transport layer that distributes already-produced snapshots - zero transport-driven callbacks into the DSP hot path --- ## 8. Architectural Layers ## 8.1 Signal production layer The generator / engine already produces semantically meaningful measurement snapshots. That should remain the source of truth for metering data. Later, spectrum production can be added in a similarly structured way. Examples of produced data classes: - measurement snapshots - future spectrum frames - optional future runtime event frames ## 8.2 Realtime-safe publication boundary Between the realtime path and the control plane there must be a strict publication boundary. Responsibilities: - accept already-computed chunk-local measurement results - publish them in a way that never blocks the producer - allow overwrite/drop behavior under load - prevent transport concerns from leaking into ingest / DSP / TX This boundary is where realtime safety is enforced. ## 8.3 Telemetry transport layer Introduce a small telemetry broadcaster/hub in the control plane. Responsibilities in WS-1: - accept published measurement snapshots from the non-blocking publication boundary - fan them out to connected WebSocket clients - apply bounded-queue/backpressure policy - isolate transport logic from DSP/runtime logic This transport layer should stay intentionally small at first. ## 8.4 Client/UI layer The browser UI should consume: - `GET /measurements` for initial/fallback snapshot state - WebSocket for live updates when available Rendering logic such as: - smoothing - peak hold - decay - short local history should remain on the UI side. --- ## 9. Why a Broadcaster/Hub Is Still Useful Even in a minimal WS-1 design, a broadcaster/hub is useful because it keeps transport logic out of: - generator code - engine code - ad-hoc handler state It allows: - one producer → many consumers - bounded queues per client - clean control-plane ownership of transport But for WS-1, this hub should be **small and boring**, not a grand infrastructure project. --- ## 10. Hot-Path Constraints The realtime path must remain intentionally primitive. Allowed on the realtime side: - chunk-local accumulation into predeclared counters/fields - simple arithmetic such as abs/square/max/counter updates - one finalize step per chunk - one non-blocking publication step per chunk Forbidden on the realtime side: - JSON encoding - HTTP handling - WebSocket writes - logging in the hot path - blocking channels - contended locks shared with non-RT code - dynamic queue growth - per-sample heap allocation - transport-driven callback logic The model is: - compute meters while already processing audio/composite data - finalize once per chunk - publish once per chunk - leave all transport/rendering concerns outside the realtime path --- ## 11. Publication Strategy The publication boundary must be non-blocking. Acceptable implementation styles include: ### Option A — Atomic latest snapshot - realtime side writes the latest completed snapshot into a preallocated slot or latest-value holder - readers fetch the latest available completed value - no backlog is preserved - freshness is prioritized completely over completeness ### Option B — Tiny bounded SPSC-style queue/ring - queue size intentionally tiny, typically `1` or `2` - if full, older unsent snapshot is overwritten or discarded - publisher never blocks - reader sees the newest available completed value For WS-1, either approach is acceptable as long as these rules hold: - bounded memory only - no producer blocking - latest state wins For WS-1, the preferred implementation bias is: - **choose the simplest non-blocking latest-value publication model that satisfies the realtime safety rules** - in practice this often means starting with an atomic/latest-snapshot publication model before introducing a more explicit tiny ring structure The most important rule is not the exact primitive. The most important rule is: - **metering publication may drop or overwrite telemetry, but may not delay the producer** --- ## 12. WS-1 Scope: Keep It Brutally Small This is a deliberate constraint. WS-1 should include only: - one endpoint: `GET /ws/telemetry` - one message class: `measurement` - one small broadcaster/hub - one bounded queue per client - one drop policy: drop old, keep newest - UI snapshot bootstrap + WS live updates WS-1 should **not** include: - topic subscriptions - bundle messages - runtime-event multiplexing - quality-level negotiation - generalized telemetry protocol machinery - speculative infrastructure for future categories The goal of WS-1 is simple: - make meters smoother - establish the live telemetry path - do not overengineer --- ## 13. Why WebSocket and Not Only SSE For WS-1, the traffic is fundamentally server → browser. That means **Server-Sent Events (SSE)** would also be a technically valid option and would be simpler in some respects. However, WebSocket is still preferred here because it better matches the likely next steps: - future multiple telemetry classes - future spectrum delivery - possible future interactive or negotiated telemetry behavior So the decision is: - **SSE would be sufficient for the narrowest first step** - **WebSocket is preferred for forward compatibility** This is a strategic choice, not a claim that basic metering strictly requires WebSocket. --- ## 14. Transport Model ## 14.1 Existing snapshot endpoint - `GET /measurements` Role: - stable pull-based snapshot - debugging - fallback - low-rate integrations This endpoint should continue returning the latest measurement snapshot in structured JSON form. ## 14.2 New live endpoint - proposed: `GET /ws/telemetry` Role: - push-based live measurement updates - suitable for smoother meter motion - future-ready for later telemetry expansion On subscribe, the server should immediately send the latest known measurement snapshot if one exists, so the client becomes visually current without waiting for the next natural update. --- ## 15. Message Classes For WS-1, the system should implement exactly one message class. ## 15.1 `measurement` Carries the latest structured measurement snapshot. Preferred rule: - the `data` payload should match the `GET /measurements` snapshot shape as closely as possible - transport envelope fields such as `type` may wrap the same underlying snapshot semantics, but WS should not invent a subtly different meter schema Example: ```json { "type": "measurement", "ts": "2026-04-13T07:00:53.842Z", "seq": 128, "data": { "sampleRateHz": 228000, "chunkSamples": 11400, "flags": { "stereoEnabled": true, "stereoMode": "DSB" }, "lrPreEncodePostWatermark": { "lRms": 0.27, "rRms": 0.27, "lPeakAbs": 0.51, "rPeakAbs": 0.51 }, "compositeFinalPreIq": { "peakAbs": 0.63, "pilotInjectionEquivalentPercent": 9.0 } } } ``` ### Not part of WS-1 yet These are future classes, not current WS-1 deliverables: - `spectrum` - `runtime` - bundles / multiplexed compound messages --- ## 16. Update Rates ### Measurement snapshots - target: `10–20 Hz` - enough for noticeably smoother meters than snapshot polling - reasonable for WS-1 WS-1 does not need extreme rates yet. The goal is not “as fast as possible”, but: - smoother than polling - stable under load - simple to reason about --- ## 17. Backpressure Strategy This is mandatory. For metering, freshness matters more than completeness. Preferred policy per client: - bounded queue/channel - if full: - discard older unsent frame(s) - keep the newest available state In short: - **latest state wins** This is especially important because browser-side WebSocket APIs do not give you a magical end-to-end backpressure solution. Additional server safety rule: - if a client remains persistently too slow, broken, or backpressured, the server may close that client connection rather than growing complexity or buffering to accommodate it --- ## 18. UI Consumption Model The frontend should have two distinct layers. ### Transport layer - connect WebSocket - reconnect on disconnect - parse `measurement` messages - store latest live state - fall back to `/measurements` when needed ### Rendering layer - meter smoothing - peak hold - decay - short local history This keeps transport and presentation loosely coupled. --- ## 19. Fallback Behavior Preferred UI behavior: 1. load snapshot from `GET /measurements` 2. render immediately from snapshot 3. connect WebSocket 4. if WS is healthy, prefer streamed updates 5. if WS drops, keep rendering last known state and resume snapshot fallback polling This keeps the UI both: - responsive when live transport is available - robust when it is not --- ## 20. Future Direction (Explicitly Not WS-1) These are valid later expansions, but they should not enlarge the first implementation unnecessarily: - `spectrum` message type - composite spectrum producer - runtime/event stream integration - quality levels / adaptive throttling - topic or subscription semantics - bundled telemetry frames These can come later once WS-1 proves the transport path. --- ## 21. Proposed Internal Shape The internal design only needs to support a small set of concepts for WS-1, such as: - publish latest measurement snapshot - subscribe client connection - drop stale frames under backpressure That can still be implemented with a small internal abstraction such as: - `PublishMeasurement(snapshot)` - `Subscribe()` It does not need a giant generic telemetry framework yet. --- ## 22. Phased Implementation Plan ## Phase WS-1 — Measurement streaming only Deliverables: - small telemetry broadcaster/hub in control plane - `GET /ws/telemetry` - only `measurement` messages - bounded per-client queue - drop-old / keep-newest policy - browser UI loads snapshot first, then prefers WS live updates - `/measurements` remains unchanged ## Phase WS-2 — UI transport polish Deliverables: - reconnect handling - clean fallback behavior - meter update cadence tuning ## Phase WS-3 — Spectrum support Deliverables: - server-side composite spectrum producer - `spectrum` message type - browser spectrum panel ## Phase WS-4 — Advanced transport controls Deliverables: - optional adaptive throttling - optional multiple telemetry classes - optional richer live transport design --- ## 23. Open Questions ### Q1 Should WS-1 stream only `measurement`? Current preference: - yes - keep it single-purpose ### Q2 Should the UI keep polling `/measurements` while WS is healthy? Current preference: - no continuous polling during healthy WS - fallback polling only ### Q3 Should future spectrum run at the same cadence as measurements? Current preference: - no - spectrum should likely be slower ### Q4 When should a more generic telemetry protocol exist? Current preference: - only after WS-1 proves useful - do not front-load complexity --- ## 24. Recommended Next Step Implement **Phase WS-1** in the smallest practical form. Concrete steps: 1. add a small telemetry broadcaster in the control layer 2. define one WS message type: `measurement` 3. add `GET /ws/telemetry` 4. publish the latest measurement snapshot into that broadcaster 5. make the browser UI bootstrap from `/measurements`, then prefer WS 6. keep `/measurements` untouched as the snapshot fallback API That gives the system the live metering transport backbone without turning WS-1 into a giant infrastructure project.