Jan
e8c5c2729b
watermark: replace time-domain PN with STFT-domain spread-spectrum (Kirovski & Malvar 2003)
Complete redesign of the audio watermark system. The previous time-domain
PN chip correlator failed over the FM air channel due to non-linear clock
drift between TX and RX sample clocks, destroying bit-boundary alignment
and producing ~50% BER (random).
The new system embeds the watermark in STFT magnitudes (frequency domain),
eliminating all sample-clock sensitivity. Based on:
D. Kirovski, H.S. Malvar, "Spread-Spectrum Watermarking of Audio Signals"
IEEE Trans. Signal Processing, Vol. 51, No. 4, April 2003
Architecture:
Encoder (generator.go, two-pass):
Audio → Stages 1-3 (LPF, drive, clip, cleanup)
→ Decimate ÷19 → 12 kHz mono
→ STFT (512-point Hann, hop=256, 21ms frames)
→ Magnitude × (1 ± 0.059) per PN chip (0.5 dB, multiplicative)
→ ISTFT → difference → ZOH ×19 → interpolation LPF@5.5kHz
→ add to L/R → Stages 4-6 (stereo encode, composite clip, pilot, RDS)
Decoder (wmdecode, key-free):
Recording → LPF@5.5kHz → decimate to 12 kHz
→ STFT (same parameters)
→ Cepstrum filter (DCT, zero first 8 coefficients → -6 dB carrier noise)
→ Cycle-offset search (6400 candidates, ~70s for 20-min recording)
→ PN correlation → 128 soft bit decisions
→ RS(16,8) Vandermonde decode → 64-bit fingerprint
Key properties:
- Multiplicative embedding: watermark scales with audio content.
Loud audio masks stronger watermark, silence gets zero watermark.
No gate needed (old system required silence gate to prevent audibility).
- Block repetition R=5: each PN chip repeated across 5 consecutive
STFT frames. Detection uses center frame only. Tolerates ±2 frames
(±43ms) of timing drift without any clock recovery.
- Fixed PN sequence: spreading code is public (seed "fmrtx-stft-pn-v1"),
enabling blind fingerprint extraction without knowing the license key.
Key identity is carried solely in the RS-encoded payload.
- PCC covert channel: PN partitioned into 128 subsets (one per data bit),
permuted across the watermark cycle for localized-damage resilience.
Parameters:
WMRate: 12000 Hz (228000/19, 192000/16 — exact both sides)
FFT: 512-point, Hann window, hop=256 (50% overlap)
Sub-band: bins 9-213 (200-5000 Hz), 204 frequency chips/frame
Embedding: 0.5 dB (multiplicative, 0.00 dB RMS change on audio)
Spreading: 204 bins × 10 groups/bit = 2040 chips/bit (33 dB gain)
Block rep: R=5 (±2 frame drift tolerance)
Payload: RS(16,8) → 64 bit fingerprint (SHA-256 truncated)
WM cycle: 136.5 seconds
Decode margin: ~19 dB over noise floor at 0.5 dB embedding
Over-the-air results (PlutoSDR TX → SDRplay RX, 102.8 MHz):
BER: 0/128
Erasures: 0
avg|c|: 2260
Spectrum: clean, no artifacts above 6 kHz
Bug fixes included:
- RS decoder: replaced Forney formula (used α^pos instead of α^(15-pos)
due to polynomial convention mismatch) with Vandermonde Gaussian
elimination solver. The Horner-method syndrome computation uses
C(x) = c[0]x^15 + ... + c[15], mapping byte position j to polynomial
power (15-j). Forney was using α^j as the error locator instead of
α^(15-j), producing wrong correction magnitudes for all erasure
configurations.
- Generator decimation: anti-alias LPF was applied only to every 19th
sample instead of all composite-rate samples. IIR filter state was
updated only at 12 kHz effective rate, producing incorrect filtering.
Fixed: LPF processes all 228 kHz samples, then decimate.
- ZOH spectral images: zero-order hold upsample (12k→228k) created
images at 12k, 24k, 36k Hz, leaking into pilot/stereo-sub/RDS bands.
Fixed: separate interpolation LPF@5.5kHz on the upsample path.
- Detect() single-cycle bug: iterated over groups (one frame per group)
instead of all recording frames. Longer recordings did not improve
SNR. Fixed: iterate all frames with modular wrapping for automatic
multi-cycle averaging.
New files:
internal/dsp/fft.go Radix-2 Cooley-Tukey FFT/IFFT
internal/watermark/stft_watermark.go STFT embedder + detector
internal/watermark/stft_roundtrip_test.go
Changed files:
internal/watermark/watermark.go RS Vandermonde fix, accessor methods
internal/offline/generator.go Two-pass architecture, STFT overlay
cmd/wmdecode/main.go Complete rewrite, key-free extraction
cmd/fmrtx/main.go Remove old watermark log + import
Removed:
Old time-domain Embedder, gate, pulse-shaping LPF, PN sequence,
ChipRate/Level/RecordingRate constants, DiagnosticState
1 ay önce