Jan Svabenik a8560630e7 Implement streaming recording redesign		1 місяць тому
..
build	Checkpoint current working SDR pipeline state	1 місяць тому
native	Checkpoint current working SDR pipeline state	1 місяць тому
README.md	docs: split CUDA build paths by platform	1 місяць тому
batch.go	Implement streaming recording redesign	1 місяць тому
batch_runner.go	Implement streaming recording redesign	1 місяць тому
batch_runner_other.go	Checkpoint current working SDR pipeline state	1 місяць тому
batch_runner_test.go	feat: parallelize mixed-bandwidth GPU batch demod	1 місяць тому
batch_runner_windows.go	Implement streaming recording redesign	1 місяць тому
doc.go	docs: add initial CUDA demod kernel source	1 місяць тому
errors.go	Introduce reusable gpudemod batch runner	1 місяць тому
gpudemod.go	Add GPU shift-filter-decimate path for signal extraction	1 місяць тому
gpudemod_cufft_test.go	build: wire CUDA demod package through nvcc and MSVC	1 місяць тому
gpudemod_stub.go	Add GPU shift-filter-decimate path for signal extraction	1 місяць тому
gpudemod_test.go	feat: prepare CUDA demod launch boundary	1 місяць тому
gpudemod_windows.go	fix: harden GPU demod state handling	1 місяць тому
kernels.cu	feat: add demod validation and GPU mode telemetry	1 місяць тому
validation.go	feat: wire CUDA freq-shift launcher	1 місяць тому
validation_extra.go	feat: add demod validation and GPU mode telemetry	1 місяць тому
validation_extra_test.go	feat: add demod validation and GPU mode telemetry	1 місяць тому
validation_runtime.go	Disable GPU validation by default in production	1 місяць тому
validation_test.go	feat: validate CUDA freq-shift output	1 місяць тому
windows_bridge.go	feat: parallelize mixed-bandwidth GPU batch demod	1 місяць тому

README.md

gpudemod

Phase 1 CUDA demod scaffolding.

Current state

Standard Go builds use gpudemod_stub.go (!cufft).
cufft builds allocate GPU buffers and cross the CGO/CUDA launch boundary.
If CUDA launch wrappers are not backed by compiled kernels yet, the code falls back to CPU DSP.
The shifted IQ path is already wired so a successful GPU freq-shift result can be copied back and reused immediately.
Build orchestration should now be considered OS-specific; see docs/build-cuda.md.

First real kernel

kernels.cu contains the first candidate implementation:

gpud_freq_shift_kernel

This is not compiled automatically yet in the current environment because the machine currently lacks a CUDA compiler toolchain in PATH (nvcc not found).

Next machine-side step

On a CUDA-capable dev machine with toolchain installed:

Compile kernels.cu into an object file and archive it into a linkable library
- helper script: tools/build-gpudemod-kernel.ps1
On Jan's Windows machine, the working kernel-build path currently relies on nvcc + MSVC cl.exe in PATH
Link gpudemod_kernels.lib into the cufft build
Replace gpud_launch_freq_shift(...) stub body with the real kernel launch
Validate copied-back shifted IQ against dsp.FreqShift
Only then move the next stage (FM discriminator) onto the GPU

Why this is still useful

The runtime/buffer/recorder/fallback structure is already in place, so once kernel compilation is available, real acceleration can be inserted without another architecture rewrite.