Jan Svabenik e520d9b6f3 Fix Windows runtime DLL search for SDRplay and gpudemod		1ヶ月前
..
build	Add Windows gpudemod DLL build path	1ヶ月前
native	Add Windows gpudemod DLL build path	1ヶ月前
README.md	docs: split CUDA build paths by platform	1ヶ月前
doc.go	docs: add initial CUDA demod kernel source	1ヶ月前
gpudemod.go	Add Windows gpudemod DLL build path	1ヶ月前
gpudemod_cufft_test.go	build: wire CUDA demod package through nvcc and MSVC	1ヶ月前
gpudemod_stub.go	feat: add demod validation and GPU mode telemetry	1ヶ月前
gpudemod_test.go	feat: prepare CUDA demod launch boundary	1ヶ月前
gpudemod_windows.go	Fix Windows runtime DLL search for SDRplay and gpudemod	1ヶ月前
kernels.cu	feat: add demod validation and GPU mode telemetry	1ヶ月前
validation.go	feat: wire CUDA freq-shift launcher	1ヶ月前
validation_extra.go	feat: add demod validation and GPU mode telemetry	1ヶ月前
validation_extra_test.go	feat: add demod validation and GPU mode telemetry	1ヶ月前
validation_test.go	feat: validate CUDA freq-shift output	1ヶ月前

README.md

gpudemod

Phase 1 CUDA demod scaffolding.

Current state

Standard Go builds use gpudemod_stub.go (!cufft).
cufft builds allocate GPU buffers and cross the CGO/CUDA launch boundary.
If CUDA launch wrappers are not backed by compiled kernels yet, the code falls back to CPU DSP.
The shifted IQ path is already wired so a successful GPU freq-shift result can be copied back and reused immediately.
Build orchestration should now be considered OS-specific; see docs/build-cuda.md.

First real kernel

kernels.cu contains the first candidate implementation:

gpud_freq_shift_kernel

This is not compiled automatically yet in the current environment because the machine currently lacks a CUDA compiler toolchain in PATH (nvcc not found).

Next machine-side step

On a CUDA-capable dev machine with toolchain installed:

Compile kernels.cu into an object file and archive it into a linkable library
- helper script: tools/build-gpudemod-kernel.ps1
On Jan's Windows machine, the working kernel-build path currently relies on nvcc + MSVC cl.exe in PATH
Link gpudemod_kernels.lib into the cufft build
Replace gpud_launch_freq_shift(...) stub body with the real kernel launch
Validate copied-back shifted IQ against dsp.FreqShift
Only then move the next stage (FM discriminator) onto the GPU

Why this is still useful

The runtime/buffer/recorder/fallback structure is already in place, so once kernel compilation is available, real acceleration can be inserted without another architecture rewrite.