Web-based Winamp controller for CarPC � Go backend, mobile-first UI
Ви не можете вибрати більше 25 тем Теми мають розпочинатися з літери або цифри, можуть містити дефіси (-) і не повинні перевищувати 35 символів.

425 рядки
13KB

  1. //go:build windows
  2. // Package viz captures the Windows audio loopback via WASAPI and emits
  3. // FFT spectrum data for visualisation in the web frontend.
  4. package viz
  5. import (
  6. "context"
  7. "fmt"
  8. "log"
  9. "math"
  10. "math/cmplx"
  11. "runtime"
  12. "syscall"
  13. "time"
  14. "unsafe"
  15. "golang.org/x/sys/windows"
  16. )
  17. // NumBars is the number of frequency bars emitted per frame.
  18. const NumBars = 64
  19. const (
  20. fftN = 2048 // FFT window size (power of 2)
  21. // WASAPI
  22. audclntShareModeShared = 0
  23. audclntStreamFlagsLoopback = 0x00020000
  24. audclntBufferFlagsSilent = 0x2
  25. bufDuration = 1_000_000 // 100 ms in 100-ns units
  26. // Wave format tags
  27. waveFormatPCM = 1
  28. waveFormatFloat = 3
  29. waveFormatExtensibleTag = 0xFFFE
  30. )
  31. // ── GUIDs ─────────────────────────────────────────────────────────────────────
  32. var (
  33. clsidMMDeviceEnumerator = windows.GUID{
  34. Data1: 0xBCDE0395, Data2: 0xE52F, Data3: 0x467C,
  35. Data4: [8]byte{0x8E, 0x3D, 0xC4, 0x57, 0x92, 0x91, 0x69, 0x2E},
  36. }
  37. iidIMMDeviceEnumerator = windows.GUID{
  38. Data1: 0xA95664D2, Data2: 0x9614, Data3: 0x4F35,
  39. Data4: [8]byte{0xA7, 0x46, 0xDE, 0x8D, 0xB6, 0x36, 0x17, 0xE6},
  40. }
  41. iidIAudioClient = windows.GUID{
  42. Data1: 0x1CB9AD4C, Data2: 0xDBFA, Data3: 0x4c32,
  43. Data4: [8]byte{0xB1, 0x78, 0xC2, 0xF5, 0x68, 0xA7, 0x03, 0xB2},
  44. }
  45. iidIAudioCaptureClient = windows.GUID{
  46. Data1: 0xC8ADBD64, Data2: 0xE71E, Data3: 0x48a0,
  47. Data4: [8]byte{0xA4, 0xDE, 0x18, 0x5C, 0x39, 0x5C, 0xD3, 0x17},
  48. }
  49. subFormatFloat = windows.GUID{
  50. Data1: 0x00000003, Data2: 0x0000, Data3: 0x0010,
  51. Data4: [8]byte{0x80, 0x00, 0x00, 0xAA, 0x00, 0x38, 0x9B, 0x71},
  52. }
  53. )
  54. // ── WAVEFORMAT structs ────────────────────────────────────────────────────────
  55. type waveFormatEx struct {
  56. FormatTag uint16
  57. Channels uint16
  58. SamplesPerSec uint32
  59. AvgBytesPerSec uint32
  60. BlockAlign uint16
  61. BitsPerSample uint16
  62. Size uint16
  63. }
  64. // waveFormatExtensibleEx is a flat representation of WAVEFORMATEXTENSIBLE.
  65. // We cannot embed waveFormatEx because Go pads the struct to 20 bytes
  66. // (alignment of largest field = uint32), but the C layout is 18 bytes —
  67. // so SubFormat would land at the wrong offset if we used struct embedding.
  68. type waveFormatExtensibleEx struct {
  69. FormatTag uint16
  70. Channels uint16
  71. SamplesPerSec uint32
  72. AvgBytesPerSec uint32
  73. BlockAlign uint16
  74. BitsPerSample uint16
  75. Size uint16
  76. Samples uint16 // wValidBitsPerSample / wSamplesPerBlock
  77. ChannelMask uint32
  78. SubFormat windows.GUID // 16 bytes → total 40 bytes, matches C layout
  79. }
  80. // ── DLL procs ─────────────────────────────────────────────────────────────────
  81. var (
  82. ole32 = windows.NewLazySystemDLL("ole32.dll")
  83. coInitializeEx = ole32.NewProc("CoInitializeEx")
  84. coUninitialize = ole32.NewProc("CoUninitialize")
  85. coCreateInstance = ole32.NewProc("CoCreateInstance")
  86. coTaskMemFree = ole32.NewProc("CoTaskMemFree")
  87. )
  88. // ── COM vtable helpers ────────────────────────────────────────────────────────
  89. var ptrSize = unsafe.Sizeof(uintptr(0))
  90. func procAt(comObj uintptr, methodIdx int) uintptr {
  91. vtbl := *(*uintptr)(unsafe.Pointer(comObj))
  92. return *(*uintptr)(unsafe.Pointer(vtbl + uintptr(methodIdx)*ptrSize))
  93. }
  94. func comRelease(p uintptr) {
  95. if p != 0 {
  96. syscall.Syscall(procAt(p, 2), 1, p, 0, 0)
  97. }
  98. }
  99. // ── Capturer ──────────────────────────────────────────────────────────────────
  100. // Capturer streams FFT spectrum bars from the system audio loopback.
  101. type Capturer struct {
  102. // C receives slices of NumBars float32 values in [0.0, 1.0] at ~30 fps.
  103. // Slow consumers cause frames to be dropped (non-blocking send).
  104. C chan []float32
  105. }
  106. // NewCapturer creates a Capturer ready to Start.
  107. func NewCapturer() *Capturer {
  108. return &Capturer{C: make(chan []float32, 4)}
  109. }
  110. // Start begins the capture loop; blocks until ctx is cancelled.
  111. // Errors are logged but never fatal — the channel simply stays empty.
  112. func (c *Capturer) Start(ctx context.Context) {
  113. if err := c.run(ctx); err != nil {
  114. log.Printf("viz: %v", err)
  115. }
  116. }
  117. func (c *Capturer) run(ctx context.Context) error {
  118. // Pin this goroutine to its OS thread — WASAPI COM objects are
  119. // thread-affine; the scheduler must not migrate us mid-call.
  120. runtime.LockOSThread()
  121. defer runtime.UnlockOSThread()
  122. hr, _, _ := coInitializeEx.Call(0, 0) // COINIT_MULTITHREADED
  123. if hr > 1 { // S_OK=0, S_FALSE=1 are both success
  124. return fmt.Errorf("CoInitializeEx: HRESULT 0x%08X", hr)
  125. }
  126. defer coUninitialize.Call()
  127. // ── IMMDeviceEnumerator ──────────────────────────────────────────────────
  128. var enumerator uintptr
  129. if hr, _, _ := coCreateInstance.Call(
  130. uintptr(unsafe.Pointer(&clsidMMDeviceEnumerator)), 0, 0x17,
  131. uintptr(unsafe.Pointer(&iidIMMDeviceEnumerator)),
  132. uintptr(unsafe.Pointer(&enumerator)),
  133. ); hr != 0 {
  134. return fmt.Errorf("CoCreateInstance(MMDeviceEnumerator): 0x%08X", hr)
  135. }
  136. defer comRelease(enumerator)
  137. // ── Default render device ────────────────────────────────────────────────
  138. // GetDefaultAudioEndpoint(eRender, eConsole, &device) — vtable index 4, 4 args
  139. var device uintptr
  140. if hr, _, _ := syscall.Syscall6(
  141. procAt(enumerator, 4), 4,
  142. enumerator, 0, 0, uintptr(unsafe.Pointer(&device)), 0, 0,
  143. ); hr != 0 {
  144. return fmt.Errorf("GetDefaultAudioEndpoint: 0x%08X", hr)
  145. }
  146. defer comRelease(device)
  147. // ── IAudioClient ────────────────────────────────────────────────────────
  148. // IMMDevice::Activate(riid, clsCtx, pParams, &ppv) — vtable index 3, 5 args
  149. var ac uintptr
  150. if hr, _, _ := syscall.Syscall6(
  151. procAt(device, 3), 5,
  152. device, uintptr(unsafe.Pointer(&iidIAudioClient)), 0x17, 0,
  153. uintptr(unsafe.Pointer(&ac)), 0,
  154. ); hr != 0 {
  155. return fmt.Errorf("Activate(IAudioClient): 0x%08X", hr)
  156. }
  157. defer comRelease(ac)
  158. // ── Mix format ──────────────────────────────────────────────────────────
  159. var fmtPtr uintptr
  160. if hr, _, _ := syscall.Syscall(
  161. procAt(ac, 8), 2, // GetMixFormat
  162. ac, uintptr(unsafe.Pointer(&fmtPtr)), 0,
  163. ); hr != 0 {
  164. return fmt.Errorf("GetMixFormat: 0x%08X", hr)
  165. }
  166. defer coTaskMemFree.Call(fmtPtr)
  167. wfx := (*waveFormatEx)(unsafe.Pointer(fmtPtr))
  168. sampleRate := int(wfx.SamplesPerSec)
  169. channels := int(wfx.Channels)
  170. isFloat := wfx.FormatTag == waveFormatFloat
  171. if wfx.FormatTag == waveFormatExtensibleTag && wfx.Size >= 22 {
  172. ext := (*waveFormatExtensibleEx)(unsafe.Pointer(fmtPtr))
  173. isFloat = ext.SubFormat == subFormatFloat
  174. }
  175. log.Printf("viz: loopback format %d Hz, %d ch, %d bit, float=%v",
  176. sampleRate, channels, wfx.BitsPerSample, isFloat)
  177. if !isFloat || wfx.BitsPerSample != 32 {
  178. return fmt.Errorf("viz: unsupported format (need float32); got tag=%04X bits=%d",
  179. wfx.FormatTag, wfx.BitsPerSample)
  180. }
  181. // ── Initialize loopback ──────────────────────────────────────────────────
  182. if hr, _, _ := syscall.Syscall9(
  183. procAt(ac, 3), 7, // IAudioClient::Initialize
  184. ac,
  185. audclntShareModeShared,
  186. audclntStreamFlagsLoopback,
  187. uintptr(bufDuration), 0, // hnsBufferDuration, hnsPeriodicity
  188. fmtPtr, 0, // pFormat, AudioSessionGuid
  189. 0, 0,
  190. ); hr != 0 {
  191. return fmt.Errorf("IAudioClient::Initialize: 0x%08X", hr)
  192. }
  193. // ── IAudioCaptureClient ──────────────────────────────────────────────────
  194. var acc uintptr
  195. if hr, _, _ := syscall.Syscall(
  196. procAt(ac, 14), 3, // GetService
  197. ac,
  198. uintptr(unsafe.Pointer(&iidIAudioCaptureClient)),
  199. uintptr(unsafe.Pointer(&acc)),
  200. ); hr != 0 {
  201. return fmt.Errorf("GetService(IAudioCaptureClient): 0x%08X", hr)
  202. }
  203. defer comRelease(acc)
  204. // ── Start ────────────────────────────────────────────────────────────────
  205. if hr, _, _ := syscall.Syscall(procAt(ac, 10), 1, ac, 0, 0); hr != 0 {
  206. return fmt.Errorf("IAudioClient::Start: 0x%08X", hr)
  207. }
  208. defer syscall.Syscall(procAt(ac, 11), 1, ac, 0, 0) // Stop
  209. // ── Capture loop ─────────────────────────────────────────────────────────
  210. buf := make([]float64, 0, fftN*2)
  211. smooth := make([]float32, NumBars)
  212. tick := time.NewTicker(10 * time.Millisecond)
  213. defer tick.Stop()
  214. for {
  215. select {
  216. case <-ctx.Done():
  217. return nil
  218. case <-tick.C:
  219. buf = drainLoopback(acc, channels, buf)
  220. for len(buf) >= fftN {
  221. bars := spectrum(buf[:fftN], sampleRate, smooth)
  222. copy(smooth, bars)
  223. select {
  224. case c.C <- bars:
  225. default:
  226. }
  227. buf = buf[fftN:]
  228. }
  229. }
  230. }
  231. }
  232. // drainLoopback reads all pending audio frames into buf and returns it.
  233. func drainLoopback(acc uintptr, channels int, buf []float64) []float64 {
  234. for {
  235. // GetNextPacketSize
  236. var packetFrames uint32
  237. if hr, _, _ := syscall.Syscall(
  238. procAt(acc, 5), 2,
  239. acc, uintptr(unsafe.Pointer(&packetFrames)), 0,
  240. ); hr != 0 || packetFrames == 0 {
  241. break
  242. }
  243. // GetBuffer(ppData, &numFrames, &flags, NULL, NULL) — 6 args
  244. var dataPtr uintptr
  245. var numFrames uint32
  246. var flags uint32
  247. if hr, _, _ := syscall.Syscall6(
  248. procAt(acc, 3), 6,
  249. acc,
  250. uintptr(unsafe.Pointer(&dataPtr)),
  251. uintptr(unsafe.Pointer(&numFrames)),
  252. uintptr(unsafe.Pointer(&flags)),
  253. 0, 0,
  254. ); hr != 0 {
  255. break
  256. }
  257. if flags&audclntBufferFlagsSilent == 0 && dataPtr != 0 && numFrames > 0 {
  258. samples := unsafe.Slice((*float32)(unsafe.Pointer(dataPtr)), int(numFrames)*channels)
  259. for i := 0; i < int(numFrames); i++ {
  260. var mono float64
  261. for ch := 0; ch < channels; ch++ {
  262. mono += float64(samples[i*channels+ch])
  263. }
  264. buf = append(buf, mono/float64(channels))
  265. }
  266. }
  267. // ReleaseBuffer
  268. syscall.Syscall(procAt(acc, 4), 2, acc, uintptr(numFrames), 0)
  269. }
  270. return buf
  271. }
  272. // ── Spectrum analysis ─────────────────────────────────────────────────────────
  273. // spectrum applies a Hanning window, runs the FFT, maps to NumBars
  274. // log-spaced frequency bins, and applies fast-attack/slow-decay smoothing.
  275. func spectrum(samples []float64, sampleRate int, prev []float32) []float32 {
  276. n := len(samples)
  277. // Hanning window
  278. cx := make([]complex128, n)
  279. for i, s := range samples {
  280. w := 0.5 * (1 - math.Cos(2*math.Pi*float64(i)/float64(n-1)))
  281. cx[i] = complex(s*w, 0)
  282. }
  283. ditFFT(cx)
  284. // Magnitude of positive frequencies, normalised
  285. bins := make([]float64, n/2)
  286. scale := 2.0 / float64(n)
  287. for i := range bins {
  288. bins[i] = cmplx.Abs(cx[i]) * scale
  289. }
  290. // Log-spaced output bars: 40 Hz → 20 kHz
  291. const fMin, fMax = 40.0, 20_000.0
  292. freqRes := float64(sampleRate) / float64(n)
  293. bars := make([]float32, NumBars)
  294. for b := 0; b < NumBars; b++ {
  295. t := float64(b) / float64(NumBars-1)
  296. f := fMin * math.Pow(fMax/fMin, t)
  297. var fNext float64
  298. if b < NumBars-1 {
  299. t2 := float64(b+1) / float64(NumBars-1)
  300. fNext = fMin * math.Pow(fMax/fMin, t2)
  301. } else {
  302. fNext = fMax
  303. }
  304. lo := clamp(int(f/freqRes), 0, len(bins)-1)
  305. hi := clamp(int(fNext/freqRes), lo+1, len(bins))
  306. var sum float64
  307. for i := lo; i < hi; i++ {
  308. sum += bins[i]
  309. }
  310. avg := sum / float64(hi-lo)
  311. // dB → [0, 1]
  312. dB := 20 * math.Log10(avg+1e-9)
  313. norm := float32((dB + 80) / 80)
  314. if norm < 0 {
  315. norm = 0
  316. }
  317. if norm > 1 {
  318. norm = 1
  319. }
  320. // Fast attack, slow decay
  321. if norm > prev[b] {
  322. bars[b] = norm
  323. } else {
  324. bars[b] = prev[b] * 0.88
  325. }
  326. }
  327. return bars
  328. }
  329. func clamp(v, lo, hi int) int {
  330. if v < lo {
  331. return lo
  332. }
  333. if v > hi {
  334. return hi
  335. }
  336. return v
  337. }
  338. // ── Cooley-Tukey FFT ─────────────────────────────────────────────────────────
  339. // ditFFT is an in-place, decimation-in-time FFT. len(x) must be a power of 2.
  340. func ditFFT(x []complex128) {
  341. n := len(x)
  342. // Bit-reversal permutation
  343. j := 0
  344. for i := 1; i < n; i++ {
  345. bit := n >> 1
  346. for j&bit != 0 {
  347. j ^= bit
  348. bit >>= 1
  349. }
  350. j ^= bit
  351. if i < j {
  352. x[i], x[j] = x[j], x[i]
  353. }
  354. }
  355. // Butterfly stages
  356. for length := 2; length <= n; length <<= 1 {
  357. half := length >> 1
  358. wStep := cmplx.Exp(complex(0, -math.Pi/float64(half)))
  359. for i := 0; i < n; i += length {
  360. w := complex(1, 0)
  361. for k := 0; k < half; k++ {
  362. u := x[i+k]
  363. v := x[i+k+half] * w
  364. x[i+k] = u + v
  365. x[i+k+half] = u - v
  366. w *= wStep
  367. }
  368. }
  369. }
  370. }