Hacker News·INFRA·2mo ago
Zero-copy GPU inference from WebAssembly on Apple Silicon
The CPU and GPU read and write the same physical bytes. End-to-end, it works.
A developer has demonstrated that WebAssembly modules can share memory directly with the GPU on Apple Silicon, eliminating the usual serialization overhead. The technique chains three components: mmap for page-aligned memory, Metal's bytesNoCopy API to wrap that pointer without copying, and Wasmtime's custom memory allocator to use the same region as Wasm linear memory. The result is that a Wasm guest and the GPU read and write the same physical bytes, with no intermediate buffers. The developer measured zero RSS delta compared to 16.78 MB for the copy path. This matters because it turns Wasm into a viable control plane for stateful AI inference on Apple hardware, with near-zero overhead between the sandbox and the accelerator.
#webassembly#apple-silicon#gpu-inference#wasmtime
Read on Hacker News
Also covered by