A 10 million document RAG dataset occupies 31 GB of RAM at float32. turbovec fits it in just 4 GB - and now it searches it faster than FAISS. I just shipped a new release of turbovec: a Rust vector index with Python bindings, built on Google Research's TurboQuant algorithm. Data-oblivious 2-4 bit quantization that matches the Shannon lower bound on distortion - zero training and no rebuilds when the corpus grows. What's in the box: → Hand-written SIMD kernels - 12–20% faster than FAISS FastScan on ARM; match-or-beat on x86. → O(1) stable-id delete and save/load. The corpus is live and mutable, not a static snapshot. → Drop-in integrations for LangChain, LlamaIndex, and Haystack. → Published benchmarks (recall, speed, compression) at d=200/1536/3072 — every number reproducible from the repo. If you're building RAG where memory, latency, or privacy matters, give it a spin. GitHub: https://lnkd.in/e5M4dVRk Paper: https://lnkd.in/eHRmpYms #RAG #VectorSearch #OpenSource #Rust #Python #𝗟𝗟𝗠 #𝗢𝗽𝗲𝗻𝗦𝗼𝘂𝗿𝗰𝗲 #𝗚𝗲𝗺𝗺𝗮4
Looks amazing, I love it.
I gotta check to see if it has native mps support because if so, I might be making the move from FAISS…
Amazing work. Thank you for sharing here
Great work 👏
This guy just can't stop building!
This is really cool, will try this. Thanks for sharing!
Amazing work !!!
Definitely gonna try it!!!
Man times moved on since I wrote fast sentence embeddings. Fantastic job, will give it a spin
This looks very solid on the infra side. What we have been seeing is that once compression and latency are under control, the hard problems show up in ranking stability rather than raw recall. Especially in dense regions where top-k becomes very sensitive to small perturbations. That part tends to matter more in production than the average metrics.