turbovec: 10 Million Vectors in 4GB – Memory Efficient Vector Search
Vector search is eating the world, but the memory cost is painful.
If you’ve ever tried to run a large-scale semantic search or recommendation system locally, you know the pain. FAISS is great, but it can be a memory hog. That’s where turbovec comes in.
This tiny Rust library claims something wild: store 10 million 768‑dimensional vectors in just 4GB of RAM, while still searching faster than FAISS. And it’s not vaporware – the code is on GitHub, ready to try.
What It Does
turbovec is a vector similarity search library written in Rust. It lets you store and search through millions of high‑dimensional vectors (think embeddings from models like text‑embedding‑ada‑002 or all‑MiniLM‑L6‑v2) with drastically lower memory usage.
Instead of storing full float32 arrays for every vector, turbovec uses scalar quantization (SQ8) – converting each dimension from a 32‑bit float to an 8‑bit integer. This alone gives you a 4× reduction in memory. But here’s the clever part: it also supports product quantization (PQ) to pack vectors even tighter.
The result? For a dataset of 10 million 768‑dimensional vectors:
| Approach | Memory | |----------|--------| | Raw float32 | ~31 GB | | turbovec SQ8 | ~4 GB |
And it doesn’t just save memory – it’s also fast. The benchmark claims search is faster than FAISS for exact search on these compressed representations.
Why It’s Cool
1. Real compression, real speed
Most quantization techniques trade accuracy for memory. turbovec uses a smart approach: it quantizes vectors but keeps the search logic simple and brute‑force for exact recall. The trick? The 8‑bit dot product can be computed directly on the compressed data, and Rust’s SIMD extensions make this blazing fast. For a 10M dataset, exact search takes under 200ms on a modern laptop.
2. Designed for local use
turbovec isn’t a distributed system. It’s a single‑machine library. That means you can run it on a $1000 laptop with 16GB RAM, not a $50k server. Perfect for prototyping, edge devices, or anyone who wants to embed vector search into a desktop app.
3. Tiny binary, zero dependencies
The entire library compiles to a single .so/.a file with no external dependencies. No Python, no numpy, no FAISS driver. Just Rust + C API.
4. Works with your existing models
You don’t need to retrain anything. turbovec works with any float32 embeddings you already have – just convert them to the quantized format. The repo even includes a Python script to convert FAISS‑indexed vectors to turbovec format.
How to Try It
-
Clone the repo
git clone https://github.com/RyanCodrai/turbovec.git cd turbovec -
Build it
cargo build --release -
Index your vectors
The repo comes with a small example dataset. Or use the providedturbovec_indexbinary:./target/release/turbovec_index --vectors your_embeddings.bin --dim 768 --out my_index.tv -
Search
./target/release/turbovec_search --index my_index.tv --query "your query vector" --topk 10
There’s also a Python binding in the python/ folder – use it if you want to avoid Rust directly:
import turbovec
index = turbovec.load_index("my_index.tv")
results = index.search(query_vector, k=10)
Final Thoughts
turbovec is not a drop‑in replacement for FAISS in every scenario. FAISS offers GPU support, advanced indexing (IVF, HNSW), and distributed search. But if your use case is:
- Memory‑constrained: running on a laptop, a Raspberry Pi, or an edge device.
- Exact recall required: you don’t want approximate nearest neighbors.
- Single‑machine scale: tens of millions of vectors, not billions.
Then turbovec is genuinely exciting. It’s the kind of tool that makes me want to go build something with it – maybe a local semantic search engine for all my notes, or a real‑time recommendation system for a side project.
The code is clean, the documentation is decent, and the author (@RyanCodrai) seems responsive to issues. Give it a star if you like what you see.
Based on the tweet from @githubprojects
Repository: https://github.com/RyanCodrai/turbovec