Daniel Ivanov

Generalist Engineer

Vector Search Doesn't Need the Cloud

Your data doesn't need someone else's server to be searchable.

Most vector search today ships your notes to a cloud service. Your journal entries, medical records, research notes - they leave your device to be embedded, indexed, and queried elsewhere. It works, but there's a dependency: your ability to search your own information requires someone else's infrastructure.

That feels backwards.

Local-First AI Infrastructure

There's a growing movement around local-first software - applications where your data lives on your device, works offline, and doesn't require server permission to function. CRDTs, sync engines, collaborative editors that work without central servers.

But AI features? Still cloud-first. Want semantic search? Call an API. Want vector similarity? Ship to a database service. Want embeddings? Send your text to a cloud provider.

I wanted to test whether this had to be true. Could you build AI-powered semantic search that follows local-first principles? Data on device, works offline, no server dependencies, optional sync?

Turns out: yes. Your browser can host a functional vector database. 10k vectors queried in ~8ms, nothing leaving your machine.

Why This Matters

Privacy by default. Your journal entries, medical records, research notes - they stay on your device. Not "encrypted in transit," not "we don't look at your data." They literally never leave.

The application works offline. On planes, in basements, when your internet dies. It doesn't degrade because there's no server to lose connection to.

No vendor lock-in. No API keys, no usage limits, no pricing changes, no service shutdowns. Your data, your search, your infrastructure.

And speed - network round-trips take 50-200ms. Local queries? Single-digit milliseconds. The fastest request is the one you don't make.

This aligns with what Ink & Switch calls the seven ideals of local-first software: no spinners, work doesn't require permission, fast operations, longevity, privacy, user control.

The Technical Reality

Here's the thing: IndexedDB wasn't designed for vector similarity search.

Vector databases use specialized indices (HNSW, IVF) and hardware acceleration (SIMD, GPU). IndexedDB gives you document storage and B-tree indices. Not the same thing.

Brute force doesn't scale. Comparing every query against 10,000 vectors in JavaScript? Too slow. You need to narrow the search space before computing similarity.

I used two filtering layers. First, a magnitude pre-filter - similar vectors tend to have similar lengths. Filter by magnitude ±tolerance, eliminate most candidates immediately.

Second, LSH bucketing. Locality-Sensitive Hashing groups similar vectors together. Unlike normal hashing (where "cat" and "cats" scatter randomly), LSH keeps similar items in the same buckets. Project vectors onto random hyperplanes, generate binary hash, query matching buckets.

This reduces 10,000 vectors to ~200 candidates. Then compute exact cosine similarity only on those. Fast enough for interactive search.

What I Built

A VectorStore that wraps IndexedDB and handles the complexity - storing vectors with pre-computed metadata, querying by different indices, caching frequent searches, extracting top-K results.

For developers, a React hook that abstracts everything:

const { insertVectors, fetchVectors } = useVectorStoreHook()

You don't think about IndexedDB transactions or LSH parameters. Just insert and search.

What Surprised Me

The browser is more capable than we assume. We're still treating it like a thin client when it's a legitimate compute platform. IndexedDB, Float32Array, Web APIs - the primitives are there.

JavaScript isn't the bottleneck. IndexedDB transactions cost 2-3ms per query. Vector math? Sub-millisecond. The architecture matters more than the language.

Local-first doesn't mean solo. You could extend this with background sync to a cloud vector DB. Local-first with optional collaboration. The data lives on your device but can sync when you want it to.

Developer experience compounds. Hiding IndexedDB's awkward transaction model behind a clean hook made the difference between "possible" and "practical." Good abstractions multiply adoption.

The Trade-offs

This isn't replacing cloud vector databases for production systems. It can't handle millions of vectors, doesn't support real-time multi-user updates, and gives approximate results.

But for personal applications? Privacy-sensitive data? Offline-first tools? It's exactly what you need.

The local-first philosophy accepts different trade-offs than cloud-first architecture. You optimize for user control and offline capability over massive scale and real-time collaboration. Both models are valid. They solve different problems.

Local-First AI Apps

This opens up new application patterns.

Personal semantic search. Index your notes, documents, bookmarks locally. Search without sending anything to a server. Works offline. Instant results.

Privacy-preserving AI features. Medical apps, therapy journals, financial tools - anything where data sensitivity matters. The AI works, but nothing leaves the device.

Offline-first knowledge bases. Documentation, research libraries, reference materials. Available everywhere, no connectivity required.

Prototypes and experiments. Build AI features without backend infrastructure. Deploy a static site, everything works client-side.

The browser becomes viable infrastructure for semantic search when you accept approximate results, pre-compute expensive operations, use proper indices, cache aggressively.

Closing Thoughts

There are natural extensions here. You could add CRDTs for sync - multiple devices, eventual consistency, conflict resolution. Better indexing for larger datasets - PCA for dimensionality reduction, quantization for memory efficiency. Integration with local LLMs running in the browser for embeddings and hardware acceleration. A fully local AI stack.

But even as-is, this changes what's possible.

We've accepted that AI features require cloud infrastructure because that's how the first wave of AI products worked. But the browser is more capable than we give it credit for. The primitives exist. The performance is there. What's missing isn't technical capability - it's the assumption that local-first AI infrastructure is viable.

Vector search was cloud-only. Now it's not.


Check out the source code, try the React hook, or play with the demo.

If you're building local-first applications and need semantic search, this might help. If you extend it with sync or scale it further, I'd love to hear about it.