Back to Blog
engineering

The Architecture Behind CVC: SQLite, CAS Blobs, and Chroma

Jai Kumar MeenaMarch 9, 202610 min read
ArchitectureSQLiteChromaDBStorage

The Architecture Behind CVC: SQLite, CAS Blobs, and Chroma

Why Three Tiers?

Each tier in CVC's storage engine serves a distinct purpose:

SQLite (Tier 1) — The workhorse. Stores the commit graph (parent pointers, branch references), metadata (timestamps, commit types, modes), and configuration. SQLite was chosen for zero-configuration deployment, ACID guarantees, and single-file portability. One cvc.db file has your entire commit graph.

CAS Blobs (Tier 2) — The vault. Every context snapshot is compressed with Zstandard (20-50% compression ratio) and stored as a content-addressed blob. The filename IS the hash of the content. This provides automatic deduplication, immutability, and O(1) integrity verification. Anchor commits store full snapshots; intermediate commits store deltas against the nearest anchor.

ChromaDB (Tier 3) — The brain. Optional vector embeddings of every commit enable semantic search. Instead of keyword matching, you search by meaning: "When did the AI figure out the caching strategy?" — Chroma finds the relevant commits even if those exact words never appeared.

Why This Design?

Separation of concerns. The graph (Tier 1) tells you what happened when. The blobs (Tier 2) store with verifiable integrity. The embeddings (Tier 3) let you search by meaning. Any tier can be rebuilt from the others. The system degrades gracefully — even without Chroma, CVC works perfectly with text search.

Git compatibility. The architecture mirrors Git's: SQLite ≈ .git/refs/, CAS blobs ≈ .git/objects/, Chroma ≈ a smart search index. Developers who understand Git understand CVC instantly.

    Blog — CVC & AI Engineering | Jai Kumar Meena