Inside the Vallor Context Layer: Architecture, Performance, and Tradeoffs
A deep technical walkthrough of how Vallor builds and serves the contract intelligence layer. With benchmarks, code samples, and the boring details that actually matter.
Every contract intelligence product claims to be fast. Most are not. This post walks through how the Vallor Context Layer is actually built, the benchmarks we hold ourselves to, and the tradeoffs we made along the way.
The numbers up front
Here is the p50 and p99 latency for the four hot paths in our system, measured over the last 30 days of production traffic across 11 enterprise tenants:
Architecture at a glance
The system has three layers, each with a different scaling model:
- Ingestion plane
- Parses incoming contracts (DOCX, PDF, email attachments), extracts structure, and writes normalized entities to Postgres. Batched, eventually consistent, scales horizontally.
- Context layer
- The hot path. Holds the embedding index, the entity graph, and the per-tenant clause library. Lives in memory with a Postgres fallback. Read-heavy, latency-sensitive.
- Agent plane
- Stateless workers that call into the context layer to do review, redlining, and obligation extraction. Auto-scales on queue depth.
The hot path, in code
Every clause lookup goes through a single function. We've optimized this one function harder than anything else in the codebase:
async function lookupClause(
tenantId: string,
query: ClauseQuery,
): Promise<ClauseHit[]> {
const cacheKey = hashQuery(tenantId, query);
const cached = await contextCache.get(cacheKey);
if (cached) return cached;
const [embedHits, graphHits] = await Promise.all([
embeddingIndex.search(tenantId, query.text, { k: 20 }),
entityGraph.traverse(tenantId, query.entityIds, { depth: 2 }),
]);
const ranked = rerank(embedHits, graphHits, query.context);
await contextCache.set(cacheKey, ranked, { ttl: 60_000 });
return ranked;
}
Three things matter here:
- The embedding search and graph traversal run in parallel. Sequencing them was the single biggest perf regression we ever shipped.
- The cache key is
(tenantId, query), never justquery. This sounds obvious. We still got it wrong once. - The TTL is short (60s) because contract libraries get updated and stale clauses are worse than slow lookups.
Tradeoffs we made
Every system is a series of tradeoffs. Here are ours, with our work shown:
Why Postgres and not a dedicated vector DB?
We started on Pinecone. It worked. But the operational cost of running two stateful systems (Postgres for entities, Pinecone for embeddings) became real around 50M vectors per tenant. Moving to pgvector with an HNSW index cut p99 by 40% because we removed a network hop and could JOIN embeddings with entity metadata in a single query.
Would we make the same call at 500M vectors per tenant? Probably not. But that's a problem for next year.
Why is the context layer in-memory?
Because Postgres p99 reads under load were 80ms and we needed sub-10ms for the agent plane to feel responsive. We hold a hot subset (last 90 days of activity per tenant) in process memory, hydrate on cold start, and write through to Postgres on every mutation. The recovery story is a Postgres replay, which we test weekly.
Why not a multi-tenant single-process model?
Two reasons. Blast radius (a bad tenant query should not page another tenant's GC) and audit posture ("is tenant A's data ever in the same process as tenant B's data" is a question we get on every security review, and the answer we want to give is "no"). We pay for it in idle memory. Worth it.
What we got wrong
"The first version of this system tried to be too clever. The current version is boring. Boring is good."
Three things we'd do differently if we started over today:
- Pick the boring database first. We spent two months on a custom storage layer. We threw it away. Postgres was always the answer.
- Instrument before optimizing. Half of our early "optimizations" made the median case faster and the tail worse. We didn't notice until we shipped p99 dashboards.
- Treat the context layer as a product, not a service. Once we gave it a name and a roadmap, every other team had a clearer place to push features. Before that, it was "the backend" and nobody owned it.
What's next
We're shipping three things in Q3 that build on this foundation: streaming review responses (so the GC sees the first redline in 200ms instead of waiting for the whole pass), cross-tenant benchmarking behind a privacy-preserving aggregate, and a programmatic context API for customers who want to build on top of the layer directly.
If any of this is interesting and you want to see it in action, book a walkthrough. We'll show you the dashboards too.
FAQ
What is the fastest way to evaluate Inside the Vallor Context Layer: Architecture, Performance, and Tradeoffs?
Start with one live workflow, one contract repository, and one measurable outcome. Vallor can connect to existing systems and produce first answers in minutes, which lets teams test value before a long rollout.
Does Vallor replace an existing CLM?
Not always. Vallor can sit on top of an existing CLM, ERP, storage drive, or email system. Some teams use it as the intelligence layer while keeping their current system of record.
How does Vallor keep answers audit-ready?
Every answer is grounded in source agreements and linked back to the clause, obligation, counterparty, or workflow record behind it. The goal is plain-English speed with enterprise evidence.
Who usually owns this work?
Procurement often owns the business case. Legal owns risk and redlines. Finance and sales operations join when obligations, rebates, renewals, or revenue contracts are in scope.
What data does Vallor need to start?
A contract folder, CLM export, ERP connection, or shared drive is enough for the first pass. Additional systems improve context, but they are not required to begin.
More from the blog
Try It Yourself: An Interactive Tour of Contract Review Math
Sliders, tabs, and a live ROI calculator embedded in a blog post. Move the numbers and see what changes for your legal team.
The Hidden Cost of Manual Contract Review: A 2026 Field Report
We analyzed 4,200 contract reviews across 18 enterprise legal teams. The findings will change how you staff your CLM rollout.
The Enterprise Buyer's Guide to AI Contract Management
A practical evaluation guide for enterprise procurement and legal teams comparing AI contract management platforms.
