Something interesting is happening in the contract technology market right now. Anthropic just embedded Claude directly into Microsoft Word. Microsoft Copilot is pitching AI-assisted contract review. In response, virtually every CLM vendor that spent the last decade focused on pre-signature workflows are now suddenly – and loudly – proclaiming they “do post-signature AI.”
Setting aside the question of what took so long for the market to move to this highest ROI area of contract management – that’s a lot of new entrants making a lot of big claims.
I’ve been in this space for over 20 years. Early on, I ran multiple deployments that organized and analyzed hundreds of thousands of documents. I often say that at this point, I know an “unfortunate” amount about contract data. So when I say that things that look great in a demo or POC break down in the face of the complexity and scale of a F500 deployment, it’s because I have lived it.
I’ve seen this movie before. And I want to offer enterprise buyers a clear-eyed warning: what looks great in a demo video, POC or an RFP answer is not the same as what works at enterprise scale, on real contracts, where the decisions actually matter.
The Demo Looks Great. The Question Is What Happens Next.
Everyone wants to “chat” with their contracts. And that’s precisely what most vendors demo to showcase their AI. Simply “ask any question.”
But here’s the uncomfortable truth about AI and contract intelligence: AI is very good at filling in blanks. It sounds intelligent. It produces confident-looking answers. And in a controlled demo or a POC environment, where you’re showing the vendor a couple of your cleanest, most straightforward contracts? It can look like magic.
Unfortunately, enterprise contract portfolios are not clean. They include:
- Contracts drafted on third-party paper your templates have never seen
- Complex pricing tables and terms buried across multiple exhibits with amendments that override them
- M&A-inherited portfolios spanning legacy systems, different naming conventions, and incomplete document hierarchies
- Terms that require reading four different sections and piecing together the answer
- Junk from legacy systems, including duplicates, missing documents, drafts and irrelevant files
- Counterparty name changes due to M&A, none of which match your CRM
- Multiple contracts for the same counterparty
- When you throw that at a contract AI model — the contracts that actually govern your commercial relationships — the accuracy picture changes dramatically
The Four Tiers of Contract Intelligence Accuracy
Everyone wants the magic AI button. People ask “why can’t I just put this in Claude.” It sounds so easy. Luckily, over the last several months, people are taking a closer look at the output.
At Pramata, we’ve developed a simple way of thinking about accuracy in contract intelligence — and it has profound implications for how you evaluate any vendor making claims in this space.
The question every buyer should be asking every vendor right now is simple: where do you actually land on this spectrum — not in your demo, but on my contracts, at my volume, across my data types?
Ask for accuracy metrics. Challenge them with complex relationships that include dozens of amendments and multiple contracts for a given counterparty. Include orders and commercial documents, not just MSAs. And make sure to include plenty of junk – duplicates, missing documents, drafts, etc – in any test. Finally, when that’s complete, ask them about their largest deployments of the technology you are deploying.
Why the Uncanny Valley Is So Hard to Escape
After two decades of building purpose-built contract intelligence for Fortune 500 companies, we understand why this is hard. Contracts are a very unique data set. And unlike traditional vendors that view contracts as a series of individual documents – something that works for generating contracts before signature – we view them as an aggregate “commercial relationship with a counterparty.
This complexity General-purpose AI models hit a ceiling on contract data because:
- Complex pricing tables are nearly impossible for standard models to interpret correctly
- Pricing terms are often modified by other clauses — volume commits, escalators, carve-outs — that an AI can easily miss
- Long contracts overflow context windows, increasing hallucination risk
- The “right” answer often requires looking across multiple documents and amendments simultaneously
- Understanding which contract is active requires sorting through order of precedence — something most AI models simply don’t do
This isn’t a prompt engineering problem. It’s a methodology problem. Reaching the >95% threshold requires a deep architecture focused on relationships, not just contracts. It requires built in quality metrics and human-in-the-loop QA processes. And it requires a model that can be easily tuned and trained to meet your specific business requirements without training on your data.
Not a general-purpose assistant that happens to be good at reading documents.
What to Demand: The P.A.S.S. Standard
When evaluating any contract intelligence solution — whether from a CLM vendor’s new “AI-powered” module or a general-purpose AI tool — hold them to what we call the P.A.S.S. standard:
Predictable. Consistent, repeatable extraction with known accuracy rates by data type. Not “our AI is great.” Show me the numbers, by field, across a representative sample of my actual contracts.
Accurate. Extracted data verified against source documents, with evidence linked back to the specific clause. Because in commercial decisions — pricing disputes, renewal negotiations, billing reconciliation — “good enough” accuracy is never actually good enough.
Scalable. The same quality at 500,000 contracts as at 500. Including the messy post-M&A portfolios. Including the third-party paper. Not just the clean templates you put through the POC.
Secure. SOC 2 compliant, role-based access, zero exposure to LLM training data, robust API / MCPs. Security that’s built in, not bolted on after the fact.
If a vendor can’t answer those four questions with specifics, that’s your answer.
The Bottom Line for Buyers
The rush of new entrants into contract intelligence is genuinely exciting. More competition, more innovation, and more attention on a problem that deserves it.
But enterprise leaders — General Counsel, CFOs, heads of Procurement — need to be clear-eyed about what they’re buying.
- A Word plugin that reviews the contract in front of you is not contract intelligence.
- A CLM that’s spent 15 years on pre-signature workflows and just added an “AI insights” tab is not contract intelligence.
- A chatbot that surfaces system metadata but doesn’t produce it is not contract intelligence.
- A POC on your 50 cleanest contracts is not a proof of enterprise-grade accuracy.
Real contract intelligence—the kind that powers your agents, your workflows, and your commercial decisions at scale—requires methodology, experience, and a relentless focus on accuracy that you simply cannot shortcut.
We’ve been doing this for over 20 years. We know where the Uncanny Valley hides. And we know what it takes to get above it.
Ask hard questions. Demand the P.A.S.S. standard. Don’t let a great demo become an expensive mistake.
Pramata delivers contract intelligence built for the agentic enterprise — Predictable, Accurate, Scalable, and Secure. Learn more at pramata.com.