ยท Open Source ยท MIT

Make LLMs check their own homework.

Recursive self-verification that catches hallucinations before your users do.

81 tests passing · 5 strategies · 3 providers · $0 to run


Varity Interface Mockup

LLMs hallucinate.

They generate highly plausible but factually incorrect statements with complete confidence.

Varity makes them verify.

We force the model into an isolated, recursive interrogation loop to detect mathematical instability in its own logic.

Decompose
Extract Claims
Self-Verify
Depth-N Loop
Cross-Check
Independent Node
Score
Compute VSS
Correct
Final Output

Benchmark Performance

Varity was recently benchmarked against a rigorous dataset of common AI hallucinations, historical misconceptions, and scientific myths.

100%
Detection Accuracy
100%
Average VSS Score
57.2%
Avg Flagged Confidence

Recent Test Results

Statement Expected Varity Result
India got its independence in 1998. hallucination [OK] Flagged
The Great Wall of China can be visible from space. hallucination [OK] Flagged
Einstein won the Nobel Prize for Relativity. hallucination [OK] Flagged
Water boils at 100 degrees Celsius at sea level. factual [OK] Verified

* Tested via varity.checker.Varity on openai/gpt-4o-mini

Python
JavaScript
CLI
from varity import Varity

v = Varity(provider="anthropic", api_key="your-key")

result = v.check(
    prompt="When was the Eiffel Tower built?",
    response="The Eiffel Tower was built in 1887."
)

# Shows the wrong date flagged via VSS flip detection
print(result.flagged_claims)  

# Auto-corrected string strictly constrained to verified facts
print(result.corrected_response)  # "The Eiffel Tower was built in 1889."
import { Varity } from "varity";

const v = new Varity({ provider: "anthropic", apiKey: "your-key" });

const result = await v.check(prompt, response);

console.log(result.flaggedClaims);
varity check --prompt "..." --response "..." --provider anthropic --key sk-...
๐Ÿ”’ Your key stays in your browser. Never sent to any server.
Inject key and execute pipeline to view Recursive VSS output.

Verdict Stability Spectrum

0.0 - 0.3 Hallucinated
0.3 - 0.5 Uncertain
0.5 - 0.7 Accurate
0.7 - 0.9 High
1.0

Claims that constantly flip verdicts across recursive depth interrogations fall cleanly to the left. Claims that resolutely maintain their boolean state aggregate to the right.

Recursive Depth

Goes much deeper than surface semantic checks. Varity actively forces the LLM to verify its own verification until stability is achieved.

Strict BYOK

Your key, your cost, your control. Works natively natively with Claude, GPT, and Gemini API standard endpoints. No middle-man routers.

Zero Cost Footprint

100% MIT Licensed. No proprietary SaaS servers to pay for. Running it against Gemini's free tier costs literally $0.

Install the core

pip install varity
npm install varity
brew install varity (Soon)