BENCHMARK ACTIVE [v3.0]
What is LegalChain?

LegalChain is the public benchmark intro site for Legal-10 and the main showcase for AGChain. AGChain is the benchmark authoring platform under development; LegalChain is where the benchmark, methodology, leaderboard, and pitch are published today.

Notices
ALL ->
No notices yet.

"The transition from atomic prompts to stress-tests that evaluate multi-step, complex reasoning in chained, stateful conditions requires new standards."

Technical Baseline

AGChain provides the evaluation infrastructure; LegalChain is the public benchmark surface running on top of it. Unlike traditional benchmarks that test isolated questions, this stack evaluates 10-step chained reasoning where errors propagate realistically, verifying citations against a sealed universe of 27,733 Supreme Court opinions and 378,938 extracted citation occurrences. With structural no-leak architecture and deterministic synthetic traps, it distinguishes grounded legal reasoning from hallucination in high-stakes workflows.

First Chained Legal Benchmark
Deterministic Reference Pack
Shepard's as Relevance Oracle
Chain-Faithful Evaluation
Citation Integrity Gate
Structural No-Leak Architecture
Selection Manifest as Contract
Two-Layer Architecture

Top Performance

FULL_LOGS ->
Model Composite S8 Integrity Latency Cost
Leaderboard preview
Open the full leaderboard to view results.
VIEW_LEADERBOARD ->
Leader: GPT-4o
91%