Keep shipping AI features without surprise billing shocks.
Catch runaway agent loops, auto-downshift models, and cap daily spend — before the bill hits.
Connect AdaptOrch MCP to your agent stack in minutes: shadow-mode first, budget caps, and automatic model downshift — no framework lock-in.
- #01 238 TESTS PASSING
- #02 2.1× NATIVE SPEEDUP
- #03 1.07× PAPER PARITY
- #04 13 AGENT ADAPTERS
- VENUEISAIA 2026
- PREPRINTarXiv · 2602.16873
- AUTHORYu Geun-Bin · DMAE
- LABEGG GI Lab
Six heads. One accountable router.
Each call earns a τ score, then routes to the cheapest head that can still verify the answer. The output carries a ledger — not just text.
-
α₁SYNTHESIZE
Plan-then-execute reasoning loops with cross-turn contradiction detection. Catch inconsistencies before they propagate.
-
α₂TOOL-USE
Schema-validated function calls. Budget gates enforce per-tool spend caps, per-tenant rate limits, and dry-run sandboxing.
-
α₃RETRIEVE
Hybrid BM25 + embedding retrieval with verifier pass. Recall stays above paper parity (1.07×) on your own corpora.
-
α₄GENERATE
Multi-model generation with downshift policy. High-τ tasks go to frontier, low-τ tasks get cheap-and-fast.
-
α₅CODE
Sandboxed execution with AST lint + unit-test gating. Contradictions are bugs — we catch them before you ship.
-
α₆VISION
Multimodal grounding with OCR + layout priors. Every claim is anchored to a bounding box, every bbox is verifiable.
Simple packaging.
Predictable AI operations.
Simple packaging for solo builders and startup teams with predictable overage policy.
† Provider LLM costs are BYOK/pass-through. Overage is predictable and scoped before enforce-mode rollout.
Find the waste before finance finds the invoice.
Move the sliders to estimate what shadow-mode can recover from retries, loops, and overpowered model calls before you flip enforce on.
Watch one request move through the kernel.
Pick a workload. AdaptOrch classifies τ, routes to the right head, then explains why the call should shadow, enforce, or fail closed.
Answer from the policy corpus and cite the clause.
- CLASSIFYτ 0.71
- ROUTERETRIEVE + VISION
- VERIFYNLI + citation check
High τ and grounded citations: enforce with provenance export.
From prompt to ledger in seven gates.
Input is normalized, scored, routed, executed, verified, and returned with provenance. If a gate fails, the call stops before the user sees drift.
- 01
INGEST
Normalize message, extract intent and entity frame.
- 02
CLASSIFY
CSE-Lite scores τ ∈ [0, 1]. Cheap; deterministic; ~3ms.
- 03
ROUTE
Map τ to the head with best cost-to-reliability profile.
- 04
DISPATCH
Shadow or enforce. Budget cap gate. Dry-run sandbox.
- 05
EXECUTE
Model, tool, retrieval, code, vision — whichever head won.
- 06
AGGREGATE
Contradiction scan across turns. Fail-closed on mismatch.
- 07
RETURN
Verified output + provenance trail + cost ledger entry.
Drop in. Keep your stack.
AdaptOrch is a kernel, not a framework. Place it beside LangChain, CrewAI, AutoGen, or your own loop; keep your prompts, tools, and memory.
Python 3.11 · async-first kernel
Deterministic hot-path in pure CPython. Every request is typed, traced, and reproducible. 238 unit + integration tests gate every commit — no flake budget.
- 2.1× native speedup over framework equivalents
- 13 agent adapters (OpenAI · Anthropic · Groq · local)
- 1.07× paper parity on published benchmarks
- 100 % Groq throughput success across 50k canary calls
Supabase · Upstash · Railway
Postgres is the system of record for every ledger, policy, and trace. Redis handles rate limits, dedup, and idempotency keys. Railway runs everything behind a single deploy command. No Kafka. No Kubernetes. No YAML tax.
- Row-level tenant isolation with enforced RLS
- Per-agent budget caps and burn alerts
- Shadow → enforce promotion with one switch
- Signed provenance chain per request
React · Tailwind · shadcn
The dashboard is intentionally thin. It reads the same ledger your agents write — zero side effects, zero hidden mutations. Everything observable is also auditable and exportable.
- Recharts + React Flow for live topology and drift
- Role-scoped views (Builder / Team / SRE)
- One-click provenance CSV + signed JSONL export
- WCAG 2.2 AA · full keyboard · high-contrast paper
-
DAY 01
Drop-in shadow
Wrap your existing agent loop. Every call is mirrored through CSE-Lite in shadow mode. Zero changes to output. Full ledger population.
-
DAY 07
Contradiction report
Review mismatch list. Decide which heads to graduate to enforce. CI gates wire up. Team dashboard exposed.
-
DAY 30
Enforce + export
Budget caps active. Contradiction kills the call, not the user. Monthly provenance export becomes your paper trail.
"Before AdaptOrch our retrieval recall and our generation output disagreed silently. Shadow mode lit up the contradictions in week one. We shipped enforce in week three. That is the whole story."
· Staff SRE, a fintech multi-agent team (anonymized)
Expected questions.
If something here is wrong, we want to know. Email us and we will publish the correction.
Q.01Is this another agent framework?
No. AdaptOrch is a kernel — a thin deterministic verification and routing layer that sits next to your existing framework (LangChain / CrewAI / AutoGen / custom). It does not own your prompts, your tool schema, or your memory.
Q.02What does "math, not vibes" actually mean?
Every dispatch is scored (τ ∈ [0,1]) and every verification is logged. When an answer contradicts a prior turn we fail closed and emit a structured diagnostic, not a vibe. That is auditable. That is math.
Q.03Is my vendor choice locked in?
Never. AdaptOrch carries 13 adapters (OpenAI, Anthropic, Groq, Cohere, Fireworks, Together, local vLLM, Ollama, and more). Switch at the router level without touching prompts.
Q.04How slow is the router?
CSE-Lite classification is ~3ms on a shared CPU. Full kernel overhead per call is < 12ms p95. Benchmarked 2.1× faster than framework-native agent loops.
Q.05Do I have to rewrite prompts?
No. Your prompts stay your prompts. AdaptOrch observes input / output shapes and verifies consistency. You decide which heads to graduate from shadow to enforce, one at a time.
Q.06What about data residency?
Self-host the kernel. The dashboard ships as a separate container. Ledgers live in your own Postgres. No telemetry leaves your perimeter unless you opt in.
Q.07Does AdaptOrch support streaming?
Yes. Token-stream passthrough with inline verification. The kernel buffers only the minimum needed for contradiction detection; first-token latency is preserved.
Q.08Is there a free tier?
Open-source/self-host use remains free. Hosted users can start on Starter at $0/month with 1,000 monthly calls, then upgrade to Pro at $39/month or Team at $149/month when enforce controls and shared quota are needed. Enterprise and on-premise deployments are scoped via direct contract only.
Q.09How do contradictions get caught?
Three levels. (i) Structural: schema and type mismatches. (ii) Semantic: NLI over prior-turn claims. (iii) Behavioral: tool-call vs. final-answer divergence. Tunable per-tenant.
Q.10Where do I report a bug?
github.com/dmae97/Adaptorch-MCP/issues. First-response SLA for Pro/Team is < 24h. For security disclosures, email ict03@rfems.com.
Ship agents with a kill switch.
Start in shadow. Promote only the heads that prove useful. Export the ledger when security, finance, or a customer asks why an agent acted.
- Starter at $0/month, no card required
- Drop-in next to LangChain / CrewAI / custom loops
- Self-host available — your Postgres, your rules