FIG. I · ARCHITECTURE arXiv : 2602.16873
RELIABILITY KERNEL v0.1.0

Keep shipping AI features without surprise billing shocks.

Catch runaway agent loops, auto-downshift models, and cap daily spend — before the bill hits.

Connect AdaptOrch MCP to your agent stack in minutes: shadow-mode first, budget caps, and automatic model downshift — no framework lock-in.

  • #01 238 TESTS PASSING
  • #02 2.1× NATIVE SPEEDUP
  • #03 1.07× PAPER PARITY
  • #04 13 AGENT ADAPTERS
Six-head heterogeneous routing diagram with central ROUTER and six agent nodes: hexagon SYNTHESIZE, square TOOL-USE, triangle RETRIEVE, circle GENERATE, diamond CODE, pentagon VISION. Vermillion active path highlighted from router to top hexagon node.
FIG. I — SIX-HEAD HETEROGENEOUS ROUTING · ACTIVE DISPATCH α₁ → α₂ HIGHLIGHTED
  • VENUEISAIA 2026
  • PREPRINTarXiv · 2602.16873
  • AUTHORYu Geun-Bin · DMAE
  • LABEGG GI Lab
FIG. III · HETEROGENEOUS HEADS τ · 0.15 → 1.00

Six heads. One accountable router.

Each call earns a τ score, then routes to the cheapest head that can still verify the answer. The output carries a ledger — not just text.

  1. #01τ≈0.30

    α₁SYNTHESIZE

    Plan-then-execute reasoning loops with cross-turn contradiction detection. Catch inconsistencies before they propagate.

  2. #02τ≈0.45

    α₂TOOL-USE

    Schema-validated function calls. Budget gates enforce per-tool spend caps, per-tenant rate limits, and dry-run sandboxing.

  3. #03τ≈0.20

    α₃RETRIEVE

    Hybrid BM25 + embedding retrieval with verifier pass. Recall stays above paper parity (1.07×) on your own corpora.

  4. #04τ≈0.55

    α₄GENERATE

    Multi-model generation with downshift policy. High-τ tasks go to frontier, low-τ tasks get cheap-and-fast.

  5. #05τ≈0.70

    α₅CODE

    Sandboxed execution with AST lint + unit-test gating. Contradictions are bugs — we catch them before you ship.

  6. #06τ≈0.85

    α₆VISION

    Multimodal grounding with OCR + layout priors. Every claim is anchored to a bounding box, every bbox is verifiable.

FIG. IV · PRICING MATRIX cost ∝ τ · ADAPTIVE

Simple packaging.
Predictable AI operations.

Simple packaging for solo builders and startup teams with predictable overage policy.

PLAN POSITIONING MONTHLY INCLUDES ACTION
01STARTER Proof-of-value diagnostics $0/month 1,000 calls monthly Shadow mode only Basic event log Get started →
03TEAM Startup-scale operations $149/month 100,000 calls monthly Shared quota governance Reliability signal diagnostics Audit-ready events Choose Team →

† Provider LLM costs are BYOK/pass-through. Overage is predictable and scoped before enforce-mode rollout.

FIG. V · ROI · MONTE CARLO input → dispatch → savings

Find the waste before finance finds the invoice.

Move the sliders to estimate what shadow-mode can recover from retries, loops, and overpowered model calls before you flip enforce on.

#01 · MONTHLY SAVINGS $128 Conservative est. @ 25% waste + ops
#02 · RECOMMENDED PLAN Pro Lowest cost-to-reliability fit
#03 · PAYBACK PERIOD 10 days Avg. break-even horizon
FIG. V-B · REQUEST INSPECTOR classify → route → verify

Watch one request move through the kernel.

Pick a workload. AdaptOrch classifies τ, routes to the right head, then explains why the call should shadow, enforce, or fail closed.

INPUT Answer from the policy corpus and cite the clause.
  1. CLASSIFYτ 0.71
  2. ROUTERETRIEVE + VISION
  3. VERIFYNLI + citation check
VERDICT ENFORCE

High τ and grounded citations: enforce with provenance export.

FIG. VI · PIPELINE seven phases · deterministic

From prompt to ledger in seven gates.

Input is normalized, scored, routed, executed, verified, and returned with provenance. If a gate fails, the call stops before the user sees drift.

  1. 01

    INGEST

    Normalize message, extract intent and entity frame.

  2. 02

    CLASSIFY

    CSE-Lite scores τ ∈ [0, 1]. Cheap; deterministic; ~3ms.

  3. 03

    ROUTE

    Map τ to the head with best cost-to-reliability profile.

  4. 04

    DISPATCH

    Shadow or enforce. Budget cap gate. Dry-run sandbox.

  5. 05

    EXECUTE

    Model, tool, retrieval, code, vision — whichever head won.

  6. 06

    AGGREGATE

    Contradiction scan across turns. Fail-closed on mismatch.

  7. 07

    RETURN

    Verified output + provenance trail + cost ledger entry.

FIG. VII · STACK zero lock-in · byo-model

Drop in. Keep your stack.

AdaptOrch is a kernel, not a framework. Place it beside LangChain, CrewAI, AutoGen, or your own loop; keep your prompts, tools, and memory.

#01 · RUNTIME

Python 3.11 · async-first kernel

Deterministic hot-path in pure CPython. Every request is typed, traced, and reproducible. 238 unit + integration tests gate every commit — no flake budget.

  • 2.1× native speedup over framework equivalents
  • 13 agent adapters (OpenAI · Anthropic · Groq · local)
  • 1.07× paper parity on published benchmarks
  • 100 % Groq throughput success across 50k canary calls
#02 · CONTROL

Supabase · Upstash · Railway

Postgres is the system of record for every ledger, policy, and trace. Redis handles rate limits, dedup, and idempotency keys. Railway runs everything behind a single deploy command. No Kafka. No Kubernetes. No YAML tax.

  • Row-level tenant isolation with enforced RLS
  • Per-agent budget caps and burn alerts
  • Shadow → enforce promotion with one switch
  • Signed provenance chain per request
#03 · SURFACE

React · Tailwind · shadcn

The dashboard is intentionally thin. It reads the same ledger your agents write — zero side effects, zero hidden mutations. Everything observable is also auditable and exportable.

  • Recharts + React Flow for live topology and drift
  • Role-scoped views (Builder / Team / SRE)
  • One-click provenance CSV + signed JSONL export
  • WCAG 2.2 AA · full keyboard · high-contrast paper
FIG. VIII · 30-DAY ROLLOUT day 1 → day 30
  1. DAY 01

    Drop-in shadow

    Wrap your existing agent loop. Every call is mirrored through CSE-Lite in shadow mode. Zero changes to output. Full ledger population.

  2. DAY 07

    Contradiction report

    Review mismatch list. Decide which heads to graduate to enforce. CI gates wire up. Team dashboard exposed.

  3. DAY 30

    Enforce + export

    Budget caps active. Contradiction kills the call, not the user. Monthly provenance export becomes your paper trail.

EPIGRAPH § 0.0

"Before AdaptOrch our retrieval recall and our generation output disagreed silently. Shadow mode lit up the contradictions in week one. We shipped enforce in week three. That is the whole story."

Lee Jae-Min · Staff SRE, a fintech multi-agent team (anonymized)
FIG. IX · FAQ anticipated objections

Expected questions.

If something here is wrong, we want to know. Email us and we will publish the correction.

Q.01Is this another agent framework?

No. AdaptOrch is a kernel — a thin deterministic verification and routing layer that sits next to your existing framework (LangChain / CrewAI / AutoGen / custom). It does not own your prompts, your tool schema, or your memory.

Q.02What does "math, not vibes" actually mean?

Every dispatch is scored (τ ∈ [0,1]) and every verification is logged. When an answer contradicts a prior turn we fail closed and emit a structured diagnostic, not a vibe. That is auditable. That is math.

Q.03Is my vendor choice locked in?

Never. AdaptOrch carries 13 adapters (OpenAI, Anthropic, Groq, Cohere, Fireworks, Together, local vLLM, Ollama, and more). Switch at the router level without touching prompts.

Q.04How slow is the router?

CSE-Lite classification is ~3ms on a shared CPU. Full kernel overhead per call is < 12ms p95. Benchmarked 2.1× faster than framework-native agent loops.

Q.05Do I have to rewrite prompts?

No. Your prompts stay your prompts. AdaptOrch observes input / output shapes and verifies consistency. You decide which heads to graduate from shadow to enforce, one at a time.

Q.06What about data residency?

Self-host the kernel. The dashboard ships as a separate container. Ledgers live in your own Postgres. No telemetry leaves your perimeter unless you opt in.

Q.07Does AdaptOrch support streaming?

Yes. Token-stream passthrough with inline verification. The kernel buffers only the minimum needed for contradiction detection; first-token latency is preserved.

Q.08Is there a free tier?

Open-source/self-host use remains free. Hosted users can start on Starter at $0/month with 1,000 monthly calls, then upgrade to Pro at $39/month or Team at $149/month when enforce controls and shared quota are needed. Enterprise and on-premise deployments are scoped via direct contract only.

Q.09How do contradictions get caught?

Three levels. (i) Structural: schema and type mismatches. (ii) Semantic: NLI over prior-turn claims. (iii) Behavioral: tool-call vs. final-answer divergence. Tunable per-tenant.

Q.10Where do I report a bug?

github.com/dmae97/Adaptorch-MCP/issues. First-response SLA for Pro/Team is < 24h. For security disclosures, email ict03@rfems.com.

FIG. X · DISPATCH τ = 1.00 · ACTIVE
TERMINAL STATE

Ship agents with a kill switch.

Start in shadow. Promote only the heads that prove useful. Export the ledger when security, finance, or a customer asks why an agent acted.

  • Starter at $0/month, no card required
  • Drop-in next to LangChain / CrewAI / custom loops
  • Self-host available — your Postgres, your rules
REQUEST§ 01 · TEAM DEMO

LOG IN START FREE