unsorry

Gate A Statement-Binding Red Team — Round 002 (2026-06-10)

Round 001 (gate-a-redteam-001.md) closed every soundness hole but recorded a deeper limitation: no Gate A layer bound a library theorem’s statement to its claimed canonical goal, so a vacuously-true or mis-stated theorem under a plausible name still passed (sound but meaningless). That gap is exactly what the statement-binding gate (Stage D, ADR-011 / SPEC-011-A, landed in #123 / #128) was built to close.

Round 002 attacks that gate directly. Three adversarial agents each opened a real PR that keeps the goal’s name and ships an honest-building library term whose type is not the goal’s statement, while pointing the index entry at the goal’s real canonical sha — the smuggle is the asymmetry between what the index claims and what the library proves. A fourth PR is an honest control. An independent adjudicator verified every PR’s CI state from gh pr checks / gh run view --log-failed. An attack is “blocked” iff gate-a is RED (cannot merge); the control must be GREEN; a survivor is any attack PR whose gate-a is green.

Verdict

The ADR-011 binding gate holds against every vector. 3/3 attacks blocked, 0 survivors; the honest control passes gate-a. Each attack is caught at the regenerated statement-binding obligation: Gate A lowers the goal’s canonical statement to a Lean obligation <name>_binding_check : <goal-∀-type> := <library-name> and type-checks it. Because the obligation demands the library term inhabit the goal’s type, a vacuous True, a specialised 0 ≤ 0, and an autoImplicit-vacuous ∀ {p} {n}, p → ¬p → 0 ≤ n all fail to type-check — the meaningfulness gap recorded in round 001 is now gated, not merely reviewed.

Per-vector results

Vector PR gate-a Caught by Evidence (the binding obligation rejected the smuggle)
vacuous-true #124 ❌ fail statement-binding obligation (built under --wfail) ✖ [575/591] Building Unsorry.NatMulOneThmBindingnat_mul_one_thm has type True but is expected to have type ∀ (n : Nat), n * 1 = n → Lean exited 1 (run 27300731227)
weakened-hyp #125 ❌ fail statement-binding obligation (built under --wfail) ✖ [595/600] Building Unsorry.NatZeroLeBindingnat_zero_le_thm has type 0 ≤ 0 (expected ∀ (n : Nat), 0 ≤ n) → build failed (run 27300741087)
autoimplicit-vacuity #127 ❌ fail statement-binding obligation (built under --wfail) ✖ [594/600] Building Unsorry.NatZeroLeBindingnat_zero_le_thm has type ?m.5 → ¬?m.5 → 0 ≤ ?m.6 (expected ∀ (n : Nat), 0 ≤ n) → build failed (run 27300765050)

Control:

Vector PR gate-a Result Evidence
control-correct #126 ✅ pass honest proof not false-rejected nat_mul_one_thm := Nat.mul_one n; binding obligation regenerated to ∀ (n : Nat), n * 1 = n and type-checked: ✔ [589/600] Built Unsorry.NatMulOneThmBinding; gate-a passed (run 27300751229, 5m30s)

All four PRs are OPEN, unmerged, autoMerge=null — verified, nothing merged or closed by the adjudicator.

Why the binding step is the load-bearing layer here

Every attack module built fine on its own (✔ Built Unsorry.NatMulOneThm, ✔ Built Unsorry.NatZeroLe) — these are honest, sound Lean terms; round-001’s sorry/axiom layers have nothing to bite on. The defense is the binding obligation, regenerated by tools.gate_a.check_statement_binding generate from the goal’s canonical AISP statement, then type-checked in the --wfail library build:

The autoimplicit-vacuity vector (#127) is the explicit replay of round 001’s PR-#64 survivor class. In round 001 it was only a soundness-clean vacuity that check_library_options had to catch as a forbidden option; in round 002 the binding obligation rejects it one step earlier on type-mismatch (?m → ¬?m → 0 ≤ ?m∀ (n : Nat), 0 ≤ n). check_library_options is sequenced after the --wfail build and so never ran (the build halted the job first); the agent verified locally that it would also exit 1 on the payload, so the defense is layered, not single-point.

Caveats (declared, not silently estimated)

Reproduce

The three attack PRs (#124 vacuous-true, #125 weakened-hyp, #127 autoimplicit-vacuity) and the control (#126 control-correct) are OPEN/unmerged, branches redteam2/* retained for audit. Each attack branch carries its tampered library/Unsorry/*.lean, an index entry under library/index/<goal-sha>.aisp claiming the goal’s canonical statement, and the flipped goals/*.aisp. Re-running gate-a on any attack branch regenerates the binding obligation from the goal and fails the --wfail build on the corresponding *Binding.lean.