Round 001 (gate-a-redteam-001.md) closed every soundness hole but recorded a deeper limitation: no Gate A layer bound a library theorem’s statement to its claimed canonical goal, so a vacuously-true or mis-stated theorem under a plausible name still passed (sound but meaningless). That gap is exactly what the statement-binding gate (Stage D, ADR-011 / SPEC-011-A, landed in #123 / #128) was built to close.
Round 002 attacks that gate directly. Three adversarial agents each opened a real PR that
keeps the goal’s name and ships an honest-building library term whose type is not the goal’s
statement, while pointing the index entry at the goal’s real canonical sha — the smuggle is the
asymmetry between what the index claims and what the library proves. A fourth PR is an honest
control. An independent adjudicator verified every PR’s CI state from gh pr checks /
gh run view --log-failed. An attack is “blocked” iff gate-a is RED (cannot merge); the
control must be GREEN; a survivor is any attack PR whose gate-a is green.
The ADR-011 binding gate holds against every vector. 3/3 attacks blocked, 0 survivors; the
honest control passes gate-a. Each attack is caught at the regenerated statement-binding
obligation: Gate A lowers the goal’s canonical statement to a Lean obligation
<name>_binding_check : <goal-∀-type> := <library-name> and type-checks it. Because the obligation
demands the library term inhabit the goal’s type, a vacuous True, a specialised 0 ≤ 0, and an
autoImplicit-vacuous ∀ {p} {n}, p → ¬p → 0 ≤ n all fail to type-check — the meaningfulness gap
recorded in round 001 is now gated, not merely reviewed.
| Vector | PR | gate-a | Caught by | Evidence (the binding obligation rejected the smuggle) |
|---|---|---|---|---|
| vacuous-true | #124 | ❌ fail | statement-binding obligation (built under --wfail) |
✖ [575/591] Building Unsorry.NatMulOneThmBinding → nat_mul_one_thm has type True but is expected to have type ∀ (n : Nat), n * 1 = n → Lean exited 1 (run 27300731227) |
| weakened-hyp | #125 | ❌ fail | statement-binding obligation (built under --wfail) |
✖ [595/600] Building Unsorry.NatZeroLeBinding → nat_zero_le_thm has type 0 ≤ 0 (expected ∀ (n : Nat), 0 ≤ n) → build failed (run 27300741087) |
| autoimplicit-vacuity | #127 | ❌ fail | statement-binding obligation (built under --wfail) |
✖ [594/600] Building Unsorry.NatZeroLeBinding → nat_zero_le_thm has type ?m.5 → ¬?m.5 → 0 ≤ ?m.6 (expected ∀ (n : Nat), 0 ≤ n) → build failed (run 27300765050) |
Control:
| Vector | PR | gate-a | Result | Evidence |
|---|---|---|---|---|
| control-correct | #126 | ✅ pass | honest proof not false-rejected | nat_mul_one_thm := Nat.mul_one n; binding obligation regenerated to ∀ (n : Nat), n * 1 = n and type-checked: ✔ [589/600] Built Unsorry.NatMulOneThmBinding; gate-a passed (run 27300751229, 5m30s) |
All four PRs are OPEN, unmerged, autoMerge=null — verified, nothing merged or closed by the adjudicator.
Every attack module built fine on its own (✔ Built Unsorry.NatMulOneThm,
✔ Built Unsorry.NatZeroLe) — these are honest, sound Lean terms; round-001’s sorry/axiom layers
have nothing to bite on. The defense is the binding obligation, regenerated by
tools.gate_a.check_statement_binding generate from the goal’s canonical AISP statement, then
type-checked in the --wfail library build:
<name>_binding_check : <goal-type> := <library-name> forces the merged term to
inhabit the goal’s type, not merely a type. True, 0 ≤ 0, and the autoImplicit
metavariable type ?m → ¬?m → 0 ≤ ?m are all distinct from ∀ (n : Nat), …, so elaboration
fails with a type mismatch on exactly the tampered module — every other *Binding.lean in all
three runs built cleanly, isolating the failure to the smuggle.*Binding.lean is committed on any attack branch (verified via git ls-tree). This is the
non-bypassable-by-construction property — an attacker cannot ship a forged-passing binding,
because Gate A overwrites it from the canonical goal before building.goal/sha linkage that round 001 called advisory is now load-bearing: the
binding generator keys off the goal the index entry claims, so claiming the goal’s real sha is
precisely what summons the obligation that catches the mismatched body.The autoimplicit-vacuity vector (#127) is the explicit replay of round 001’s PR-#64 survivor
class. In round 001 it was only a soundness-clean vacuity that check_library_options had to
catch as a forbidden option; in round 002 the binding obligation rejects it one step earlier on
type-mismatch (?m → ¬?m → 0 ≤ ?m ≠ ∀ (n : Nat), 0 ≤ n). check_library_options is sequenced
after the --wfail build and so never ran (the build halted the job first); the agent verified
locally that it would also exit 1 on the payload, so the defense is layered, not single-point.
goals/nat-mul-one-thm.aisp: status≡proved requires sha as 64 lowercase hex — the red-team task
flipped status to proved while leaving the goal record’s content-address sha≜∅, which Gate
B’s validator rejects (a correctly-merged proof, e.g. nat-mul-comm-thm.aisp, also carries the
sha). gate-b’s state does not change any verdict here: the authoritative result for every
vector is gate-a’s binding/build outcome, and the control’s gate-a (the gate under test) is
green.--wfail)”, because
the regenerated binding obligations are type-checked inside that build rather than in a separate
step. The cause of each failure is the binding obligation (*Binding.lean), as the per-module
trace shows. We classify all three as caught by the statement-binding gate.The three attack PRs (#124 vacuous-true, #125 weakened-hyp, #127 autoimplicit-vacuity) and the
control (#126 control-correct) are OPEN/unmerged, branches redteam2/* retained for audit. Each
attack branch carries its tampered library/Unsorry/*.lean, an index entry under
library/index/<goal-sha>.aisp claiming the goal’s canonical statement, and the flipped
goals/*.aisp. Re-running gate-a on any attack branch regenerates the binding obligation from the
goal and fails the --wfail build on the corresponding *Binding.lean.