Unfaithful claims: breaking 6 zkVMs

Himanshu Sheoran Valter Wik Himanshu Sheoran, Valter Wik #zkVM

A zkVM verifier should be faithful to one thing above all else: its public claims. Yet we found six systems where this guarantee breaks. Learn how a subtle ordering bug lets an attacker bypass the cryptography entirely and prove mathematically impossible statements.

Unfaithful claims: breaking 6 zkVMs

Introduction

A zkVM verifier should be faithful to one thing above all else: its public claims. If the claimed input/output statement is false, verification must fail.

We found six systems where this faithfulness breaks. Across Jolt, Nexus, Cairo-M, Ceno, Expander, and Binius64, public-claim data was not always bound into Fiat-Shamir transcripts before challenge generation. That subtle ordering bug turns statement values into attacker-controlled variables in later verification equations.

In this post, we demonstrate how to exploit these unbound variables to bypass the cryptography entirely and prove mathematically impossible statements, such as finding a counterexample to Fermat’s Last Theorem (see Challenges to try this out yourself). In a blockchain context, this could translate to receiving $1M out of thin air.


Jargon cheat sheet

The ZK ecosystem is particularly full of jargon and abbreviations, which may be off-putting to newcomers, so bookmark this section. Before we go deeper, here’s a one-liner for every term you’ll encounter:

  • Fiat-Shamir: Instead of a real verifier sending random challenges, hash everything so far to get “random” challenges. Makes proofs non-interactive.
  • Transcript: The running hash state. You “absorb” data into it, then “squeeze” out challenges.
  • Polynomial Commitment: Like a hash, but for polynomials. You commit to a polynomial, then later prove “my polynomial evaluates to 42 at point 7” without revealing the whole polynomial.
  • Sumcheck: A protocol to prove “this polynomial sums to H over all boolean inputs” without actually computing the exponentially many terms. Reduces to checking one random point.
  • MLE (Multilinear Extension): Turn a table of values into a polynomial. The polynomial equals the table on 0/1 inputs and smoothly interpolates elsewhere. Key property: evaluating it is a linear function of the table entries.
  • Lookup / LogUp: Prove “all my values appear in this table” by encoding membership as sums of fractions. If the sums match, the sets match (with high probability).
  • AIR: “Algebraic Intermediate Representation” — a way to write “valid execution trace” as polynomial equations. If the equations hold, the trace is valid.
  • STARK: Prove AIR constraints hold using commitments + random sampling + FRI. No trusted setup needed.
  • FRI: “Fast Reed-Solomon IOP” — proves a committed function is actually a low-degree polynomial, not arbitrary garbage that passes spot-checks.
  • OODS: “Out-of-Domain Sampling” — check the constraint polynomial at a random point outside the execution domain. Ties everything together.
  • GKR: Verify arithmetic circuits layer-by-layer using sumcheck. Reduces “check this huge circuit” to “check a few random evaluations.”
  • claimed_sum / opening_claim: Prover-supplied values that feed into verification equations. These are the usual suspects for binding bugs.

What are we even breaking?

What is a zkVM?

A zkVM proof claims that a program executed correctly on public inputs, producing the claimed public output, while hiding the full execution trace.

Formally, the verifier is convinced that there exists a valid trace TT such that:

  T  :  VM(P,X,T)Y\exists \; T \; : \; \mathsf{VM}(P, X, T) \to Y

where:

  • PP = program/circuit description (public)
  • XX = public input
  • YY = claimed public output
  • TT = private witness/trace (registers, memory history, intermediate values)

The verifier does not replay execution step by step. Instead, it checks algebraic constraints over committed polynomials.

Some systems in this post are verifiable-computing systems rather than full zero-knowledge systems, but the critical property is still soundness:

  • Completeness: honest execution verifies.
  • Soundness: false execution should not verify.

We are breaking soundness in all six systems.

Definition (Faithfulness)

A verifier is faithful when the public statement it accepts is exactly the statement bound into the proof. In this post, every bug lets a prover-controlled claim change after the transcript challenges are known while the proof still verifies.

In all six codebases, verification follows this abstract flow:

  1. Fix public statement data.
  2. Parse proof payload (commitments, reduction messages, openings).
  3. Rebuild Fiat-Shamir challenges from transcript state.
  4. Check constraint equations at sampled points.
  5. Check PCS/opening consistency.
  6. Accept only if all checks are jointly consistent.
Public statement data feeds both the prover's execution and the Fiat-Shamir transcript. The prover executes the program, builds a proof payload of commitments, reduction messages, and openings, and the verifier absorbs statement and proof elements in protocol order before squeezing challenges and running constraint, PCS, and global consistency checks to accept or reject.High-level prover and verifier flow in a zkVMPublic statement data feeds both the prover's execution and the Fiat-Shamir transcript. The prover executes the program, builds a proof payload of commitments, reduction messages, and openings, and the verifier absorbs statement and proof elements in protocol order before squeezing challenges and running constraint, PCS, and global consistency checks to accept or reject.

Verifier checks

Fiat-Shamir transcript reconstruction

Proof payload (sent to verifier)

Prover side

Public statement

Program PP

Input XX

Claimed output YY

Metadata
sizes, roots, domains

Execute P(X)P(X)

Private trace /
witness TT

Build proof

Commitments

Reduction messages
sumcheck / GKR / STARK rounds

Claimed evaluations /
openings

Absorb statement elements

Absorb proof elements
in protocol order

Squeeze challenges

Constraint checks
(at challenge points)

PCS / opening checks

Global consistency
checks

Accept / Reject

The non-negotiable invariant is transcript ordering: if a value affects a verifier equation, it must be absorbed before sampling the challenge that gates that equation. Violating this gives the prover an attacker-controlled degree of freedom.


The building blocks

Before we can understand the bugs, we need to understand the protocols they break. Each of these is a tool that zkVMs compose together.

The Fiat-Shamir transform

Interactive protocols1 require real-time communication. It involves the verifier sending random challenges, and the prover responding to them. This doesn’t work for blockchains (where you have no real-time verifier) or when you want anyone to verify your proof at a later point.

The solution is to replace the verifier’s randomness with a hash function. The prover “talks to themselves,” using the hash of everything so far as the challenge. If we use a cryptographic hash function, this should mean that the challenges are completely unpredictable.

In an interactive protocol the prover sends a commitment, the verifier replies with a random challenge, the prover responds, and this repeats before the verifier runs its final check.Interactive protocolIn an interactive protocol the prover sends a commitment, the verifier replies with a random challenge, the prover responds, and this repeats before the verifier runs its final check.VerifierProverVerifierProver
verify(C,r1,R1,r2,R2)
Commitment C
Random challenge r1
Challenge response R1
Random challenge r2
Challenge response R2
The prover absorbs each message into a transcript and squeezes challenges from it instead of talking to a live verifier, sends the proof, and the verifier replays the same absorb-and-squeeze steps against its own transcript before checking.Fiat-Shamir transformThe prover absorbs each message into a transcript and squeezes challenges from it instead of talking to a live verifier, sends the proof, and the verifier replays the same absorb-and-squeeze steps against its own transcript before checking.Verifier transcriptVerifierProver transcriptProverVerifier transcriptVerifierProver transcriptProver
verify(C,r1,R1,r2,R2)
Absorb commitment C
Squeeze challenge r1
Absorb challenge response R1
Squeeze challenge r2
Send proof (C,R1,R2)
Absorb commitment C
Squeeze challenge r1
Absorb challenge response R1
Squeeze challenge r2
Important

The hash (transcript) must include everything that affects verification BEFORE the challenges derived from it are used.

If some value VV affects a verification equation, but VV isn’t absorbed before the relevant challenge is squeezed, then the challenge is completely independent of VV. This means that the prover can “see” (compute in advance) the challenge before choosing VV, which may allow it to choose VV exactly so that the verification passes, even though it should not.

This is the bug class we found in all six systems.

The sumcheck protocol

The sumcheck protocol proves that a polynomial sums to a claimed value over the Boolean hypercube (all inputs in {0,1}n\{0,1\}^n), i.e. the claim:

H=x1{0,1}x2{0,1}xn{0,1}g(x1,x2,,xn)H = \sum_{x_1 \in \{0,1\}} \sum_{x_2 \in \{0,1\}} \cdots \sum_{x_n \in \{0,1\}} g(x_1, x_2, \ldots, x_n)

The naive approach would be for the verifier to compute all 2n2^n evaluations. This is exponentially expensive.

The sumcheck protocol is a clever interactive protocol that reduces the exponential number of polynomial evaluations to checking only one.

Round by round the prover sends a univariate polynomial, the verifier checks it against the running claim and replies with a random challenge, reducing the multivariate sum to a single evaluation that is checked with one opening proof.The sumcheck protocolRound by round the prover sends a univariate polynomial, the verifier checks it against the running claim and replies with a random challenge, reducing the multivariate sum to a single evaluation that is checked with one opening proof.VerifierProverVerifierProver
Claim H=x{0,1}ng(x)
over x=(x1,,xn)
Round 1
g1(X)=x2,,xng(X,x2,,xn)
Verify g1(0)+g1(1)=H
Round 2
g2(X)=x3,,xng(r1,X,x3,,xn)
Verify g2(0)+g2(1)=g1(r1)
Repeat for n rounds...
Final round
Verify g(r1,,rn)=gn(rn)
(single opening proof for a committed polynomial)
H
univariate g1(X)
Random challenge r1
univariate g2(X)
Random challenge r2

In each round, the prover must send a polynomial gi(X)g_i(X) such that gi(0)+gi(1)g_i(0) + g_i(1) equals the previous claim. If the prover is lying about the original sum HH, then they must lie about gig_i somewhere. But since the verifier picks a random rir_i, with overwhelming probability, the prover won’t then be able to match the evaluation of the original polynomial.

The compression trick

For degree-1 (multilinear) polynomials, gi(X)=a+bXg_i(X) = a + bX has only two coefficients. Since the verifier knows gi(0)+gi(1)=Hi1g_i(0) + g_i(1) = H_{i-1} (the previous claim), we have:

a+(a+b)=Hi1    b=Hi12aa + (a + b) = H_{i-1} \implies b = H_{i-1} - 2a

So the prover only sends a=gi(0)a = g_i(0), and the verifier recovers bb. This saves 50% on communication costs.

The next claim in the chain is:

Hi=gi(ri)=a+bri=a+(Hi12a)ri=a(12ri)+Hi1riH_i = g_i(r_i) = a + b \cdot r_i = a + (H_{i-1} - 2a) \cdot r_i = a(1 - 2r_i) + H_{i-1} \cdot r_i

This is linear in Hi1H_{i-1}! By induction, the final claim is linear in the original HH. If HH isn’t in the transcript, we can solve for it.

Intuition (Why linearity matters)

Once the challenge is fixed, the verifier has accidentally turned the claimed sum into a knob. If the equation is linear in that knob, the attacker does not need to break the commitment scheme; they solve for the value the verifier will accept.

Multilinear extensions (MLEs)

An MLE is just the polynomial view of a table over {0,1}n\{0,1\}^n: it matches the table on Boolean points and extends it to field points.

For this post, the only property you need is:

f~(r)=b{0,1}nf(b)eq(b,r)\tilde{f}(\vec{r}) = \sum_{\vec{b} \in \{0,1\}^n} f(\vec{b}) \cdot \text{eq}(\vec{b}, \vec{r})

At a fixed challenge point r\vec{r}, the coefficients eq(b,r)\text{eq}(\vec{b}, \vec{r}) are constants, so f~(r)\tilde{f}(\vec{r}) is linear in the table values f(b)f(\vec{b}).

That linearity is exactly why missing transcript binding is dangerous: if r\vec{r} is sampled before those values are bound, an attacker can reprogram values while preserving the same evaluated claim.

Lookup arguments (LogUp)

zkVMs need to check that values satisfy certain properties. For example:

  • Is this byte in range [0,255][0, 255]?
  • Does this opcode decode correctly?
  • Is this memory access consistent with previous accesses?

The naive approach: Add constraints for each check. Expensive.

The clever approach: Precompute a table of valid tuples. Prove that every value the program uses appears in the table. This is a multiset membership check.

LogUp (Logarithmic Derivative): Encode multiset membership as a sum of fractions.

If set AA should equal set BB as multisets:

aA1za=bB1zb\sum_{a \in A} \frac{1}{z - a} = \sum_{b \in B} \frac{1}{z - b}

for random challenge zz. If the multisets match, the sums are equal. If they differ, the sums differ with overwhelming probability.

In zkVMs: Different components emit and consume lookup tuples:

  • CPU emits: “I read value vv from address aa at time tt
  • Memory table consumes: “At time tt, address aa contained vv

If everything balances, the execution is consistent.

The claimed_sum: Each component computes its contribution to the LogUp sum:

claimed_sumi=j1zemitjk1zconsumek\text{claimed}\_\text{sum}_i = \sum_j \frac{1}{z - \text{emit}_j} - \sum_k \frac{1}{z - \text{consume}_k}

The global check: iclaimed_sumi=0\sum_i \text{claimed}\_\text{sum}_i = 0.

Why this is vulnerable: The claimed_sum values are prover-supplied. If they’re not in the transcript before challenges are derived, the prover can adjust them to make the sum zero for an invalid execution.


The universal attack pattern

Now we can describe the attack pattern that works on all six systems:

A four-step attack. Step 1 finds prover-controlled values and checks whether they are bound to the transcript before challenges; if not, they are candidates. Step 2 traces how a value V affects verification, writes the verification equation in terms of V, and determines whether it is linear or low-degree. Step 3 holds the transcript and challenges fixed, treats unbound values as variables, and collects the verifier equations. Step 4 sets up the system of equations, solves, and patches the proof, after which the forged proof verifies.The universal attack pattern for unbound valuesA four-step attack. Step 1 finds prover-controlled values and checks whether they are bound to the transcript before challenges; if not, they are candidates. Step 2 traces how a value V affects verification, writes the verification equation in terms of V, and determines whether it is linear or low-degree. Step 3 holds the transcript and challenges fixed, treats unbound values as variables, and collects the verifier equations. Step 4 sets up the system of equations, solves, and patches the proof, after which the forged proof verifies.

Step 4: Solve and forge

Set up system of equations

Solve

Patch proof with solution

Step 3: Derive constraints

Hold transcript / challenges fixed

Treat unbound values as variables

Collect the verifier equations
involving them

Step 2: Trace to equations

How does value VV affect
verification?

Write verification equation
in terms of VV

Determine: is equation
linear / low-degree in VV?

Step 1: Identify unbound values

Find prover-controlled values

Check: are they in transcript
before challenges?

If not: candidate for attack

Forged proof verifies!

When a value VV isn’t transcript-bound:

  1. Challenges are fixed (independent of VV)
  2. The verification equation has form: f(V)=targetf(V) = \text{target}
  3. If ff is linear: αV+β=target\alpha \cdot V + \beta = \text{target}
  4. Solve: V=(targetβ)/αV = (\text{target} - \beta) / \alpha
Proposition (Unbound linear claim)

If a prover-controlled value VV affects a verifier check as αV+β\alpha V + \beta, and the challenge defining α\alpha and β\beta was sampled before VV was bound, then whenever α0\alpha \ne 0 the prover can choose V=(targetβ)/αV = (\text{target} - \beta) / \alpha after seeing the transcript.

In the simplest linear case, forging reduces to solving a low-dimensional field equation, while other systems require small coupled systems.

For systems with multiple unbound values, we get a system of linear equations. Gaussian elimination solves it in O(n3)O(n^3) field operations. For non-linear constraints, we might need to use some more advanced techniques like resultants and Groebner bases.


The six broken systems

Six systems split by where the unbound value lives. In the proof payload for Jolt (opening_claims), Nexus (claimed_sum) and Ceno (out_evals); passed as a statement parameter for Cairo-M (public_data), Expander (public_input / claimed_v) and Binius64 (public witness).The six broken zkVM systemsSix systems split by where the unbound value lives. In the proof payload for Jolt (opening_claims), Nexus (claimed_sum) and Ceno (out_evals); passed as a statement parameter for Cairo-M (public_data), Expander (public_input / claimed_v) and Binius64 (public witness).

Unbound value passed as statement parameter

Cairo-M
public_data
rational system

Expander
public_input / claimed_v
linear

Binius64
public witness
linear

Unbound value lives in proof payload

Jolt
opening_claims
linear system

Nexus
claimed_sum
linear

Ceno
out_evals
linear + product

Now let’s see how this plays out in each system. We’ll go deep on the first one (Jolt) to establish the pattern, then focus on what’s unique about each subsequent system.


Jolt (a16z)

Jolt is a zkVM for RISC-V programs, built by a16z. It uses sumcheck extensively to verify execution constraints.

The proof structure:

JoltProof
JoltProof {
commitments: Vec<Commitment>, // Polynomial commitments to trace
opening_claims: Map<OpeningId, Claim>, // <- THE VULNERABLE VALUES
proofs: Map<Stage, SumcheckProof>, // Sumcheck and opening proofs
...
}

The verification flow:

The verifier receives the JoltProof and public I/O, sets up the transcript, runs the batched sumchecks, verifies the openings against the commitments, and accepts.Jolt verification flowThe verifier receives the JoltProof and public I/O, sets up the transcript, runs the batched sumchecks, verifies the openings against the commitments, and accepts.

Stage 5: Opening verification

Batch-verify polynomial evaluations

Check against commitments

Stages 2-4: Batched sumchecks

Derive batching coefficients

Compute BatchedClaim

Verify sumcheck rounds

Stage 1: Transcript setup

Initialize transcript

Absorb public I/O, trace length

Absorb commitments

Verifier receives

JoltProof

Public I/O

Accept

The bug: Each sumcheck instance provides an input_claim, which is the value the polynomial allegedly sums to over the Boolean hypercube. These claims come from opening_claims in the proof, but they were never absorbed into the transcript before the batching coefficients were derived.

Commitments, public I/O and round messages are absorbed to derive the batching coefficients and sumcheck challenges, but opening_claims is never absorbed, yet H_i from opening_claims feeds the BatchedClaim verification equation.Jolt sumcheck flow with unbound opening claimsCommitments, public I/O and round messages are absorbed to derive the batching coefficients and sumcheck challenges, but opening_claims is never absorbed, yet H_i from opening_claims feeds the BatchedClaim verification equation.

Verification equation

Challenge derivation

Transcript state

NOT ABSORBED

Commitments

Public I/O

opening_claims
(missing)

Round messages

αi\alpha_i
(batching coeffs)

sumcheck challenges

HiH_i from opening_claims

BatchedClaim = iαiHi\sum_i \alpha_i \cdot H_i

How sumcheck uses opening_claims:

In Jolt’s batched sumcheck, the verifier computes a target value BatchedClaim by taking a random linear combination of the individual claims HiH_i:

BatchedClaim=iαiHi\text{BatchedClaim} = \sum_i \alpha_i \cdot H_i

where αi\alpha_i are random coefficients derived from the transcript. Since opening_claims (containing HiH_i) were not in the transcript, the αi\alpha_i values are independent of HiH_i.

Why it’s linear:

Due to the compression optimization (prover omits one less coefficient per round), the final verification equation traces back through the rounds and becomes linear in the input claim HH:

Cfinal=aH+bC_{\text{final}} = a \cdot H + b

where a,ba, b are determined by the transcript (independent of HH). The verifier checks that Cfinal=expected_evalC_{\text{final}} = \text{expected}\_\text{eval} (from PCS opening), this becomes aH+b=expected_evala \cdot H + b = \text{expected}\_\text{eval}.

Because multiple claims are coupled across verification stages, the attacker may need to adjust a small set of claim values simultaneously to satisfy all affected constraints.

This can be exploited by solving a small linear system over a handful of unbound claim values so all affected checks pass simultaneously.

Status: Fixed on October 3, 2025 via PR #981


Nexus

Nexus is a zkVM built on the Stwo prover (from StarkWare). It uses STARKs with logup lookup arguments.

Nexus splits verification into components such as instruction execution, memory, registers, etc. Each component handles a subset of constraints.

Each component emits and consumes lookup tuples. The component’s claimed_sum summarizes its net contribution:

claimed_sumi=j1zproducedjk1zconsumedk\text{claimed}\_\text{sum}_i = \sum_j \frac{1}{z - \text{produced}_j} - \sum_k \frac{1}{z - \text{consumed}_k}

All claimed_sum values must sum to zero (everything produced is consumed).

All constraints are combined into a composition polynomial. The verifier then checks this polynomial at a random point outside the execution domain, known as an OODS (Out-of-Domain Sampling) test.

The proof structure:

NexusProof
NexusProof {
stark_proof: {
commitments: [Merkle roots of trace columns]
sampled_values: [polynomial evaluations]
fri_proof: [low-degree test proof]
}
claimed_sum: [FieldElement; NUM_COMPONENTS] // <- VULNERABLE
log_size: [component sizes]
}

The claimed_sum values are checked to be of correct length, that they sum to zero, and are used in the final composition polynomial. But at no point were they absorbed into the transcript.

associated_data, log_sizes and trace commitments derive the lookup elements, OODS point and composition coefficients, while claimed_sum is independent of the transcript yet feeds the verification checks.Nexus verification flow with unbound claimed_sumassociated_data, log_sizes and trace commitments derive the lookup elements, OODS point and composition coefficients, while claimed_sum is independent of the transcript yet feeds the verification checks.

Verification

Not in transcript

Derived challenges

Transcript state

Independent

associated_data

log_sizes

trace commitments

lookup_elements (zz)

OODS point

composition coeffs

claimed_sum

Check iclaimed_sumi=0\sum_i \mathrm{claimed\_sum}_i = 0

CP(oods)=expectedCP(\mathrm{oods}) = \mathrm{expected}

The OODS check computes the composition polynomial, which includes logup boundary constraints. These constraints are linear in claimed_sum:

The composition polynomial is a random linear combination of constraints:

C(x)=iαiconstrainti(x)C(x) = \sum_i \alpha_i \cdot \text{constraint}_i(x)

Since each constraint is linear in its claimed_sum, the overall composition polynomial is linear in all claimed_sum values.

The verifier checks C(oods_point)=expectedC(\text{oods}\_\text{point}) = \text{expected}.

With claimed_sum not in transcript, the composition polynomial becomes a linear function of the claimed_sum values. Combined with the constraint that claimed sums must sum to zero, this is a small linear system that is easily solvable.

Status: Fixed on October 24, 2025 via PR #503


Cairo-M (Kakarot Labs)

Cairo-M, built by Kakarot Labs, is an alternative proof system for the Cairo VM (used by Starknet).

Cairo-M is in many ways similar to Nexus. It uses logup to prove global statements about the execution.

The proof structure:

Cairo-M proof
Proof {
claim: ComponentSizes,
interaction_claim: LogupClaimsPerComponent,
public_data: { // <- VULNERABLE
initial_registers: { pc, fp },
final_registers: { pc, fp }, // <- forged
clock, // <- forged
initial_root,
final_root, // <- forged
public_memory: { program, input, output }, //output modified
},
stark_proof: [...],
}

The verification flow:

Prover setup elements are absorbed and lookup challenges drawn, but public_data is not yet in the transcript when it feeds the claimed_sum check, leaving it unbound.Cairo-M verification flowProver setup elements are absorbed and lookup challenges drawn, but public_data is not yet in the transcript when it feeds the claimed_sum check, leaving it unbound.

Verification flow

Not in transcript

Transcript state

Prover setup

challenges

mix claim

UNBOUND

Mix PCS config

Commit trace roots

Mix claim (component sizes)

Check proof-of-work

Absorb setup elements

Draw lookup challenges z,αz, \alpha

BUG: public_data
not yet in transcript

Check:
claimed_sum(relations,public_data)=0\mathrm{claimed\_sum}(\mathrm{relations}, \mathrm{public\_data}) = 0

Mix interaction_claim

Verify STARK proof

Lookup challenges are derived without public_data being mixed into the transcript.

The public_data (program I/O, boundary registers, memory roots) enters the lookup relations inside denominators through challenge-weighted encodings of tuples. Abstractly, the verifier checks a relation of the form:

L(public_data)+(other transcript_bound terms)=0,L=i1z+α,ti(public_data)+βL(\text{public}\_\text{data}) + \text{(other transcript}\_\text{bound terms)} = 0, \\ L = \sum_i \frac{1}{z + \langle \alpha, t_i(\text{public}\_\text{data})\rangle + \beta}

The global check is then that L(p)+(other terms)=0L(p) + \text{(other terms)} = 0.

With challenges fixed, this is a rational equation in public data. This is not linear, but still algebraically solvable.

Public-data coordinates participate in verification relations through extension-field arithmetic (including extension-valued public-memory entries), so the forged-parameter search is a coupled extension-field system.

Status: Fixed on October 31, 2025 via commit 92b6740


Ceno (Scroll)

Ceno is a zkVM by Scroll, using GKR with a tower sumcheck structure.

Ceno splits verification into chips, with one per opcode or lookup table. Each chip proves its constraints independently.

Many per-record values (reads, writes, lookups) are batched into a binary tree structure. Each layer folds pairs of values with random challenges; this is the tower sumcheck.

All read records must match all write records (plus initial/final state). This is checked via a multiset equality, this time using a product instead of logup:

ir_out_evalsi=jw_out_evalsj(state factors)\prod_i \text{r}\_\text{out}\_\text{evals}_i = \prod_j \text{w}\_\text{out}\_\text{evals}_j \cdot (\text{state factors})

The proof structure:

ZKVMChipProof
ZKVMChipProof {
r_out_evals: [[FieldElement]], // <- VULNERABLE
w_out_evals: [[FieldElement]], // <- VULNERABLE
lk_out_evals: [[FieldElement]], // <- VULNERABLE
tower_proof: [...],
gkr_iop_proof: [...],
}

r_out_evals, w_out_evals, and lk_out_evals are used to initialize the tower sumcheck claim, but they’re never absorbed into the transcript. This leaves us with two equations:

  1. GKR/Tower equation (linear in out_evals):

    The tower sumcheck claim is claim=jαjout_evalsj\text{claim} = \sum_j \alpha^j \cdot \text{out}\_\text{evals}_j.

    This check is linear in out_evals.

  2. rw-product consistency (bilinear in out_evals):

    ir_out_evalsi=jw_out_evalsj(state factors)\prod_i \text{r}\_\text{out}\_\text{evals}_i = \prod_j \text{w}\_\text{out}\_\text{evals}_j \cdot (\text{state factors})

    If we vary x0=r_out_evals[0][0]x_0 = {\text{r}\_\text{out}\_\text{evals}}[0][0] and x1=r_out_evals[0][1]x_1 = {\text{r}\_\text{out}\_\text{evals}}[0][1] we get the following constraint:

    x0x1(rest of product)=targetx_0 \cdot x_1 \cdot (\text{rest of product}) = \text{target}

    This is bilinear in (x0,x1)(x_0, x_1).

We have two unknowns (x0,x1)(x_0, x_1) and two equations, one linear and one bilinear:

  1. Linear (from GKR): a0x0+a1x1+c=0a_0 x_0 + a_1 x_1 + c = 0
  2. Bilinear (from multiset): kx0x1+d=0k \cdot x_0 \cdot x_1 + d = 0

Substitution reduces this to a quadratic in one variable, which is solvable with the quadratic formula.

Status: Fixed on March 5, 2026 via PR #1262 (original report: #1125)


Expander (Polyhedra)

Expander is a GKR-based proof system for arithmetic circuits.

The proof structure:

Proof (raw bytes, parsed in order):
- PCS commitment
- Sumcheck round polynomials (for each layer)
- Layer claims (claim_x, claim_y)
- PCS opening proofs
NOT in proof bytes (passed separately):
- public_input // statement data passed separately
- claimed_v // statement claim passed separately

In Expander’s circuit model, constant gates can reference public input values. During GKR verification, the eval_cst() evaluates the contribution of these gates at the sumcheck challenge point:

sum -= GKRVerifierHelper::eval_cst(&layer.const_, public_input, sp);

This evaluation is a linear combination of public input values, weighted by coefficients derived from the challenges stored in the verifier’s scratch pad (sp).

The vulnerability:

public_input is never absorbed into the transcript. The transcript is initialized from the PCS commitment and sumcheck round messages, but public inputs are passed separately to the verifier.

The transcript is built from proof bytes only; public_input and claimed_v are passed separately and never absorbed, yet they feed the GKR sumcheck check, which is linear in public_input.Expander verification flow with unbound public inputThe transcript is built from proof bytes only; public_input and claimed_v are passed separately and never absorbed, yet they feed the GKR sumcheck check, which is linear in public_input.

GKR sumcheck verification

Passed separately

Transcript

Proof (bytes)

NEVER ABSORBED

NEVER ABSORBED

PCS commitment

Sumcheck rounds

Layer claims

Opening proofs

Built from
proof bytes only

public_input

claimed_v

check uses
(public_input, claimed_v)

Linear in public_input

The eval_cst() function computes a linear combination:

eval_cst=ipublic_input[i]eq(i,r)\text{eval}\_\text{cst} = \sum_i {\text{public}\_\text{input}}[i] \cdot \text{eq}(i, \vec{r})

where r\vec{r} contains the challenges. Since challenges are derived before the statement data is bound, they are independent of public_input. This lets an attacker choose an arbitrary false statement (e.g., a forged output) and then solve the induced linear constraints for a modified public_input that makes the verifier’s check pass.

Status: Fixed on January 21, 2026 via commit 4a8c2be; claimed 500k bug bounty award pending.


Binius64

Binius64 is a proof system optimized for binary fields, designed to be efficient on 64-bit CPUs. Binius uses F2128\mathbb{F}_{2^{128}} (or variants thereof), where addition is XOR. This makes certain operations very fast.

One of Binius’s key features is its specialized protocols for bitwise operations. The Shift Protocol efficiently handles bit-shifts and rotations (essential for hash functions like SHA-256) without the massive overhead typical in other proof systems.

The vulnerability:

The verifier receives the public witness (program inputs/outputs) as a separate parameter:

pub fn verify<F, C>(
constraint_system: &ConstraintSystem,
public: &[Word], // <- NEVER ABSORBED
// ...
) -> Result<VerifyOutput<F>, Error>

In the shift protocol, challenges r_j and inout_eval_point are sampled before the public witness is bound.

The verification flow:

The first sumcheck and sampled challenges drive the verification equations, but public_input is not absorbed into the transcript before it feeds the public_eval MLE, leaving it unbound.Binius64 shift protocol verification flowThe first sumcheck and sampled challenges drive the verification equations, but public_input is not absorbed into the transcript before it feeds the public_eval MLE, leaving it unbound.

Verification flow

Not in transcript

Transcript state

rjr_j

inout_eval\mathrm{inout\_eval}

γ\gamma

batch_coeff\mathrm{batch\_coeff}

UNBOUND

First sumcheck
(produces γ,rj,rs\gamma, r_j, r_s)

Sample inout_eval_point

Sample batch_coeff

BUG: public_input
not yet absorbed

public_eval =
MLE(public_input,rj,inout_eval_point)\operatorname{MLE}(\mathrm{public\_input}, r_j, \mathrm{inout\_eval\_point})

sum = γ+batch_coeffpublic_eval\gamma + \mathrm{batch\_coeff} \cdot \mathrm{public\_eval}

Second sumcheck
(verifies batched sum)

During verification:

  1. Sumcheck produces challenge points r_j (bit indices) and r_s (shift indices)
  2. Verifier samples inout_eval_point from transcript
  3. Verifier computes public_eval = MLE(public, r_j, inout_eval_point) using the unbound public slice
  4. The public_eval feeds into subsequent verification equations

The MLE evaluation is linear in the public witness bits:

public_eval=w,bpublic[w][b]eq(b,rj)eq(w,inout_eval_point)\text{public}\_\text{eval} = \sum_{w,b} {\text{public}}[w][b] \cdot \text{eq}(b, r_j) \cdot \text{eq}(w, \text{inout}\_\text{eval}\_\text{point})

With challenges fixed (independent of public), an attacker can find an alternate witness public\text{public}' that produces the same evaluation. This is a single 128-bit linear constraint over hundreds of bits, yielding a single linear equation in a high-dimensional binary witness space, which is typically underconstrained and admits many alternative witnesses under common parameterizations.

Status: Fixed on December 29, 2025 via commit 86a515f


Why does this keep happening?

Given that we found the same bug class in six independent implementations, at some point we have to ask whether there is a systemic issue making this mistake so common.

Academic papers don’t specify Fiat-Shamir

Academic papers usually describe interactive protocols: “Prover sends CC. Verifier sends random rr. Prover sends RR.”

They often omit the necessary steps to make the protocol non-interactive: “Hash CC before sampling rr. Also hash the public statement. Also hash intermediate values that affect later equations.”

Security proofs thus also analyze the interactive protocols where binding is implicit. The responsibility of determining what to include in the transcript therefore falls on the implementor, which may not have a good understanding of the full protocol.

The hot potato problem

Modern zkVMs are modular:

Each layer assumes another binds the value. The zkVM layer passes the claim, the lookup and sumcheck layers assume it is transcript-bound, and the claim falls through so an unbound value reaches the verifier.The hot potato problem across modular zkVM layersEach layer assumes another binds the value. The zkVM layer passes the claim, the lookup and sumcheck layers assume it is transcript-bound, and the claim falls through so an unbound value reaches the verifier.

zkVM layer
passes claim/value

Lookup layer
assumes transcript-bound

Sumcheck layer
assumes transcript-bound

Claim falls through:
unbound value reaches verifier

It often happens that each layer assumes the previous/next layer handles the transcript binding for a value, so in the end it never happens.

Optimization pressure

Performance is existential for ZK. Since every hash operation has a cost, there is constant pressure to exclude values that are “probably fine” to leave out.

There are indeed cases when this can be done safely, but determining what is safe requires a full understanding of all protocols involved, and the decision to exclude something should be double and triple checked by experts.

Testing doesn’t catch adversarial inputs

Unit tests run the honest prover. Integration tests run the honest prover. Fuzzing only randomly perturbs values and has a very low probability of succeeding in fooling a verifier. Identifying Fiat-Shamir bugs requires thorough manual security analysis, and sometimes even that falls short.


How to find and fix these bugs

Prevention

Fiat-Shamir has long been a known source of soundness bugs, which has driven the development of primitives that make implementation less error-prone.

One such tool is to merge the proof and transcript, to force all values that are sent by the prover to be automatically absorbed into the transcript.

The prover holds a proof buffer which emulates the communication channel between prover and verifier. When a value is sent by the prover it is added to the proof buffer and automatically absorbed into the transcript. When the prover then needs to read a challenge from the verifier it simply squeezes from the current transcript.

This can then be done in reverse for the verifier. It gradually reads values from the proof buffer and can thus sync the transcript state and derive the same challenges.

Halo2 follows this pattern, and Binius is transcript-centric as well.

Caution

Even with a merged proof/transcript, statement data (e.g., public inputs) must still be absorbed before sampling any challenges that govern equations depending on them — and as Binius demonstrates, even transcript-centric systems can miss this.


Responsible disclosure timeline

System Reported Fixed Response Time
Jolt Sep 2025 Oct 3, 2025 <1 week
Nexus Oct 2025 Oct 24, 2025 <1 week
Cairo-M Oct 2025 Oct 31, 2025 <1 week
Ceno Nov 2025 Mar 5, 2026 ~4 months
Binius64 Dec 2025 Dec 29, 2025 <1 week
Expander Nov 2025 Jan 21, 2026? 3 months

All six teams were notified; responses ranged from immediate acknowledgement to delayed fix, and all reported issues have since been addressed.


Challenges

Do you think you have a good understanding of these bugs? We have prepared challenges to allow you to practice implementing two of these exploits. If you solve any of them, follow the instructions in the flag. The first 10 solvers will get a T-shirt.

Your goal is to find a counter example of Fermat’s Last Theorem, i.e. you know a,b,ca, b, c such that a3+b3=c3,a,b,c1a^3 + b^3 = c^3, a,b,c \ge 1. Good luck!

Jolt

See the handout for the setup running on the server. Submit your proof by connecting to jolt.chal.osec.io:8960

Nexus

See the handout for the setup running on the server. Submit your proof by connecting to nexus.chal.osec.io:8950

Now you should have enough margin to prove Fermat wrong.


Takeaways

We found critical soundness vulnerabilities in six separate zkVMs. All share the same root cause: prover-controlled values that affect verification equations were not bound to the Fiat-Shamir transcript before challenges were derived.

The fix in each case is trivial—one or two lines of code. But finding the bug requires understanding the full verification flow and asking: “What if the prover chose this value after seeing the challenges?”

For the ZK ecosystem: The Fiat-Shamir transform looks simple. Hash everything, derive challenges. In practice, “everything” is hard to specify when you have dozens of components, each with its own inputs and outputs, each expecting someone else to handle binding.

We found six instances by examining a handful of systems. How many more exist in the dozens of zkVMs, proof systems, and recursive verifiers deployed today?

For auditors: Draw the data flow. Trace the transcript. Check every prover-controlled value against when its relevant challenges are derived.

For builders: Treat the transcript as a sacred ledger. When in doubt, absorb it.

Footnotes

  1. The type most commonly described in literature.

Subscribe to the blog

New posts from the OtterSec team, straight to your inbox. One email per post, unsubscribe any time.