Formal verification -- a powerful tool for proving the correctness of your programs. How does it actually work? This blog post will provide practical tips to help you use the Move Prover to its fullest potential, as well as explore a real-world example of how we used formal verification to secure a smart contract.
At a high level, formal verification allows you to provide a specification for the program. This specification is then checked against symbolic inputs, allowing you to prove that your code follows the specification for all possible inputs.
Move Prover
The Move Prover is an automated tool that allows developers to formally verify smart contracts written in the Move programming language.
Move was primarily designed to facilitate automatic verification. Interestingly, the Move Prove operates on the Move bytecode itself, avoiding potential compiler bugs from interfering with prover correctness.
The architecture of the tool consists of multiple components as illustrated below.
First, the Move prover receives a Move source file (an input) that contains specifications of the intended behavior of the program. Those specifications are then extracted from the annotated source by the Move Parser. Consequently, the tool compiles the source code into Move bytecode which is verified and converted into a prover object model plus the specification system "blueprint".
The model is translated into an intermediate language, called Boogie. This Boogie code is then passed to the Boogie verification system which generates the input for the solver using a "verification condition generation". The verification condition (VC) is passed to an automated theorem prover (Z3).
Once the VC is passed to the Z3, the prover checks if the SMT formula is unsatisfiable. If so, it means that the specifications hold. Otherwise, a model that satisfies the conditions is generated and converted back into Boogie format in order to issue a diagnosis report. The diagnosis report is then reverted to a source-level error which parallels a standard compiler error.
Move Specification Language
Move MSL is a subset of the Move Language, which introduces support to statically describe the behavior about the correctness of a program with no implications on production.
To better understand how to use the MSL, we will use Pontem's U256 library, an open source Move library which implements support for U256 numbers, as a case study.
The U256 number is implemented as a struct which contains 4 u64 numbers.
struct U256 has copy, drop, store {
v0: u64,
v1: u64,
v2: u64,
v3: u64,
}
Now, let's consider the add(a: U256, b: U256): U256
function. In order to verify the correctness of such a function, it might be useful to verify some of the group axioms, for example: commutativity and associativity.
Specifications are declared in a specification block, which can be found in Move functions, as module member, or in a different file as a separate specification module.
For example, if your file is sources/u256.move
, you can put specifications in sources/u256.spec.move.
spec add { ... }
The specifications placed inside the specification blocks are considered Expressions.
Expressions
Let's go over some common expressions.
aborts_if
defines when the function can abort. This is especially useful in the context of smart contract development, where an abort would cause the entire transaction to rollback.
For example, the add
function aborts if and only if the U256 addition overflows. Let's put these words into an expression:
const P64: u128 = 0x10000000000000000;
spec fun value_of_U256(a: U256): num {
a.v0 +
a.v1 * P64 +
a.v2 * P64 * P64 +
a.v3 * P64 * P64 * P64
}
spec add {
aborts_if value_of_U256(a) + value_of_U256(b) >= P64 * P64 * P64 * P64;
}
We can observe in the snippet above, that we are allowed to call functions inside the spec block. However, the callee must either be an MSL function, or a pure Move function. A pure Move function can be defined as a function that does not modify the global state or use Move expression features unsupported by MSL.
A common pattern for aborts_if
is aborts_if false
, which lets you prove that a function will never abort.
spec critical_function {
aborts_if false;
}
Another type of expression that we can use is ensures
. As the name suggests, it ensures that a certain condition is true at the end of a function's execution.
In the case of the add
function, we want to ensure that the return value is the sum of the 2 parameters. Note that because MSL uses unbounded numbers, we're able to very cleanly express this property without worrying about overflows.
spec add {
aborts_if value_of_U256(a) + value_of_U256(b) >= P64 * P64 * P64 * P64;
ensures value_of_U256(result) == value_of_U256(a) + value_of_U256(b);
}
Note that because Move specification functions are written in MSL, the numbers are unbounded and we can define the expression without risk of overflow.
Let's try to prove the library with the specifications from above:
$ move prove
It outputs the following error information:
[...]
error: abort not covered by any of the `aborts_if` clauses
╭ spec add {
| aborts_if value_of_U256(a) + value_of_U256(b) >= P64 * P64 * P64 * P64;
| ensures value_of_U256(result) == value_of_U256(a) + value_of_U256(b);
| }
╰─────^
[...]
at ./sources/u256.move:316: add
enter loop, variable(s) carry, i, ret havocked and reassigned
carry = 54
i = 3792
ret = u256.U256{v0 = 26418, v1 = 27938, v2 = 6900, v3 = 1999}
at ./sources/u256.move:346: add
ABORTED
FAILURE proving 1 modules from package `u256` in 9.143s
{
"Error": "Move Prover failed: exiting with verification errors"
}
The prover is telling us that proving failed because the abort was not covered by our aborts_if
clauses. But there is no other abort situation that we have to cover, right?
If we keep reading the error output, we will encounter the somewhat cryptic message: ret havocked and reassigned
.
What does this mean?
By diving into the Move Prover source, we find a likely suspect. The prover attempts to prove all loops with induction!
More formally, it will translate the loop into two key steps, following the classic steps of a proof by induction
- Base Case: Asserting the loop invariant holds at the start of loop execution
- Inductive Step: Assume the invariant, execute the loop body, and assert that the invariant still holds
The loop prover will also havoc, or assign random values to, all variables written to inside the loop. Going back to the log message, this implies that the variables carry
, ret
and i
have been havocked, or assigned random values. This also explains why the input and output of add
makes no sense.
More concretely, the loop analysis translates into the following steps.
- Assert the loop invariant
- Havoc all modified variables
- Assume the loop invariant
- Assume the loop guard (the code inside the while condition)
- Run the loop body
- Assert the loop invariant
There are two approaches to dealing with loops.
The first would be to specify a loop invariant.
In order to specify the loop invariant, we need to use some special syntax, as we explored briefly in our previous post.
while ({
spec {
invariant len(amounts_times_coins) == i;
invariant i <= n_coins;
invariant forall j in 0..i: amounts_times_coins[j] == input[j] * n_coins;
};
(i < n_coins)
}) {
vector::push_back(
&mut amounts_times_coins,
(*vector::borrow(&input, (i as u64)) as u128) * (n_coins as u128)
);
i = i + 1;
};
In this case, the brackets specify the loop invariant for the while
loop. Note that because the loop invariant executes after the loop guard, so we need to account for an extra step with i <= n_coins
.
while ({
spec {
invariant len(amounts_times_coins) == i;
invariant i <= n_coins;
invariant forall j in 0..i: amounts_times_coins[j] == input[j] * n_coins;
};
(i < n_coins)
}) {
Loop invariants are often difficult to write, especially for nontrivial loop bodies.
The second solution to dealing with loops is to unroll the loop. This technique works in this particular situation because, as we can observe, the loop within the add
function will always iterate exactly 4 times:
/// Total words in `U256` (64 * 4 = 256).
const WORDS: u64 = 4;
[...]
let i = 0;
while (i < WORDS) {
let a1 = get(&a, i);
let b1 = get(&b, i);
[...]
Unrolling the function and running again the Move Prover will print out a "Success" message!
SUCCESS proving 1 modules from package `u256` in 9.685s
{
"Result": "Success"
}
For the Associative Property (a+(b+c) = (a+b)+c
) to be true, changing the grouping of addends should not change the sum. To verify this, we will first implement a function which simulates this property:
fun add_assoc_property(a: U256, b: U256, c: U256): bool {
let result_1 = add(b, c);
let result_11 = add(a, result_1);
let result_2 = add(a, b);
let result_22 = add(c, result_2);
let cmp = compare(&result_11, &result_22);
if ( cmp == EQUAL ) true else false
}
Lastly, we want to create a spec block which aborts if the sum overflows, and ensures that the result of the function is true:
spec add_assoc_property {
aborts_if (value_of_U256(a) + value_of_U256(b)) + value_of_U256(c) >= P64 * P64 * P64 * P64;
ensures result == true;
}
Running move prover with the new specifications, we can confirm that there are no verification errors:
SUCCESS proving 1 modules from package `u256` in 9.685s
{
"Result": "Success"
}
For a more complete document detailing Move Prover syntax, we recommend referring to spec-lang.md in the Move Repository.
Use Cases
Formal verification can prove that a smart contract satisfies the given requirements for all possible cases without even running the contract. The hard part is coming up with the specifications.
Here, we hope to explore some practical examples of possible verification ideas.
Error Conditions
Taking an example from std::fixed_point32
, it's often useful to explicitly define when a function might abort. For example, arithmetic operations with fixed point numbers should only error if they overflow.
spec schema MultiplyAbortsIf {
val: num;
multiplier: FixedPoint32;
aborts_if spec_multiply_u64(val, multiplier) > MAX_U64 with EMULTIPLICATION;
}
spec fun spec_multiply_u64(val: num, multiplier: FixedPoint32): num {
(val * multiplier.value) >> 32
}
Access Control Policies
Somewhat similar to error conditions, it's often useful to enforce explicit access control policies at the specification level.
For example, in std::offer
we are able to see that the function should abort if and only if there does not exist an offer, or the recipient is now allowed.
spec redeem {
/// Aborts if there is no offer under `offer_address` or if the account
/// cannot redeem the offer.
/// Ensures that the offered struct under `offer_address` is removed.
aborts_if !exists<Offer<Offered>>(offer_address);
aborts_if !is_allowed_recipient<Offered>(offer_address, signer::address_of(account));
ensures !exists<Offer<Offered>>(offer_address);
ensures result == old(global<Offer<Offered>>(offer_address).offered);
}
These access control specifications make it impossible to accidentally remove security critical access control policies later.
Complex Mathematical Formulae
Whether it's a decimal implementation or more complex data structures, it's often useful to verify that the expected output is always the output.
Proving that your fundamental data structures work exactly as intended will give you much more confidence in the remainder of your codebase.
For example, in our work with Laminar Markets, we provided recommendations for verifying their internal splay tree implementation against a simpler priority queue data structure.
Data Invariants
Formal verification provides the best environment to verify that certain variables
or resources
don't exceed the intended boundaries. Let's consider the struct from below. We can ensure that index
is never greater than 4 using a struct invariant
.
struct Type {
index: u64
}
spec Type {
invariant index < 4;
}
We were able to verify more complex properties in our recent audits for LayerZero and Aries Markets, but the details are left as an exercise to the reader.
Economic Invariants.
Proper economic invariants can require more creativity to come up with but can be extremely effective at securing your protocol.
For example, you should never be able to drain coins from a pool by adding and removing shares. In practice, you might implement this as a utility helper function.
// #[test] // TODO: cannot specify the test-only functions
fun no_free_money_theorem(coin1_in: u64, coin2_in: u64): (u64, u64) acquires Pool {
let share = add_liquidity(coin1_in, coin2_in);
remove_liquidity(share)
}
spec no_free_money_theorem {
pragma verify=false;
ensures result_1 <= coin1_in;
ensures result_2 <= coin2_in;
}
Some other ideas include
- Swapping through an AMM should never lead to a decrease in one side of the pool without also increasing the other side. In other words, no free money
- Lending protocols should always be fully collateralized after a series of deposit, borrow, and withdraw instructions.
- Orderbooks should never lose money after an order is placed and then canceled.
Closing Thoughts
In this post, we've explored how to properly utilize the Move Prover to verify critical invariants about your codebase.
In our upcoming posts, we will explore how to turn the Move Prover into a weapon for squashing security vulnerabilities by learning how to ask the right questions, so stay tuned!
We're passionate about formal verification and pushing the edge of what's possible in Move security. If you have any thoughts, or would like to explore an audit, feel free to reach out to me @notdeghost.