Function Values the Move VM

AIP-112 - Function Values in the Move VM

Summary

Move 2 adds higher-order functions to the language. Function values can be constructed using partial function applications and lambda expressions, and passed around as values, as well as stored in vectors and structs. Under certain conditions, function values can also be persisted in storage, meaning in Move that they have the store ability. This new feature requires extensions to the Move VM which are described in this AIP.

Motivation

Higher-Order Programming

Higher-order functions are a well established feature in programming. Nearly every modern PL supports them, including lambda expressions which can capture values from the context. In Web3 development, builders are very familiar with using lambdas in their client code. Lambdas enable cleaner code with more declarative abstractions.

Beyond Inline Functions

In Move, inline functions already support a restricted form of function parameters, enabling ‘filter/map/reduce’ on vectors and other types. This feature is popular among Move developers. However, it has severe restrictions: since the inline function is expanded at the caller’s site, it cannot access any private features in its originating module, and therefore cannot work on the resource storage owned by that module.

Separate Storage from Logic

With full support of lambdas in Move, one will be able to create code like below, which is currently not possible in Move. In this example, the object module owns the global storage and the access policies for Data, but only the current module can work with Data:

module 0747::app {
    use 0x1::object;
    struct Data { .. }
    entry fun action(obj: object::Object<Data>) {
        object::with_mut(obj, |data: &mut Data| work(data));
        // shortcut: object::with_mut(obj, work)
    }
    fun work(data: &mut Data) { .. }
}

Here we separate storage management and application logic. This is not possible in Move today: inline functions cannot be used for this, and also references to global storage cannot be returned by functions. Rather one has to do something like let addr = obj.address_of(); work(&mut Data[addr]);, giving the object module little ability to manage the actual storage.

Storable Function Values

Another important application are storable function values, that is functions which are stored in resources on chain. With storable function values, one can construct dispatch tables which delegate functionality to other contracts without the need for listing those contracts as explicit dependencies. Today, Dispatchable Token Standard (AIP-73) is already an application of this feature; however, the current implementation is special cased for function types.

Storable function values are particular important for DeFi exchange apps. Those apps need to be able to deal with an unbounded number of assets which are not known upfront. With storable function values, one can register a new asset with custom processing logic in a contract which is already deployed on chain.

Out of Scope

The following features are out of scope, and maybe added in later versions:

Capturing of references for lambdas. Values captured by a lambda need to be moved or copied into the closure representing the function value. Notice that this also implies that a lambda cannot modify any of the values in its context. This restriction might be lifted once references in structures are supported.
Dynamic Linking With the Dispatchable Token Standard the feature of a FunctionInfo was introduced, which can be dynamically constructed from strings. This feature is not yet supported via this AIP, but expected to be added on top of it later.

High-Level Overview

The VM is extended by a set of new instructions which allow to construct and execute closures. A closure is a new kind of value in the VM which consists of a reference to a function as well as a list of captured arguments. In order to execute a closure, the captured arguments are augmented by any additional provided arguments, and then the closure is executed.

CLOSURE(func, captured_arg)(provided_arg) === func(captured_arg, provided_arg)

Captured arguments don’t need to be consecutive; rather, the closure contains a bitmask specifying to which argument position the captured arguments belong. Details of the bitmask mechanism are described later.

A storable closure is one which is based on an upgrade persistent (public) function, where all captured arguments are storable as well. Storable closures can be serialized and deserialized to/from storage. On deserialization of a closure, it is essential to avoid the need of loading the code behind the closure; this deserialized closures need to be resolved and linked against code lazily, the first time they are executed. This allows contracts to store e.g. a map of hundreds of closures, and execute only specific ones per given transaction, without the need for loading the modules of any non-participating closure functions.

Per upgrade persistency it is guaranteed that a closure can always be successfully resolved, avoiding any late binding errors. Furthermore, no type confusion for captured arguments can be introduced.

The reentrancy problem of dynamic dispatch, as it comes with function values, is solved as follows. A module is considered active if a given function of that module is currently on the call stack. For any active module, storage owned by the module (that is resources declared by the module) are locked. Accessing those resources via borrow_global or borrow_global_mut, as well as move_to and move_from, will lead to a runtime error. This allows reentrancy and working on references to resources of a module which are already obtained; however, it disallows borrowing storage again, leading to racing conditions and keeping Move’s borrow semantics intact. For a detailed discussion, see later.

Impact

A significant impact on the expressiveness of Move is expected by function values. They enable scenarios as discussed in the Motivation section which go behind what is possible today in Move, and specifically support DeFi apps with Move on Aptos.

Alternative Solutions

There are no real alternatives to function values. Workarounds exist for some use cases by they are not complete. Not at least for this reason, most modern program languages support them.

As discussed, inline functions can avoid the need to support function values in the VM for filter/map/reduce scenarios, however, because of visibility rules in Move, they are rather restricted because they are expanded at the caller side.
If a DeFi app wants to dispatch over multiple supported assets, they can create a large switch expression and do the dispatch ‘manually’. However, this requires linking all the participating assets into the DeFi app, which is not scalable. Moreover, the app will need to be redeployed whenever a new asset is added.
Another workaround is to use the dynamic script composer to link a DeFi app with the assets it supports. In this case, the app will generate a script on-the-fly each time it want to work with a particular asset. While this is a better solution than statically linking the asset, it is less secure, as it requires the asset to provide fine-grained public APIs to enable this use case. Moreover, it shifts logic from the contract into the client, making the overall logic less transparent and ‘off-chain’.

Specification and Implementation Details

Language and Compiler Support

While the source language design is not part of this AIP, here are a few assumptions which are input for the VM design.

Lambda Lifting

The Move compiler is expected to perform lambda lifting. Given an expression as follows:

let x: u64
let f = |y| x + y

The compiler will introduce a new (private) function and generate a term to create a closure for representing the lambda:

let x: u64
let f = CLOSURE(lifted_fun, 0xb01, [x]) // internal expression construct

fun lifted_fun(x: u64, y: u64) -> u64 { x + y }

Above, 0xb01 is a bit mask specifying which arguments of a function are captured by a closure. (If the i’th bit is set, the ith argument is captured.)

Notice that the function value created by the above lambda expression will not be storable as it is based on a private function.

Storable Function Values

The compiler detects when it can avoid lambda lifting and then uses the called function directly, in which case the overall function can become storable:

|x| pub_fun(x, _ )  ==> CLOSURE(pub_fun, 0b01, [x])
|y| pub_fun(_, y)   ==> CLOSURE(pub_fun, 0b10, [y])

public fun pub_fun(x: u64, y: u64) -> u64

Notice that the constructed function values above are storable because they are based on a public function, and the captured arguments are storable as well.

Only allowing public functions to be storable can create security risks by incentivizing to make functions publicly callable only for guaranteeing they persist on upgrade. Henceforth, this AIP also adds a new attribute #[persistent] on function declarations, marking them upgrade persistent, meaning they are treated on code upgrade similar as public functions (types cannot change, cannot be removed):

#[persistent]
fun private_storable_fun(...) { ... }

New Instructions

Three new instructions are added to the bytecode:

enum Bytecode {
  ..
  PackClosure(FunctionHandleIndex, ClosureMask),
  PackClosureGeneric(FunctionInstantiationIndex, ClosureMask),
  CallClosure(SignatureIndex),
}

PackClosure and PackClosureGeneric expect the captured arguments on the stack. CallClosure expects a closure on top of the stack, and underneath any additional provided arguments. The captured arguments in the closure are combined with the provided ones according to the ClosureMask to form the final argument list.

Notice that both captured and provided argument lists can be empty. That is, the special cases of a function without any captured arguments, and the invocation without any additional provided arguments, are all handled over the same instructions.

New Types

A variant for function types is added to the existing type representation — SignatureToken — in the binary format.

enum SignatureToken {
   ..
   Function(Vec<SignatureToken>, Vec<SignatureToken>, AbilitySet),
}

Function types are denoted, for example, as |u64,u32|u8 has store+copy+drop .

The type of PackClosure(fun, mask, captured) is derived as follows:

The argument types are the types of those arguments of fun which are not captured as described by mask. The result type is that of fun.
The abilities are the intersection of the abilities of all captured arguments, further joined with store+copy+drop if the function is upgrade persistent, and copy+drop if it is not.

This reflects that a function itself can be always copied and dropped, but only stored if it is guaranteed to be persistent.

Type Compatibility

The notion of type compatibility is extended by assignability. A value of type |T1|R1 has A1 is assignable to a location declared with type |T2|R2 has A2 if T1 == T2, R1 == R2, and A1 >= A2, that is, the function value has at least the abilities as expected by the location. For example, the value could be a closure on the stack, and the location the parameter of a function argument.

Notice that assignability is not co-variant. If function types appear inside other types, for example as type arguments, they are compared with equality. This restriction might be lifted in later iterations, however, it should be noted that languages like Rust also do not support this.

Bytecode Verification and Runtime Verification

The bytecode verifier ensures statically that the arguments passed to the PackClosure instructions match the argument types declared for the function w.r.t. the provided bit mask.

For the CallClosure(SignatureMask) instruction, the verifier ensures that the closure on the stack has the type as specified by the signature in the instruction. Suppose fun f(x: u64, y: u8): u32 and a closure CLOSURE(f, 0b10, y), with y:u8. Then the type of the function on stack is expected to be |u64|u32. Notice that the whole purpose of the signature index in CallClosure is to enable this verification; it is not needed for the execution semantics.

The same checks which are done statically by the bytecode verifier are again performed by the runtime checker, if paranoid mode is enabled.

Value Representation

There are the following requirements for value representation:

Closures constructed during execution need to execute fast. There should be minimal difference to calling a function directly.
When a closure is deserialized from storage (and serialized) back, the module behind the closure should not need to be loaded. Rather, loading should not happen until the closure is executed. This enables dispatch tables in storage which may contain hundreds of closures.
Closures should be comparable and printable independent of whether they are resolved or not. This requires the captured arguments to be in deserialized form.

One possible representation of closure values can look as follows:

enum ValueImpl {
    ..
    Closure(LazyLoadedFunction, Vec<ValueImpl>)
}
struct LazyLoadedFunction(Rc<RefCell<LazyLoadedFunctionState>>);
enum LazyLoadedFunctionState {
    Unresolved {
        data: SerializedFunctionData, // See below
        ...
    },
    Resolved {
        fun: LoadedFunction,
        ...
    },
}

With this representation, a closure constructed during execution will be in the Resolved state, whereas closure deserialized from storage in the Unresolved state. Notice that Rc<RefCell<_>>`` allows to clone LazyLoadedFunction` and share resolution state.

Serialization

A closure’s type does not describe how the values captured by the closure are serialized. In order to be able to serialize and deserialize the captured values, type information must be serialized as well. This leads to the following logical representation of a function in storage:

pub struct SerializedFunctionData {
    pub module_id: ModuleId,
    pub fun_id: Identifier,
    pub fun_inst: Vec<TypeTag>,
    pub mask: ClosureMask,
    // Allows to deserialize captured arguments without resolving function
    pub captured_layouts: Vec<MoveTypeLayout>,
}

The alternative to storing the type layout is to leave the captured arguments in serialized native form. However, this conflicts with the requirement to make unresolved closures comparable and printable.

Reentrancy check

Reentrancy is considered a problem for two reasons:

Reentrancy is a well-known dangerous pattern via callbacks which are able to modify state shared between caller and callback without referential transparency. For example, a callback silently modifies a balance which is in an intermediate state.
There is also a problem inherent to the Move borrow semantics and references to global storage. Currently, a module executing a function which borrows a resource from the module cannot be re-entered, because the module usage relation in Move is acyclic. This fact is leveraged in Move’s borrow semantics in that the acquires check is localized to a module. However, with function values, this assumption does no longer hold — modules can be re-entered.

This problem is addressed by a new form of reentrancy check in the runtime. In contrast to the existing one for Dispatchable Token Standard, this check allows re-entrance as long as no resources of the module are acquired.

Concretely, let [M1..Mn]a stack of active modules which are visited at runtime. Then for any resource Mi::R, with i < n, acquiring the resource via move_from, borrow_global, or borrow_global_mut will lead to a runtime error.

How is the stack of active modules maintained?

For a call to a function in a different module, that target module is pushed on the stack.
For a direct call to a function in the same module, the stack is unchanged.
For an indirect call via a closure to a function in the same module, the target module is pushed on the stack. (Notice that this can lead to the same module being multiple times on the stack). Notice that this reflects that on indirect calls, the acquires information is not known, thus those calls have to be treated like external calls.

As an example, consider the following code:

module caller {
  use callee;
  struct R{ count: u64 } has key;
  fun calling() acquires R {
     let r = &mut R[@addr];
     // This callback is OK, because `R` is not acquired
     callee::call_me(r, |x| do_something(x))
     // This callback will lead to reentrancy runtime error
     callee::call_me(r, |_| R[@addr].count += 1)
     r.call_count += 1
  }
  fun do_something(r: &mut R) { .. }
}

module callee {
  fun call_me<T(x: &mut T, action: |&mut T|) {
    action(x)
  }
}

Notice that this treatment not only addresses the Move borrow semantics problem (2) but also captures the methodological problem of reentrancy in (1). It is safe to allow a callback in a function like do_something because the state it can modify is referentially transparent via the &mut R parameter.

Module Lock for Reentrancy

The default reentrancy behavior as described in the last section disallows modification of resources owned by the re-entered module, but not by other modules. Here is an example of the classical reentrancy exploit created by this:

module account {
    ...
}
module caller {
  fun transfer(from: address, to: address, amount: u64, notify: |u64|) {
    account::deposit(to, amount)
    notify(amount);
    account::withtdraw(from, amout);
  }
}

To prevent this pattern to be (accidentally) happening, the #[module_lock] function is introduced:

#[module_lock]
fun transfer(from: address, to: address, amount: u64, notify: |u64|)

All code executed within a function marked with this attribute disallows to re-enter the same module from an outside caller. Thus, the transfer function cannot be recursively called fromm the notify(amount) callback. This is stronger and entails the default reentrancy check. Notice that the default reentrancy check is overridden for all transitively called code.

The #[module_lock] is recommended to use for functions which can be re-entered via untrusted code. There will be linting rules required to guide usage of this attribute. Making #[module_lock] the default, however, would prevent many useful programming patterns with function values.

Notice that the #[module_lock] reentrancy behavior is the same as implemented for the Dispatchable Token Standard.

Testing

This feature needs to be tested through the layers of the stack: code serialization, bytecode verification, value and closure serialization, and multiple levels of end-to-end tests.

Reference Implementation

The feature is code complete, the following PRs are involved:

🚀 #16007 [move][closures] Persistent functions and module lock +979/-193
🚀 #16000 [move 2.2][closures] Support function type wrappers in the language +813/-574
🚀 #15987 [move][closures] End-to-end tests and feature gating +518/-123
🚀 #15963 [move-vm][closures] New reentrancy checker +488/-40

🚀 #15917 [compiler][closures] Update compiler to most recent function value design +12511/-39088
🚀 #15680 [move-vm][closures] Interpreter +429/-63
🚀 #15670 [move-vm][closures] Type and Value representation and serialization +2279/-433
🚀 #15669 [move-cm][closures] Refactor: Move type conversions out of `Loader` into a trait +635/-521
🚀 #15668 [move-vm][closures] Refactor: move abilities and closure mask into core types +752/-622
🚀 #15667 [move-vm][closures] Types and opcodes for closures +898/-86
🚀 #15777 [move-vm] Increase bytecode VERSION_MAX to v8 +418/-399

Risks and Drawbacks

Function values are a powerful new feature in the Move VM, which enable a whole set of new paradigms. With the added expressiveness also new responsibility comes, to avoid creating overly complex code. This can be addressed by good documentation and examples for the feature.

Security Considerations

This feature is a relative isolated extension to the Move VM with little feature interactions. Existing tests and a rich set of new tests targeting the new feature specifically should be sufficient to ensure it’s correct functionality. The main focus of security analysis should be on ensuring type confusion via captured parameters is excluded, as well as the new code being robust and does not crash.

Future Potential

Possible future extensions include:

Enabling of capturing references (mutable or immutable). This will also allow to modify context parameters in closures. This feature would go hand-in-hand with the planned feature of allowing references inside of structs.
Supporting dynamic linking of function values