From Stack Machine to Functional Machine: Step 3 (Higher Order Functions)
Environment
For illustrating our journey, we will use the Yul language. Yul compiles to both WebAssembly and EVM bytecode.
If you want to run the examples, it can be done with https://remix.ethereum.org:
- choose Yul as the compiled language, use the raw
calldata
input, check the return value using the debugger. - use the Yul+ plugin to compile, deploy and interact (you will need to comment out the
mslice
helper function)
The full code source can also be found at https://gist.github.com/loredanacirstea/fc1abd6345a17519455188d2e345f372
Prerequisites
Read the previous articles:
- From Stack Machine to Functional Machine: Step 1 — Recursive Apply
- From Stack Machine to Functional Machine: Step 2 — Currying
Higher-Order Functions (HOFs)
HOFs are functions that handle other functions as input and/or output arguments. HOFs make functional programming possible.
We are building Taylor: a function/graph-based language, for dType — a decentralized type system. You can find out more about dType and Taylor from our Solidity Summit presentation.
With Taylor, types are recursively applied functions, based on a suit of “native” functions, implemented in a stack-based language, like Yul or WebAssembly.
In order to create type definitions and casting functions, especially for various types of arrays, we needed HOFs such as map
, reduce
, curry
(which transforms a function with multiple arguments into a curried function, at runtime).
In this article, we are exploring how HOFs can be implemented in a stack-based language. We will use Yul, but a similar approach can also be used for WebAssembly. We will build upon the code that we created in the previous steps, using recursive apply and currying.
Map & Curry
A good use case example for currying functions is the map
function, which receives a function and an array as input arguments.
Given an array of integers, if we want to apply map
over the array and increase each element with 2
, we can use a curried sum
function, to make our code reusable:
const sumCurried = a => b => a + b
const sumPartial = sumCurried(2)const arr = [4, 7, 8, 2, 10]
const newarr = map(sumPartial, arr)// newarr: [6, 9, 10, 4, 12]
Let’s see how our map
function looks in Yul:
// map: function_signature, array
case 0xaaaaaaaa {
let internal_fsig, fsig_size := getfSig(input_ptr)
let array_length := mload(add(input_ptr, fsig_size))
let values_ptr := add(add(input_ptr, fsig_size), 32) // build returned array - add length
mstore(output_ptr, array_length) // internal_output_ptr is the memory pointer
// at which the next array element is stored
let internal_output_ptr := add(output_ptr, 32)
result_length := 32 for { let i:= 0 } lt(i, array_length) { i := add(i, 1) } {
// Execute function on array element
let interm_res_length := executeInternal(
internal_fsig,
values_ptr,
32,
internal_output_ptr,
virtual_fns
) // Move array values pointer & the resulting array pointer
// to the next element
values_ptr := add(values_ptr, 32)
internal_output_ptr := add(internal_output_ptr, interm_res_length) // Update final result length
result_length := add(result_length, interm_res_length)
}
}
To simplify the pattern, we only expect arrays with uint256
type elements. You will see how Taylor solves HOFs for any type in the second part of the article, but in the current case, using uint256
makes the code simpler and easier to understand.
You can look at the full source code at the bottom of the article, to understand how it works.
To test this code, we will use the following calldata
:
0xffffffffcccccccc0000000200000028000000c4bbbbbbbbeeeeeeee0000000000000000000000000000000000000000000000000000000000000002aaaaaaaa00000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000700000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000a
Broken down, the calldata
represents:
ffffffff - execute signature (main entry point)
cccccccc - recursive apply signature
00000002 - number of steps for recursive apply
00000028 - length in bytes for the first step
000000c4 - length in bytes for the second step
bbbbbbbb - first step: curry function signature
eeeeeeee - sum function signature 0000000000000000000000000000000000000000000000000000000000000002
- partially applied argument for sum: 2 aaaaaaaa - second step: map signature 0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000004 0000000000000000000000000000000000000000000000000000000000000007 0000000000000000000000000000000000000000000000000000000000000008 0000000000000000000000000000000000000000000000000000000000000002 000000000000000000000000000000000000000000000000000000000000000a
- array values
The first step, processed by the recursiveApply
internal function is currying our sum
function, transforming it from const sum = (a, b) => a + b
to const sumCurried = a => b => a + b
. Internally, we are doing this by storing the sum
function signature in memory, along with the first a
argument - read about it in Step 2 — Currying article.
In the second step, map
receives our sumCurried
signature (the memory pointer where the partially applied function data exists) and our array and applies the curried function over each array element.
The final value is written at the given memory output pointer output_ptr
.
If you call the contract with the above calldata
, the result is:
0x000000000000000000000000000000000000000000000000000000000000000500000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000009000000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000c
Broken down, the result looks like this:
0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000006 0000000000000000000000000000000000000000000000000000000000000009 000000000000000000000000000000000000000000000000000000000000000a 0000000000000000000000000000000000000000000000000000000000000004 000000000000000000000000000000000000000000000000000000000000000c
- new array values
The above is equivalent to our example:
const sumCurried = a => b => a + b
const sumPartial = sumCurried(2)const arr = [4, 7, 8, 2, 10]
const newarr = map(sumPartial, arr)// newarr: [6, 9, 10, 4, 12]
Reduce
The reduce
function takes in a function of the form (accumulator, currentValue) -> accumulator
, an array, and an initial value for the accumulator.
To simplify the pattern, we will only be using arrays of uint256
type elements and the accumulator will also be of uint256
type.
Given an array of integers, we want to apply reduce
over the array and calculate the sum of all elements:
const sum = (a, b) => a + bconst arr = [4, 7, 8, 2, 10]
const result = reduce(sum, arr, 0)// result: 31
Let’s see how our reduce
function looks in Yul:
// reduce: function_signature, array, accumulator (initial value)
case 0x99999999 {
let internal_fsig, fsig_size := getfSig(input_ptr)
let new_ptr := add(input_ptr, fsig_size) // The accumulator is treated as a uint256 for simplicity
let accumulator := mload(new_ptr)
new_ptr := add(new_ptr, 32) let array_length := mload(new_ptr)
let values_ptr := add(new_ptr, 32) for { let i:= 0 } lt(i, array_length) { i := add(i, 1) } {
// Store arguments in a temporary memory pointer, for simplicity
let temporary_input_ptr := 6000
mstore(temporary_input_ptr, accumulator)
mstore(add(temporary_input_ptr, 32), mload(values_ptr)) // Output memory pointer is 0x00
let interm_res_length := executeInternal(
internal_fsig,
temporary_input_ptr,
64,
0,
virtual_fns
) // Read result from the output pointer
accumulator := mload(0)
// Move array values pointer to the next element
values_ptr := add(values_ptr, 32)
}
mstore(output_ptr, accumulator)
result_length := 32
}
You can look at the full source code at the bottom of the article, to understand how it works.
To test this code, we will use the following calldata
:
0xffffffffcccccccc00000001000000e899999999eeeeeeee000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000700000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000a
Broken down, the calldata
represents:
ffffffff - execute signature (main entry point)
cccccccc - recursive apply signature
00000001 - number of steps for recursive apply
000000e8 - length in bytes for the first step
99999999 - first step: reduce function signature
eeeeeeee - sum signature (reduce function argument) 0000000000000000000000000000000000000000000000000000000000000000
- reduce initial accumulator value 0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000004 0000000000000000000000000000000000000000000000000000000000000007 0000000000000000000000000000000000000000000000000000000000000008 0000000000000000000000000000000000000000000000000000000000000002 000000000000000000000000000000000000000000000000000000000000000a
- array values
There is only one step that the recursiveApply
internal function handles: the reduce
function is called with the signature for the internal sum
function, the initial accumulator value 0
and the array.
reduce
then passes the accumulator
value and an array element to sum
and the result becomes the new accumulator
.
The final value is written at the given memory output pointer output_ptr
.
If you call the contract with the above calldata
, the result is 31
:
0x000000000000000000000000000000000000000000000000000000000000001f
Applications in Taylor
We use the map
, curry
and reduce
functions in our Taylor on-chain graph interpreter smart contract, used for defining types. These functions are especially useful for defining array types and building casting functions, which iterate over array elements and cast them to a new type.
Taylor uses a special, typed encoding and decoding format, where values are always preceded by their type. This allows Taylor to do some interesting things:
- have runtime type checking
- build generic functions, that can handle multiple types at runtime (Taylor only works with memory pointers)
Our example from the first part of the article had some obvious limitations: we were using only uint256
type values in the arrays and accumulator. Even if we use a general ABI encoding (e.g. Ethereum ABI encoding), it would be hard to impossible to create a generic map
or reduce
function that would work on any type (especially with varying type sizes - e.g. uint8
).
But with Taylor, generic functions are possible and this opens the door towards generic HOFs, that do not need to be defined (and stored on-chain), separately, for each type.
Generic HOFs are the distinct feature of functional systems, that give them their intrinsic power.
Taylor Array Casting Example
To try out the following example, you can interact with the Taylor smart contract deployed on Ropsten, at 0x7D4150f492f93e2eDD7FC0Fc62c9193b322f75e5
:
- create a new `.js` file in https://remix-alpha.ethereum.org
- copy the following code:
- in the Remix console, execute
remix.exeCurrent()
orremix.execute(filepath_to_your_js_file)
- check out a more detailed explanation of the typed encoding format here: https://github.com/loredanacirstea/taylor/blob/master/SupportedTypes.md
The following is an example of a graph, that can be interpreted by Taylor, which casts an array to another array of the same length, but different array element types. Graphs in Taylor are more complex than in our example from the first part of the article — each graph step (function) can receive as input any of the initial inputs or variables produced by previous steps.
To be used in Taylor, we first store the graph, by calling the Taylor contract with this calldata
:
0xfffffffe00000005777777880300000026ee000003000000070000000e000000161100000300000411000003000002220000043333331c0000001f3333331a000000030203003333332800000002040533333332000000020601
Broken down, the calldata
represents:
fffffffe - store graph signature
00000005 - length in bytes for the type definition head
77777788 - this graph's signature
03 - steps count00000026 - length in bytes for hardcoded graph inputs
ee000003 - tuple of 3 elements follows
00000007 - additive sums of lengths for each tuple element
0000000e
00000016
11000003 - type uint24 for slice_size - 02
000004 - slice_size value
11000003 - type uint24 for index - 03
000002 - index value
22000004 - type bytes4 for cast signature
3333331c - cast function signature0000001f - length in bytes for graph steps
3333331a - selectraw signature
00000003 - how many inputs selectraw will receive
02 - index for finding slice_size in all graph-local variables
03 - index for finding index
00 - index for to_type
33333328 - curry signature
00000002
04 - index for cast signature
05 - index for selectraw signature
33333332 - map signature
00000002
06 - index for curry signature
01 - index for the array to cast from
To summarize: we have a graph with three steps (functions):
selectraw
- we need this to select thecast_to
type for an array element from the new array type. E.g. selecting11000020
(uint256
) from22000008 44000003 11000020
, where4400000311000020
meansuint256[3]
.curry
- curries thecast
function & partially applies it to theselectraw
result:11000020
(cast_to
type).map
- maps over thecast_from
array elements and applies the partially appliedcast
function on each of them. Aggregates the results in an array.
We have stored our graph. Now, we can execute it.
Given the array [2, 5, 4]
or type int32[3]
, we want to cast it to uint256[3]
. Given that all the array values are valid unsigned integers that fit into a uint256
, it should be possible to achieve this.
The calldata
for doing this using our array casting graph is:
0xffffffff77777788ee0000020000000c000000202200000844000003110000204400000312000004000000020000000500000004
Broken down, the calldata
represents:
ffffffff - execute function signature, the main entry point
77777788 - our array casting graph signature
ee000002 - tuple of 2 elements following
0000000c - additive sum of lengths in bytes for each tuple element
00000020
22000008 - type bytes8 for to_array
44000003 - to_array signature: uint256[3]
11000020
44000003 - actual from_array starts here, type int32[3]
12000004
00000002
00000005
00000004
If you call the Taylor contract with the above calldata
, the result is:
0xee000001000000684400000311000020000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004
Broken down, the result represents:
ee000001 - all inputs & outputs are wrapped in a tupple
00000068
44000003 - final array uint256[3]
11000020
0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000005
0000000000000000000000000000000000000000000000000000000000000004
- array values: [2, 5, 7], left-padded to fit uint256
Full Code
Originally published at https://github.com.