From Stack Machine to Functional Machine: Step 3 (Higher Order Functions)

Loredana Cirstea
8 min readMay 5, 2020

--

Read on Github.

Environment

For illustrating our journey, we will use the Yul language. Yul compiles to both WebAssembly and EVM bytecode.

If you want to run the examples, it can be done with https://remix.ethereum.org:

  • choose Yul as the compiled language, use the raw calldata input, check the return value using the debugger.
  • use the Yul+ plugin to compile, deploy and interact (you will need to comment out the mslice helper function)

The full code source can also be found at https://gist.github.com/loredanacirstea/fc1abd6345a17519455188d2e345f372

Prerequisites

Read the previous articles:

Higher-Order Functions (HOFs)

HOFs are functions that handle other functions as input and/or output arguments. HOFs make functional programming possible.

We are building Taylor: a function/graph-based language, for dType — a decentralized type system. You can find out more about dType and Taylor from our Solidity Summit presentation.

With Taylor, types are recursively applied functions, based on a suit of “native” functions, implemented in a stack-based language, like Yul or WebAssembly.

In order to create type definitions and casting functions, especially for various types of arrays, we needed HOFs such as map, reduce, curry (which transforms a function with multiple arguments into a curried function, at runtime).

In this article, we are exploring how HOFs can be implemented in a stack-based language. We will use Yul, but a similar approach can also be used for WebAssembly. We will build upon the code that we created in the previous steps, using recursive apply and currying.

Map & Curry

A good use case example for currying functions is the map function, which receives a function and an array as input arguments.

Given an array of integers, if we want to apply map over the array and increase each element with 2, we can use a curried sum function, to make our code reusable:

const sumCurried = a => b => a + b
const sumPartial = sumCurried(2)
const arr = [4, 7, 8, 2, 10]
const newarr = map(sumPartial, arr)
// newarr: [6, 9, 10, 4, 12]

Let’s see how our map function looks in Yul:

// map: function_signature, array
case 0xaaaaaaaa {
let internal_fsig, fsig_size := getfSig(input_ptr)
let array_length := mload(add(input_ptr, fsig_size))
let values_ptr := add(add(input_ptr, fsig_size), 32)
// build returned array - add length
mstore(output_ptr, array_length)
// internal_output_ptr is the memory pointer
// at which the next array element is stored
let internal_output_ptr := add(output_ptr, 32)
result_length := 32
for { let i:= 0 } lt(i, array_length) { i := add(i, 1) } {
// Execute function on array element
let interm_res_length := executeInternal(
internal_fsig,
values_ptr,
32,
internal_output_ptr,
virtual_fns
)
// Move array values pointer & the resulting array pointer
// to the next element
values_ptr := add(values_ptr, 32)
internal_output_ptr := add(internal_output_ptr, interm_res_length)
// Update final result length
result_length := add(result_length, interm_res_length)
}
}

To simplify the pattern, we only expect arrays with uint256 type elements. You will see how Taylor solves HOFs for any type in the second part of the article, but in the current case, using uint256 makes the code simpler and easier to understand.

You can look at the full source code at the bottom of the article, to understand how it works.

To test this code, we will use the following calldata:

0xffffffffcccccccc0000000200000028000000c4bbbbbbbbeeeeeeee0000000000000000000000000000000000000000000000000000000000000002aaaaaaaa00000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000700000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000a

Broken down, the calldata represents:

ffffffff - execute signature (main entry point)
cccccccc - recursive apply signature
00000002 - number of steps for recursive apply
00000028 - length in bytes for the first step
000000c4 - length in bytes for the second step
bbbbbbbb - first step: curry function signature
eeeeeeee - sum function signature 0000000000000000000000000000000000000000000000000000000000000002
- partially applied argument for sum: 2 aaaaaaaa - second step: map signature 0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000004 0000000000000000000000000000000000000000000000000000000000000007 0000000000000000000000000000000000000000000000000000000000000008 0000000000000000000000000000000000000000000000000000000000000002 000000000000000000000000000000000000000000000000000000000000000a
- array values

The first step, processed by the recursiveApply internal function is currying our sum function, transforming it from const sum = (a, b) => a + b to const sumCurried = a => b => a + b. Internally, we are doing this by storing the sum function signature in memory, along with the first a argument - read about it in Step 2 — Currying article.

In the second step, map receives our sumCurried signature (the memory pointer where the partially applied function data exists) and our array and applies the curried function over each array element.

The final value is written at the given memory output pointer output_ptr.

If you call the contract with the above calldata, the result is:

0x000000000000000000000000000000000000000000000000000000000000000500000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000009000000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000c

Broken down, the result looks like this:

0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000006 0000000000000000000000000000000000000000000000000000000000000009 000000000000000000000000000000000000000000000000000000000000000a 0000000000000000000000000000000000000000000000000000000000000004 000000000000000000000000000000000000000000000000000000000000000c
- new array values

The above is equivalent to our example:

const sumCurried = a => b => a + b
const sumPartial = sumCurried(2)
const arr = [4, 7, 8, 2, 10]
const newarr = map(sumPartial, arr)
// newarr: [6, 9, 10, 4, 12]

Reduce

The reduce function takes in a function of the form (accumulator, currentValue) -> accumulator, an array, and an initial value for the accumulator.

To simplify the pattern, we will only be using arrays of uint256 type elements and the accumulator will also be of uint256 type.

Given an array of integers, we want to apply reduce over the array and calculate the sum of all elements:

const sum = (a, b) => a + bconst arr = [4, 7, 8, 2, 10]
const result = reduce(sum, arr, 0)
// result: 31

Let’s see how our reduce function looks in Yul:

// reduce: function_signature, array, accumulator (initial value)
case 0x99999999 {
let internal_fsig, fsig_size := getfSig(input_ptr)
let new_ptr := add(input_ptr, fsig_size)
// The accumulator is treated as a uint256 for simplicity
let accumulator := mload(new_ptr)
new_ptr := add(new_ptr, 32)
let array_length := mload(new_ptr)
let values_ptr := add(new_ptr, 32)
for { let i:= 0 } lt(i, array_length) { i := add(i, 1) } {
// Store arguments in a temporary memory pointer, for simplicity
let temporary_input_ptr := 6000
mstore(temporary_input_ptr, accumulator)
mstore(add(temporary_input_ptr, 32), mload(values_ptr))
// Output memory pointer is 0x00
let interm_res_length := executeInternal(
internal_fsig,
temporary_input_ptr,
64,
0,
virtual_fns
)
// Read result from the output pointer
accumulator := mload(0)
// Move array values pointer to the next element
values_ptr := add(values_ptr, 32)
}
mstore(output_ptr, accumulator)
result_length := 32
}

You can look at the full source code at the bottom of the article, to understand how it works.

To test this code, we will use the following calldata:

0xffffffffcccccccc00000001000000e899999999eeeeeeee000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000000000000000000000700000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000a

Broken down, the calldata represents:

ffffffff - execute signature (main entry point)
cccccccc - recursive apply signature
00000001 - number of steps for recursive apply
000000e8 - length in bytes for the first step
99999999 - first step: reduce function signature
eeeeeeee - sum signature (reduce function argument) 0000000000000000000000000000000000000000000000000000000000000000
- reduce initial accumulator value 0000000000000000000000000000000000000000000000000000000000000005
- array length 0000000000000000000000000000000000000000000000000000000000000004 0000000000000000000000000000000000000000000000000000000000000007 0000000000000000000000000000000000000000000000000000000000000008 0000000000000000000000000000000000000000000000000000000000000002 000000000000000000000000000000000000000000000000000000000000000a
- array values

There is only one step that the recursiveApply internal function handles: the reduce function is called with the signature for the internal sum function, the initial accumulator value 0 and the array.

reduce then passes the accumulator value and an array element to sum and the result becomes the new accumulator.

The final value is written at the given memory output pointer output_ptr.

If you call the contract with the above calldata, the result is 31:

0x000000000000000000000000000000000000000000000000000000000000001f

Applications in Taylor

We use the map, curry and reduce functions in our Taylor on-chain graph interpreter smart contract, used for defining types. These functions are especially useful for defining array types and building casting functions, which iterate over array elements and cast them to a new type.

Taylor uses a special, typed encoding and decoding format, where values are always preceded by their type. This allows Taylor to do some interesting things:

  • have runtime type checking
  • build generic functions, that can handle multiple types at runtime (Taylor only works with memory pointers)

Our example from the first part of the article had some obvious limitations: we were using only uint256 type values in the arrays and accumulator. Even if we use a general ABI encoding (e.g. Ethereum ABI encoding), it would be hard to impossible to create a generic map or reduce function that would work on any type (especially with varying type sizes - e.g. uint8).

But with Taylor, generic functions are possible and this opens the door towards generic HOFs, that do not need to be defined (and stored on-chain), separately, for each type.

Generic HOFs are the distinct feature of functional systems, that give them their intrinsic power.

Taylor Array Casting Example

To try out the following example, you can interact with the Taylor smart contract deployed on Ropsten, at 0x7D4150f492f93e2eDD7FC0Fc62c9193b322f75e5:

The following is an example of a graph, that can be interpreted by Taylor, which casts an array to another array of the same length, but different array element types. Graphs in Taylor are more complex than in our example from the first part of the article — each graph step (function) can receive as input any of the initial inputs or variables produced by previous steps.

To be used in Taylor, we first store the graph, by calling the Taylor contract with this calldata:

0xfffffffe00000005777777880300000026ee000003000000070000000e000000161100000300000411000003000002220000043333331c0000001f3333331a000000030203003333332800000002040533333332000000020601

Broken down, the calldata represents:

fffffffe - store graph signature
00000005 - length in bytes for the type definition head
77777788 - this graph's signature
03 - steps count
00000026 - length in bytes for hardcoded graph inputs
ee000003 - tuple of 3 elements follows
00000007 - additive sums of lengths for each tuple element
0000000e
00000016
11000003 - type uint24 for slice_size - 02
000004 - slice_size value
11000003 - type uint24 for index - 03
000002 - index value
22000004 - type bytes4 for cast signature
3333331c - cast function signature
0000001f - length in bytes for graph steps
3333331a - selectraw signature
00000003 - how many inputs selectraw will receive
02 - index for finding slice_size in all graph-local variables
03 - index for finding index
00 - index for to_type
33333328 - curry signature
00000002
04 - index for cast signature
05 - index for selectraw signature
33333332 - map signature
00000002
06 - index for curry signature
01 - index for the array to cast from

To summarize: we have a graph with three steps (functions):

  • selectraw - we need this to select the cast_to type for an array element from the new array type. E.g. selecting 11000020 (uint256) from 22000008 44000003 11000020, where 4400000311000020 means uint256[3].
  • curry - curries the cast function & partially applies it to the selectraw result: 11000020 (cast_to type).
  • map - maps over the cast_from array elements and applies the partially applied cast function on each of them. Aggregates the results in an array.

We have stored our graph. Now, we can execute it.

Given the array [2, 5, 4] or type int32[3], we want to cast it to uint256[3]. Given that all the array values are valid unsigned integers that fit into a uint256, it should be possible to achieve this.

The calldata for doing this using our array casting graph is:

0xffffffff77777788ee0000020000000c000000202200000844000003110000204400000312000004000000020000000500000004

Broken down, the calldata represents:

ffffffff - execute function signature, the main entry point
77777788 - our array casting graph signature
ee000002 - tuple of 2 elements following
0000000c - additive sum of lengths in bytes for each tuple element
00000020
22000008 - type bytes8 for to_array
44000003 - to_array signature: uint256[3]
11000020
44000003 - actual from_array starts here, type int32[3]
12000004
00000002
00000005
00000004

If you call the Taylor contract with the above calldata, the result is:

0xee000001000000684400000311000020000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000000000000000004

Broken down, the result represents:

ee000001 - all inputs & outputs are wrapped in a tupple
00000068
44000003 - final array uint256[3]
11000020
0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000005
0000000000000000000000000000000000000000000000000000000000000004
- array values: [2, 5, 7], left-padded to fit uint256

Full Code

Originally published at https://github.com.

--

--

Loredana Cirstea
Loredana Cirstea

No responses yet