Day 9 - Assembly

14th鐵人賽 ethereum smart contract solidity blockchain

ALu

2022-09-09 00:08:59

2572 瀏覽

分享至

Assembly

Synchronization Link Tree

Intro.

其實在寫這篇的時候還蠻痛苦的，因為不懂 EVM 的架構很難進 Assembly 的範疇，但在 Advanced Patterns 的章節部分又會一直出現組合語言，請各位耐心服用啦，忍此一時之後，再忍個十時就可以了嘻嘻。

那學習 Assembly 的過程我自己覺得有三個重點：

把基本的 Assembly 語法看過一次知道大概有什麼可以用
跟著例子學（OpenZeppelin 中有一些 efficiency 的 utils）
跟著了解底層架構的運作（EVM）

Something About EVM

這邊還是先提到一些 EVM，EVM 在 Geth 之中有各種 opcodes，在遇到一些 assembly 不懂的時候可以先來查一下 OpCodes！

These opcodes allow the EVM to be Turing-complete. This means the EVM is able to compute (almost) anything, given enough resources. Because opcodes are 1 byte, there can only be a maximum of 256 (16²) opcodes.
Cited: Compiling SMART CONTRACTS into MACHINE CODE using Ethereum Virtual Machine(EVM)

Stack-manipulating opcodes (POP, PUSH, DUP, SWAP)
Arithmetic/comparison/bitwise opcodes (ADD, SUB, GT, LT, AND, OR)
Environmental opcodes (CALLER, CALLVALUE, NUMBER)
Memory-manipulating opcodes (MLOAD, MSTORE, MSTORE8, MSIZE)
Storage-manipulating opcodes (SLOAD, SSTORE)
Program counter related opcodes (JUMP, JUMPI, PC, JUMPDEST)
Halting opcodes (STOP, RETURN, REVERT, INVALID, SELFDESTRUCT)

YUL

通常使用 assembly 的情況是為了更貼近 EVM 一點，因為在 Solidity 的語法中基本上都是幫我們包好的 high-level 敘述，如果我們想要達到更有效率或者更細節的操作，就可以使用 assembly 來完成目的。在 Solidity 之中 inline assembly 又叫做 Yul（曾經叫做 JULIA or IULIA），除了在 Solidity 之中作為 inline assembly 的一部分，也能當作獨立的直譯語言能夠被編譯成 bytecode 給不同的後端。

我們可以直接使用 solc 去處理 YUL。

Syntax

在 Solidity 中的 Inline Assembly 是不能彼此溝通的，意思就是說在每一個 assembly { ... } 裡宣告的變數都可以視為是 local variables。在 Inline Assembly 裡 = 這個 operation 以 := 表達，而變數宣告則需要使用弱型別 let 取代原先的強型別型態宣告。let 這個語法在 EVM 中首先會宣告一個 stack slot，準備給後面將要宣告的值放著。

另外一個需要注意的部分在於 String 只能有 32 個字元（32 characters）。

那在 EVM 中的 Opcodes 例如各種位元運算、關於交易或者合約的 call、keccak256()、環境變數（blockhash 或 coinbase）、記憶體操作（Storage, Memory and Stack），我們都是可以使用的。

sload & mstore

sload(key) 中代表取哪個 slot 來 load，詳細的 storage layout 可以看這裡。sload(key) 會回傳 offset 代表資料在 memory 中的初始位置，以及 size 代表資料的長度。

assembly {
    let v := sload(0) // _value is at slot #0
}

mstore(offset, value) 代表我們想要在 offset 的位置「開始」存入 value。

assembly {
    let v := sload(0) // read from slot #0
    mstore(0x80, v) // store v at position 0x80 in memory
    return(0x80, 32) // v is 32 bytes (uint256)
}

Operations

加減乘除的概念非常簡單：

function addAssembly(uint x, uint y) public pure returns (uint) {
     assembly {
         // Add some code here
         let result := add(x, y)
         mstore(0x0, result)
         return(0x0, 32)
     }
 }
 
 function addSolidity(uint x, uint y) public pure returns (uint) {
     return x + y;
 }

Inline Assembly little example

我們可以來看一下 Solidity 官方文件中的這個例子來介紹流程控制、敘述、函式、迴圈等用法：主要是利用「避免檢查是否超出陣列長度」來達到更有效率的用途。

// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.4.16 <0.9.0;

library VectorSum {
    // This function is less efficient because the optimizer currently fails to
    // remove the bounds checks in array access.
    function sumSolidity(uint[] memory data) public pure returns (uint sum) {
        for (uint i = 0; i < data.length; ++i)
            sum += data[i];
    }

    // We know that we only access the array in bounds, so we can avoid the check.
    // 0x20 needs to be added to an array because the first slot contains the
    // array length.
    function sumAsm(uint[] memory data) public pure returns (uint sum) {
        for (uint i = 0; i < data.length; ++i) {
            assembly {
                sum := add(sum, mload(add(add(data, 0x20), mul(i, 0x20))))
            }
        }
    }

    // Same as above, but accomplish the entire code within inline assembly.
    function sumPureAsm(uint[] memory data) public pure returns (uint sum) {
        assembly {
            // Load the length (first 32 bytes)
            let len := mload(data)

            // Skip over the length field.
            //
            // Keep temporary variable so it can be incremented in place.
            //
            // NOTE: incrementing data would result in an unusable
            //       data variable after this assembly block
            let dataElementLocation := add(data, 0x20)

            // Iterate until the bound is not met.
            for
                { let end := add(dataElementLocation, mul(len, 0x20)) }
                lt(dataElementLocation, end)
                { dataElementLocation := add(dataElementLocation, 0x20) }
            {
                sum := add(sum, mload(dataElementLocation))
            }
        }
    }

    assembly {
    
        function sumPureAsm(data) -> sum {
            // Load the length (first 32 bytes)
            let len := mload(data)

            // Skip over the length field.
            //
            // Keep temporary variable so it can be incremented in place.
            //
            // NOTE: incrementing data would result in an unusable
            //       data variable after this assembly block
            let dataElementLocation := add(data, 0x20)

            // Iterate until the bound is not met.
            for
                { let end := add(dataElementLocation, mul(len, 0x20)) }
                lt(dataElementLocation, end)
                { dataElementLocation := add(dataElementLocation, 0x20) }
            {
                sum := add(sum, mload(dataElementLocation))
            }
        }
    }
}

其中 for 迴圈的語法是差不多的只是表達方式不同：例如 condition 部分則由 lt(dataElementLocation, end) 也就是 less than 代替。同理如果我們要使用 if(a < b) { ... } 這樣的語法也可以使用 if lt(a, b) { x := sub(0, x) }。

函式的部分由於組合語言只能存在在他們被宣告的作用域之中（Scope），所以不需要宣告可視性（Visibility）與可修改性（stateMutabilty）。

Advanced Usage

Dive in function selector

我們已經知道 Function Selector 可以動態的呼叫函式，而不用使用原本我們熟悉的編譯之後的 ABI，那我們就可以直接使用 Function Selector 搭配底層的 call 來完成所有的動態呼叫。如果今天想要使用 Contract2.func() 來呼叫 Contract1 中的 func()

function func() public returns (uint32, uint32) {
    uint32[2] memory ret;

    address dest = address(contract1);

    bytes4 selector = contract1.func.selector;
    // Or bytes4 selector = bytes4(uint256(keccak256("func(uint256,uint8)") >> 224));

    bytes memory data = abi.encodeWithSelector(selector, uint256(789), uint8(123));

    assembly {
        let success := call(
            gas,           // 將剩餘的 gas 傳給 function
            dest,          // contract1 的地址
            0,             // msg.value
            add(data, 32), // 真正 data 開始的位置是在他的「長度紀錄」以後
            mload(data),   // 前 32 bytes 為長度紀錄
            ret,           // output 地址
            8              // output 大小
        )
        if iszero(success) {
            revert(0, 0)
        }
    }

    return (ret[0], ret[1]);
}

Utils as Example

這邊提供兩個實際使用的 Utils 給大家參考。

Merkle Proof

OpenZeppelin-Merkle Proof Hash

function _efficientHash(bytes32 a, bytes32 b) private pure returns (bytes32 value) {
    assembly {
        mstore(0x00, a)
        mstore(0x20, b)
        value := keccak256(0x00, 0x40)
    }
}