The Ethereum Virtual machine is kind of different than most other Virtual Machines out there. In my previous post I already explained how it’s used and described some of its characteristics.
The Ethereum Virtual Machine (EVM) is a simple but powerful, Turing complete 256bit Virtual Machine that allows anyone to execute arbitrary EVM Byte Code.The go-ethereum project contains two implementations of the EVM. A simple and straightforward byte-code VM and a more sophisticated JIT-VM. In this post I’m going to explain some of the differences between the two implementations and describe some of the characteristics of the JIT EVM and why it can be so much faster than the byte-code EVM.
Go-ethereum’s Byte Code Virtual Machine
The EVM’s internals are pretty simple; it has a single run loop which will attempt to execute the instruction at the current Program Counter (PC in short). Within this loop the Gas is calculated for each instruction, memory is expanded if necessary and executes the instruction if the preamble succeeds. This will continue on until the VM either finishes gracefully or returns with an error by throwing an exception (e.g. out-of-gas).for op = contract[pc] {
if !sufficientGas(op) {
return error("insufficient gas for op:", or)
}
switch op {
case ...:
/* execute */
case RETURN:
return memory[stack[-1], stack[-2]]
}
pc++
}
The EVM has another way to change the program-counter through something called jump-instructions (JUMP & JUMPI). Instead of letting the program-counter increment (pc++) the EVM can also jump to arbitrary positions in the contract code. The EVM knows two jump instructions, a normal jump that reads as “jump to position X” and a conditional jump that read as “jump to position X if condition Y is true”. When either such a jump occurs it must always land on a jump-destination. If the program lands on an instruction other than a jump destination the program fails — in other words, for a jump to be valid it must always be followed by a jump-destination instruction if the condition yielded true.
Prior to running any Ethereum program the EVM iterates over the code and finds all possible jump-destinations, it then puts them in a map that can be referenced by the program-counter to find them. Each and every time the EVM encounters a jump-instructions the jump validity is checked.
As you can see the executing code is relatively easy and simply interpreted by the byte-code VM, we may conclude even that through its sheer simplicity it’s actually pretty dumb.
Welcome JIT VM
The JIT-EVM takes a different approach to running EVM byte-code and is by definition initially slower than the byte-code VM. Before the VM can run any code it must first compile the byte-code in to components that can be understood by the JIT VM.The initialisation- and execution procedure is done in 3-steps:
- We check whether there’s a JIT program ready to be run using the hash of the code — H(C) is used as an identifier to identify the program;
- if a program was found we run the program and return the result;
- if no program was found we run the byte-code and we compile a JIT program in the background.
Initially I tried to check whether the JIT program had finished compiling and move the execution over to the JIT — this all happened during runtime in the same loop using Go’s atomic package — unfortunately it turned out to be slower than letting the byte-code VM run and use the JIT program for every sequential call after the compilation of the program had finished.
By compiling the byte-code in to logical pieces the JIT has the ability to analyse the code more precisely and optimise where and whenever necessary.
For example an incredible simple optimisation that I did was compiling several push operation in to a single instruction. Let’s take the CALL instruction; call requires 7 push instructions — i.e. gas, address, value, input-offset, input-size, return-offset and return-size — prior to executing it, and what I did instead of looping through these 7 instructions, executing them one by one, I’ve optimised this away by taking the 7 instructions and append the 7 values in to a single slice. Now, whenever the start of the 7 push instructions is executed, it instead executes the one optimised instruction by immediately appending the static slice to the VM stack. Now of course this only works for static values (i.e. push 0x10), but these are present in the code quite a lot.
I’ve also optimised the static jump instructions. Static jumps are jumps who always jump to the same position (i.e. push 0x1, jump) and never change under any circumstance. By determining which jumps are static we can pre-check whether a jump is valid and lies within the bounds of the contract and if so we create a new instructions that replaces both the push and jumpinstruction and is flagged as valid. This prevents the VM from having to do two instructions and it prevents it from having to check whether the jump is valid and doing an expensive hash-map lookup for valid jump position.
Next steps
Full stack and memory analysis would also fit nicely in this model where large chunks of code could fit in to single instructions. Further I’d like to add symbolic-execution and turn the JIT in to a proper JIT-VM. I think this would be a logical next step once programs get large enough to take advantage of these optimisations.
Conclusion
Our JIT-VM is a whole lot smarter than the byte-code VM, but is far from being completely done (if ever). There are many more clever tricks we could add with this structure, but simply aren’t realistic for the moment. The runtime is within the bounds of being “reasonable” speedy. May the need arise to further optimise the VM we have the tools to do so.