| |
| JIT Compiler Design |
| |
| |
| Copyright (C) 2006 Pekka Enberg |
| |
| This file is released under the GPLv2. |
| |
| |
| Introduction |
| ============ |
| |
| The compiler is divided into the following passes: control-flow graph |
| construction, bytecode parsing, instruction selection, and code |
| emission. The compiler analyzes the given bytecode sequence to find |
| basic blocks for constructing the control-flow graph. This pass is |
| done first to simplify parsing of bytecode branches. Bytecode |
| sequence is then parsed and converted to an expression tree. The tree |
| is given to the instruction selector to lower the IR to three-address |
| code. Code emission phase converts that sequence to machine code |
| which can be executed. |
| |
| Programs are compiled one method at a time. Invocation of a method is |
| replaced with an invocation of a special per-method JIT trampoline |
| that is responsible for compiling the actual target method upon first |
| invocation. |
| |
| |
| Intermediate Representations |
| ============================ |
| |
| The compiler has two intermediate representations: expression tree and |
| three-address code. The expression tree is constructed from bytecode |
| sequence of a method whereas three-address code is the result of |
| instruction selection. Three-address code is translated to executable |
| machine code. |
| |
| The JIT compiler operates on one method at a time called a compilation |
| unit. A compilation unit is made up of one or more basic blocks which |
| represent straight-line code. Each basic block has a list of one or |
| more statements that can either be standalone or operate on one or two |
| expression trees. |
| |
| The instruction selector emits three-address code for a compilation |
| unit from the expression tree. This intermediate representation is |
| essentially a sequence of instructions that mimic the native |
| instruction set. One notable exception is branch targets which are |
| represented as pointers to instructions. The pointers are converted |
| to real machine code targets with back-patching during code emission. |