Documentation/jit-compiler-design.txt - pub/scm/java/jato/jato - Git at Google


 				JIT Compiler Design


   Copyright (C) 2006 Pekka Enberg

   This file is released under the GPLv2.


 Introduction
 ============

 The compiler is divided into the following passes: control-flow graph
 construction, bytecode parsing, instruction selection, and code
 emission.  The compiler analyzes the given bytecode sequence to find
 basic blocks for constructing the control-flow graph.  This pass is
 done first to simplify parsing of bytecode branches.  Bytecode
 sequence is then parsed and converted to an expression tree. The tree
 is given to the instruction selector to lower the IR to three-address
 code.  Code emission phase converts that sequence to machine code
 which can be executed.

 Programs are compiled one method at a time.  Invocation of a method is
 replaced with an invocation of a special per-method JIT trampoline
 that is responsible for compiling the actual target method upon first
 invocation.


 Intermediate Representations
 ============================

 The compiler has two intermediate representations: expression tree and
 three-address code.  The expression tree is constructed from bytecode
 sequence of a method whereas three-address code is the result of
 instruction selection.  Three-address code is translated to executable
 machine code.

 The JIT compiler operates on one method at a time called a compilation
 unit.  A compilation unit is made up of one or more basic blocks which
 represent straight-line code.  Each basic block has a list of one or
 more statements that can either be standalone or operate on one or two
 expression trees.

 The instruction selector emits three-address code for a compilation
 unit from the expression tree.  This intermediate representation is
 essentially a sequence of instructions that mimic the native
 instruction set.  One notable exception is branch targets which are
 represented as pointers to instructions.  The pointers are converted
 to real machine code targets with back-patching during code emission.

	JIT Compiler Design


	Copyright (C) 2006 Pekka Enberg

	This file is released under the GPLv2.


	Introduction
	============

	The compiler is divided into the following passes: control-flow graph
	construction, bytecode parsing, instruction selection, and code
	emission. The compiler analyzes the given bytecode sequence to find
	basic blocks for constructing the control-flow graph. This pass is
	done first to simplify parsing of bytecode branches. Bytecode
	sequence is then parsed and converted to an expression tree. The tree
	is given to the instruction selector to lower the IR to three-address
	code. Code emission phase converts that sequence to machine code
	which can be executed.

	Programs are compiled one method at a time. Invocation of a method is
	replaced with an invocation of a special per-method JIT trampoline
	that is responsible for compiling the actual target method upon first
	invocation.


	Intermediate Representations
	============================

	The compiler has two intermediate representations: expression tree and
	three-address code. The expression tree is constructed from bytecode
	sequence of a method whereas three-address code is the result of
	instruction selection. Three-address code is translated to executable
	machine code.

	The JIT compiler operates on one method at a time called a compilation
	unit. A compilation unit is made up of one or more basic blocks which
	represent straight-line code. Each basic block has a list of one or
	more statements that can either be standalone or operate on one or two
	expression trees.

	The instruction selector emits three-address code for a compilation
	unit from the expression tree. This intermediate representation is
	essentially a sequence of instructions that mimic the native
	instruction set. One notable exception is branch targets which are
	represented as pointers to instructions. The pointers are converted
	to real machine code targets with back-patching during code emission.