3.2.2. Just-In-Time Compilers
A Just-In-Time(JIT) compiler translates Java bytecodes into equivalent native machine instructions as shown in Figure 2. This translation is performed at runtime immediately after a method is invoked. Instead of interpreting the code for an invoked method, the JIT compiler translates a method’s bytecodes into a sequence of native machine instructions. These native instructions are executed in place of the bytecodes. Translated bytecodes are then cached to eliminate redundant translation of bytecodes.
In most cases, executing JIT compiler generated native code is more efficient than interpreting the equivalent bytecodes since an interpreter identifies and interprets a bytecode every time it is encountered during execution. JIT compilation, on the other hand, identifies and translates each instruction only once—the first time a method is invoked. In programs with large loops or recursive methods, the combination of the JIT compilation and native code execution times can be drastically reduced compared to an interpreter. Additionally, a JIT compiler can speed up native code execution by optimizing the code it generates, as described in Section 3.2.1.
Since the total execution time of a Java program is a combination of compilation time and execution time, a JIT compiler needs to balance the time spent optimizing generated code against the time it saves by optimization. Code optimization is further limited by the scope of compilation. Since compilation occurs on demand for one class or one method at a time, it is difficult for a JIT compiler to perform nonlocal optimizations. Due to the time and scope restrictions of JIT compilation, JIT optimizations are generally simple ones that are expected to yield reasonably large performance gains compared to the optimization time.
Although JIT compilers are generally more efficient than interpreters, there are still advantages to using an interpreter. One advantage is that interpreters are better suited to debugging programs. Another advantage is that JIT compilation compiles an entire method at once, while interpretation translates only the instructions that actually are executed. If only a small percentage of the bytecodes in a method are ever executed and the method is rarely executed, the time spent on JIT compilation may never be recouped by the reduction in execution time.
Current JIT compilers offer a variety of target platforms and features. The Symantec Cafe JITis included in the Java 2 runtime environment for Windows 95/NT and Netscape Navigator [Symantec].Microsoftincludes a JIT with Internet Explorer for Windows 95/NT and Macintosh [Just-In-Time Compilation]. No published information is available about the implementation details of these JIT compilers.
IBM includes an optimizing JIT compiler in its IBM Developer Kit for Windows [Ishizaki et al. 1999; Suganuma et al. 2000]. This JIT compiler performs a variety of optimizations including method inlining, exception check elimination, common subexpression elimination, loop versioning, and code scheduling. The compiler begins by performing flow analysis to identify basic blocks and loop structure information for later optimizations. The bytecodes are transformed into an internal representation, called extended bytecodes, on which the optimizations are performed.
The optimizations begin with method inlining. Empty method calls originating from object constructors or small access methods are always inlined. To avoid code expansion (and thus, the resulting cache inefficiency) other method calls are inlined only if they are in program hotspots such as loops. Virtual method calls are handled by adding an explicit check to make sure that the inlined method is still valid. If the check fails, standard virtual method invocation is performed. If the referenced method changes frequently, the additional check and inlined code space adds additional overhead to the invocation. However, the referenced method is likely to remain the same in most instances, eliminating the cost of a virtual method lookup and invocation.
Following inlining, the IBM JIT compiler performs general exception check elimination and common subexpression elimination based on program flow information. The number of array bound exception checks is further reduced usingloop versioning. Loop versioning creates two versions of a target loop—a safe version with exception checking and an unsafe version without exception checking. Depending on the loop index range test at the entry point of the loop, either the safe or the unsafe version of the loop is executed.
At this point, the IBM JIT compiler generates native x86 machine code based on the extended bytecode representation. Certain stack manipulation semantics are detected in the bytecode by matching bytecode sequences known to represent specific stack operations. Register allocation is applied by assigning registers to stack variables first and then to local variables based on usage counts. Register allocation and code generation are performed in the same pass to reduce compilation time. Finally, the generated native code is scheduled within the basic block level to fit the requirements of the underlying machine.
Intel includes a JIT compiler with the VTune optimization package for Java that interfaces with the Microsoft JVM [Adl Tabatabai et al. 1998]. The Intel JIT compiler performs optimizations and gen erates code in a single pass without generating a complete internal representation of the program. This approach speeds up native code generation while limiting the scope of the optimizations to extended basic blocks only. The Intel JIT compiler applies common subexpression elimination (CSE) within basic blocks, local and global register allocation, and limited exception optimizations.