Adirect compiler translates either Java source code or byte codes into machine instructions that are directly executable on he designated target processor (refer to Figure 2). The main difference between a direct compiler and a JIT compiler is that the compiled code generated by a direct compiler is available for future executions of the Java program. In JIT compilation, since the translation is done at run-time, compilation speed requirements limit the type and range of code optimizations that can be applied. Direct compilation is done statically, however, and so can apply traditional time-consuming optimization techniques, such as data-flow analysis, interprocedural analysis, and so on [Aho et al. 1986], to improve the performance of the compiled code. However, because they provide static compilation, direct compilers cannot support dynamic class loading. Direct compilation also results in a loss of portability since the generated code can be executed only on the specific target processor. It should be noted that the portability is not completely lostif the original bytecodes or source code are still available, however. Caffeine[Hsieh et al. 1996] is a Java bytecode-to-native code compiler that generates optimized machine code for the X86 architecture. The compilation process involves several translation steps. First, it translates the bytecodes into an internal language representation, called Java IR, that is organized into functions and basic blocks. The Java IR is then converted to a machine-independent IR, calledLcode, using stack analysis, stackto-register mapping, and class hierarchy analysis. The IMPACT compiler [Chang et al. 1991] is then used to generate an optimized machine-independent IR by applying optimizations such as inlining, data-dependence and interclass analysis to improve instruction-level parallelism (ILP). Further, machine-specific optimizations, including peephole optimization, instruction scheduling, speculation, and register allocation, are applied to generate the optimized machine-specific IR. Finally, optimized machine code is generated from this machine-specific IR. Caffeine uses an enhanced memory model to reduce the overhead due to additional indirections specified in the standard Java memory model. It combines the Java class instance data block and the method table into one object block and thus requires only one level of indirection. Caffeine supports exception handling only through array bounds checking. It does not support garbage collection, threads, and the use of graphics libraries. The Native Executable Translation (NET) compiler [Hsieh et al. 1997] extends the Caffeine prototype by supporting garbage collection. It uses a mark-andsweep garbage collector which is invoked only when memory is full or reaches a predefined limit. Thus, NET eliminates the overhead due to garbage collection in smaller application programs that have low memory requirements. NET also supports threads and graphic libraries. The IBMHigh Performance Compiler for Java (HPCJ) [Seshadri 1997] is another optimizing native code compiler targeted for the AIX, OS/2, Windows95, and WindowsNT platforms. HPCJ takes both Java source code and bytecodes as its input. If the Java source code is used as an input, it invokes the AIX JDK’s Java source-to-bytecode compiler (javac) to produce the bytecodes. Java bytecodes are translated to an internal compiler intermediate language (IL) representation. The common back-end from IBM’s XL family of compilers for the RS/6000 is used to translate the IL code into an object module (.o file). The object module is then linked to other object modules from the Java application program and libraries to produce the executable machine code. The libraries in HPCJ implement garbage collection, Java APIs, and various runtime system routines to support object creation, threads, and exception handling. A common back-end for the native code generation allows HPCJ to apply various language-independent optimizations, such as instruction scheduling, common subexpression elimination, intramodular inlining, constant propagation, global register allocation, and so on. The bytecodeto-IL translator reduces the overhead for Java run-time checking by performing a simple bytecode simulation during basic block compilation to determine whether such checks can be eliminated. HPCJ reduces the overhead due to method indirection by emitting direct calls for instance methods that are known to be final or that are known to belong to a final class. It uses a conservative garbage collector since the common back-end does not provide any special support for garbage collection.