4. HIGH-PERFORMANCE JAVA EXECUTION TECHNIQUES
Direct compilers or bytecode-to-source translators can improve Java performance by generating optimized native codes or intermediate language codes. However this high performance may come at the expense of a loss of portability and flexibility. JIT compilers, on the other hand, support both portability and flexibility, but they cannot achieve performance comparable to directly compiled code as only limited-scope optimizations can be performed. A number of execution techniques have been developed that attempt to improve Java performance by optimizing the Java source code or the bytecodes so that portability is not lost. Some techniques apply dynamic optimizations to the JIT compilation approach to improve the quality of the compiled native machine code within the limited time available to a JIT compiler. Some other techniques improve various features of the JVM, such as thread synchronization, remote method invocation (RMI), and garbage collection, that aid in improving the overall performance of Java programs. In this section, we describe these high-performance Java execution techniques.
4.1. Byte code Optimization
One technique for improving the execution time performance of Java programs is to optimize the bytecodes. The Briki compiler applies a number of optimizations to bytecodes to improve the execution time performance [Cierniak and Li 1997a,b]. The front-end of the offline mode of the Briki compiler [Cierniak and Li 1997a] reconstructs the high-level program structure of the original Java program from the bytecodes. This high-level view of the input bytecodes is stored in an intermediate representation called JavaIR. A number of optimizations can then be applied to this JavaIR. Finally, the optimized JavaIR is transformed back into Java source code and executed by a Java compiler.
Converting Java bytecodes to JavaIR is very expensive, however, requiring an order of magnitude more time than the time required to perform the optimizations themselves. In an on-line version of Briki [Cierniak and Li 1997b], the same optimizations are performed while recovering only as much of the structure information as needed and using faster analysis techniques than those used in traditional compilers. This on-line version of Briki is integrated with the Kaffe JIT compiler [Wilkinson, Kaffe v0.10.0], using Kaffe to generate an IR. The compiler analyzes and transforms the Kaffe IR into an optimized IR which the Kaffe JIT backend then translates into native code.
One optimization in Briki attempts to improve memory locality by data remapping. Briki groups together similar fields within objects onto consecutive memory locations, increasing the probability that fields will reside on the same cache line. However, the specific criterion used by the Briki compiler to determine similar fields has not been described [Cierniak and Li 1997a,b]. Briki also attempts to remap array dimensions when dimension remapping will improve memory access time. For example, in a two-dimensional array A, location A[i][j] might be remapped to A[j][i] if such a remapping is advantageous. However, this remapping poses a number of problems. First, Java arrays are not standard rectangular arrays but rather are implemented as an array of arrays. Therefore, array elements in a given dimension are not guaranteed to be of the same size. A[i] might reference an array of size 2 while A[j] references an array of size 3, for instance. Since array remapping is valid only for rectangular arrays, Briki constrains remapping to cases where all dimensions of the array are accessed. Second, subscript expressions in arrays can have potential sideeffects. For instance, the calculation of a subscript may, as a side-effect, assign a value to a variable. Remapping the array dimensions may cause problems since the renaming changes the evaluation order of these expressions. To remedy this problem, Briki evaluates subscript expressions in the correct order before the arrays are actually referenced.
TheDashO Pro[DashO Pro] bytecode optimizer applies several optimization techniques to improve runtime performance and reduce the size of the Java executable. These techniques include shortening the names of all the classes, methods, and fields, removing all unused methods, fields, and constant-pool entries (both constants and instance variables), extracting classes from accessible thirdparty packages or libraries, and so on. DashO produces one file that contains only the class files needed for the Java program resulting in smaller and faster Java bytecode files.
Krintz et al. [1999] proposed a Java class file splitting and prefetching technique to reduce the transfer delay of bytecodes sent over the Internet. Class file splitting partitions a Java class file into separate hotandcold class files to eliminate transferring code in the cold class file that is very rarely (or never) used. Class file prefetching inserts special prefetch instructions into the Java class file to overlap bytecode transfer with execution. These optimizations use compiletime analysis and profiling to select code to split and to insert prefetch instructions. Experimental results showed that class file splitting was able to reduce startup time for Java programs by 10% while prefetching combined with splitting reduced the overall bytecode transfer delay by 25% on average.
The Java application extractor Jax [Tip et al. 1999] applies several techniques, such as the elimination of redundant methods/fields and class hierarchy specialization [Tip and Sweeney 1997], to reduce the size of a Java class file archive (e.g., zip or jar files). Jax performs a wholeprogram analysis on a Java class file to determine the classes, methods, and fields that must be retained to preserve program behavior. It then removes the unnecessary components of the class files to reduce the archive size. Additionally, Jax applies a class hierarchy transformation that reduces the archive size by eliminating classes entirely or by merging adjacent classes in the hierarchy. It also replaces original class, method, and field names with shorter names. Once methods are removed from a class file, some entries in the constant pool may appear to be redundant. Thus, when the archive is rewritten after all the transformations are applied, a new constant pool is created from scratch to contain only the classes, methods, fields, and constants that are referenced in the transformed class file. Using Jax on a number of Java archives resulted in a 13.4–90.2% reduction in size [Tip et al. 1999].