4.4.2. Remote Method Invocation
The Java Remote Method Invocation (RMI) [Remote Method Invocation Specification] mechanism enables distributed programming by allowing methods of remote Java objects to be invoked from other JVMs, possibly on different physical hosts. A Java program can invoke methods on a remote object once it obtains a reference to the remote object. This remote object reference is obtained either by looking up the remote object in the bootstrap-naming service provided by RMI, or by receiving the reference as an argument or a return value. RMI uses object serialization to marshal and unmarshal parameters.
The current Java RMI is designed to support client-server applications that communicate over TCP-based networks [Postel 1981]. Some of the RMI design goals, however, result in severe performance limitations for high-performance applications on closely connected environments, such as clusters of workstations and distributed memory processors. The Applications and Concurrency Working Group (ACG) of the Java Grande Forum (JGF) [Java Grande Forum Report] assessed the suitability of RMI-based Java for parallel and distributed computing based on Java RMI. JGF proposed a set of recommendations for changes in the Java language, Java libraries, and JVM implementation to make Java more suitable for high-end computing. ACG emphasized improvements in two key areas, object serialization and RMI implementation, that will potentially improve the performance of parallel and distributed programs based on Java RMI.
Experimental results [Java Grande Forum Report] suggest that up to 30% of the execution time of a Java RMI is spent in object serialization. Accordingly, ACG recommended a number of improvements, including a slim encoding technique for type information, more efficient serialization of float and double data types, and enhancing the reflection mechanism to allow fewer calls to be made to obtain information about a class. In the current RMI implementation, a new socket connection is created for every remote method invocation. In this implementation, establishing a network connection can take up to 30% of the total execution of the remote method [Java Grande Forum Report]. The key recommendations of JGF to reduce the connection overhead include improved socket connection management, improved resource management, and improved support for custom transport.
4.4.3. Java for High-Performance Numeric Computation
The current design and implementation of Java does not support high-performance numerical applications. To make the performance of Java numerical programs comparable to the performance obtained through programming languages such as C or Fortran, the numeric features of the Java programming language must be improved. The Numerics Working Group of the Java Grande Forum assessed the suitability of Java for numerical programming and proposed a number of improvements in Java’s language features, including floating-point and complex arithmetic, multidimensional arrays, lightweight classes, and operator overloading [Java Grande Forum Report].
To directly address these limitations in Java, IBM has developed a special library that improves the performance of numerically-intensive Java applications [Moreira et al. 2000]. This library takes the form of the Java Array package and is implemented in the IBM HPCJ. The Array package supports such FORTRAN 90 like features as complex numbers, multidimensional arrays, and linear algebra library. A number of Java numeric applications that used this library were shown to achieve between 55% and 90% of the performance of corresponding highly optimized FORTRAN codes.
4.4.4. Garbage Collection
Most JVM implementations use conservative garbage collectors that are very easy to implement, but demonstrate rather poor performance. Conservative garbage collectors cannot always determine where all object references are located. As a result, they must be careful in marking objects as candidates for garbage collection to ensure that no objects that are potentially in use are freed prematurely. This inaccuracy sometimes leads to memory fragmentation due to the inability to relocate objects.
To reduce the negative performance impacts of garbage collection, Sun’s Hotspot JVM [The Java Hotspot Performance Engine Architecture] implements a fully accurate garbage collection (GC) mechanism. This implementation allows all inaccessible memory objects to be reclaimed while the remaining objects can be relocated to eliminate memory fragmentation. Hotspot uses three different GC algorithms to efficiently handle garbage collection. Agenerationalgarbage collector is used in most cases to increase the speed and efficiency of garbage collection. Generational garbage collectors cannot, however, handle long-lived objects. Consequently, Hotspot needs to use an old-object garbage collector, such as amark-compact garbage collector to collect objects that accumulate in the “Old Object” area of the generational garbage collector. The old-object garbage collector is invoked when very little free memory is available or through programmatic requests.
In applications where a large amount of data is manipulated, longer GC pauses are encountered when a mark-compactcollector is used. These large latencies may not be acceptable for latency-sensitive or data-intensive Java applications, such as server applications and animations. To solve this problem, Hotspot provides an alternative incremental garbage collector to collect objects in the “Old Object” area. Incremental garbage collectors can potentially eliminate all user-perceived GC pauses by interleaving the garbage collection with program execution.