Distributed Objects
From an object point of view, one of the biggest disadvantages of HTTP is its connectionless nature. Object-oriented architects and designers have been working for years with distributed object systems and are familiar with their architecture and design patterns. If we consider how most Web applications work, we know that a client browser navigates a system's Web pages, with each containing a certain content, either active or passive. At one level of abstraction, the pages in a Web application can be considered objects. Each possesses contentstateand may execute scripts, or behavior. Pages have relationshipshyperlinkswith other pages and objects in the system: Document Object Model, database connectors, and so on). The fact that pages are distributed to clients, where they are executed, in a sense makes the simple Web system a type of distributed object system. In fact, the principal goal of this book is to show how to model the pages of Web applications and other Web-specific components in an object-oriented manner, consistent with the models of the rest of the system.
It should therefore be no surprise that architects and designers of Web applications have a natural affinity toward distributed object systems. Distributed object systems have certain advantages, including the ability to truly distribute the execution of the business logic to the system nodes that are the most appropriate to handle it. But because HTTP is connectionless, many existing distributed object design patterns are difficult to apply directly to Web applications. Most distributed object systems depend on a consistent network connection between client and server.
In addition to the connection issue, there is a limit to how much functionality can be delivered with plain HTML client-side scripting, even with applets or ActiveX controls. In many cases, the bandwidth to the server is too restricted by HTTP for sophisticated objects to perform business tasks on the server.
A classic example is the basic address collection interface. Many software systems that require the collection of personal and business addresses allow the user to enter a zip (postal) code and have the system automatically populate the state and city fields of the address. For a Web application to do this, either the entire postal code list needs to be accessible on or downloaded to the client, or the system must make an additional server request for the information. Neither solution is well suited for the Web, since the zip code list itself might be many megabytes and take several minutes just to download to the client, and an extra server trip is lengthy: on the order of several seconds. Additionally, the usability issues related to a several-second discontinuity in the data entry process of a common address block will prevent most designers from pursuing this route.
The solution for this common problem is beyond the ability of HTTP. The client needs a quick and efficient way to query the server for the city and the state of a given zip code. In a distributed object system, this is not a problem; a client object simply obtains a reference to a server-side object that can answer the query. The server-side object will have access to a database of postal codes and city/state mappings. The response time in such a system will probably be millisecondsdepending on network bandwidthinstead of the seconds that a full HTTP page request would require (see Figure 4-1).
Figure 4-1. Mechanisms for implementing a smart address form
Using distributed objects in a Web application can solve a lot of functionality and performance issues that are often encountered in Web application development. There are costs, however, the most obvious being an additional order of magnitude of complexity in the architecture of the system. The key to the use of distributed objects in a Web application is to incorporate the distributed object system without losing the main benefits of a Web architecture in the first place. The most notable benefit of the Web is its ease of deployment. For a Web application to make effective use of distributed objects, there must be a way to automatically send to the client the objects and interfaces necessary for it to participate in the system without having the user stop the application and install special software.
Another benefit of Web architectures is the ability to leverage heterogeneous and minimally powered clients. Depending on the choice of distributed object infrastructures, there may be an impact on the types of clients that can participate in the system. In addition to the requirements on the client computer, using distributed objects in a Web application requires a reliable network; managing and designing for spotty network connections is often more trouble than it is worth.
At the time of this writing, two principal distributed object infrastructures are associated with Web application development: Java's RMI and Microsoft's DCOM. The goal of both is to hide the details of distributed communications and make them the responsibility of the infrastructure, not of the class designer or the implementer. They work on the principle of location transparency, which states that the object designer/implementer should never need to know the location of a given object instance. That decision is left to the architect or the deployment individual. Ideally, the designer of a class should not care whether a given instance of a class is located on the same machine, although the reality of distributed object design is that the actual location can be important, especially when designing to meet certain performance requirements.
The approaches taken by these infrastructures are for the most part the same. Each uses proxies, or stubs, as interfaces between the distributed object infrastructure and objects that use and implement the business functionality. Both provide naming and directory services to locate objects in the system, and both provide some security services. Bridges can even be built that allow objects from one infrastructure to communicate with objects in the other; however, these bridges are subject to severe functional limitations and performance consequences. The following sections offer an overview of these two infrastructures and how they are leveraged in Web applications.
RMI / IIOP
Remote Method Invocation (RMI), the Java standard for distributed objects, allows Java classes to communicate with other Java classes, which might be located on different machines. Java RMI is a set of APIs and a model for distributed objects that allows developers to build distributed systems easily. The initial release of the RMI API used Java serialization and the Java Remote Method Protocol (JRMP) to make method invocations across a network look like local invocations. Today, the use of Internet Inter-Orb Protocol (IIOP), a product of the CORBA[1] initiatives, as the transport protocol is preferred, making it easier to integrate with non-Java objects. Built-in support for this protocol is included in the latest releases of the Java Development Kit (JDK).
[1] CORBA, like RMI and DCOM, is a distributed object protocol managed by the Object Management Group (OMG), the same group that manages the evolution of UML.
From the designer's point of view, the underlying transport protocol should have nothing to do with the design of the system's classes. This is not always the case, however. When IIOP is the underlying transport protocol, special care must be taken when designing operation signatures. Most CORBA implementations limit operation parameters to primitive types. Operations that accept Java object references as parameters might not be usable in IIOP-based systems. Also, the present release of the JDK does not support output and input/output parameters on operations. It is conceivable that existing CORBA objects might have operations that expect or require such parameters, and so it would be difficult for Java-based clients to use them.
RMI introduces two new types of object: stub and skeleton. The stub is a client-side object that represents the remote object and executes on the client machine. The skeleton is responsible for managing all the details of being remoteresponding to communications from objects on another machineand exists on the server machine. The best part about these two objects is that you don't have to write the code for them yourself. They are automatically generated from a special compiler: rmic. This compiler creates the stub and skeleton classes from business objects that implement certain interfaces. For example, the rmic command would take the Java class MyObject as an argument and produce class files of the form MyObject_Skel.class and MyObject_Stub.class.
The goal is, again, to insulate the designer and developer as much as possible from the details of remote communication. Figure 4-2 shows an overview of the layered architecture for RMI.
Figure 4-2. RMI layered architecture
To use a remote object, a client must first obtain a reference to it. This means that the client will need to know the name and the location of the remote object. This information can be expressed in terms of a URL. For example, a CityStateServer object existing on a machine called myhost.com would have the URL
rmi://myhost.com/CityStateServer
The rmi: part of the URL indicates its type or protocol. Clients wishing to communicate with an instance of this object use the Java Naming and Directory Interface to look up and obtain a reference. Obtaining a reference and using an instance of a CityStateServer object in an applet is as simple as
CityStateServer cs = null; cs = (CityStateServer) Naming.lookup("rmi://myhost.com/CityStateServer"); aCity = cs.getCity(zip); aState = city.getState();
The Naming instance is a well-known remote object that connects to the remote server and requests an instance of the object. It returns the object stub, which the client program uses to invoke methods.
One of the most significant features of RMI is that if it doesn't exist on the client, the stub for the remote object will automatically be downloaded to the client in accordance with the policies of the security manager. Every instance of an RMI-enabled program must install and run a security manager object. Applets, however, have the option of defaulting to the existing appletSecurityManager instance.
On the server, an interface is defined for the remote object as follows:
package myapp.CityStateServer; import myapp.Address.*; import java.rmi.Remote; import java.rmi.RemoteException; public interface CityStateServer extends Remote { City getCity( String zip ) throws RemoteException; }
Each method call of the remote object must throw a RemoteException. All parameters passed to remote methods must be serializable (they will eventually get sent over the network).
Setting up a remote object server involves three steps:
- Start a security manager class so that the server can accept stub classes from other machines and, in effect, become a client to another machine.
- Create one or more instances of the server object.
- Register at least one of the server objects with the RMI naming registry so that it can be found by client programs.
Presently, only Java applications, not applets, can be remote object hosts. This makes sense when the normal use of applets is in HTML Web pages that are quickly created and destroyed. The following code fragment for an application's main function shows the CityStateServer getting registered on a host:
public static void main( String [] args ) throws RemoteException, java.net.MalformedURLException, RMISecurityException { System.setSecurityManager( new RMISecurityManager() ); CityStateServer css = new CityStateServer(); Naming.rebind("rmi://myhost.com/CityStateServer", css ); }
The designer and the implementer are not completely isolated from the RMI infrastructure; remoteable server objects must implement a certain interface and throw special exceptions. Clients must catch these exceptions and handle them gracefully. Additionally, all parameters passed as arguments to the interface's functions must be serializable, able to send their state out in a stream. The implementer also has other responsibilities; for the most part, however, it does isolate the designer and the developer from the complex issues of remote procedure calls, networking, and deployment.
In a Web application, RMI is typically used as a communication mechanism between an applet and an application server. The applet is delivered as part of a Web page that the user navigates to. All that is required on the part of the client is a Java-enabled Web browser. All the classes necessary to invoke and to use remote objects will be downloaded to the client as necessary. Once the client applet is run, it can contact the remote server, request a remote object instance reference, and begin to invoke methods on it as if it were a local object instance. All marshalling of protocols is handled by the stub and skeleton classes and the RMI infrastructure. Figure 4-3 shows how applets and remote objects work together.
Figure 4-3. Applets using RMI
DCOM
Microsoft's solution to the distributed-object problem is provided by Distributed COM (DCOM), an extension to the popular Component Object Model (COM). Microsoft describes DCOM as COM with a longer wire. Most of the effort in making COM objects distributed is in their deployment and registration.
Just like RMI, DCOM isolates the object developer from the details of distributing an object. DCOM goes even farther by providing facilities to make COM-only objects live on remote servers. Unlike RMI, in which server objects must implement certain remote interfaces, DCOM gives the object developer independence from the distributed-object infrastructure.
COM object implementations are assigned special class identifiers (CLSID). Clients who want instances of a particular COM object request them with the CLSID from the operating system. When the client machine has the DCOM-supporting facilities installed, it is possible for these objects to be located on a remote server. When a client creates an object instance, the following happens:
- The client calls CoCreateInstance() on a CLSID supported by a local server.
- The DCOM runtime, working with the SCM (service control manager), determines whether the requested local server is running and can be connected to.
- The client is provided with a reference to an interface proxy to the object. If an existing instance of the object is available, it will be used; otherwise, a new instance is created.
The principal responsibility for locating objects rests with the service control manager. The SCM will locate or create the object instance on the local machine or across the network, if necessary.
Once an object reference is obtained by the client, it can invoke operations on it. In normal COM, the communication between the client and the server objects that are in different process spaces is managed by the distributed computing environment's (DCE) remote procedure call (RPC) mechanism. When the objects are located on different machines, the DCOM infrastructure enters the picture and marshals the messages and replies over the network.
DCOM uses a scheme similar to RMI and CORBA, creating proxy and stub objects to act as interfaces between the client program, or server object implementation, and the COM infrastructure. The existence of these objects, invisible to the implementer, is provided by DCOM. Figure 4-4 shows an overview of the DCOM architecture.
Figure 4-4. Overview of DCOM architecture
The principal strategy for deploying these objects is to either manually install the object proxies on the client or use the code-downloading capabilities of Internet Explorer (IE) to do it for you. IE versions 3 and higher are capable of requesting and downloading COM components from servers. The download is, of course, subject to the security policies set up on the client. The objects that are downloaded are complete COM objects that can run entirely on the client. This means that if it is possible to download a COM object, it is possible to download proxies for remote objects as well.
When DCOM is used in a Web application, Web pages contain ActiveX controls that are downloaded to the client and executed. Along with these controls, proxy objects can be downloaded and registered to point to implementation objects on the appropriate application server (see Figure 4-5).
Figure 4-5. Use of distributed objects in a Web application
The biggest disadvantage of using DCOM instead of RMI or CORBA is the client requirement of running the Windows operating system. Even though COM and DCOM are public specifications, the reality is that only Windows-based operating systems support them. Additionally, Microsoft's Internet Explorer is the only major browser that has native support for COM. For intranet applications, however, this may not be a problem.