Understanding .NET (2nd Edition)
Given that Web services provide a quite general mechanism for communication between software on different machines, a simpleminded view might suggest that Web services are all that's required. After all, this technology can be used on both intranets and the Internet, and it potentially allows the client and server to be written using software from different vendors. Effective distributed computing requires more than just Web services, however. To see why, think about some of the limitations of standard Web services. For one thing, an explicit goal of Web services is to allow communication between different vendor implementations. Yet doing this isn't free. Mapping from the CLR's type system into the one defined by XML can be problematic, and depending on how it's done, this translation might lose some information. If both parties in the communication are built using the same technology, such as the .NET Framework, there's no reason to pay this price. Because using the .NET Framework at each end is a common scenario, some option that allows transmitting the complete set of CLR types must exist.
Another problem is that the XML-based serialization used in Web services is not very efficient. We may have to live with this for Internet-based communication, since XML and SOAP are becoming the world's common mechanisms for exchanging information. Yet for communication inside a firewall, such as on a corporate intranet, there's no need to use a relatively inefficient XML-based format for data. Instead, a faster binary representation can be used.
Still another issue is how an object's lifetime is handled. In ASP.NET Web Services, every client call results in a new instance of the target class being created and then destroyed when the call returns. But for some kinds of applications, allowing the same instance to handle multiple calls from the same or different clients is very useful. This kind of behavior isn't what ASP.NET Web Services is designed to do.
.NET Remoting addresses these concerns. While it is possible to expose SOAP-based Web services using .NET Remoting, it's typical to use this part of the class library when both ends of the communication are using the .NET Framework. Whether they're communicating across an intranet or over the Internet through firewalls, the communicating systems will then have the same type system, a common set of available communication protocols, and even the same implementation of those protocols.
As you might expect, .NET Remoting provides traditional RPC functionality, allowing a client to invoke a method in a remote object and have some result returned. It can also be used for asynchronous (nonblocking) calls, however, as well as one-way calls that have no result. The mission of .NET Remoting is to make all of these interactions as simple yet as flexible as possible.
An Overview of the Remoting Process
Although the word remoting implies communication between different machines, it's used a bit more broadly in the .NET Framework. Here, remoting refers to any communication between objects in different application domains, whether those app domains are on the same machine or on machines connected by a network. Figure 7-2 shows a very high-level view of the major components of the remoting process. Figure 7-2. Calls to remote objects rely on a proxy object in the calling app domain and channel objects in both app domains.
When a client calls a method on an object in another app domain, that call is first handled by a proxy object running in the client's app domain. The proxy represents the remote object in the client's app domain, allowing the client to behave as if that object were running locally. The CLR automatically creates a proxy by using reflection to access the metadata of the remote object being accessed. (Note what this implies: The assembly containing the remote object's classes and/or interfaces must be available on the client's machine.)
A proxy eventually hands a call's information to a channel object. The channel object is responsible for using some appropriate mechanism, such as a TCP connection, to convey the client's request to the remote app domain. Once the request arrives in that app domain, a channel object running there locates the object for which this call is destined, perhaps creating it if the object isn't already running. The call is then passed to the object, which executes it and passes any results back through the same path.
At a high level, the process is simple. In fact, however, there's much more going on than this initial description shows. It's possible, for instance, to insert code that intercepts and customizes the in-progress call at several points in the path between caller and object. In fact, the details of .NET Remoting can get fairly involvedremote access is never simple to implement wellbut thankfully, most of the complexity can remain invisible to developers.
Passing Information to Remote Objects
Calling a method in an object is straightforward when both the client and the object are in the same app domain. Parameters of value types such as integers are passed by value, which means that their contents are simply copied from client to object. Parameters of reference types, such as classes, are passed by reference, which means that a reference to the instance itself is passedno separate copy is made. Calling a method in an object gets more complicated when the two are in different app domains, however, and so .NET Remoting must address these complications. For one thing, accessing a remote object's properties or fields requires some way to transfer information across an app domain boundary. The process of packaging values for transfer to another app domain is called marshaling, and there are several options for how it gets done.
One option is marshal by value (MBV). As the name suggests, transferring an instance of some type using this option copies its value to the remote app domain. For this to work, a user-defined type must be serializable, that is, its definition must be marked with the Serializable attribute. When an instance of that type is passed as a parameter in a remote call, the object's state is automatically serialized and passed to the remote app domain. Once it arrives, a new instance of that type is created and initialized using the serialized state of the original. (Note that the code for the type isn't passed, however, which means that for types such as classes, an assembly containing the MSIL for that type must exist on whatever machine the object's state is passed to.) An MBV object should usually be reasonably simple, or the cost of serializing and transferring the entire object to the remote app domain will be very high.
It's also possible to pass an instance of a reference type across an app domain boundary by reference. This option, called marshal by reference (MBR), is possible only with reference types that inherit from MarshalByRefObject, a class contained in the System namespace. When an MBR object is passed across an app domain boundary, only a reference to the object is passed. This reference, which is more complex than the one used to refer to the object in its own app domain, is used to construct a proxy back to the original object in its home app domain. When code in the remote app domain references this object, such as by calling one of its methods, those references are actually sent back to the original instance of this object. Passing MBR objects as parameters makes sense in cases where the overhead of accessing the object remotely is less than the cost of making a copy of the object.
Figure 7-3 illustrates the difference between MBV and MBR objects. When object X, an MBV object in app domain 1, is passed to app domain 2 as a parameter on a call to object O, a copy of X is created in the remote app domain. Passing object Y, however, does not result in a copy of Y being created in app domain 2 because Y is an MBR object. Instead, a proxy to Y is created, and all accesses to object Y are sent back to the instance of Y in app domain 1. (Although it's not shown in the picture, communication from Y's proxy back to Y itself relies on channels, just as described earlier.) Figure 7-3. Marshal by value objects are copied when passed across an app domain boundary, while marshal by reference objects have a proxy created for them in the remote app domain.
Finally, if a user-defined type isn't serializable and doesn't inherit from MarshalByRefObject, it is neither an MBV nor an MBR object. In this case, instances of that type can't be marshaled across an app domain boundary at all. In other words, any instance of this type can be used only within the app domain in which it is created.
Choosing a Channel
Applications using .NET Remoting ultimately rely on channels to convey calls and responses between app domains. Three standard channels are provided: a TCP channel, an HTTP channel, and an interprocess communication option called the IPC channel. It's also possible to build custom channels when necessary. While not especially simple to create, a custom channel might provide special security services, use a nonstandard protocol, or perform some other function in a unique way. It's safe to assume, however, that most applications will work happily with one of the three choices built into the .NET Framework.
The TCP channel is the best choice for fast machine-to-machine communication. By default, it serializes and deserializes a call's parameters using the binary formatter described in Chapter 4, although the SOAP formatter (which was also described in Chapter 4) can be used instead. Once the parameters have been serialized, they're transmitted directly in TCP packets. In version 2.0 of the .NET Framework, the TCP channel can also provide authentication and data encryption if required.
The second option, the HTTP channel, uses the SOAP formatter by default to serialize and deserialize a call's parameters. Rather than sending those parameters directly over TCP, they're sent as SOAP requests and responses embedded in HTTP. It's also possible to use the binary formatter with the HTTP channel, which can be useful for communication through firewalls. The binary formatter is more efficient than the SOAP formatter, so if the .NET Framework is on both sides of the communication, this option makes sense. For applications that need distributed security, the HTTP channel can use the security options provided by Internet Information Services (IIS). In this case, one possibility is to use the Secure Sockets Layer (SLL) protocol with HTTP, an option sometimes referred to as HTTPS.
The third choice, the IPC channel, is new in version 2.0 of the .NET Framework. Rather than allowing communication between applications on different machines, the IPC channel is intended for communication between applications in different processes on the same machine. It uses named pipes, a standard Windows mechanism for interprocess communication.
Deciding which channel to use depends on your goals. If the communication is cross-machine and entirely within an organization's intranetif no firewalls will be traverseduse the fast and simple TCP channel. If a cross-machine communication must go through firewalls, however, as do most packets sent on the Internet, use the HTTP channel. Although it's a bit less efficient, riding on HTTP means passing through port 80, the only port that virtually all firewalls leave open. Also, if the goal is to provide a standard Web service whose clients might not be based on the .NET Framework, the HTTP channel is the only .NET Remoting option you can use. (It's worth pointing out that ASP.NET Web Services is typically a better choice in this situation, however, since unlike .NET Remoting, it's designed to interoperate with non-Microsoft platforms). And for communication between processes on the same machine, the IPC channel is the obvious choice.
It's also possible for a single application to use different kinds of channels simultaneously. This allows clients to communicate with remote objects using the mechanism that's most appropriate for each one. A client, for instance, might use the more efficient TCP channel to talk to an object inside the firewall while also invoking methods in an object across the Internet via the HTTP channel.
Creating and Destroying Remote Objects
One of the most challenging issues in designing a remoting technology is determining what options to support for creating remote objects, referred to as activation. .NET Remoting provides three options, each of which is illustrated in Figure 7-4.
Figure 7-4. Objects accessed via .NET Remoting can be single-call, singleton, or client-activated.
Regardless of which option is chosen, the server will create each new object on its own thread. Importantly, any object that will be accessed from outside its app domain must be an MBR object, which means that the class must inherit from MarshalByRefObject. These similarities notwithstanding, however, each of these options varies in who creates the object, how the object is destroyed, as well as in other ways. Accordingly, each is worthy of its own short discussion. Single-Call Objects
As the name suggests, single-call objects exist only for the life of one method call from one client. A new instance is created for each new call, and that instance is destroyed when the call ends. This model, which is also how ASP.NET and ASP.NET Web Services applications work, means that the object can't maintain any internal state between method calls since a new instance is created for each call. It works well with a load-balanced set of server machines, however, where each request might be handled by a different machine. Since the object stores no in-memory state, using different machines for a sequence of requests is not a problem.
To expose a single-call object to clients in other app domains, a server must register the object's type with the .NET Remoting infrastructure. There are two ways for servers to do this. One possibility is to perform the registration explicitly by calling methods provided by classes in System.Runtime.Remoting and its subordinate namespaces. For example, a server process can specify a channel that clients can use to access the single-call object by calling the RegisterChannel method of the ChannelServices class, found in the namespace System.Runtime.Remoting.Channels. Next, the server can call the RegisterWellKnownServiceType method of the RemotingConfiguration class, contained in System.Runtime.Remoting. The server specifies several things on this call, including the type being registered, the mode (which in this case is SingleCall), and the URL at which this object can be found.
A second (and usually better) way for a server to register a single-call type is to specify the desired options in a configuration file and then tell the remoting infrastructure to read this file by passing the file's name as a parameter on a call to RemotingConfiguration's Configure method. This allows changing details of the exposed type, such as the URL at which it can be found, without recompiling the server code. However it's done, registering a type doesn't actually create an instance of that type. No running instance is created until it's absolutely necessary, as described later in this section. Once a server is running and has registered an appropriate type for a single-call object, a client can invoke methods on that object. A client has two choices for how it does this. The first lets the client use the standard new operator provided by CLR-based languages such as C# and Visual Basic (VB). With this option, the client application first tells the .NET Framework's remoting infrastructure various things, such as what channel to use, the type of the remote object, and a URL at which that object can be found. As with the server, this can be done either by using explicit calls or by referencing a configuration file. Note that to access a remote object, the client must know the URL at which it can be found (there's no built-in support for using a directory service such as Active Directory to learn this information). Alternatively, rather than explicitly passing the remoting infrastructure the information required to access the remote object, a client can specify this information in a configuration file, just like the server. Whichever approach a developer chooses, the client code can now create instances of the remote object using the new operator.
If a developer is willing to forgo the relative convenience of using the standard new operator, she can use another approach for accessing a remote single-call object. Rather than setting up the configuration information and then calling new, a client can instead call the Activator class's GetObject method. The parameters to this call include the type of the object to be accessed and the URL at which the object can be found. Instead of specifying these separately, as in the previous case, they're passed directly on this call.
Whichever choice is used, however, neither one actually creates an instance of the remote object. Instead, single-call objects are server-activated, which means that the server creates an instance of the object only when a method call actually arrives. And because they're single-call objects, the object is destroyed after the call completes.
Singleton Objects
Like single-call objects, singleton objects are activated by the server. Accordingly, the steps required for the server to register and the client to access a singleton object are similar to those just described for a single-call object. The only difference is that the server specifies a mode of Singleton instead of SingleCall on its call to RemotingConfiguration.RegisterWell KnownServiceType or in the configuration file. On the client, the code is exactly the same as with single-call objects.
The behavior of the object is not the same, however. Unlike a single-call object, which gets destroyed after each method call, a singleton object stays active until its (configurable) lifetime expires. Since a singleton object isn't destroyed between calls, it can maintain state internally. Yet because the same instance is accessed by all clients that use this singleton, that state is potentially accessible by any of these clients.
If another client makes a call on a singleton class after the running instance of that class has died, a new instance will be created. This new instance will handle all calls from all clients until its lifetime expires. Note, however, that for a singleton object accessible at a given URL, there is never more than one instance of the class active at any time. Client-Activated Objects
Even though a client can use the new operator to "create" an instance of a single-call or singleton object, the server doesn't really create this instance until the first method call from the client arrives. This is why these two choices are called server-activated: The server is in charge of determining when activation occurs. Client-activated objects, by contrast, are explicitly created when the client requests it. The server still does the actual creation, of course, since that's where the object is runningthe name is something of a misnomer. Still, the distinction between client-activated objects and the two types of server-activated objects is significant. The most important difference is that with client-activated objects, each client gets its own object instance, and each object can maintain internal state specific to its client between method callsthe object isn't destroyed after each call. Instead, as described later in this section, each client-activated object has a lease that determines when the object is destroyed.
Just as with the first two types of remotely accessible objects, the server must register the type before the client can access it. As before, this can be done either through explicit calls or via a configuration file. To create the object, the client can also make explicit calls, much like the previous cases, or rely on a configuration file. In either case, the client can use either the new operator or make an explicit call to the CreateInstance method provided by the Activator class. (GetObject can't be used with client-activated objects.) Both of these directly contact the server, which then creates an instance of the specified client-activated type. All calls made by the client to this object will be handled by this instance, and each client that creates a client-activated object of this type will have its own instance.
One problem remains: When is a client-activated object destroyed? With single-call and singleton objects, the server decides when to destroy the object. With client-activated objects, however, the server can't destroy the object until it knows the client is finished using it. Theoretically, the client could tell the server when it's done with the object, but what happens if the client fails unexpectedly, or just forgets to do this? The server could wind up with objects that no longer have clients yet will never be destroyed. To avoid this problem, each client-activated object has a lease assigned to it[2]. The lease controls how long an object can remain in existence. A client can set an object's lease time when that object is created, or an administrator can control default lease times. Optionally, each method call from a client can reset the lease timer to zero. If an object's lease time elapses, the lease manager in the app domain that contains this object contacts any sponsors of the object. If any of these sponsors wishes to renew the lease, the object's lifetime is extended. If not, the object can be marked for garbage collection. Clients can also explicitly extend the lease of an object or even set it to infinity, ensuring that the server won't destroy it prematurely. [2] In fact, leasing is used to control the lifetime of all remotely accessed MBR objects, including those passed as parameters.
Remotely activating and accessing objects is inherently nontrivial. Is it better to provide many options, running the risk of making the technology too complex to use? Or should the design stay simple, supporting only the most common scenarios? .NET Remoting aims for a middle ground, offering built-in services for common situations while still allowing enough complexity to address more advanced applications. Pleasing everybody is hard, but .NET Remoting offers enough choice to please at least most of the people most of the time.
|
Категории