Understanding .NET (2nd Edition)

Given that Web services provide a quite general mechanism for communication between software on different machines, a simpleminded view might suggest that Web services are all that's required. After all, this technology can be used on both intranets and the Internet, and it potentially allows the client and server to be written using software from different vendors. Effective distributed computing requires more than just Web services, however. To see why, think about some of the limitations of standard Web services. For one thing, an explicit goal of Web services is to allow communication between different vendor implementations. Yet doing this isn't free. Mapping from the CLR's type system into the one defined by XML can be problematic, and depending on how it's done, this translation might lose some information. If both parties in the communication are built using the same technology, such as the .NET Framework, there's no reason to pay this price. Because using the .NET Framework at each end is a common scenario, some option that allows transmitting the complete set of CLR types must exist.

Web services are necessary but not sufficient

Another problem is that the XML-based serialization used in Web services is not very efficient. We may have to live with this for Internet-based communication, since XML and SOAP are becoming the world's common mechanisms for exchanging information. Yet for communication inside a firewall, such as on a corporate intranet, there's no need to use a relatively inefficient XML-based format for data. Instead, a faster binary representation can be used.

Serializing data into XML isn't always the best choice

Still another issue is how an object's lifetime is handled. In ASP.NET Web Services, every client call results in a new instance of the target class being created and then destroyed when the call returns. But for some kinds of applications, allowing the same instance to handle multiple calls from the same or different clients is very useful. This kind of behavior isn't what ASP.NET Web Services is designed to do.

It's sometimes useful for the same object instance to handle multiple calls

.NET Remoting addresses these concerns. While it is possible to expose SOAP-based Web services using .NET Remoting, it's typical to use this part of the class library when both ends of the communication are using the .NET Framework. Whether they're communicating across an intranet or over the Internet through firewalls, the communicating systems will then have the same type system, a common set of available communication protocols, and even the same implementation of those protocols.

.NET Remoting focuses on communication between CLR-based applications

As you might expect, .NET Remoting provides traditional RPC functionality, allowing a client to invoke a method in a remote object and have some result returned. It can also be used for asynchronous (nonblocking) calls, however, as well as one-way calls that have no result. The mission of .NET Remoting is to make all of these interactions as simple yet as flexible as possible.

.NET Remoting supports both synchronous and asynchronous communication

An Overview of the Remoting Process

Although the word remoting implies communication between different machines, it's used a bit more broadly in the .NET Framework. Here, remoting refers to any communication between objects in different application domains, whether those app domains are on the same machine or on machines connected by a network. Figure 7-2 shows a very high-level view of the major components of the remoting process.

Figure 7-2. Calls to remote objects rely on a proxy object in the calling app domain and channel objects in both app domains.

.NET Remoting is used for communication between different app domains

When a client calls a method on an object in another app domain, that call is first handled by a proxy object running in the client's app domain. The proxy represents the remote object in the client's app domain, allowing the client to behave as if that object were running locally. The CLR automatically creates a proxy by using reflection to access the metadata of the remote object being accessed. (Note what this implies: The assembly containing the remote object's classes and/or interfaces must be available on the client's machine.)

Clients rely on proxy objects

A proxy eventually hands a call's information to a channel object. The channel object is responsible for using some appropriate mechanism, such as a TCP connection, to convey the client's request to the remote app domain. Once the request arrives in that app domain, a channel object running there locates the object for which this call is destined, perhaps creating it if the object isn't already running. The call is then passed to the object, which executes it and passes any results back through the same path.

Communication is handled by channel objects

At a high level, the process is simple. In fact, however, there's much more going on than this initial description shows. It's possible, for instance, to insert code that intercepts and customizes the in-progress call at several points in the path between caller and object. In fact, the details of .NET Remoting can get fairly involvedremote access is never simple to implement wellbut thankfully, most of the complexity can remain invisible to developers.

.NET Remoting provides many opportunities for customization

Passing Information to Remote Objects

Calling a method in an object is straightforward when both the client and the object are in the same app domain. Parameters of value types such as integers are passed by value, which means that their contents are simply copied from client to object. Parameters of reference types, such as classes, are passed by reference, which means that a reference to the instance itself is passedno separate copy is made. Calling a method in an object gets more complicated when the two are in different app domains, however, and so .NET Remoting must address these complications. For one thing, accessing a remote object's properties or fields requires some way to transfer information across an app domain boundary. The process of packaging values for transfer to another app domain is called marshaling, and there are several options for how it gets done.

Values passed between app domains must be marshaled and unmarshaled

One option is marshal by value (MBV). As the name suggests, transferring an instance of some type using this option copies its value to the remote app domain. For this to work, a user-defined type must be serializable, that is, its definition must be marked with the Serializable attribute. When an instance of that type is passed as a parameter in a remote call, the object's state is automatically serialized and passed to the remote app domain. Once it arrives, a new instance of that type is created and initialized using the serialized state of the original. (Note that the code for the type isn't passed, however, which means that for types such as classes, an assembly containing the MSIL for that type must exist on whatever machine the object's state is passed to.) An MBV object should usually be reasonably simple, or the cost of serializing and transferring the entire object to the remote app domain will be very high.

Marshal by value passes the value itself to another app domain

It's also possible to pass an instance of a reference type across an app domain boundary by reference. This option, called marshal by reference (MBR), is possible only with reference types that inherit from MarshalByRefObject, a class contained in the System namespace. When an MBR object is passed across an app domain boundary, only a reference to the object is passed. This reference, which is more complex than the one used to refer to the object in its own app domain, is used to construct a proxy back to the original object in its home app domain. When code in the remote app domain references this object, such as by calling one of its methods, those references are actually sent back to the original instance of this object. Passing MBR objects as parameters makes sense in cases where the overhead of accessing the object remotely is less than the cost of making a copy of the object.

Marshal by reference passes only a reference to another app domain

Figure 7-3 illustrates the difference between MBV and MBR objects. When object X, an MBV object in app domain 1, is passed to app domain 2 as a parameter on a call to object O, a copy of X is created in the remote app domain. Passing object Y, however, does not result in a copy of Y being created in app domain 2 because Y is an MBR object. Instead, a proxy to Y is created, and all accesses to object Y are sent back to the instance of Y in app domain 1. (Although it's not shown in the picture, communication from Y's proxy back to Y itself relies on channels, just as described earlier.)

Figure 7-3. Marshal by value objects are copied when passed across an app domain boundary, while marshal by reference objects have a proxy created for them in the remote app domain.

Finally, if a user-defined type isn't serializable and doesn't inherit from MarshalByRefObject, it is neither an MBV nor an MBR object. In this case, instances of that type can't be marshaled across an app domain boundary at all. In other words, any instance of this type can be used only within the app domain in which it is created.

Not all types can be marshaled

Choosing a Channel

Applications using .NET Remoting ultimately rely on channels to convey calls and responses between app domains. Three standard channels are provided: a TCP channel, an HTTP channel, and an interprocess communication option called the IPC channel. It's also possible to build custom channels when necessary. While not especially simple to create, a custom channel might provide special security services, use a nonstandard protocol, or perform some other function in a unique way. It's safe to assume, however, that most applications will work happily with one of the three choices built into the .NET Framework.

.NET Remoting provides a TCP channel, an HTTP channel, and an IPC channel

The TCP channel is the best choice for fast machine-to-machine communication. By default, it serializes and deserializes a call's parameters using the binary formatter described in Chapter 4, although the SOAP formatter (which was also described in Chapter 4) can be used instead. Once the parameters have been serialized, they're transmitted directly in TCP packets. In version 2.0 of the .NET Framework, the TCP channel can also provide authentication and data encryption if required.

The TCP channel sends binary information directly over TCP

The second option, the HTTP channel, uses the SOAP formatter by default to serialize and deserialize a call's parameters. Rather than sending those parameters directly over TCP, they're sent as SOAP requests and responses embedded in HTTP. It's also possible to use the binary formatter with the HTTP channel, which can be useful for communication through firewalls. The binary formatter is more efficient than the SOAP formatter, so if the .NET Framework is on both sides of the communication, this option makes sense. For applications that need distributed security, the HTTP channel can use the security options provided by Internet Information Services (IIS). In this case, one possibility is to use the Secure Sockets Layer (SLL) protocol with HTTP, an option sometimes referred to as HTTPS.

The HTTP channel sends SOAP envelopes over HTTP

The third choice, the IPC channel, is new in version 2.0 of the .NET Framework. Rather than allowing communication between applications on different machines, the IPC channel is intended for communication between applications in different processes on the same machine. It uses named pipes, a standard Windows mechanism for interprocess communication.

The IPC channel allows communication between app domains on a single machine

Deciding which channel to use depends on your goals. If the communication is cross-machine and entirely within an organization's intranetif no firewalls will be traverseduse the fast and simple TCP channel. If a cross-machine communication must go through firewalls, however, as do most packets sent on the Internet, use the HTTP channel. Although it's a bit less efficient, riding on HTTP means passing through port 80, the only port that virtually all firewalls leave open. Also, if the goal is to provide a standard Web service whose clients might not be based on the .NET Framework, the HTTP channel is the only .NET Remoting option you can use. (It's worth pointing out that ASP.NET Web Services is typically a better choice in this situation, however, since unlike .NET Remoting, it's designed to interoperate with non-Microsoft platforms). And for communication between processes on the same machine, the IPC channel is the obvious choice.

Which channel is best depends on the situation

It's also possible for a single application to use different kinds of channels simultaneously. This allows clients to communicate with remote objects using the mechanism that's most appropriate for each one. A client, for instance, might use the more efficient TCP channel to talk to an object inside the firewall while also invoking methods in an object across the Internet via the HTTP channel.

Performance: What's the Fastest Choice?

Given that the .NET Framework offers several choices for creating distributed applications, it's natural to wonder which ones offer the best performance. Before answering the question, it's important to emphasize that relatively few applications require the highest performance possible. Other considerations, such as interoperability with other technologies, are often more important. Still, it's useful to know which choices are fastest.

Of the three main choices the .NET Framework offers, Enterprise Services and .NET Remoting using binary encoding over a TCP channel are fastest. This shouldn't be surprising. Enterprise Services relies on the efficient and highly tuned Distributed COM (DCOM) protocol, and using Remoting's binary-over-TCP option clearly minimizes overhead (although Enterprise Servicesbased applications will scale better). ASP.NET Web Services is a significantly slower option, as is the seldom-used SOAP option in .NET Remoting. All three technologies have their place, however, since each addresses a somewhat different set of problems.

For a more detailed look at the relative performance of these three technologies in the .NET Framework 2.0, see the paper by Ingo Rammer and Richard Turner at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/asmxremotesperf.asp.

Creating and Destroying Remote Objects

One of the most challenging issues in designing a remoting technology is determining what options to support for creating remote objects, referred to as activation. .NET Remoting provides three options, each of which is illustrated in Figure 7-4.

  • Single-call objects: A new instance of the class is created for each call from each client and then is destroyed (that is, made available for garbage collection) when the call ends.

  • Singleton objects: Only one instance of the class is created on a given machine, and that same instance handles all calls made by all clients to that class on this machine.

  • Client-activated objects: A separate instance of the class is created for each client that wishes to use it and then is destroyed only when that client is finished with it.

Figure 7-4. Objects accessed via .NET Remoting can be single-call, singleton, or client-activated.

.NET Remoting provides three styles of activation for remote objects

Regardless of which option is chosen, the server will create each new object on its own thread. Importantly, any object that will be accessed from outside its app domain must be an MBR object, which means that the class must inherit from MarshalByRefObject. These similarities notwithstanding, however, each of these options varies in who creates the object, how the object is destroyed, as well as in other ways. Accordingly, each is worthy of its own short discussion.

Single-Call Objects

As the name suggests, single-call objects exist only for the life of one method call from one client. A new instance is created for each new call, and that instance is destroyed when the call ends. This model, which is also how ASP.NET and ASP.NET Web Services applications work, means that the object can't maintain any internal state between method calls since a new instance is created for each call. It works well with a load-balanced set of server machines, however, where each request might be handled by a different machine. Since the object stores no in-memory state, using different machines for a sequence of requests is not a problem.

A new single-call object is created for each call and then is destroyed when the call ends

To expose a single-call object to clients in other app domains, a server must register the object's type with the .NET Remoting infrastructure. There are two ways for servers to do this. One possibility is to perform the registration explicitly by calling methods provided by classes in System.Runtime.Remoting and its subordinate namespaces. For example, a server process can specify a channel that clients can use to access the single-call object by calling the RegisterChannel method of the ChannelServices class, found in the namespace System.Runtime.Remoting.Channels. Next, the server can call the RegisterWellKnownServiceType method of the RemotingConfiguration class, contained in System.Runtime.Remoting. The server specifies several things on this call, including the type being registered, the mode (which in this case is SingleCall), and the URL at which this object can be found.

Single-call objects must be registered with the .NET Remoting infrastructure

A second (and usually better) way for a server to register a single-call type is to specify the desired options in a configuration file and then tell the remoting infrastructure to read this file by passing the file's name as a parameter on a call to RemotingConfiguration's Configure method. This allows changing details of the exposed type, such as the URL at which it can be found, without recompiling the server code. However it's done, registering a type doesn't actually create an instance of that type. No running instance is created until it's absolutely necessary, as described later in this section.

Once a server is running and has registered an appropriate type for a single-call object, a client can invoke methods on that object. A client has two choices for how it does this. The first lets the client use the standard new operator provided by CLR-based languages such as C# and Visual Basic (VB). With this option, the client application first tells the .NET Framework's remoting infrastructure various things, such as what channel to use, the type of the remote object, and a URL at which that object can be found. As with the server, this can be done either by using explicit calls or by referencing a configuration file. Note that to access a remote object, the client must know the URL at which it can be found (there's no built-in support for using a directory service such as Active Directory to learn this information). Alternatively, rather than explicitly passing the remoting infrastructure the information required to access the remote object, a client can specify this information in a configuration file, just like the server. Whichever approach a developer chooses, the client code can now create instances of the remote object using the new operator.

A client can use the new operator to access a single-call object

If a developer is willing to forgo the relative convenience of using the standard new operator, she can use another approach for accessing a remote single-call object. Rather than setting up the configuration information and then calling new, a client can instead call the Activator class's GetObject method. The parameters to this call include the type of the object to be accessed and the URL at which the object can be found. Instead of specifying these separately, as in the previous case, they're passed directly on this call.

A client can also use Activator. GetObject to access a single-call object

Whichever choice is used, however, neither one actually creates an instance of the remote object. Instead, single-call objects are server-activated, which means that the server creates an instance of the object only when a method call actually arrives. And because they're single-call objects, the object is destroyed after the call completes.

In either case, the server actually creates the single-call object

Singleton Objects

Like single-call objects, singleton objects are activated by the server. Accordingly, the steps required for the server to register and the client to access a singleton object are similar to those just described for a single-call object. The only difference is that the server specifies a mode of Singleton instead of SingleCall on its call to RemotingConfiguration.RegisterWell KnownServiceType or in the configuration file. On the client, the code is exactly the same as with single-call objects.

One singleton object handles all client requests for a particular class

The behavior of the object is not the same, however. Unlike a single-call object, which gets destroyed after each method call, a singleton object stays active until its (configurable) lifetime expires. Since a singleton object isn't destroyed between calls, it can maintain state internally. Yet because the same instance is accessed by all clients that use this singleton, that state is potentially accessible by any of these clients.

Singleton objects can maintain state between calls, unlike single-call objects

If another client makes a call on a singleton class after the running instance of that class has died, a new instance will be created. This new instance will handle all calls from all clients until its lifetime expires. Note, however, that for a singleton object accessible at a given URL, there is never more than one instance of the class active at any time.

Client-Activated Objects

Even though a client can use the new operator to "create" an instance of a single-call or singleton object, the server doesn't really create this instance until the first method call from the client arrives. This is why these two choices are called server-activated: The server is in charge of determining when activation occurs. Client-activated objects, by contrast, are explicitly created when the client requests it. The server still does the actual creation, of course, since that's where the object is runningthe name is something of a misnomer. Still, the distinction between client-activated objects and the two types of server-activated objects is significant. The most important difference is that with client-activated objects, each client gets its own object instance, and each object can maintain internal state specific to its client between method callsthe object isn't destroyed after each call. Instead, as described later in this section, each client-activated object has a lease that determines when the object is destroyed.

Each client gets its own instance of a client-activated object

Just as with the first two types of remotely accessible objects, the server must register the type before the client can access it. As before, this can be done either through explicit calls or via a configuration file. To create the object, the client can also make explicit calls, much like the previous cases, or rely on a configuration file. In either case, the client can use either the new operator or make an explicit call to the CreateInstance method provided by the Activator class. (GetObject can't be used with client-activated objects.) Both of these directly contact the server, which then creates an instance of the specified client-activated type. All calls made by the client to this object will be handled by this instance, and each client that creates a client-activated object of this type will have its own instance.

A client can create a client-activated object using the new operator or Activator. CreateInstance

One problem remains: When is a client-activated object destroyed? With single-call and singleton objects, the server decides when to destroy the object. With client-activated objects, however, the server can't destroy the object until it knows the client is finished using it. Theoretically, the client could tell the server when it's done with the object, but what happens if the client fails unexpectedly, or just forgets to do this? The server could wind up with objects that no longer have clients yet will never be destroyed.

To avoid this problem, each client-activated object has a lease assigned to it[2]. The lease controls how long an object can remain in existence. A client can set an object's lease time when that object is created, or an administrator can control default lease times. Optionally, each method call from a client can reset the lease timer to zero. If an object's lease time elapses, the lease manager in the app domain that contains this object contacts any sponsors of the object. If any of these sponsors wishes to renew the lease, the object's lifetime is extended. If not, the object can be marked for garbage collection. Clients can also explicitly extend the lease of an object or even set it to infinity, ensuring that the server won't destroy it prematurely.

[2] In fact, leasing is used to control the lifetime of all remotely accessed MBR objects, including those passed as parameters.

A client-activated object is destroyed when its lease expires

Remotely activating and accessing objects is inherently nontrivial. Is it better to provide many options, running the risk of making the technology too complex to use? Or should the design stay simple, supporting only the most common scenarios? .NET Remoting aims for a middle ground, offering built-in services for common situations while still allowing enough complexity to address more advanced applications. Pleasing everybody is hard, but .NET Remoting offers enough choice to please at least most of the people most of the time.

.NET Remoting provides a diverse group of options

Perspective: Why Are There Two Separate SOAP Implementations in the .NET Framework?

.NET Remoting can use the SOAP Formatter to handle SOAP messages. ASP.NET Web Services also uses SOAP, although it provides its own code for this. On the face of it, this makes no sense. Why include two completely separate implementations of the same technology in the .NET Framework class library?

The short answer is: different groups, different goals. Recall that the goal of .NET Remoting is to communicate effectively across a network when both client and server are built on the .NET Framework. SOAP is used primarily because, when mapped to HTTP, it allows this communication to pass through firewalls. In ASP.NET, on the other hand, the goal is to interoperate with any other implementation of SOAP, not just with the .NET Framework. SOAP is used both because it can pass through firewalls and because it's supported by many vendors.

A primary result of these distinct goals is different approaches to mapping serialized CLR types into XML. .NET Remoting uses the SOAP formatter, which allows everything that can be expressed by a CLR-based application to be passed across the network, including private data members and more. ASP.NET's SOAP implementation is not so committed to full-fidelity transfer of CLR types. Instead, it strives to produce a purely standard XML representation in everything it transmits, so it uses the XmlSerializer class to serialize and deserialize information. Sending a serialized type across ASP.NET's SOAP implementation won't send any private data members, for example, since XML has no notion of private members. The XmlSerializer emphasizes faithfulness to XML's XSD type system, while the SOAP formatter used in .NET Remoting emphasizes faithfulness to the CLR types.

Where .NET Remoting targets the homogeneous case, ASP.NET's Web services are optimized for heterogeneity. Given the complexities engendered by different type systems and different vendors, the need for multiple implementations of the same thing shouldn't be so surprising.

Категории