XML: A Managers Guide (2nd Edition) (Addison-Wesley Information Technology Series)

Various standard and industry initiatives have attempted to tackle the issues of messaging protocol requirements and the two different architectures. The goal of these initiatives is to provide standardization in the name of interoperability. Just as people want to ensure that other parties can understand the data in a document by using a standard XML format, they also want to exchange these documents in a business context by using standard protocols. In determining which of these initiatives is relevant to you, you must consider both the type of architecture you plan to use and the scope of functionality you require.

It would be nice if just knowing whether you will be using a business document or a remote interface architecture told you which standards to use, but, sadly, that is not the case. However, it does help. If you plan to have a remote interface architecture, you know that SOAP is relevant but that neither BizTalk nor ebXML is. You also have to determine if higher-level Web Services protocols are relevant. On the other hand, if you plan to have a business document architecture, you pretty much need to choose among BizTalk, ebXML, and approaches that leverage Web Services protocols within a proprietary business document framework.

While there are certainly philosophical and platform issues that affect this choice, there is an important functional one as well. BizTalk and ebXML have different scopes. BizTalk focuses on delivering a relatively simple document wrapper with higher-level functionality contained in the Microsoft tools. ebXML focuses on delivering a complete methodology and software stack for B2B commerce.

SOAP / W3C XML Protocol

The good news is that all major Web Services initiatives have decided to use SOAP as their low-level XML messaging protocol. There has also been a great deal of cooperation within the developer community to ensure the interoperability of different implementations . Thus XML messaging has, to a certain extent, achieved the goal of using common infrastructure. However, as you will see, these higher-level initiatives use SOAP somewhat differently.

The bad news is that SOAP made itself attractive for a wide variety of uses by deferring many important issues. SOAP provides no extended features. It does provide a mechanism for adding extended features, but such additions aren't necessarily interoperable. The thinking behind this minimalist approach is that it's more important to establish a common low-level XML messaging protocol now rather than figure out how to standardize even a relatively small set of extended features. Eventually, as people begin depending on such features, they could migrate into the SOAP standard if necessary.

Microsoft, DevelopMentor, and Userland Software collaborated on version 1.0 of the Simple Object Access Protocol. The effort rapidly expanded to include a large group of vendors, resulting in the widely used version 1.1. Then these vendors submitted the protocol to the W3C, where it is now evolving under the direction of the XML Protocol Working Group. Hopefully, this working group will issue SOAP 1.2 as a Recommendation by the time you read this.

SOAP works much like our hypothetical messaging protocol. Example 4-1 shows a simple SOAP requests -response pair for the shipping status of an order, using an HTTP binding. Example 4-1a is the request. The request comes in via an HTTP POST method that contains the SOAP message as a file. The SOAP wrapper is called an Envelope, and it consists of two parts . The Header contains information for custom-extended features, and the Body provides standardized encoding of the function call. The location of the target application is encoded in the namespace declaration. The function is indicated with a corresponding element name ”in this case, "GetStatus" element. The parameters of the request are encoded with corresponding elements ”in this case, "Acct" and "Order" elements.

Example 4-1a

POST /ShippingStatus HTTP/1.1 Host: www.foocompany.com Content-Type: text/xml; charset="utf-8" Content-Length: nnnn SOAPAction: "http://www.foocompany.com/SOAP/Shipping" <SOAP:Envelope xmlns:SOAP= "http://schemas.xmlsoap.org/soap/envelope/" SOAP:encodingStyle= "http://schemas.xmlsoap.org/soap/encoding/"> <SOAP:Body> <m:GetStatus xmlns:m= "http://www.foocompany.com/SOAP/Shipping"> <Acct>ABCDEFGHI</Acct> <Order>123456789</Order> </m:Status> </SOAP:Body> </SOAP:Envelope>

Example 4-1b is the response. The response comes embedded in the HTTP 200 Return . It uses the same wrapper as the request, but with information about the return values instead of the function call. The outermost body element indicates which function is responding ”in this case, "GetStatusResponse." Its child elements contain the return types and their values ”in this case, a single "Status" element with a value of "Shipped."

Example 4-1b

HTTP/1.1 200 OK Content-Type: text/xml; charset="utf-8" Content-Length: nnnn <SOAP:Envelope xmlns:SOAP= "http://schemas.xmlsoap.org/soap/envelope/" SOAP:encodingStyle= "http://schemas.xmlsoap.org/soap/encoding/"> <SOAP:Body> <m:GetStatusResponse xmlns:m= "http://www.foocompany.com/SOAP/Shipping"> <Status>Shipped</Status> </m:GetStatusResponse> </SOAP:Body> </SOAP:Envelope>

There are a couple of interesting points to note about this SOAP exchange. First, the element structures of the function call and the return values are completely dependent on a particular interface. The requester must know this structure before it can construct a valid message. Usually, a programmer will need documentation of this interface and then have to write the requesting code to produce the appropriate SOAP messages. In some cases, it may be possible to use higher-level tools to map external interfaces to internal data structures, but this approach still requires someone with programming skills who understands how these pieces work.

The second interesting point is that the HTTP binding obviously depends on the HTTP interaction model. In this case, the return values are embedded in the HTTP response to the HTTP POST method containing the request. This approach results in an inherently synchronous model ”the requester must wait for the HTTP response before proceeding. Of course, the response could just contain an acknowledgment of the request and later send the results in a separate SOAP message. Unfortunately, the SOAP specification does not describe how to identify individual messages. If once piece of software sends two different "GetStatus" messages to another piece of software and then a "GetStatusResponse" message comes back, SOAP has no mechanism to indicate a correpondence with a particular "GetStatus" message. So using SOAP in an asyncrhonous model requires additonal agreements between sender and recipient.

The lack of message identifiers is symptomatic of a larger issue. SOAP does not include any extended features. Developers must define custom elements and put them into the Header if they want security, reliability, or transactions. Of course, different developers may define these tags differently, leading to an interoperability problem. Example 4-2 shows a SOAP Header with security and transaction information. The encoding of this information is completely hypothetical but nevertheless valid from the perspective of the SOAP standard.

Example 4-2

<SOAP:Envelope> <SOAP:Header> <s:Authenticate xmlns:m= "http://www.foocompany.com/SOAP/Security" mustUnderstand="1"> <Method>SSL</Method> <Credential> <CN>www.barcorp.com</CN> </Credential> </s:Authenticate> <s:Transaction xmlns:m= "http://www.foocompany.com/SOAP/Transaction"> <TransID begin="false" commit="false" rollback="false"> 123456789 </TransID> <SequenceNumber>3</SequenceNumber> </t:Transaction> </SOAP:Header> ... </SOAP:Envelope>

Example 4-2 has hypothetical encodings for two extended features: authentication and transactions. The exact elements used in these encodings are completely made up. Developers could certainly use the implied format to implement these features, but the resulting software would be incompatible with any other implementation. What is standardized is the specification that understanding the authentication encoding is mandatory, while understanding the transaction encoding is optional. When a requester wants to use an extended feature, it specifies through an attribute whether the recipient "must support" the feature. This attribute enables developers of proprietary extensions to ensure that their software interacts only with software that subscribes to the same proprietary model.

Specifying "must support" means that the recipient must return an error if it cannot support the feature. If "must support" is not specified, the recipient should use the feature if it can be supported but may continue processing if it does not. Of course, mandating support for extended features means that all the parties must either have agreements about these Header elements or use a higher-level standard. Otherwise, using the extended features simply results in many unprocessable messages.

Support for security and transactions works well with this approach because the only issue is whether the recipient knows how to act on the information contained in the Header. Business document applications, however, need guaranteed delivery. Reliability features affect the transport layer, making them somewhat problematic . For each transport binding, partners must agree on how to achieve reliability for a message. There are currently initiatives to add reliability to SOAP, but the complexity of these issues currently makes using SOAP for business document applications more difficult than for remote interface applications. The best solution is to use on top of SOAP a standard that adds the capabilities necessary to support business document applications.

BizTalk

BizTalk is the Microsoft- backed initiative for business document applications. BizTalk Framework is the open specification for how these applications should behave. BizTalk Server is the Microsoft product that implements this specification. Other vendors may offer implementations as well, but the specification does defer the precise nature of certain optional features, thereby potentially limiting the interoperability of these alternative implementations.

The BizTalk Framework defines a set of header tags that extend SOAP to meet the needs of business document applications. From a manager's perspective, the exact syntax of these tags is not particularly relevant, but the capabilities they deliver are significant. Message properties, endpoint, and service tags address a major shortcoming of SOAP by providing the information necessary for asynchronous communication. Manifest and process management tags further extend the model by allowing for non-XML attachments and automated workflow management.

As you saw in the discussion of SOAP, the lack of message identifiers severely limits the protocol's usefulness in an asynchronous model. The BizTalk Framework makes such identifiers mandatory as well as other useful information such as a timestamp, expiration, and topic. This metadata about the message gives servers the foundation to transport, route, and process all the messages in a complex business interaction properly.

Perhaps the most important feature enabled by the message property tags is the ability to specify the sender and recipient of business documents logically. The transport of the underlying SOAP binding obviously knows the physical endpoints of the message transfer, but this level of addressing merely ensures the successful movement of the message from one machine to another. It does not ensure that the application capable of handling the particular request receives it and processes it. The BizTalk endpoint tags specify the sender and recipient in terms of logical business entities, such as the "Accounts Payable Department of Foo Company." Just as a logical e-mail address enables mail servers to route an e-mail through many hops to the correct desktop, the endpoint tags enable messaging servers to route a message to the application designated to handle the message.

The message property and endpoint tags form the basis for reliable messaging. Suppose the sender wants guaranteed delivery of a given message to a particular recipient. To do its job, the underlying messaging system needs to know the logical recipient and the message identifier. The logical recipient tells the system, which has to acknowledge the message, and the message identifier makes sure a specific recipient gets a specific message. A delivery receipt serves as the tangible proof that the messaging system has done its job. To enable all this processing with the BizTalk Framework, the sender requests a delivery receipt and may optionally request a commitment receipt. The commitment receipt goes a step further by requiring the recipient to specify the date by which it will provide a response.

The primary goal of a business document application is to facilitate the flow of these documents among trading partners. Realistically, a business document may have a set of corresponding supporting documents. Consider the case of a purchase order issued by a semiconductor design firm to an overseas fabrication plant. A number of additional documents such as CAD drawings and process notes would accompany the purchase order to form a precise definition of the expected service. BizTalk accommodates this requirement using a manifest that catalogs all attachments. Each manifest entry provides descriptive information about the attachment and a reference to its location in a larger multipart MIME package that contains the business document and its attachments, similar to the packaging of e-mail attachments.

Another reality of business document applications is that a given business document rarely exists in isolation. It is usually just part of a larger set of interactions among trading partners. Remember the series of document exchanges necessary to complete the "Purchasing" process discussed earlier. An emerging class of products enables the automated processing of such multistep processes based on a shared definition of the process. The BizTalk Framework accommodates such software by providing a means to specify a reference to a process definition. But the format of this process definition is not standardized. The BizTalk Server product uses this mechanism to leverage its built-in workflow capabilities, based on its own process definition format called XLANG.

While the BizTalk tags provide the information necessary to exchange business documents among trading partners, such exchanges naturally require security. BizTalk messages may use a variety of underlying transports, so the BizTalk Framework defers the issue of securing the communications channel to these transports. It does not, however, defer the issue of securing the messages themselves . Because it uses the standard MIME packaging for messages, it uses the S/MIME standard for securing them. This standard covers the encryption and signing of messages and is already in widespread use for applications such as securing e-mail. Note that S/MIME does rely on each party having some degree of public key infrastructure to take advantage of its most useful features.

While Microsoft's BizTalk Server was the only commercial product available that implements the BizTalk Framework at the time of this writing, there were other implementations under development. A number of custom programs demonstrated the ease of extending a SOAP implementation to create a server that could process BizTalk messages. The BizTalk Framework addresses the basic issues of asynchronous, reliable, and secure business document exchange, but it ignores the condition that trading partners should agree on common business processes to execute with this document exchange. Even simple items like the precise format for logical addressing of sender and recipient are currently not part of the standard. Of course, the Microsoft product suite provides a number of fine tools for dealing with these concerns.

ebXML

In contrast to BizTalk, ebXML deals with the function of establishing automated business processing among trading partners. It specifies a much grander architecture that starts with business process modeling and proceeds to the negotiation of shared business processes. This initiative is proceeding under the auspices of the Organization for the Advancement of Structured Information Standards (OASIS) and UN/CEFACT, an international body with a great deal of experience in facilitating electronic business.

Of course, ebXML does need the basic plumbing to move XML business documents from one point to another. ebXML takes a very similar approach to BizTalk in providing this structure by extending SOAP with its own header tags. These tags address the same issues, such as message identity, logical addressing, and reliable delivery. There are some other additions, such as enhanced error reporting and status checking, but these are not the primary differentiators. Rather, ebXML offers an added set of conventions on top of the basic plumbing.

ebXML looks at business document messaging as part of a larger process of negotiating automated business processes. This perspective comes from its roots in traditional EDI. The experience of EDI is that enterprises typically need to fit electronic business methods mostly within their existing business processes. For two trading partners to automate their interaction, they must arrive at a common process that meets these constraints. Each partner must discover how the other is willing to do business and tailor its electronic interactions to accommodate these practices.

Obviously, this process of discovery goes beyond simply inspecting programming interfaces. While not strictly required, ebXML emphasizes the use of business modeling and accompanying documentation to educate potential partners. When using a modeling language, the specification mandates the use of the Unified Modeling Language ( UML ) .

ebXML has a framework that defines the types of modeling constructs that enterprise should define. An enterprise may adopt a Role within a Business Process. So "Manufacturing" would have "Designers," "Raw Materials Suppliers," and "Factories." A set of trading partner interactions to accomplish a set of interdependent Business Processes is a Business Collaboration. So a group of trading partners could link the "Manufacturing," "Shipping," and "Distribution" Business Processes as part of the larger "Deliver Goods to Market" Business Collaboration.

A particular Business Collaboration consists of multiple Business Transactions. Each Business Transaction defines a pattern of Message exchanges that advance the Business Collaboration by a logical unit of progress. The hypothetical "Deliver Goods to Market" Business Collaboration might include such Business Transactions as "Authorize Manufacturing," "Update Production Schedule," and "Select Shipping Carrier." The "Authorize Manufacturing" Business Transaction could require "Specification," "Quote," and "Purchase Order" messages. Each enterprise may then specify the format it supports for the messages it must exchange within the Business Collaboration. Figure 4-5 summarizes the hypothetical model.

Figure 4-5. Abbreviated ebXML Business Model

Obviously, this model-driven approach to electronic business processes generates a substantial number of documents. Enterprises signal their support for various Roles within different Business Processes by submitting documents to an ebXML registry, where they become available to potential trading partners. Each enterprise maintains control over the conditions under which others may access its information. This registry is part of a larger repository for all human and machine information generated during trading partner collaborations.

Within the registry, each enterprise maintains a Collaboration Protocol Profile (CPP). A CPP describes the pattern of interactions an enterprise will use when fulfilling each of its possible Roles within each of its supported Business Processes. A CPP includes the technical interfaces, messaging, and security capabilities supported by the enterprise. When two partners have opposing Roles within the same Business Process, their interaction patterns are compatible for this Business Process. If they have a common subset of technical capabilities, they can automate this business process. They create a Collaboration Protocol Agreement (CPA) that governs this automated process. The CPA references a Business Process Schema Specification (BPSS) that provides an XML description of the workflow rules used to execute the business process. All CPPs, CPAs, and BPSSs reside in the registry.

Reaching agreement on CPPs would clearly be easier if trading partners shared a basic vocabulary of business concepts. To this end, ebXML includes a set of common Process Templates and Core Components . The Process Templates cover interactions in common Business Processes such as finance, product development, product manufacturing, transportation and logistics, and product support. The Core Components include XML Schema definitions for elements that define entities like names , addresses, accounts, product descriptions, and prices. Together, they greatly reduce the modeling and negotiation burden for enterprises that use them. Moreover, by mining the repository for common patterns, industry groups can further expand these templates and components, making the process of implementing ebXML systems easier and easier as adoption grows. Figure 4-6 shows the relationships among the different ebXML components.

Figure 4-6. ebXML Components

ebXML is a very inclusive initiative. Large enterprises can have sophisticated and expensive ebXML gateways that integrate with their backend systems to achieve massive efficiency improvements. But the potential advantages for large enterprises don't preclude the participation of small ones. Because the entire system is designed for human readability, a one-person company could easily work with the largest corporations by manually processing ebXML documents accepted through e-mail.

For those enterprises that want a lighterweight business document messaging model, OASIS has another group working on the Universal Business Language (UBL). The goal of this group is to define a standard set of formats for business documents, such as "Purchase Order," "Shipping Schedule," and "Commercial Invoice." These formats will be compatible with the ebXML Core Components but usable without the rest of the ebXML model. Many enterprises do not initially require all of ebXML's sophistication, but it would be nice if the business document formats they used were compatible with ebXML. It would also be nice if these enterprises could easily upgrade to the full ebXML model should the need arise. Moreover, there are a number of competing business document standards, including commerce eXtensible Markup Language (cXML), RosettaNet, and XML Common Business Library (xCBL). These standards are all somewhat similar because each took the same approach of trying to reformulate preexisting EDI formats as XML, but the existence of multiple options has somewhat fragmented compatibility. Unifying them with UBL will hopefully lead to more universal interoperability.

Категории