XML, Web Services, and the Data Revolution
| |
Team-Fly |
| XML, Web Services, and the Data Revolution By Frank P. Coyle
|
Table of Contents | |
Chapter 3. XML in Practice |
As Figure 3.2 shows, the first wave of XML vocabularies centered on defining structures for specific or vertical industries: applications of XML or SGML ”such as schemas, document type definitions (DTDs), namespaces, and style sheets ”were used to leverage the Web for business purposes. Many of the early applications of XML were extensions of prior development based on SGML, so that many companies working with SGML were able to get a head start in XML. Figure 3.2. The first wave of XML applications: vertical industry data descriptions.
Companies seeking to position themselves in the global economy find XML attractive for several reasons:
In this section we'll look at just a few of the hundreds of vertical industry initiatives centered around XML. These include the Open Financial Exchange (OFX), Mortgage Industry Standards Maintenance Organization (MISMO), and the HR-XML Consortium, an initiative to standardize data for human resources with a focus on recruitment. Each brings to the table its own challenges and solutions, many still in the formative stage, but since with XML we're not constrained by lock-in to binary data representations, data descriptions can evolve with requirements.
In looking at different vertical industry approaches to defining XML vocabularies, we notice that two themes recur. One is the struggle over whether to use elements or attributes to represent data. The second theme, which is more subtle, is whether to focus on defining XML for data storage across an industry or whether to concentrate on data representations for exchanging data between partners within an industry. As we'll see in the following examples, OFX and MISMO tackle the problem of XML for data exchange, while HR-XML takes on the challenge of defining XML formats for persistent data storage. Finance: OFX
One of the great ironies of the computer revolution is that it has pushed many clerical responsibilities up the corporate ladder so that many of us now find ourselves typing and editing our own documents and using financial software to track our personal finances. However, one of the real challenges of any data management system is to keep data synchronized. The OFX specification is an XML-based language that enables brokerage clients to download account information directly into their accounting or tax-preparation software, such as Quicken or TurboTax. OFX also supports the exchange of financial information among financial services companies, their technology outsourcers, and consumers using Web- and PC-based software. As in any effort to define an industry standard, consensus is required. OFX is an open consortium created by CheckFree, Intuit, and Microsoft in early 1997, and it now has the support of over 1,000 financial institutions, technology solution providers, and payroll companies. Major financial players in the OFX initiative include Prudential, TD Waterhouse Group , Inc., and T. Rowe Price. OFX supports a range of financial activities, including consumer and small-business banking, consumer and small-business bill payment, bill presentment, and investment download and tracking, including stocks, bonds , and mutual funds. As Figure 3.3 shows, OFX enables the downloading of brokerage information to a user 's PC. Downloads can go directly into Web and PC tax software and may include information from 401(k), 1099, and W2 tax forms. OFX also allows consumers to pay bills directly over the Web. Figure 3.3. OFX enables brokerage clients to download account information directly into tax preparation software.
As is often the case in broad-based initiatives of this sort , the umbrella consortium allows its members (financial services companies) to enhance application capability by adding new XML content, giving them an opportunity to support value-added features and help position themselves in a competitive marketplace . The focus of the OFX XML vocabulary has been on data exchange, not data storage. OFX makes no recommendation about how data should be represented in the permanent data stores of participants . The important objective for OFX is to define the data formats for moving data from one platform to another. OFX has taken a strong stance in the elements versus attributes controversy, coming down strongly in favor of elements. The DTD for the financial data exchange defines over 450 elements and no attributes. The DTD does, however, make extensive use of entities, XML shortcut abbreviations that make the DTD and XML documents themselves more readable. Human Resources and HR-XML
The hiring and employee management done by human resources departments are data intensive . HR-XML is a nonprofit consortium dedicated to enabling an XML-based e-commerce and human re sources data interchange format. The objective is to spare employers and vendors the risk and expense of having to agree upon and implement an ad hoc data exchange mechanism. By developing and publishing an XML representation for HR data, it will be easier for any company to do business with other companies without having to implement a one-of-a-kind interchange mechanism. HR-XML's current work focuses on standards for staffing and recruiting, benefits enrollment, payroll, competencies, and workforce management. For any organization attempting to define an XML data representation, it's important to create consensus among stakeholders. HR-XML includes a group called the Cross-Process Objects (CPO) Workgroup with three related roles within the HR-XML Consortium:
The CPO oversees teams that work on models and schemas for common HR objects. Driving the CPO effort is the fact that XML-HR specifically targets XML for data storage, not B2B transactional data. The usage scenario is one in which r sum s will be written to a server as XML files. A program will load information from these files to a system of distributed databases where an intranet-based query program will allow precise skills matches against the databases. What's a Person?
One of the challenges confronted by the HR-XML Consortium has been to arrive at a consensus on what it means to be a Person (at least as far as hiring managers are concerned ). While Person is often used in introductory XML examples, it's not trivial to define, given the requirement that any definition should be capable of global use in a consistent manner. The definition of a Person schema for HR-XML includes a number of requirements:
The result of the effort was the DTD shown in Listing 3.1. It's included here because it illustrates a short, readable design that mixes elements and attributes according to the well-respected schema design principle, "Use elements to represent domain data; use attributes for metadata." For example, the element FormattedName , which is used to describe the full name as it will appear on some document, includes an attribute called type , which is intended to describe the kind of presentation it indicates, for example, a legal form of the name, a form suitable for sorting, or just a default presentation that might be used to address an envelope. Similarly, the element Affix , intended to allow a title of some kind to be included with a name, is supported by an attribute that adds information about the affix, such as whether it represents an academic rank (Professor), an aristocratic title (Lord), or a military title (Colonel). Listing 3.1 A DTD for the Person Element in the HR-XML Definition for Human Resources Applications [1]
<!ELEMENT PersonName (FormattedName* , LegalName? , GivenName* , PreferredGivenName? , MiddleName? , FamilyName* , Affix*)> <!ELEMENT FormattedName (#PCDATA)> <!ATTLIST FormattedName type (presentation legal sortOrder ) 'presentation' > <!ELEMENT LegalName (#PCDATA)> <!ELEMENT GivenName (#PCDATA)> <!ELEMENT PreferredGivenName (#PCDATA)> <!ELEMENT MiddleName (#PCDATA)> <!ELEMENT FamilyName (#PCDATA)> <!ATTLIST FamilyName primary (true false undefined ) 'undefined' prefix CDATA #IMPLIED > <!ELEMENT Affix (#PCDATA)> <!ATTLIST Affix type (academicGrade aristocraticPrefix aristocraticTitle familyNamePrefix familyNameSuffix formOfAddress generation qualification ) #REQUIRED > [1] Copyright, The HR-XML Consortium. All Rights Reserved. http://www.hr-xml.org. Examples of XML that satisfy the DTD in Listing 3-1 include an XML PersonName for Major John Smith: <PersonName> <GivenName>John</GivenName> <FamilyName>Smith</FamilyName> <FormattedName>John Smith</FormattedName> <Affix type="formOfAddress">Major</Affix> </PersonName> and for Mrs. Jane H. Doe: <PersonName> <GivenName>Jane</GivenName> <MiddleName>H.</MiddleName> <FamilyName>Doe</FamilyName> <Affix type="formOfAddress">Mrs.</Affix> </PersonName> Mortgage Banking: MISMO
Just about everyone who purchases a home acquires a mortgage loan. Mortgage loans are available through lending institutions such as banks and mortgage companies that supply the cash to buy the home. In order for lending institutions to continue to have money to deliver to borrowers, the loans are sold to companies such as Fannie Mae and Freddie Mac and packaged as mortgage-backed securities. This is big business. Over $378 billion in mortgage- backed securities were issued by Fannie Mae and Freddie Mac in 2000, a statistic which indicates the importance of the transfer of data between lending institutions and Fannie and Freddie. In 1999 a group of industry representatives formed MISMO and in 2000 began to address electronic commerce issues in the mortgage industry. Their objective was to define an XML schema in the form of DTDs that could be used as the basis for data exchange within the industry. In formulating an XML schema, MISMO has been very explicit about what they are working to standardize. Unlike HR-XML, they are not trying to come up with formats for storing long- term data but are only attempting to standardize loan data as it moves between two organizations at some point in time. Of course, companies are free to archive data as it moves between servers, but the intent is only to describe the data that is needed to carry out the B2B transactions between lenders and Fannie Mae and Freddie Mac.
In developing a schema for B2B data interchange it's important to establish consensus, which depends on communication. Along the path to standardization, MISMO released a draft version of its dictionary of common data items for review, focusing on information associated with mortgage loan applications. MISMO uses a centralized Web-based repository to provide a single location for managing data elements and generating XML document definitions. The DTD that has evolved is extensible so that other underwriting organizations can use it and add to it with additional data they may need for their own particular transactions. As Listing 3.2 shows, the DTD is designed around a top-level definition so that parts can be reused in other loan-related transactions. Listing 3.2 A Portion of the DTD for MISMO
LOAN_APPLICATION ( _DATA_INFORMATION? , ADDITIONAL_CASE_DATA?, AFFORDABLE_LENDING?, ASSET*, DOWN_PAYMENT*, GOVERNMENT_LOAN?, INTERVIEWER_INFORMATION?, LIABILITY*, LOAN_PRODUCT_DATA?, LOAN_PURPOSE?, LOAN_QUALIFICATION?, MORTGAGE_TERMS?, PROPERTY?, PROPOSED_HOUSING_EXPENSE*, REO_PROPERTY*, TITLE_HOLDER*, TRANSACTION_DETAIL?, BORROWER+ )> DOWN_PAYMENT _Type ( BridgeLoan CashOnHand CheckingSavings DepositOnSalesContract EquityOnPendingSale EquityOnSoldProperty EquityOnSubjectProperty GiftFunds LifeInsuranceCashValue LotEquity OtherTypeOfDownPayment RentWithOptionToPurchase RetirementFunds SaleOfChattel SecuredBorrowedFunds StocksAndBonds SweatEquity TradeEquity TrustFunds UnsecuredBorrowedFunds ) #IMPLIED> Tracking XML Standards
The explosion of XML vocabularies has led to a need for a central repository to track the various XML initiatives. The Organization for the Advancement of Structured Information Standards (OASIS), is a nonprofit international consortium that creates interoperable industry specifications based on public XML and SGML standards. One aspect of the OASIS mission is to develop vertical industry applications, conformance tests, and interoperability specifications that make vertical standards usable. Table 3.1 lists various areas for which there are XML initiatives. OASIS does not compete with but rather builds upon and supplements the work done by standards bodies such as W3C (for XML) or ISO (for SGML). OASIS's technical work generally falls into one of the following categories:
In keeping with the spirit of the Web and open standards, OASIShas adopted a Technical Committee Process that governs its technical work and provides a vendor-neutral home for standards, giving all interested parties, regardless of their standing in a specific industry, an equal voice in the creation of technical work. Table 3.1. Some Vertical Industry XML Dialects Registered at OASIS
|
| |
Team-Fly |
Top |