Securing Web Services with WS-Security: Demystifying WS-Security, WS-Policy, SAML, XML Signature, and XML Encryption

 <  Day Day Up  >  

ge ·stalt or Ge ·stalt (g-shtE4lt ) n. a configuration or pattern of elements so unified as a whole that it cannot be described merely as a sum of its parts . [1]

[1] Cognitive Science Laboratory's WordNet, Princeton University. http://www.cogsci.princeton.edu/~wn/

We say gestalt of Web services because we are going to show that, although most presentations of Web services describe the piece-part standards and technologies that taken together equal Web services, we believe that approach does not do justice to the unified concept of Web services. Instead, we believe it is much more instructive to look back at how distributed computing and middleware systems have evolved, how Web services relate to those previous generations of distributed computing, how Web services are similar or different from their ancestors , and what all of this means for Web services security. First, you need to understand the business problems that have driven the need for Web-based middleware.

Application Integration

Application integration occurs when you allow different software systems to share or aggregate information. For most of the history of computing, application integration has happened between applications that reside within corporate boundaries, but increasingly in a networked and integrated world, integrations reach outside organization boundaries. The fact is, applications do not work in a vacuum , and most need to communicate and be integrated with other applications to be effective.

Enterprise Application Integration

There is still a huge need for internal applications to share or aggregate information, and this creates a large backlog of need for integrating applications. One common example of the need for internal application integration comes from the way companies organize data about their customers. Some of that data might be in a customer relationship management (CRM) system, but that is not where the billing history with those same customers resides. When those customers are part of the company supply chain, that aspect of the relationship with them might be in a totally separate system. In some cases, a large enterprise acquires a smaller one and then finds itself with two different vendors ' CRM systems. Acquisitions are a common cause of the need to integrate internal applications. Most major enterprise IT initiatives, such as enterprise resource planning (ERP), CRM, sales force automation, supply chain management, and more, are application integration initiatives.

B2C and B2B Application Integration

Organizations are discovering that there is potentially much higher value when they integrate applications between themselves and others. You can find numerous cases of business-to-consumer (B2C) integration that creates a holistic and better experience for consumers from a variety of disparate sources and applications. Amazon, Orbitz, and Yahoo! are all B2C application integration examples. Business-to-business (B2B) application integrations are increasingly important and common. Frequently, they integrate some sort of trading partner community. B2C and B2B integrations are helped by the fact that, for several years now, most companies have been building more and more of their capabilities with a Web face and are therefore prepared to deal with a networked world.

Countless businesses, each with its unique core competencies and value-add products and services, have been building more and more of their capabilities with a Web interface since the creation of the Web. Sometimes the motivation has been to provide higher value to Web- facing applications by directly integrating a business partner's services as features of the host business's application (as opposed to having users click away and start using the partner's Web site!). Increasingly, there is a huge demand for making applications out of multiple, distinct Web delivery points. But there are well-established standards for how you communicate on the Web, so these standards should be leveraged as you strive to make unified Web-based distributed applications. In some cases, the protocols you needed to accomplish this did not exist, so the past few years have seen a frenetic effort ”which continues apace ”to nail down the remaining standards. Consistent with the Web, these protocols and standards are ostensibly vendor neutral, eliminating the mutual assured destruction of past distributed computing efforts.

Note

We say "ostensibly vendor neutral" because some large vendors are influencing the standards processes to such an extent as to call vendor neutrality into question. Although this sounds bad on the surface, in practice if all other vendors fall into line, the desired result is achieved: All vendors agree to and build to a unified standard. The big vendors feel there is so much at stake and the standards committee process can be so lengthy that they are short-circuiting the process by building out standards themselves and then presenting them to a standards group like OASIS for discussion, recommendation, and finalization .

A number of other business issues are driving the need to integrate applications and information. Let's explore them now, starting with the need to automate and streamline business processes.

Automating Business Processes

Business processes ”even the simplest ones such as requesting vacation time off ”do not neatly stay within only one application. For an employee vacation request, at least the HR systems and the corporate financial systems need to communicate to approve and accrue for employee paid time off. Different applications assist in different parts of a business process. In the case of processing an order, one application handles ERP, one handles inventory, one handles billing, one handles customer information, and so on. When processes cross application boundaries, information needs to be shared; otherwise , the humans involved will either do a poorer or, at best, a less efficient job. Frequently, humans compensate for unintegrated applications, and you will find manual translation of information between one system and another, even in the operationally best companies. This outcome is unacceptable in today's business environment. Efficient information exchange, transparency of actual transactions, and data integration are fundamental and critical to the business process. So too are repeatability , auditability , and accountability, all of which require secure automated and monitored middleware systems.

Information Aggregation Portals

A portal is an integrated and personalized Web-based interface to information, applications, and collaborative services. Access to most portals is limited to corporate employees (an intracompany portal) or corporate employees and certain qualified vendors, contractors, customers, and other parties within the extended enterprise (an intercompany portal). Portals are also used on the Web to provide a single site for consumers to see a variety of related information, goods, or services.

The trend toward portals as a way to provide an information aggregation point accessible ultimately via a browser is testament to the importance of integration. Portals are used to deliver more products and services to employees, partners, or consumers by aggregating information from numerous sources into a single browser interface. Examples of consumer portals are numerous, including CNN, CBS Marketwatch, a portal site for parents of college-age children, and many more. Increasingly, portals are also used to deliver a single point of access to employees or business partners about a company and its internal operations, such as an employee benefits portal or a partner activities portal. Portals are used to create a simple interface on a collection of features previously encapsulated in a proprietary application or applications that would have required a client download but now provide the same functionality through a browser. The ubiquity of the browser and the "thinness" of a browser-based client make this paradigm a convenient way to offer application services to the end user .

All these examples have some security challenges. Orbitz's business partners, such as Delta Airlines, Marriott Hotels, and Avis Rent-a-Car, want to respond to Orbitz's requests for product availability but do not want to provide that information to someone other than Orbitz, who they are sure will pay them for it. Company intranet portals are incredibly powerful and useful resources for the people authorized to use them but could be damaging to the company or its employees if accessed by unauthorized people. Even internally accessed applications now surfacing their functionality through browser interfaces must have stringent controls over information security. Because all these types of portals are being developed using Web services, these and other security requirements will have to be addressed by Web services security.

The Importance of Universal Application Connectivity

The need for a universal way of connecting applications has been a driving force in the industry for 20 years. It has led to middleware . Middleware is plumbing software that connects computers and applications together. The goal for middleware is to make it easy for applications (and the users who use them) to access remote resources. Technologies such as Distributed Computing Environment (DCE) were used to integrate back-end legacy systems to newer , lighter user interfaces like Windows. This need is strong on the Web because e-business forces companies to expose their business processes over the Web. Those business processes have traditionally been codified into "back-end" systems.

Virtually every application requires integration. Business process automation and workflow rely on integration. Portals require integration. Knowledge management requires integration. Business intelligence requires integration. Universal application connectivity is all about better decision making. It is about better customer services, better product initiatives, and even better homeland security.

The ultimate goal many hold for Web services is service-oriented architecture (SOA). This is the essence of universal application connectivity. SOA makes every application act as a service hub using the Web as middleware. Web services and SOA provide a significant boon to application integration by making the interface to every SOA- adapted application standard, published, discoverable, and self-describing .

The Evolution of Distributed Computing

Distributed computing is an approach to computer-to-computer communication. It has remained a goal ever since time-sharing on mainframes reached its limitations. Either one computer was not really enough to complete the job, or islands of computing were found to be ineffective when business processes were seen to cross application boundaries. Distributed computing is also an approach to higher reliability and availability because it removes single points of failure for applications. The means of communication between the distributed systems ”frequently euphemistically referred to as plumbing ”is called middleware.

Middleware

The Distributed Computing Environment was an early '90s effort to standardize various competing remote procedure call (RPC) technologies. DCE was driven by the Open Group ”a consortium of otherwise competing companies originally called Open Software Foundation (OSF). DCE's goal was distributed applications across heterogeneous systems. This was part of the late-80s to early-90s focus on " open " systems. If you recall, this was right in the middle of the client/server computing paradigm. At the time, everyone had lots of high-productivity, relatively new client machines that ran nice windowing operating systems. But they also had lots of fast, expensive, business-critical back-end systems that they needed to utilize and even try to leverage more. This situation seems not to have changed much even today.

DCE was wildly successful in achieving some of its goals. It was the first attempt to operate independently of the operating system and networking technology. It succeeded at this effort. DCE was implemented as a set of software services that reside on top of the operating system. It was middleware that used lower-level operating system and network resources, as shown in Figure 2.1.

Figure 2.1. Components and services in DCE.

DCE broke ground on major distributed applications and formed part of the knowledge base from which Web services evolved. Barclay's ”a huge bank ” processed live transactions from 2,000 distributed locations using DCE. Charles Schwab handled 25,000 real-time stock transactions per day from a large DCE deployment. 3M deployed distributed clients to 87,000 employees worldwide for DCE-based access to mainframes and legacy applications ”before the conversion to the company's intranet.

At about the same time, Common Object Request Broker Architecture (CORBA) was a huge middleware project led by the Object Management Group; this group included more than 700 companies ”everyone it seems except Microsoft, which had its own competing Distributed Component Object Model (DCOM). The underlying communications protocol used by CORBA was called the Internet Inter-ORB Protocol (IIOP). In October 1996, Marc Andreesen, Netscape's co-founder, unknowingly contributed to the hype when he said, "The next shift catalyzed by the Web will be the adoption of enterprise systems based on distributed objects and IIOP. IIOP will manage the communication between the object components that power the system. Users will be pointing and clicking at objects available on IIOP-enabled servers." [2] This did not happen with IIOP as he had envisioned , but it is precisely what is happening with Web services.

[2] The Netscape Web site. http://wp.netscape.com/ columns /techvision/iiop.html

A major failing of CORBA was its complexity, which contributed to making it much less scalable than DCE. This failure occurred because of its attempt to be too pure to object-oriented philosophies. However, CORBA was very strong on security, and much of what is happening today in Web services security traces directly from CORBA security.

Note

The following quote, from the CORBA security specification introduction, gives a glimpse into how all the issues Web services security has to deal with were front and center in the minds of CORBA's developers:

"CORBA security services specify the authentication, authorization, and encrypting of messages. The user has all the instruments at his disposal. The span ranges from access-control-lists (ACL) for groups of objects (without impact on the application program) to very sophisticated and fine-grained mechanisms for sensitive data. [3] "

[3] See the CORBA security specification at http://www.omg.org/docs/formal/02-03-11.pdf.

We may never see ”nor do we want to ”Web services security reach a single specification of 430 pages, as CORBA's did. However, when the specifications for XML Signature, XML Encryption, WS-Security, Security Assertion Markup Language (SAML), Kerberos, eXtensible Rights Markup Language (XrML), eXtensible Access Control Markup Language (XACML),and others are all combined, you can be sure it is more than the 430 pages CORBA used. The lessons learned and the innovations applied in CORBA security are clearly present in all these components of Web services security.

At about this same time, over at Microsoft, DCOM took its Component Object Model (COM) ”which was loosely based on DCE ”and extended it to allow applications to be built from COM objects that resided in different networked machines. This model was platform-neutral as long as it was a Win32 platform. DCOM was transport-neutral but virtually always used TCP/IP. DCOM was too complex, too fine-grained, too proprietary, and it was not scalable. However, it provided critical learning that went directly into Microsoft's .NET Web services.

The Web: The Global Network for Information Exchange

The Web forms the most pervasive distributed computing infrastructure ever envisioned. There are approximately 46 million registered Internet domains [4] . There are 172 million Internet hosts [5] . More than 500,000 of the servers on the Internet have Secure Socket Layer (SSL) certificates to support Transport Layer Security [6] . There are in excess of 3 billion accessible Web page documents [7] . Across all corporations, for every computer supporting the company's presence on the Web, we postulate that there are, on average, at least 10 completely internal servers also using the same Web technologies. If this is true, it means that as Web services become pervasive on internal not just external servers, this new form of middleware could theoretically affect almost 2 billion servers.

[4] Netcraft Web server survey, January 2004, www. netcraft .com

[5] The Internet Domain Survey, January 2003, http://www.isc.org/ds/WWW-200301/

[6] Netcraft SSL survey, http://www.netcraft.com

[7] Search Engine Showdown, http://www.searchengineshowdown.com/stats/sizeest.shtm

The Web is based on text messages; traditionally, they have been HTML documents, but increasingly they are XML dialect documents. Virtually all applications can deal with text messages.

The Web is also based on resources and links. Hypertext represents the links. The Web has exploited hypertext to create an incredible collection of interconnected documents based on unique identifiers. Those identifiers are called uniform resource identifiers (URIs) . A URI is a compact, formatted, text-based name that uniquely identifies a resource. A URL , on the other hand, substitutes locator for identifier and is a URI that is bound to a physical network address.

You interact with these Web resources via a protocol such as HTTP, SMTP, and FTP for Web browsing, email, and file transfer, respectively. The first part of a URL can indicate the application protocol to use when interacting with the resource, as in http://www. sams .com.

Virtually every organization has invested in computing infrastructure (or pays fees to someone who has on its behalf ) to support its participation in the Web. Middleware that leverages this enormous investment and enormous commitment to a set of ubiquitous international standards will have an immediate acceptance unlike anything in computing history. This is the main reason for Web services' rapid adoption.

Web services were defined to communicate using the Internet and the infrastructure created for the Web (links, resources, transport protocols, security, and firewalls). Not surprisingly, Web Services Security begins with the security developed for the Web. We will discuss SSL and other building blocks for Web Services Security later.

Early Web Services Using HTTP POSTing as an API

You have the Web, you have a Web site and the servers to run it, and you have a Web-facing application that uses HTML forms displayed in your site visitor's browsers. You realize you need to have a human who is using a browser access not just that form but your business partner's application as well. This happens because your business partner wants to incorporate your Web application directly into one of its own. What do you do?

If the set of standards and technologies currently being called Web services did not yet exist to integrate your applications to theirs, you would do what thousands of developers at their companies did: You would directly co-opt the Web as middleware using HTTP POST technology.

From the formal W3C HTTP specification:

"POST is a standard method in HTTP used to request that the server accept the entity enclosed in the request as a new subordinate of the resource identified in the Request-URI." [8]

Listing 2.1 shows what the POST looks like for a simple HTML-based form posted on a Web site.

Listing 2.1. An HTTP POST Header for a Simple HTML Form

POST /companyForm HTTP/1.1 Host: www.awl.com Content-Type: application/x-www-form-urlencoded/ Content-Length: 63 firstname=Jothy&lastname=Rosenberg&email=jothy@acm.org&sex=male

HTTP POST used in this way creates a Web API. It treats a Web form normally accessed interactively through a browser as a programmatic interface accessed by another application from a remote location. Effectively, this is an RPC. As an API, it is very brittle ”one tiny change to a form and it breaks. However, compared to DCE's RPC mechanism, this is much further up the protocol stack, so it is firewall friendly and much easier to support in the IT infrastructure. Companies used POST as an API because they were desperate. The fact that so much of this was going on was a strong impetus to nailing down the Web services standards and putting them into use.

As you will see later, SOAP, one of the key Web services “enabling standards, uses the HTTP POST mechanism, which makes SOAP slide into IT environments so easily.

[8] RFC 2616: Hypertext Transfer Protocol: HTTP/1.1.ftp://ftp.isi.edu/in-notes/rfc2616.txt

The Inevitability of Web Services

You may wonder why Web services are being developed now. The reason is that the entire software industry has finally decided that there is more value in firms and consumers easily sharing information than in keeping data locked away. This is what happens in any maturing industry: making sure one vendor's product works with everybody else's product. That's right, standards. Think about the light bulb socket, two-by-four lumber, and standard-gauge rail beds; each of those standards led to an explosion of use. That is what we and many others believe we are witnessing now with Web services.

There is an inevitability about Web services. They are not just the newest fad that is being overhyped (although they are being overhyped). This is the natural evolution of the Web, of distributed computing, and of e-business needs. As e-business has progressed from islands of marketing information about products and services to islands of credit-card retail stations to portals, the Web has been evolving from an information-centric platform to an application-centric platform. This is not a replay of the CORBA fiasco. The unification and the shear number of vendors working on these services dwarf what was accomplished with CORBA, DCOM, and DCE combined. Every tool, every major language, every vendor, even every enterprise is jumping into Web services because each one sees them as an approach that will work.

As with all previous successful development paradigm technology shifts, real success comes only when the programmers adopt something, not when the business people alone see the benefits. Web services give developers tools that can locate and utilize all functions and interfaces for interesting applications within and between organizations. When you deliver innovation to developers that actually increases their productivity in a major way, you see rapid adoption ”which is the only thing that can shift the entire computing industry to a new application development paradigm like Web services. The same thing happened with Unix, C++, client/server computing, Java, and XML, to name just a few radical developer productivity multipliers.

What we want to get from Web services is to have both "perfect" enterprise application integration and "perfect" distributed computing in the same package of standard protocols, tools, languages, and interfaces. That grandiose vision may be unrealistic . But developers will not be fooled by hype and will continue to push these standards and aim for these goals. Several standards are still incomplete or immature, especially in the security area ”the focus of this book. The development and management tools are still quite immature as well. However, the pent-up demand that met Java's introduction ”an unprecedented adoption rate that was 10 times the adoption rate of C++ ”seems to at least be matched by the demand for Web services.

How We Got Here

It may at times seem as though Web services either just suddenly appeared out of nowhere or, at a minimum, evolved from the basic standards upon which the browser-based Web was built. Neither is true.

Web services are a natural evolutionary result of decades of work on distributed computing and, as with those previous generations, were driven by a continuing application integration crisis. The need to integrate all applications remains and, if anything, grows stronger because the current "integration" is frequently a human who copies information from one application's screen to another.

Successive generations of middleware evolved through experimentation with each generation building on the strengths and lessons learned from those preceding . RPC mechanisms evolved in this fashion. So did message-oriented middleware. The vision of a service-oriented architecture appeared early but remained quite elusive .

Then came the explosion of the Web, which began in earnest in 1994 when TCP/IP connectivity landed on everyone's desktops and Tim Berners-Lee's HTTP and HTML combined with Marc Andreesen's Mosaic and Netscape browsers created a new meaning for the word surfing . The pervasiveness of the Web is what made clear how the next generation of middleware should be built. It should leverage all the technology, standards, and infrastructure that already existed in most organizations supporting the Web.

Client/server was a major paradigm shift in how applications should be constructed and had its biggest impact on companies that for the first time started connecting all employees into the network to access centralized application resources. But it didn't take long to start building frustration with fat clients. Getting them installed and configured for all the people who needed them, updating them when new updates were needed, and providing training and support, as each one had its own ideas about how user interfaces should work, became so prohibitively expensive and time-consuming that companies and users began to rebel. The alternative was browser-based applications. Sometimes these applications are called portals when they combine information from numerous sources or applications. Many, if not most, of the issues of fat clients had been solved , but new issues with thin clients created their own new frustrations. The lack of interactivity and slow response time were huge backward steps from the high productivity that users of fat clients were accustomed to. In effect, browser-based applications with their Submit buttons felt very much like the old "green screen" applications of mainframe days.

There were some interesting early attempts at what are now called Web services. The first was the invention of browser frames. With frames , a site visited by a browser could create a new frame in the user's browser and redirect just that portion of the screen to a new URL. The original site remained in control of the overall browser, its outermost frame, and its security. Sites wanted to retain their users (a concept called stickiness ) while still "integrating" the other sites into the overall information presented to the browser's user. CGI was another example of struggling with existing capabilities to deliver what are expected now from Web services. CGI allowed a browser to initiate execution of any program if a script to do so was installed and accessible on the Web server.

The most telling of all examples of the need for Web services and the almost desperate attempts to simulate them was the extensive use of HTTP POST as an integration scheme. Site A that had a Web application humans normally accessed through browsers could be integrated into site B's Web application if site B could "screen scrape" all the interactions necessary to make site A think a browser was interacting with it. This result was accomplished by having site B send the same HTTP POSTs to site A that would have been sent by a browser directly accessing it.

Along came XML to solve the problem of how different applications, organizations, and databases reach agreement on the way data they want to interchange should be structured and defined. XML made business information transportable. This was a huge contrast to client/server applications that each defined the database schema and binary data being transported unto itself with no eye toward integration with other applications, much less other organizations. But business processes never had and never will stay isolated within only one application. Companies were driving cost and inefficiencies out of their business processes, and they were looking at their supply chains, where paper-based processes ruled .

As companies became the dominant force behind application development, their tolerance for lack of interoperability between platforms and applications became dominant as well. This was one of the reasons for Java's unprecedented acceptance, growth, and displacement of C++ as the enterprise's application development language of choice. IT budget pressures also drove companies to demand better ways to deal with their application integration crises .

The drive for platform neutrality, language independence, and interoperability in the face of heterogeneity finally reached the major vendors. For the first time, IBM, Microsoft, and others who joined them drove a set of middleware standards that were common, consistent, and quickly supported in their respective tools and platforms. This started with the collaboration between Microsoft and IBM to drive SOAP. It is fair to say that, the moment SOAP was proposed as a standard from these two vendors, Web services were born.

Security Challenges

A Web service is middleware that uses the Web infrastructure. It integrates applications inside and outside the organization. It enables a service-oriented architecture. Distributed computing has always had a challenging set of security issues. In this section, we explore the security challenges brought on by Web services.

Identities

One of the greatest security challenges brought on by Web services revolves around identities. Web services transport potential unknown identities into your organization. These identities are not individuals directly connected to your computer; they are connected to someone else's computer, and that person is presenting these identities on their behalf. Their identities are essentially attached to the Web service messages. Say a 401k provider offers a Web service to employers , who then have the problem of authenticating an employee and passing the authenticated identity of that employee from the company intranet portal back through the Web service to the 401k provider. Who are these individuals really? What services are they requesting? Are they authorized to do so? These are the security questions that must be answered for the 401k scenario to play out.

Another big problem created by Web services is asset protection. It is a sad but true fact that the vast majority of corporate intellectual property compromises are perpetrated from inside the organization. Web services can potentially make this problem worse unless you secure them first. A critical question, then, is this: What proprietary information is leaving the organization at the request of these persons? After they are done, can it be proven that they did these things?

Web services security problems related to identities are such a critical issue that we discuss them in several places in this book. All of Chapter 6, "Portable Identity, Authentication, and Authorization," is about portable identities and the authentication and authorization of those identities. In addition, Chapter 7, "Building Security into SOAP," focuses on how the WS-Security standard represents an identity and attaches it to a SOAP message.

Messages

Web services, at their core, are messages being transported from one place to another. Securing these messages is a variant of classic information security. The messages themselves, or things they refer to, need to be secured. That means confidentiality and integrity ”two core concepts of information security. Because these messages are sometimes legal documents, they may need to be signed, so you need non- repudiation to be able to prove that a transaction took place, who initiated it, and when. These security concepts will be explored in depth in Chapter 3, "The Foundations of Distributed Message-Level Security."

Service-Oriented Architectures

Service-oriented architectures provide software as a service that leads to composite applications being constructed from a collection of reusable service components. SOAs built with Web services result in multi-hop message flow, as one Web service calls another service to handle one piece of functionality before the result is passed on to yet another service in the chain. The information transported this way must be kept confidential from all the entities that touch it along its way. The essence of SOA is that, instead of having software systems deployed as functionality accessed through specialized client software, functionality is accessed by sending the SOA application a request to which it responds. Stated another way, an application's full functionality is accessible through a request-response paradigm. The target application provides a service. The communication to this service is expected to be a Web service. If the service is not free, authentication is needed to make sure only authorized users access the service. If the content is confidential, responses to service requests may need to be kept confidential in transit, or perhaps the authenticity of the information's source may need to be proved in some secure way. Figure 2.2 shows how many different applications used by different types of users might all access a shared service.

Figure 2.2. Service-oriented architecture diagram showing different types of applications accessing a shared service.

At this point, we are ready to delve into the core component standards that make up Web services. We will cover each of these topics generally to be complete, but in all cases we will show how the various standards interrelate to the other standards and make sure the security context of each standard is plain. First, we will build on the description of XML, which was briefly introduced in Chapter 1, "Basic Concepts of Web Services Security."

 <  Day Day Up  >  

Категории