Web Services and Distributed Programming
Distributed programming is like network programmingonly the audience is different. The point of network programming is to let a human control a computer across the network. The point of distributed programming is to let computers communicate between themselves.
Humans use networking software to get data and use algorithms they don have on their own computers. With distributed programming, automated programs can get in on this action. The programs are (one hopes) designed for the ultimate benefit of humans, but an end user doesn see the network usage or even neccessarily know that its happening.
The simplest and most common form of distributed programming is the web service. Web services work on top of HTTP: they generally involve sending an HTTP request to a certain URL (possibly including an XML document), and getting a response in the form of another XML document. Rather than showing this document to an end user the way a web browser would, the web service client parses the XML response document and does something with it.
We start the chapter with a number of recipes that show how to provide and use web services. We include generic recipes like Recipe 16.3, and recipes for using specific, existing web services like Recipes 16.1, 16.6, and 16.9. The specific examples are useful in their own right, but they should also help you see what kind of features you should expose in your own web services.
There are three main approaches to web services: REST-style services,[1] XML-RPC, and SOAP. You don need any special tools to offer or use REST-style services. On the client end, you just need a scriptable web client (Recipe 14.1) and an XML parser (Recipes 11.2 and 11.3). On the server side, you just write a web application that knows how to generate XML (Recipe 11.9). We cover some REST philosophy while exploring useful services in Recipe 16.1 and Recipe 16.2.
[1] Why am I saying " REST-style" instead of REST? Because REST is a design philosophy, not a technology standard. REST basically says: use the technologies of the web the way they were designed to work. A lot of so-called "REST Web Services" fall short of the REST philosophy in some respect (the Amazon web service, covered in Recipe 16.1, is the most famous example). These might more accurately be called "HTTP+XML" services, or "HTTP+POX" (Plain Old XML) services. Don get too hung up on the exact terminology.
REST is HTTP; XML-RPC and SOAP are protocols that run on top of HTTP. Weve devoted several recipes to Rubys SOAP client: Recipes 16.4 and 16.7 are the main ones. Rubys standalone SOAP server is briefly covered in Recipe 16.5. Rails provides its own SOAP server (Recipe 15.18), which incidentally also acts as an XML-RPC server.
XML-RPC isn used much nowadays, so weve just provided a client recipe (Recipe 16.3). If you want to write a standalone XML-RPC server, check out the documentation at http://www.ntecs.de/projects/xmlrpc4r/server.html.
You can use a web service to store data on a server or change its state, but web service clients don usually use the server to communicate with each other. Web services work well when theres a server with some interesting data and many clients who want it. It works less well when you want to get multiple computers to cooperate, or distribute a computation across multiple CPUs.
This is where DRb (Distributed Ruby) comes in. Its a network protocol that lets Ruby programs share objects, even when they e running on totally different computers. We cover a number of the possibilities, from simple data structure sharing (Recipe 16.10) to a networked application (Recipe 16.18) that, after the initial connection, has no visible networking code at all.
Distributed programming with DRb is a lot like multithreaded programming, except the "threads" are actually running on multiple computers. This can be great for performance. On a single CPU, multithreading makes it look like two things are happening at once, but its just an illusion. Run two "threads" on different computers, and you can actually do twice as much work in the same time. You just need to figure out a way to split up the work and combine the results.
Thats the tricky part. When you start coordinating computers through DRb, youll run into concurrency problems and deadlock: the same problems you encounter when you share data structures between threads. You can address these problems using the same techniques that worked in Recipes 20.4 and 20.11. Youll also encounter brand new problems, like the tendency of machines to drop off the network at unfortunate times. These are more troublesome, and the solutions usually depend on the specific tasks youve assigned the machines. Recipe 16.10, the first DRb recipe, provides a brief introduction to these problems.
Категории