Parsing HTTP Requests

The HTTPEchoProtocol class in Example 4-1 provides an interesting glimpse into HTTP in action, but it's a long way from being ready for use in a real web server. It doesn't even parse the request to figure out what resource the client is trying to access, or what HTTP method she's using. Before you try to build a real web application, you need a better way to parse and respond to requests. This lab shows you how.

4.2.1. How Do I Do That?

Write a subclass of twisted.web.http.Request with a process method that processes the current request. The Request object will already contain all the important information about an HTTP request when process is called, so all you have to do is decide how to respond. Example 4-2 demonstrates how to run an HTTP server based on a subclass of http.Request.

Example 4-2. requesthandler.py

from twisted.web import http class MyRequestHandler(http.Request): pages = { '/': '

Home

Home page', '/test': '

Test

Test page', } def process(self): if self.pages.has_key(self.path): self.write(self.pages[self.path]) else: self.setResponseCode(http.NOT_FOUND) self.write("

Not Found

Sorry, no such page.") self.finish( ) class MyHttp(http.HTTPChannel): requestFactory = MyRequestHandler class MyHttpFactory(http.HTTPFactory): protocol = MyHttp if __name__ == "_ _main_ _": from twisted.internet import reactor reactor.listenTCP(8000, MyHttpFactory( )) reactor.run( )

Run requesthandler.py and it will start up a web server on port 8000. You should be able to view both the home page (http://localhost:8000/) and the page /test (http://localhost:8000/test) in your browser. Figure 4-2 shows you how the page /test will look in your browser.

Figure 4-2. A page generated by the requesthandler.py web server

If you attempt to load any other page, you should get an error message, as shown in Figure 4-3.

Figure 4-3. The requesthandler.py 404 page

 

4.2.2. How Does That Work?

The http.Request class parses an incoming HTTP request and provides an interface for working with the request and generating a response. In Example 4-2, MyRequestHandler is a subclass of http.Request that provides a custom process method. The process method will be called after the request has been completely received. It is responsible for generating a response and then calling self.finish( ) to indicate that the response is complete. MyRequestHandler uses the path property to find out which path is being requested, and attempts to find a matching path in the pages dictionary. If a matching page is found, MyRequestHandler uses the write method to send back the text of the page as the response.

Note that write is used only to write the body portion of the response, not to generate the raw HTTP response itself. The setResponseCode method can be used to change the HTTP status code. The twisted.web.http module provides constants for all the status codes defined by HTTP, so you can write http.NOT_FOUND instead of 404.

Request.setResponseCode takes an optional second argument, a human-readable status message. You can feel free to leave this outthe twisted.web.http module includes a built-in list of descriptions for common status codes, which it will use by default.

The Request class also provides a setHeader method for adding headers to the response. MyRequestHandler uses setHeader to set the Content-Type header to text/html; this setting tells the browser that the response body is in HTML format.

The twisted.web.http module provides two additional classes that you'll need to turn your subclass of Request into a functioning web server. The HTTPChannel class is a Protocol that creates Request objects for each connection. To make the HTTPChannel use your subclass of Request, override the requestFactory class attribute. HTTPFactory is a ServerFactory that adds some extra features, including a log method that takes a Request object and generates a log message in the standard Combined log format used by Apache and other web servers.

Категории