Professional Java Servlets 2.3
| < Free Open Study > |
|
So far we have looked at the J2EE's approach to modularization and development of J2EE applications. However, because we are focusing upon servlets in this book, to understand the roles they perform we need to examine the J2EE web tier in more detail. We have seen that J2EE containers provide services and an environment for application components, so we need to examine the web container and the components that can be deployed into, and managed by, the web container.
J2EE web components are usually JSP pages or servlets (they may also be filters, or lifecycle event listener classes, or tag libraries). The container has the responsibility of instantiating, initializing, calling, and destroying the web components deployed into it. It may create a pool of instances of a component, or it may execute methods on the instance in multiple threads corresponding to multiple requests.
Java Servlets
Servlets are Java classes that dynamically process requests and construct responses. In practice this often means that they dynamically generate HTML web pages in response to requests. However, they may also send data in other formats to clients, such as serialized Java objects (applets and Java applications), and XML. The servlets are run in a servlet container, and have access to the services provided by that container.
The client of the servlets can be a browser, applet, Java application or any other client that can construct a request (normally an HTTP request that the servlet can recognize and respond to) and receive the response to it.
From the servlet's point of view the request must be properly formatted at the basic level (for example, if we are using an HTTP servlet it must be an HTTP request) and at the higher level where the servlet may expect certain data in a specific format from the client. Any client that correctly prepares the request will receive the appropriate response based on the processing logic in the servlet. Servlets should also be prepared to handle incorrectly configured requests, but it is the programmer's responsibility to decide how to handle this.
Servlet Lifecycle
Generally the lifecycle of a servlet is as follows:
-
The container is responsible for ensuring that the servlet is initialized before it processes requests.
-
Servlet components then receive requests from the client tier. The container actually receives the request, transparently maps the request to the appropriate component instance, and passes the component properly-formatted request and response objects.
-
The servlet then processes the request, normally with the help of either the business tier logic (EJB's) or by retrieving information directly from the database or enterprise information tier.
-
Once processing has been completed a response is returned to the client tier.
-
Finally, the container is responsible for destroying any servlet instances that it has created.
Steps 1 and 5 (initialization and destruction) execute once only, but steps 2, 3, and 4 will loop many times to process many requests.
Servlet Communication
Communication between a servlet and the outside world may occur at four points within a web application:
-
With the client, during a request/response cycle
-
With the servlet container, to access information about the container environment or to access JNDI resources
-
With other resources on the server, such as other servlets, EJBs, and so on
-
With external resources in order to fulfill the request, including databases, legacy systems, and EIS
Generally, a servlet's role is to communicate with the client. Communication with the container will be unlikely to produce data to return to the client, but instead access to resources that can provide a service. To be truly useful, a servlet is likely to also communicate with either other server components or external backend resources (often a database).
Servlet-Client Communication
Communication with the client can take many forms, with the most popular being text-based communication. In HTTP communication, some or all of the information parameters will be supplied as part of the request. In the server's response, there is a range of possible formats to return the data in, depending on the client.
The most obvious is the HTML page for display in the browser. This is the markup language of the Web and is ideal for business to consumer e-commerce sites. Alternately, if the client is a mobile phone, WML (Wireless Markup Language) is optimal. If you wish to send data back for editing in a spreadsheet type of package a CSV file (comma separated values) is a good option, and if you set the MIME type correctly your client browser may open up the client's spreadsheet program to process the file. For communication with systems written in other languages other than Java, XML is rapidly becoming the preferred option. It is also suitable for Java client applets or applications where data forms the substantive part of the communication.
Serialized Java objects can also be exchanged and may be the best option when data objects are already created on the server or client. They need no parsing (unlike XML or other text formats) and are ready to go. For an application with changing requirements, modifications to only client- or server-side code do not affect the other side - only changes to serialized object classes affect both. In this case it is usually good practice to separate such classes into a separate common package so that they can be clearly identified during early stages of testing and releases (and minor modifications).
Problems sometimes occur when server- (or client-) side modifications affect common classes, and the deployer might overlook this and update the server (or client) classes only. Then the application breaks because the client- and server-side versions of the common classes clash.
Servlets can also return files, either existing files or ones created specifically. By setting the appropriate MIME type we can dynamically serve many different files.
Implementing the Servlet Specification
The Servlet specification is just that, a specification. It outlines what a servlet container should do, what services it must provide, and a set of API (javax.servlet and javax.servlet.http) consisting of classes and interfaces that it must implement. It does not build or provide a servlet container per se, but defines the framework within which one may be built.
It also outlines what a servlet component deployed in a container can do and what services it should expect from the container in which it is deployed. This means that standard servlets or (servlet components if you like) can be deployed on any fully-compliant implementation of the Servlet specification.
This leads to two questions:
-
Who implements or builds the container?
Vendors that want to incorporate Servlet technologies into their product (usually some flavor of web server) implement the container. They provide the container as part of their product so that programmers can develop servlet-based web applications.
-
Who implements or builds the servlet components?
Programmers who want to develop web applications for the enterprise using Java/servlet technology build servlets and assemble them into web applications. Any servlets or web applications built within the standard Java and Servlet specification will run within any container that implements the servlet specifications. Of course this also assumes that the web application and container are matched to the same version of the specification.
However, the following questions then arise:
-
Why define the container so that vendors must implement their own version of the container?
-
Why do we need other implementations other than the reference implementation?
To answer the second question first, the reference implementation is only that - a reference implementation. It may be freely used, adapted, enhanced and/or improved (subject to the license). Developers use the reference implementations to gain experience of the technologies. This experience is portable onto other servers that meet the same set of specifications. It is not necessarily the best implementation for a specific purpose, and vendors often implement their own version of the servlet container (or adapt/include the reference version) so that they can optimize its performance for the specific purpose.
This also answers the first question of why the definition is provided to allow different implementations. Vendors develop their own implementations based on adding some enhancement to their implementation that differentiates their version from competitors' versions. They may also implement the full J2EE specifications, or only the Servlet specification. In particular they may add additional web container management tools that enhance the administrator or developer's ability to deploy or redeploy components/web applications without having to restart the server.
In reality, this competition is generally a good thing for users, because it gives users a choice of containers that may have varying performance characteristics and possibly additional services. We, as programmers and users of these implementations (at least of Tomcat), have specific application requirements, including:
-
Fast database performance
-
Additional services or API
-
Cost
-
Reliability
The choice of implementations allows us the maximum flexibility (within a given budget) to choose the best implementation for our requirements. However, we need to be aware that using any additional non-standard services or APIs can tie the application to the specific server and reduce its portability. Application portability is important, as this allows us to move a web application if a particular implementation becomes more suitable.
Tomcat and Catalina
Tomcat, developed under the Jakarta Apache project (http://jakarta.apache.org/), supports the current version of the Servlet API 2.3. This current version of the Tomcat Servlet container, named Catalina, has been redeveloped from the ground up to meet the latest standards of flexibility and performance.
Tomcat 4 is the official Reference Implementation of the Servlet 2.3 and JavaServer Pages 1.2 API.
The Apache project is a collection of open source projects related to web development. The Jakarta subproject is an umbrella for the Java open source projects. The source code for the Sun Reference Implementation was released to the Apache subproject Tomcat, which is run by volunteers from the Java developer community.
The key advantages for developers (and users) of having the official Reference Implementation as a Jakarta Apache project is that the source code is available to individuals (developers like us) and companies to use without needing to pay royalties. It should also provide the benefit of standardizing and improving the implementations of web containers and the deployment of web applications among competing products from different providers (in addition to bug fixes and future development being driven by the open source community). It is also likely to mean that Tomcat (or component parts of it like the Catalina servlet container) will become components in new servers/applications, which should be to the developers' advantage as well as being good for the product provider.
It is important to stress here that while we are going to be using Tomcat throughout this book, we could also use one of the many alternative implementations out there. These include Allaire's JRun, Iona's iPortal Application Server, BEA's WebLogic, IBM's WebSphere, and Oracle's Oracle9i Application Server.
However, by using Tomcat, we are learning about the standard implementation, about the API, and about how to develop web applications. When we develop a web application based on the Servlets 2.3 Specification on Tomcat, we know that we can move our application to another container that implements the same version of the Servlet specification, because all container implementators or vendors are implementing the same specification. Deploying and running the web application on another compliant container is a fairly standardized process across containers, so there is often little extra that needs to be learnt to get up and running with a new container.
The Web Server-Web Container Relationship
A web server is the software that resides on a server computer. The purpose of the web server, as shown below, is to receive client requests and return the relevant response of a static resource or a dynamically created response. Often the web server will receive requests over HTTP, but requests can be made over any suitable protocol that the web server supports.
In Java enterprise server side programming with servlets and J2EE technologies, the web server's purpose is similar to that described above. The web server receives requests from clients and maps the request to the appropriate resource. If the request is for a static resource (an HTML web page, an image), it simply returns this resource to the client (or an error code if the resource could not be located).
The request could also be for a J2EE component, such as a servlet. In this case the J2EE server provides a web container and/or an EJB container to the web server. The web server then forwards requests for components within the containers to the specific container and then the container passes the request to the relevant component, which then processes the request and returns a response. This is shown below:
On a practical level the J2EE server may be integrated within the web server or may be "plugged" in. In the "plug in" process the web server is configured to recognize requests for components within the J2EE containers and to forward these requests to the appropriate container:
The web container does not receive requests directly from the client, but receives them from the web server. The web container can be configured as an add-on, or plugged into the web server, and receives the requests that map to components within the container. Examples of this configuration would include JRun-to-IIS, or Tomcat-to-Apache web server.
The case shown below is similar but slightly different. In this case the web container is an integral part of the web server. The web server functions similarly, providing static content as requested, but instead of forwarding requests to the plugged-in web container, the web container is internal to the web server and the request is passed internally to the web container:
WebLogic, WebSphere and Tomcat are examples of this scenario. As you have probably noticed, Tomcat (and others) can be configured with either setup as required.
Tomcat, run as a standalone server, provides web server functionality together with the web container for web applications. Like other such Java web servers it will run our web application, but does not provide the wider range of J2EE services that a full J2EE-compliant application server would. Examples developed in this book will work with Tomcat 4 in this setup.
Advantages of Using Servlets
So what are the benefits of using servlets?
Originally the web consisted of static web pages. As the requirements for the web evolved, static pages were not enough. Dynamically-created pages were required and the Common Gateway Interface (CGI) developed to fill this void. CGI works by passing the request onto the CGI program which is spawned in a separate process. However, running the CGI script in a new process has obvious cost (in terms of time, and server processing resources) and scalability issues.
CGI scripts can be written in almost any language. Perl is the language most frequently used with CGI. However, it can cause problems with languages like C if poorly-programme, and CGI scripts suffer from the disadvantage of not being able to access the server's resources or information once started. Therefore CGI scripts cannot share or pool resources such as database connections where relevant, which impedes performance. Some improvements were made to the original concept, including FastCGI which made CGI processes persistent (eliminating some of the start-up costs), but still having the same scaling limitations. CGI is not as cross-platform as Java, but is still relatively cross-platform subject to support and testing.
Alternative options to CGI have been provided by server vendors including Netscape's (now iPlanet) server extension APIs (NSAPI). However, these APIs are server-specific and contain various security and reliability problems. Microsoft provides ASP that runs VBScript.
Probably the most obvious difference between CGI and Java is process execution. CGI scripts are executed in separate processes while Java servlets run within the server process, with obvious performance advantages. Servlet instances persist between calls so they do not have to be created on each call. Also servlets have access to the servlet container and information about the environment in which the servlet is running. This means that servlets can also share resources such as connections to databases. The end result is a significant improvement in performance and scalability over CGI applications.
The integration of servlets could pose risks for a server, but Java provides an error and exception handling process that eliminates this potential problem. Servlets and servers use the standard Java error and exception handling mechanism that means that servlet programmers should code their servlets robustly enough to cope with errors. In the event that they fail to anticipate every possible error, and the servlet does fail, the server catches the error gracefully and returns a standard error to the client. This informs the client that there was an error and protects the server against poorly written web applications. Servers also are protected against servlets that violate security through the security manager, which effectively sandboxes (restricts access for) the web application to the server's resources outside the web application's domain.
Servlets offer all of the services of CGI involved in request-response handling, and then add significantly by giving the servlet access to a full range of Java libraries including the J2SE API, the Servlets API and often the full J2EE API (depending on the server). We can also include many other APIs from third-party vendors, such as JDBC drivers and XML parsers. These libraries and optional APIs are available cross-platform and cross-server.
Protocol Flexibility
The Servlet API provides two core packages:
-
javax.servlet
-
javax.servlet.http.
Most applications tend to extend from javax.servlet.http.HttpServlet to create servlets, or from their own subclass of this, therefore implicitly choosing the HTTP protocol. This means that a common perception of servlets is that they are tied to HTTP.
However, there is no reason why we could not implement our own package extending from javax.servlet to develop our own protocol support, much as the javax.servlet.http supports HTTP-based servlets. It would be relatively simple to extend the Servlets API to support other protocols such as file transfer (FTP), and e-mail (POP3, SMTP, IMAP). It would also be relatively simple to develop and implement our own protocol appropriate for our application. We can also use the HTTP-based servlet to tunnel through firewalls that block other connections, as part of our application.
JSP Components
JavaServer Pages (JSPs) are an extension of servlet technology, because they simplify part of the process of creating web content. A JSP can contain directive tags, blocks of Java code (known as "scriptlets"), and HTML. The tags and scriptlets are used to generate dynamic content within the page. JSP pages are compiled into a servlet on first call, for execution.
The specifications tend to limit their identification of the uses of JSP pages to text-based documents, but, because JSP pages are an extension of servlets and servlet functionality, any use that a servlet may be put to may also be incorporated into a JSP. JSP pages are particularly useful components for generating page content, so they are normally used for presentation logic, and developers are discouraged from placing processing logic in them. We'll learn more about JavaServer Pages in Chapter 8.
| < Free Open Study > |
|