XML: A Managers Guide (2nd Edition) (Addison-Wesley Information Technology Series)

In this application category, the XML paradigm is something of a replacement for HTML on its own. The end delivery format in most cases will still be HTML, but XML-formatted content and XSLT style sheets drive the process. Usually, the document author wants an XML solution instead of plain old HTML for one of these three reasons.

  1. The author wants to generate content from an existing electronic source. Generating personalized news documents from news feeds and databases is an excellent example. In this case, the document author wants to define a common structure for all the documents and populate the structure automatically based on certain parameters. An online sports pool where the author wants to generate the results electronically for each user based on the user 's individual picks is another example application that could benefit from this electronic generation.

  2. The authoring process is highly collaborative. Online publishing is the prototypical example. A group of geographically dispersed authors wants to contribute their pieces of content, which are then processed for multiple audience-specific outlets. The group needs highly structured units of information so all the pieces will fit together for delivery, and the formatting must be completely separated from the content. Distributed project management applications are another example.

  3. Users need the ability to search through a document archive using the metadata provided by XML tags. The search for works by William Shakespeare introduced in Chapter 1 is one example. In this case, the document author wants to provide plays online with the content marked up to indicate the role of each element. Online product catalogs are perhaps the best example of when facilitating such a finely tuned search would benefit both users and the application provider.

People are the primary consumers of Content Documents. Software applications may index and search content documents, but primarily the information within the documents themselves is meaningful to people. It follows that presentation is important. Therefore, Content Document applications must heavily involve XSL stylesheets as well as XML documents.

Development Process

Because Content Document applications often use XML to replace HTML, the development bears a certain resemblance to that of traditional Web applications. The primary differences lie in XML's cleaner separation of responsibilities between information design and presentation design. As Figure 6-1 shows, the development process for Content Document applications has the following seven steps.

Figure 6-1. Development Process for Content Document

  1. Design schemas. The primary goal of most Content Document applications is to deliver structured information to human users. Therefore, the first step in the development process is assessing what information the users need and organizing this information as a set of schemas. Because the information needs of users will vary from application to application, most Content Document applications require custom schemas. Information designers may be able to leverage existing schemas in whole or in part. However, because the value proposition of a new Content Document application is the delivery of information tailored to the needs of target users, some custom schema design is almost always necessary.

  2. Design stylesheets. The information consumer for Content Document applications is the human user. Human users process information most effectively when it is presented in a well-designed visual format. Therefore, once application developers have decided what information to present, the next step is to develop the layouts for that information. In most Content Document applications, users view documents within an XML-capable browser, using stylesheets, so layout design implies stylesheet design. Because one of the advantages of XML for Content Document applications is the customization of layouts, designers require a model of the intended audiences and their requirements. Each distinct audience segment may then need its own stylesheet for each schema related to its information needs.

  3. Implement document management. The schema design phase produces templates for the information delivered to users. The stylesheet design phase produces the layout for these templates. The next step is implementing the infrastructure necessary to marshal documents and stylesheets and then deliver them to users. This document management may be as simple as creating a filesystem directory with the appropriate permissions and configuring a Web server. It may include the use of a more sophisticated content management system to manage collaborative authoring, authorize processes for releasing content to users, and update cycles for different types of content. Completing this phase requires managers and producers first to define policies and then for administrators to configure the document management systems to reflect those policies.

  4. Author documents. With the document management infrastructure in place, the application needs the content to deliver. Most Content Document applications include static XML documents. Human authors must create these documents. As discussed in Chapter 5, they should use either ad hoc word processor-like editors or wizards that accept form-based input. Administrators have to acquire these tools and then integrate them with the document management systems. In most cases, this integration should be a simple matter of configuration, but it might also require the application of update patches or purchase of add-on products. For particularly specialized applications, developers may have to write custom authoring tools.

  5. Generate documents. Many Content Document applications include dynamic documents generated from data in external sources. For these documents, developers must establish connections to the appropriate sources and specify the rules for generating documents from the data that they contain. Hopefully, developers can use third-party tools for data integration to access the data sources. Otherwise, they may have to write custom code for this purpose. Applying the rules for generating documents requires capturing parameters from user input, and developers may implement them with a publishing tool or a scripting language. During the testing phase, developers must ensure that the external sources can handle the load imposed by the generation process for the expected number of users.

  6. Integrate front end. After providing for the creation of the content documents, developers must provide for their delivery to users by integrating the application with front-end clients. For XML-capable Web browsers, this integration may consist simply of providing a link to a starting page from a well-known Web location. For non-XML-capable Web browsers, this integration will require distributing XML plug-ins or updated versions of the browser. This step is particularly important in the testing phase because the differences in the platform and version spread of browser clients require thorough functional testing to ensure that all users have acceptable experiences.

  7. Integrate storage. In most cases, an application requires persistent storage for content documents. Developers have to integrate the document management infrastructure with this repository. XML-specific stores include automated functions for the storage and retrieval of XML documents. More generic repositories may require writing custom storage and retrieval code. Note that repository integration can occur in parallel with front-end integration. As with document generation, load testing is important to make sure the storage mechanism can handle the expected load.

For very simple applications that deliver static documents via a Web server, the Content Document development process is nearly identical to the development of a traditional Web site. Schema design is the important new element. Developing applications that deliver dynamic XML documents via a Web server is very similar to developing dynamic Web applications with HTML. Schema design and the separation between designing stylesheets and generating documents are the important new elements. However, applications that use custom document management or sophisticated repositories add steps similar to those in traditional client-server software development.

Required Staff

The similarities to traditional Web application development make the types of staff required for Content Document applications similar as well. As in HTML Web applications, there is a split between authors of static documents and designers of dynamic documents. There is an additional split between information designers and presentation designers. As one would expect, similarities emerge in project management and support personnel. The following types of staff are required.

  • Producer. Whether for an intranet, extranet, or the Internet, Content Document applications must address the requirements of their target users. Because the target user groups are usually quite large and the expectations for responsiveness are usually quite high, there must be a staff member devoted to assessing user requirements and taking appropriate actionthe producer. The producer coordinates the identification of changes in user requirements with the allocation of resources necessary to implement appropriate application enhancements. During initial application development, the producer plays much the same role as a traditional project manager, determining application goals, and managing the development schedule. However, appealing to thousands of users and making daily changes require a more dynamic ongoing role similar to that of a producer in the media industry. The producer will have experience in project management, product development or design, and perhaps media production.

  • Information designer. Content Document applications deliver information to human users. The information designer, taking direction from the producer, determines what information to provide and how to organize it. This staff type participates primarily in the schema design phase. However, the information designer may also participate in the implement document management and storage integration phases. These phases require the information designer's input on the probable usage patterns of the different document types so that other types of staff may implement efficient document management infrastructure and storage access. The information designer will have experience in requirements gathering, requirements specification, and data modeling.

  • Layout designer. The layout designer develops the stylesheets for the application. These designs depend on the types of documents designed by the information designer and the information gathered about users. In some cases, the producer may arrange user focus groups in which the layout designer participates to gather this data. The layout designer will have experience in graphic design, Web page design, and perhaps user interface design.

  • Application developer. Sophisticated Content Document applications may require the coding of custom software. These applications require an application developer. The role of this staff type is to supply programming expertise for the creation of custom document management infrastructure, custom document authoring tools, and custom document generation tools. The application developer will have expertise in appropriate programming languages such as Java, Visual Basic, and Perl.

  • Document author. Document authors have domain-specific knowledge that they need to encode as XML documents. To facilitate the rapid capture of this knowledge, they should use wizard-based authoring tools, although ad hoc tools will work better if the nature of the information does not lend itself to the template approach. They will have expertise in the particular domain for which the application provides content.

  • Data integrator. For applications that generate documents from external sources, the data integrator translates the information in these sources into the appropriate XML format. In some cases, this translation may simply require using a graphical tool. In others, it may require some scripting. The data integrator will have experience in database programming or administration and perhaps in programming languages such as Java, C++, and Perl.

  • Administrator. The administrator is responsible for configuring the document management infrastructurefor instance, Web servers and content management systemsand ensuring the ongoing operation of the application. This includes the application hardware, the document management infrastructure, the authoring system, the generation system, and the persistent storage mechanism. Sophisticated applications may require more than one administrator. The administrator will have experience as a Webmaster or system administrator, perhaps as a network administrator, and perhaps as a database administrator.

When compared with the staff for traditional Web applications, the information designer is probably the most revolutionary addition to the development team. The separation of information format design from presentation design and document authoring makes explicit a role fulfilled implicitly by a number of different traditional staff types. The combination of general requirements gathering skills and specific schema design skills is likely to be the most difficult to find. In the short term , layout designers, application developers, and data integrators with XML experience may be in short supply, but their skills will be transferable enough to make learning XML relatively straightforward.

Категории