XML: A Managers Guide (2nd Edition) (Addison-Wesley Information Technology Series)
The fundamental unit of XML content is the element, which is an author-specified chunk of information. An element consists of an element name and element content . XML is case sensitive, so you must pay attention to case when assigning element names and creating element content. Start and end tags denote the boundaries of the element and contain the element name. The element content may consist of character data or other elements. An element may also be empty of content. Consider Example 2-3, an annotated version of our business card document to see examples of these content types. Example 2-3
<BusinessCard> document element <Name> element content <GivenName>Kevin</GivenName> data content <MiddleName>Stewart</MiddleName> data content <FamilyName>Dick</FamilyName> data content </Name> <Title> Software Technology Analyst data content </Title> <Author/> empty content <ContactMethods> element content <Phone>650-555-5000</Phone> data content <Phone>650-555-5001</Phone> data content </ContactMethods> </BusinessCard> In Example 2-3, "BusinessCard" is the top-level element. In XML, there can be only one element at the top level. This element is called the document element or sometimes root element . Think of this element as the trunk of the tree from which all other elements branch. Figure 2-2 shows the corresponding tree for Example 2-3 with each node representing an element and identified with the element name. Conceptually the element content resides within the node. Figure 2-2. Business Card Element Tree
The annotations in Example 2-3 indicate the content model for each element. There are four allowable content models for elements.
Notice that, except for empty elements, all elements in Example 2-3 have a start tag and an end tag. The start tag is bounded by angle brackets ”for example, <ElementName>. The end tag is bounded by angle brackets and has a leading slash, as in </ElementName>. All content, whether data or element, must occur between the start and end tags. An empty element may use this syntax by simply providing no content between the start and end tags ”for example, <ElementName></ElementName>. An empty element may also use an abbreviated syntax, with a single tag bounded by angle brackets with a trailing slash ”for example, <ElementName/>. A document that obeys all the XML syntax rules is well formed . There are several technical criteria for well- formedness , but the primary ones are the following.
An XML processor can process a well-formed XML document unambiguously, building a tree data structure in which each node is an element that contains either data content or references to its subelements or both or neither. You could use such documents to represent many different kinds of content. Example 2-4 shows a document that represents the schema for a simple contact database. Figure 2-3 shows the corresponding tree. Example 2-4
<Database> <Table> <Column>Name</Column> <Column>Phone Number</Column> </Table> <Table> <Column>Date</Column> <Column>Person</Column> </Table> </Database> Figure 2-3. Contact Database Element Tree
Although the document in Example 2-4 captures the basic structure of the contact database, there is not enough information for a software application to process the document, establish a connection to the database in question, and perform queries. The element names "Database," "Table," and "Column" are insufficiently descriptive. Clearly, you need a richer syntax for describing the metadata associated with an element. |