Real World XML (2nd Edition)

In the previous example, I associated an image, image.gif, with a document, but only by setting an attribute to the text "image.gif" . What if I wanted to make image.gif a real part of the document?

I can do that by treating image.gif as an external unparsed entity. The creators of XML realized that XML was not ideal for storing data that is not text. So, they added the idea of unparsed entities as a way of associating non-XML data, such as non-XML text, or binary data, with XML documents.

To declare an external unparsed entity, you use an <!ENTITY> element like this. Note the keyword NDATA , indicating that I'm referring to an unparsed entity:

<!ENTITY NAME SYSTEM VALUE NDATA TYPE >

Here, NAME is the name of the external unparsed entity, VALUE is the value of the entity (such as the name of an external file, as in image.gif), and TYPE is a declared notation.

You can also use public external unparsed entities if you use the PUBLIC keyword with an FPI:

<!ENTITY NAME PUBLIC FPI VALUE NDATA TYPE >

Here's an example. In this case, I start by declaring a notation named GIF that stands for the image/gif MIME type:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> . . .

Now I create an external unparsed entity named SNAPSHOT1 to refer to the external image file, image.gif:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> . . .

After you've declared an external unparsed entity such as SNAPSHOT1 , you can't just embed it in an XML document directly. Instead, you create a new attribute of the ENTITY type that you can assign the entity to. I'll call this new attribute IMAGE :

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> <!ATTLIST CUSTOMER IMAGE ENTITY #IMPLIED> ]> . . .

Now, finally, I'm able to assign the IMAGE attribute the value SNAPSHOT1 like this, making image.gif an official part of the document:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> <!ATTLIST CUSTOMER IMAGE ENTITY #IMPLIED> ]> <DOCUMENT> <CUSTOMER IMAGE="SNAPSHOT1"> . . . </CUSTOMER> </DOCUMENT>

If you use external unparsed entities like this, validating XML processors won't try to read and parse them, but they'll often check to make sure they're there. By doing so, they will confirm that the document is complete.

What if I wanted to embed multiple unparsed entities? Take a look at the next topic.

Категории