If you need a DTD in a hurry, create it from an XML instance using Trang, Relaxer, DTDGenerator, or xmlspy. Several free Java tools are available that can generate a DTD based on an XML instance or instances. James Clark's Trang (http://www.thaiopensource.com/relaxng/trang.html) can, among other things, convert an XML document to a DTD, as can Relaxer (http://www.relaxer.org). Michael Kay's DTDGenerator (http://saxon.sourceforge.net/dtdgen.html), once part of the Saxon project, consists of a single Java class that is dedicated to XML-to-DTD conversion. This hack walks you through the steps to generate a DTD from a simple instance using each of these tools. 5.7.1 Trang You can download the current Trang JAR (trang.jar) from http://www.thaiopensource.com/download/, then place the JAR in the working directory. The Trang archive comes with a manual (trang-manual.html) that provides details on how to use Trang. I will cover only what is needed to create a DTD in this section. (If you need help with Java, refer to [Hack #10] .) To create a DTD from time.xml, run this command: java -jar trang.jar -I xml -O dtd time.xml generated.dtd The -I switch indicates the type of input (XML), and -O indicates the type of output (DTD). time.xml is the input file and generated.dtd is the output file. You could simplify this command by skipping the -I and -O options, which will produce the same result: java -jar trang.jar time.xml generated.dtd The file generated.dtd looks like this: <?xml encoding="UTF-8"?> <!ELEMENT time (hour,minute,second,meridiem,atomic)> <!ATTLIST time xmlns CDATA #FIXED '' timezone #REQUIRED> <!ELEMENT hour (#PCDATA)> <!ATTLIST hour xmlns CDATA #FIXED ''> <!ELEMENT minute (#PCDATA)> <!ATTLIST minute xmlns CDATA #FIXED ''> <!ELEMENT second (#PCDATA)> <!ATTLIST second xmlns CDATA #FIXED ''> <!ELEMENT meridiem (#PCDATA)> <!ATTLIST meridiem xmlns CDATA #FIXED ''> <!ELEMENT atomic EMPTY> <!ATTLIST atomic xmlns CDATA #FIXED '' signal #REQUIRED> Trang automatically declares an xmlns attribute for every element in an effort to be namespace-friendly. Trang apparently orders the declarations it outputs according to the order in which they appear in the source. 5.7.2 Relaxer With Relaxer installed [Hack #37], you can type this command to generate a DTD from time.xml: relaxer -dir:out -dtd time.xml Relaxer automatically uses the filename of the input file (time.xml) as the filename for the output file (time.dtd). So to keep from clobbering the existing time.dtd, Relaxer places the output file in the subdirectory out. If the subdirectory does not exist, Relaxer creates it. The result of this command, the file out/time.dtd, is shown here: <!-- Generated by Relaxer 1.0 --> <!-- Tue Mar 02 17:21:20 MST 2004 --> <!ELEMENT hour (#PCDATA)> <!ELEMENT time (hour, minute, second, meridiem, atomic)> <!ATTLIST time timezone CDATA #REQUIRED> <!ELEMENT minute (#PCDATA)> <!ELEMENT atomic EMPTY> <!ATTLIST atomic signal CDATA #REQUIRED> <!ELEMENT meridiem (#PCDATA)> <!ELEMENT second (#PCDATA)> Relaxer can consider the content models of more than one XML document in order to produce a DTD. Try this: relaxer -dir:out -dtd time1.xml time.xml Relaxer uses the name of the first file in the list for its output filename, so the output file will be out/time1.dtd, which follows: <!-- Generated by Relaxer 1.0 --> <!-- Tue Mar 02 17:28:30 MST 2004 --> <!ELEMENT hour (#PCDATA)> <!ELEMENT time (hour, minute, second, meridiem, atomic?)> <!ATTLIST time timezone CDATA #REQUIRED> <!ELEMENT minute (#PCDATA)> <!ELEMENT atomic EMPTY> <!ATTLIST atomic signal CDATA #REQUIRED> <!ELEMENT meridiem (#PCDATA)> <!ELEMENT second (#PCDATA)> The content for atomic is zero or one (?), as seen on the bold line. This is because atomic does not appear in time1.xml. Therefore, Relaxer interprets it as being an optional element rather than a required one. 5.7.3 DTDGenerator The DTDGenerator JAR file came with the files for the book. To generate a DTD from time.xml with this utility, type the following command while in the working directory: java -cp dtdgen.jar DTDGenerator time.xml The result of the command is sent to standard output: <!ELEMENT atomic EMPTY > <!ATTLIST atomic signal NMTOKEN #REQUIRED > <!ELEMENT hour ( #PCDATA ) > <!ELEMENT meridiem ( #PCDATA ) > <!ELEMENT minute ( #PCDATA ) > <!ELEMENT second ( #PCDATA ) > <!ELEMENT time ( hour, minute, second, meridiem, atomic ) > <!ATTLIST time timezone NMTOKEN #REQUIRED > It appears that DTDGenerator orders the declarations according to how they are stored on the stack. To redirect the output of DTDGenerator to a file, do this: java -cp dtdgen.jar DTDGenerator time.xml > somesuch.dtd 5.7.4 xmlspy You can also generate a DTD from an XML document with xmlspy (http://www.xmlspy.com). This example demonstrates how to do this in xmlspy 2004 Enterprise Edition (Release 3). Start xmlspy, and use File Open to open time.xml in the working directory. Choose DTD/Schema Generate DTD/Schema. The Generate DTD/Schema dialog box appears (Figure 5-3). The DTD radio button is selected by default. Click OK. You are then asked if you want to assign the generated DTD to the document. Click Yes. This adds a document type declaration to the XML document. Then you are asked to save the DTD. Give it the name spytime.dtd and click the Save button. Overwrite spytime.dtd if it already exists in the working directory (it should). With File Save As, save time.xml now with a document type declaration as spytime.xml. Choose XML Validate or press F8 to validate spytime.xml against spytime.dtd (Figure 5-4).
Figure 5-3. xmlspy's Generate DTD/Schema dialog box with DTD selected Figure 5-4. Validating spytime.xml in xmlspy (text view) If you have xmlspy available, it provides one of the quickest and easiest ways to generate a DTD for an XML document. |