The XML Registry
An XML Registry provides domain-wide configuration settings for XML parsers and XSL transformers, as well as resolution of different external entities. It is a domain resource that can be administered on a per-server basis using the Administration Console. Once you associate an XML Registry with a server instance, the settings apply to all applications running on that server. A WebLogic domain may define multiple XML Registries, but one XML Registry only may be assigned to a server instance. Of course, the same XML Registry may be shared by multiple servers in the domain. If a server instance doesn't have an XML Registry targeted to it, any server-side application that uses the JAXP interfaces will use the default parsers and XSL transformers shipped with your WebLogic distribution.
An XML Registry allows you to configure alternative parsers and transformers, instead of WebLogic's built-in parsers and transformers.
|
As we saw in earlier examples, an application that relies on JAXP does not need to use any code specific to the parser or transformer. The XML Registry lets you plug in different parsers and transformers without changing any code. The parser implementation returned by JAXP depends on the following conditions:
- If you have defined application-scoped parsers or transformer factories for an application EAR, WebLogic will use these configuration settings to determine the parser or transformer.
- If an application EAR doesn't have any such application-scoped XML configuration, WebLogic will look for an XML Registry that may be targeted to the server. If it exists, then the following occurs:
- If the XML Registry defines a parser specific to the XML document being parsed, WebLogic will use this configured value.
- Otherwise, WebLogic will choose from the default parsers defined in the registry.
- If there is neither an application-scoped configuration nor any XML Registry targeted to the server, WebLogic will use its built-in parsers.
Thus, an XML Registry is a server-specific, domain-wide resource. It determines the actual parser and transformer implementations used by all applications running on a server, provided they use the JAXP interface and don't have an application-scoped configuration! An XML Registry consists of the following:
- A list of default factories that will be used to create a parser or transformer.
- A list of external entity resolvers that map external entities to possible local URIs, with options for caching the entities as well.
- A list of XML factories that will be used for particular XML applications. Each such XML document is identified either by its root element or by its public and system identifiers.
An XML Registry acts as a deploy-time parser configuration for server-side applications running on a particular server instance. Document-specific parsers provide a simple yet powerful way to transparently alter the actual parser, without any change to the code.
18.3.1 Creating an XML Registry
In order to create an XML Registry, open the Administration Console, move to the Services/XML node in the left pane, and then select the "Configure a new XML Registry" option from the right pane. You will need to supply a name for the registry and the fully qualified class names for the SAX and DOM parser factories, as well as an XSL transformer. If any of these fields are left blank, WebLogic's default parsers will be used. As it is, the fields are initialized with the default values for WebLogic's built-in parser and transformer factories. After creating an XML Registry, select the Target and Deploy tab to associate the registry with particular server instances and make it available.
Suppose you've set weblogic.xml.babel.jaxp.SAXParserFactoryImpl as the SAX parser factory for an XML Registry, and the registry is targeted to server A. If no other configuration overrides this setting, any server-side application running on server A that uses JAXP will automatically use the FastParser as its SAX parser.
18.3.2 Configuring Document-Specific Parsers
Once you configure the default parsers for an XML Registry, you can further specify document-specific parser factories. You can configure this by setting up a new Parser Select Registry Entry. This option is available from the Parser Select Entries node under the selected XML Registry. Once again, you will need to specify the fully qualified class names of the XML factories (you can ignore the defunct Parser Class Name field). In addition, you need to associate these XML factories with a specific document. You can specify the document type information in two ways:
- You can supply the public or system identifier that corresponds to a DTD. If a server-side application parses a document that includes a DTD reference with the same public or system ID, it will use the associated parser factories.
- You can supply the name of a root element. Because XML is case-sensitive, be sure to use the correct case for the tag name. If the XML document defines a namespace, include the namespace-prefix for the root element.
Remember, the Parser Select Entries associated with an XML Registry apply only to server-side applications that use the JAXP interface to acquire parser factories. When an application is about to parse a document, WebLogic tries to determine the document type by searching through the first 1000 characters of the document. If it does find a public or system identifier, or a root element that matches one of the parser select entries, WebLogic uses the parser specified for that document type.
This document-based selection of a parser is useful when you want to use parsers that are more optimal for specific document types (e.g., the FastParser for SOAP messages). Another benefit of document-specific parsers is that you can override the default XML configuration transparently, without requiring any code changes. However, because WebLogic needs to inspect the document type for any XML document, this feature may carry a small performance penalty.
18.3.3 Configuring External Entity Resolution
An XML Registry also can define a number of entity resolution mappings. Each mapping associates an external entity with either a local file or contents of a remote URL. It also provides additional cache settings that determine when the external entity is fetched, and the length of time it will be cached. Creating an entity resolution mapping requires a little more effort than defining a document-specific parser. Select an XML Registry entry and then select the Entity Spec Entries option. You can now configure an entity resolution mapping by mapping a public or system identifier to an entity URI.
The URI specifies the location from which the external entity can be fetched. Its value is either the path to a local copy of the external entity, or a URL that refers to a remote copy. If the entity URI refers to a local file, the path is interpreted relative to the directory associated with the XML Registry: domainRoot/xml/registries/registryName, where registryName is the name of the registry. You have to create this directory manually. It will stock local copies of files that will be used to resolve external entities configured in the XML Registry.
As an example, let's add external entity resolution to our XML Registry, MyRegistry. Start by creating a directory in the domain root called xml/registries/MyRegistry. Now create a file, called ext.txt, which holds the text . Next, configure an external entity mapping, with a system identifier of example, and specify a URI of ext.txt. We have now effectively mapped an external entity with a system identifier of example to substitution text contained in the ext.txt file. We can test this configuration by creating a server-side application (servlet, JSP, etc.) that parses an XML fragment that includes this entity reference:
// Grab the SAX parser using JAXP SAXParserFactory spf = SAXParserFactory.newInstance( ); SAXParser sp = spf.newSAXParser( ); sp.parse(new java.io.StringBufferInputStream( " ]> &a;" ), new org.xml.sax.helpers.DefaultHandler( ) { public void startElement (String uri, String lName, String qName, Attributes attr){ System.err.println(qName);}; } );
Here, the SAX handler simply prints the name of each element encountered during the parse. If the external entity is resolved successfully, the resulting XML should be:
The parse yields the following output, as expected:
outside side in
So, an Entity Spec Entry allows you to map an external entity to a local file that holds the replacement XML. This kind of mapping also is useful for DTD references, which are treated like external entities. For example, you could create another mapping under MyRegistry that associates a document type with system identifier http://oreilly.com/dtds/foo to a local entity URI /dtds/foo.dtd.
Then, any XML document that includes a DTD reference with the same system ID will resolve the DTD to the local copy held under the /xml/registries/MyRegistry folder.
18.3.4 Caching Entities
WebLogic provides a caching facility that improves the performance of external entity resolution. You can configure WebLogic's support for caching by adjusting when the external entity is fetched, and the period after which it is considered stale. The When to Cache field for an Entity Spec Entry determines when an external entity is fetched. If you select an XML Registry from the left pane of the Administration Console and then choose the Configuration tab from the right pane, you can set a value for the When to Cache field. WebLogic permits the following values for this setting:
cache-on-reference
This setting ensures that WebLogic caches the item after it has been referenced for the first time while parsing a document.
cache-at-initialization
This setting ensures that WebLogic caches the item when the server starts up.
cache-never
This setting instructs the server never to cache the item
defer-to-registry-setting
This setting instructs the cache to use the value set in the XML Registry's main configuration page.
The When to Cache field can take any one of the first three values explained earlier. By default, the XML Registry is configured to cache an external entity when it is first referenced, and if an entity resolution mapping doesn't override the XML Registry setting, it will inherit the value of its cache setting.
Finally, you can adjust the Cache Timeout Interval setting for an entity resolution mapping, which determines the duration (in seconds) after which the cached entity is considered stale. A subsequent request to a cached entity that has become stale causes WebLogic to fetch the resource from the location specified by the URI. Otherwise, the server will continue to use the cached value of the entity resolution. Although you may specify a timeout interval for each entity mapping, you also can specify a value of -1 for the timeout interval. In this case, the actual timeout will be determined by the value of the Cache Timeout Interval setting associated with the server instance that the XML Registry is targeted to. All entity mappings with a timeout value of -1 will inherit the cache timeout setting for the targeted server itself.
The Cache Timeout Interval server setting can be found by clicking on the server in the left pane of the Administration Console and then selecting the Services/XML node. This setting determines the timeout period for all entity mappings whose Cache Timeout Interval has a value of -1. Three other settings on this screen are of interest:
XML Registry
This setting determines the name of the XML Registry targeted to the server instance you can choose from any one of the XML Registries defined in the domain. Of course, you can change this value by selecting an XML Registry and assigning the server from the Targets panel. Make sure that you do not target more than one XML Registry to the same server.
Cache Memory Size
WebLogic can cache some of the external entities in memory. This setting specifies how much memory (in kilobytes) to set aside for this cache. It defaults to 500 KB.
Cache Disk Size
When the memory cache has reached its maximum allotted size, WebLogic persists the little-used external entities to disk. This setting determines the maximum size (in megabytes) for the disk cache. It defaults to 5 MB.
Using these settings, you can specify on a per-server basis the size of the cache for external entities, both in memory and on disk, and when to refresh a cached external entity.
The final option on this screen, Monitor XML Entity Cache, allows you to monitor how the cache is being used. Select this option if you need to access usage statistics such as the total number of cached entries, the frequency of timeouts, and the resource usage.