XML, Web Services, and the Data Revolution
| |
Team-Fly |
| XML, Web Services, and the Data Revolution By Frank P. Coyle
|
Table of Contents | |
Chapter 2. The XML Technology Family |
RDF is an effort to bring order to the Web. It is part of the W3C's Semantic Web initative, an effort not to create a separate Web but to extend the current one in a way that gives information well-defined meaning, better enabling computers and people to work in cooperation. Because there are billions of pieces of data on the Web, the problem is getting to the information you really need. Most search engines fail miserably, returning thousands of unhelpful links because Web pages don't provide information about their content. However, some search engines do better than others because they use metadata.
Technically metadata is data about data. The search engines Yahoo and Google use metadata to build useful search links. When you search Yahoo, you're searching through human-generated subject categories and site labels. Google, on the other hand uses a method that ranks relevant Web sites based on the structure of the Internet itself. For example, Google interprets a link from page A to page B as a vote for page B by page A. More votes or links that connect to a page mean a higher rank for that page. Also, votes cast by "important" Web pages count more than links from "unimportant" pages. Either way, smart search requires metadata. Metadata
Metadata includes the indexing and organization required to retrieve library material such as books by author, title, or subject. It is the software infrastructure behind a large video store catalog that lets a customer find a movie directed by Quentin Tarantino or all movies where the director also appears in the film ( Reservoir Dogs, Apocalypse Now ). Then when the customer gets the movie home, the metadata of the yellow pages lets one find the phone number for pizza delivery so there will be something to eat while watching the movie. The common thread here is information about information. In each case, there is a need for information about what you're looking for ”the book's location, the video's name , the pizza shop's phone number ”to zero in on your goal.
Is metadata required? In theory, no. The brute “force approach ”looking through a library one book at a time, or wandering past video store shelves until you find a movie, or calling all the possible numbers in your area code until you hit on pizza delivery ”is always a possibility. But that would be far too time consuming. Without metadata there wouldn't be time for much else beside brute-force searching. MetaData: Beyond Search
Although metadata is most commonly used to find things, metadata is also used to support the business side of an enterprise. The video store uses metadata to determine how often videos are being rented, when it's time to move rentals to the for-sale bin, and who its best customers are. Running a viable video store operation would be impossible without metadata. The Components of RDF
RDF is used to identify the commonality behind different ways of categorizing data and to represent that commonality in such a way that Web architects can use it to build new and more complex technologies. The Resource Description Framework, as its name implies, is a framework for describing and interchanging metadata. It is built on the following three definitions. Resources
All things described by RDF expressions are called resources. A resource may be an entire Web page, such as the HTML document http://www.w3.org/Overview.html. A resource may also be a part of a Web page, such as a specific HTML or XML element within the document. A resource is anything that can have a URI; this includes all the Web's pages, as well as individual elements of an XML document. Properties
Properties are specific aspects, characteristics, attributes, or relations used to describe resources. A particular property is a resource that has a name and can be used as a property, for example Author or Title. In many cases, all we really care about is the name; but a property needs to be a resource so that it can have its own properties. Statements
A statement consists of a resource, a property, and a value. These parts are known as the subject, predicate, and object of a statement. A typical statement is, "The Author of http://davenet.userland.com/2001/09/10/openSourceIn2001 is Dave Winer." The value can be just a string, for example "Dave Winer," or it can be another resource, as in the example, "The Home-Page of http://davenet.userland.com/2001/09/10/openSourceIn2001 is http://davenet.userland.com." A specific resource together with a named property plus the value of that property for that resource is an RDF statement. The object of a statement (that is, the property value) can be another resource or it can be a literal, such as a resource (specified by a URI) or a simple string or other primitive data type defined by XML. In RDF terms, a literal may have content that is XML markup but is not further evaluated by the RDF processor. Consider as a simple example the sentence, "Dave Winer is the creator of the resource http://davenet.userland.com/2001/09/10/openSourceIn2001." This sentence has the following parts:
RDF Vocabularies
Properties standing alone, however, are not very useful. The expectation is that properties will be packaged, for example, as a set of basic bibliographic properties such as Author, Title, and Date. Over time, property collections or vocabularies will emerge in competition with each other, such as vocabularies for online learning or wine connoisseurship. This means that opinions , pointers, indexes, or anything that helps discovery will have high value. Diversity of ideas inevitably leads to a diversity of vocabularies, since anyone can come up with a vocabulary, advertise it, and charge a fee. The market will help the good ones survive. RDF is designed to have the following characteristics:
For Tim Berners-Lee, the real power of the Semantic Web will come about when program agents collect Web content from a variety of sources and exchange their results with other programs and agents. The effectiveness of these agents will increase as more Web content and services become available. The dream is that a Semantic Web will allow agents not explicitly designed to work together to transfer data by using RDF semantics that describe what the data really is all about. |
| |
Team-Fly |
Top |