XML Hacks: 100 Industrial-Strength Tips and Tools

   

FOAF provides a framework for creating and publishing personal information in a machine-readable fashion. As you learn FOAF, you will also get acquainted in a practical way with RDF.

The Friend of a Friend or FOAF project (http://www.foaf-project.org/) is a community-driven effort to define an RDF vocabulary for expressing metadata about people and their interests, relationships, and activities. Founded by Dan Brickley and Libby Miller, the FOAF project is an open, community-led initiative that is tackling head-on a small and relatively manageable piece of the W3C's wider Semantic Web goal of creating a machine-processable web of data. Achieving this goal quickly requires a network effect that will rapidly yield a mass of data. Network effects mean people. It seems a fairly safe bet that any early Semantic Web successes are going to be riding on the back of people-centric applications. Indeed, everything interesting that we might want to describe on the Semantic Web was arguably created by or involves people in some form or another. And FOAF is all about people.

FOAF facilitates the creation of the Semantic Web equivalent of the archetypal personal homepage: my name is Leigh, this is a picture of me, I'm interested in XML, and here are some links to my friends. And just like the HTML version, FOAF documents can be linked together to form a web of data, with well-defined semantics.

Being a W3C Resource Description Framework (RDF) application (http://www.w3.org/RDF/) means that FOAF can claim the usual benefits of being easily harvested and aggregated. And like all RDF vocabularies, it can be easily combined with other vocabularies, allowing the capture of a very rich set of metadata. This hack introduces the basic terms of the FOAF vocabulary, illustrating them with a number of examples. The hack concludes with a brief review of the more interesting FOAF applications and considers some other uses for the data.

4.7.1 The FOAF Vocabulary

Like any well-behaved vocabulary, FOAF publishes both its schema and specification at its namespace URI: http://xmlns.com/foaf/0.1. The documentation is thorough and includes definitions of all classes and properties defined in the associated RDF schema. The schema, described using RDF Schema (http://www.w3.org/TR/rdf-schema/) and the Web Ontology Language, or OWL (http://www.w3.org/TR/owl-features/), is embedded in the specification (near the end), which is written in XHTML 1.0 (http://www.w3.org/TR/xhtml1/). However, it can also be accessed directly (http://xmlns.com/foaf/0.1/index.rdf).

Rather than cover the whole vocabulary, this hack focuses on two of the most commonly used classes it defines: Person and Image. The remaining definitions cover the description of documents, projects, groups, and organizations; consult the specification for more information. The community also has a lively mailing list (http://rdfweb.org/mailman/listinfo/rdfweb-dev), IRC channel (http://www.ilrt.bris.ac.uk/discovery/chatlogs/foaf/), and project Wiki (http://rdfweb.org/topic/), which serve as invaluable sources of additional information and discussion.

4.7.2 Personal Metadata

The Person class is the core of the FOAF vocabulary. A simple example (Example 4-7) will illustrate its basic usage.

Example 4-7. Person.rdf

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <foaf:Person> <foaf:name>Peter Parker</foaf:name> <foaf:mbox rdf:resource="mailto:peter.parker@dailybugle.com"/> </foaf:Person> </rdf:RDF>

In other words, Person.rdf says there is a person with the name "Peter Parker" who has an email address of "peter.parker@dailybugle.com".

Publishing data containing plain-text email addresses is just asking for spam; to avoid this, FOAF defines another property, foaf:mbox_sha1sum, whose value is a SHA1 encoded email address complete with the mailto: URI scheme prefix. The FOAF project Wiki has a handy reference page (http://rdfweb.org/topic/HashPrograms) pointing to a number of different ways of generating a SHA1 sum. The end result of applying this algorithm is a string unique to a given email address (or mailbox). The next fragment (Example 4-8) demonstrates the use of this and several other new properties that further describe Peter Parker.

Example 4-8. Augmented Person class

<foaf:Person> <foaf:name>Peter Parker</foaf:name> <foaf:gender>Male</foaf:gender> <foaf:title>Mr</foaf:title> <foaf:givenname>Peter</foaf:givenname> <foaf:family_name>Parker</foaf:family_name> <foaf:mbox_sha1sum>cf2f4bd069302febd8d7c26d803f63fa7f20bd82 </foaf:mbox_sha1sum> <foaf:homepage rdf:resource="http://www.example.com/spidey"/> <foaf:weblog rdf:resource="http://www.example.com/spidey/blog/"/> </foaf:Person>

This is a slightly richer description of Peter Parker, including some granularity in the markup of his name through the use of foaf:title, foaf:givenname, and foaf:family_name. We also now know that Peter Parker is male (foaf:gender) and has both a homepage (foaf:homepage) and a weblog (foaf:weblog).

4.7.3 Identifying Marks

Keen-eyed RDF enthusiasts will already have noticed that neither of these examples assigns a URI to the resource called Peter Parker; that is, there is no rdf:about attribute on the foaf:Person resource, as in:

<foaf:Person rdf:about="...uri to identify peter..."/>

That's because there is still some debate around both the social and technical implications of assigning URIs to people. Which URI identifies you? Who assigns these URIs? What problems are associated with having multiple URIs (assigned by different people) for the same person? Side-stepping this potential minefield, FOAF borrows the concept of an inverse functional property (IFP) from OWL. An inverse functional property is simply a property whose value uniquely identifies a resource.

The FOAF schema defines several inverse functional properties, including foaf:mbox, foaf:mbox_sha1sum, and foaf:homepage; consult the schema documentation for the complete list. An application harvesting FOAF data can, on encountering two resources that have the same values for an inverse functional property, safely merge the description of each and the relations of which they are part. This process, often referred to as smushing, must be carried out when aggregating FOAF data to ensure that data about different resources is correctly merged. As an example, consider the following RDF fragment:

<foaf:Person> <foaf:name>Peter Parker</foaf:name> <foaf:mbox_sha1sum>cf2f4bd069302febd8d7c26d803f63fa7f20bd82 </foaf:mbox_sha1sum> </foaf:Person> <foaf:Person> <foaf:name>Spider-Man</foaf:name> <foaf:mbox_sha1sum>cf2f4bd069302febd8d7c26d803f63fa7f20bd82 </foaf:mbox_sha1sum> </foaf:Person>

Applying our knowledge that foaf:mbox:sha1sum is an inverse functional property, we can merge the descriptions together to discover that these statements actually describe a single person. Spider-Man is unmasked! While perfectly valid, this may not be desirable in all circumstances, and flags the importance of FOAF aggregators recording the source (provenance) of their data. This allows incorrect and potentially malicious data to be identified and isolated.

Before moving on, it's worth noting that while FOAF defines the email address properties (foaf:mbox_sha1sum and foaf:mbox) as uniquely identifying a person, this is not the same thing as saying that all email addresses are owned by a unique person. What the FOAF schema claims is that any email address used in a foaf:mbox (or encoded as a foaf:mbox_sha1sum) property uniquely identifies a person. If it doesn't, then it's not a suitable value for that property.

4.7.4 It's Who You Know

Having captured some basic metadata about Peter Parker, it's time to go a step further and begin describing his relationships with others. The foaf:knows property is used to assert that there is some relationship between two people. Precisely what this relationship is and whether it's reciprocal (if you know me, do I automatically know you?), is deliberately left undefined.

For obvious reasons, modeling interpersonal relationships can be a tricky business. The FOAF project has therefore taken the prudent step of simply allowing a relationship to be defined without additional qualification. It is up to other communities (and vocabularies) to further define different types of relationships.

Using foaf:knows is simple: one foaf:Person foaf:knows another. The following example (knows.rdf in Example 4-9) shows two alternative ways of writing this using the RDF/XML syntax. The first uses a cross reference to a person defined in the same document (using the rdf:nodeID attribute), while the second describes the foaf:Person in situ within the foaf:knows property. The end result is the same: Peter Parker knows both Aunt May and Harry Osborn.

Example 4-9. knows.rdf

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <foaf:Person rdf:node> <foaf:name>Harry Osborn</foaf:name> <rdfs:seeAlso rdf:resource="http://www.osborn.com/harry.rdf"/> </foaf:Person> <foaf:Person> <foaf:name>Peter Parker</foaf:name> <foaf:knows rdf:node/> <foaf:knows> <foaf:Person> <foaf:name>Aunt May</foaf:name> </foaf:Person> </foaf:knows> </foaf:Person> </rdf:RDF>

The other thing to notice is that, in addition to the foaf:knows relationship between Peter and Harry, a link has also been introduced to Harry's own FOAF document, using the rdfs:seeAlso property. Defined by the RDF Schema specification, the rdfs:seeAlso property indicates a resource that may contain additional information about its associated resource. In this case, it's being used to point to Harry Osborn's own FOAF description.

It is through the use of the rdfs:seeAlso property that FOAF can be used to build a web of machine-processable metadata; rdfs:seeAlso is to RDF what the anchor element is to HTML. Applications can be written to spider (or scutter, using the FOAF community's terminology; see http://rdfweb.org/topic/Scutter] these RDF hyperlinks to build a database of FOAF data.

4.7.5 Finer-Grained Relationships

The loose definition of foaf:knows won't fit all applications, particularly those geared to capture information about complex social and business networks. However, this doesn't mean that FOAF is unsuitable for such purposes; indeed, FOAF has the potential to be an open interchange format used by many different social networking applications.

The expectation is that additional vocabularies will be created to refine the general foaf:knows relationship to create something more specific. The correct way to achieve this is to declare new subproperties of foaf:knows. Stepping outside of FOAF for a moment, we can briefly demonstrate one example of this using the relationship schema created by Eric Vitiello and Ian Davis (http://purl.org/vocab/relationship).

The relationship schema defines a number of subproperties of foaf:knows, including parentOf, siblingOf, friendOf, and so on, from the namespace http://www.perceive.net/schemas/relationship/. Example 4-10 uses these properties to make some clearer statements about the relationships between Peter Parker and some of his contemporaries (relationship.rdf).

Example 4-10. relationship.rdf

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rel="http://www.perceive.net/schemas/relationship/"> <foaf:Person rdf:> <foaf:name>Spider-Man</foaf:name> <rel:enemyOf rdf:resource="#green-goblin"/> </foaf:Person> <foaf:Person rdf:> <foaf:name>Green Goblin</foaf:name> <rel:enemyOf rdf:resource="#spiderman"/> </foaf:Person> <foaf:Person rdf:> <foaf:name>Peter Parker</foaf:name> <rel:friendOf rdf:resource="#harry"/> </foaf:Person> <foaf:Person rdf:> <foaf:name>Harry Osborn</foaf:name> <rel:friendOf rdf:resource="#peter"/> <rel:childOf rdf:resource="#norman"/> </foaf:Person> <foaf:Person rdf:> <foaf:name>Norman Osborn</foaf:name> <rel:parentOf rdf:resource="#harry"/> </foaf:Person> </rdf:RDF>

While it is possible to model quite fine-grained relationships using this method, the most interesting applications will be those that can infer relationships between people based on other metadata. For example, have they collaborated on the same project, worked for the same company, or been pictured together in the same image? This brings us to Image, the other commonly used FOAF class.

4.7.6 Image Is Everything

Digital cameras being all the rage these days, it's not surprising that many people are interested in capturing metadata about their pictures. FOAF provides for this use case in several ways. First, using the foaf:depiction property we can make a statement that says "this person (resource) is shown in this image." FOAF also supports an inverse of this property (foaf:depicts) that allows us to make statements of the form "this image is a picture of this resource." Example 4-11 (Image.rdf) illustrates both of these properties.

Example 4-11. Image.rdf

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <foaf:Person rdf:> <foaf:name>Peter Parker</foaf:name> <foaf:depicts rdf:resource="http://www.example.com/spidey/photos/peter.jpg"/> </foaf:Person> <foaf:Person rdf:> <foaf:name>Spider-Man</foaf:name> </foaf:Person> <foaf:Person rdf:> <foaf:name>Green Goblin</foaf:name> </foaf:Person> <!-- codepiction --> <foaf:Image rdf:about="http://www.example.com/spidey/photos/ spiderman/statue.jpg"> <dc:title>Battle on the Statue Of Liberty</dc:title> <foaf:depicts rdf:resource="#spiderman"/> <foaf:depicts rdf:resource="#green-goblin"/> <foaf:maker rdf:resource="#peter"/> </foaf:Image> </rdf:RDF>

This RDF instance says that the image at http://www.example.com/spidey/photos/peter.jpg is a picture of Peter Parker. It also defines a foaf:Image resource (i.e., an image that can be found at a specific URI), which depicts both Spider-Man and the Green Goblin. Elements from the Dublin Core Metadata Initiative (http://dublincore.org/), such as dc:title, are often added to FOAF documents to title images, documents, and so forth.

Notice also that Peter Parker is defined as the author of the image using the foaf:maker property, which is used to relate a resource to its creator. The dc:creator term from Dublin Core isn't used here due to some issues with its loose definition.

4.7.7 Publishing FOAF Data

Having created an RDF document containing FOAF terms and copied it to the Web, the next step is to link the new information into the existing web of FOAF data. There are a few ways to do this:

  • Through foaf:knows. Ensuring that people who know you link to your FOAF data via an rdfs:seeAlso link will make the data discoverable.

  • Through the FOAF Bulletin Board, which is a Wiki page that links to dozens of FOAF files. FOAF harvesters generally include the RDF view of this page as one of their starting locations.

  • Through auto-discovery. The FOAF project has defined a means to link to a FOAF document from an HTML page using the link element; several tools now support this mechanism.

Beyond its initial developments, FOAF has potential in many areas. For example, painful web site registrations can become a thing of the past instead you can just indicate the location of your FOAF description, where a script can grab your personal information. Throw in the relationships, and FOAF can be used as an interchange format between social networking sites, building an open infrastructure that allows end users to retain control over their own data. Also consider e-commerce sites such as Amazon, which have become successful because of their high levels of personalization. Getting the most from these sites involves a learning process in which the sites can discover your interests either through explicit preference setting or adapting product suggestions based on a purchase history. With FOAF, there's the potential to capture this information once, in a form that can be used not by just one site, but many. The user could then move freely between systems.

4.7.8 See Also

  • The FOAF application most immediately useful to the owner of a freshly published FOAF description is Morten Frederikson's FOAF Explorer (http://xml.mfd-consult.dk/foaf/explorer/), which can generate an HTML view of FOAF data, complete with referenced images and links to other data. For example look at my FOAF description at http://xml.mfd-consult.dk/foaf/explorer/?foaf=http://www.ldodds.com/ldodds.rdf.

  • FOAF Explorer provides an effective way to browse the network of FOAF data. With the addition of a Javascript bookmarklet (see http://www.ldodds.com/blog/archives/000026.html) to perform auto-discovery, it's easy to jump from a blog posting to a description of that person and their interests.

  • The most elegant way to browse the relationships in the network of FOAF data is by using Jim Ley's foafnaut (http://www.foafnaut.org/), an SVG application that provides a neat visualization of foaf:knows relationships. Here's the foafnaut view starting from my description: http://blub.foafnaut.org/?sha1=1bca73e5c6916c738d6ec7cc0597ad0e395e7ace

  • plink is a social networking site: http://www.plink.org/

  • foafbot (http://usefulinc.com/foaf/foafbot) and whwhwhwh (http://swordfish.rdfweb.org/discovery/2003/10/whwhwhwh/) are IRC bots that provide conversational interfaces onto FOAF data.

  • Libby Miller's codepiction experiments demonstrate a novel way to explore FOAF image metadata: http://swordfish.rdfweb.org/discovery/2001/08/codepict/

  • FOAF-a-matic tool for FOAF generation: http://www.ldodds.com/foaf/foaf-a-matic

Leigh Dodds

Категории

© amp.flylib.com,