Web Site Cookbook: Solutions & Examples for Building and Administering Your Web Site (Cookbooks (OReilly))

Problem

You want to notify the world about updates on your web site with an RSS feed.

Figure 6-5. Three easy-to-use feed-parsing toolsFeed2JS, RSS-to-JavaScript, and CaRPcan reformat RSS feeds from other sites and display them on your site

Solution

Use a PHP script to get your web site content from a SQL database, format the records as valid RSS feed entries, and save the complete feed file to your web server.

First, define a connection to your SQL database:

$dbName = "mydatabase"; $conn = mysql_connect("dbhost","dbuser", "dbpassword") or die("Couldn't Connect."); $db = mysql_select_db($dbName, $conn) or die("Couldn't select database.");

Then define a function to generate the RSS code for individual items:

function createItem ($title, $description, $link, $pubdate) { $item = " <item>\n"; $item .= " <title>".$title."</title>\n"; $item .= " <description>".$description."</description>\n"; $item .= " <link>http://yourwebsite.com/".$link."</link>\n"; $item .= " <pubDate>".$pubdate."</pubDate>\n"; $item .= " </item>\n"; return $item; }

To save the feed file on your web server, define the full server path to the file in the variable $feedfile:

$feedfile = "/full/path/to/rss.xml";

Your web server must have write privileges for the directory the feed file will be saved in.

Also, define a variable for a properly formatted last-modified timestamp for the feed using the PHP date( ) function:

$lastBuildDate = date("D, j M Y H:i:s");

The $lastBuildDate variable will have the format Tue, 29 Mar 2005 09:58:08.

The content of the feed file will be stored in the variable $feedcontent. First, the script needs to add the RSS tags to $feedcontent that provide a general description of the feed:

$feedcontent = "<?xml version=\"1.0\"?>\n"; $feedcontent .= "<rss version=\"2.0\">\n"; $feedcontent .= " <channel>\n"; $feedcontent .= " <title>title</title>\n"; $feedcontent .= " <link>link</link>\n"; $feedcontent .= " <description>description</description>\n"; $feedcontent .= " <lastBuildDate>".$lastBuildDate."</lastBuildDate>\n"; $feedcontent .= " <language>en-us</language>\n"; $feedcontent .= " <copyright>copyright</copyright>\n"; $feedcontent .= " <webMaster>webmaster@yoursite.com</webMaster>\n"; $feedcontent .= " <ttl>1</ttl>\n";

Most of these tags are self-explanatory. The latest version of the RSS specification is 2.0, although some feeds are published using old versions. Links to specifications and RSS file validators are in the "See Also" section of this Recipe.

The <language> element allows feed aggregators to group feeds published in the same language. The <ttl> element stands for time to live and defines the number of minutes that must pass before a cached feed can be refreshed from the source.

Next, the script queries the database for 15 records:

$sql ="SELECT id, title, description, pubDate FROM table ORDER BY id DESC LIMIT 0,15"; $result = @mysql_query($sql, $conn) or die("Error #". mysql_errno() . ": " . mysql_error());

Depending on the structure of your database, you will likely want to tweak the query with a WHERE statement that checks an RSS_publish field for yes or true to make sure that the set of found records includes only the content you want to publish in your RSS feed.

With a while loop, the script uses the createItem( ) function to add a new item entry to $feedcontent for every found record:

while ($row = mysql_fetch_array($result)) { $id=$row['id']; $title=$row['title']; $description=$row['description']; $link = "feeds/ ".$id."/"; $pubDate=$row['pubDate']; $pubDate = date("D, j M Y H:i:s", $pubDate); $feedcontent .= createItem($title, $description, $link, $pubDate); }

Two more additions to $feedcontent close the file:

$feedcontent .= " </channel>\n"; $feedcontent .= "</rss>\n";

Finally, save the feed in a static file, using this code:

$file = fopen($feedfile, "w+"); fwrite($feedfile,$feedcontent); fclose($file);

To generate your feed file, you can request the PHP script manually by requesting its URL in a web browser as often as you care to. Or, make your life a little easier and set up a cron job to automate the process using the Unix utility wget. A cron task that generates the feed file at the top of every hour would look like this:

0 * * * * /usr/local/bin/wget http://yourwebsite.com/path/to/feed.php

Discussion

RSS files are XML documents, so they must conform to the W3C specification. A valid RSS file generated from the script presented in this Recipe will look something like this:

<?xml version="1.0"?> <rss version="2.0">; <channel> <title>Web Site Cookbook Daily Feed</title> <link>http://daddison.com/wscb/</link> <description>Late-breaking news and updates.</description> <lastBuildDate>Thu, 18 Aug 2005 17:10:00 -0500</lastBuildDate> <language>en-us</language> <copyright>Copyright 2005</copyright> <webMaster>webmaster@daddison.com</webMaster> <ttl>1</ttl> <item> <title>A Little RSS Goes a Long Way</title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit, set eiusmod tempor incidunt et labore et dolore magna aliquam. Ut enim ad minim veniam, quisnostrud exerc. Irure dolor in reprehend incididunt ut labore et dolore magna aliqua. Ut enim ad minim Ipsum dolor sit amet, consectetur adipiscing elit, set eiusmod tempor incidunt et labore et dolore magna aliquam. Ut enim ad minim veniam, quisnostrud exerc.</description> <link>http://daddison.com/wscb/post20050818171000.php</link> <pubDate>Thu, 18 Aug 2005 17:10:00 -0500</pubDate> </item> </channel> </rss>

The code elements are arranged hierarchically, similar to HTML. The first two lines define the document type (XML) and the RSS version. The entire body of the feed goes between one channel element. The lines before the first item list the feed's meta-information. Only title, link, and description are required (the bolded elements in the example). The other elements, such as language and copyright, are for the benefit of subscribers and aggregators.

All of the elements that can be placed within an item are optional, but either the title or the description must be present for the entry to appear in the feed. For a complete list of optional channel and item elements, check out the RSS specification web site listed in the "See Also" section in this Recipe.

The enclosure element allows feed publishers to attach a media object, such as an MP3 file, to a feed entry. For example, feeds that include a podcasta syndicated, periodic audio-based feed distributed via RSSuse the enclosure element. The enclosure element must include attributes for the file's URL, size, and type:

<enclosure url="http://yourwebsite.com/podcast.mp3" length="2038879" type="audio/mpeg"/>

Making sure your feed file validates before unleashing it on the world might involve setting or overriding the web server's notion of the content type for the feed file. RSS feeds should be served as application/rss+xml or application/xml. You can do this by creating an .htaccess file in the same directory as the feed file and adding this instruction to the web server:

AddType 'application/rss+xml; charset=utf-8' .xml

For more information about web server content types and character sets, see Recipe 4.2.

See Also

Recipe 1.8 explains cron in more detail. Recipe 2.7 explains wget in more detail. The W3C specification for XML documents is online at http://www.w3.org/TR/REC-xml. Feed Validator has an overview of the latest RSS specification at http://feedvalidator.org/docs/rss2.html and a web-based feed validating utility on the home page. Feeds also may be formatted according to the Atom specification.

Build It with a Blog

An alternative way to generate an RSS feed for your web site is with a blog. While the popular stereotype for a "blogger" tends toward either the self-styled political pundit or late-night diarist, you don't need to be either to benefit from this nearly ubiquitous form of web publishing.

Taken out of its pigeonhole in the public consciousness, blogging is nothing more than a new solution to the longstanding problem of managing web site content. And blogs do it in a way that's ideal for web surfers, allowing an author to publish short, categorized, time-stamped entries that canif the author choosesbe commented on or even updated by the audience.

Nearly all blogs are accompanied by an RSS feed. Simply put, the blog is the HTML view of the posts published on the author's web site (or on a free blogging service site such as Blogger) and the feed is a stripped-down version in XML format that can be viewed with news reader software or syndicated and formatted back into HTML on other web sites.

If you don't want or need to set up your own database to generate a customized RSS feed, there are several varieties of blog software and services at your disposal. With the aforementioned Blogger, you can be blogging within minutes of setting up a free account. Blogger blogs are hosted on owner Google's servers, but you can self-syndicate a feed onto your own site. You can exercise more control of your blog if you host it on your own site using software such as WordPress (http://wordpress.org) or Movable-Type (http://movable.org). A link to a site with a side-by-side comparison of all the popular blogging services and software is listed in the "See Also" section of this Recipe.

Finally, a word of caution about setting up a blog, especially if it's only for the purpose of generating a feed to display or syndicate elsewhere. Most blogging platforms allow readers to add comments to posts, a feature that both enriches and plagues the web. When left unattended, blog comments can quickly devolve into a new form of spam, as malicious web surfers and their automated minions fill the comments section with links to their own sites in an attempt to boost their own search engine rankings. If you don't want comments or don't have time to monitor the comments you will get on your blog, disable the commenting feature.

For more information on this method, see http://www.atomenabled.org. For a thorough comparison of several varieties of blog software and services, see Blog Software Breakdown at http://www.asymptomatic.net/blogbreakdown.htm.

Категории