Mining Google Web Services: Building Applications with the Google API

Many of the performance enhancements you can add to your Google Web Services application revolve around some type of offline storage. How your application uses offline storage makes a big difference in performance enhancement. In most cases the vendor and product you choose will determine factors such as reliability and availability. The following sections describe a few of the issues you need to consider as part of your offline storage strategy.

Choosing the Correct Offline Storage Strategy

All of the examples so far in the book have considered offline storage from the perspective of storing the Google data that an application needs to handle multiple requests for the same information. This technique is the most important strategy to learn from a speed and reliability perspective. However, it's not the only strategy to consider because some applications simply don't benefit from this approach.

Another offline storage technique to consider is the use of a database containing links, page titles, phrases, categories, or other information. You can associate the information with specific non-changing data such as a research topic description. This database can help users quickly identify previously researched information. Even if you can't save other information from Google Web Services, saving these values can improve user efficiency by reducing the number of research requests the user has to make.

Don't become fixated on output data when working with Google Web Services. All of the requests you make have value too. For example, you might create a database of recent request data to provide hints to the user. As the user fills in request data, your application can make suggestions for the next input value and reduce the chance the user will make an invalid request. Likewise, you can store requests that didn't work. Making a quick check for these requests before you send the data to Google will save a round-trip over the Internet and improve application speed. The application can also alert the user to the fact that the request won't work and make suggestions on how to change the request.

Tip  

Google provides the <searchComments> and <searchTips> elements to alert users to search conditions that didn't result in a valid return value. For example, a specific combination of search terms might not return any links. You can search these search combinations in a database and simply return the <searchComments> and <searchTips> values to the user, rather than waste time making a round-trip call to Google Web Services .

Selecting a Database That Suits Your Needs

The database- related examples in the book rely on one of three database managers: SQL Server, Microsoft Access, or MySQL. You can find many other alternatives ”these are just a sampling of what's available. I chose these three database managers because they represent several steps in functionality, performance, ease of use, and cost. It's important to get a database manager that you can afford, that will perform the tasks you need it to do, and is easy enough to manage, so you might choose any of the myriad alternatives on the market.

SQL Server is the most expensive of the three, but it also provides the best functionality. Microsoft constantly touts the speed of SQL server, but it's a memory hog and can consume copious amounts of hard drive space. Given the complex tasks that SQL Server can perform, you might not find it as easy to use as Access, but the GUI-based tools do make it easier to use than the command line interface of MySQL.

Microsoft Access is probably the easiest of the database managers to use because it provides a single GUI interface where you manage everything. Some developers feel that Access is only useful for local databases, but many small businesses rely on Access as their only multiuser database manager. From a speed perspective, Access is probably the slowest of the three database managers. However, it's very easy on hard drive use and relatively light on memory use as well.

Note  

You can use the Microsoft Data Engine (MSDE) as a substitute for SQL Server in some cases. It always works as a good alternative to SQL Server for local development. In some cases, you can also use MSDE as an alternative to SQL Server for groups of up to five people. Make sure that MSDE actually meets your needs before you spend time installing it ”this product doesn't include all the features of SQL Server. Because MSDE relies on the same DLLs as SQL Server in many cases, you'll also want to apply any required patches to ensure the integrity of your system. Learn more about MSDE at http://www.microsoft.com/sql/msde/default.asp.

MySQL is the least expensive of the three database managers ”you simply download your copy from a Web site. You'll find that this database manager is the hardest of the three to use because almost everything happens at the command line. Some middle- sized companies use MySQL because it has the speed required to handle larger applications. It's also relatively easy on memory use, but about equal with SQL Server when it comes to hard drive space requirements.

Considering Database Storage Alternatives

Don't assume that you need a database to provide the benefits of local storage. It's true that you need a database when the usage requirements are high or you need long- term storage of information. However, storing the Google data isn't exactly rocket science ”you can use any of a number of alternatives. For example, the "Using a Script to Call an XSLT Page" section of Chapter 3 discusses a technique where you rely on the capabilities of a browser to process the Google data. In this case, the simple fact that the browser caches pages it downloads from the Internet is enough to improve performance for multiple calls for the same data ”at least for the local user. Obviously, the browser caching solution won't work for multiple sessions if the user sets the browser to clear the cache after each session.

Sometimes you need something a little more substantial than the cache provided by a browser, but still don't need permanent storage. In these situations, you can use an in-memory solution. The simplest solution is an array or other memory structure. However, many languages also provide actual caches you can use and some vendors provide caches as part of their third party product. For example, the DataSet object provided with Visual Studio .NET is actually a form of in-memory cache, but it definitely has database functionality and you can link it to a physical database.

Tip  

You can find a great article on caching techniques for PHP and Web services ("Caching With PHP Cache_Lite") at http://www.devshed.com/c/b/PHP or http://www. devshed .com/c/a/PHP/Caching-With-PHP-Cache_Lite/. This article considers important issues, such as using the browser cache and implementing server-based caching.

It's possible to get by without using a database even when you need some form of permanent storage. For example, you could store a list of home page links in an XML file. In fact, you can easily extend XML storage to entire requests or other types of permanent data. At some point, the performance of such a system is going to become problematic , but it works for small amounts of data for one user and could even work for a few users if you use a central storage location for the XML files.

Категории