Web Page Access Logging
18.14.1 Problem
You want to know more about a page than just the number of times it's been accessed, such as the time of access and the host from which the request originated.
18.14.2 Solution
Maintain a hit log rather than a simple counter.
18.14.3 Discussion
The hitcount table records only the count for each page registered in it. If you want to record other information about page access, use a different approach. Suppose you want to track the client host and time of access for each request. In this case, you need a log for each page rather than just a count. But you can still maintain the counts by using a multiple-column index that combines the page path and an AUTO_INCREMENT sequence column:
CREATE TABLE hitlog ( path VARCHAR(255) BINARY NOT NULL, hits BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, t TIMESTAMP, host VARCHAR(64), PRIMARY KEY (path,hits) );
To insert new records, use this query:
INSERT INTO hitlog (path, host) VALUES(path_val,host_val);
For example, in a JSP page, hits can be logged like this:
<%= request.getRemoteHost ( ) %> <%= request.getRemoteAddr ( ) %> UNKNOWN INSERT INTO hitlog (path, host) VALUES(?,?) <%= request.getRequestURI ( ) %>
The hitlog table has the following useful properties:
- Access times are recorded automatically in the TIMESTAMP column t when you insert new records.
- By linking the path column to an AUTO_INCREMENT column hits, the counter values for a given page path increment automatically whenever you insert a new record for that path. The counters are maintained separately for each distinct path value. (For more information on how multiple-column sequences work, see Recipe 11.15.)
- There's no need to check whether the counter for a page already exists, because you insert a new row each time you record a hit for a page, not just for the first hit.
- If you want to determine the current counters for each page, select the record for each distinct path value that has the largest hits value:
SELECT path, MAX(hits) FROM hitlog GROUP BY path;