Hack 21. Report on Any Date Criteria
A report may depend on ranges of dates that can be tricky to calculate. Monthly totals are pretty straightforward; but how about current month, last month, and year to date?
To report performance indicators you need to generate values for specific time periods. Business analysts commonly are interested in the current month compared to the preceding month, or the corresponding period in the preceding year. You can do all of this in SQL.
In the examples that follow, the original data is in a table, t. This table records individual incidents of paperclip usage. Every row contains the date (whn) and the number of paperclips used (v):
mysql> SELECT * FROM t; +------------+------+ | whn | v | +------------+------+ | 2006-01-07 | 53 | | 2006-01-13 | 46 | | 2006-01-18 | 99 | | 2006-01-19 | 15 | | 2006-01-26 | 9 | ...
4.3.1. Monthly Totals
If you want to see monthly totals, you must include the year and the month in the GROUP BY expression:
mysql> SELECT YEAR(whn), MONTH(whn), COUNT(v), SUM(v) -> FROM t -> GROUP BY YEAR(whn),MONTH(whn); +-----------+------------+----------+--------+ | YEAR(whn) | MONTH(whn) | COUNT(v) | SUM(v) | +-----------+------------+----------+--------+ | 2006 | 1 | 7 | 348 | | 2006 | 2 | 5 | 329 | | 2006 | 3 | 10 | 585 | | 2006 | 4 | 8 | 293 | | 2006 | 5 | 7 | 413 | | 2006 | 6 | 8 | 465 | | 2006 | 7 | 6 | 206 | | 2006 | 8 | 9 | 456 | | 2006 | 9 | 4 | 217 | | 2006 | 10 | 10 | 401 | | 2006 | 11 | 9 | 540 | | 2006 | 12 | 7 | 402 | | 2007 | 1 | 2 | 139 | | 2007 | 2 | 13 | 800 | | 2007 | 3 | 14 | 674 | | 2007 | 4 | 6 | 456 | | 2007 | 5 | 4 | 171 | +-----------+------------+----------+--------+
|
You can combine the year and month into a single number if you want. If you multiply the year by 100 and add the month you can be certain that each month will be distinct and sortable. Also, the resulting number is human readable and is suitable for processing as a string; you can easily turn it back into a date [Hack #19]:
mysql> SELECT 100*YEAR(whn)+MONTH(whn), COUNT(v), SUM(v) -> FROM t -> GROUP BY 100*YEAR(whn)+MONTH(whn); +--------------------------+----------+--------+ | 100*YEAR(whn)+MONTH(whn) | COUNT(v) | SUM(v) | +--------------------------+----------+--------+ | 200601 | 7 | 348 | | 200602 | 5 | 329 | | 200603 | 10 | 585 | | 200604 | 8 | 293 | | 200605 | 7 | 413 | | 200606 | 8 | 465 | | 200607 | 6 | 206 | | 200608 | 9 | 456 | | 200609 | 4 | 217 | | 200610 | 10 | 401 | | 200611 | 9 | 540 | | 200612 | 7 | 402 | | 200701 | 2 | 139 | | 200702 | 13 | 800 | | 200703 | 14 | 674 | | 200704 | 6 | 456 | | 200705 | 4 | 171 | +--------------------------+----------+--------+
|
4.3.2. Current Month
If you want to see the data for the current month you can test both month and year in the WHERE clause:
mysql> SELECT * FROM t -> WHERE MONTH(whn)=MONTH(CURRENT_DATE) -> AND YEAR(whn)=YEAR(CURRENT_DATE) -> ORDER BY whn; +------------+------+ | whn | v | +------------+------+ | 2006-06-07 | 96 | | 2006-06-11 | 4 | | 2006-06-12 | 78 | | 2006-06-12 | 36 | | 2006-06-17 | 57 | | 2006-06-29 | 74 | | 2006-06-29 | 94 | | 2006-06-30 | 26 | +------------+------+
If you want to see the data for the preceding month do not change MONTH(whn)=MONTH(CURRENT_DATE) to MONTH(whn)=MONTH(CURRENT_DATE)-1. If you do that you will get data from the wrong year when you run this query in January. Instead, you need to subtract one month from CURRENT_DATE. It is a little neater if you do the date calculation in a nested SELECT:
mysql> SELECT * FROM t, -> (SELECT CURRENT_DATE - INTERVAL 1 MONTH lastMnth) p -> WHERE MONTH(whn)=MONTH(lastMnth) -> AND YEAR(whn)=YEAR(lastMnth); +------------+------+------------+ | whn | v | lastMnth | +------------+------+------------+ | 2006-05-04 | 43 | 2006-05-23 | | 2006-05-06 | 55 | 2006-05-23 | | 2006-05-08 | 89 | 2006-05-23 | | 2006-05-15 | 87 | 2006-05-23 | | 2006-05-22 | 90 | 2006-05-23 | | 2006-05-29 | 22 | 2006-05-23 | | 2006-05-30 | 27 | 2006-05-23 | +------------+------+------------+
4.3.2.1. SQL Server
In SQL Server, you should use the DATEADD function. You can specify the interval as m for month and the number of months as -1:
SELECT * FROM t, (SELECT DATEADD(m,-1,GETDATE( )) lastMnth) p WHERE MONTH(whn)=MONTH(lastMnth) AND YEAR(whn)=YEAR(lastMnth);
4.3.2.2. Oracle
You need to use the dual table in a subselect in Oracle. Also, you can use the TO_CHAR function to match the year and month in a single function:
SELECT * FROM t, (SELECT CURRENT_DATE - INTERVAL '1' MONTH lastMnth FROM dual) WHERE TO_CHAR(whn,'yyyymm')=TO_CHAR(lastMnth,'yyyymm');
4.3.3. Year-to-Date Totals
To calculate year-to-date totals you must make sure that the year matches the current date and that the records occur on or before the current date:
mysql> SELECT COUNT(v), SUM(v) FROM t -> WHERE whn <= CURRENT_DATE -> AND YEAR(whn)=YEAR(CURRENT_DATE); +----------+--------+ | COUNT(v) | SUM(v) | +----------+--------+ | 42 | 2239 | +----------+--------+
4.3.3.1. Fiscal year to date
Suppose you are reporting over a 365-day period, but your year does not start on January 1. This is the case with reports over a fiscal year or tax year.
For instance, say that your fiscal year starts on April 6. Calculating which dates are in the current fiscal year is rather complicated; the easiest thing to do is to work with the number of days between January 1 and April 6. You can get SQL to do the calculation as follows:
mysql> select DATEDIFF(DATE '2006-04-06',DATE '2006-01-01'); +-----------------------------------------------+ | DATEDIFF(DATE '2006-04-06',DATE '2006-01-01') | +-----------------------------------------------+ | 95 | +-----------------------------------------------+
In SQL Server, the DATEDIFF function needs another parameter. You use 'd' to indicate that you want the result as the number of days: DATEDIFF('d', '2006-04-06','2006-01-01').
In Oracle, you can simply subtract dates to get the number of days between them as an integer: DATE '2006-04-06' - DATE '2006-01-01'.
Once you have this offset you can determine the relevant fiscal year by subtracting this from both the date to be tested and the current date. This means that you don't have to worry about the different cases. In this example, March 29, 2006 is in fiscal year 2005, but April 20, 2006 is in fiscal year 2006:
mysql> SELECT whn, -> YEAR(whn - INTERVAL '95' DAY) whnFiscalYear, -> YEAR(CURRENT_DATE - INTERVAL '95' DAY) currentFiscalYear -> FROM t -> WHERE whn IN (DATE '2006-03-29', DATE '2006-04-20'); +------------+---------------+-------------------+ | whn | whnFiscalYear | currentFiscalYear | +------------+---------------+-------------------+ | 2006-03-29 | 2005 | 2006 | | 2006-04-20 | 2006 | 2006 | +------------+---------------+-------------------+
You can then use this as a condition to ensure that you are reporting on only the current fiscal year:
mysql> SELECT MIN(whn),MAX(whn), COUNT(v), SUM(v) FROM t -> WHERE whn <= CURRENT_DATE -> AND YEAR(whn - INTERVAL '95' DAY)= -> YEAR(CURRENT_DATE - INTERVAL '95' DAY); +------------+------------+----------+--------+ | MIN(whn) | MAX(whn) | COUNT(v) | SUM(v) | +------------+------------+----------+--------+ | 2006-04-09 | 2006-06-17 | 28 | 1443 | +------------+------------+----------+--------+
The minimum and maximum relevant dates are included in the output. This is a complicated expression and you might want to check by hand that the MIN(whn) value shown matches the first record following 2006-04-06 and that the MAX(whn) value is the last record to the current date.
In SQL Server, you can invoke the DATEADD function: DATEADD('d', whn, -95).
Perhaps your fiscal year is not a fixed number of days relative to January 1. In that case, you really have no alternative than to record the start-of-year dates in a table.
Suppose the taxYear table was created with the following format:
mysql> SELECT * FROM taxYear; +------------+ | strt | +------------+ | 2005-04-06 | | 2006-04-06 | | 2007-04-07 | +------------+
You can perform the same calculation as performed earlier:
mysql> SELECT MIN(whn), MAX(whn), COUNT(v), SUM(v) -> FROM t, -> (SELECT MAX(strt) txStrt FROM taxYear -> WHERE strt < CURRENT_DATE) tx -> WHERE whn >= txStrt AND whn <= CURRENT_DATE; +------------+------------+----------+--------+ | MIN(whn) | MAX(whn) | COUNT(v) | SUM(v) | +------------+------------+----------+--------+ | 2006-04-09 | 2006-06-17 | 28 | 1443 | +------------+------------+----------+--------+