Published June 6th, 2008
Fun with Serial Dates
I ran into a bit of an oddity today while working with serial dates in SQL Server 2005 and MS Excel 2003. I’ve dealt with this before with Excel using FoxPro, but don’t recall the specifics. Anyway…
A serial date is a sequential number, starting on January 1st, 1900 (or if you want compatibility with a Mac serial date, 1904) that represents the number of days since any date to that point in time (you can also represent time by using a decimal, but I won’t get into that here). Excel has a bug (or a “Lotus 123 compatibility feature” as some like to call it) in it that counts Feb 29, 1900 as an actual date, when in fact 1900 was not a leap year.
Hence the first bit of fun. Run this TSQL code:
SELECT DATEDIFF(dd,'19000101','20080606')
The above query returns ‘39603′. So far so good. In SQL Server, you can now derive the date from the serial value by using convert:
SELECT CONVERT(DATETIME,DATEDIFF(dd,'19000101','20080606'))
…which gives me ‘2008-06-06 00:00:00.000′. So far so good.
The reason we’re using serial dates in the first place is to accommodate some models developed in Excel. The data is fed in directly from the database and calculations are performed.
When I take value 39603, put it into an Excel cell and re-format the column from General to Date, I get ‘June 4, 2008′! Two days off from what I get from SQL Server.
I already mentioned that Excel counts Feb 29, 1900. That’s one day. So why am I off by two? Here’s some more fun. In SQL Server, try the following:
SELECT CONVERT(DATETIME,0)
You get ‘1900-01-01 00:00:00.000′. But January 1st is supposed to be day one. Not day zero (must be a zero-based array sort of thing). When you try the same in Excel, you get a more appropriate (but equally odd) value of ‘January 0, 1900′. You’d think I’m dealing with scientific ephemeris or something.
SELECT CONVERT(DATETIME,1)
So, in SQL Server, we’re off by a day. The first day shouldn’t result in the second day (’1900-01-02 00:00:00.000′) of the year as the above TSQL shows.
This helps to explain the 2 day offset between Excel and SQL Server.
Come on Redmond! Work with me here!
To solve (read: work around) the problem, we’ve decided to stick with keeping Excel happy (against my recommendation, but at the end of the day, this is a business decision). Our data integration packages add two days to the serial date we calculate from a YYYYMMDD value. When data is moved to Excel from the data warehouse, no transformation is done on the serial value and Excel calculates as expected. When we use the serial date in a SQL Server context, we need to subtract 2 before the convert.
This is not a very good design and will undoubtedly cause someone problems some day.
I recommend storing the ISO YYYYMMDD format instead. But Excel won’t format a value such as 20080606 to a date. What a shame! If anyone knows of a way, or can enlighten me more about this subject — do not hesitate!
Data Warehousing and Business Intelligence consultant, with expertise in business analysis, data modeling, and data integration. Extensive experience developing vertical and integrated desktop and Internet applications spanning municipal, clinical, and financial industries.