Fun with Serial Dates
I ran into a bit of an oddity today while working with serial dates in SQL Server 2005 and MS Excel 2003. I’ve dealt with this before with Excel using FoxPro, but don’t recall the specifics. Anyway…
A serial date is a sequential number, starting on January 1st, 1900 (or if you want compatibility with a Mac serial date, 1904) that represents the number of days since any date to that point in time (you can also represent time by using a decimal, but I won’t get into that here). Excel has a bug (or a “Lotus 123 compatibility feature” as some like to call it) in it that counts Feb 29, 1900 as an actual date, when in fact 1900 was not a leap year.
Hence the first bit of fun. Run this TSQL code:
SELECT DATEDIFF(dd,'19000101','20080606')
The above query returns ‘39603′. So far so good. In SQL Server, you can now derive the date from the serial value by using convert:
SELECT CONVERT(DATETIME,DATEDIFF(dd,'19000101','20080606'))
…which gives me ‘2008-06-06 00:00:00.000′. So far so good.
The reason we’re using serial dates in the first place is to accommodate some models developed in Excel. The data is fed in directly from the database and calculations are performed.
When I take value 39603, put it into an Excel cell and re-format the column from General to Date, I get ‘June 4, 2008′! Two days off from what I get from SQL Server.
I already mentioned that Excel counts Feb 29, 1900. That’s one day. So why am I off by two? Here’s some more fun. In SQL Server, try the following:
SELECT CONVERT(DATETIME,0)
You get ‘1900-01-01 00:00:00.000′. But January 1st is supposed to be day one. Not day zero (must be a zero-based array sort of thing). When you try the same in Excel, you get a more appropriate (but equally odd) value of ‘January 0, 1900′. You’d think I’m dealing with scientific ephemeris or something.
SELECT CONVERT(DATETIME,1)
So, in SQL Server, we’re off by a day. The first day shouldn’t result in the second day (’1900-01-02 00:00:00.000′) of the year as the above TSQL shows.
This helps to explain the 2 day offset between Excel and SQL Server.
Come on Redmond! Work with me here!
To solve (read: work around) the problem, we’ve decided to stick with keeping Excel happy (against my recommendation, but at the end of the day, this is a business decision). Our data integration packages add two days to the serial date we calculate from a YYYYMMDD value. When data is moved to Excel from the data warehouse, no transformation is done on the serial value and Excel calculates as expected. When we use the serial date in a SQL Server context, we need to subtract 2 before the convert.
This is not a very good design and will undoubtedly cause someone problems some day.
I recommend storing the ISO YYYYMMDD format instead. But Excel won’t format a value such as 20080606 to a date. What a shame! If anyone knows of a way, or can enlighten me more about this subject — do not hesitate!
I'm a Quant Technical Specialist (Data Warehousing and Business Intelligence), with expertise in business analysis, data modeling, and data integration. I have extensive experience developing vertical and integrated desktop, Internet, and BI applications spanning municipal, clinical, and financial industries.

June 6th, 2008 at 10:35 am
Tod,
Don’t know if this helps any with your date transformation:
http://www.screencast.com/users/RDCrozier/folders/Jing/media/a6ed6820-912d-442f-befd-040f49677529
or
http://tinyurl.com/5pe6tl
Dave Crozier
June 9th, 2008 at 12:47 am
Tod,
In your data warehouse you can also add an extra column to the data dimension to capture the Excel date. Use this field when you move data to Excel. Make sure you keep on working with SQL Server conform dates in the data warehouse. At least it stays conform in one system. Otherwise you will have problems when somebody would go directly to the database.
What will you do when the logic in Excel changes?
Regards,
Tjomme
June 9th, 2008 at 1:56 am
Exactly! I suppose no one thinks about moving away from Excel anytime soon :-s
Storing bad data to accommodate external software is always a bad idea. Although we discussed adding an extra metric to the fact table (the serial date in our context is actually a business metric stored in the fact!), we determined that the size of the fact table would grow unnecessarily too large.
The date dimension is an interesting option, but because we’re dealing with a number that for us is semi-additive, doesn’t “feel” like it would work. But I’ll explore that option more.
Actually, adding the serial date in actual, SQL Server, and Excel varieties is an excellent idea in general!!