This article is part of a series discussing the Kimball Group’s “34 Subsystems of ETL“. The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual FoxPro.

You’ve just spent the better part of the year designing a perfect ETL system – from data profiling to data propagation. You’re in production, and all jobs are running on schedule in what seems to be an eerie harmony. You lean back in your chair. Victory, you think.

Then comes the call: Your database server crashed. Now what?

You must have a reliable backup system in place — and not just overnight tape backups conducted by your IT department. Those are a good first (baby) step, but they are usually not good enough, especially to help you recover from failures in mid ETL stride. Even with a fully redundant environment, you have got to be sure that you can re-run jobs from your staging area no matter where they failed. This is an ETL responsibility not to be passed off to IT. This means that an accessible backup of all staging data is essential. Remember, you should stage data at each major step in the ETL process (after load, after cleansing, conforming, and just before delivery). Doing so allows you to return to these staging points in order to recover from failures when they occur during ETL processing. I’ll discuss this more in my next post “ETL Subsystem 24: Recovery and Restart System”.

If you use a staging area, then you’re already part of the way there. The only thing you need to manage at this point is to plan and implement your backup strategy. Backing up staged data as it is being staged can be tricky. You can write to two different locations, on different machines or on different drives of the same machine. Consider this the poor-man’s approach to staging area backups. I’ve done this often; as it is easy and inexpensive (disk space is cheap and getting cheaper). More sophisticated mechanisms exist (for example, check out IBM Tivoli). Your policies, existing services, compliance concerns, and your ability to recreate lost data will all play a factor in determining what level of backups you need to make.

Storage of these backups, including the question on how long to keep the archives, is something you’ll need to consider as well. For critical data that cannot be reproduced, you will want to store it for a very long time in a safe location. For easily reproduced data, you might want to keep it long enough to appease the auditors. For the staging area, it might be safe to keep it only until the next run.

From Here

I won’t frame this subsystem using SSIS or Visual FoxPro, as I’ve done with the other subsystems to date. Backing up your data and files — unless you use the poor-man’s approach — is not a function that you can do using the ETL tooling. And doing a manual copy of staging data might be as simple as moving files from one directory to another, or using database replication to replicate changes from your main database to a linked, backup database on another machine.

You have a lot of decisions to make when it comes to your backup system. Next post, I’ll discuss recovery and restart, ETL Subsystem 24.

Tags: , , , , ,