This article is part of a series discussing the Kimball Group’s “34 Subsystems of ETL“. The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual FoxPro.
It will often become necessary for you to move data from the data warehouse to other applications, databases, and servers. This is a little different then scaling out your data warehouse to support multiple servers (an integration server, relational database server, reporting server(s), analytic server(s), etc.); typically, distributing data in this way is done via replication techniques. But this subsystem addresses another important need.
I usually refer to this need as “Data Provisioning“. Provisioning is a term borrowed from the telecommunications industry meant to describe the process of providing various telecom services to a customer. A data provision is simply a service that provides data to some client. The client can be an Excel sheet, XML file, application, operational system, or perhaps a master (MDM) database.
This subsystem is not the only place that you’ll be provisioning data to various client applications. BI Applications often allow various extracts and downloads that can be made available to the same list of clients. This usually comes by way of a downloadable report or extract. However, the data propagation subsystem provides a few additional benefits above what BI Applications can usually deliver. First, as an ETL process, you can utilize the full power of the ETL system to perform additional transformations on the data before delivery. Second, propagating data can be part of the ETL pipeline so that provisions can be made available as soon as the dimensional models are loaded (highly desirable if the provision is meant to update an operational or master data system).
Consider though that as an ETL function, these data provisions are best suited for well-defined, almost-never-changing requests for data. You would not use this subsystem to provide data in an ad hoc way to an analyst! You would, however, use this subsystem to provide a daily extract to your risk department for compliance reasons.
SQL Server 2005 Integration Services (SSIS)
The only challenge that you’re bound to face is providing the data in the various formats required by the client. I’ve found that Excel is particularly tricky to work with and as a consequence, I usually push my clients to accept CSV or Tab delimited files instead!
Hand Coding with Visual FoxPro (VFP9)
You’ll have no problems propagating data with FoxPro. In many cases, you can use SQL to get the data you need and simply write it to a table or file on disk. The COPY TO command supports different formats and it isn’t difficult to create outputs in other formats such as XML. Using the same techniques you would employ to connect to various servers (using ODBC drivers for example), you can push data into almost any environment. This is a strength of FoxPro that sets it apart from other ETL scripting languages.
From Here
There’s no real magic to propagating data. It’s an ETL function with the typical extract, transform, and load operations. You can treat it as part of your current ETL system; perhaps running provisions after your facts are loaded. Alternatively, provisions can be run on a schedule independent of normal ETL processing or can be run on demand as needed.
…
This post wraps up the “Data Delivery” function in the ETL environment. In my next post, I’ll move on to “ETL Management” functions including scheduling, backup, security and metadata.