34 Subsystems of ETL Data Integration
In this, and in the next series of posts, I will be exploring the 34 subsystems of ETL Data Integration (note: there used to be 38) as defined by the Kimball Group. I introduce the subsystems in this post, and then I will discuss how each fits (or does not fit) into both a SQL Server environment and in a hand-coded environment using VFP9.
The subsystem concept is a best-practice initiative formulated by The Kimball Group to help organizations design effective and efficient Data Integration environments for Data Warehousing using the Dimensional Model. As you’ll see, the subsystems are well-defined, battle-tested, and represent a relatively complete picture of what is needed for ETL Integration.
The Kimball Group categorizes the subsystems into 4 distinct groups: Data Extraction, Cleansing and Conforming Tasks, Data Delivery, and Management. Management is an often forgotten aspect of ETL (If the acronym was ETLM, we’d be OK!). You’ll notice that there are quite a few management-specific subsystems.
- Data Profiling
- Change Data Capture (CDC)
- Extraction
- Data Cleansing Subsystem
- Error Event Management
- Auditing
- Removing Duplicates
- Data Conformance
- Slowly Changing Dimensions (SCD)
- Surrogate Key Generator
- Hierarchy Manager
- Special Dimensions Manager
- Fact Table Builders
- Surrogate Key Management
- Bridge Table Builder
- Late Arriving Data Handler
- Dimension Manager
- Fact Table Provider
- Aggregate Generation
- OLAP Cube Builder
- Data Propagation Manager
- Scheduler
- Backup System
- Recovery and Restart
- Version Control
- Version Migration
- Work flow Monitor
- Sorting
- Data Lineage and Dependency
- Problem Escalation
- Paralleling and Pipelining
- Security
- Compliance Manager
- Metadata Repository
Data Extraction
Cleansing and Conforming Tasks
Data Delivery
Management
Data Integration architects have a lot to be thankful for the work and effort put forth by the Kimball Group. As you can probably guess, to do ETL right, you’ll need adequate time and resources. It is no wonder that data integration is by far the most time-consuming aspect of data warehousing.
In subsequent articles I will define each of these systems, provide examples, and discuss how they can be accomplished in SQL Server as well as VFP9. I have a lot of work to do, so stay tuned!
I'm a Quant Technical Specialist (Data Warehousing and Business Intelligence), with expertise in business analysis, data modeling, and data integration. I have extensive experience developing vertical and integrated desktop, Internet, and BI applications spanning municipal, clinical, and financial industries.

January 14th, 2008 at 4:15 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implimented in SSIS or hand coded in Visual […]
January 14th, 2008 at 4:16 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implimented in SSIS or hand coded in Visual […]
January 16th, 2008 at 3:23 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implimented in SSIS or hand coded in Visual […]
January 16th, 2008 at 3:25 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
February 8th, 2008 at 3:35 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
February 14th, 2008 at 7:45 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
February 26th, 2008 at 8:51 am
I am unable to access the 1 Data Profiling link as it points to 3-Data-Extraction
February 26th, 2008 at 4:34 pm
Sorry about that Dan. I’ve updated the entry. The link to that post is here:
http://blog.todmeansfox.com/2007/12/07/etl-subsystem-1-data-profiling/
Regards,
Tod
February 29th, 2008 at 11:28 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
March 17th, 2008 at 7:56 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
March 25th, 2008 at 7:07 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
April 15th, 2008 at 9:27 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
May 8th, 2008 at 4:13 pm
[…] requirements, standards, implementation details for the business requirements, ETL service details, Subsystem strategies, data access strategies, metadata maintenance plans, data storage, backup and recovery, […]
May 10th, 2008 at 10:44 am
[…] in future postings, and I’ll also be picking up where I left off with my postings on the ETL Subsytems 1 through 34. So if there is something you would like me to address first, let me […]
May 15th, 2008 at 6:47 am
[…] be getting back on track with my dissection of the 34 Subsystems of ETL (in which I give my thoughts on each subsystem through SSIS and Visual FoxPro). I took a few weeks […]
May 20th, 2008 at 11:50 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
May 26th, 2008 at 6:10 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
June 16th, 2008 at 9:06 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
June 22nd, 2008 at 1:25 pm
Todd,
Would you be publishing your views about “Bridge Table Builder” soon ?. The fact that the Bridge table is going to support many-to-many relationships in the form of composite keys is making me fear that it is not possible to prepare the data with the help of database itself. I am sorry If am sounding meaning less. I would be glad to give an example, but i figure that replies are meant to be short.
regards,
jana
June 23rd, 2008 at 2:07 am
Dear Jana,
Yes, I have been preparing posts on the following topics: Surrogate Key Management, Bridge Table Builder, and Late Arriving Data Handler. All should be online by early next week.
Bridge tables are quite interesting. I recently designed a model where bridge tables were essential — and in fact the only way — to make it all work. Consider a sale that involves many sellers, buyers, and items (an item can have multiple parts of different types). In the end, the dimensional model makes the design a breeze and although complex, is still easy for business users to understand and navigate. Performance is excellent as well.
Stay tuned, it will be online soon!
June 23rd, 2008 at 1:04 pm
todd,
That sounds like panacea, you have cracked every major problem a datawareshouse might face. Performance,Navigation,Many to Many Relationships. Will you be posting the Schema or ER diagrams by any chance ?
regards,
jana
June 24th, 2008 at 3:43 am
Great word! “Panacea”
I’ll certainly be posting more about data modeling, enterprise mappings, master data management, governance, and so on. My focus now is on getting through all ETL considerations first. Data Integration consumes most DW/BI resources, so I felt it would be a good place to focus my efforts…
June 24th, 2008 at 9:51 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
July 2nd, 2008 at 9:36 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
July 10th, 2008 at 9:23 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
July 14th, 2008 at 9:40 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
July 18th, 2008 at 8:44 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
August 11th, 2008 at 8:34 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
August 26th, 2008 at 9:39 am
[…] of what I discuss on Tod means Fox (and especially in my 34 Subsystem series) has to do with relational database engines (SQL Server 2005 and Visual FoxPro in particular). For […]
September 4th, 2008 at 8:17 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
September 16th, 2008 at 9:22 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
October 5th, 2008 at 8:09 am
[…] a long list of improvements to the team. Many items on my list were directly related to many of the ETL Subsystems that I’ve been discussing. Taking these to heart, we began the long process of refactoring, […]
October 20th, 2008 at 12:46 pm
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
October 28th, 2008 at 10:42 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
November 3rd, 2008 at 9:38 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
November 10th, 2008 at 11:12 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
November 20th, 2008 at 10:19 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]
December 26th, 2008 at 5:05 am
[…] The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual […]