Posts Tagged BI

Live from Kimball University: Day 1 (planning, requirements, dimensional models)

As you may know, I am currently in Amsterdam attending the Kimball University (organized by Quest for Knowledge / Q4K) training on the Data Warehouse Lifecycle. Margy Ross and Warren Thornthwaite are here and so far so good. Margy is teaching for the first two days, while Warren will take the final two days. I am not disappointed thus far.

Admittedly, much of the material in this course is not new to me. I’ve been exposed to dimensional modeling for some time now, and with a few projects under my belt, I’m starting to feel like I know what I’m talking about (which could be dangerous, I know). I’ve read the books, the articles, the tips, and have even written my own articles and blogged on many subjects. Attending these four day sessions will be primarily to validate and shore-up what I already know and hopefully correct some of my misconceptions and mistakes. With that said, I am getting a lot out of each session so far.

Project and Program Planning

Margy started out with a brief introduction, detailing the roots of the Kimball Group. Its pretty interesting to hear how they started out as 3 2-person companies and essentially merged into one: The Kimball Group. These six people have fundamentally changed the way data modelers and architects think about and approach data warehousing. What sets them apart, I believe, is that their ideas and theories stem from needs of the user.

Tremendous emphasis is placed on getting the requirements right. Margy pointed out that involving the users early and often is a key to success. Oftentimes, she said, users are involved early on and then forgotten once development begins. Three or Four months later, what is delivered might not be exactly what the end user had in mind. As the development team works, they uncover more and more hidden issues that must be resolved. The best person to help resolve many of these issues is the business user — not the data modeler or ETL architect.

In short, planning for a data warehouse must involve the business community. You need a strong and charismatic sponsor who has vision and influence, a data steward who cares about convention and quality, and a business analyst who can straddle the fence between the green grassy fields of IT and the barren wasteland of the business (kidding…).

Margy also developed a list of warning signs that ranged from failing to fill the various shoes I mentioned in the previous paragraph to acknowledging “BI/DW success is tied to user acceptance”. One of her points, not to underestimate the data cleansing workload, is finally starting to become less common. But it still remains one of the main reasons why data warehousing projects fail.

The Bus Architecture

Here’s something you don’t get by reading a book: Margy and the group didn’t quite like Ralph’s name for what we know as the Bus Architecture. I don’t have an electrical engineering background, but Ralph does. aw bus.thumbnail Live from Kimball University: Day 1 (planning, requirements, dimensional models)I suppose he felt that the way business processes and conformed dimensions related together reminded him of how a computer’s bus architecture worked. Essentially, being able to swap in and out components (If you have an electrical engineering background, or are just plain smarter than me on this, then please correct me if I’m off-base!).

I asked Margy what she would have called it, and she didn’t really know. I guess that’s why Ralph won-out. He was the only one with an idea!

For our purposes, the Bus Architecture “provides a standardized master set of conformed dimensions and conformed facts used throughout the data warehouse.” Click the image for a larger picture. This image comes from a SQL Server Mag article “Dimensional Modeling Basics” by Joy Mundy and Warren Thornthwaite.

Dimensional Modeling and Conformed Dimensions

I’ve talked at length in my articles and posts about dimensional modeling. Margy went over the basics today, starting out with a discussion on the differences between the Kimball approach and the Corporate Information Factory (CIF) approach pushed by Inmon. Although the dust between the two camps has settled in recent years, Margy was quite adamant about the benefits of dimensional modeling over the enterprise data warehouse concepts. Margy detailed these, but until I get my notes together, I’ll leave it at that.

snowflake1.thumbnail Live from Kimball University: Day 1 (planning, requirements, dimensional models)She also stressed the reliance on the star schema, and warned of snowflaking designs. Snowflaking occurs mainly when you attempt to normalize a dimension, usually by breaking the dimension up by its hierarchal data. For example, you might be tempted to break the “Store” dimension, which contains city, district, region, and state into separate City, District, Region, and State dimensions. Of course, doing so creates a snowflake and makes the model much more complex than it needs to be.

Conformed dimensions and facts are really the key to a successful dimensional modeling — and by extension data warehousing — project or program. They allow you to conduct drill-across queries, where you query data across business processes, and ultimately conformed dimensions and facts give your end users a consistent version of the data no matter what process they are looking at. Using the Bus Architecture, dimensions that are used across business processes are easy to spot.

Comments and Questions

I am also paying close attention to many of the questions and comments being made during the Q&A moments. The room is roughly composed of about 75% of people who are somewhat new to data warehousing and business intelligence. So it is a great opportunity to hear some of the questions the group has had for Margy.

Some questions: a.) How do you link a promotion to a product to show all the products that did not sell during a promotion? b.) Can you add the time to the date dimension? c.) Why should I conduct interviews during requirement gathering when I know the business well enough myself? d.) When determining the feasibility of implementing a business process, are there other important factors outside of data availability to consider?

I’ll elaborate more on all of these answers (and more) in future postings. In short, the answer key is simply: a.) Use a factless fact between product and promotion; b.) No, use either a Time dimension or store a timestamp in the Fact table; c.) You don’t know as much as you think you know and interviews will form the basis for future documentation; d.) available resources and technology issues can be overcome, perhaps the next biggest factor is dependencies between business processes.

And Lastly…

Almost everyone in attendance is a consultant. I am still getting used to this. I’ve seen and met more consultants in my short time here in Europe (about 5 months so far) than I have my entire life. Margy was also surprised. She did make a very interesting observation though. Consultants, who usually pay their own way through courses and seminars, typically stay away when times are tough. When the economy is good and/or they have plenty of work, consultants attend more conferences. Interesting. I suppose this could mean that the consultant business is doing quite well in this part of Europe.

I couldn’t get everything into this single post. I’ll be sure to post more notes as time goes on. For now, it’s on to tomorrow!

Tags: , , , , ,

1 Comment

Data Warehouse Lifecycle in Depth

I’m starting to get excited about next week. I’m leaving Brussels for Amsterdam at about 5pm on Monday to attend a four day course at Kimball University. Subject: The Data Warehouse Lifecycle in Depth. Instructors: Margy Ross & Warren Thornthwaite. I’m looking forward to diving into the DW Lifecycle with industry experts: those who literally wrote the book. Ralph Kimball won’t be there but I trust that Margy and Warren will give us a great week of sessions.

What’s really, really cool is that I’ll be getting the new edition of “The Data Warehouse Lifecycle Toolkit“:

What better way to start 2008 than with a major new data warehousing book! Joy, Warren, Margy, Bob and I have been working on the second edition of The Data Warehouse Lifecycle Toolkit for most of the last year. Virtually every paragraph of the first edition was changed, and more than 60% of the book is new material. The happy looks on our faces are both pride in our new baby, as well as relief from a long writing and editing stint.

I plan on blogging from the event, so if anyone is interested in getting some insight on any of the sessions, please let me know. I’ll be sure to take especially good notes!

Tags: , , , ,

3 Comments

What is MDM Anyway?

What is master data and what is the master data domain? What does it cover? Where is the business value in MDM? Is MDM a data warehousing function? How can business users be sold on the MDM investment?

While the industry dukes it out over answering these questions (you can’t get two experts or vendors to give you the same answer to any of those questions), I thought I’d share my thoughts. I admit, that my MDM experience to this point is minimal, having only been involved in one official MDM project (and only at the planning and data modeling level). But nevertheless, I do have some of my own ideas.

The bottom line, though, is that MDM is important and should not be overlooked when planning for new business intelligence or enterprise integration initiatives.

What MDM is

Master Data Management, an information activity, is the process of ensuring that an organization’s data (including its metadata) is consistent, reliable, accessible, distinctive and well defined. This master data, once it meets these requirements, can then be used by the enterprise as a system of record for key business entities and attributes.

There are different types of MDM: Analytical (A-MDM), Operational (O-MDM), and Enterprise (E-MDM). I see it like this: A-MDM is related to data warehousing and would therefore be implemented solely as a compliment to (or a reaction from) a data warehouse/business intelligence project (I don’t agree that simply having conformed dimensions constitutes “having” A-MDM, though). O-MDM is about collecting, managing, and redistributing master data to be used by operational systems (which is largely an effort in synchronization). E-MDM, is the Holy Grail, and would be a combination of both O-MDM and A-MDM. E-MDM reminds me of Enterprise Data Modeling (EDM), in that you truly need a 360-degree view of the business to make it work (READ: get business and IT together for tea).

Data governance and stewardship, as well as data quality management, data integration, and service-oriented architecture (SOA) are all functions and processes related in some way to Master Data.

What MDM is Not

MDM is not a technology. It is a business function. You are either managing master data, or you are not. It is unlikely that any single approach or software technology will solve the Master Data problem. It drives me nuts to hear about how vendors sell MDM (and BI, for that matter), but I digress…

The work of developing and maintaining master data is not a data warehousing function. To restate: Data warehouses are not developed to create master data (even A-MDM). Some will disagree with me on this. In my opinion/observation, MDM is a separate, highly important activity that works in conjunction with the data warehouse and all other applications and processes that exist in the enterprise.

Other Thoughts

I think that a common mistake made with regard to planning for, and implementing MDM, is that there is not enough emphasis on the data quality gains that will be realized. In addition, the data integration process for data warehousing projects will be infinitely easier, as the complex work of conforming dimensions and attributes will already be complete (at least in definition, with plenty of good metadata to use as a reference).

Master data, in many ways, provides a new way of looking at data as an asset with tangible, strategic value.

I’ll post more on this topic as time rolls on. As always, thoughts, questions, and criticisms welcome!

Tags: , , ,

2 Comments

Trends in Business Intelligence

I had the pleasure of attending an excellent TDWI Benelux Chapter meeting in Antwerpen, Belgium last Thursday. Presenting were Steve Hoberman and Davy Nys. Steve discussed the latest trends in business intelligence, including the drivers that influence them. Davy’s presentation, which I’ll talk about in a future posting, discussed the place for open source technologies in a BI environment.

Steve Hoberman

I have been reading Steve’s work for a few years now. I first discovered him in print when I picked up his book “Data Modeler’s Workbench: Tools and Techniques for Analysis and Design”. Unfortunately (for me), I lent the book to a former colleague some time ago, and will likely not get it back! It looks like I’ll need to re-order. You can keep up with Steve at The Data Administrator’s Newsletter website, where Steve is a columnist. His writings are excellent. He offers a great deal of insight and experience into data modeling and related topics.

So what are the trends?

Steve identified five drivers and five trends. I don’t have access to his Power Point slides, so I will do my best in regurgitating what I heard based on my notes and memory. First, the drivers, which include:

  • Cheaper storage and computing;
  • businesses are more BI savvy and demand more of IT;
  • increased pressure for business and IT to perform well;
  • struggling global economy; and
  • scarce IT resources.

These drivers have contributed to the following five trends:

  • The merging of data warehouses and operational data stores;
  • increased integration efforts;
  • excitement around unstructured data;
  • pressure to deliver solutions correctly the first time; and
  • the need for BI personnel to wear multiple hats.

Certainly, Steve explained the above with more elegance and grace. Hopefully, I haven’t butchered his thoughts too much.

The point though, is that BI is heading in a particular direction. I’ve had a sense of this for the past couple of years:

  • I have witnessed firsthand the merging of DW and ODS.
  • I have seen a rise in integration efforts — including application and data.
  • Unstructured data and text mining are certainly talked about now as integral to future BI initiatives.
  • Because of the fact that business users are now more involved in BI projects (it used to be that IT had to struggle to obtain business sponsorship), the ideas come faster than the ability of IT to implement them, meaning that IT must strive to get them right the first time.
  • Lastly, I do wear multiple hats: DBA, developer, integration architect, researcher, business analyst, project manager, etc.

I would be interested in hearing your thoughts on this subject. Feel free to write me at Tod at Blackstone Providence dot com, or simply just make a comment to this post. You can also reach Steve at his website, or visit TDWI for more details on events that might be coming in your area.

Tags: , , ,

No Comments

Compare a String to a List of Formats

Conforming data and building master data are two very important aspects of data warehousing, migrating legacy systems, and federating reports. A common task is to compare a string to a known format and return the matching value. For example, you may need to take parcel identifier information from multiple towns and do some smart parsing to obtain atomic map, blocks, and lots. Town A might store their Parcel ID in the format MMM-BBB/LLL, while another in the format MM/LL.BBB. You could store all known formats in a table, do some magic to convert all parts to a format code, and finally find the match in the formats table.

In VFP, there are many ways to compare the contents of a string to a particular format. TRANSFORM() and LIKE() come to mind. Using TRANSFORM, you can do something like the following:

lcTestVal = "SOMETEST_VAL"
? TRANSFORM(lcTestVal,"XXXXXXXX_XXX") == lcTestVal     && returns .T.
? TRANSFORM(lcTestVal,"XXXXXXX_XX") == lcTestVal    && returns .F.

LIKEwise, you can do some comparisons with LIKE:

lcTestVal = "SOMETEST_VAL"
? LIKE("????????_???",lcTestVal)    && returns .T.
? LIKE("???????_??",lcTestVal)    && returns .F.

But in some cases, it might be necessary to compare a single string against multiple formats, returning the format that the string matches (a common task when conforming data for data warehousing). There are a lot of ways to handle this, but simply using TRANSFORM or LIKE won’t do it alone.

Take the following example, where we have three different account number formats that come from three disparate source systems. You could manage this easily in FoxPro:

lcFormat1 = "NCCCCCN"
lcFormat2 = "CCCNNNNNN"
lcFormat3 = "NNNNNNNCCC"
 
DIMENSION aTestValues [10]
aTestValues[1] = "3UUFGP7"
aTestValues[2] = "MCK000989"
aTestValues[3] = "0090892LLG"
aTestValues[4] = "9ABCD2"
aTestValues[5] = "JKL230945"
aTestValues[6] = "JKLM00989"
aTestValues[7] = "JJ2000989"
aTestValues[8] = "3U99G7"
aTestValues[9] = "3UUFGP"
aTestValues[10] = "7XYZA1"
 
FOR x = 1 TO 10
 
    lcTestValMasked = CHRTRAN(CHRTRAN(UPPER(aTestValues[x]),"ABCDEFGHIJKLMNOPQRSTUVWXYZ","CCCCCCCCCCCCCCCCCCCCCCCCCC"),"1234567890","NNNNNNNNNN")
 
    ? ICASE( lcTestValMasked == lcFormat1 , "String " + aTestValues[x] + " matches format 1",;
            lcTestValMasked == lcFormat2 , "String " + aTestValues[x] + " matches format 2",;
            lcTestValMasked == lcFormat3 , "String " + aTestValues[x] + " matches format 3",;
            "String " + aTestValues[x] + " does not match any format")
 
NEXT

In the above example, there are three acceptable format codes that we’re using to compare against 10 values. Using ICASE, we can easily determine which codes match the format string. The inner CHRTRAN converts all character values to the letter “C”, while the outer CHRTRAN converts numbers to “N”. If we did the numbers first, our “N”s would become “C”s! Of course, you can use “X” and 9, “A” and “#”, or whatever letters you want to use as your mask.

Ideally, the format codes would belong to a table which would contain additional information about the code, such as where it came from and its status (active or inactive). Then, a loop through the table, or a seek on the format column would point you to the corresponding record.

I would love to hear some additional solutions on this. If you have any, please comment!

Tags: , , , , , ,

No Comments

ETL Subsystem 2: Changed Data Capture

This article is part of a series discussing the Kimball Group’s “34 Subsystems of ETL“. The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual FoxPro.

Managing changed data is a major challenge. Sometimes referred to as “incremental updates”, Changed Data Capture is essential for data warehousing operations, as it keeps data volume low and helps to manage the synchronization between the source and the warehouse.

Without a Changed Data Capture methodology, you would be forced to blow away and reload your data warehouse every time you needed it refreshed. If you only need your data warehouse once a month, this might be OK, but that would be rare.

Here’s the definition that I’ll be using:

Changed Data Capture
Changed Data Capture (CDC) is a method of identifying changes made to a source database or file for the purposes of integrating the data into the data warehousing pipeline. CDC reduces data volume and processing needed for the data warehouse.

Accomplishing this is easier said than done. In some cases you will need to trust the source system to adequately identify its changes, in others you’ll need to figure it out on your own using a brute force row-by-row comparison. If you have access to, and can work with transaction logs of the source system, then this route may be best.

If you can trust the source system, you may be relying on a “last updated” datatime stamp stored in an audit table or in the row of the table that contains the change. You may also have some versioning information available as well. In this scenario, you rely on the source system to tell you when a change was made. Your ETL solution will look for these changes and only select that data for further processing.

The danger is that you are trusting a source system to report all changes faithfully. If you can access the logs, or if the changes are logged using a database trigger, than your trust is likely well placed. If the source relies on application logic to maintain this information, you should consider a different approach. It is quite possible that a database can change through a back-door, bypassing the application logic that would normally track the change. Steer clear.

If the source cannot tell you what has changed (or you simply don’t trust it), then you will need to do a comparison. Depending on the size of the project, it would seem to make most sense to do this type of comparison in the staging area — avoiding any contact with your dimensions in the dimensional model. Touching the Dimension for this purpose would result in a performance hit and could affect any analysis or queries currently being run against the data. In the “Hand Coding” section below, I outline a brute force method that relies on a staging scheme and some SQL to determine changed data for further processing.

In general, the CDC process involves the following steps:

  1. Identify changed rows if possible
  2. Do a lookup on the dimension or staging table for the natural key of the record in question
  3. For Dimensions, handle Type 1 and Type 2 changes using a Script component (see Dimensional Modeling: Loading The Slowly Changing Dimension for the logic behind this).
  4. Perform the updates and inserts
  5. Handle deletes through negation techniques


SQL Server 2005 Integration Services (SSIS)

SQL Server 2005 will allow you to do all CDC work within SSIS. For example, if the source system reports changes, you can write an extract (utilizing variables set during package execution) that will look for the specific change indicators whether it be a date range, version, or some status flag.

For brute force comparisons on Dimension tables, you can utilize the Slowly Changing Dimension task. I have already discussed the SCD at length in my post “Dimensional Modeling: Loading The Slowly Changing Dimension“, so please read that if you need more information about how a SCD is loaded and maintained. For the purposes of CDC, the only change I would make to that post is to use a staging area for the task instead of the production database. This will allow you to focus processing efforts on identifying changes, filtering out any unnecessary data in the process which will speed up processing down the line.

If the Slowly Changing Dimension task won’t work in your scenario (perhaps you don’t have dimensions available to check for new and changed rows or you have very, very large dimensions which the SCD task is not great at handling), then you can use SQL, checksums (see System.Security.Cryptography for more information), or a variety of other SSIS transforms to do the trick. In the next section, I show how SQL can be used.

Update: For SQL Server 2008 Enterprise Edition, there will be a CDC feature for tables (for DML only). I’m not too thrilled about this new feature, and in fact, I’m not sure if it will really work in a true ETL environment. But, I have not evaluated it, so I could be way off base. Check out these blog postings for more information:

Hand Coding with Visual FoxPro (VFP9)

In the following code example, I present a brute-force CDC solution in VFP. This solution, as discussed in my article “From Source to Warehouse: A Microsoft Visual FoxPro Data Integration Strategy” and at FoxForward this past September, relies on using VFP tables in a staging directory called sometable_curr (which represents the fresh data) and sometable_prev which represents the data from the previous run. Comparing these two data sets for changes is fairly straightforward.

The code for this is located in my 2007 Fox Forward Presentation Examples (zip file, prgs in file). Specifically, open program files “4 – extract.prg” and “5 – changed data capture.prg”. The examples use the Northwind Database.

Here are the bare-bones (refer to the programs above for more details)


*-- open the current and previous tables
USE sometable_curr IN 0 
USE sometable_prev IN 0 
 
*-- do the compare using UNION and GROUP BY. 
SELECT MIN(source) as source, sometable_id,;
    sometable_name, sometable_phone ;
FROM ( ;
  SELECT "curr" as source, sc.sometable_id, ;
      sc.sometable_name, sc.sometable_phone ;
    FROM sometable_curr sc ;
  UNION ;
  SELECT "prev" as source, sp.sometable_id,;
      sp.sometable_name, sp.sometable_phone ;
  FROM sometable_prev sp ;
) as currprev ;
GROUP BY sometable_id, sometable_name, sometable_phone;
HAVING COUNT(*) = 1 ;
INTO TABLE sometable_cdc.d
*-- open the current and previous tables
USE sometable_curr IN 0 
USE sometable_prev IN 0 

*-- do the compare using UNION and GROUP BY. 
SELECT MIN(source) as source, sometable_id,;
    sometable_name, sometable_phone ;
FROM ( ;
  SELECT "curr" as source, sc.sometable_id, ;
      sc.sometable_name, sc.sometable_phone ;
    FROM sometable_curr sc ;
  UNION ;
  SELECT "prev" as source, sp.sometable_id,;
      sp.sometable_name, sp.sometable_phone ;
  FROM sometable_prev sp ;
) as currprev ;
GROUP BY sometable_id, sometable_name, sometable_phone;
HAVING COUNT(*) = 1 ;
INTO TABLE sometable_cdc.d
*-- do the compare using UNION and GROUP BY. 
SELECT MIN(source) as source, sometable_id,;
    sometable_name, sometable_phone ;
FROM ( ;
  SELECT "curr" as source, sc.sometable_id, ;
      sc.sometable_name, sc.sometable_phone ;
    FROM sometable_curr sc ;
  UNION ;
  SELECT "prev" as source, sp.sometable_id,;
      sp.sometable_name, sp.sometable_phone ;
  FROM sometable_prev sp ;
) as currprev ;
GROUP BY sometable_id, sometable_name, sometable_phone;
HAVING COUNT(*) = 1 ;
INTO TABLE sometable_cdc.dbf

As you can see, a combination of using the staging area (curr and prev) and some SQL can give you changed data with minimal effort. This method is fairly efficient because it is done in the staging area where you could isolate the process on a dedicated a machine.

To use a checksum to do a comparison, please take a look at my post “Sys(2017) and Comparing Data“. I go into detail on how this can be accomplished.

From here

There are several avenues to follow when designing a CDC solution. Moreover, each source will likely demand a different approach. Figuring this out ahead of time is essential for the success of the integration system.

The next ETL Subsystem is [Data] Extraction. Once the source system has been profiled and a mechanism for CDC has been developed, the ETL team can begin the process of extracting the changed data from the source.

Tags: , , , , , , ,

2 Comments

Defining Business Intelligence and Data Warehousing

I’ve been asked on occasion to define Business Intelligence and Data Warehousing. Typically, I am not sure how to answer. I’m terrible at inventing definitions, but here is what I usually respond with: For Business Intelligence, it is “a broad term describing the acquired knowledge obtained through reporting and analysis of organizational information” and for Data Warehousing, “A subject-oriented, integrated, time-variant, non-volatile process of collecting data for supporting decisions” (close to Inmon’s definition). But those answers don’t really satisfy me or those who have asked the question.

There are dozens of ‘accepted’ definitions for Business Intelligence and Data Warehousing. To make matters more confusing, the community itself seems torn. For example, the wikipedia article on the subject is woeful at best. I’ve seen others define BI as a ‘front-end’ to Data Warehousing, which I wholeheartedly disagree with. There is also a strong misconception that Business Intelligence and Data Warehousing are one-in-the same or connected at the hip. They are not.

I believe some of the onus can be placed on vendors who sell “BI” or “BIDW” solutions.

First, Business Intelligence

Business Intelligence is not an activity or process. It’s a result. You either have it or you don’t. This is a very important distinction, considering the level of misconception surrounding it.

How can you get it? Well, lots of ways.

Having intelligence about the business allows you to make informed decisions. The intelligence is derived from many sources. One of which is through Data Warehousing. Enterprise Resource Planning (ERP), knowledge management, and technical reporting are others, for example.

Second, Data Warehousing

Data Warehousing is a process. It is not just a database, nor is its purpose in life to just hold historical data (a common misconception). The process involves Data Profiling, Metadata Management, Dimensional Modeling, Data Integration (ex: ETL, EAI), Data Quality, Reporting, and Governance. There are many variations to this process, all of which are driven by business needs.

The Data Warehouse facilitates Business Intelligence by providing the necessary data and processes needed to integrate disparate transactional data into valuable information presented through various Applications such as executive dashboards, ad-hoc reporting tools, analytics, etc. These Applications are used by decisions-makers to make informed decisions.

It follows that Business Intelligence can be an outcome of Data Warehousing, provided that the data in the warehouse is promoted to information and then applied to decision-making.

Purpose and Justification

Strategic initiatives, goals, and competition will drive how organizations approach obtaining Business Intelligence. One organization might look at it for tactical reasons (i.e. decision support). Another might see it as a mission-critical part of running their day-to-day operations. Yet another might see it as a competitive differentiator. These drivers define the tools, techniques, and processes that might be used. Data Warehousing may solve one or more of these requirements.

Additionally, the drivers dictate the type of Applications that will be created on top of the Data Warehouse Database. Some departments may insist on printable reports or Excel sheets. Some Applications might be designed to analyze trends, issues, and events to gain insight into a process. It is very common to see outcome analysis, scoring, data exports, and some automated processes (such as bulk email operations).

The point of this entry is not to “set the record straight”, but rather to help draw the line between Business Intelligence and Data Warehousing. It is an expression of my opinion on the subject. Please understand that this is an important discussion to be having if you are involved or are planning to be involved in a Data Warehousing project.

Tags: , ,

1 Comment