Posts Tagged Productivity

Mindmapping Dimensional Models

When presented with a data modeling problem, I start with a conceptual design and then create the logical and physical designs as each concept becomes more mature and stable. This is an iterative process that can take many hours. Using mind mapping software has always given me a head start.

Mind mapping has been around for a long time. It’s a visual technique that you can employ which allows you to diagram ideas based on some central theme. For dimensional modeling, the theme is some event in the business process, while the ideas are the dimensions and dimension hierarchies. Mind maps are quick to make, easy to follow and share, and will allow you to see all interconnected concepts in one place. Software developers and data modelers have been using mind maps for a long time, but its use (as far as I’ve seen) isn’t quite mainstream in the dimensional modeling space.

Orders 300x97 Mindmapping Dimensional Models

Mind mapping an Orders business process dimensional model

When I start to construct a new data model (ER/DM), my first attempt at a design is often a mind map. As I read through requirements and examine business processes and business entities, I start to draw out how they may relate. Traditionally I used a paper and pencil. But recently, I’ve switched to using mind mapping software that I can access and share on all my devices. MinDgo, for example, works on my Mac, iPhone, and iPad. Once I’ve sufficiently covered all key concepts and requirements, I complete the conceptual model in PowerDesigner.

I use mind mapping software for the following reasons:

  1. Quick, easy, and structured way of designing high-level business process dimensional models
  2. Available on all my devices (unlike the heavy case/modeling tools we use), so when inspiration strikes, or when that coffee-machine meeting concludes, I can quickly get the ideas into the design
  3. Very easy to show, explain, and help interpret the models to business and technology colleagues
  4. Organizing different iterations of a design, and interconnecting related designs, is easy using software (try this in your Moleskin)

I also use this technique when I am trying to understand existing data models. For example, if I’m analyzing complex database documentation (from vendors like WorldScope or Charles River), I can get a good feel for how things relate by mind mapping as I go.

Tags: , , , , ,

4 Comments

Chaos Theory and the Data Warehouse

Have you ever considered the Data Warehouse as a chaotic system? The work of the Data Warehouse team is never complete: new requirements trickle in every day, and user feedback gets more and more sophisticated as time passes. Chaos Theory can help explain this, and in the end, offer us some insight into how we can better plan Data Warehouse development, deployment, and maintenance.

butterfly effect 150x150 Chaos Theory and the Data WarehouseThe Data Warehouse is a process which forms the center of an information supply supply chain, with several inputs and several outputs. Each input and each output is subject to change based on factors such as vendor upgrades, new interfaces, expanded interfaces, and perhaps most importantly end-user (client) evolution. All of these changes happen continuously. As people use the Data Warehouse, they become more inquisitive. They want their output and analysis rolled up or down in different ways. Predicting (i.e. planning) for Data Warehouse change can be as difficult as predicting (and therefore planning for) the weather. This environment of ever-changing needs fits neatly into the confines of Chaos Theory. But what is chaos in this context? What is Chaos Theory exactly?

From the book “Chaos Theory Tamed”, author Garnett P. Williams writes:

Chaos is sustained and disorderly-looking long-term evolution that satisfies certain mathematical criteria and that occurs in a deterministic non-linear system. Chaos theory is the principles and mathematical operations underlining chaos. (pg 9)

Meteorologist Edward Lorenz in the 1960s determined that even the tiniest differences in an initial measurement can have a huge impact on an outcome. In other words, as his butterfly effect posits, a butterfly flapping its wings in Africa can affect weather patterns in North America. Weather is a system which has a highly sensitive dependence on its initial inputs.

The foundation of the Data Warehouse is only as stable as how you control for the tiniest changes to the inputs into the information structure. As weather, it too has a highly sensitive dependence on inputs. One tiny change to a source system can have almost catastrophic effects on the Data Warehouse.

Finding Order

However, despite the chaos, we should be able to find some order. This is what Lorenz and scientists after him tried to do. The first step in this process is understanding that even seemingly random changes are not always as random as they seem. If we can understand that changes to our Data Warehouse are not random, then we can build a better Data Warehouse.

There are a few things you can do to tame the chaos:

  • Be consistent and systematic. The more predictable you and your Data Warehouse team are, the easier it will be handle change. In other words, control any and all variables that you can.
  • Adopt proven analysis and development methodologies that others have had success with. This is not to say that some level of adaptation to your environment, team skills, and situation are not required, but rather, start off with a good foundation and follow along where it makes sense.
  • Keep the team close. Quality and frequent interaction among the people who make and run the DWH is essential.
  • Stay in the groove like an improvisational jazz band. If your data modelers are not in tune with your decision-support analysts who are not in tune with your DBA, then you can’t expect to handle the challenges of chaos.
  • Feedback and evolution are two very important aspects of Data Warehousing. Keep your ear to the wall and try to anticipate changes before they occur. This takes practice, but (back to the improvisational jazz band analogy) practice makes perfect.
  • Keep in step. In the Data Warehouse world, change is natural and will come in waves. More significantly, if changes cannot be implemented quickly, your clients will lose confidence in your ability to keep up.
  • Think and act quickly. The longer you debate, the longer your client must wait. While they wait, they construct workarounds or look elsewhere. If you’re lucky and they do wait for you, their change may become outdated and no longer relevant; an opportunity might have been missed (and you’ve essentially failed them).
  • Don’t be afraid to be wrong. The consequence of acting quickly is that you might get something wrong. Just be agile enough to respond and deliver new change with urgency.

I’ll post more thoughts on this over the next weeks. I’m particularly interested in how users of the Data Warehouse become more and more sophisticated as they use its tools and applications.

Tags: , , , , , ,

No Comments

Scoping Data Warehouse Initiatives

focus Scoping Data Warehouse InitiativesData warehousing is a complex operation. From start to finish (if there is a finish), project teams are faced with many challenges. In all phases of the lifecycle, there are opportunities for derailment. The best way to mitigate potential issues and stay on time and within budget is to carefully define and manage scope. Managing scope can be an ongoing struggle (especially if requirements are not clearly defined or justified). While this is really a PM101-type of topic, I feel there are some fine points in a DW/BI environment that are not mentioned enough.

Consider the following:

Programs verses projects

I won’t get into a deep PM discussion here, but it is important to point out that data warehousing (or business intelligence, master data management, etc.) initiatives should be thought of as programs and not projects. This mindset will help in scoping.

A program (which might also be called a “project portfolio” in some circles) is basically just a set of related projects. With a program, the emphasis is on organizing, prioritizing, and allocating resources to the right projects. Program scope is more strategic, and answers long-term questions about what type of value the organization hopes to achieve from the initiative.

A project, on the other hand, is much more specific — with a set number of deliverables and goals that have a high immediate impact. The scope at the project level is therefore more tactical in nature: high impact, fast delivery. Be aware that some projects may never be given the green light (for example, if there is a low business impact or if there is a low feasibility rating because of data source or data quality complications).

What I find odd is that organizations still choose to tackle immense data warehousing initiatives in one or two shots, trying to deliver everything at once over a period of 18 or more months. This is the wrong approach (here’s why). Break this large initiative into individual projects and try to deliver functionality every 6 to 8 weeks.

The business process

The best way to break down data warehousing programs into high-impact projects is along business process lines. A business process, as defined here, is:

The complete response that a business makes to an event. A business process entails the execution of a sequence of one or more process steps. It has a clearly defined deliverable or outcome. A Business Process is defined by the business event that triggers the process, the inputs and outputs, all the operational steps required to produce the output, the sequential relationship between the process steps, the business decisions that are part of the event response, and the flow of material and/or information between process steps.

Some example of the above: inventory tracking, Internet sales, retail sales, marketing, tax assessment, tax collection, pitching, batting.

In any data warehousing environment, you can expect to have several business processes to model. Each business process you tackle will have elements touching upon different aspects of the data warehouse, including infrastructure, middleware, data modeling, ETL, business logic development, presentation elements, and so on. If you scope each project to the business process, you can deliver complete solutions in the shortest amount of time. (It should be obvious that the very first business process you implement will take the longest, as the team works out the core infrastructure. Most of this infrastructure will be reused by other business processes.)

Avoid scoping to a data source

Do not fall into the trap of scoping to a data source. Scoping to a data source is almost guaranteed to deliver mediocre outcomes. These projects typically include many unfinished or inadequate business processes all delivered at once some time in the distant future and long after the excitement over the initiative has subsided.

While it is true that only one or two data sources might exist in some organizations, it is not true that inventory, customers, sales, procurement, shipping, and other business processes need to be taken on at once. Create a single project for each business process, prioritize based on impact and feasibility, and then badabing badaboom, you deliver. Next.

Along the same lines, do not adjust your scope if the data source is unavailable, uncooperative, or lacking in quality. Instead, bring the fight to the data source (here is where a good, preferable C-Leveled, business sponsor can come in handy) and set things right. This is obviously a project risk, and also an organizational risk. If you are having problems extracting inventory data then maybe its time to put down your data warehousing gloves and get a new inventory system.

Last thoughts

Scoping the data warehouse is a difficult problem. Troubles start early on with the initial idea, it moves on through requirement gathering, and finally into the development phase of the lifecycle. There is not a lot of good advice in this area for data warehousing (if you happen to know of a good source, please send me a link or title). But I do find that if you work towards business processes, think in terms of programs and projects, and avoid the data source trap, scoping decisions will settle into the real needs of the business.

Tags: , , , , , , , , ,

1 Comment

The Three Faces of a Good ETLer

Hiring a “data integration expert” or consultant for your next, greatest, data warehousing project? Don’t take it lightly. ETL personnel are critical to the success or failure of your project.

The following are what I deem to be essential technology-related aspects, or faces, of a good ETL developer and/or architect (herein referred to as an ETLer for lack of creativity). While you need to consider business and industry knowledge, personality, and experience in your team-building process, you should start by checking off the following on your interview sheet:

First Face: the technologist

Programming must come natural to an ETLer. Objects, logical constructs, expression construction, program flow, and the like, must be well understood. The truth is that no matter how much your vendor proclaims that their tool does it all, chances are excellent that some hand coding will be required. On top of that, ETL tools work a lot like procedural programs. Technologists are very good at putting their right foot forward, and will generally think of things to make the ETL flow perform better. They also think about logging, auditing, and exception handling; all important.

Second Face: the theorist

But a solid programming background is not enough. Knowledge of Data Integration theory and best practices are equally important. While I believe in and use Kimball’s methodologies for integrating data into a dimensional data warehouse, other methodologies exist that may be more suitable to your business and integration needs. Following a proven methodology, with slight modifications to suit your environment will get you further, faster. Having little or no theory behind what you’re doing gets you somewhere, slower. Identify your methodology, and then find someone who understands it.

Third Face: the specialist

Knowing the ins and outs of your ETL tool (SSIS, OWB, Datastage, Talend Open Studio, etc.) is essential. I would venture to guess that a solid programmer who has a great understanding of ETL theory will be able to get by using most tools with little learning curve. What I worry about (and you should too) are the nuances in the tooling that can stump even the best. These nuances (SSIS, my tool of *ehem* choice — sorry, I needed to clear my throat, has many of these nuances) can cost you many project hours and force rewrites if blocking issues are encountered. Tool knowledge is also essential to know when it is appropriate to forgo the tool because of I/O issues, or because hierarchical data is better handled elsewhere, or because business logic is best not bundled within a data flow.

About Face

While junior members of your data integration team can be one or two-faced (that came out funny), senior members and architects must have more meat on the bone.

I suppose this is why good ETLers are difficult to come by. The ETLer needs to have a healthy mix of programming talent, an approach discipline, and tool knowledge. Trained DBAs and software developers might have a lot to offer, as might a troop of certified tool jocks and method junkies, but to get your project in on time and within budget, don’t settle.

Tags: , , , , ,

No Comments

Top 7 Reasons I Wear a Suit

I dislike wearing suits.

It used to be that I could code in my favorite Phish t-shirt wearing sandals. I had a key instead of a badge, and lunch usually meant a few greasy pizzas or clam cakes. In those days, my attire only meant something if there was an off-site or if clients were coming to visit “the shop” (which was a tiny building several miles from the heart of the big city). I could easily bounce back and forth between long and short hair and between full beard and cleaned-shaved. Ahh… those were the days.

Now I work in a major international city for a rather large bank. I code in a suit when I’m not in meetings, wear nice shoes, carry a badge, and eat salads and yogurt for lunch. *sigh*

To be fair, I enjoy the new challenges and the big city. And if wearing a suit on occasion is a consequence, I can live with it. So while a suit is not fully mandatory, I still wear one at times. Here’s why:

  1. It easily puts me in line with the dress code
  2. Dressing is simpler in the morning (although sometimes it takes a couple of tries to get the perfect knot in my tie)
  3. My wife tells me I look great
  4. Dressing down on Friday never felt so good
  5. I look more important than I am
  6. I feel more important than I am
  7. My jacket flaps behind me in the wind when I ride my bike to the train station, which makes me feel like a super hero with a cape

Other than those fantastic reasons, wearing a suit is a real drag.

pr2s amsterdam bicycle suit1.thumbnail Top 7 Reasons I Wear a Suit (I do admit, there is something rather Monty Pythonish about wearing a suit on a bike. I bet I look pretty silly to the folks driving past me. But riding my bike gives me more than 30 minutes a day of much-needed exercise, and on top of that, the price of gas here in Europe would blow your mind!)

Tags: ,

1 Comment

Business Casual

So today I am wearing jeans and a faded (but once pleasant) button-down shirt. No tie, although my shoes are nice and I am wearing dark socks. A few others around me are similarly dressed, which is typical for summer Fridays. Then there are those who are in full suits, as if preparing for a job interview or some important sales meeting. Others are in suits but without tie which is in-and-of-itself a very strange practice; I call these “half-suits”.

You can draw two lines in the sand separating all three groups of Friday dressers. You have the “workers” – those actively engaged in business operations like myself; the middle management-type who are no longer “workers” and who aspire to delegate more and more activities, you know the type – it’s just like them not to wear ties with their suits; lastly, there are those who on the one end of the spectrum make only infrequent strategic decisions of the C-level type and on the other end those who work for mostly commission and must rely on their superior charisma (and sharp suits) to get ahead.

My colleague and I are working feverishly at the moment to improve the performance of one of our BI applications: A process we hope to improve from roughly 15 minutes to about 7 minutes (so half-suit and full-suit can have more time at the cooler). In addition, I am troubleshooting a foreign key violation in one of our ETL loads, and my partner-in-crime is hunting down the results of some replication testing in our production environment. Meanwhile, a full suit is currently browsing an online golf store; the half-suits are centered around the water cooler.

This, symbolically, highlights the problems with business and IT alignment in general — especially in large organizations. I find that IT is normally of the first variety – willing to dress down whenever possible to add a little comfort to an otherwise fast-paced existence full of responsibility and accountability. Dressing down in no-way implies a dress down of activities or a dumbing down of skills.

As you can tell, this is a bit of a rant and a fallacious attempt at tossing my colleagues into generalized buckets. But one thing is very true: business and IT need to get in sync. I would like to think that my team is above average in this regard. The immediate team consists of business and IT personnel – all of which are fighting for a successful project.

‘d like to hear your thoughts on this matter….

Tags: ,

3 Comments

Cooling Down

It is (or should be) common knowledge that you should never send an email, write a blog or forum post, or make a phone call when you’re totally ticked off about something! You are likely to say something you don’t mean or perhaps you’ll be a little too honest.

First cool down, and then respond. Easy enough, but what if you can’t wait to cool down using traditional methods (you know like, take a long hot bath)?

The solution: Simply write your name a few times on a piece of paper using your non-dominant hand. Apparently, it will force the logical side of your brain to start working, giving your emotional side a few seconds to forget why it is so upset (or sad, or excited, etc.). For all the neurosurgeons out there who might want to debate brain lateralization, I’m not the guy for you! But this technique has worked many times for me (and it recently got my sister-in-law out of a funk).

Over the past several days, I’ve also been looking into other ways to train my brain to either help in logical tasks, management tasks, programming, motivation, etc. I stumbled upon a blog entry (from Gary’s Historical Art) that spoke of the book “Drawing on the right side of the brain“. I remember this book from my childhood and was thrilled to see it has a new addition. It contains some additional information on (a) the latest developments in brain research, and (b) information on using drawing skills for problem solving. I plan to get a copy soon.

Tags: ,

No Comments