Archive for June, 2009

10 Commandments of Data Integration

  1. You shall compile and document all requirements and mappings; segregate the work by business process. You may have more than one of these business processes, some of which may come before others.
  2. Do not begin without first conducting a thorough data profile; otherwise, you will be punished for your inequities, as will the generations that come after you.
  3. Do not think commandments one or two are in vain, lest you will become overrun by the dead line, scope creepers, and a great exodus of people from your tribe; if this happens to you, do not swear or curse, for you have been warned.
  4. Remember that latency and timeliness are equal in importance to non-volatility and having a traceable lineage; a staging area may lead you to this promised land.
  5. Honor the rules of data conformance.
  6. Do not kill dirty data: you shall clean them, or take them back to their sources for retribution.
  7. Do not commit the worst data integration transgression of all and ignore data quality, your ignorance will not be forgiven.
  8. Do not be shy about stealing your neighbor’s work, for his trials have led to best practices that you can make equally good use of.
  9. Do not rely solely on business keys; surrogates are your friend and will permit you to engage in slowly changing your dimensions.
  10. You shall covet a proper audit and log system; for on the day of judgment, you will need proof of your compliance.

Tags: , , , , ,

No Comments

Actionable Business-IT Alignment

geek & pokeFor several years I’ve been following the Business-IT alignment movement. Alignment is necessary for fostering innovation and realizing greater profits as IT resources are used to their fullest to satisfy business goals and objectives.

But many organizations still struggle to make the connection.

Some suggest varying 3-, 5-, and 7-step approaches for better alignment. They typically consist of diving into new organizational structures, or implementing new frameworks and service models. Other sources say that the business is to blame, that IT is misunderstood and underutilized. While finally, some say it is a dead issue altogether.

I’m not convinced this is a dead issue, nor am I convinced that the business is to blame (especially nowadays). I’m also not convinced that 3-, 5-, and 7- step plans will magically work.

A few basic actions, initiated by IT, must happen first.

The Actions

Business has long ago recognized that better alignment with IT is essential. We can argue about how well they’ve understood IT and how it can best be utilized, but across the board, the recognition is there. IT, after many years of making the case for alignment, seems to be coming up short during this crucial period. I hope that the following actions, from an IT-perspective, can help:

  • Understand: Learn the language of the business. Plainly put, this means you need to become more financially intelligent. All businesses exist to earn a profit, and understanding how this works is the first critical step. You must understand where the estimates and assumptions are in the numbers, when you would want to depreciate or amortize, and what constitutes a capital expenditure or operating expense. Among many things, you must understand ROI, cash conversion, and how to use profitability ratios.
  • Participate: Take part in strategic and other long-term planning initiatives. IT professionals must be able to see the company’s vision and turn the vision into actionable IT initiatives. If a representative from IT is not present during steering committee and other board meetings, make this a priority. You will need to convince the business that IT is capable of playing an important role in all long-term decision making. Fortunately, businesses already realize this need, but in many cases don’t feel that IT can effectively contribute (perhaps because IT doesn’t understand the business language).
  • Contribute: Provide business with the ability to make fast and accurate decisions. This means pioneering smart business intelligence initiatives that can provide decision-makers with the tools and reports — distilled — that business can utilize. You need to prove your flexibility, agility, and ability to understand what the business really needs. Much of this is tied to understanding, and extended with participation. These initiatives also include business process improvements; tighter integration of business, meta, and master data; and managing performance.

If IT can take action, better alignment can be achieved. This isn’t to say that the business can’t do more, but before the business can make IT a full partner, understanding, participation, and contributions from IT are a must.

Tags: , , , , , , ,

2 Comments

New VFP User Group in Iraq

The FoxPro community has always been diverse and vibrant, with user groups spread out all over the world. Although VFP9 is the last installment we’ll get from Microsoft, I’m happy to see and re-report that a new group has been created: Iraqi Visual FoxPro Programmers Group (Iraqi.vfppg)

You can read the original announcement here on Foxite.

Founder Ammar Hadi writes:

It was my dream to build this group since about more than 4 years ago, and at that time I send an e-mail to one of the Iraqi Foxers who is a member in foxite too (we never met) and told him that I will work on creating a foxpro group in Iraq. He show his readiness to help me on that. But at that time, I was totally engaged in my Medical study to get the certificate of Neurosurgical Profession. Now it is about a year and a half since I finished my study and training and got the certificate to work as a professional neurosurgeon (with very good marks ;-D ….. ). Now I can say that I got some more time to make my dream come true.

I wish Ammar the very best!

Tags: , , ,

1 Comment

Naming Conventions and the Underscore

I’ve seen and worked with a lot of naming conventions. When I start a new development project, I always — without exception — document how I intend to name the new items I create, whether they are physical objects such as tables and fields in a database, or names used in code for objects, variables, and the like. This document is shared with the team, adjustments are made where necessary, and adopted as standard for the project or group of projects.

The important thing in this effort is consistency, and not the technique we adopt. If all single-part surrogate key fields are to end in “Key”, and all single-part varchar business key fields are to end in “ID”, then the rule must be obeyed by all developers and DBAs.

A major point of contention that seems to crop up every single time naming conventions are discussed is how and when to use an underscore. Should the field be “CustomerID” or “Customer_ID”? “ProductKey” or “Product_Key”? “SSNumber” or “SS_Number”?

CamelCase

boy and camel 150x150 Naming Conventions and the UnderscoreI have a general rule: If the container (database, property window, etc) can remember case, then only use an underscore when combining an acronym with additional information, when that information comes after the acronym. This is because you can use “CamelCase“, which I find more readable.

Here are a few examples, showing how I would nornally handle underscores when case is remembered:

  • CustomerID and not Customer_ID
  • ProductKey and not Product_Key
  • PhoneNumber and not Phone_Number
  • SS_Number and not SSNumber
  • ISO_Date and not ISODate
  • WP_Theme and not WPTheme
  • ValueEPS and not Value_EPS

And while I’m at it, avoid redundancy: VIN (Vehicle Identification Number) is not VIN_Number; GICS (Global Industry Classification Standard) is not GICS_Classification; and DG (Disease Group) is not DG_Group!

No Case Support?

If the container does not remember case, then always use an underscore to separate the parts of a name. Examples:

  • customer_id and not customerid
  • product_key and not productKey
  • phone_number and not phonenumber
  • ss_number and not ssnumber
  • iso_date and not isodate
  • wp_theme and not wptheme
  • value_eps and not valueeps

If you can be strict about naming, project members and future generations will have an easier time understanding your code, objects, and documentation. I find that the underscore — even within a good naming system — is often used inconsistently. I also find that the underscore can make a major visual impact if used correctly. So it pays to pay special attention to ASCII character 95!

As an aside, naming conventions also play an important role in data warehouse conformity. So while it is important to have good standards, you will also need good governance to be sure that your names are meaningful and consistent.

Tags: , , , ,

6 Comments

Innovation Interrupted?

Michael Mandel writes in his June 3rd cover story for Business Week that “During the past decade, innovation has stumbled. And that may help explain America’s economic woes”. From Facebook to flat-screen televisions, we’ve been lead to believe that we’ve entered into a new age of innovation unmatched by the past.

But like Mandel, I disagree. I think we’re spinning our collective wheels. Biotech, alternative energy, health care, and in almost every other industry, has produced little impact on the innovation front. While we’re all still enamored with the Internet, social networking, and cloud computing, old and inefficient systems in other key industries have remained in stubborn place. The US government owns 60% of GM as a result. This is one example in a sea of many sad cases.

Then there is the problem of rebranding. While I believe that corporate America has had opportunities to make real impact, they’ve taken the straighter path to profits by rebranding old models and technology. Cereal, automobiles, and financial products come to mind.

Perhaps motivation and risk tolerance have taken a hit. The dot-com burst, 9/11, Enron, and the housing crisis are all black-swan type of events that have the ability to change the way people think and do business.

  • After the dot-com bubble burst, investors steered clear of companies that had no product, no customer base, and no real ROI plan (ok, this is a good thing). Where would the speculators place their efforts next? Housing?

  • After 9/11, US foreign policy became a sticking point for many foreign investors, who perhaps don’t like to be told “us or them”.

  • Enron led to Sarbanes-Oxley, an innovation choker that had (and has) the potential to punish organizations for taking R&D risks. But Enron and other failures of its time exposed some serious flaws in how US companies are allowed to run their finance departments.

  • And while people were making money in the late 90s and in the early part of this decade, banks were lending to anything with a heartbeat. The Fed was a major facilitator, keeping interest rates too low. The resulting foreclosure rate today is unbelievable (how many on average per minute?). Investors — including banks — will be far tighter with credit, stifling innovation even more.

Moving Forward

I would like to see the US — and other nations — create an Office of Innovation. I would like to see companies (private and public) free to explore new R&D projects. I would like to see universities continue to get large grants for scientific study and research. And I would like to see more kids enter the fields of science and mathematics.

I would also like to be alive when quantum computing becomes reality.

Things are generally slower in mainland Europe (as I can attest), so the EU has to double their efforts on the innovation front. While the European Commission has a good innovation policy, and has recognized that the EU has great potential, there is still a lot of ground to cover.

I hope to talk more about this in the coming months.

Tags: , , , ,

No Comments

MDM is a Capability, Not a Product

I had bookmarked, and finally just read, an article by Loraine Lawson of IT Business Edge titled “Consultant: Master Data Management Can Pay off During M&As” which referred to this blog post from Evan Levy, “MDM and M&A“.

MDM is an interesting topic, and one that has a lot of relevance in my work environment. M&As are also interesting and can have a huge impact on a great many people. But while reading these articles, I was reminded of an important MDM axiom.

Even writes:

MDM provides a company the capability to link the data content from disparate systems within and across companies.

Remember that MDM is a capability and not a technology. You cannot buy MDM, but you can build a MDM strategy. This strategy will likely cross several technologies and platforms. It may consist of data warehousing elements, SOA, and SaaS applications. It will surely consist of certain disciplines such as data governance, data quality management, and data integration.

Vendors will continue to push their MDM solutions, but be careful not to trap yourself into thinking that the job is done once you’ve installed. Vendors can wrap most technologies necessary for MDM into a single package, but they cannot provide you with a strategy or the personnel to make it work for your organization.

MDM is a capability you create, and not a product you can buy.

Tags: , , , ,

2 Comments

Avoid Data Dead Ends and Information Loss

Black Hole Avoid Data Dead Ends and Information LossWhen analyzing data to make a decision, the last thing you want to encounter is a data dead end. You may be digging into some figures only to find that the data you have access to has been aggregated, combined, filtered, interpreted, or otherwise changed (in an unauthorized way) from its original source. And as an analyst, the last thing that you want to discover is that your ETL processes are solely responsible.

In Business Intelligence and decision-support instances, especially reports and dashboards, data alterations are common. Aggregates, summaries, snapshots, and the like are normal and necessary for a bird’s eye view of whatever business process is being examined. But in order to avoid information loss, be certain that the underlying data is intact at the most atomic and granular level. And also be sure analysts can get at this data (no black boxes allowed). You don’t want this information to be tossed into a black hole never to be seen or heard from again.

Atomic and Granular

I like to distinguish atomicity from granularity in the following way: Atomicity refers to non-additive and descriptive elements, usually stored as dimensions or non-additive facts, while granularity refers to measurement data usually stored as facts in a business process dimensional model. You could interchange these definitions under certain circumstances, but I like to draw the line so it is clear what I’m talking about.

Atomicity

Atomic data elements will give you the ability to conduct deeper research. By atomic, I mean that the data element has an exact meaning and does not represent some concatenated value or total. The sum of the parts have greater meaning than their whole, and in the end, allow analysts to cut analysis across different dimensions at a very minute scale.

  • A phone number is better split into country code, area code, and subscriber number
  • A street address into street number, name, type, and direction
  • A person’s name into surname and given name
  • A parcel ID into plat, lot, and map
  • An industry classification into groups and subgroups
  • A date into year, quarter, month, week of year, day, and day of week
  • Et cetera!

Granularity

With granularity, you define the level of detail in a measurement. The more granular, the greater the detail. For a trip to the market, you can define the granularity of your shopping excursion on the item level (each item in the basket), by product (grouping similar items), or perhaps by the entire basket as a whole. The choice is yours. Of course, storing the price of each item is the most granular and will give you the greatest flexibility in your analysis. You can then build your aggregates (by product, entire basket, etc.) from the most granular metrics.

If you decide to load data at larger grains, you are losing information and creating dead ends for your decision-makers. It pays to load data at the finest grain possible.

From here…

Integrating data into the data warehouse at an atomic and granular level gets you pretty far. You are likely already doing this (especially if you are familiar with transaction grain fact tables). But there are other ways you can lose data, and therefore information. In a follow-up to this post, I’ll discuss how evaluations and logic gates can also be a source of information loss.

I’d like to know your thoughts on this subject. Have I missed anything important, or have I marked something important that you feel is inconsequential?

Tags: , , , , ,

No Comments