AIX Tip of the Week

AIX Tip of the Week: Ten Mistakes to Avoid for Data Warehousing Managers

Audience: IT Managers and Administrators

Date: May 11, 2000

The Data Warehousing Institute has compiled a list of the top ten mistakes data warehousing managers should avoid. The list is based on surveys with industry experts, data warehousing project managers and IS executives. The mistakes are:

  1. Starting with the Wrong Sponsorship Chain
  2. Setting Expection You Cannot Meet
  3. Engaging in Politically-Naive Behavior
  4. Loading the Warehouse with Information "Just because it Was Available"
  5. Designing a Data Warehouse Database the Same as Tranactional DB
  6. Choosing a Data Warehousing Manager Who is Technology Rather than User Oriented
  7. Ignoring the Potential Value of External Data (Text, Images, Sound, Video)
  8. Delivering Data with Overlapping and Confusing Definitions
  9. Believing Performance, Capacity and Scalability Promises
  10. Believing Your Problems are Over Once the Warehouse is Operational
  11. Focusing on Ad Hoc and Periodic Reporting

See the attached file for an explanation of each item, as well as the reason why the list has eleven (not ten) items!

Ten Mistakes to Avoid for Data Warehousing Managers

The Data Warehousing Institute (TDWI) is dedicated to helping organizations increase their understanding and use of business intelligence by educating decision makers and I/S professionals on the proper deployment of data wareho using strategies and technologies. In addition, TDWI helps its membership advance their professional development as data warehousing managers and practitioners.

TDWI accomplishes these goals through sharing information about best practices and real world lessons learned by data warehousing visionaries, practitioners, and pioneers. TDWI convenes annual worldwide conferences and courses on data warehousing and b usiness information strategies where experienced professionals share real world experiences. TDWI is also the first professional organization to offer a comprehensive data-warehousing curriculum.

The staff of The Data Warehousing Institute has called upon experts across the industry, and conducted meetings in several cities with active data warehousing project managers and IS executives to assist us in developing a compendium of the "ten m istakes to avoid for data warehousing managers." This article contains about 65 percent of the complete document.

1. Starting with the Wrong Sponsorship Chain

The right sponsorship chain includes two key individuals above the data-warehousing manager. At the top is an executive sponsor with a great deal of money to invest in effective use of information. A good sponsor, however, is not th e only person required in the reporting chain above the warehousing manager. When a data-warehousing project craters, the cause can sometimes be traced to the lack of a key individual between the sponsor and the data-warehousing manager. That person is of ten called the project "driver" because he or she keeps the project moving in the right direction and ensures the schedule is kept. A good driver is a business person with three essential characteristics: (1) s/he has already earned the respect of the other executives, (2) s/he has a healthy skepticism about technology, and (3) s/he is decisive but flexible.

2. Setting Expectations that You Cannot Meet and Frustrating Executives at the Moment Of Truth

Data warehousing projects have at least two phases: (1) the selling phase in which you attempt to persuade people that they can expect to get wonderful access to the right data through simple, graphical delivery tools, (2) the strug gle to meet the expectations you have raised in phase one.

Data warehouses do not give users all the information they need. All data warehousing is, by necessity, domain specific, which means it focuses on a particular set of business information. Worse still, many warehouses are loaded with summary informatio n - not detail. If a question asked by an executive requires more detail or requires information from outside the domain, the answer is often, "we haven’t loaded that information, but we can, it will just cost (a bunch) and take (many) weeks." E xecutives focus their frustration on the person who made the promises.

3. Engaging in Politically-Naive Behavior. (e.g. Saying "This Will Help Managers Make Better Decisions")

A foolish error made by many data warehousing managers is promoting the value of their data warehouse with arguments to the effect of, "This will help managers make better decisions." When a self-respecting manager hears t hose words, the natural reaction is "This person thinks we have not been making good decisions and that his/her system is going to ‘fix’ us." From that point on, that manager is very, very hard to please.

Most experienced CIOs know that the objective of data warehousing is the same one that fueled the fourth generation language boom of the late seventies, and the EIS craze of the late eighties - giving end users better access to important information. F ourth generation languages have had a long and useful life, but EIS had a quick rise and a quicker fall. Why? One possible answer is that 4GLs were sold as tools to get data while EIS were promoted as change agents that would improve business and enable b etter management decisions. That raised political issues, and made enemies out of potential supporters.

4. Loading the Warehouse with Information "Just Because It Was Available."

Some inexperienced data warehousing managers send a list of tables and data elements to end users along with a request asking, "which of these elements should be included in the warehouse?" Sometimes they ask for categorie s such as ‘essential’, ‘important’, and ‘nice-to-have’. They get back long lists of marginally useful information that radically expand the data warehouse storage requirements and, more importantly, slow responsiveness. Extraneous data buries important in formation. Faced with the need to dig through long guides to find the right field name, and having to deal with multiple versions of the same information, users quickly grow frustrated and may even give up entirely.

5. Believing that Data Warehousing Database Design is the Same as Transactional Database Design

Data warehousing is fundamentally different from transaction processing. The goal here is to access aggregates - sums, averages, trends, and more. Another difference is the user. In transaction processing, a programmer develops a qu ery that will be used tens of thousands of times. In data warehousing, an end-user develops the query and may use it only one time. Data warehousing databases are often denormalized to make them easier to navigate for infrequent users.

An even more fundamental difference is in content. Where transactional systems usually contain only the basic data, data warehousing users increasingly expect to find aggregates and time-series information already calculated for them and ready for imme diate display. That’s the impetus behind the multi-dimensional database market.

6. Choosing a Data Warehousing Manager Who is Technology-Oriented Rather than User-Oriented

"The biggest mistake I ever made was putting that propeller-head in as the manager of the project." Those are the exact words from the driver at a large oil company, explaining how the user-hostile project manager had made so many people angry that the entire project was in danger of being scrapped.

Do not let his words tar all technologists. Some make excellent project managers and can serve as effective data warehousing managers; however, many cannot. Data warehousing is a service business-not a storage business-and making clients angry is a nea r perfect method of destroying a service business.

7. Focusing on Traditional Internal Record-Oriented Data and Ignoring the Potential Value of External Data and of Text, Images, and - Potentially - Sound and Video

A White House study of commercial executives showed that the very highest executives rely on outside data (news, telephone calls from associates, etc.) for more than 95 percent of all the information they use. Because of their focus on external sources of information, senior executives sometimes see data warehouses as irrelevant. Therefore, it’s valuable to extend the project focus to include external information.

In addition, consider expanding the forms of information available through the warehouse. Users are starting to ask, "Where’s the copy of the contract (image) that explains the information behind the data? And where’s the ad (image) that ran in th at magazine?

Where’s the tape (audio or video) of the key competitor at a recent conference talking about its business strategy? Where’s the recent product launch (video)?" This is the age of television. Traditional alphanumeric data is two generations behind the current technology.

8. Delivering Data with Overlapping and Confusing Definitions

The Achilles heel of data warehousing is the requirement to gain consensus on data definitions. Conflicting definitions each have champions, and they are not easily reconciled. Many of the most stubborn definitions have been constru cted by managers to reflect data in a way that makes their department look effective. To the finance manager, sales means the net of revenue less returns. Sales to the distribution people is what needs to be delivered. Sales to the sales organization is t he amount committed by clients. One organization reported twenty-seven different definitions of sales.

Executives do not give up their definitions without a fight, and few data warehousing managers are in a position to bully executives into agreement. Solving this problem is one of the most important tasks of the data-warehousing driver. If it is not so lved, users will not have confidence in the information they are getting. Worse, they may embarrass themselves by using the wrong data - in which case, they will inevitably blame the data warehouse.

9. Believing the Performance, Capacity, and Scalability Promises

At a recent conference, CIOs from three companies-a manufacturer, a retailer, and a service company-described their data warehousing efforts. Although the three data warehouses were very different, all three ran into an identical pr oblem. Within four months of getting started, each of the CIOs unexpectedly had to purchase at least one additional processor of a size equal to or larger than the largest computer that they had originally purchased for data warehousing. They simply ran o ut of power. Two of the three had failed to budget for the addition, and found themselves with a serious problem. The third had budgeted for unforeseen difficulties, and was able to adapt.

A very common capacity problem arises in networking. One company reported that it sized a network to support an image warehouse, but discovered that the network was soon overwhelmed The surprise was that the images were not at fault. The problem turned out to be network traffic for data transfer between the end-user application and the database of indices on the server. The images moved fast, but the process of finding the right one clogged the network. Network overloads are a very common surprise in c lient/server systems in general and in data warehousing systems in particular.

10. Believing that Once the Data Warehouse is Up and Running, Your Problems are Finished

Each happy data warehouse user asks for new data and tells others about the ‘great new tool.’ And they, too, ask for more data to be added. And all of them want it immediately. At the same time, each performance or delivery problem results in a high-pressure search for additional technology or a new process. Thus the data warehousing project team needs to maintain high energy over long periods of time. A common error is to place data warehousing in the hands of project-oriented peo ple who believe that they will be able to set it up once and have it run itself. Data warehousing is a journey, not a destination.

11. Focusing On Ad Hoc Data Mining and Periodic Reporting. *

This is a subtle error, but an important one. Fixing it may transform a data-warehousing manager from a data librarian into a hero.

The natural progression of information in a data warehouse is (1) extract the data from legacy systems, clean it, and feed it to the warehouse, (2) support ad hoc reporting until you learn what people want, and then (3) convert the ad hoc reports into regularly scheduled reports. That’s the natural progression, but it isn’t the best progression. It ignores the fact that managers are busy and that reports are liabilities rather than assets unless the recipients have time to read the reports.

Alert systems can be a better approach and they can make a data warehouse mission-critical. Alert systems monitor the data flowing into the warehouse and inform all key people with a need to know, as soon as a critical event takes place. Harris Semicon ductor’s industry-leading manufacturing alert server, for example, monitors patterns in semi-conductor test data, and screams loudly (via email) when wafer characteristics anywhere in the world (Malaysia, Singapore, or three US sites) creep too far from t he ideal. Rethink the manager's need: Does he or she really want reports? Or would an alert system be better?

*You'll find eleven "mistakes" on our list. Believing there are only ten mistakes to avoid is also a mistake, so we’ve given you eleven to keep you on your toes.