Data Warehouse

ByDr. SubraMANI Paramasivam

Evolution of data warehouses

Evolution in organization use of data warehouses

Organizations generally start off with relatively simple use of data warehousing. Over time, more sophisticated use of data warehousing evolves. The following general stages of use of the data warehouse can be distinguished:

Off line Operational Databases

Data warehouses in this initial stage are developed by simply copying the data of an operational system to another server where the processing load of reporting against the copied data does not impact the operational system’s performance.

Off line Data Warehouse

Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data is stored in a data structure designed to facilitate reporting.

Real Time Data Warehouse

Data warehouses at this stage are updated every time an operational system performs a transaction (e.g., an order or a delivery or a booking.)

Integrated Data Warehouse

Data warehouses at this stage are updated every time an operational system performs a transaction. The data warehouses then generate transactions that are passed back into the operational systems.

ByDr. SubraMANI Paramasivam

Data Warehousing – Definition Plus

The data warehousing market consists of tools, technologies, and methodologies that allow for the construction, usage, management, and maintenance of the hardware and software used for a data warehouse, as well as the actual data itself.

In order to clear up some of the confusion that is rampant in the market, definitionplus provides you with fact:


Data Warehouse:

The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way: “A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process”. He defined the terms in the sentence as follows:

Subject Oriented:

Data that gives information about a particular subject instead of about a company’s ongoing operations.

Integrated:

Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole.

Time-variant:

All data in the data warehouse is identified with a particular time period.

Non-volatile:

Data is stable in a data warehouse. More data is added but data is never removed. This enables management to gain a consistent picture of the business.

(Source: “What is a Data Warehouse?” W.H. Inmon, Prism, Volume 1, Number 1, 1995).

This definition remains reasonably accurate almost ten years later. However, a single-subject data warehouse is typically referred to as a data mart, while data warehouses are generally enterprise in scope. Also, data warehouses can be volatile. Due to the large amount of storage required for a data warehouse, (multi-terabyte data warehouses are not uncommon), only a certain number of periods of history are kept in the warehouse.

Ralph Kimball provided a much simpler definition of a data warehouse. As stated in his book, “The Data Warehouse Toolkit”, on page 310, a data warehouse is “a copy of transaction data specifically structured for query and analysis”. This definition provides less insight and depth than Mr. Inmon’s, but is no less accurate.

1