What is a Data
Warehouse ?
To keep it simple,
let us take Bill Immon’s definition of
the Data Warehouse (one of earliest authors on DWH):
“A Data
Warehouse is a central repository for the Entire Enterprise”.
As we know, the
end product of Business Intelligence are the reports/dashboards that assist the
stakeholders in analyzing the trends and help them take decisions. But it is
important that these reports are projected on data that is accumulated over a
period of time (historical data) and not just recent data or data from a single
system. In other words, these reports would
make sense only when they are created on top of integrated and historical data.
Considering the fact that Organizations these days have their data scattered
across multiple systems, platforms and formats, DWH’s role becomes all the more
crucial as it acts as a one stop destination.
How is it
different from a traditional database?
Following are the differences between Operational/Transactional
systems and a DWH:
Operational/Transactional
database
|
Data
Warehouse
|
It
deals with operational data i.e. data involved in the operation of a
particular system and it is characterized by a large number of short on-line
transactions
Features
§ Designed Primarily for fast INSERTs,
UPDATEs and DELETEs.
§ Quick
Singular Transactions.
§ Operational/transactional
Data
§ Application
specific DBs
|
It
is a collection of Historical and integrated Data accumulated from various
operational systems.
Features
§ Designed Primarily for : SELECTs
§ Fast
Reading of Large Data Sets
§ Historical
Data or Archival Data
§ Integrated data set with a global
relevance
|
Now that we got to know that a Data Warehouse is the most important Cog in the BI (Business Intelligence) wheel, it is time to get our hands dirty and scan deeper into the Data Warehouse (DWH).