What is a data warehouse? It is a central data hub of data used for reporting and data analytics. Usually the data is highly structured. The data warehouse has been around for a while. In 1989, Bill Inmon originated the concept of a data warehouse. Back then data warehouses on on-premises. Today, even very small companies can use data warehouses in the cloud because they can use the scalable pay-as-you-go model.
There are two types of data warehouses: organizational and technical. A technical data warehouse architecture reflects uthe technical nature of the data warehouse, such as massively parallel processing (MPP). The organizational data warehouse architecture organizes the data along the business teams, functions and processes. For example, the sales data would be organized together.
The organizational data warehouse architecture separates online analytical processing (OLAP) from production databases (online transaction processing – OLTP). As a business grows and the data grows, you’ll want to keep your analytics queries from slowing down the production system. Analytics performance will also be better this way.
The second characteristic is that the organizational database centralizes data and organizes it. Traditionally, a data warehouse pulls data from application systems by using ETL (extract, transform, load). The transformation phase cleans and standardizes data and the load phase pushes the data into the warehouse’s target database system. Next, the data is loaded into the appropriate target data mart.
A variation of the ETL is ELT. Here, data is moved from production systems into a staging area in the data warehouse. The data is in raw form. Transformations are done in the data warehouse, in batches.