Data Warehouse vs. OLAP Cube?

Can anyone explain what is really distinction between Data Warehouse and OLAP Cubes?

Are they different approach for same thing?

Is one of them deprecated in comparison with other?

Are there any performance issues in one of them?

Any explanation is welcomed

57376 次浏览

A data warehouse holds the data you wish to run reports on, analyze, etc.

A cube organize this data by grouping data into defined dimensions. You can have multiple dimensions (think a uber-pivot table in Excel).

For example, in your data warehouse you have all your sales, but running complex SQL queries can be time consuming. So from your data warehouse you create a cube which indexes and precompute the data. In your cube you could have all those precomputed dimensions : sales by months, by week, by salesman, by client, by geographical region, by product color, etc. Then you can run OLAP queries on your cube to have the total, average and maximum sales by (month, salesman, region), or by (color, region), or by (salesman, month). Since all the data is precomputed and indexed, the queries are really fast.

A data warehouse is a database with a design that makes analyzing data easier† and faster, often with data from multiple sources. It usually has a dimensional model, meaning fact tables and dimension tables.

OLAP is a set of operations that one can do on a data set, such as pivoting, slicing, dicing, drilling. For example, one can do OLAP operations with Excel PivotTables. There are certain SQL statements which are "for OLAP", such as PIVOT, group by CUBE(), group by ROLLUP(), and group by GROUPING SETS(), as well as the various window functions

An OLAP Server is a type of server software that facilitates OLAP operations, for example with caching and query re-writing. OLAP operations are often expressed in MDX, and your OLAP server might translate MDX into regular SQL for your database. Or it might work against its own binary file format. A dimensional model inside an OLAP server is called an OLAP cube

You can have a data warehouse and not use OLAP at all (you just run reports).

You can also do OLAP operations on something other than a data warehouse, such as a flat file.

Are they different approach for same thing?

No, a data warehouse is a place to store data in an easily analyzable format, and OLAP is a method to analyze data.

Are one of them deprecated in comparison with other?

No, they compliment each other in that a data warehouse makes it easy to analyze data using OLAP, and OLAP can make analyzing a data warehouse more useful.

Is there any performance issues in one of them?

Yes. A data warehouse is meant to store lots and lots of data, and thus it will take time to query. Performance can be improved by using indexes or a columnar db, caching, RAID 10 SSDs, partitioning, and by pre-aggregating some data.

See also: https://dba.stackexchange.com/questions/45655/what-are-measures-and-dimensions-in-cubes

† as opposed to making transactions easier/more integral

Are they different approach for same thing?

No, a data warehouse is a place to store data in an easily analyzable format, and OLAP is a method to analyze data.

No, they really do the same things! OLAP is more precalculate than DWH. OLAP is like aggregates in DWH