This course will provide state of the art coverage of data warehousing (DW) and it use in a business intelligence platform. We will study the Dimensional Fact Model (DFM) that provides the conceptual layer of a DW and then discuss a number of logical models that are used to represent a multidimensional data structures: ROLAP and MOLAP. Next, we will discuss the steps involved in populating a DW : ETL—Extract, Transform and Load. We will also study the modern techniques that deal with big data in the context of data warehousing: Column Store databases, No-SQL and Hadoop.
Introduction: DW
- DW –requirements, basic architecture and life-cycle.
Conceptual Modeling of DW:
- The DFM: facts, measures dimensions and cubes
- Events and Aggregation : additive, non-additive, aggregations with hierarchies
- Advanced Concepts: slowly-changing dimensions and dynamic hierarchies.
Logical Modeling of DW:
- ROLAP versus MOLAP
- Star schemas and snowflake schemas
- View materialization and greedy algorithm for their selection
ETL—Extract, Load and Transform:
- Immediate and delayed extraction, computing deltas
- Loading dimension, fact tables and populating materialized views
- Data Cleansing
Real-time Business Intelligence with DW:
- Detecting changes with sentinels: a new data mining type
- In memory implementations for DW
This is a visiting professor course of Vienna PhD School of informatics in the area of Business Informatics.
Lecturer: Peter Scheuermann, Northwestern University, US.