--:-- --
↓ Scroll for more

Unit 3.5

Data Warehouses and Data Marts for Business Intelligence

IT 233: Business Information Systems

Data Warehouse Title

Learning Objectives

By the end of this session, you will be able to:

  • ✅ Define a data warehouse and its purpose in supporting Business Intelligence (BI).
  • ✅ Describe the four key characteristics of a data warehouse.
  • ✅ Differentiate between a data warehouse (OLAP) and a transactional database (OLTP).
  • ✅ Define a data mart and explain its relationship to a data warehouse.

The Core Problem: Two Different Workloads

⚡ Day-to-Day Operations

Running the business requires handling thousands of small, fast transactions.

  • Examples: Selling an item, withdrawing cash from an ATM, booking a ticket.
  • Focus: Speed, accuracy, and real-time data.
  • This is called Online Transaction Processing (OLTP).

📊 Big Picture Analysis

Analyzing the business requires asking complex questions over large amounts of historical data.

  • Examples: "What are our sales trends over 5 years?", "Which products are most profitable?"
  • Focus: Insights, trends, and patterns.
  • This is called Online Analytical Processing (OLAP).

Problem: Running complex analysis (OLAP) on a live operational system (OLTP) can drastically slow it down, impacting daily business!

The Solution: The Data Warehouse 🏛️

Data Warehouse: A special type of database designed specifically for data analysis and reporting (OLAP). It serves as a central repository of historical and integrated data from various sources.

Think of it as the company's "corporate memory" or a "single source of truth."

Main Purpose

To separate the analytical workload (OLAP) from the transactional workload (OLTP), enabling powerful Business Intelligence without disrupting operations.

OLTP vs. OLAP: A Head-to-Head Comparison

OLTP (Transactional Database)

  • Purpose: Run the business
  • Data: Current, real-time
  • Queries: Simple, fast, repetitive
    (e.g., "Find customer #123")
  • Users: Front-line workers, clerks
  • Example: E-commerce checkout system

OLAP (Data Warehouse)

  • Purpose: Analyze the business
  • Data: Historical, summarized
  • Queries: Complex, multi-dimensional
    (e.g., "Compare Q1 sales by region for the last 3 years")
  • Users: Analysts, managers, executives
  • Example: Sales forecasting dashboard

The 4 Pillars of a Data Warehouse (Part 1)

1. Subject-Oriented 🎯

Data is organized around the major subjects of the business.

  • Subjects: Customer, Product, Sales
  • Instead of: Business processes like 'Order Entry' or 'Invoicing'.
  • This makes it intuitive for business analysis.

2. Integrated 🔗

Data is combined from multiple, different sources into a single, consistent format.

  • Sources: CRM, ERP, flat files, APIs
  • Process: Data is cleaned and standardized.
  • Example: "USA", "U.S.", "United States" all become "USA".
Data Cube
ETL Pipeline

The 4 Pillars of a Data Warehouse (Part 2)

3. Time-Variant ⏳

Data represents a historical record, allowing for analysis over time.

  • Contains data over a long period (e.g., 5-10 years).
  • Each record is associated with a point in time.
  • Essential for trend analysis and forecasting.

4. Non-Volatile 🔒

Once data is loaded into the warehouse, it is stable and does not change.

  • Data is read-only.
  • New data is added periodically (e.g., daily, weekly), but old data is not overwritten.
  • Ensures a reliable historical record for analysis.

Introducing Data Marts 🔍

Data Mart: A smaller, simpler version of a data warehouse that is focused on a single subject or line of business.

It's a subset of the enterprise data warehouse, designed for a specific department's needs.

Sales Data Mart

Contains data on customers, sales, and performance for the sales team.

Finance Data Mart

Contains data on revenue, costs, and profits for the finance department.

Marketing Data Mart

Contains data on campaigns, leads, and customer segments for marketing.

Why use a Data Mart? Faster access, simpler to use, and quicker to build for specific user groups.

Warehouse vs Mart
Decision Support

Practical Application: A Nepali Telco

Scenario: Analyzing Customer Data at Ncell or Nepal Telecom

OLTP System: The live billing and call record system. It processes millions of transactions (calls, SMS, data usage) per minute. You can't run slow, heavy queries here!

Data Warehouse: Integrates data from call records, billing systems, customer recharge cards, and CRM. It allows analysts to ask complex questions like: "What was the average data consumption per user in Province 3 during the Dashain festival season over the last 5 years?"

Marketing Data Mart: The marketing team gets a focused data mart with just customer demographics, recharge history, and service usage. This allows them to quickly analyze the success of a new data pack offer without needing to query the entire, massive data warehouse.

Chapter 5: Key Takeaways

  • OLTP vs. OLAP: Operational databases (OLTP) run the business; data warehouses (OLAP) analyze the business.
  • The 4 Pillars: Data warehouses are Subject-Oriented, Integrated, Time-Variant, and Non-Volatile.
  • Single Source of Truth: A data warehouse provides one consistent, trusted version of data for reliable business intelligence.
  • Data Marts: These are smaller, departmental subsets of a data warehouse that provide focused, faster data access for specific user groups.

Thank You

Any questions?


Next Up: Chapter 6 - Introduction to Big Data

Back to Start