Unit 3.2
The Challenges of Managing Data
Learning Objectives
By the end of this chapter, you will be able to...
- β
Identify the main difficulties organizations face when managing data.
- β
Describe the traditional file-based approach to data management.
- β
Explain the major problems caused by the file-based approach.
The Core Challenge
Organizations collect more data than ever before, but managing it is a massive challenge.
Fundamental Principle: The value of data depends entirely on its quality.
"Garbage In, Garbage Out" (GIGO)
Effective management is crucial for turning raw data into a valuable business asset.
Challenge 1: The Data Deluge π
Organizations are overwhelmed by the sheer volume of data from diverse sources.
- π Internal Transactions: Sales, inventory, HR records
- π Web & Social Media: Clickstreams, user comments, likes
- π‘ IoT Devices: Sensor data, smart appliances
This creates significant technical and financial challenges for storage and processing.
Challenge 2: Data Quality π―
Poor data quality leads to flawed analysis and bad business decisions.
Common Quality Issues:
- Inaccurate: Incorrect customer names or numbers
- Incomplete: Missing contact information
- Inconsistent: "Kathmandu" vs. "KTM"
- Out of Date: Old addresses or phone numbers
Challenge 3: Data Security π
Data is a valuable asset that must be protected from threats.
Ensuring data security and privacy is a complex and continuous effort.
- Protecting against unauthorized internal access.
- Defending against external cyberattacks and data breaches.
- Complying with data privacy regulations.
Challenge 4: Data Governance ποΈ
Without clear rules, data becomes chaotic and loses its value.
Data Governance: A collection of policies and procedures to manage the availability, usability, integrity, and security of data.
Lack of governance leads to:
- Inconsistent data definitions
- Unclear data ownership
- Diminished trust in data across the organization
The Traditional Solution: File-Based Approach
Before modern databases, organizations managed data in separate, application-specific files.
Sales Dept.
customer_sales.dat
orders.dat
Accounting Dept.
customer_billing.dat
invoices.dat
Marketing Dept.
customer_prospects.dat
campaigns.dat
This created "Information Silos," where departments could not easily share data.
Problem #1: Data Redundancy πΎπΎ
Data Redundancy: The same piece of data is stored in multiple, separate files.
Example: A customer's address exists in the Sales file, the Accounting file, AND the Marketing file.
Consequences:
- Wasted storage space.
- Increased complexity in data management.
- Leads directly to the next, more serious problem...
Problem #2: Data Inconsistency β‘
A direct result of data redundancy. When data is duplicated, it often becomes inconsistent.
Scenario: Address Change
- A customer calls Sales to update their address.
- The Sales clerk updates
customer_sales.dat.
- The address in
customer_billing.dat and customer_prospects.dat remains unchanged.
Result: The organization now has multiple, conflicting "versions of the truth."
Problem #3: Data Isolation π§±
Data Isolation: Data is stored in different files with different formats, making it difficult to access and integrate.
Example: A manager needs a report on "total sales per customer and their outstanding balance."
- Sales data is in the Sales system.
- Billing data is in the Accounting system.
- Creating this report requires a complex, often manual, process of extracting and combining data.
Real-World Impact: A Nepali Context
Scenario: A Bank in Nepal Using a File-Based System
Imagine you open a savings account at a branch in Pokhara. Later, you apply for a loan at a different branch in Kathmandu.
What problems might you face?
- You have to provide all your KYC ("Know Your Customer") details again.
- If you update your phone number for your loan, your savings account record might not be updated, causing you to miss important alerts.
- The bank struggles to get a single view of you as a customer, making it hard to offer you relevant products.
Summary & Key Takeaways π
- Modern data management faces challenges of volume, quality, security, and governance.
- The traditional file-based approach created isolated "information silos."
- Data redundancy (duplicate data) wastes resources and directly causes data inconsistency (conflicting data).
- Data isolation makes it nearly impossible to get a holistic, unified view of the business.
- These problems highlight the need for a modern database approach.
Thank You!
Any Questions?
Next Topic: The Database Approach
Creating a "Single Source of Truth"