IT 233: Business Information Systems
By the end of this chapter, you will be able to:
Big Data refers to vast, complex datasets that are too large to be managed or analyzed using traditional data processing tools.
The challenge isn't just storage. It's about...
Big Data is commonly defined by five key characteristics:
The scale of data
The speed of data
The different forms of data
The trustworthiness of data
The business outcome from data
Refers to the sheer quantity of data being generated and stored.
We've moved beyond Gigabytes (GB) and Terabytes (TB) to...
Example: Facebook stores hundreds of petabytes of user photos and videos. The Large Hadron Collider generates ~1 petabyte of data per second.
The speed at which new data is created and must be processed.
Often, insights are needed in real-time or near-real-time to be useful.
Example: Real-time stock market analysis, live social media trend monitoring, or data from IoT sensors on a factory floor require immediate processing.
Refers to the different forms that data can take. Big Data is rarely neat and tidy.
Highly organized, like a spreadsheet or SQL database.
No predefined format, like text, images, or video.
Has tags/markers, like XML or JSON files.
The vast majority of Big Data is unstructured.
Refers to the trustworthiness, accuracy, and quality of the data.
With data from so many sources, uncertainty and "noise" are major challenges.
Example: Analyzing social media sentiment is difficult due to sarcasm, slang, and fake accounts. This affects the data's veracity and can lead to wrong conclusions.
Arguably the most important V. Does the data lead to a tangible business outcome?
If you cannot turn your data into value, it's not an assetβit's a costly storage problem.
The goal is to derive insights that lead to:
Analyzes massive volumes of viewing data (what you watch, when you pause, what you search for) to power its recommendation engine and decide which new shows to produce.
The Nepal Tourism Board could analyze unstructured data from social media (Instagram geotags, travel blogs, TripAdvisor reviews) to identify emerging tourist destinations, understand visitor sentiment, and plan marketing campaigns more effectively.
Let's discuss the chapter questions.