Learning Objectives
By the end of this chapter, you will be able to:
- Explain why computers use the binary system.
- Define bit and byte.
- Understand the purpose of character encoding standards.
- Differentiate between ASCII and Unicode.
How Computers Represent Data
Digital computers do not understand human language. At their most fundamental level, they can only understand two states: on and off. These two states are represented by the numbers 1 (on) and 0 (off). This two-digit number system is called the binary system.
All data—whether it’s text, numbers, images, or sound—must be translated into a series of 1s and 0s before a computer can process it.
Bits and Bytes
The basic units of data representation are:
- Bit (Binary Digit): The smallest unit of data in a computer. A bit can have a value of either 0 or 1.
- Byte: A group of 8 bits. A single byte can represent 256 different values (2^8). The byte is the standard unit used to measure storage capacity.
Storage capacities are measured in multiples of bytes:
- Kilobyte (KB): Approximately 1,000 bytes.
- Megabyte (MB): Approximately 1 million bytes.
- Gigabyte (GB): Approximately 1 billion bytes.
- Terabyte (TB): Approximately 1 trillion bytes.
Representing Characters: ASCII and Unicode
To represent text (letters, numbers, and symbols), computers use character encoding standards. These standards assign a unique binary number to each character.
-
ASCII (American Standard Code for Information Interchange): An early and widely used standard. It uses 7 or 8 bits to represent each character. The original 7-bit ASCII can represent 128 characters, which is enough for all the uppercase and lowercase English letters, numbers, and common punctuation marks.
-
Unicode: A modern standard designed to represent every character from every language in the world. Unicode uses a variable number of bits, allowing it to represent over a million unique characters. It is the standard used on the internet and in all modern operating systems. The most common implementation of Unicode is UTF-8.
Representing Programs
Just like data, computer programs (which are sets of instructions) must also be represented in binary. When a programmer writes code in a high-level language like Python or Java, a special program called a compiler or an interpreter translates that human-readable code into the binary machine code that the computer’s processor can directly execute.
Summary
Computers represent all data and instructions in the binary system, using combinations of 0s and 1s. The smallest unit of data is a bit, and 8 bits make up a byte, which is the standard measure of storage. To represent text, computers use character encoding standards like ASCII for basic English text and Unicode for all international languages. Even the programs we write must be compiled down to binary machine code before a computer can run them.
Key Takeaways
- Computers use the binary system (0s and 1s) to represent all data.
- A bit is a single binary digit; a byte is 8 bits.
- ASCII and Unicode are standards for encoding text characters into binary numbers.
- Unicode is the modern standard that supports all world languages.
Discussion Questions
- Why is the binary system a natural fit for electronic computers?
- A standard text message (SMS) is limited to 160 characters. If it uses 7-bit ASCII encoding, how many bits are in a maximum-length message?
- Why was it necessary to create Unicode to replace ASCII?

