How does a 10GB file become 2GB after zipping?
Understanding Compression and Zipping
Have you ever compressed a file from 10GB down to 2GB and wondered… what actually happened to the other 8GB?
The answer lies in data compression.
Compression is the process of reducing file size by storing data more efficiently not by deleting it (in most cases), but by removing redundancy.
Before diving into how it works, it’s important to understand that there are two main types of compression:
Lossless Compression
This method preserves all original data. When you unzip or decompress the file, you get an exact replica of the original.
Common examples include ZIP, RAR, and PNG files.
Lossy Compression
This method reduces file size by permanently removing some data, typically details that are less noticeable to humans.
It is commonly used for media files like JPEG images, MP3 audio, and MP4 videos.
So, how does compression actually work?
At its core, compression identifies patterns and repetition within data and encodes them more efficiently.
For example, instead of storing: “A A A A A A A A”
It can store: “A × 8”
Same information less space.
When you “zip” a file, you’re essentially applying a lossless compression algorithm that reorganizes the data into a more compact form. When the file is unzipped, the system reconstructs the original data exactly as it was.
In simple terms:
Compression doesn’t remove your data it removes inefficiency.
Understanding this concept is fundamental for developers working with file storage, data transfer, and system optimization.