Object storage is the quiet workhorse behind half the internet: every uploaded photo, video, backup, and static asset almost certainly lives in an object store like Amazon S3, Google Cloud Storage, or Azure Blob. It's the durable, effectively-infinite bucket our Google Drive and Pastebin designs offload their bytes to. What makes it different from a database or a filesystem is a deliberate set of constraints — a flat namespace, immutable objects, and an HTTP API — that trade convenience for near-limitless scale and durability.
- Objects, not files or blocks — each object is a blob + a key + metadata, stored in a flat namespace (no real directories), accessed over HTTP.
- Built for durability — ~11 nines via replication across AZs and erasure coding; you offload "never lose this" to the storage layer.
- Effectively infinite + cheap — scale to exabytes with no capacity planning; pay per GB stored and per request.
- Immutable objects — you replace a whole object, not edit in place; there's no cheap "rename" or "append."
- Big files use multipart upload; clients transfer directly via presigned URLs so app servers never proxy the bytes.
- Storage classes / lifecycle tier data from hot to archive to cut cost; pair with a CDN for fast global reads.
Object storage keeps immutable blobs addressed by key in a flat namespace, exposed over an HTTP API. It achieves ~11 nines of durability with erasure coding and cross-AZ replication, scales to exabytes with no provisioning, and is cheap. Upload big files with multipart and let clients transfer directly via presigned URLs. It's not a filesystem (no rename, listing is paginated and not free) and not a database (no queries) — it's the durable bucket you put bytes in and front with a CDN.
Object vs Block vs File Storage
There are three storage paradigms, and choosing object storage means accepting its model on purpose:
| Aspect | Object (S3) | Block (EBS) | File (NFS) |
|---|---|---|---|
| Unit | Object (blob + metadata) | Fixed-size blocks | Files in directories |
| Access | HTTP API by key | Attached as a disk | Mounted, POSIX paths |
| Mutation | Replace whole object | Edit any block | Edit in place |
| Scale | Effectively unlimited | Per-volume limits | Limited by server |
| Best for | Blobs: media, backups, assets | Databases, OS disks | Shared app files |
Block storage is a raw disk (great for a database's files); file storage is a shared, mountable hierarchy (great for legacy apps). Object storage gives up in-place edits and a real directory tree in exchange for limitless scale and a simple network API — ideal for write-once, read-many blobs.
Anatomy of an Object and the Flat Namespace
An object is three things: the data (the bytes), a unique key (its name within a bucket), and metadata (content type, size, custom tags). Crucially, the namespace is flat — there are no real folders. A key like 2023/09/photo.jpg looks hierarchical, but the slashes are just characters in the key; "folders" are a UI convenience produced by listing keys with a common prefix. This flatness is what lets the system scale: there's no directory tree to traverse or lock, just a giant distributed key→blob map.
The HTTP API
You interact with object storage over plain HTTP verbs against a key:
PUT /my-bucket/2023/09/photo.jpg # upload (replaces if exists)
GET /my-bucket/2023/09/photo.jpg # download
DELETE /my-bucket/2023/09/photo.jpg # remove
GET /my-bucket?prefix=2023/09/ # list keys under a prefix (paginated)
# big files: multipart upload (parallel, resumable)
initiate → upload part 1..N (in parallel) → complete
Two patterns matter at scale. Multipart upload splits a large object into parts uploaded in parallel and reassembled server-side — resumable and fast for multi-GB files. Presigned URLs are time-limited, signed links that let a client PUT/GET an object directly to/from the store without routing the bytes through your application servers — essential for keeping bulk traffic off your backend (exactly how the Drive design moves chunks).
Durability via Replication and Erasure Coding
The headline feature is durability — providers advertise ~11 nines (99.999999999%), meaning the probability of losing an object is vanishingly small. Two techniques get there. Replication stores copies across multiple availability zones (physically separate datacenters), so a fire or flood in one doesn't lose data. Erasure coding is cleverer and cheaper than full replication: it splits an object into k data fragments plus m parity fragments, and can reconstruct the object from any k of the k+m total. With (say) 10 data + 4 parity, you tolerate losing any 4 fragments while using only 1.4× storage instead of 3× for triple replication.
object → split into k=10 data + m=4 parity = 14 fragments
spread across 14 disks / zones
rebuild from ANY 10 of 14 → tolerate losing up to 4
storage overhead = 14/10 = 1.4x (vs 3x for triple replication)
Consistency Model
Historically S3 was eventually consistent for some operations — a freshly written object might briefly 404 on read. Modern object stores now offer strong read-after-write consistency: once a PUT succeeds, a subsequent GET returns the new data. This matters for pipelines that write then immediately read. Listing, however, may still lag slightly, and there are no multi-object transactions — each object operation is independent and atomic on its own.
Storage Classes and Lifecycle
Not all data is accessed equally, so object stores offer storage classes at different price/latency points, and lifecycle rules to move objects between them automatically:
| Class | Use | Trade-off |
|---|---|---|
| Standard (hot) | Frequently accessed | Highest storage cost, instant access |
| Infrequent access | Accessed monthly | Cheaper storage, retrieval fee |
| Archive (cold) | Backups, compliance | Very cheap, minutes-to-hours to restore |
A lifecycle policy might keep an object in Standard for 30 days, move it to Infrequent Access, then to Archive after a year, and finally expire it — all automatically, slashing cost for aging data.
Scaling and Performance
Because the namespace is flat, the store partitions the keyspace and scales nearly linearly — there's no directory tree to bottleneck. Historically performance was tied to key prefixes (the store partitioned by prefix, so many keys sharing one prefix could create a hotspot), which is why advice used to be to randomize prefixes; modern S3 auto-scales prefixes, but spreading load across keys still helps for extreme throughput. For read-heavy public content, you put a CDN in front so most reads are served from the edge and never hit the origin bucket.
What It's Used For
- Static assets & media — images, video, JS/CSS bundles, served via CDN.
- Backups & archives — durable, cheap, lifecycle-tiered cold storage.
- Data lake — raw data for batch/analytics jobs to read directly (schema-on-read).
- Application blobs — user uploads, file-sync chunks, generated artifacts (Drive, Pastebin).
Pitfalls
- No rename / move — "renaming" means copy-to-new-key + delete-old; moving a big "folder" is many operations.
- Listing isn't free — large buckets list in paginated batches and per-request costs add up; don't use listing as a query engine.
- Not a database — no queries, joins, or partial updates; pair it with a metadata DB (as the Drive design does).
- Request costs — millions of tiny objects can cost more in requests than in storage; batch small items.
Object storage trades the conveniences of a filesystem (rename, in-place edit, cheap listing) and a database (queries) for three superpowers: limitless scale via a flat namespace, ~11 nines durability via erasure coding and cross-AZ replication, and a dead-simple HTTP API. Use it for immutable blobs, move bytes directly with presigned URLs and multipart, tier with lifecycle rules, and front it with a CDN.
Object vs block vs file? Object = blobs by key over HTTP, flat namespace, limitless (S3); block = raw disk for databases (EBS); file = mountable POSIX hierarchy (NFS).
How does it get 11 nines? Cross-AZ replication plus erasure coding (k data + m parity, rebuild from any k) — durability far cheaper than 3x replication.
Why presigned URLs and multipart? Clients upload/download directly to the store (bytes never touch your servers); multipart parallelizes and resumes large uploads.
Is it a filesystem? No — flat namespace, immutable objects, no cheap rename, paginated non-free listing. "Folders" are just key prefixes.
Why front it with a CDN? Reads are bulk and cacheable; the edge serves hot content so the origin bucket isn't hammered.