Back to glossary

Infrastructure & DevOps

Object Storage

A storage architecture that manages data as discrete objects in a flat namespace rather than as files in a hierarchical directory. Object storage services like Amazon S3 provide virtually unlimited scalability, high durability, and cost-effective storage for large data volumes.

Object storage differs from file systems and block storage by treating each piece of data as a self-contained object with metadata and a unique identifier. This flat structure enables massive scalability because objects can be distributed across many servers without managing directory hierarchies. Object storage provides 99.999999999% durability through automatic replication and is priced per gigabyte-month, making it economical for large datasets.

For AI product teams, object storage is the backbone of data infrastructure. Training datasets, model artifacts, embeddings, user-generated content, and log files all reside in object storage. ML pipelines read training data from and write model checkpoints to object storage as a central coordination point. Growth teams store experiment configurations, analytics exports, and historical metric snapshots in object storage for long-term access. The cost profile of object storage, cheap for storage but charged per API call, influences how AI pipelines should be designed: batch reads are much more cost-effective than many small reads, which shapes data loading patterns for model training and feature computation.

Related Terms

Content Delivery Network

A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.

Edge Computing

A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.

Serverless Computing

A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.

Function as a Service

A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.

Platform as a Service

A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.

Infrastructure as a Service

A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.