Back to glossary

Infrastructure & DevOps

Backup Strategy

A comprehensive plan for creating, storing, verifying, and restoring copies of data to protect against loss from hardware failure, software bugs, human error, or security breaches. An effective backup strategy defines backup frequency, retention periods, and storage locations.

Backup strategies follow the 3-2-1 rule: maintain at least three copies of data on two different media types with one copy stored offsite. Modern cloud implementations typically use automated snapshots for databases, continuous replication to secondary regions, and object storage lifecycle policies for long-term retention. Critically, backups are worthless unless restoration is regularly tested.

For AI product teams, backup strategy must cover unique assets beyond traditional application data. Model artifacts represent weeks or months of training compute and must be versioned and backed up in model registries. Training datasets, which may be expensive or impossible to reconstruct, need archival storage. Feature store state, experiment configurations, and analytics data all require backup coverage. Growth teams should verify that experiment results and historical metrics are included in backup scope because losing this data means losing the institutional knowledge that informs future strategy. Regular restoration drills should include restoring not just the database but the complete AI serving stack: model weights, configuration, feature pipelines, and dependent services.

Related Terms

Content Delivery Network

A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.

Edge Computing

A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.

Serverless Computing

A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.

Function as a Service

A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.

Platform as a Service

A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.

Infrastructure as a Service

A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.