Back to glossary

Serverless Computing

A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.

Serverless computing abstracts away server management entirely. The cloud provider handles provisioning, scaling, patching, and capacity planning. Functions execute in response to events like HTTP requests, database changes, or message queue entries, and scale automatically from zero to thousands of concurrent instances. This model eliminates idle capacity costs and reduces operational burden.

For AI product teams, serverless is compelling for inference workloads with variable traffic patterns. A recommendation API that handles 10 requests per minute overnight but 10,000 during peak hours benefits enormously from automatic scaling. Growth teams running experiments appreciate serverless because they can deploy new API endpoints, webhooks, and data processing pipelines without infrastructure tickets. However, serverless has limitations for AI workloads: cold starts add latency, execution time limits may not accommodate complex model inference, and GPU access is limited on most serverless platforms. Teams must evaluate whether their AI workload's latency and compute requirements fit the serverless model or need dedicated infrastructure.

Related Terms

Content Delivery Network

A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.

Edge Computing

A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.

Function as a Service

A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.

Platform as a Service

A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.

Infrastructure as a Service

A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.

Container Orchestration

The automated management of containerized applications across a cluster of machines, handling deployment, scaling, networking, and health monitoring. Kubernetes is the dominant orchestration platform, providing declarative configuration for complex distributed systems.