Service Mesh
A dedicated infrastructure layer that handles service-to-service communication within a microservices architecture. It provides observability, traffic management, and security features like mutual TLS without requiring changes to application code.
A service mesh works by deploying a lightweight proxy sidecar alongside each service instance. These proxies intercept all network traffic, enabling the mesh to implement traffic routing rules, retry policies, circuit breaking, mutual TLS encryption, and detailed telemetry collection transparently. Popular service mesh implementations include Istio, Linkerd, and Consul Connect.
For AI product teams running microservices architectures, a service mesh provides critical visibility into how AI services interact with the rest of the application. It reveals latency between the API gateway and model serving endpoints, success rates for feature store lookups, and traffic patterns across model versions during canary deployments. Growth teams benefit from the traffic management capabilities: a service mesh can gradually shift traffic between model versions, implement header-based routing for experiment cohorts, and enforce rate limits that protect AI services from traffic spikes. The observability data from a service mesh also helps diagnose performance issues that affect user experience, such as slow model responses that degrade page load times.
Related Terms
Content Delivery Network
A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.
Edge Computing
A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.
Serverless Computing
A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.
Function as a Service
A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.
Platform as a Service
A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.
Infrastructure as a Service
A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.