CAP Theorem
A fundamental distributed systems principle stating that a system can guarantee at most two of three properties simultaneously: Consistency, Availability, and Partition tolerance.
The CAP theorem, formulated by Eric Brewer, establishes that in the presence of a network partition (nodes cannot communicate), a distributed system must choose between consistency (all nodes see the same data) and availability (every request receives a response). Since network partitions are inevitable in distributed systems, the real choice is between CP (consistent but may reject requests) and AP (available but may return stale data).
In practice, most systems make nuanced trade-offs rather than a binary choice. Databases like PostgreSQL with synchronous replication choose CP, ensuring strong consistency at the cost of availability during network issues. Systems like Cassandra and DynamoDB default to AP, accepting eventual consistency to maintain availability.
Understanding CAP helps AI engineering teams make informed database and architecture decisions. A feature store serving real-time predictions might prioritize availability (AP), accepting slightly stale features rather than failing requests. A billing system processing payments requires strong consistency (CP) even if it means occasional unavailability during partitions.
Related Terms
A/B Testing
A controlled experiment comparing two or more variants to determine which performs better on a defined metric, using statistical methods to ensure reliable results.
Feature Flag
A software mechanism that enables or disables features at runtime without deploying new code, used for gradual rollouts, A/B testing, and targeting specific user segments.
MLOps
The set of practices combining machine learning, DevOps, and data engineering to reliably deploy, monitor, and maintain ML models in production.
Model Serving
The infrastructure and systems that host trained ML models and handle inference requests in production, optimizing for latency, throughput, and cost.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords, typically powered by embedding-based similarity comparison.
CI/CD (Continuous Integration / Continuous Deployment)
An automated software practice where code changes are continuously integrated into a shared repository, tested, and deployed to production, reducing manual intervention and accelerating delivery cycles.