Back to glossary

Connection Pooling

A technique that maintains a pool of reusable database or network connections rather than creating and destroying connections for each request. Connection pooling reduces the overhead of connection establishment and improves response times for database-heavy applications.

Establishing a new database connection involves TCP handshake, authentication, and session initialization, which can take tens of milliseconds. Under high concurrency, the overhead of repeatedly creating and tearing down connections becomes a significant bottleneck. Connection pooling maintains a set of pre-established connections that requests check out, use, and return, amortizing setup costs across many operations.

For AI product teams, connection pooling is critical because AI features often require multiple database queries per request: fetching user context, looking up feature values, retrieving cached predictions, and logging results. Without pooling, each AI inference request might establish four or five database connections, quickly exhausting database connection limits under load. Growth teams should monitor connection pool metrics because pool exhaustion is a common cause of sudden latency spikes during traffic surges. Serverless functions add complexity because each function instance may create its own pool, leading to connection count explosion. Tools like PgBouncer or ProxySQL sit between applications and databases to centralize connection management and prevent this problem.

Related Terms

Content Delivery Network

A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.

Edge Computing

A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.

Serverless Computing

A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.

Function as a Service

A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.

Platform as a Service

A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.

Infrastructure as a Service

A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.