Convolutional Neural Network (CNN)
A neural network architecture designed for processing grid-structured data like images, using convolutional filters that slide over the input to detect local patterns like edges, textures, and shapes.
CNNs exploit the spatial structure of images through three key ideas: local connectivity (each neuron connects to a small region of the input), weight sharing (the same filter is applied across the entire image), and pooling (gradually reducing spatial dimensions to build translation-invariant features). These principles make CNNs dramatically more parameter-efficient than fully connected networks for image tasks.
The architecture typically stacks convolutional layers that detect increasingly complex features: early layers detect edges and colors, middle layers detect textures and shapes, and deep layers detect objects and scenes. This hierarchical feature learning is what allows CNNs to understand images without any manual feature engineering.
While vision transformers (ViTs) have matched or exceeded CNN performance on many benchmarks, CNNs remain widely used in production due to their efficiency, well-understood behavior, and extensive tooling. They power image classification, object detection, image segmentation, and video analysis. For growth applications, CNNs enable visual content moderation, product image search, automated quality inspection, and document processing. Pre-trained CNN backbones like ResNet and EfficientNet provide excellent starting points for transfer learning.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.