Underfitting
When a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data because it fails to learn the relevant relationships.
Underfitting is the opposite of overfitting: the model cannot capture enough complexity to fit the data. A linear regression model trying to learn a highly nonlinear relationship will underfit. The training error remains high, and the model performs poorly on both training and test data because it fundamentally lacks the capacity to represent the patterns present.
Common causes include using a model that is too simple for the problem (linear model for nonlinear data), insufficient training (not enough epochs or too aggressive early stopping), excessive regularization that constrains the model too much, poor feature engineering that does not provide the model with informative inputs, and learning rate settings that prevent convergence.
Fixing underfitting typically involves increasing model complexity (more layers, more parameters, more flexible architecture), training longer, reducing regularization strength, engineering better features, or fundamentally rethinking the problem formulation. The balance between overfitting and underfitting is the core tension in machine learning, captured by the bias-variance tradeoff. The goal is a model complex enough to learn the signal but constrained enough to ignore the noise.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.