Regression
A supervised learning task that predicts a continuous numerical value based on input features, used for forecasting metrics like revenue, estimating customer lifetime value, and predicting engagement scores.
Regression models predict numbers rather than categories. Linear regression is the simplest form, fitting a straight line through data. Polynomial regression captures curves. Ridge and Lasso regression add regularization to prevent overfitting. Gradient-boosted trees and neural networks handle complex nonlinear relationships. The choice depends on the complexity of the underlying relationship and the amount of available data.
Evaluation metrics for regression differ from classification: mean absolute error (MAE) measures average prediction distance, mean squared error (MSE) penalizes large errors more heavily, and R-squared measures what fraction of variance the model explains. The choice of metric should reflect business impact: MAE is appropriate when all errors matter equally, while MSE is better when large errors are disproportionately costly.
For growth teams, regression models drive quantitative predictions: forecasting monthly recurring revenue, estimating customer lifetime value for acquisition budget allocation, predicting time-to-conversion for lead prioritization, and forecasting demand for capacity planning. The practical challenge is that regression predictions are continuous, but business decisions are often discrete (invest/don't invest), so combining regression predictions with threshold-based decision rules is a common production pattern.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.