Loss Function

The loss function defines what "good" means for your model. It converts the abstract goal of "make accurate predictions" into a concrete number that gradient descent can minimize. Different tasks require different loss functions: cross-entropy for classification, mean squared error for regression, contrastive loss for embeddings, and custom losses for specific business objectives.

Choosing the right loss function is a critical design decision because it directly determines what the model optimizes for. Cross-entropy loss encourages the model to output calibrated probabilities. Focal loss emphasizes hard examples, useful when easy examples dominate. Weighted losses let you penalize certain types of errors more heavily, reflecting their business impact. If misclassifying a churning customer costs 10x more than misclassifying a retained one, your loss function should reflect that.

For production ML systems, the loss function used during training often differs from the business metric you ultimately care about. A recommendation model trained with cross-entropy loss is evaluated on click-through rate. A churn model trained with log loss is evaluated on business value saved. Understanding the gap between training loss and business metric helps you design evaluation frameworks that accurately predict production impact.

Related Terms

RAG (Retrieval-Augmented Generation)

Embeddings

Vector Database

LLM (Large Language Model)

Fine-Tuning

Prompt Engineering