Randomization Unit

The entity (user, session, page view, device, cluster, or geographic region) at which random assignment to experiment variants occurs, determining the independence structure of the data and affecting both the validity and statistical power of the experiment.

The randomization unit is one of the most consequential design decisions in experiment planning. It determines at what level random assignment happens and therefore at what level outcomes are independent. The most common randomization unit in online experiments is the user (identified by a persistent ID), but alternatives include session (each visit is independently randomized), page view (each request is independently randomized), device (for cross-platform consistency), cookie (as a proxy for user when login is not required), company or organization (for B2B products), and geographic region (for marketplace or infrastructure experiments). For growth teams, choosing the wrong randomization unit can either invalidate the experiment through interference violations or waste statistical power through unnecessary variance.

The choice of randomization unit should match the level at which the treatment is experienced and the level at which outcomes are meaningfully independent. User-level randomization is appropriate when the treatment provides a consistent experience across sessions and when the user's outcome is independent of other users' assignments. Session-level randomization provides more independent observations (increasing power) but means a user might see different variants in different sessions, which is inappropriate for experiments testing persistent features or where learning effects span sessions. Page-view-level randomization is useful for testing layout or content presentation changes where each page view is an independent opportunity, but creates a confusing experience if the same user sees different variants within a single session. For B2B products, randomizing at the company level ensures all users within an organization have a consistent experience, which is essential for collaborative features but dramatically reduces the effective sample size.

The randomization unit directly affects statistical power and analysis. If the unit is the user, the sample size for power analysis is the number of users, and each user contributes one independent observation. If the unit is the session and each user has multiple sessions, there are more independent observations but within-user correlation between sessions must be accounted for (or the standard errors will be underestimated). If the unit is a cluster (company, region), the effective sample size is the number of clusters, which is typically much smaller than the number of individual users. Common pitfalls include using page-view-level randomization for experiments that should be user-level (creating inconsistent experiences), not accounting for the correlation structure when the analysis unit differs from the randomization unit, and underestimating the sample size impact of cluster-level randomization.

Advanced randomization unit considerations include using multiple randomization layers for different types of experiments (user-level for UI changes, session-level for content ranking, region-level for pricing), implementing ID resolution to ensure consistent randomization across devices for the same user, and handling anonymous-to-logged-in transitions where the randomization unit changes mid-session. Some experimentation platforms support adaptive randomization units that start at a finer grain (page view) and coarsen to user-level once the user is identified. For network experiments, the randomization unit might be a community or cluster identified through graph partitioning algorithms. The choice of randomization unit also affects the interference structure: user-level randomization assumes users do not influence each other, which may be violated in social or marketplace products, necessitating cluster or switchback designs.

Related Terms

Cluster Randomization

An experimental design that randomly assigns groups (clusters) of users rather than individual users to treatment conditions, used when individual randomization is not feasible or when interference between users within the same cluster would violate independence assumptions.

Split Testing

The practice of randomly dividing users into two or more groups and exposing each group to a different version of a product experience to measure which version performs better on a target metric, commonly known as A/B testing.

Sample Ratio Mismatch

A diagnostic check that detects whether the observed ratio of users in experiment groups matches the expected ratio from the randomization design, where a significant deviation signals a data quality problem that can invalidate experiment results.

Multivariate Testing

An experimentation method that simultaneously tests multiple variables and their combinations to determine which combination of changes produces the best outcome, unlike A/B testing which typically varies a single element at a time.

Holdout Testing

An experimental design where a small percentage of users are permanently excluded from receiving a new feature or set of features, serving as a long-term control group to measure the cumulative impact of product changes over time.

Power Analysis

A statistical calculation performed before an experiment to determine the minimum sample size required to detect a meaningful effect with a specified probability, balancing the risk of false negatives against practical constraints like traffic and experiment duration.