Most A/B tests fail before they even launch. 📉 Not because the feature was bad. But because the experiment design was flawed. We often rush to "get it live" without doing the math. Here is the end-to-end framework I use to design experiments that actually yield truth. ✅ My A/B test design framework 1) Start with the problem + hypothesis - What problem are we solving? - What user behaviour are we trying to change? - What do we expect to happen and why? 2) Define the treatment (very concretely) - What exactly changes in the product? - Where is the exposure? - Who is eligible? - What could be unintentionally affected? 3) Ask the painful question: do we even need a test? - Sometimes A/B is overkill. - Sometimes it’s impossible to run cleanly. - Sometimes a rollback / logging fix / qualitative research gives more value. 4) Choose the right experiment type Not everything is “classic 50/50”. - Classic A/B - Causal inference (Diff-in-Diff, CausalImpact/forecasting) - Geo experiments (city split, switchback) 5) Metrics: keep it simple, but complete - 1–2 goal metrics (the decision metric) - Proxy metrics (to understand the mechanism) - Guardrails (to prevent “win by breaking the product”) 6) Randomization & exposure: avoid dilution - Randomization unit (user? session? order? city?) - Exposure place & time - Minimal dilution (don’t randomize in one place and measure impact somewhere else) 7) Know what you’ll do statistically before launch - Metric type → method: - Conversion → z-test / t-test for proportions - Numeric metrics → t-test / bootstrap - Ratio metrics → linearization / delta method / bootstrap Plus: - variance reduction (stratification, CUPED) - multiple testing correction (if you’re looking at many metrics/segments) 8) Sample size & duration (the reality check) - choose alpha and power - define MDE (based on history + stakeholder expectations) - account for: number of variants, baseline rate, ratio metrics, CUPED adjustments If the duration is 6 months — the correct decision might be: don’t run this test. 9) Launch like an engineer, not like a gambler - ramp up gradually - check bugs + sudden metric drops - monitor SRM - track dynamics (sequential testing helps a lot) 10) Analysis + conclusion (the part everyone rushes) - check key segments (only the important ones) - make the decision with trade-offs: prioritise what matters for the product’s ultimate goal / North Star A/B testing isn't just about p-values. It’s about risk management. Which part of the design phase do you see teams skip the most? 👇 Let me know in the comments.
How to Conduct a Randomized Controlled Experiment
Explore top LinkedIn content from expert professionals.
Summary
A randomized controlled experiment is a scientific method used to test whether an intervention or change truly influences outcomes by comparing randomly assigned groups. This approach helps eliminate bias, ensuring that any difference observed is due to the intervention itself rather than other factors.
- Define your hypothesis: Start by clearly stating what problem you’re investigating and what outcome you expect from the intervention.
- Set up randomization: Randomly assign participants to control and treatment groups to ensure results aren’t influenced by pre-existing differences.
- Analyze results carefully: Review the data to confirm changes are statistically meaningful and check for unintended effects before making any decisions.
-
-
I’m working on 5 Behavioural Science experiments across 𝟱𝟳𝟬𝗸+ 𝗽𝗲𝗼𝗽𝗹𝗲 today. An experiment, specifically a 𝗿𝗮𝗻𝗱𝗼𝗺𝗶𝘀𝗲𝗱 𝗰𝗼𝗻𝘁𝗿𝗼𝗹𝗹𝗲𝗱 𝗲𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁 (𝗥𝗖𝗘), is the gold standard for testing whether a drug 💊or a vaccine 💉works. Scientists use randomised controlled trials to 𝗶𝗻𝗳𝗲𝗿 whether a drug is 𝗰𝗮𝘂𝘀𝗶𝗻𝗴 𝘁𝗵𝗲 𝗵𝗲𝗮𝗹𝗶𝗻𝗴 𝗲𝗳𝗳𝗲𝗰𝘁 for which they designed it. In my work as a behavioural strategist, I always prefer to test an intervention (aka treatment) 𝘁𝗼 𝗲𝘀𝘁𝗮𝗯𝗹𝗶𝘀𝗵 𝗰𝗮𝘂𝘀𝗮𝗹 𝗲𝘃𝗶𝗱𝗲𝗻𝗰𝗲 𝗯𝗲𝘁𝘄𝗲𝗲𝗻 𝘁𝗵𝗲 𝗶𝗻𝘁𝗲𝗿𝘃𝗲𝗻𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝘁𝗵𝗲 𝘁𝗮𝗿𝗴𝗲𝘁 𝗰𝗵𝗮𝗻𝗴𝗲 𝗶𝗻 𝗯𝗲𝗵𝗮𝘃𝗶𝗼𝘂𝗿. For instance, if I’m adjusting the buying journey, I’d like to test whether the adjustment is causing the shift in buying behaviour before I throw funds into scaling the adjustment.🤔 I also want to avoid 𝗳𝗮𝗹𝘀𝗲 𝗽𝗼𝘀𝗶𝘁𝗶𝘃𝗲𝘀❌ i.e., conclude that the adjustment works when I just got lucky. Many loosely refer to RCEs as an ‘A/B test’ (or ‘A/B/n test’ if there is more than one intervention to test). I’m careful about using those terms because many of these tests have sloppily disregarded an RCE cornerstone – 𝗿𝗮𝗻𝗱𝗼𝗺𝗶𝘀𝗮𝘁𝗶𝗼𝗻. Why is randomisation important? Because without randomisation, we end up 𝗯𝗶𝗮𝘀𝗶𝗻𝗴 𝗼𝘂𝗿 𝗿𝗲𝘀𝘂𝗹𝘁𝘀.😮 There are 𝟯 𝗽𝗼𝗶𝗻𝘁𝘀 in an RCE relevant to randomisation that I’d like to highlight: selection, allocation, and intervention delivery. 1️⃣& 2️⃣𝗦𝗲𝗹𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗔𝗹𝗹𝗼𝗰𝗮𝘁𝗶𝗼𝗻. Randomly select individuals from your population to create a representative sample for the experiment and randomly allocate them to your treatment and control groups. Both the 𝘀𝗲𝗹𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗮𝗹𝗹𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝘀𝗵𝗼𝘂𝗹𝗱 𝗯𝗲 𝗿𝗮𝗻𝗱𝗼𝗺. Selecting and allocating by the first letter of the last name, by the order individuals are stored in a database, or by the city they live in aren’t random. Any time you follow a pattern, you throw randomisation out and bring bias in. 3️⃣𝗜𝗻𝘁𝗲𝗿𝘃𝗲𝗻𝘁𝗶𝗼𝗻 𝗗𝗲𝗹𝗶𝘃𝗲𝗿𝘆. Ideally, you should deliver the intervention and control (baseline) treatments to participants at the same time. But, sometimes, this is not possible. For instance, if you have to broadcast a treatment message to 𝟱𝟬𝟬𝗸 𝗽𝗮𝗿𝘁𝗶𝗰𝗶𝗽𝗮𝗻𝘁𝘀 and your messaging system only allows you 𝟭𝟬𝗸 𝗺𝗲𝘀𝘀𝗮𝗴𝗲𝘀 𝗽𝗲𝗿 𝗵𝗼𝘂𝗿, you must randomly sequence your broadcast. You don’t want a particular treatment group A to receive your message at 7 a.m., while treatment group F receives your message at 7 p.m., to avoid broadcast time from biasing your results (unless broadcast time is a treatment in itself). As Matteo Maria Galizzi, my mentor from The London School of Economics and Political Science (LSE) taught me, 𝗻𝗼 𝗮𝗺𝗼𝘂𝗻𝘁 𝗼𝗳 𝗱𝗮𝘁𝗮 𝘀𝗰𝗶𝗲𝗻𝗰𝗲 𝗰𝗮𝗻 𝗳𝗶𝘅 𝗮 𝗳𝗮𝘂𝗹𝘁𝘆 𝗲𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁𝗮𝗹 𝗱𝗲𝘀𝗶𝗴𝗻. #behavioraleconomics #behavioralscience #behavioraldesign
-
Writing that a study will be "randomized, double-blind, cross-over, adaptive with sample size re-estimation" is EASY compared to executing on that design. Here are some tips for how to ensure your designs are executed as intended: 🔀 Randomized: Before the protocol is finalized: - Finalize stratification factors (those variables you want to ensure are balanced between groups) - Select a randomization scheme (permeated block - set your block size, adaptive, etc.) - Ensure you know who is writing the randomization plan and who is drafting the randomization codes (have them generate dummy randomization codes beforehand to test) 😎 Blinding: While writing the protocol: - WRITE A BLINDING PLAN - who is blinded, who is not, how emergency unblinding will occur, etc. - Set up your AND YOUR VENDOR'S environments to have separate areas for the blinded and unblinded team - Work with the programming team to ensure they know the plan for programming unblinded outputs from blinded data (assuming that is how you will be working) and how unblinded outputs will be generated once the data is locked 💊 Cross-Over: Before the protocol is finalized - Ensure a washout period is included and work with medical to ensure it is sufficient and will not impact analysis results - Work with the sponsor to determine how cross-over data is to be handled/presented via TLFs - Work with data management to understand how subject data will be tracked between treatment regimens 🦠 Adaptive Design: During protocol development - Be VERY specific with what will be adaptive about the study and set specific rules during protocol writing to ensure the adaptive nature of the study is limited and cannot balloon out of control It can be very easy to write down a study design and not think about all the nuances that go into actually executing on that study design. This is the first step in a poorly conducted trial. Clinical trials are won or lost during setup - ensure you take the time to do the mental reps on what things are going to need to be completed and when, to ensure a successfully executed clinical trial. Happy Monday
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development