AB Test Calculator for Marketers
Want to know if your new idea actually worked? This tool helps you measure the real impact of any change — from marketing messages to product features — using a classic A/B testing approach without the math headache.
What is an A/B Test?
An A/B test is a simple and powerful method to compare two versions of something — such as a landing page, email subject line, or ad creative — to see which one performs better.
You split your audience into two groups at random. One group (the control) sees version A, and the other (the treatment) sees version B. This calculator helps you understand if the difference in results — like click-through rate or sign-ups — is big enough to matter or could have happened by chance.
Not sure how to use the calculator? Click here to learn how it works.
How Our Calculator Works
We use statistical tests to compare your two groups — control and treatment — and determine whether the difference in performance is real or could have happened by chance.
- Inputs: For each group, enter:
- Total users (or traffic) — e.g., how many people saw the email or visited the page
- Number of successes — e.g., how many signed up, clicked, or converted
- Output: estimated effect size, confidence interval, p-value, and a clear verdict on statistical significance
You can test any success rate — here are two examples of inputs:
- Email A vs. Email B: 1,000 users saw each version; 120 clicked A, 100 clicked B.
- Landing Page Test: 800 users visited Version A and 1,200 visited Version B; 80 signed up from A, 132 from B.
An A/B test compares two versions of something — like a web page or ad — to see which performs better.
You show version A (control) to one group and version B (treatment) to another, then compare outcomes like clicks or sign-ups.
Example: Test two email subject lines. The one with more opens is your winner!
Control group: sees the original version — your business-as-usual.
Treatment group: sees the new variation you want to test.
Comparing both helps you see if the change had a real impact.
This calculator is for metrics based on proportions (successes out of total):
- Conversion Rate
- Click-through Rate
- Sign-up Rate
- Purchase Rate
It does not support continuous outcomes like revenue per user. For that, consider other tools or reach out to us.
The confidence level is how sure you want to be about the result.
A 95% confidence level means you're 95% sure the difference isn't due to random chance.
Alpha is just: 1 - confidence level
. So if confidence = 95%, alpha = 0.05.
We don't show alpha to keep things simple — stick with 95% if you're unsure. It's the industry default.
You can choose the type of test:
- Two-sided (default): checks for any change — higher or lower.
- One-sided (greater): checks only if the treatment is better.
- One-sided (less): checks only if the treatment is worse.
If you're unsure, use two-sided — it's safest and works for most marketing tests.
It means the observed difference is unlikely to be due to chance.
In business terms: You can be reasonably confident your change actually worked.
If your result is not significant, it might mean the effect is small — or you need more data.
p-value: How likely your result happened by chance. Smaller is better. If p < 0.05
, it's usually considered significant.
z-score: A mathematical value showing how far your result is from "no difference."
We don't show z-score directly — it's more technical and can be confusing. We use it behind the scenes to calculate results.
Sometimes random assignment isn't possible — like testing a campaign only in certain cities.
In that case, try a geographic test instead.
Use our GeoLift Calculator — it compares treated vs. control regions over time and works great for:
- Retail store tests
- Localized campaigns
- City-based advertising
This tool is for analyzing results after a test.
If you're still planning your test, use our Sample Size Calculator instead. It helps you figure out:
- How many people you need
- The smallest effect you can detect
To trust your results, these assumptions must hold:
- Random assignment: groups are comparable
- No external changes: no major events during the test
- Stable environment: users aren't switching between groups
If any of these are violated, your results could be biased.
The Minimum Detectable Effect (MDE) is used during test setup.
It helps you answer: "What's the smallest change I care about detecting?"
This calculator is for after-the-fact analysis — use our Sample Size Calculator for planning ahead with MDE.
Power is the chance of finding a real effect if one exists.
This calculator shows observed power behind the scenes, but we don't display it — it can be misleading and depends heavily on results.
To plan a test with enough power, use our Sample Size Calculator.
We want this tool to be as accessible and understandable as possible. If you're unsure about any concept or how to format your data, feel free to reach out — we'd be happy to clarify or guide you.
Talk to a marketing data expert to refine your approach.