Design statistically sound experiments with clear hypotheses and sample size calculations.
2 hrs → 20 min
Compared to doing it manually
/ab-test-designerType this in Claude to run the skill
Underpowered experiments waste weeks of traffic and produce inconclusive results. Without proper planning, you're flipping coins and calling it data-driven.
This skill is part of a workflow that automate multiple steps together:
.claude/skills/ folder in your project/ab-test-designer in Claude to run the skillBuild comprehensive metrics frameworks using the AARRR pirate metrics or input/output methodology.
Diagnose conversion funnel problems and generate data-backed improvement hypotheses.
Design A/B tests with proper methodology, sample sizes, and success criteria.
Interpret experiment results with statistical rigor and clear ship/no-ship recommendations.
A/B test when: you have enough traffic for statistical significance, the change is measurable, and the risk of the change warrants validation. Don't A/B test obvious fixes or low-traffic features.
Until you reach statistical significance (usually 95% confidence) AND at least one full business cycle (typically 1-2 weeks). Don't stop tests early just because results look good.
Statistical significance (usually 95%) means there's only a 5% chance the result is due to random variation. It's not about how big the difference is — it's about how confident you are it's real.
Download this skill and drop it in your .claude/skills/ folder.
This skill + 70+ more, context files, and agent workflows — $499