User loginNavigationSearch 
Sample Size Calculator & Statistical PowerCalculating sample size is an important first step in the process of running an a/b split test. Many people ignore this step, or do not understand the purpose it serves. Let’s explore why to calculate sample size and how to do it. What is Statistical Power An experiment’s “power” is 1 minus the Type II Error. The Type II Error is when you declare the challenger had no effect, but in fact you were wrong. So you’re ok “missing” an effect 20% of the time, then the test’s statistical power is 80%. The question is... do you have enough traffic to do this a/b test? Or put another, how long will it take you to run this test and still have a reasonable chance of picking out the “signal” from the noise – if that signal really exists. This is where a good sample size / power calculator comes into play. You’ll enter in a few numbers, and then you can make a call whether it makes sense to run this test or not. Here’s the information you need when using such a calculator. You will need ALL BUT ONE of the following: Challenger Proportion: Whatever conversion rate you need to be able to detect. When expressed as a percentage change, this is called your Lift Threshold. Your test will also detect anything more extreme than that – that is, more different from your control. But you will NOT have a good chance of detecting a conversion rate less extreme. Type I error: This is the chance of declaring a difference when there really isn’t one. Traffic: These calculators are typically geared for 2 recipe tests. If you have more recipes, they may not be appropriate or you’ll have to adjust the sample sizes. The calculator will solve for whatever number you left blank. Here’s a nice online sample size / power calculator you can use. Of course stats programs such as SPSS, SAS, Minitab, and R also can calculate power.
If you want to use the opensource program R to do this calculation, here are some worked examples. Example: Here we’re solving for power.
Twosample comparison of proportions power calculation n = 24 NOTE: n is number in *each* group Here we’re solving for sample size (each group):
Twosample comparison of proportions power calculation n = 1932.588 NOTE: n is number in *each* group Running an A/B Test With a Predefined Sample Size Now you need to run the test until you reach that sample size, then stop and perform your statistical test to find out where your pvalue is. If the pvalue is below say 0.05 you will reject the null hypothesis. If not, you’ll not reject the null hypothesis but you will know that your test was powerful enough to detect a change. What if you know you are constrained to running the test inside of say two weeks? If you don't have the liberty of "solving for" sample size, then you can solve for Lift Threshold. This requires a bit of working backwards but you can do it. First figure out how much traffic you get in those two weeks, then plug that into the calculator. Keep your Power at .80 and your Type I error at .05. Input your baseline conversion. Now the calculator will solve for your Conversion Rate in the Challenger recipe. What this tells you is that "within 2 weeks your test has an 80% chance of finding a conversion rate change as different as... [whatever number it split out]." If that number is outrageous and you don't believe the challenger has a snowball's chance in hell of producing that result, then maybe you should rethink the test. Conclusion If you want to learn more about the steps involved with running an a/b test using statistics, check out this online course A/B Testing: Test Design & Statistics Trackback URL for this post:http://www.benchmarkanalytics.com/d/?q=trackback/51

Nice post
Hi Jared,
The lift estimator calculator is an interesting idea  thanks for sharing. I modified the sample size calculator and used the Goal Seek tool in Excel given a specific sample size. Dylan