Question 1

What is the difference between chi-squared and Bayesian A/B testing?

Accepted Answer

A chi-squared test gives a binary answer: significant or not, based on a p-value threshold (typically 0.05). A Bayesian approach tells you the probability that one variant beats the other and the expected size of the difference. Chi-squared answers 'can I reject the null hypothesis?' while Bayesian answers 'how confident should I be that B is better, and by how much?'

Question 2

Which method should I use for my A/B test?

Accepted Answer

For most product decisions, Bayesian analysis is more useful because it gives you a probability and a range rather than a binary yes/no. Chi-squared is better when you need a strict, conventional significance test, for example in academic research or regulated environments. Both methods converge on the same conclusion with large enough sample sizes.

Question 3

What is a p-value in A/B testing?

Accepted Answer

A p-value is the probability of observing results as extreme as yours if there were no real difference between variants. A p-value of 0.03 means there is a 3% chance of seeing this data if A and B were identical. It does not mean there is a 97% chance that B is better. That distinction is why Bayesian analysis exists.

Question 4

What is a credible interval?

Accepted Answer

A 95% credible interval is the range that contains the true lift with 95% probability. If the interval is +5% to +25%, you can say: there is a 95% chance the real improvement is between 5% and 25%. This is more intuitive than a frequentist confidence interval, which answers a subtly different question.

Question 5

How many visitors do I need for a significant A/B test?

Accepted Answer

It depends on your baseline conversion rate and the effect size you want to detect. As a rough guide: to detect a 20% relative lift on a 5% conversion rate, you need roughly 3,000 to 5,000 visitors per variant. The Bayesian approach can give useful signals earlier because it provides a probability rather than a binary threshold.

A/B Test Significance Calculator

Chi-Squared Test

Beta-Binomial Model

Why two methods?

Chi-squared: "Can I reject the null hypothesis?"

Bayesian: "How likely is it that B is better, and by how much?"

When they agree, when they don't

What this calculator uses

Frequently Asked Questions

Turn numbers into product decisions