This is a very basic question, so please bear with me. I've been learning about AB Testing, which is largely used in internet marketing to examine the effectiveness of certain aspects of ads, websites, etc. Here's a couple links to people who want to know more about AB Testing:
http://visualwebsiteoptimizer.com/split-testing-blog/what-you-really-need-to-know-about-mathematics-of-ab-split-testing/ http://20bits.com/articles/statistical-analysis-and-ab-testing/ http://elem.com/~btilly/effective-ab-testing/ Let's say that I have a website that registers users for a forum. I want to know if Headline 1 or Headline 2 is more effective at getting visitors on the web site to register for the forum. So I have the following data. dat = data.frame(Headline=c("Headline 1", "Headline 2"), Visitors=c("1000", "1300"), Clicks=c("500", "600"), Conversions=c("100", "150")) And here are the click through rates and conversion rates for each of the headlines. ctr1 = (500/1000)*100 # for headline 1 ctr2 = (600/1300)*100 # for headline 2 ctr1; ctr2 conv1 = (100/1000)*100 # for headline 1 conv2 = (150/1300)*100 # for headline 2 conv1; conv2 According to the sites above, I'm really interested in determining the confidence intervals for the conversion rates for each headline. While 95% confidence would be ideal, I'm really open to anything 80% and up, so I need to calculate confidence intervals where I am 80%/85%/90%/95% confident that the conversion rate for a headline is within a certain range. I'm really not sure how to go about this. Are there specific tests and/or functions in R that with provide me with the appropriate information? (confint, chisquare, gtest, etc.?) Thanks for your patience and help. EDIT: So I tried the following, and I'm not sure if I'm doing it properly or making the right conclusions. Furthermore, there has to be a more efficient way to perform this task in R. For a given conversion rate (p) and number of trials (n): p1 = 0.1 n1 = 1000 se1 = sqrt( p1 * (1-p1) / n1 ) se1 se1 * 1.96 (p1 + 1.96*se1) * 100 (p1 - 1.96*se1) * 100 p2 = 0.11 n2 = 1300 se2 = sqrt( p2 * (1-p2) / n2 ) se2 se2 * 1.96 (p2 + 1.96*se2) * 100 (p2 - 1.96*se2) * 100 (8.1, 11.8) # headline 1 (9.2, 12.7) # headline 2 # these confidence intervals for the two headlines overlap. # therefore, the variation (headline 2) isn't more effective # than the control headline Thanks again. I'm running R 2.13 on Windows 7 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.