I'm trying to reproduce some results from the American Community Survey PUMS data using the "survey" package. I'm using the one-year 2012 estimates for New Hampshire (http://www2.census.gov/acs2012_1yr/pums/csv_pnh.zip) and comparing to the estimates for user verification from http://www.census.gov/acs/www/Downloads/data_documentation/pums/Estimates/pums_estimates_12.csv
Once the age groups are set up as specified in the verification estimates, the following SAS code produces the correct estimated totals with standard errors: proc surveyfreq data = acs2012 varmethod = jackknife; weight pwgtp; repweights pwgtp1 -- pwgtp80 / jkcoefs = 0.05; table SEX agegroup; run; I've not been successful in reproducing the standard errors with R, although they are very close. My code follows; what revisions do I need to make? Thanks, Mike L. # load estimates for verification pums_est <- read.csv("pums_estimates_12.csv") pums_est[,4] <- as.integer(gsub(",", "", pums_est[,4])) # load PUMS data pums_p <- read.csv("ss12pnh.csv") # convert sex and age group to factors pums_p$SEX <- factor(pums_p$SEX, labels = c("M","F")) pums_p$agegrp <- cut(pums_p$AGEP, c(0,5,10,15,20,25,35,45,55,60,65,75,85,100), right = FALSE) # create replicate-weight survey object library(survey) pums_p.rep <- svrepdesign(repweights = pums_p[207:286], weights = pums_p[,7], combined.weights = TRUE, type = "Fay", rho = 1 - 1/sqrt(4), scale = 1, rscales = 1, data = pums_p) # using type "JK1" with scale = 4/80 and rscales = rep(1,80) # seems to produce the same results # total population by sex with SE's by.sex <- svyby(~SEX, ~ST, pums_p.rep, svytotal, na.rm = TRUE) round(by.sex[1,4:5]) # se1 se2 # 33 1606 1606 # compare results with Census pums_est[966:967, 5] #[1] 1610 1610 # total population by age group with SE's by.agegrp <- svyby(~agegrp, ~ST, pums_p.rep, svytotal, na.rm = TRUE) round((by.agegrp)[15:27]) # se1 se2 se3 se4 se5 se6 se7 se8 se9 se10 se11 se12 se13 # 33 874 2571 2613 1463 1398 1475 1492 1552 2191 2200 880 1700 1678 # compare results with Census pums_est[968:980, 5] # [1] 874 2578 2613 1463 1399 1476 1493 1555 2191 2200 880 1702 1684 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.