On Thu, 28 Aug 2008, Farley, Robert wrote:

I'm feeling like I just don't get it.  My attempt at rake now fails
with:
Error in postStratify.survey.design(design, strata[[i]],
population.margins[[i]],  :
 Stratifying variables don't match

Ah. Now we have an easy one to fix. This means that the names of the variables don't match, which they don't, because the variable names in the formula are lineon and NumStn and the variable names in the population tables are StnName and StnTraveld. You just need to rename the variables in the population tables.

        -thomas


The factors in the data frame looks fine.  Should I have the same
structure in the design?
str(EBDesign$lineon)
NULL
str(EBSurvey$lineon)
Factor w/ 13 levels "Warner Center",..: 3 1 1 1 2 13 1 5 1 5 ...
str(ByEBOn$StnName)
Factor w/ 13 levels "Balboa","De Soto",..: 11 2 5 8 6 1 12 7 10 13 ...
all(levels(EBSurvey$lineon)==StnName)
[1] TRUE
#
str(EBDesign$NumStn)
NULL
str(EBSurvey$NumStn)
Factor w/ 12 levels "1","2","3","4",..: 10 12 4 12 8 1 8 8 12 4 ...
str(ByEBNum$StnTraveld)
Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
all(levels(EBSurvey$NumStn)==StnTraveld)
[1] TRUE

A complete listing is below:
**************************************************
**************************************************
**************************************************
sessionInfo()        # List loaded packages
R version 2.7.2 (2008-08-25)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] graphics  grDevices utils     datasets  stats     methods   base


other attached packages:
[1] survey_3.8-1   fortunes_1.3-5 moonsun_0.1    prettyR_1.3-2
foreign_0.8-29
SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)

#=======================================================================
========
temp <- sub(' +$', '', SurveyData$direction_)
SurveyData$direction_ <- temp

#=======================================================================
========
# Calc. # stations traversed from StnOn/StnOff

SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyDat
a$lineoff))
#################################################### Kludge
mean(SurveyData$NumStn)
[1] 6.785276
SurveyData$NumStn <- pmax(1,SurveyData$NumStn)
mean(SurveyData$NumStn)
[1] 6.789877
####################################################
SurveyData$NumStn <- as.factor(SurveyData$NumStn)

#=======================================================================
========
# Adjust one direction at a time.  Start W/ EB {learn subsetting
later}
EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )
EBDesign <- svydesign(id=~sampn, weights=~expwgt, data=EBSurvey)

#=======================================================================
========
# New Marignals {start w/ 2 dimensions: StnOn X Distance}
StnName <- as.factor(c( "Warner Center", "De Soto", "Pierce College",
"Tampa", "Reseda", "Balboa", "Woodley", "Sepulveda", "Van Nuys",
"Woodman", "Valley College", "Laurel Canyon", "North Hollywood"))
EBOnNewTots       <- c(            1000,       600,             1200,
500,     1000,      500,       200,         250,       1000,       300,
100,          123.65,                0 )
ByEBOn  <- data.frame(StnName, Freq=EBOnNewTots)
#
StnTraveld <- as.factor(1:12)
EBNumStn   <- c(673.65,     800, 1000, 1000,  800,  700,  600, 500,
400, 200,  50, 50 )
ByEBNum    <- data.frame(StnTraveld, Freq=EBNumStn)
#
RakedEBSurvey <- rake(EBDesign, list(~lineon, ~NumStn), list(ByEBOn,
ByEBNum) )
Error in postStratify.survey.design(design, strata[[i]],
population.margins[[i]],  :
 Stratifying variables don't match
#
str(EBDesign$lineon)
NULL
str(EBSurvey$lineon)
Factor w/ 13 levels "Warner Center",..: 3 1 1 1 2 13 1 5 1 5 ...
str(ByEBOn$StnName)
Factor w/ 13 levels "Balboa","De Soto",..: 11 2 5 8 6 1 12 7 10 13 ...
all(levels(EBSurvey$lineon)==StnName)
[1] TRUE
#
str(EBDesign$NumStn)
NULL
str(EBSurvey$NumStn)
Factor w/ 12 levels "1","2","3","4",..: 10 12 4 12 8 1 8 8 12 4 ...
str(ByEBNum$StnTraveld)
Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
all(levels(EBSurvey$NumStn)==StnTraveld)
[1] TRUE
#
**************************************************
**************************************************
**************************************************

Robert Farley
Metro
www.Metro.net


-----Original Message-----
From: Thomas Lumley [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 28, 2008 11:43
To: Farley, Robert
Cc: r-help@r-project.org
Subject: Re: [R] Survey Design / Rake questions

On Mon, 25 Aug 2008, Farley, Robert wrote:

I see a number of things that bother me.
 1) str(ByEBNum$StnTraveld) says "int [1:12] 1 2 3 4 5 6 7 8 9 10 ..."
        Even though "StnTraveld  <- c(as.factor(1:12))"

You don't want the c()
a<-as.factor(1:12)
str(a)
 Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
str(c(a))
 int [1:12] 1 2 3 4 5 6 7 8 9 10 ...

As the help for c() says  "all attributes except names are removed.",
which includes the factor levels.

 2) ByEBOn$StnName[1:5] seems to imply I have extra spaces in the
data.  Where would they have come from?

No, that's just R printing things in columns
a<-factor(1:12, labels=c(1:11,"antidisestablishmentarianism"))
a
 [1] 1                            2
 [3] 3                            4
 [5] 5                            6
 [7] 7                            8
 [9] 9                            10
[11] 11                           antidisestablishmentarianism
Levels: 1 2 3 4 5 6 7 8 9 10 11 antidisestablishmentarianism


 3) I'd like to verify that the order (value) of "EBSurvey$lineon"
matches my definition in "StnName"

all(levels(EBSurvey$lineon)==StnName)

        -thomas


Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to