Jason,
Thank you for your reply. Interesting ... so you think the 'classes' in the 
error message "The combined number of values in at least one class..."  is 
referring to the CDF bins rather than the sub-population classes that I 
defined. 
 
That makes sense as I only defined two classes (!). I was worried it was 
detecting and treating polygons as classes, somehow (ecoregions in example 
below).
 
I had already reached out to Kincaid and Olsen but had not received an answer 
yet so I moved on to R-help.   I'll go back to them. 
 
Thanks again.
Best, 
Tim

>>> "Law, Jason" <jason....@portlandoregon.gov> 11/1/2013 2:47 PM >>>
I use the spsurvey package a decent amount.  The cont.cdftest function bins the 
cdf in order to perform the test which I think is the root of the problem.  
Unfortunately, the default is 3 which is the minimum number of bins.

I would contact Tom Kincaid or Tony Olsen at NHEERL WED directly to ask about 
this problem.

Another option would be to take a different analytical approach (e.g., a mixed 
effects model) which would allow you a lot more flexibility.

Jason Law
Statistician
City of Portland
Bureau of Environmental Services
Water Pollution Control Laboratory
6543 N Burlington Avenue
Portland, OR 97203-5452
503-823-1038
jason....@portlandoregon.gov


-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tim Howard
Sent: Friday, November 01, 2013 7:49 AM
To: r-help@r-project.org
Subject: [R] spsurvey analysis

All,
I've used the excellent package, spsurvey, to create spatially balanced samples 
many times in the past. I'm now attempting to use the analysis portion of the 
package, which compares CDFs among sub-populations to test for differences in 
sub-population metrics. 

- My data (count data) have many zeros, following a negative binomial or even 
zero-inflated negative binomial distribution.
- Samples are within polygons of varying sizes
- I want to test whether a sample at time 1 is different from a sample at time 
2. Essentially the same sample areas and number of samples.

The problem:
- cont.cdftest  throws a warning and does not complete for most (but not all) 
species sampled. Warning message: "The combined number of values in at least 
one class is less than five. Action: The user should consider using a smaller 
number of classes."

- There are plenty of samples in my two time periods (the dummy set below: 
Yr1=27, Yr2=31 non-zero values). 

My Question:
Why is it throwing this error and is there a way to get around it?



Reproduceable example (change path to spsurvey sample data), requires us to use 
spsurvey to generate sample points:

### R code tweaked from vignettes 'Area_Design' and 'Area_Analysis'
library(spsurvey)
### Analysis set up
setwd("C:/Program Files/R/R-3.0.2/library/spsurvey/doc")
att <- read.dbf("UT_ecoregions")
shp <- read.shape("UT_ecoregions")

set.seed(4447864)

# Create the design list
Stratdsgn <- list("Central Basin and Range"=list(panel=c(PanelOne=25), 
seltype="Equal"),
                  "Colorado Plateaus"=list(panel=c(PanelOne=25), 
seltype="Equal"),
                  "Mojave Basin and Range"=list(panel=c(PanelOne=10), 
seltype="Equal"),
                  "Northern Basin and Range"=list(panel=c(PanelOne=10), 
seltype="Equal"),
                  "Southern Rockies"=list(panel=c(PanelOne=14), 
seltype="Equal"),
                  "Wasatch and Uinta Mountains"=list(panel=c(PanelOne=10), 
seltype="Equal"),
                  "Wyoming Basin"=list(panel=c(PanelOne=6), seltype="Equal"))

# Select the sample design for each year
Stratsites_Yr1 <- grts(design=Stratdsgn, DesignID="STRATIFIED",
                   type.frame="area", src.frame="sp.object",
                   sp.object=shp, att.frame=att, stratum="Level3_Nam", 
shapefile=FALSE)

Stratsites_Yr2 <- grts(design=Stratdsgn, DesignID="STRATIFIED",
                   type.frame="area", src.frame="sp.object",
                   sp.object=shp, att.frame=att, stratum="Level3_Nam", 
shapefile=FALSE)

#extract the core information, add year as a grouping variable, add a plot ID 
to link with dummy data
Yr1 <- cbind(pltID = 1001:1100, Stratsites_Yr1@data[,c(1,2,3,5)], grp = "Yr1")
Yr2 <- cbind(pltID = 2001:2100, Stratsites_Yr2@data[,c(1,2,3,5)], grp = "Yr2")  
       
sitedat <- rbind(Yr1, Yr2)

# create dummy sampling data. Lots of zeros!
bn.a <- rnbinom(size = 0.06, mu = 19.87, n=100) bn.b <- rnbinom(size = 0.06, mu 
= 20.15, n=100) dat.a <- data.frame(pltID = 1001:1100, grp = "Yr1",count = 
bn.a) dat.b <- data.frame(pltID = 2001:2100, grp = "Yr2",count = bn.b) dat <- 
rbind(dat.a, dat.b)

####
## Analysis begins here
####
data.cont <- data.frame(siteID = dat$pltID, Density=dat$count) sites <- 
data.frame(siteID = dat$pltID, Use=rep(TRUE, nrow(dat))) subpop <- 
data.frame(siteID = dat$pltID, 
                        All_years=(rep("allYears",nrow(dat))),
                        Year = dat$grp)
design <- data.frame(siteID = sitedat$pltID,
                        wgt = sitedat$wgt,
                        xcoord = sitedat$xcoord,
                        ycoord = sitedat$ycoord)
framesize <- c("Yr1"=888081202000, "Yr2"=888081202000)                        
                        
## There seem to be pretty good estimates CDF_Estimates <- cont.analysis(sites, 
subpop, design, data.cont, 
                        popsize = list(All_years=sum(framesize),
                        Year = as.list(framesize)))

print(CDF_Estimates$Pct)

## this test fails
CDF_Tests <- cont.cdftest(sites, subpop[,c(1,3)], design, data.cont,
   popsize=list(Year=as.list(framesize)))
warnprnt()

## how many records have values greater than zero, by year?   Probably 
irrelevant!
notZero <- dat[dat$count > 0,]
table(notZero$grp)

### end

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

Thanks in advance.
Tim

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to