Thank you very much Joshua. I was thinking to use logistic regression with
glm(). But this will pool the individual factories which share the same
factor levels together. I was puzzled by how to deal with individual
factory. Any idea? I will try your method anyway.

Xiang



On Mon, Mar 22, 2010 at 12:55 PM, Joshua Wiley <jwiley.ps...@gmail.com>wrote:

> Dear Xiang,
>
> Now I understand what you meant.  If you are only interested in
> comparing the Good Samples, I think you would have to use the
> proportion (Good Sample/Total Sample) or something similar.  Another
> thought would be to dummy code the data (e.g., Good = +1, Fair = 0,
> Bad = -1).  Then you could compare the means.  Obviously in my example
> the mean would be relatively less impacted by Fair than Bad samples.
> Another benefit of this approach (over comparing mean number of good
> samples from each city) is that you estimate variability within
> factories (based of the dummy codes) which controls for many variables
> relative to within city.
>
> Once you dummy code, the test itself is not difficult.  Below is a
> sample function.  You can either include the mean, standard deviation,
> and n of each group OR include the raw data with each factory being a
> separate object in a list that you use as "x".  The lambdas are the
> weights to apply to the means.  If you want to compare City A to City
> B, you could use -1 for every factory in A and +1 for every factory in
> B (same idea as a two sample t-test, you just estimate the variability
> more finely because it is by factory and then pooled).  A cautionary
> note, I just wrote this function myself.  I tested that it gives the
> same result as the t.test() function on two samples (mine uses
> one-tailed p-values) for simple numeric vectors; however, I have no
> idea what it would do with other types of data and it may even appear
> to have worked but returned wrong results.
>
>
> ######################################
> t.contrast.test <- function(x=NA, lambda, m=NULL, s=NULL, n=NULL,
> raw=TRUE, na.rm=TRUE) {
> ifelse(identical(raw, TRUE), {
> for(i in seq_along(x)) {m[i] <- mean(x[[i]], na.rm=na.rm)};
> for(i in seq_along(x)) {s[i] <- sd(x[[i]], na.rm=na.rm)};
> for(i in seq_along(x)) {n[i] <- length(x[[i]])};
> NA}, {
> NA})
> df <- sum(n-1)
> effect <- sum(m*lambda)
> s2.pooled <- weighted.mean(x=s^2, w=n-1)
> sample.correction <- sum((lambda^2)/n)
> variability <- sqrt(sample.correction*s2.pooled)
> t.score <- effect/variability
> p.score <- pt(q=t.score, df=df, lower.tail=F)
> r.score <- t.score/sqrt((t.score^2)+df)
> value <- list(t.score, p.score, r.score, s2.pooled, df)
> names(value) <- c("t.contrast", "p.value", "r.contrast",
> "pooled.variance", "df")
> return(value)}
> ######################################
>
>
> I hope that all made sense.
>
> Best Regards,
>
> Joshua
>
> --
> Joshua Wiley
> Senior in Psychology
> University of California, Riverside
> http://www.joshuawiley.com/
>



-- 
Xiang Gao, Ph.D.
Department of Biology
University of North Texas

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to