Hi, Krishna:

<in line>

On 4/3/2011 7:35 AM, Krishna Kirti Das wrote:
Thank you, John.

Yes, your answers do help. For me it's mainly about getting familiar with
the "R" way of doing things.

Thus your response also confirms what I suspected, that there is no explicit
user-interface (at least one that is widely used) in terms of
functions/packages that represents an unbalanced design in the same way that
aov would represent a balanced one. Analyzing balanced and unbalanced data
are obviously possible, but with balanced designs via aov what has to be
done is intuitive within the language but unintuitive for unbalanced
designs.

Intuition is subject to one's background and expectations. If you think in terms of a series of nested hypotheses, then the standard R anova is very intuitive. I never use aov, because it's not intuitive to me and not very general. 'aov' is only useful for a balanced design with normal independent errors with constant variance. The real world is rarely so simple. The 'aov' algorithm was wonderful over half a century ago, when all computations were done by hand or using a mechanical calculator (e.g., an abacus or a calculator with gears). Unbalanced designs were largely impractical because of computational difficulties. There were many procedures for imputing missing values for a design that was "almost balanced".


I encourage you to think in terms of alternative sequences of nested hypotheses, including the implications of A being significant by itself, but not with B already present, except that the A:B interaction is or is not significant.

I did notice that this question gets asked several times and in slightly
different ways, and I think the lack of an interface that represents an
unbalanced design in the same way aov represents balanced designs is why the
question will probably keep getting asked again.

I had mentioned nlme and lme4 because I saw in some of the discussions that
using those were recommended for working with unbalanced designs. And
specifying random effects with zero variance, for example, would probably
serve my purposes.

      I'd be surprised if nlme or lme4 changes what I wrote above.


      Hope this helps.
      Spencer

Thank you for your help.

Sincerely,

Krishna

On Sun, Apr 3, 2011 at 7:28 AM, John Fox<j...@mcmaster.ca>  wrote:

Dear Krishna,

Although it's difficult to explain briefly, I'd argue that balanced and
unbalanced ANOVA are not fundamentally different, in that the focus should
be on the hypotheses that are tested, and these are naturally expressed as
functions of cell means and marginal means. For example, in a two-way
ANOVA,
the null hypotheses of no interaction is equivalent to parallel profiles of
cell means for one factor across levels of the other. What is different,
though, is that in a balanced ANOVA all common approaches to constructing
an
ANOVA table coincide.

Without getting into the explanation in detail (which you can find in a
text
like my Applied Regression Analysis and Generalized Linear Models),
so-called type-I (or sequential) tests, such as those performed by the
standard anova() function in R, test hypotheses that are rarely of
substantive interest, and, even when they are, are of interest only by
accident. So-called type-II tests, such as those performed by default by
the
Anova() function in the car package, test hypotheses that are almost always
of interest. Type-III tests, which the Anova() function in car can perform
optionally, require careful formulation of the model for the hypotheses
tested to be sensible, and even then have less power than corresponding
type-II tests in the circumstances in which a test would be of interest.

Since you're addressing fixed-effects models, I'm not sure why you
introduced nlme and lme4 into the discussion, but I note that Anova() in
the
car package has methods that can produce type-II and -III Wald tests for
the
fixed effects in mixed models fit by lme() and lmer().

Your question has been asked several times before on the r-help list. For
example, if you enter terms like "type-II" or "unbalanced ANOVA" in the
RSeek search engine and look under the "Support Lists" tab, you'll see many
hits -- e.g.,
<Mhttps://stat.ethz.ch/pipermail/r-help/2006-August/111927.html>.

I hope this helps,
  John

--------------------------------
John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox



-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Krishna Kirti Das
Sent: April-03-11 3:25 AM
To: r-help@r-project.org
Subject: [R] Unbalanced Anova: What is the best approach?

I have a three-way unbalanced ANOVA that I need to calculate (fixed
effects plus interactions, no random effects). But word has it that aov()
is good only for balanced designs. I have seen a number of different
recommendations for working with unbalanced designs, but they seem to
differ widely (car, nlme, lme4, etc.). So I would like to know what is
the
best or most usual way to go about working with unbalanced designs and
extracting a reliable ANOVA table from them in R?

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to