On Dec 8, 2011, at 3:28 PM, Michael wrote:
Hi all,
If we wanted to study the effect on the mean of the hourly data
based on
the hours within a day...
and we wanted to do Anova analysis...
We have two choices:
Who is "we" and how were these constraints imposed?
Please see below:
Why are these two approaches giving very different p-values?
They are markedly different statistical models.
And which one
shall I use?
Without knowing your situation better and the eventual purposes of
this analysis, it would be difficult to give sensible advice. I
suspect the answer is "neither".
--
David.
Thanks a lot!
1. treating the hours as double/floating numbers:
anova(lm(hourlydata~as.double(hours_factors)))
Df Sum Sq Mean Sq F value Pr(>F)
as.double(hours_factors) 1 0.0002 0.00019876 1.3425 0.2466
Residuals 14868 2.2013 0.00014806
2. treating the hours as factors:
anova(lm(hourlydata~hours_factors))
Df Sum Sq Mean Sq F value Pr(>F)
hours_factors 9 0.00077 8.5979e-05 0.5806 0.8142
Residuals 14860 2.20072 1.4810e-04
[[alternative HTML version deleted]]
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.