On Dec 8, 2011, at 3:28 PM, Michael wrote:

Hi all,

If we wanted to study the effect on the mean of the hourly data based on
the hours within a day...

and we wanted to do Anova analysis...

We have two choices:

Who is "we" and how were these constraints imposed?


Please see below:

Why are these two approaches giving very different p-values?

They are markedly different statistical models.

And which one
shall I use?


Without knowing your situation better and the eventual purposes of this analysis, it would be difficult to give sensible advice. I suspect the answer is "neither".

--
David.

Thanks a lot!

1. treating the hours as double/floating numbers:


anova(lm(hourlydata~as.double(hours_factors)))

Df Sum Sq Mean Sq F value Pr(>F)

as.double(hours_factors) 1 0.0002 0.00019876 1.3425 0.2466

Residuals 14868 2.2013 0.00014806

2. treating the hours as factors:



anova(lm(hourlydata~hours_factors))

Df Sum Sq Mean Sq F value Pr(>F)

hours_factors 9 0.00077 8.5979e-05 0.5806 0.8142

Residuals 14860 2.20072 1.4810e-04

        [[alternative HTML version deleted]]



David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to