On 10/16/2008 11:43 AM, Greg Snow wrote:
I wonder if including the p-values for the normality test is the best approach in you
animation? The clt does not say that the distribution of the means will be normal, just
that it approaches normality (and therefore may be a decent approximation). The
normality test can just reject the null that the data (simulated means) comes from a
normal distribution. Since the true distribution of the means is not normal (unless you
use a sample size of Inf, and I for one have better things to than wait for a computer to
simulate several samples of size Inf) the null for the normality test is always false and
therefore the test will always result in either saying it is not normal or a type II
error. The real goal is not to show normality, but to show that using the normal gives a
"good enough" approximation. I would prefer the bottom plot to show either the
proportion of p-values from a normal based test on the simulated data that is less than
alpha, or the proportion of confid
ence intervals based on the normal based test that include the true parameter.
Then the user can see when those values become close enough an approximation.
But the p-value is not the test. The test comes later, when you
interpret the p-value. So there's no such thing as a Type II error in
a p-value. The demo does show that for n < 20 (or whatever), the test
is very likely to reject the null. After that, it becomes less and less
likely.
My suggestion (and this is a matter of taste) would be to do the tests
independently, rather than using the same dataset plus new observations
each time. It is hard to understand the behaviour of p-values even
without complicating things by giving a correlated sequence of them.
And this is even more a matter of taste: I'd plot the p-values as
points, not as vertical bars. Showing that a p-value of 0.8 is twice as
big as a p-value of 0.4 isn't useful for interpreting them.
Duncan Murdoch
What is your target audience for this demo? In my opinion, anyone who could
understand the bottom plot should already understand the clt enough not to need
the demo, those that I would aim the demo at would just be confused by the
current bottom plot.
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
project.org] On Behalf Of Yihui Xie
Sent: Wednesday, October 15, 2008 10:51 PM
To: roger koenker
Cc: r-help
Subject: Re: [R] plot - central limit theorem
Thanks, Roger, your demo is interesting. I'm thinking about improving
it later.
I've also made a demo for the CLT in my package 'animation', in which
there's also normality testing for the sample means, because I don't
think "bell-shaped" alone means normality - so I performed the
Shapiro-Wilk test and plotted the P-values under the demo. See the
function clt.ani() in the package 'animation', or
http://animation.yihui.name/prob:central_limit_theorem
You can use any function to denote the population (specify the
argument 'FUN') in clt.ani().
Regards,
Yihui
--
Yihui Xie <[EMAIL PROTECTED]>
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China
On Thu, Oct 16, 2008 at 4:22 AM, roger koenker <[EMAIL PROTECTED]>
wrote:
> Galton's 19th century mechanical version of this is the quincunx. I
have a
> (very primitive) version of this for R at:
>
>
http://www.econ.uiuc.edu/~roger/courses/476/routines/quincunx.R
>
>
> url: www.econ.uiuc.edu/~roger Roger Koenker
> email [EMAIL PROTECTED] Department of Economics
> vox: 217-333-4558 University of Illinois
> fax: 217-244-6678 Champaign, IL 61820
>
>
>
>> Jörg Groß wrote:
>>>
>>> Hi,
>>>
>>>
>>> Is there a way to simulate a population with R and pull out m
samples,
>>> each with n values
>>> for calculating m means?
>>>
>>> I need that kind of data to plot a graphic, demonstrating the
central
>>> limit theorem
>>> and I don't know how to begin.
>>>
>>> So, perhaps someone can give me some tips and hints how to start
and
>>> which functions to use.
>>>
>>>
>>>
>>> thanks for any help,
>>> joerg
>>>
>
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.