Re: [R] P values

Joris Meys Sun, 09 May 2010 15:28:43 -0700

On Sun, May 9, 2010 at 6:53 PM, Bak Kuss <bakk...@gmail.com> wrote:

> Thank you for your replies.
>
> As I said (wrote) before, 'I am no statistician'.
> But I think I  know what Random Variables are (not).
>
> Random variables are not random, neither are they variable.
> [It sounds better in french: Une variable aléatoire n'est pas variable,
> et n'a rien d'aléatoire.]
>
Semantics. What's in a name? If I want to represent something that I expect
to behave with random noise, and that can have different values, I represent
it by some letter X for example. Just for simplicity, I'd call such an X a
random variable. I can post another 10 definitions from another 10
introductions to the same statistics. et alors?



> See this definition from: 'Introduction to the mathematical and statistical
> foundations of econometrics',
> Herman J. Bierens, Cambridge University Press, 2004, page 21.
>
> http://docs.google.com/View?id=dct7h449_8748tjc6g9
>
>
> Simply put: A random variable is just a mapping (transformation) from a set
> to the real line.
> If the mapping is limited to be from a set to the  [0,1] segment of the
> real line,
> one calls it a probability. But it is still not variable nor random.
> Just a 'simple transformation'.
>
Who said a p-value is a random variable? If I write it down in a theoretical
calculation, it is called "a variable". If I calculate it from a dataset,
it's an observation.

>
>
> As far as Central Limit Theorems are concerned,
> are they not...  well, far from reality.
>

Not that far, apparently. I can't find too many datasets with more than 50
observations of which the mean does not behave normal.

They belong to asymptotics. By definition 'asymptotics' do not belong
> to reality.  'As if...' kind of arguments they are.
>

It's not because they don't belong to reality that they are far from
reality. By definition, Science is not reality. It is a way of modelling
reality close enough so you can make use of it. If you really want to put
the dots on the i, Newton was wrong as well. Who cares if you can build
bridges with it?

Are they not excuses for our 'misbehavior'? An alibi?
> Just like 'p-values'? They just _indicate_  that  _probably_
> we were wrong in having thought such and such...
>

They give a measure for that indication, which is rather more specific than
_probably_ . Which is exactly the point of statistics : putting a measure on
uncertainty by aproximation. You're never exact, but then again, neither was
Newton.

Without ever getting close to whatever the 'real reality' was, is, could
> be... probably!
>
 If I wasn't there, there's no way of ever knowing what the 'real reality'
was. If I would know what the 'real reality' was, I'd be a historian.


> bak
>
>
>
>
>
> That's a common misconception. A p-value expresses no more than the chance
> of obtaining the dataset you observe, given that your null hypothesis _and
> your assumptions_ are true. Essentially, a p-value is as "real" as your
>
> assumptions. In that way I can understand what Robert wants to say. But
> with
> lare enough datasets, bootstrapping or permutation tests gives often about
> the same p-value as the asymptotic approximation. *At that moment, the
> central limit theorem comes into play*
>
>
> On Sat, May 8, 2010 at 9:38 PM, Duncan Murdoch 
> <murdoch.dun...@gmail.com>wrote:
>
>> On 08/05/2010 9:14 PM, Joris Meys wrote:
>>
>>> On Sat, May 8, 2010 at 7:02 PM, Bak Kuss <bakk...@gmail.com> wrote:
>>>
>>>
>>>
>>>> Just wondering.
>>>>
>>>> The smallest the p-value, the closer  to 'reality'  (the more accurate)
>>>> the model is supposed to (not) be (?).
>>>>
>>>> How realistic is it to be that (un-) real?
>>>>
>>>>
>>>>
>>>
>>> That's a common misconception. A p-value expresses no more than the
>>> chance
>>> of obtaining the dataset you observe, given that your null hypothesis
>>> _and
>>> your assumptions_ are true.
>>>
>>
>>
>> I'd say it expresses even less than that.  A p-value is simply a
>> transformation of the test statistic to a standard scale.  In the nicer
>> situations, if the null hypothesis is true, it'll have a uniform
>> distribution on [0,1].  If H0 is false but the truth lies in the direction
>> of the alternative hypothesis, the p-value should have a distribution that
>> usually gives smaller values.  So an unusually small value is a sign that H0
>> is false:  you don't see values like 1e-6 from a U(0,1) distribution very
>> often, but that could be a common outcome under the alternative hypothesis.
>>   (The not so nice situations make things a bit more complicated, because
>> the p-value might have a discrete distribution, or a distribution that tends
>> towards large values, or the U(0,1) null distribution might be a limiting
>> approximation.)
>> So to answer Bak, the answer is that yes, a well-designed statistic will
>> give p-values that tend to be smaller the further the true model gets from
>> the hypothesized one, i.e. smaller p-values are probably associated with
>> larger departures from the null.  But the p-value is not a good way to
>> estimate that distance.  Use a parameter estimate instead.
>>
>> Duncan Murdoch
>>
>>
>>
>>  Essentially, a p-value is as "real" as your
>>> assumptions. In that way I can understand what Robert wants to say. But
>>> with
>>> lare enough datasets, bootstrapping or permutation tests gives often
>>> about
>>> the same p-value as the asymptotic approximation. At that moment, the
>>> central limit theorem comes into play, which says that when the sample
>>> size
>>> is big enough, the mean is -close to- normally distributed. In those
>>> cases,
>>> the test statistic also follows the proposed distribution and your
>>> p-value
>>> is closer to "reality". Mind you, the "sample size" for a specific
>>> statistic
>>> is not always merely the number of observations, especially in more
>>> advanced
>>> methods. Plus, violations of other assumptions, like independence of the
>>> observations, changes the picture again.
>>>
>>> The point is : what is reality? As Duncan said, a small p-value indicates
>>> that your null hypothesis is not true. That's exactly what you look for,
>>> because that is the proof the relation in your dataset you're looking at,
>>> did not emerge merely by chance. You're not out to calculate the exact
>>> chance. Robert is right, reporting an exact p-value of 1.23 e-7 doesn't
>>> make
>>> sense at all. But the rejection of your null-hypothesis is as real as
>>> life.
>>>
>>> The trick is to test the correct null hypothesis, and that's were it most
>>> often goes wrong...
>>>
>>> Cheers
>>> Joris
>>>
>>>
>>>
>>>> bak
>>>>
>>>> p.s. I am no statistician
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>


-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] P values

Reply via email to