Ah, the plot thickens! The p-value imbroglio again.

I won't comment except to note that all of this so far assumes a
simple null. What if you have a composite null? --e.g. My null is that
the data are drawn from a normal with unknown mean and variance versus
they are drawn from mixture of 2 normals with 2 different unknown
means but the same unknown variance. Constructing appropriate tests
gets dicier and dicier in these situations.

Cheers,
Bert

On Fri, Sep 3, 2010 at 7:19 AM, Greg Snow <greg.s...@imail.org> wrote:
> Ted,
>
> I agree that we are measuring discrepancies and that large discrepancies 
> correspond to p-values near 0 and small discrepancies correspond to large 
> p-values.  But interpreting discrepancies on a p-value scale leads more to 
> confusion than understanding.  If you are interested in the discrepancy, then 
> focus on the meaningful discrepancy scale (confidence intervals are great in 
> many of these cases).  I also agree that small p-values corresponding to 
> large discrepancies is meaningful in saying that the large discrepancy is 
> indicative of a real difference rather than just luck.
>
> My point was more focused on the over interpretation of differences in large 
> p-values (remember this thread started with the original poster 
> misinterpreting a p-value of 1).  Try this exercise:  Consider a sample of 
> size 100 from a normal population with known standard deviation of 1.  The 
> null hypothesis is that the true mean is 50, what sample mean(s) will result 
> in a p-value of 0.4? a p-value of 0.9?  Is the difference between the 2 
> discrepancies worth getting excited about?  Compare what conclusions you 
> would draw by comparing the 2 confidence intervals to what might be concluded 
> by comparing the 2 p-values.
>
> The difference between a p-value of 0.01 and 0.1 is very meaningful (if using 
> an alpha=0.05 or close), the difference between a p-value of 0.4 and 0.9 is 
> much less meaningful even though the difference is bigger.
>
> Also for alpha=0.05, I don't think it is worth getting any more excited over 
> a p-value of 0.000000001 than one of 0.0001, but people do.
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
>> -----Original Message-----
>> From: Ted Harding [mailto:ted.hard...@manchester.ac.uk]
>> Sent: Thursday, September 02, 2010 3:59 PM
>> To: Greg Snow
>> Cc: r-help@r-project.org; Kay Cecil Cichini
>> Subject: Re: [R] general question on binomial test / sign test
>>
>> On 02-Sep-10 18:01:55, Greg Snow wrote:
>> > Just to add to Ted's addition to my response.  I think you are moving
>> > towards better understanding (and your misunderstandings are common),
>> > but to further clarify:
>> > [Wise words about P(A|B), P(B|A), P-values, etc., snipped]
>> >
>> > The real tricky bit about hypothesis testing is that we compute a
>> > single p-value, a single observation from a distribution, and based
>> on
>> > that try to decide if the process that produced that observation is a
>> > uniform distribution or something else (that may be close to a
>> uniform
>> > or very different).
>>
>> Indeed. And this is precisely why I began my original reply as follows:
>>
>> >> Zitat von ted.hard...@manchester.ac.uk:
>> >>> [...]
>> >>> The general logic of a singificance test is that a test statistic
>> >>> (say T) is chosen such that large values represent a discrepancy
>> >>> between possible data and the hypothesis under test. When you
>> >>> have the data, T evaluates to a value (say t0). The null hypothesis
>> >>> (NH) implies a distribution for the statistic T if the NH is true.
>> >>>
>> >>> Then the value of Prob(T >= t0 | NH) can be calculated. If this is
>> >>> small, then the probability of obtaining data at least as
>> discrepant
>> >>> as the data you did obtain is small; if sufficiently small, then
>> >>> the conjunction of NH and your data (as assessed by the statistic
>> T)
>> >>> is so unlikely that you can decide to not believe that it is
>> >>> possible.
>> >>> If you so decide, then you reject the NH because the data are so
>> >>> discrepant that you can't believe it!
>>
>> The point is that the test statistic T represents *discrepancy*
>> between data and NH in some sense. In what sense? That depends on
>> what you are interested in finding out; and, whatever it is,
>> there will be some T that represents it.
>>
>> It might be whether two samples come from distributions with equal
>> means, or not. Then you might use T = mean(Ysample) - mean(Xsample).
>> Large values of |T| represent discrepancy (in either direction)
>> between data and an NH that the true means are equal. Large values
>> of T, discrepancy in the positive direction, large values of -T
>> diuscrepancy in the negative direction. Or it might be whether or
>> not the two samples are drawn from populations with equal variances,
>> when you might use T = var(Ysample)/var(Xsample). Or it might be
>> whether the distribution from which X was sampled is symmetric,
>> in which case you might use skewness(Xsample). Or you might be
>> interested in whether the numbers falling into disjoint classes
>> are consistent with hypothetical probabilities p1,...,pk of
>> falling into these classes -- in which case you might use the
>> chi-squared statistic T = sum(((ni - N*pi)^2)/(N*pi)). And so on.
>>
>> Once you have decided on what "discrepant" means, and chosen a
>> statistic T to represent discrepancy, then the NH implies a
>> distribution for T and you can calculate
>>   P-value = Prob(T >= t0 | NH)
>> where t0 is the value of T calculated from the data.
>>
>> *THEN* small P-value is in direct correspondence with large T,
>> i.e. small P is equivalent to large discrepancy. And it is also
>> the direct measure of how likely you were to get so large a
>> discrepancy if the NH really was true.
>>
>> Thus the P-values, calculated from the distribution of (T | NH),
>> are ordered, not just numerically from small P to large, but also
>> equivalently by discrepancy (from large discrepancy to small).
>>
>> Thus the uniform distribution of P under the NH does not just
>> mean that any value of P is as likely as any other, so "So what?
>> Why prefer on P-value to another?"
>>
>> We also have that different parts of the [0,1] P-scale have
>> different *meanings* -- the parts near 0 are highly discrepant
>> from NH, the parts near 1 are highly consistent with NH,
>> *with respect to the meaning of "discrepancy" implied by the
>> choice of test statistic T*.
>>
>> So it helps to understand hypothesis testing if you keep in
>> mind what the test statistic T *represents* in real terms.
>>
>> Greg's point about "try to decide if the process that produced that
>> observation is a uniform distribution or something else (that may
>> be close to a uniform or very different)" is not in the first instance
>> relevant to the direct interpretation of small P-value as large
>> discrepancy -- that involves only the Null Hypothesis NH, under
>> which the P-values have a uniform distribution.
>>
>> Where it somes into its own is that an Alternative Hypothesis AH
>> would correspond to some degree of discrepancy of a certain kind,
>> and if T is well chosen then its distribution under AH will give
>> large values of T greater probability than they would get under NH.
>> Thus the AHs that are implied by a large value of a certain test
>> statistic T are those AHs that give such values of T greater
>> probability than they would get under NH. Thus we are now getting
>> into the domain of the Power of the test to detect discrepancy.
>>
>> Ted.
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 02-Sep-10                                       Time: 22:59:23
>> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to