Re: [R] weight in lm

David Winsemius Mon, 14 Aug 2017 11:00:56 -0700

> On Aug 14, 2017, at 10:49 AM, David Winsemius <dwinsem...@comcast.net> wrote:
> 
> 
>> On Aug 14, 2017, at 5:17 AM, peter dalgaard <pda...@gmail.com> wrote:
>> 
>> 
>>> On 14 Aug 2017, at 13:43 , Spencer Graves 
>>> <spencer.gra...@effectivedefense.org> wrote:
>>> 
>>> 
>>> 
>>> On 2017-08-14 5:53 AM, peter dalgaard wrote:
>>>>> On 14 Aug 2017, at 10:13 , Troels Ring <tr...@gvdnet.dk> wrote:
>>>>> 
>>>>> Dear friends - I hope you will accept a naive question on lm: R version 
>>>>> 3.4.1, Windows 10
>>>>> 
>>>>> I have 204 "baskets" of three types corresponding to factor F, each of 
>>>>> size from 2 to 33 containing measurements, and need to know if the 
>>>>> standard deviation on the measurements  in each basket,sdd, is different 
>>>>> across types, F. Plotting the observed sdd  versus the sizes from 2 to 
>>>>> 33, called "k" , does show a decreasing spread as k increases towards 33.
>>>>> 
>>>>> I tried lm(sdd ~ F,weight=k) and got different results if omitting the 
>>>>> weight argument but would it be the correct way to use sqrt(k) as weight 
>>>>> instead?
>>>>> 
>>>> I doubt that there is a "correct" way, but theory says that if the baskets 
>>>> have the same SD and data are normally distributed, then the variance of 
>>>> the sample VARIANCE is proportional to 1/f = 1/(k-1). Weights in lm are 
>>>> inverse-variance, so the "natural" thing to do would seem to be to regress 
>>>> the square of sdd with weights (k-1).
>>>> 
>>>> (If the distribution is not normal, the variance of the sample variance is 
>>>> complicated by a term that involves both n and the excess kurtosis, 
>>>> whereas the variance of the sample SD is complicated in any case. All 
>>>> according to the gospel of St.Google.)
>>> 
>>> 
>>>    The Wikipedia article on "standard deviation" gives the more general 
>>> formula.  (That article does NOT give a citation for that formula.  I you 
>>> know one, please add it -- or post it here, to make it easier for someone 
>>> else to add it.)
>>> 
>> 
>> Er, I don't see that (i.e. var(S) etc.) in there? 
>> 
>> My sources were
>> 
>> https://math.stackexchange.com/questions/72975/variance-of-sample-variance
>> https://stats.stackexchange.com/questions/631/standard-deviation-of-standard-deviation
>> 
>> which contains further links, but no references to publications. I suspect 
>> that this stuff is easy enough to do ab initio that people don't bother to 
>> fire up a literature search.
> 
> I don't see why that page doesn't cite: 
> https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation
> 
> ... which had several citations including to Johnson, Kotz and Balakrishnan, 
> v 1, ch 13 sect 8.2. I dug out my copy from the bottom of a large pile of 
> tomes that I had not reshelved and can confirm that the formula is almost 
> (but not quite) the same as appears in print.
> 
> JK&M give a formula (p 127) with no derivation or citation:
> 
> E[S] = sigma*( 2/n )^(1/2)*Gamma(n/2)/Gamma[ (n-1)/2 ]
> 
> Whereas the Wikipedia page citing a 1968 TAS article gives:
> 
> E[S] = sigma*( 2/(n-1) )^(1/2)*Gamma(n/2)/Gamma[ (n-1)/2 ]
> 
> I looked up the Bloch note online:
> 
> http://www.tandfonline.com/doi/abs/10.1080/00031305.1968.10480476?journalCode=utas20
> 
> And it does not have the formula. It was a note on an earlier article by 
> Cureton, who in turn cited an American Journal of Psychology article by 
> Holtxman(1950, v63, 615-617).
> http://amstat.tandfonline.com/doi/abs/10.1080/00031305.1968.10480435?src=recsys
> 
> Searching on that article I see the first hit is a citation to some R 
> documentation for hte MBESS::s.u function, which does implement it as 
> recommended by Holtzman.
> 
> If I were voting on this I would put greater weight on the JK&M but that's 
> just because it is incredibly likely that I could do the math.


And after reading the historical note by Jarrett (cited by 
http://davegiles.blogspot.com/2013/12/unbiased-estimation-of-standard.html) :

http://www.tandfonline.com/doi/abs/10.1080/00031305.1968.10480474

I'm wondering if these may be equivalent after corrections of varying 
definitions.

-- 
David.
> 
> Best;
> David.
> 
> 
> 
> 
> 
>> 
>> -pd
>> 
>> 
>>> 
>>>    Thanks, Peter.
>>>    Spencer Graves
>>>> 
>>>> -pd
>>>> 
> 
> David Winsemius
> Alameda, CA, USA
> 
> 'Any technology distinguishable from magic is insufficiently advanced.'   
> -Gehm's Corollary to Clarke's Third Law
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weight in lm

Reply via email to