Re: [Rd] Problem with fitdistr for gamma in R 2.2.0

2005-12-06 Thread Prof Brian Ripley
The problem lies in dgamma, the function which is stated to be implicated. 
See the following NEWS item:

 o  [dpqr]gamma now returns NaN for an invalid 'shape' parameter
(rather than throw an error), for consistency with other
distribution functions.

That setting lower/upper works is just intelligence in the function to 
choose an appropriate method.

On Thu, 17 Nov 2005, P Ehlers wrote:

> I think the problem may lie with fitdistr().
> Specifically, replacing the code in fitdistr.R (VR_7.2-20)
> (line 137 to end) with the code in VR_7.2-8 (line 92 to end)
> seems to handle
>
>   fitdistr(otm, "gamma")
>
> just fine. But I haven't done much testing.
>
> Peter Ehlers
>
> Peter Dalgaard wrote:
>> P Ehlers <[EMAIL PROTECTED]> writes:
>>
>>
>>> Gregor,
>>>
>>>  fitdistr(otm, "gamma", method="L-BFGS-B")
>>>
>>> works for me (on WinXP). Or you could specify "lower = 0".
>>
>>
>> The really odd thing is that it even works with
>>
>>
>>> fitdistr(otm, "gamma",lower=-Inf)
>>
>>  shape rate
>>   1.03081094   0.18924370
>>  (0.09055117) (0.02117350)
>>
>> or even
>>
>>
>>> fitdistr(otm, "gamma",upper=Inf)
>>
>>  shape rate
>>   1.03081094   0.18924370
>>  (0.09055117) (0.02117350)
>>
>>
>> Also
>>
>>
>>> fitdistr(otm, "gamma",control=list(parscale=c(.1,.1)))
>>
>>  shape rate
>>   1.03079500   0.18923897
>>  (0.09055106) (0.02117363)
>>
>> and quite amusingly:
>>
>>
>>
>>> fitdistr(otm, "gamma",method="BFGS",lower=0)
>>
>>  shape rate
>>   1.03081096   0.18924371
>>  (0.09055118) (0.02117350)
>> Warning message:
>> bounds can only be used with method L-BFGS-B in: optim(x = 
>> c(0.059610966029577, 0.0591496321922168, 0.14, 0.18,
>>
>>> fitdistr(otm, "gamma",method="CG",lower=0)
>>
>>  shape rate
>>   1.03081096   0.18924371
>>  (0.09055118) (0.02117350)
>> Warning message:
>> bounds can only be used with method L-BFGS-B in: optim(x = 
>> c(0.059610966029577, 0.0591496321922168, 0.14, 0.18,
>>
>> whereas the same calls without the dysfunctional lower= gives the
>> warning about `shape` needing to be positive.
>>
>> This probably all indicates that something inside optim() is broken.
>>
>>
>>
>>
>>> I no longer have 2.1.0 running, so I don't know why this
>>> wasn't needed in 2.1.0.
>>>
>>> "R version 2.2.0, 2005-10-24"
>>> MASS version: 7.2-20
>>>
>>> -peter
>>>
>>> Gorjanc Gregor wrote:
>>>
 Dear R developers,

 I have encountered strange behaviour of fitdistr for gamma in recent R
 build i.e. 2.2.0. I have attached the code for data at the end of this mail
 so you can reproduce the problem. In short, I am able to run fitdistr under
 2.1.0 without problems, while I get the following error under 2.2.0
 (Version 2.2.0 Patched (2005-11-15 r36348))



> fitdistr(otm, "gamma")

 Error in densfun(x, parm[1], parm[2], ...) :
'shape' must be strictly positive

 The results on 2.1.1 (Version 2.1.1 (2005-06-20)) are



> fitdistr(otm, "gamma")

shape   rate
  1.030667   0.189177
 (0.090537) (0.021166)

 Platform: Windows XP

 Thank you in advance for your effort on this remarkable tool!

 Here is the data for above problem/results:

 "otm" <-
 c(0.059610966029577, 0.0591496321922168, 0.14, 0.18, 0.24, 0.25,
 0.270071982912719, 0.270758049933706, 0.269911804412492, 0.280138451903593,
 0.279787947586738, 0.279429937571753, 0.3, 0.320746235495899,
 0.319553311037365, 0.51, 0.54, 0.56, 0.6, 0.609812622915953,
 0.609198293855879, 0.64, 0.69, 0.74, 0.76, 0.770972826186568,
 0.769288654833566, 0.78, 0.789181584270671, 0.78991363293305,
 0.8, 0.89, 0.900691718998831, 0.8991656800583, 0.92, 0.93, 0.94,
 1.01, 1.02, 1.13, 1.18, 1.26, 1.29, 1.33, 1.42, 1.43, 1.47, 
 1.47940529614314,
 1.47920716832764, 1.6, 1.61, 1.63, 1.68938231960637, 1.6894849291523,
 1.82, 1.88088044053270, 1.8792804789003, 1.89, 1.92, 2, 2.04,
 2.07, 2.12, 2.17, 2.18, 2.22, 2.23, 2.27, 2.28, 2.3, 2.32092240267433,
 2.31912300181622, 2.38, 2.39, 2.43, 2.46, 2.51, 2.52, 2.55, 2.56,
 2.61, 2.66091404781397, 2.6595832825806, 2.67, 2.7, 2.77, 2.8,
 2.81, 2.86, 2.87, 2.93, 3.01, 3.05, 3.14, 3.15, 3.17, 3.18, 3.24,
 3.26, 3.33, 3.44, 3.45, 3.52, 3.55, 3.63, 3.73, 3.9, 4, 4.01,
 4.04, 4.13, 4.15934497380769, 4.16094719917513, 4.3, 4.33, 4.34,
 4.66, 4.76, 4.82, 4.83, 4.89, 4.92, 5.06, 5.14, 5.16, 5.26, 5.31,
 5.36, 5.48, 5.66, 5.79, 5.8, 5.85, 5.87, 5.92952534468565, 
 5.92962284128508,
 6.04, 6.11, 6.13, 6.16, 6.19, 6.42, 6.66, 6.69, 7.11, 7.16, 7.29,
 7.3, 7.31, 7.33, 7.72, 7.82, 7.87, 7.91, 8.01, 8.17, 8.45, 8.49,
 8.73, 8.86, 8.95, 9, 9.05, 9.13, 9.22, 9.52, 9.82, 9.88, 9.91,
 9.99, 10.03, 10.4, 10.59, 10.83, 11.06, 11.64, 11.85, 12.02,
 12.4, 12.64, 12.96, 13.44, 14.06, 14.07, 14.37, 15.4, 15.6, 15.92,
 16.23, 16.6, 16.97, 17.06, 17.8, 18.69, 18

[Rd] problems with initialize-method, difference between Win XP & Linux

2005-12-06 Thread Matthias Kohl
Dear R devels,

I have some questions concerning the use of "initialize".

Situation:
There are two packages "S4pkg1" and "S4pkg2" which include S4 classes 
and methods where the first one contains a new S4 class "S4pkg1Class".
Then, in "S4pkg2" there is a new S4 class "S4pkg2Class" which has a slot 
of class "S4pkg1Class". Both packages have a namespace where I use 
exportClasses("S4pkg1Class") in the namespace of "S4pkg1" and 
import("S4pkg1") in the namespace of "S4pkg2".

#
1. Solution:
I provide a prototype argument in the definition of "S4pkg1Class" and 
use new("S4pkg1Class") in the prototype of "S4pkg2Class".
Then, everything works fine under Windows XP and (Suse 9.3) Linux using 
R 2.2.0 and R 2.3.0 devel; i.e., calling "new("S4pkg2Class")" returns an 
object of class "S4pkg2Class" and the slot of class "S4pkg1Class" is 
filled with the corresponding prototype.

#
2. Solution:
I don't provide a prototype argument in the definition of "S4pkg1Class". 
Instead, I define an "initialize"-method for class "S4pkg1Class" with 
default arguments for the slots of "S4pkg1Class" and again I use 
"new("S4pkg1Class")" in the prototype of class "S4pkg2Class".
Moreover, I use exportMethods("initialize") in the namespace of package 
"S4pkg1".

Then, everything seems to work fine (at least on my PC) under Windows XP 
using R 2.2.0 and R 2.3.0 devel; i.e., calling "new("S4pkg2Class")" 
returns an object of class "S4pkg2Class" where the slot of class 
"S4pkg1Class" now is filled with the default object generated by the 
initialize-method of class "S4pkg1Class".
However, under (Suse 9.3) Linux using R 2.2.0 and R 2.3.0 devel 
"new("S4pkg2Class")" returns an object of class "S4pkg2Class" where the 
slot of class "S4pkg1Class" is not filled with the default object 
generated by the initialize-method of class "S4pkg1Class" but with a 
"default-protoype" (slots are filled with "numeric(0)", "character(0)", 
...).

Can someone confirm this behavior?

The sources of two sample packages can be found under:
http://www.stamats.de/S4pkg1_0.1-1.tar.gz
and
http://www.stamats.de/S4pkg2_0.1-1.tar.gz

After installation please try:
require(S4pkg1)
new("S4pkg1Class") # o.k., default values of initialize are used

require(S4pkg2)
new("S4pkg2Class") # is slot "pkg1" filled with the output of 
new("S4pkg1Class") given above???

Why does this work under Windows XP but not under (Suse 9.3) Linux?
Am I doing something wrong - or is this a bug?

Many thanks for any help!
Matthias

-- 
StaMatS - Statistik + Mathematik Service
Dipl.Math.(Univ.) Matthias Kohl
www.stamats.de

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] standardized residuals (rstandard & plot.lm) (PR#8367)

2005-12-06 Thread Heather . Turner
Full_Name: Heather Turner
Version: 2.2.0
OS: Windows XP
Submission from: (NULL) (137.205.240.44)


Standardized residuals as calculated by rstandard.lm, rstandard.glm and plot.lm
are Inf/NaN rather than zero when the un-standardized residuals are zero. This
causes plot.lm to break when calculating 'ylim' for any of the plots of
standardized residuals. Example:

"occupationalStatus" <-
structure(as.integer(c(50, 16, 12, 11, 2, 12, 0, 0, 19, 40, 35, 
   20, 8, 28, 6, 3, 26, 34, 65, 58, 12, 102, 19, 14, 8,
   18, 66, 110, 23, 162, 40, 32, 7, 11, 35, 40, 25, 90,
   21, 15, 11, 20, 88, 183, 46, 554, 158, 126, 6, 8,
23,
   64, 28, 230, 143, 91, 2, 3, 21, 32, 12, 177, 71,
106)
 ), .Dim = as.integer(c(8, 8)), .Dimnames =
  structure(list(origin = c("1", "2", "3", "4", "5", "6", "7",
"8"),
 destination = c("1", "2", "3", "4", "5", "6", "7",
 "8")), .Names = c("origin", "destination")),
  class = "table")
Diag <- as.factor(diag(1:8))
Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE)
Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE)
Uniform <- glm(Freq ~ origin + destination + Diag + 
   Rscore:Cscore, family = poisson, data = occupationalStatus)
residuals(Uniform)[as.logical(diag(8))] #zero/near-zero
rstandard(Uniform)[as.logical(diag(8))] #mostly Inf/NaN
plot(Uniform) #breaks on qqnorm plot (or any 'which' > 1)

This could be fixed by replacing standardized residuals with zero where the hat
value is one, e.g.
rstandard.glm <- function (model,
infl = lm.influence(model, do.coef = FALSE),
...) {
 res <- infl$wt.res
 hat <- infl$hat
 ifelse(hat == 1, 0, res/sqrt(summary(model)$dispersion * (1 - 
infl$hat)))
}
etc.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] standardized residuals (rstandard & plot.lm) (PR#8367)

2005-12-06 Thread ripley
Curiously, I was just looking at that, since I believe the answer should 
be NaN, and some optimizing compilers/fast BLASes are not giving that.
(There's an example in reg-test-3.R.)  So I think we need to return NaN 
when hat is within rounding error of 1.

My take is that plot.lm should handle this: you will see most but not all 
cases have na.rm=TRUE in calculating ylim, but as Inf is theoretically
impossible it has not been considered.

Note that plot.lm does not use rstandard and so needs a separate fix.

Thanks for the report

On Tue, 6 Dec 2005 [EMAIL PROTECTED] wrote:

> Full_Name: Heather Turner
> Version: 2.2.0
> OS: Windows XP
> Submission from: (NULL) (137.205.240.44)
>
>
> Standardized residuals as calculated by rstandard.lm, rstandard.glm and 
> plot.lm
> are Inf/NaN rather than zero when the un-standardized residuals are zero. This
> causes plot.lm to break when calculating 'ylim' for any of the plots of
> standardized residuals. Example:
>
> "occupationalStatus" <-
>structure(as.integer(c(50, 16, 12, 11, 2, 12, 0, 0, 19, 40, 35,
>   20, 8, 28, 6, 3, 26, 34, 65, 58, 12, 102, 19, 14, 8,
>   18, 66, 110, 23, 162, 40, 32, 7, 11, 35, 40, 25, 90,
>   21, 15, 11, 20, 88, 183, 46, 554, 158, 126, 6, 8,
> 23,
>   64, 28, 230, 143, 91, 2, 3, 21, 32, 12, 177, 71,
> 106)
> ), .Dim = as.integer(c(8, 8)), .Dimnames =
>  structure(list(origin = c("1", "2", "3", "4", "5", "6", "7",
> "8"),
> destination = c("1", "2", "3", "4", "5", "6", "7",
> "8")), .Names = c("origin", "destination")),
>  class = "table")
> Diag <- as.factor(diag(1:8))
> Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE)
> Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE)
> Uniform <- glm(Freq ~ origin + destination + Diag +
>   Rscore:Cscore, family = poisson, data = occupationalStatus)
> residuals(Uniform)[as.logical(diag(8))] #zero/near-zero
> rstandard(Uniform)[as.logical(diag(8))] #mostly Inf/NaN
> plot(Uniform) #breaks on qqnorm plot (or any 'which' > 1)
>
> This could be fixed by replacing standardized residuals with zero where the 
> hat
> value is one, e.g.
> rstandard.glm <- function (model,
>infl = lm.influence(model, do.coef = FALSE),
>...) {
> res <- infl$wt.res
> hat <- infl$hat
> ifelse(hat == 1, 0, res/sqrt(summary(model)$dispersion * (1 -
> infl$hat)))
> }
> etc.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel