Arnaud Le Rouzic wrote: > Hi the list, > > I am experiencing several issues with profile.mle (and consequently with > confint.mle) (stat4 version 2.9.2), and I have to spend a lot of time to > find workarounds to what looks like interface bugs. I would be glad to > get feedback from experienced users to know if I am really asking too > much or if there is room for improvement. > > * Problem #1 with fixed parameters. I can't manage to get profile-based > confidence intervals when some parameters of the likelihood function are > fixed: > > -----------8<-------------------------------- > library(stats4) > > minusLogL1 <- function(mu, logsigma2) { N*log(2*pi*exp(logsigma2))/2 + > N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) } > minusLogL2 <- function(mu) { logsigma2 <- 0; > N*log(2*pi*exp(logsigma2))/2 + > N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) } > > N <- 100 > x <- rnorm(N, 0, 1) > > fit <- mle(minusLogL1, start=list(mu=0, logsigma2=0)) > confint(fit) > > fit2 <- mle(minusLogL1, start=list(mu=0), fixed=list(logsigma2=0)) > confint(fit2) > > fit3 <- mle(minusLogL2, start=list(mu=0)) > confint(fit3) > ----------->8-------------------------------- > > Is it unreasonable to expect an identical result with fit2 and fit3? > When looking into the code of the "profile" method for mle, one can find > something like call$fixed <- fix ; i.e. the fixed effects in the call > are completely replaced by the combination of fixed effects needed by > the profile function. Perhaps I am missing something, but I don't > understand why this is necessary. > > * Problem #2 deals with the scope of the variables used in the "call" > function. I understand that there are technical constraints, and similar > remarks were previously answered on the Microsoft mode ("It's not a bug, > it's a feature and you have to cope with it"), but the direct > consequence is that the user has to understand the internals of the > functions provided by the stats4 package, which does not look like a > good idea to me. One of the simplest and most striking example is > something like: > > -----------8<-------------------------------- > > fit3 <- mle(minusLogL2, start=list(mu=0)) > confint(fit3) > x <- rnorm(N, 0, 1) > confint(fit3) > > ----------->8-------------------------------- > > Although I understand the technical reason why the result is different, > I think such side effects are catastrophic in terms of UI. The user > should never have to know whether the information he/she wants is > already stored in the object (vcov, coef) or if the function will > recompute something. > > Incidentally, such side effects make it very complicated to write > wrappers for the mle (which is my actual problem). A straightforward way > to solve it would be to store more information, including the data, in > the mle object (as many others, e.g. lm, do), but if stats4 has been > included among the core packages, it is probably because it was > considered stable and flexible enough. Are there any tricks or > alternatives to wrap mle calls properly? >
Well... Issue #1 looks like a programming oversight, the "fixed" argument was mainly put there to enable profiling, not as a model building tool. However, it sounds easy enough to maintain a larger set of fixed variables. Patches will be considered... Issue #2 is stickier. I think I must say that the idelology of mle() is that the user passes a likelihood function. If the likelihood function depends on global data, and the user changes the global data, the user deserves what he or she gets... It is pretty much impossible for mle() to take an arbitrary function and detect which data it might depend on. You can go the bbmle route and pass in the data set as a data frame, but then you lose the flexibility that a likelihood can depend on data that doesn't fit within the single rectangular data frame. I think a better way is just to create likelihood functions that do not depend on global data, e.g. by keeping data in an environment that is within the scope of the likelihood function, but not (easily) accessible from elsewhere. In your case, maybe minusLogL1 <- local( { N <- 100 x <- rnorm(N, 0, 1) function(mu, logsigma2) { N*log(2*pi*exp(logsigma2))/2 + N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) } } Or maybe NormalLike <- function(x) { force(x) N <- length(x) function(mu, logsigma2) { N*log(2*pi*exp(logsigma2))/2 + N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) } } x <-rnorm(100, 0, 1) minusLogL1 <- NormalLike(x) (The force(x) is not strictly necessary here as the calculation of N will automatically force evaluation of x, but forgetting to do it can cause some rather magnificent grief otherwise because of R's lazy evaluation mechanisms.) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.