Agreed that this looks like a real bug, and is independent of how one regards the more general issue about specifying methods for a public generic and a non-exported class.
John On Aug 2, 2016, at 11:48 AM, Kevin Ushey <kevinus...@gmail.com> wrote: > Hi Martin, John, > > Thanks for the responses! I've tidied up some of the notes from this > mailing list thread and posted them on the bug tracker. > > John, in this case, I think namespaces are relevant because for > non-exported S4 classes, the class information is made available > through the '.__C__<package>' symbol in the package's namespace, but > not the package environment that gets attached to the search path. In > this (rare, yet not impossible) sequence of events, it looks like R > attempts to resolve the '.__C__<package>' symbol in the wrong > environment, and so class information lookup fails, and we end up > caching the wrong inheritance information. > > Thanks, > Kevin > > On Sun, Jul 31, 2016 at 5:12 AM, John Chambers <j...@r-project.org> wrote: >> (Just returning from the "wilds" of Canada, so not able to comment on the >> specifics, but ...) >> >> There is a basic point about generic functions that may be related to the >> "private" class question and my earlier remarks that Martin alluded to. >> >> R (and S4 before it) allows packages to define methods for a generic >> function in another package. Say, for plot() in graphics. >> >> The current model is that the generic plot() remains one function, >> specifically a generic associated with the graphics package. >> >> Package A might define a method corresponding to one or two classes defined >> in that package. When A is loaded, those methods are added to the table for >> plot() in the current session. >> >> Now suppose a user calls a function, foo(), in package B, and that foo() in >> turn calls plot(). This is the same plot() function, and in particular will >> include the methods supplied from package A. >> >> This is regardless of the two packages having any overt connection. Also, >> the methods are accepted by the generic function regardless of whether the >> class is explicitly exported or not. In this sense, classes cannot be >> entirely private if methods are defined for a non-local function. >> Namespaces are not directly relevant. >> >> Whether this can lead to strange behavior isn't clear, and if so, is it a >> sign that something undesirable was done in the particular example? (In >> Extending R, I suggested a "right to write methods" that would discourage a >> package from having methods unless it owned the function or some of the >> classes.) >> >> R could adopt a different model for generic functions, where a package that >> defined a method for a non-exported class would create a "local" version of >> the generic, but that would likely raise some other issues. >> >> But seems like a useful topic for discussion. >> >> John >> >> On Jul 30, 2016, at 11:07 AM, Martin Maechler <maech...@stat.math.ethz.ch> >> wrote: >> >>>>>>>> Kevin Ushey <kevinus...@gmail.com> >>>>>>>> on Fri, 29 Jul 2016 11:46:19 -0700 writes: >>> >>>> I should add one more item that may be related here -- >>>> calling 'methods:::.requirePackage' returns a different >>>> result based on whether the package namespace is already >>>> loaded or not. >>> >>>> If the package namespace is not loaded, the package is >>>> loaded and attached, and the package environment is >>>> returned: >>> >>>>> methods:::.requirePackage("digest") >>>> Loading required package: digest <environment: >>>> package:digest> attr(,"name") [1] "package:digest" >>>> attr(,"path") [1] >>>> "/Users/kevin/Library/R/3.3/library/digest" >>>>> "digest" %in% loadedNamespaces() >>>> [1] TRUE >>>>> "package:digest" %in% search() >>>> [1] TRUE >>> >>>> On the other hand, if the package namespace has already >>>> been loaded, the package namespace is returned without >>>> attaching the package: >>> >>>>> requireNamespace("digest") >>>> Loading required namespace: digest >>>>> methods:::.requirePackage("digest") >>>> <environment: namespace:digest> >>>>> "digest" %in% loadedNamespaces() >>>> [1] TRUE >>>>> "package:digest" %in% search() >>>> [1] FALSE >>> >>>> This may be intentional, but the behavior seems surprising >>>> and could be responsible for the behavior outlined >>>> earlier. >>> >>> Yes, the behavior you outlined earlier is buggy, and I also have >>> seen similar bugous behavior for the case of non-exported >>> classes. >>> >>> Part of it is historical: The S4 code was mostly written before >>> namespaces were introduced into R; I vaguely remember John >>> Chambers (the principal creator of S4) saying that he did not >>> intend the formal classes to be not visible... which in some >>> sense only contains the fact that he (or anybody) would not >>> think much about hidden objects before they were introduced. >>> >>> Still, in the mean time, most of us have seen many cases where >>> we wanted to have "private" classes, and many packages do have >>> them, too.... and they "mostly work" ;-) >>> >>> In other words, I agree that it would be very desirable to get >>> to the bottom of this and fix such problems. >>> >>> .requirePackage() is among the parts of the methods package code >>> which are quite delicate (and not much documented AFAIK, the hidden >>> .requirePackage() function is a good example!). >>> >>> Delicate for at least two reasons: >>> >>> 1) They are not only used in crucial steps when "bootstrapping" >>> the methods package ('methods' has to define its own S4 >>> generics, methods, and classes before the package "exists"), >>> >>> 1b) they are also used both when building and installing another >>> 'methods'-dependent package. This could be called >>> "bootstrapping another package". >>> >>> 2) they are also much used whenever S4 code (aka >>> 'methods'-dependent) packages are loaded and attached, >>> so should be as fast as possible. >>> I think you know that in recent years there have been >>> considerable (and successful!) efforts to speedup package >>> loading time, e.g. speeding up 'library(Matrix)' by about >>> 2 factors of 2, and changing such functions should not >>> deteriorate package-load speed noticably. >>> >>> Still, it is very desirable to improve / fix the issue: >>> After all this musings, I'd currently guess that it would be a >>> good idea if .requirePackage() would always return the >>> *namespace* of the corresponding package unless that does not >>> yet exist, as in the the 'bootstrap' situations above. >>> ... or we'd add a new argument to .requirePackage() dealing with >>> that, or we use two functions: .requirePackage() needed for >>> boostrapping, returning a package envionment, and >>> .requireNamespace() used to access class (and generic function!) >>> environments. >>> >>> Well tested patches are very welcome; as is filing a formal >>> bug report with bugzilla (https://bugs.r-project.org/ ; >>> (if you have no "account" there, we will have to >>> manually create one, for now, see the 'Note' on >>> https://www.r-project.org/bugs.html because of the spammer >>> attacks earlier this month ... but I see you, Kevin are >>> registered there), >>> or >>> just continuing your findings here. >>> >>> Martin >>> >>> >>>> Best, Kevin >>> >>>> On Fri, Jul 29, 2016 at 11:37 AM, Kevin Ushey >>>> <kevinus...@gmail.com> wrote: >>>>> I have a small idea as to what's going on now; at least, >>>>> why exporting the class resolves this particular issue. >>>>> >>>>> Firstly, when an S4 class is not exported, the associated >>>>> '.__C__<class>' object is not made part of the package >>>>> environment. For example, I see: >>>>> >>>>>> getAnywhere(".__C__SubMatrix") A single object matching >>>>> '.__C__SubMatrix' was found It was found in the following >>>>> places namespace:s4inherits with value < ... > >>>>> >>>>> Note that the symbol is only discovered in the package >>>>> namespace. When the class is exported (e.g. with >>>>> 'exportClasses(SubMatrix)' in the NAMESPACE file), it's >>>>> found both in 'package:s4inherits' and >>>>> 'namespace:s4inherits'. >>>>> >>>>> Secondly, when R attempts to resolve the superclasses for >>>>> an S3 class, the function 'methods:::.extendsForS3' is >>>>> called. Tracing that code eventually gets us here: >>>>> >>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L255 >>>>> >>>>> Note that we reach this code path as the S3 class cache >>>>> has not been populated yet; ie, this code returns NULL: >>>>> >>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L238-L240 >>>>> >>>>> So, the class hierarchy is looked up using this code: >>>>> >>>>> if(isTRUE(nzchar(package))) { whereP <- >>>>> .requirePackage(package) value <- get0(cname, whereP, >>>>> inherits = inherits) } >>>>> >>>>> However, because the '.__C__SubMatrix' object is only >>>>> made available in the package's namespace, not within the >>>>> package environment, it is not resolved, and so lookup >>>>> fails. (Presumedly, that lookup is done when initially >>>>> building a cache for S3 dispatch?) So, I wonder if that >>>>> class lookup should occur within the package's namespace >>>>> instead? >>>>> >>>>> Thanks for your time, Kevin >>>>> >>>>> On Sat, Jun 25, 2016 at 12:46 PM, Kevin Ushey >>>>> <kevinus...@gmail.com> wrote: >>>>>> Hi, >>>>>> >>>>>> (sorry for the wall of text; the issue here appears to >>>>>> be rather complicated) >>>>>> >>>>>> I'm seeing a somewhat strange case where checking >>>>>> whether an S4 object inherits from a parent class >>>>>> defined from another package with 'inherits' fails if >>>>>> that object is materialized through a call to >>>>>> 'load'. That's a mouthful, so I've put together a >>>>>> relatively small reproducible example online here: >>>>>> >>>>>> https://github.com/kevinushey/s4inherits >>>>>> >>>>>> This package, 's4inherits', defines an S4 class, >>>>>> 'SubMatrix', that inherits from the 'dsyMatrix' class >>>>>> defined in the Matrix package. After installing the >>>>>> package, I run some simple tests: >>>>>> >>>>>> $ R -f test-save.R >>>>>> >>>>>>> library(s4inherits) data <- SubMatrix(1) >>>>>>> >>>>>>> is(data, "SubMatrix") >>>>>> [1] TRUE >>>>>>> inherits(data, "SubMatrix") >>>>>> [1] TRUE >>>>>>> >>>>>>> is(data, "dsyMatrix") >>>>>> [1] TRUE >>>>>>> inherits(data, "dsyMatrix") >>>>>> [1] TRUE >>>>>>> >>>>>>> save(data, file = "test.RData") >>>>>>> >>>>>> >>>>>> All the inheritance checks report as we would expect. I >>>>>> check that the inheritance reports are as expected, then >>>>>> save that object to 'test.RData'. I then load that data >>>>>> file in a new R session and run the same checks: >>>>>> >>>>>> $ R -f test-load.R >>>>>> >>>>>>> library(methods) load("test.RData") >>>>>>> >>>>>>> inherits(data, "SubMatrix") >>>>>> Loading required package: s4inherits [1] TRUE >>>>>>> is(data, "SubMatrix") >>>>>> [1] TRUE >>>>>>> >>>>>>> inherits(data, "dsyMatrix") >>>>>> [1] FALSE # (??) >>>>>>> is(data, "dsyMatrix") >>>>>> [1] TRUE >>>>>>> >>>>>> >>>>>> Note that R now reports that my loaded object does _not_ >>>>>> inherit from "dsyMatrix", yet this occurs only when >>>>>> checked with 'inherits()' -- 'is' produces the expected >>>>>> result. >>>>>> >>>>>> I do not see the behavior if I explicitly load / attach >>>>>> the 's4inherits' package before loading the associated >>>>>> data file; it only occurs if the package namespace is >>>>>> loaded in response to loading the data object hosting a >>>>>> 'SubMatrix' object. >>>>>> >>>>>> More precisely, the package namespace is loaded when the >>>>>> promise hosting the data object is evaluated; that >>>>>> promise being generated by 'load', if I understand >>>>>> correctly. Somehow, evaluation of that promise within >>>>>> the call to 'inherits' in this very special case causes >>>>>> the unexpected behavior -- ie, if the first thing that >>>>>> you do with the loaded object is check its class with >>>>>> 'inherits', then you get this unexpected result. >>>>>> >>>>>> Even more, this behavior seems to go away if the >>>>>> 's4inherits' package explicitly exports the class -- >>>>>> however, you could imagine this class normally being >>>>>> internal to the package, and so it may be undesirable to >>>>>> export it. >>>>>> >>>>>> I checked a bit into the C code, and IIUC the check here >>>>>> looks up the class hierarchy in the R_S4_extends_table >>>>>> object defined in 'attrib.c', so it seems like that >>>>>> mapping is potentially not getting constructed with the >>>>>> full hierarchy (although hopefully someone with more >>>>>> knowledge of the S4 internals + interaction with S3 can >>>>>> elaborate). >>>>>> >>>>>> (FWIW, this was distilled from a case where S3 dispatch >>>>>> on a similar loaded S4 object failed, due to failure to >>>>>> resolve the class hierarchy for the S3 dispatch >>>>>> context.) >>>>>> >>>>>> Thanks, Kevin >>>>>> >>>>>> --- >>>>>> >>>>>> $ R --slave -e "utils::sessionInfo()" R Under >>>>>> development (unstable) (2016-06-13 r70769) Platform: >>>>>> x86_64-apple-darwin15.5.0 (64-bit) Running under: OS X >>>>>> 10.11.5 (El Capitan) >>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel