Agreed that this looks like a real bug, and is independent of how one regards 
the more general issue about specifying methods for a public generic and a 
non-exported class.

John

On Aug 2, 2016, at 11:48 AM, Kevin Ushey <kevinus...@gmail.com> wrote:

> Hi Martin, John,
> 
> Thanks for the responses! I've tidied up some of the notes from this
> mailing list thread and posted them on the bug tracker.
> 
> John, in this case, I think namespaces are relevant because for
> non-exported S4 classes, the class information is made available
> through the '.__C__<package>' symbol in the package's namespace, but
> not the package environment that gets attached to the search path. In
> this (rare, yet not impossible) sequence of events, it looks like R
> attempts to resolve the '.__C__<package>' symbol in the wrong
> environment, and so class information lookup fails, and we end up
> caching the wrong inheritance information.
> 
> Thanks,
> Kevin
> 
> On Sun, Jul 31, 2016 at 5:12 AM, John Chambers <j...@r-project.org> wrote:
>> (Just returning from the "wilds" of Canada, so not able to comment on the 
>> specifics, but ...)
>> 
>> There is a basic point about generic functions that may be related to the 
>> "private" class question and my earlier remarks that Martin alluded to.
>> 
>> R (and S4 before it)  allows packages to define methods for a generic 
>> function in another package.  Say, for plot() in graphics.
>> 
>> The current model is that the generic plot() remains one function, 
>> specifically a generic associated with the graphics package.
>> 
>> Package A might define a method corresponding to one or two classes defined 
>> in that package.  When A is loaded, those methods are added to the table for 
>> plot() in the current session.
>> 
>> Now suppose a user calls a function, foo(), in package B, and that foo() in 
>> turn calls plot().  This is the same plot() function, and in particular will 
>> include the methods supplied from package A.
>> 
>> This is regardless of the two packages having any overt connection.  Also, 
>> the methods are accepted by the generic function regardless of whether the 
>> class is explicitly exported or not.  In this sense, classes cannot be 
>> entirely private if methods are defined for a non-local function.  
>> Namespaces are not directly relevant.
>> 
>> Whether this can lead to strange behavior isn't clear, and if so, is it a 
>> sign that something undesirable was done in the particular example?  (In 
>> Extending R, I suggested a "right to write methods" that  would discourage a 
>> package from having methods unless it owned the function or some of the 
>> classes.)
>> 
>> R could adopt a different model for generic functions, where a package that 
>> defined a method for a non-exported class would create a "local" version of 
>> the generic, but that would likely raise some other issues.
>> 
>> But seems like a useful topic for discussion.
>> 
>> John
>> 
>> On Jul 30, 2016, at 11:07 AM, Martin Maechler <maech...@stat.math.ethz.ch> 
>> wrote:
>> 
>>>>>>>> Kevin Ushey <kevinus...@gmail.com>
>>>>>>>>   on Fri, 29 Jul 2016 11:46:19 -0700 writes:
>>> 
>>>> I should add one more item that may be related here --
>>>> calling 'methods:::.requirePackage' returns a different
>>>> result based on whether the package namespace is already
>>>> loaded or not.
>>> 
>>>> If the package namespace is not loaded, the package is
>>>> loaded and attached, and the package environment is
>>>> returned:
>>> 
>>>>> methods:::.requirePackage("digest")
>>>>   Loading required package: digest <environment:
>>>> package:digest> attr(,"name") [1] "package:digest"
>>>> attr(,"path") [1]
>>>> "/Users/kevin/Library/R/3.3/library/digest"
>>>>> "digest" %in% loadedNamespaces()
>>>>   [1] TRUE
>>>>> "package:digest" %in% search()
>>>>   [1] TRUE
>>> 
>>>> On the other hand, if the package namespace has already
>>>> been loaded, the package namespace is returned without
>>>> attaching the package:
>>> 
>>>>> requireNamespace("digest")
>>>>   Loading required namespace: digest
>>>>> methods:::.requirePackage("digest")
>>>>   <environment: namespace:digest>
>>>>> "digest" %in% loadedNamespaces()
>>>>   [1] TRUE
>>>>> "package:digest" %in% search()
>>>>   [1] FALSE
>>> 
>>>> This may be intentional, but the behavior seems surprising
>>>> and could be responsible for the behavior outlined
>>>> earlier.
>>> 
>>> Yes, the behavior you outlined earlier is buggy, and I also have
>>> seen similar bugous behavior for the case of non-exported
>>> classes.
>>> 
>>> Part of it is historical:  The S4 code was mostly written before
>>> namespaces were introduced into R;   I vaguely remember John
>>> Chambers (the principal creator of S4) saying that he did not
>>> intend the formal classes to be not visible... which in some
>>> sense only contains the fact that he (or anybody) would not
>>> think much about hidden objects before they were introduced.
>>> 
>>> Still, in the mean time, most of us have seen many cases where
>>> we wanted to have "private" classes,  and many packages do have
>>> them, too.... and they "mostly work" ;-)
>>> 
>>> In other words, I agree that it would be very desirable to get
>>> to the bottom of this and fix such problems.
>>> 
>>> .requirePackage() is among the parts of the methods package code
>>> which are quite delicate (and not much documented AFAIK, the hidden
>>> .requirePackage() function is a good example!).
>>> 
>>> Delicate for at least two reasons:
>>> 
>>> 1) They are not only used in crucial steps when "bootstrapping"
>>>  the methods package ('methods' has to define its own S4
>>>  generics, methods, and classes before the package "exists"),
>>> 
>>> 1b) they are also used both when building and installing another
>>>   'methods'-dependent package.  This could be called
>>>   "bootstrapping another package".
>>> 
>>> 2) they are also much used whenever S4 code (aka
>>>  'methods'-dependent) packages are loaded and attached,
>>>  so should be as fast as possible.
>>>  I think you know that in recent years there have been
>>>  considerable (and successful!) efforts to speedup package
>>>  loading time, e.g. speeding up 'library(Matrix)' by about
>>>  2 factors of 2, and changing such functions should not
>>>  deteriorate package-load speed noticably.
>>> 
>>> Still, it is very desirable to improve / fix the issue:
>>> After all this musings, I'd currently guess that it would be a
>>> good idea if  .requirePackage()  would always return the
>>> *namespace* of the corresponding package unless that does not
>>> yet exist, as in the the 'bootstrap' situations above.
>>> ... or we'd add a new argument to .requirePackage() dealing with
>>> that, or we use two functions:  .requirePackage() needed for
>>> boostrapping, returning a package envionment, and
>>> .requireNamespace() used to access class (and generic function!)
>>> environments.
>>> 
>>> Well tested patches are very welcome; as is filing a formal
>>> bug report with bugzilla (https://bugs.r-project.org/ ;
>>> (if you have no "account" there, we will have to
>>> manually create one, for now, see the 'Note' on
>>> https://www.r-project.org/bugs.html because of the spammer
>>> attacks earlier this month ... but I see you, Kevin are
>>> registered there),
>>> or
>>> just continuing your findings here.
>>> 
>>> Martin
>>> 
>>> 
>>>> Best, Kevin
>>> 
>>>> On Fri, Jul 29, 2016 at 11:37 AM, Kevin Ushey
>>>> <kevinus...@gmail.com> wrote:
>>>>> I have a small idea as to what's going on now; at least,
>>>>> why exporting the class resolves this particular issue.
>>>>> 
>>>>> Firstly, when an S4 class is not exported, the associated
>>>>> '.__C__<class>' object is not made part of the package
>>>>> environment.  For example, I see:
>>>>> 
>>>>>> getAnywhere(".__C__SubMatrix") A single object matching
>>>>> '.__C__SubMatrix' was found It was found in the following
>>>>> places namespace:s4inherits with value < ... >
>>>>> 
>>>>> Note that the symbol is only discovered in the package
>>>>> namespace. When the class is exported (e.g. with
>>>>> 'exportClasses(SubMatrix)' in the NAMESPACE file), it's
>>>>> found both in 'package:s4inherits' and
>>>>> 'namespace:s4inherits'.
>>>>> 
>>>>> Secondly, when R attempts to resolve the superclasses for
>>>>> an S3 class, the function 'methods:::.extendsForS3' is
>>>>> called. Tracing that code eventually gets us here:
>>>>> 
>>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L255
>>>>> 
>>>>> Note that we reach this code path as the S3 class cache
>>>>> has not been populated yet; ie, this code returns NULL:
>>>>> 
>>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L238-L240
>>>>> 
>>>>> So, the class hierarchy is looked up using this code:
>>>>> 
>>>>> if(isTRUE(nzchar(package))) { whereP <-
>>>>> .requirePackage(package) value <- get0(cname, whereP,
>>>>> inherits = inherits) }
>>>>> 
>>>>> However, because the '.__C__SubMatrix' object is only
>>>>> made available in the package's namespace, not within the
>>>>> package environment, it is not resolved, and so lookup
>>>>> fails. (Presumedly, that lookup is done when initially
>>>>> building a cache for S3 dispatch?) So, I wonder if that
>>>>> class lookup should occur within the package's namespace
>>>>> instead?
>>>>> 
>>>>> Thanks for your time, Kevin
>>>>> 
>>>>> On Sat, Jun 25, 2016 at 12:46 PM, Kevin Ushey
>>>>> <kevinus...@gmail.com> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> (sorry for the wall of text; the issue here appears to
>>>>>> be rather complicated)
>>>>>> 
>>>>>> I'm seeing a somewhat strange case where checking
>>>>>> whether an S4 object inherits from a parent class
>>>>>> defined from another package with 'inherits' fails if
>>>>>> that object is materialized through a call to
>>>>>> 'load'. That's a mouthful, so I've put together a
>>>>>> relatively small reproducible example online here:
>>>>>> 
>>>>>> https://github.com/kevinushey/s4inherits
>>>>>> 
>>>>>> This package, 's4inherits', defines an S4 class,
>>>>>> 'SubMatrix', that inherits from the 'dsyMatrix' class
>>>>>> defined in the Matrix package.  After installing the
>>>>>> package, I run some simple tests:
>>>>>> 
>>>>>> $ R -f test-save.R
>>>>>> 
>>>>>>> library(s4inherits) data <- SubMatrix(1)
>>>>>>> 
>>>>>>> is(data, "SubMatrix")
>>>>>> [1] TRUE
>>>>>>> inherits(data, "SubMatrix")
>>>>>> [1] TRUE
>>>>>>> 
>>>>>>> is(data, "dsyMatrix")
>>>>>> [1] TRUE
>>>>>>> inherits(data, "dsyMatrix")
>>>>>> [1] TRUE
>>>>>>> 
>>>>>>> save(data, file = "test.RData")
>>>>>>> 
>>>>>> 
>>>>>> All the inheritance checks report as we would expect. I
>>>>>> check that the inheritance reports are as expected, then
>>>>>> save that object to 'test.RData'. I then load that data
>>>>>> file in a new R session and run the same checks:
>>>>>> 
>>>>>> $ R -f test-load.R
>>>>>> 
>>>>>>> library(methods) load("test.RData")
>>>>>>> 
>>>>>>> inherits(data, "SubMatrix")
>>>>>> Loading required package: s4inherits [1] TRUE
>>>>>>> is(data, "SubMatrix")
>>>>>> [1] TRUE
>>>>>>> 
>>>>>>> inherits(data, "dsyMatrix")
>>>>>> [1] FALSE # (??)
>>>>>>> is(data, "dsyMatrix")
>>>>>> [1] TRUE
>>>>>>> 
>>>>>> 
>>>>>> Note that R now reports that my loaded object does _not_
>>>>>> inherit from "dsyMatrix", yet this occurs only when
>>>>>> checked with 'inherits()' -- 'is' produces the expected
>>>>>> result.
>>>>>> 
>>>>>> I do not see the behavior if I explicitly load / attach
>>>>>> the 's4inherits' package before loading the associated
>>>>>> data file; it only occurs if the package namespace is
>>>>>> loaded in response to loading the data object hosting a
>>>>>> 'SubMatrix' object.
>>>>>> 
>>>>>> More precisely, the package namespace is loaded when the
>>>>>> promise hosting the data object is evaluated; that
>>>>>> promise being generated by 'load', if I understand
>>>>>> correctly. Somehow, evaluation of that promise within
>>>>>> the call to 'inherits' in this very special case causes
>>>>>> the unexpected behavior -- ie, if the first thing that
>>>>>> you do with the loaded object is check its class with
>>>>>> 'inherits', then you get this unexpected result.
>>>>>> 
>>>>>> Even more, this behavior seems to go away if the
>>>>>> 's4inherits' package explicitly exports the class --
>>>>>> however, you could imagine this class normally being
>>>>>> internal to the package, and so it may be undesirable to
>>>>>> export it.
>>>>>> 
>>>>>> I checked a bit into the C code, and IIUC the check here
>>>>>> looks up the class hierarchy in the R_S4_extends_table
>>>>>> object defined in 'attrib.c', so it seems like that
>>>>>> mapping is potentially not getting constructed with the
>>>>>> full hierarchy (although hopefully someone with more
>>>>>> knowledge of the S4 internals + interaction with S3 can
>>>>>> elaborate).
>>>>>> 
>>>>>> (FWIW, this was distilled from a case where S3 dispatch
>>>>>> on a similar loaded S4 object failed, due to failure to
>>>>>> resolve the class hierarchy for the S3 dispatch
>>>>>> context.)
>>>>>> 
>>>>>> Thanks, Kevin
>>>>>> 
>>>>>> ---
>>>>>> 
>>>>>> $ R --slave -e "utils::sessionInfo()" R Under
>>>>>> development (unstable) (2016-06-13 r70769) Platform:
>>>>>> x86_64-apple-darwin15.5.0 (64-bit) Running under: OS X
>>>>>> 10.11.5 (El Capitan)
>>> 
>>>> ______________________________________________
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to