Thanks for the thoughtful reply Mikael.

> Any function F with '...' as a formal argument can pass '...' to another 
> function G.

Yes, that's true. The difference is that in print(F) we can _usually_
pick out at a glance how '...' is being used -- we can see which 'G'
is getting '...'.

For S3 generics, we quickly reach the dead end of 'UseMethod' -- F
being S3 generic is in fact _highly_ relevant.

Yes, the practical issues you raise are interesting & knotty (I
especially have in mind [1] and [2]), but ultimately I think we could
come up with something useful. Whether that becomes a default can
depend on how useful it winds up being, and the empirical risk of
back-incompatibility (which I suspect is low).

Mike C

[1] utils::isS3stdGeneric
https://stat.ethz.ch/R-manual/R-devel/library/utils/html/isS3stdGen.html,
which has a large # of false negatives
[2] utils::nonS3methods
https://stat.ethz.ch/R-manual/R-devel/library/tools/html/QC.html,
which maintains an onerous list of S3 method lookalikes

On Mon, Jun 9, 2025 at 8:44 PM Mikael Jagan <jagan...@gmail.com> wrote:
>
> I don't really understand the premise.  Any function F with '...' as a formal
> argument can pass '...' to another function G.  The actual arguments matching
> '...' in the call to F will be matched to the formal arguments of G.  So the
> the maintainer of F may want to alert the user of F to the existence of G and
> the user of F may want to consult the documentation of G.
>
> Whether F is S3 generic and G is registered as a method for F seems 
> irrelevant.
>
> That is a conceptual issue.  There are practical issues, too:
>
>      * print.default is used "everywhere".  Backwards incompatible changes to
>        default behaviour have the potential to break a lot of code out there.
>
>      * Testing that a function F is S3 generic seems nontrivial.  You have to
>        deal with internally generic functions and for closures recurse through
>        body(F) looking for a call to UseMethod.
>
>      * I would not want the output of print(F) to depend on details external 
> to
>        F or the method call, such as the state of the table of registered S3
>        methods which changes as packages are loaded.  AFAIK, it is intended 
> that
>        options() is the only exception to the rule.
>
>      * More harmonious would be to implement the feature ("give me more
>        information about S3 methods") as an option (disabled by default) of
>        utils::.S3methods if not as a new function altogether.
>
> Mikael
>
> > Date: Fri, 6 Jun 2025 11:59:08 -0700
> > From: Michael Chirico<michaelchiri...@gmail.com>
> >
> > There is a big difference in how to think of '...' for non-generic
> > functions like data.frame() vs. S3 generics.
> >
> > In the former, it means "any number of inputs" [e.g. columns]; in the
> > latter, it means "any number of inputs [think c()], as well as any
> > arguments that might be interpreted by class implementations".
> >
> > Understanding the difference for a given generic can require carefully
> > reading lots of documentation. print(<generic>), which is useful for
> > so many other contexts, can be a dead end.
> >
> > One idea is to extend the print() method to suggest to the reader
> > which other arguments are available (among registered generics). Often
> > ?<generic> will include the most common implementation, but not always
> > so.
> >
> > For rbind (in a --vanilla session), we currently have one method,
> > rbind.data.frame, that offers three arguments not present in the
> > generic: make.row.names, stringsAsFactors, and factor.exclude. The
> > proposal would be to mention this in the print(rbind) output somehow,
> > e.g.
> >
> >> print(rbind)
> > function (..., deparse.level = 1)
> > .Internal(rbind(deparse.level, ...))
> > <bytecode: 0x73d4fd824e20>
> > <environment: namespace:base>
> >
> > +Other arguments implemented by methods
> > +  factor.exclude: rbind.data.frame
> > +  make.row.names: rbind.data.frame
> > +  stringsAsFactors: rbind.data.frame
> >
> > I suggest grouping by argument, not generic, although something like
> > this could be OK too:
> >
> > +Signatures of other methods
> > +  rbind.data.frame(..., deparse.level = 1, make.row.names = TRUE,
> > stringsAsFactors = FALSE,
> > +      factor.exclude = TRUE)
> >
> > Where it gets more interesting is when there are many methods, e.g.
> > for as.data.frame (again, in a --vanilla session):
> >
> >> print(as.data.frame)
> > function (x, row.names = NULL, optional = FALSE, ...)
> > {
> >      if (is.null(x))
> >          return(as.data.frame(list()))
> >      UseMethod("as.data.frame")
> > }
> > <bytecode: 0x73d4fc1e70d0>
> > <environment: namespace:base>
> >
> > +Other arguments implemented by methods
> > +  base: as.data.frame.table
> > +  check.names: as.data.frame.list
> > +  col.names: as.data.frame.list
> > +  cut.names: as.data.frame.list
> > +  fix.empty.names: as.data.frame.list
> > +  make.names: as.data.frame.matrix, as.data.frame.model.matrix
> > +  new.names: as.data.frame.list
> > +  nm: as.data.frame.bibentry, as.data.frame.complex, as.data.frame.Date,
> > +    as.data.frame.difftime, as.data.frame.factor, as.data.frame.integer,
> > +    as.data.frame.logical, as.data.frame.noquote, as.data.frame.numeric,
> > +    as.data.frame.numeric_version, as.data.frame.ordered,
> > +    as.data.frame.person, as.data.frame.POSIXct, as.data.frame.raw
> > +  responseName: as.data.frame.table
> > +  sep: as.data.frame.table
> > +  stringsAsFactors: as.data.frame.character, as.data.frame.list,
> > +    as.data.frame.matrix, as.data.frame.table
> >
> > Or
> >
> > +Signatures of other methods
> > +  as.data.frame.aovproj(x, ...)
> > +  as.data.frame.array(x, row.names = NULL, optional = FALSE, ...)
> > +  as.data.frame.AsIs(x, row.names = NULL, optional = FALSE, ...)
> > +  as.data.frame.bibentry(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.character(x, ..., stringsAsFactors = FALSE)
> > +  as.data.frame.citation(x, row.names = NULL, optional = FALSE, ...)
> > +  as.data.frame.complex(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.data.frame(x, row.names = NULL, ...)
> > +  as.data.frame.Date(x, row.names = NULL, optional = FALSE, ..., nm =
> > deparse1(substitute(x)))
> > +  as.data.frame.default(x, ...)
> > +  as.data.frame.difftime(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.factor(x, row.names = NULL, optional = FALSE, ..., nm
> > = deparse1(substitute(x)))
> > +  as.data.frame.ftable(x, row.names = NULL, optional = FALSE, ...)
> > +  as.data.frame.integer(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.list(x, row.names = NULL, optional = FALSE, ...,
> > cut.names = FALSE,
> > +      col.names = names(x), fix.empty.names = TRUE, new.names =
> > !missing(col.names),
> > +      check.names = !optional, stringsAsFactors = FALSE)
> > +  as.data.frame.logical(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.logLik(x, ...)
> > +  as.data.frame.matrix(x, row.names = NULL, optional = FALSE,
> > make.names = TRUE,
> > +      ..., stringsAsFactors = FALSE)
> > +  as.data.frame.model.matrix(x, row.names = NULL, optional = FALSE,
> > make.names = TRUE,
> > +      ...)
> > +  as.data.frame.noquote(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.numeric(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.numeric_version(x, row.names = NULL, optional =
> > FALSE, ..., nm = deparse1(substitute(x)))
> > +  as.data.frame.ordered(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.person(x, row.names = NULL, optional = FALSE, ..., nm
> > = deparse1(substitute(x)))
> > +  as.data.frame.POSIXct(x, row.names = NULL, optional = FALSE, ...,
> > nm = deparse1(substitute(x)))
> > +  as.data.frame.POSIXlt(x, row.names = NULL, optional = FALSE, ...)
> > +  as.data.frame.raw(x, row.names = NULL, optional = FALSE, ..., nm =
> > deparse1(substitute(x)))
> > +  as.data.frame.table(x, row.names = NULL, ..., responseName =
> > "Freq", stringsAsFactors = TRUE,
> > +      sep = "", base = list(LETTERS))
> > +  as.data.frame.ts(x, ...)
> >
> > Obviously that's a bit more cluttered, but as.data.frame() should be a
> > pretty unusual case. It also highlights better the differences in the
> > two approaches: the former economizes on space and focuses on what
> > sorts of arguments are available; the latter shows the defaults, does
> > not hide the arguments shared with the generic, and will always
> > produce as many lines as there are methods.
> >
> > There are other edge cases to think through (multiple registrations,
> > interactions with S4, primitives, ...), but I want to first check with
> > the list if this looks workable & valuable enough to pursue.
> >
> > Mike C
> >
> > ----
> >
> > Code that helped with the above:
> >
> > f = as.data.frame
> > # NB: methods() and getAnywhere() require {utils}
> > m = methods(f)
> > generic_args = names(formals(f))
> > f_methods = lapply(m, \(fn) getAnywhere(fn)$objs[[1L]])
> > names(f_methods) = m
> > new_args = sapply(f_methods, \(g) setdiff(names(formals(g)), generic_args))
> > with( # group by argument name
> >    data.frame(method = rep(names(new_args), lengths(new_args)), arg =
> > unlist(new_args), row.names=NULL),
> >    {tbl = tapply(method, arg, toString); writeLines(paste0(names(tbl),
> > ": ", tbl))}
> > )
> > signatures=sapply(f_methods, \(g) paste(head(format(args(g)), -1),
> > collapse="\n"))
> > writeLines(paste0(names(signatures), gsub("^\\s*function\\s*", "", 
> > signatures)))
>

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to