I think there are two different use cases here. The first, the one that I think is driving the design, is that the user writes a function for a particular problem, where the value of iterate is known. The other use case is that the user gets a summary function from somewhere else (a package) and applies it using reduceBy*. In that case, the user would potentially need to write a wrapper, depending on the formals of the reusable function. The only way I could make the second use case work with the current design is to have a higher order function that returns a universal iterator that detects the value of iterate via nargs() and behaves appropriately. The higher order function would not need to be known to the user, just the package developer.
On Tue, Jun 17, 2014 at 1:39 PM, Martin Morgan <mtmor...@fhcrc.org> wrote: > Val's out today and I'm at least part of the problem so... > > > On 06/17/2014 10:13 AM, Michael Lawrence wrote: > >> On Tue, Jun 17, 2014 at 7:00 AM, Valerie Obenchain <voben...@fhcrc.org> >> wrote: >> >> Hi Michael, Ryan, >>> >>> Yes, it would be ideal to have a single signature for both cases of >>> 'iterate'. We went over the pros/cons again and at the end of the day >>> decided to keep things as they are. No perfect solution here. >>> >>> These were the primary points: >>> >>> - Disadvantages of defining REDUCER with only '...' is that '...' can >>> represent variables other than just the output from MAPPER. >>> >>> >>> Do you mean that "..." will capture additional arguments? From where? >> > > reduceBy* takes an argument ... and this is currently available to both > the MAPPER and REDUCER, see below. > > > >> >> - The unappealing aspect of the variadic approach is introducing a new >>> check each time REDUCER is called. >>> >>> >>> What is this check? >> >> >> - Going the other direction, considering a single arg for REDUCER instead >>> two, requires coercing 'last' and 'current' to a list before pulling them >>> apart again. >>> >>> >>> What is the problem with constructing this list? Isn't that one >> extremely >> fast line of code? >> > > it's not the list construction but the lost convenience of named > arguments, in addition to consistency with Reduce when the data are > presented iteratively -- REDUCER=`+` instead of REDUCER=function(lst) > sum(unlist(lst, use.names=FALSE)). > > > >> It seems to me simpler to settle on one signature, and my preference would >> be for the single list argument, just because the call is smaller and >> simpler. Then have a convenient adaptor to handle the variadic case. >> > > The variadic adapter concept is easy enough to understand in context, but > would send me for a head scratch at some later time. > > Martin > > > >> >> >>> Valerie >>> >>> >>> >>> On 06/15/14 16:36, Michael Lawrence wrote: >>> >>> I kind of prefer the adaptor solution, just for the sake of API >>>> cleanliness >>>> (the MAPPER/REDUCER pair has some elegance), but I think we agree that >>>> the >>>> iterate switch introduces undesirable coupling. >>>> >>>> >>>> >>>> >>>> On Sun, Jun 15, 2014 at 3:07 PM, Ryan <r...@thompsonclan.org> wrote: >>>> >>>> What about having two separate reducer arguments, one for a reducer >>>> that >>>> >>>>> takes two elements at a time and combines them, and the other for a >>>>> reducer >>>>> that takes a list and combines all the elements of the list? Specifying >>>>> both at once would be an error. I think it makes more sense to say >>>>> "these >>>>> two arguments expect different things" than "this one argument expects >>>>> a >>>>> different thing depending on the value of another argument". >>>>> >>>>> -Ryan >>>>> >>>>> >>>>> On Sun Jun 15 11:17:59 2014, Michael Lawrence wrote: >>>>> >>>>> I just thought there is some benefit for the callback to be the same, >>>>> >>>>>> regardless of the iterate setting. This would allow generalization >>>>>> across >>>>>> different data scales. Perhaps all that is needed is a constructor for >>>>>> an >>>>>> adapter closure, one for each direction. >>>>>> >>>>>> For example, the variadic adapter would look like: >>>>>> >>>>>> Variadic <- function(FUN) { >>>>>> function(x, y) { >>>>>> if (missing(y)) { >>>>>> do.call(FUN, x) >>>>>> } else { >>>>>> FUN(x, y) >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> That would make it easy to e.g. adapt rbind into the framework. I >>>>>> wonder >>>>>> if >>>>>> there is precedent and better terminology from the functional >>>>>> programming >>>>>> domain? >>>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> >>>>>> On Sun, Jun 15, 2014 at 8:38 AM, Martin Morgan <mtmor...@fhcrc.org> >>>>>> wrote: >>>>>> >>>>>> On 06/15/2014 07:34 AM, Michael Lawrence wrote: >>>>>> >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> >>>>>>>> Was just checking out GenomicFiles and was a little surprised that >>>>>>>> the >>>>>>>> arguments to the REDUCER are different depending on iterate=TRUE vs. >>>>>>>> iterate=FALSE. In my often flawed opinion, iteration should not be a >>>>>>>> concern of the REDUCER. It should be oblivious to the iteration >>>>>>>> mode. >>>>>>>> In >>>>>>>> other words, when iterate=TRUE, it is a special case of having two >>>>>>>> objects >>>>>>>> to combine, instead of multiple. >>>>>>>> >>>>>>>> >>>>>>>> My 'rationale' was that one would choose iterate=FALSE when one >>>>>>>> >>>>>>> required >>>>>>> all elements to perform the reduction. I thought of the list (rather >>>>>>> than >>>>>>> ...) as the general R data structure for representing N elements, >>>>>>> with >>>>>>> a >>>>>>> special case (consistent with Reduce) made for the pairwise reduction >>>>>>> of >>>>>>> iterate=TRUE. Either way, the two cases (x, y vs. list(), x, y vs. >>>>>>> ...) >>>>>>> seem to require some explaining to the user. Is there a clear better >>>>>>> choice? You're the second person to trip over this, so I guess >>>>>>> there's >>>>>>> a >>>>>>> crack in the sidewalk... >>>>>>> >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> What would be convenient (but unnecessary) is to detect from the >>>>>>> formal >>>>>>> >>>>>>> arguments whether REDUCER is variadic or list-based. In other words, >>>>>>>> if >>>>>>>> REDUCER is defined like function(...) { } it is called via >>>>>>>> do.call(), >>>>>>>> otherwise it is passed the list. >>>>>>>> >>>>>>>> Thoughts? Maybe I'm totally confused? >>>>>>>> >>>>>>>> Michael >>>>>>>> >>>>>>>> [[alternative HTML version deleted]] >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioc-devel@r-project.org mailing list >>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>> Computational Biology / Fred Hutchinson Cancer Research Center >>>>>>> 1100 Fairview Ave. N. >>>>>>> PO Box 19024 Seattle, WA 98109 >>>>>>> >>>>>>> Location: Arnold Building M1 B861 >>>>>>> Phone: (206) 667-2793 >>>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioc-devel@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>>> >>>>>> >>>>>> >>>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioc-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>> >>>> >>>> >>> >>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel