>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel@r-project.org> >>>>> on Wed, 1 Feb 2017 16:17:06 +0000 writes:
> On 'aggregate data.frame', the URL should be > https://stat.ethz.ch/pipermail/r-help/2016-May/438631.html . thank you. Yes, using 'drop' makes sense there where the result is always "linear(ized)" or "one-dimensional". For tapply() that's only the case for 1D-index. > vector(typeof(ans)) (or vector(storage.mode(ans))) has > length zero and can be used to initialize array. Yes,.. unless in the case where ans is NULL. You have convinced me, that is nicer. > Instead of if(missing(default)) , if(identical(default, > NA)) could be used. The documentation could then say, for > example: "If default = NA (the default), NA of appropriate > storage mode (0 for raw) is automatically used." After some thought (and experiments), I have reverted and no longer use if(missing). You are right that it is not needed (and even potentially confusing) here. Changes are in svn c72106. Martin Maechler > -------------------------------------------- > On Wed, 1/2/17, Martin Maechler > <maech...@stat.math.ethz.ch> wrote: > Subject: Re: [Rd] RFC: tapply(*, ..., init.value = NA) > Cc: R-devel@r-project.org Date: Wednesday, 1 February, > 2017, 12:14 AM >>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org> >>>>> on Tue, 31 Jan 2017 15:43:53 +0000 writes: >> Function 'aggregate.data.frame' in R has taken a >> different route. With drop=FALSE, the function is also >> applied to subset corresponding to combination of >> grouping variables that doesn't appear in the data >> (example 2 in >> https://stat.ethz.ch/pipermail/r-devel/2017-January/073678.html). > Interesting point (I couldn't easily find 'the example 2' > though). However, aggregate.data.frame() is a > considerably more sophisticated function and one goal was > to change tapply() as little as possible for compatibility > (and maintenance!) reasons . > [snip] >> With the code using if(missing(default)) , I consider the >> stated default value of 'default', default = NA , >> misleading because the code doesn't use it. > I know and I also had thought about it and decided to keep > it in the spirit of "self documentation" because "in > spirit", the default still *is* NA. >> Also, tapply(1:3, 1:3, as.raw) is not the same as >> tapply(1:3, 1:3, as.raw, default = NA) . The accurate >> statement is the code in if(missing(default)) , but it >> involves the local variable 'ans'. > exactly. But putting that whole expression in there would > look confusing to those using str(tapply), args(tapply) or > similar inspection to quickly get a glimpse of the > function user "interface". That's why we typically don't > do that and rather slightly cheat with the formal default, > for the above "didactical" purposes. > If you are puristic about this, then missing() should > almost never be used when the function argument has a > formal default. > I don't have a too strong opinion here, and we do have > quite a few other cases, where the formal default argument > is not always used because of if(missing(.)) clauses. > I think I could be convinced to drop the '= NA' from the > formal argument list.. >> As far as I know, the result of function 'array' in is >> not a classed object and the default method of `[<-` will >> be used in the 'tapply' code portion. >> As far as I know, the result of 'lapply' is a list >> without class. So, 'unlist' applied to it uses the >> default method and the 'unlist' result is a vector or a >> factor. > You may be right here ((or not: If a package author makes > array() into an S3 generic and defines S3method(array, *) > and she or another make tapply() into a generic with > methods, are we really sure that this code would not be > used ??)) > still, the as.raw example did not easily work without a > warning when using as.vector() .. or similar. >> With the change, the result of >> tapply(1:3, 1:3, factor, levels=3:1) >> is of mode "character". The value is from the internal >> code, not from the factor levels. It is worse than before >> the change, where it is really the internal code, >> integer. > I agree that this change is not desirable. One could > argue that it was quite a "lucky coincidence" that the > previous code returned the internal integer codes though.. > [snip] >> To initialize array, a zero-length vector can also be >> used. > yes, of course; but my ans[0L][1L] had the purpose to get > the correct mode specific version of NA .. which works for > raw (by getting '00' because "raw" has *no* NA!). > So it seems I need an additional !is.factor(ans) there ... > a bit ugly. > --------- > [snip] > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel