On Mon, Jan 17, 2011 at 3:57 PM, Spencer Graves < spencer.gra...@structuremonitoring.com> wrote:
> For me, a major strength of R is the package development process. > I've found this so valuable that I created a Wikipedia entry by that name > and made additions to a Wikipedia entry on "software repository", noting > that this process encourages good software development practices that I have > not seen standardized for other languages. I encourage people to review > this material and make additions or corrections as they like (or sent me > suggestions for me to make appropriate changes). > I agree that the package development process is a major strength. Other factors include the high level of user support, hand-holding, feedback, and prompt bug fixes. It is not uncommon to see support at levels far exceeding what you would expect from a for-profit business. Newbie questions are answered in seconds in some cases! On the package development process, if C/C++ development does become more popular with the help of packages like Rcpp, then extensions that check this part of a package for consistency, documentation, etc. might be helpful. This might exploit features of Doxygen, for example. Dominick > > While R has other capabilities for unit and regression testing, I > often include unit tests in the "examples" section of documentation files. > To keep from cluttering the examples with unnecessary material, I often > include something like the following: > > > A1 <- myfunc() # to test myfunc > > A0 <- ("manual generation of the correct answer for A1") > > \dontshow{stopifnot(} # so the user doesn't see "stopifnot(" > all.equal(A1, A0) # compare myfunc output with the correct answer > \dontshow{)} # close paren on "stopifnot(". > > > This may not be as good in some ways as a full suite of unit tests, > which could be provided separately. However, this has the distinct > advantage of including unit tests with the documentation in a way that > should help users understand "myfunc". (Unit tests too detailed to show > users could be completely enclosed in "\dontshow". > > > Spencer > > > > On 1/17/2011 11:38 AM, Dominick Samperi wrote: > >> On Mon, Jan 17, 2011 at 2:08 PM, Spencer Graves< >> spencer.gra...@structuremonitoring.com> wrote: >> >> Another point I have not yet seen mentioned: If your code is >>> painfully slow, that can often be fixed without leaving R by >>> experimenting >>> with different ways of doing the same thing -- often after using >>> profiling >>> your code to find the slowest part as described in chapter 3 of "Writing >>> R >>> Extensions". >>> >>> >>> If I'm given code already written in C (or some other language), >>> unless it's really simple, I may link to it rather than recode it in R. >>> However, the problems with portability, maintainability, transparency to >>> others who may not be very facile with C, etc., all suggest that it's >>> well >>> worth some effort experimenting with alternate ways of doing the same >>> thing >>> in R before jumping to C or something else. >>> >>> Hope this helps. >>> Spencer >>> >>> >>> >>> On 1/17/2011 10:57 AM, David Henderson wrote: >>> >>> I think we're also forgetting something, namely testing. If you write >>>> your >>>> routine in C, you have placed additional burden upon yourself to test >>>> your >>>> C >>>> code through unit tests, etc. If you write your code in R, you still >>>> need >>>> the >>>> unit tests, but you can rely on the well tested nature of R to allow you >>>> to >>>> reduce the number of tests of your algorithm. I routinely tell people >>>> at >>>> Sage >>>> Bionetworks where I am working now that your new C code needs to >>>> experience at >>>> least one order of magnitude increase in performance to warrant the >>>> effort >>>> of >>>> moving from R to C. >>>> >>>> But, then again, I am working with scientists who are not primarily, or >>>> even >>>> secondarily, coders... >>>> >>>> Dave H >>>> >>>> >>>> This makes sense, but I have seem some very transparent algorithms >> turned >> into vectorized R code >> that is difficult to read (and thus to maintain or to change). These >> chunks >> of optimized R code are like >> embedded assembly, in the sense that nobody is likely to want to mess with >> it. This could be addressed >> by including pseudo code for the original (more transparent) algorithm as >> a >> comment, but I have never >> seen this done in practice (perhaps it could be enforced by R CMD >> check?!). >> >> On the other hand, in principle a well-documented piece of C/C++ code >> could >> be much easier to understand, >> without paying a performance penalty...but "coders" are not likely to >> place >> this high on their >> list of priorities. >> >> The bottom like is that R is an adaptor ("glue") language like Lisp that >> makes it easy to mix and >> match functions (using classes and generic functions), many of which are >> written in C (or C++ >> or Fortran) for performance reasons. Like any object-based system there >> can >> be a lot of >> object copying, and like any functional programming system, there can be a >> lot of function >> calls, resulting in poor performance for some applications. >> >> If you can vectorize your R code then you have effectively found a way to >> benefit from >> somebody else's C code, thus saving yourself some time. For operations >> other >> than pure >> vector calculations you will have to do the C/C++ programming yourself (or >> call a library >> that somebody else has written). >> >> Dominick >> >> >> >> ----- Original Message ---- >>>> From: Dirk Eddelbuettel<e...@debian.org> >>>> To: Patrick Leyshock<ngkbr...@gmail.com> >>>> Cc: r-devel@r-project.org >>>> Sent: Mon, January 17, 2011 10:13:36 AM >>>> Subject: Re: [Rd] R vs. C >>>> >>>> >>>> On 17 January 2011 at 09:13, Patrick Leyshock wrote: >>>> | A question, please about development of R packages: >>>> | >>>> | Are there any guidelines or best practices for deciding when and why >>>> to >>>> | implement an operation in R, vs. implementing it in C? The "Writing R >>>> | Extensions" recommends "working in interpreted R code . . . this is >>>> normally >>>> | the best option." But we do write C-functions and access them in R - >>>> the >>>> | question is, when/why is this justified, and when/why is it NOT >>>> justified? >>>> | >>>> | While I have identified helpful documents on R coding standards, I >>>> have >>>> not >>>> | seen notes/discussions on when/why to implement in R, vs. when to >>>> implement >>>> | in C. >>>> >>>> The (still fairly recent) book 'Software for Data Analysis: Programming >>>> with >>>> R' by John Chambers (Springer, 2008) has a lot to say about this. John >>>> also >>>> gave a talk in November which stressed 'multilanguage' approaches; see >>>> e.g. >>>> >>>> >>>> http://blog.revolutionanalytics.com/2010/11/john-chambers-on-r-and-multilingualism.html >>>> >>>> >>>> In short, it all depends, and it is unlikely that you will get a >>>> coherent >>>> answer that is valid for all circumstances. We all love R for how >>>> expressive >>>> and powerful it is, yet there are times when something else is called >>>> for. >>>> Exactly when that time is depends on a great many things and you have >>>> not >>>> mentioned a single metric in your question. So I'd start with John's >>>> book. >>>> >>>> Hope this helps, Dirk >>>> >>>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel