On Tue, Sep 13, 2011 at 12:54 PM, Joseph Park <jpark...@att.net> wrote:
> Hi, I'm looking for some guidance on whether to use > S4 or Reference Classes for an analysis application > I'm developing. > I'm a C++/Python developer, and like to 'think' in OOD. > I started my app with S4, thinking that was the best > set of OO features in R. However, it appears that one > needs Reference Classes to allow object methods to assign > values (other than the .Object in the initialize method) > to slots of the object. > This is typically what I prefer: creating an object, then > operating on the object (reference) calling object methods > to access/modify slots. > So I'm wondering what (dis)advantages there are in > developing with S4 vs Reference Classes. > Things of interest: > Performance (i.e. memory management) > Integration compatibility with R packages > ??? other issues >From a C++/Python background you will probably feel more comfortable with reference classes. They are newer than S4 classes and much newer than S3 "classes" (which aren't really classes) and methods. Because reference classes are newer the support for them has not been as fully developed and you may encounter warts from time to time. I use both reference classes and S4 classes. Often I have objects that represent model/data combinations for which the parameter estimates are to be determined by optimizing a criterion. In those cases it makes sense to me to use reference classes because the state of the object can be changed by a method. I want to update the parameters in the object and evaluate the estimation criterion without needing to copy the entire object. If you try to perform some kind of update operation on an S4 object and not cheat in some way (i.e. adhere to strict functional programming semantics) you need to create a new instance of the object each time you update it. When the object is potentially very large you find yourself worrying about memory usage if you take that route. I found that my code started to look pretty ugly because conceptually I was updating in place but the code needs to be written as replacements. Having said all that, you should realize that the style of programming favored in R, and particularly in R packages, is to regard a method as determined jointly by the generic function and the class(es) of the argument(s). This is different from most other object-oriented languages in which the class is paramount and a method is just a member of a class that happens to be code, not data. You can get a lot of mileage out of the idiom of defining methods for common generics (print, plot, summary, ...) for particular S3 or S4 classes. The structure of R packages favors S3 generics but you can define a method for an S3 generic applied to an object from an S4 class. The only restriction is that S3 generics can only dispatch on the first argument but that is what happens in a language where the methods are part of the class definitions. When you need multiple dispatch S4 generics and methods are worth the pain. So my current approach is to use S4 classes for objects that are in some way static but to use reference classes for objects that will need to be updated when performing some kind of estimation (or other such operations such as Markov chain Monte Carlo). ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.