Thanks to you both. I think you’re saying/implying that once I “test drive” a particular bit of cleaning I should capture it in a function which does it reproducibly against the raw data, and that becomes the best documentation for it. That makes sense.
Pito Salas Brandeis Computer Science Feldberg 131 > On Jun 30, 2016, at 11:44 AM, Robert Baer <rb...@atsu.edu> wrote: > > You might look at: > > http://stackoverflow.com/questions/7979609/automatic-documentation-of-datasets > > You might also, try the FIle | Compile Notebook from within R-Studio > (https://www.rstudio.com/) on your well-documented R-scripts to get a nice > reproducible recording/report of data analysis workflow. Similar > functionality is available from basic R, but involves more work. There are > many other approaches, but the best choice depends on your precise needs. > > And, as a programmer, you are probably already familiar with things like: > https://google.github.io/styleguide/Rguide.xml > > > > On 6/30/2016 9:51 AM, Pito Salas wrote: >> I am studying statistics and using R in doing it. I come from software >> development where we document everything we do. >> >> As I “massage” my data, adding columns to a frame, computing on other data, >> perhaps cleaning, I feel the need to document in detail what the meaning, or >> background, or calculations, or whatever of the data is. After all it is now >> derived from my raw data (which may have been well documented) but it is >> “new.” >> >> Is this a real problem? Is there a “best practice” to address this? >> >> Thanks! >> >> Pito Salas >> Brandeis Computer Science >> Feldberg 131 >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.