Vince Buffalo has covers this nicely in his book "Bioinformatics Data
Skills". The original data should stay the original data is immutable and
Vince then suggests that you have a text file in your data directory where
you explain where the data came from and which scripts you used to create a
modified version, when you did this and so on.

I find using roxygen comments and knitr extremely useful for keeping track
of what I intend to do and why because it allows me to export all the
reasoning, summary tables and plots to a format I can share with
collaborators that don't care about the R code for getting there.

HTH
Ulrik


On Thu, 30 Jun 2016 at 17:30 Pito Salas <pitosa...@brandeis.edu> wrote:

> I am studying statistics and using R in doing it. I come from software
> development where we document everything we do.
>
> As I “massage” my data, adding columns to a frame, computing on other
> data, perhaps cleaning, I feel the need to document in detail what the
> meaning, or background, or calculations, or whatever of the data is. After
> all it is now derived from my raw data (which may have been well
> documented) but it is “new.”
>
> Is this a real problem? Is there a “best practice” to address this?
>
> Thanks!
>
> Pito Salas
> Brandeis Computer Science
> Feldberg 131
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to