On 29.03.2017 17:36, Jeff Newmiller wrote:
The relevance to R (and therefore R-help) of this question is marginal at best.
R might not be the language of choice when you go retrieve the data.
Also, this question seems dangerously close to a troll, because the obvious
answer is that the data should be in an open format but if you are not
currently working with data in an open format then you increase the cost of
archiving and risk losing information up front by extracting it from a
proprietary format, and balancing those concerns is more political than
technical.
Note that there exist open binary formats, and the goals of your archiving task
and nature of the data would have to be considered in deciding which of the
many to use. My own experience has been that plain text survives time best, but
YMMV.
Well, I didn't mean to troll the list. We have a small section on R, and
in response to a question that we got from a user, we thought it would
be a good idea to check with some actual R-users.
I think the responses are pretty much in line with what we expected.
There's unsurprisingly no simple solution. A text format is advantageous
due to the many options that a user has to work with text data. Your
point is valid, with regards to the format of the source-data, which can
be a clear constraint (other constraints are, for example, of a legal
nature). I'm not trying to advocate for open formats per se, just trying
to gather information so as to be able to make a recommendation.
I think we need to restructure the information on our web platform to
clearly differentiate between data and the source code, scripts etc.
which are used to process the data ("algorithms").
There is a big problem with data that has been archived but nobody knows
what it is/ was for. Archivation, sharing, reproducibility are important
subjects and we are interested in the experience of statisticians in
dealing with these problems.
Thanks for the replies!
Joe
--
B 1003
Kommunikations-, Informations-, Medienzentrum (KIM)
Universitaet Konstanz
t: ++49-7531-883234
e: joe.g...@uni-konstanz.de
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.