Brandon Zicha wrote:
Hey Paul,
Hey Brandon, (adding R-help in the cc)
I agree with you that the documentation of R could be better, especially
with more examples in code showing not only the common cases, but also
more esoteric cases. It would be great if everyone invested a lot of
time to write awesome documentation, but this is not the case. I just
objected to the tone (I tought :)) I spotted. Some more comments are inline:
Accepting the main point of my post - that the often VERY incomplete
help files appended to packages can be a major stumbling block for
getting up and running in R - I take your point. I probably went a
bit to far with my language there.
I would point out though that a great many parts of research (like
writing a bibliography - or searching for citations of any kind
usually) aren't much fun, but are an important part of research
related work. Likewise, complete documentation (by which I hardly
mean a paper - looking at STATA help files as a minimum would be a
good start) is part of programming. I agree that one needs to employ
some level of judgement, otherwise you will get helpfile that says
"First turn on the computer... then click the 'R' Icon...." But, I
have myself created one or two STATA functions that I have put up for
public use - so I know how not fun, but necessary complete
documentation is. Further, I didn't say that writing documentation
doesn't take time. Everything takes time. My point was that relative
to actually creating the application - writing more complete
documentation takes very little time. If one invests the time to do
the 'fun' stuff of writing a new package for R, it seems reasonable
that taking the (proportionately) little time to write a nicer help
file would be the most 'professional' thing to do. But, this could be
my illusion that all researchers seem themselves as professionals -
rather than an anarchic egoistic enclave of independent
self-interested paper producers.
This is what scientists get judged upon, not on how much software they
publish and how good their documentation is. Furthermore, it is quite
hard for a hardcore R programmer to judge what people find har about
their software.
I am notorious for assuming greater standards as an acceptable 'norm'
than my community at large :-) Furthermore, you are absolutely right
that my standards are apparently even to high for many commercial
applications! R help is sometimes downright good!
So, if I accept that I am demanding S.O.B. and tone down my thoughts
of proper documentation and professionalism and adopt the (probably
more) reasonable perspective you do at the end of "well, this is the
world we live in... and come on it's free" I totally agree that I
probably went too far! But, better yet, I think that this observation
you make suggests a solution: Perhaps R could use a more integrated
and organized open source help system. I can think of a few
possibilities - the easiest being a wiki version of R help. This way
users could add useful information to help files - such as more
examples, tricks, tips, and known problems. This would take advantage
of the open source, free, user-community centered aspects of R, and
permit those with an interest in helping beginners to post notes for
beginners - on the help files. I know that if such a wiki existed I
would have posted my recent example of constrain optimization I just
did recently. It wouldn't be too difficult to add a function
wikihelp(X) that would open the wiki help page rather than the
standard help documentation. Currently, help on any given command is
scattered all over help fora all about the web. A central, indexed,
and easily referenced help system might be a solution. Heck, such a
system could go a step further and link R-help listserv archives by
command thus centralizing and integrating the open-source user-built
information resource of the listserv into help(). How many e-mails to
this listserv begin with 'I just spent a few hours cruising the help
forums related to 'X' and couldn't find an answer.'
Sounds like a good addition, allowing people to add to the documentation
as they see fit. There is ofcourse the R wiki, but this is not widely
used and not firmly embedded into R itself. But how would we keep such a
system you propose manageable, preventing it from becoming an enormous
mess. Maybe some kind of moderation?
I note that STATA has all their help files for the latest version of
stata available on the web (http://www.stata.com/help.cgi?contents).
How difficult would a similar system - only with R, editable and with
links to supplementary information - be to set up? I can't imagine it
would be horribly expensive in terms of set up costs.
A problem is that there is no company that markets R that could set this
up, the community is much looser, much more open source. Probably the R
core team would be the closest thing we have.
What do you think?
Best,
Brandon Z
On Mar 2, 2010, at 1:16 PM, Paul Hiemstra wrote:
Brandon Zicha wrote:
What were your biggest misconceptions or
stumbling blocks to getting up and running
with R?
Easy. I terms of materials I have been unable to find good books
that introduce users to R from the perspective of someone familiar
only with packages like SPSS or STATA, or not familiar with
statistics packages at all. Even introduction texts use jargon
without introducing it.
I think that R-help files should be more thorough than they are, and
contain more examples. I thought that STATA help files were
sparse! The notion that 'R is a user community and thus they do
this in their spare time' is no excuse for those creating new tools
for R not developing complete help files. It doesn't take that much
time relative to actually creating the new function.
Hi Brandon,
I would disagree with your point that documentation doesn't take much
time. Writing documentation that is suitable for both the advanced
user (being a reference, and thus preferably short) and the beginning
user (being sort of a tutorial, and thus prefererably longer) is
quite a challenge, comparable to writing a good paper. Apart from the
fact that it takes quite a while, it is also not much fun. Often
people develop packages for their own research and put the software
online so others can benefit, they don;t need the documentation
themselves and don't get paid to write the documentation.
So saying 'it's no excuse' really goes too far in my view. R is free,
you did not pay several thousands of euros giving you the right for
good support. Even the support is free through the mailing list. You
can get a paid version of R at Revelution Computing. Then you can
call them if there are problems. I'm not meaning to offend anybody,
but I didn't agree with "is no excuse for those creating new tools
for R not developing complete help files". Partly the strength of R
is in the open source, but sometimes, as with documentation, this can
bite you. But I think the R docs aren't that bad, I've seen
proprietary software that a worse job than R.
my 2euro on the subject :),
Cheers,
Paul
In terms of actual R use - creating, using, and manipulating data
are the biggest frustration for those of the 'spreadsheet
generation'. I get the impression that one needs to not merely
understand, but be fully fluent in the jargon of matrix mathematics
to even know what is going on half the time. I find myself - even
now - using 'rules of thumb' that 'seemed to work' rather than fully
understanding what I am doing. It is particularly discouraging when
many of those 'intro books' suggest using something besides R for
data manipulation - how clumsy is that!?
I find the actual programming syntax itself is the easiest part to
master. It is certainly more flexible - but without a particularly
sufficient increase in complexity - than trying to write script in
SPSS and STATA.
Brandon Zicha
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.