On Mar 2, 2010, at 8:01 AM, Paul Hiemstra wrote:
Brandon Zicha wrote:
Hey Paul,
Hey Brandon, (adding R-help in the cc)
I agree with you that the documentation of R could be better,
especially with more examples in code showing not only the common
cases, but also more esoteric cases. It would be great if everyone
invested a lot of time to write awesome documentation, but this is
not the case. I just objected to the tone (I tought :)) I spotted.
Some more comments are inline:
Accepting the main point of my post - that the often VERY
incomplete help files appended to packages can be a major stumbling
block for getting up and running in R - I take your point. I
probably went a bit to far with my language there.
I would point out though that a great many parts of research (like
writing a bibliography - or searching for citations of any kind
usually) aren't much fun, but are an important part of research
related work. Likewise, complete documentation (by which I hardly
mean a paper - looking at STATA help files as a minimum would be a
good start) is part of programming. I agree that one needs to
employ some level of judgement, otherwise you will get helpfile
that says "First turn on the computer... then click the 'R'
Icon...." But, I have myself created one or two STATA functions
that I have put up for public use - so I know how not fun, but
necessary complete documentation is. Further, I didn't say that
writing documentation doesn't take time. Everything takes time. My
point was that relative to actually creating the application -
writing more complete documentation takes very little time. If one
invests the time to do the 'fun' stuff of writing a new package for
R, it seems reasonable that taking the (proportionately) little
time to write a nicer help file would be the most 'professional'
thing to do. But, this could be my illusion that all researchers
seem themselves as professionals - rather than an anarchic egoistic
enclave of independent self-interested paper producers.
This is what scientists get judged upon, not on how much software
they publish and how good their documentation is. Furthermore, it is
quite hard for a hardcore R programmer to judge what people find har
about their software.
I am notorious for assuming greater standards as an acceptable
'norm' than my community at large :-) Furthermore, you are
absolutely right that my standards are apparently even to high for
many commercial applications! R help is sometimes downright good!
So, if I accept that I am demanding S.O.B. and tone down my
thoughts of proper documentation and professionalism and adopt the
(probably more) reasonable perspective you do at the end of "well,
this is the world we live in... and come on it's free" I totally
agree that I probably went too far! But, better yet, I think that
this observation you make suggests a solution: Perhaps R could use
a more integrated and organized open source help system. I can
think of a few possibilities - the easiest being a wiki version of
R help. This way users could add useful information to help files
- such as more examples, tricks, tips, and known problems. This
would take advantage of the open source, free, user-community
centered aspects of R, and permit those with an interest in helping
beginners to post notes for beginners - on the help files. I know
that if such a wiki existed I would have posted my recent example
of constrain optimization I just did recently. It wouldn't be too
difficult to add a function wikihelp(X) that would open the wiki
help page rather than the standard help documentation. Currently,
help on any given command is scattered all over help fora all about
the web. A central, indexed, and easily referenced help system
might be a solution. Heck, such a system could go a step further
and link R-help listserv archives by command thus centralizing and
integrating the open-source user-built information resource of the
listserv into help(). How many e-mails to this listserv begin with
'I just spent a few hours cruising the help forums related to 'X'
and couldn't find an answer.'
Sounds like a good addition, allowing people to add to the
documentation as they see fit. There is ofcourse the R wiki, but
this is not widely used and not firmly embedded into R itself. But
how would we keep such a system you propose manageable, preventing
it from becoming an enormous mess. Maybe some kind of moderation?
I note that STATA has all their help files for the latest version
of stata available on the web (http://www.stata.com/help.cgi?
contents). How difficult would a similar system - only with R,
editable and with links to supplementary information - be to set up?
I cannot comment on how difficult it would was to set up, but I must
disagree that it does not exist for R. The default for RSiteSearch is
Jon Baron's search utility. It appears to have been relatively
recently modified so that it searches functions and not r-help but in
that form it addresses your expectations quite well. I suspect that
the cognoscenti could offer other search strategies that would be
eqaully effective.
http://finzi.psych.upenn.edu/search.html
(I will also comment that when I was referred to the Stata
documentation, my small taste left me thinking R help files were far
superior to what I found for Stata.)
--
David.
I can't imagine it would be horribly expensive in terms of set up
costs.
A problem is that there is no company that markets R that could set
this up, the community is much looser, much more open source.
Probably the R core team would be the closest thing we have.
What do you think?
Best,
Brandon Z
On Mar 2, 2010, at 1:16 PM, Paul Hiemstra wrote:
Brandon Zicha wrote:
What were your biggest misconceptions or
stumbling blocks to getting up and running
with R?
Easy. I terms of materials I have been unable to find good books
that introduce users to R from the perspective of someone
familiar only with packages like SPSS or STATA, or not familiar
with statistics packages at all. Even introduction texts use
jargon without introducing it.
I think that R-help files should be more thorough than they are,
and contain more examples. I thought that STATA help files were
sparse! The notion that 'R is a user community and thus they do
this in their spare time' is no excuse for those creating new
tools for R not developing complete help files. It doesn't take
that much time relative to actually creating the new function.
Hi Brandon,
I would disagree with your point that documentation doesn't take
much time. Writing documentation that is suitable for both the
advanced user (being a reference, and thus preferably short) and
the beginning user (being sort of a tutorial, and thus
prefererably longer) is quite a challenge, comparable to writing a
good paper. Apart from the fact that it takes quite a while, it is
also not much fun. Often people develop packages for their own
research and put the software online so others can benefit, they
don;t need the documentation themselves and don't get paid to
write the documentation.
So saying 'it's no excuse' really goes too far in my view. R is
free, you did not pay several thousands of euros giving you the
right for good support. Even the support is free through the
mailing list. You can get a paid version of R at Revelution
Computing. Then you can call them if there are problems. I'm not
meaning to offend anybody, but I didn't agree with "is no excuse
for those creating new tools for R not developing complete help
files". Partly the strength of R is in the open source, but
sometimes, as with documentation, this can bite you. But I think
the R docs aren't that bad, I've seen proprietary software that a
worse job than R.
my 2euro on the subject :),
Cheers,
Paul
In terms of actual R use - creating, using, and manipulating data
are the biggest frustration for those of the 'spreadsheet
generation'. I get the impression that one needs to not merely
understand, but be fully fluent in the jargon of matrix
mathematics to even know what is going on half the time. I find
myself - even now - using 'rules of thumb' that 'seemed to work'
rather than fully understanding what I am doing. It is
particularly discouraging when many of those 'intro books'
suggest using something besides R for data manipulation - how
clumsy is that!?
I find the actual programming syntax itself is the easiest part
to master. It is certainly more flexible - but without a
particularly sufficient increase in complexity - than trying to
write script in SPSS and STATA.
Brandon Zicha
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Drs. Paul Hiemstra
--
Drs. Paul Hiemstra
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.