Gabriel,

I can see some usefulness along the lines you are mentioning (like
automatic checking for typos), but it might come at a substantial (in my
opinion) mental cost for the vignette writer.  To some extent typos are
detected in actual runnable code chunks since they are actually being run.

Whether or not it would be worthwhile to use this system you are speaking
about, depends on how easy and well it can be incorporated into our
existing usage.  This may be very easy or very hard, depending on the
current state of the code.  It is unclear to me whether it is currently in
development or whether it is a mature, deployable system.

Do you have an actual example of a Sweave (or other format) document and
how this tool works in practice?  An example would make it very easy to see
pros and cons.

Best,
Kasper


On Tue, Jul 23, 2013 at 12:16 PM, Gabriel Becker <gmbec...@ucdavis.edu>wrote:

> Hey all,
>
> Kasper and Martin were discussing the various latex macros for use in
> vignettes and whether it is valuable to have many specific macros or a
> couple (1 or 2) quite general ones.
>
> I'd like to offer another perspective on this. Markup (which is what the
> macros are) is informative, with specific markup being more informative
> than generic markup.
>
> The markup can be informative to the end user if the different types of
> markup are rendered differently, but as Kasper pointed out this information
> is almost always available from context. That isn't the only application,
> however.
>
> The markup can also be informative when *programatically processing* the
> document. This means that we can write R scripts which can look at
> documents with this type of markup and know when we mention R functions, or
> packages, or classes.
>
> This means that authors can write and run unit tests* *on and find problems
> with* the documents themselves*. My advisor, Duncan Temple Lang, has done
> quite a bit of work on this subject in the context of a different format,
> and it has proven to be a powerful tool when authoring documents which
> discuss code.
>
> What I mean by unit tests on documents is that we can write functions which
> detect errors in the document itself (not just the code). This includes
> functions which:
>
>    - Check for R functions or S4 classes the document mentions, but which
>    are not exported by any of the packages cited by/mentioned in the
> document
>    (find missing R package citations)
>    - Make a table of all function/class/package names mentioned in the
>    document (easily identify typos)
>    - Check for packages which are mentioned by name or loaded by the code
>    but not cited (find missing citations)
>
> This concept becomes even more powerful when combined with static analysis
> tools (eg the CodeDepends and codetools packages) which look at the code in
> the document as well. In this case we can actually compare the entities the
> code is using to the entities the text is talking about.
>
> None of these functions will ensure that there are no mistakes, of course,
> but they can check for specific types of mistakes in a manner that is much
> easier and more effective than trying to do so by hand, especially in the
> case of longer documents.
>
> So my vote is to keep the numerous, specific macros.
>
> As a side note, this means that if desired, there are ways a journal like
> JSS or an organization like Bioconductor could actually enforce/detect
> violations of package citation rules.
>
> Thanks for reading,
> ~G
>
> --
> Gabriel Becker
> Graduate Student
> Statistics Department
> University of California, Davis
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to