[Bioc-devel] A case for specific markup (was BiocStyle for styling ...)

Gabriel Becker Tue, 23 Jul 2013 12:18:31 -0700

Hey all,

Kasper and Martin were discussing the various latex macros for use in
vignettes and whether it is valuable to have many specific macros or a
couple (1 or 2) quite general ones.


I'd like to offer another perspective on this. Markup (which is what the
macros are) is informative, with specific markup being more informative
than generic markup.

The markup can be informative to the end user if the different types of
markup are rendered differently, but as Kasper pointed out this information
is almost always available from context. That isn't the only application,
however.

The markup can also be informative when *programatically processing* the
document. This means that we can write R scripts which can look at
documents with this type of markup and know when we mention R functions, or
packages, or classes.

This means that authors can write and run unit tests* *on and find problems
with* the documents themselves*. My advisor, Duncan Temple Lang, has done
quite a bit of work on this subject in the context of a different format,
and it has proven to be a powerful tool when authoring documents which
discuss code.

What I mean by unit tests on documents is that we can write functions which
detect errors in the document itself (not just the code). This includes
functions which:

   - Check for R functions or S4 classes the document mentions, but which
   are not exported by any of the packages cited by/mentioned in the document
   (find missing R package citations)
   - Make a table of all function/class/package names mentioned in the
   document (easily identify typos)
   - Check for packages which are mentioned by name or loaded by the code
   but not cited (find missing citations)

This concept becomes even more powerful when combined with static analysis
tools (eg the CodeDepends and codetools packages) which look at the code in
the document as well. In this case we can actually compare the entities the
code is using to the entities the text is talking about.

None of these functions will ensure that there are no mistakes, of course,
but they can check for specific types of mistakes in a manner that is much
easier and more effective than trying to do so by hand, especially in the
case of longer documents.

So my vote is to keep the numerous, specific macros.

As a side note, this means that if desired, there are ways a journal like
JSS or an organization like Bioconductor could actually enforce/detect
violations of package citation rules.

Thanks for reading,
~G

-- 
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] A case for specific markup (was BiocStyle for styling ...)

Reply via email to