On Sun, 13 Feb 2011, Yihui Xie wrote:

Regarding the reasons that make the doc directory large, I wonder if
we can make some changes in R:

'we' cannot: only core developers can. However, end users can contribute in many other ways: see below.

1. Use a null graphics device as the default device rather than pdf()
when running Sweave -- this can avoid the useless Rplots.pdf:

options(device = function(...) {
   .Call("R_GD_nullDevice", PACKAGE = "grDevices")
})

This can save some time in building the vignette(s) as well. (see
http://yihui.name/en/?p=673)

However, this undocumented null device may not work for certain
graphics. Here is an example that it fails for ggplot2:
http://stackoverflow.com/questions/4692974/ggplot2-code-that-works-interactively-rkward-crashes-under-lyx-pgfsweave-hint/4707745#4707745

Is it possible for someone to look into the null device (Dr Murrell?)
to make it stable enough?

I don't see a bug report on that, and a patch would help expedite this.

2. Compress the PDF graphics and vignettes using third-party tools,
among which I recommend qpdf (it's free).

qpdf --stream-data=compress input.pdf output.pdf

This can reduce the size of PDF files a lot without quality loss. I'm
using this tool in the animation package to reduce the size of PDF
animations.

*Can*, but I did say

  'There are several ways to reduce the sizes of PDFs with no loss in
   quality, e.g. Adobe Acrobat Standard/Pro.'

and qpdf is often ineffective (or worse), e.g. on package mokken. The problem is that many of the large packages need images re-saved in some other format (or preferably re-generated in some other format).

I've added a --compact-vignettes option to R CMD build (in R-devel). At present it uses qpdf, but I will look out for better/additional options. (I use Acrobat 9 Pro on my Mac and that has always beaten qpdf, often by a large margin. But qpdf is perhaps the most readily available of these tools.)

3. Sorry I bring up this issue again, but I don't understand why
Sweave could not implement the png() device along with pdf() and
postscript(). I'm willing to provide a patch if needed.

Does it need changes to R? I believe that it just needs a different driver, something which could be provided in a package.

This has been raised several times (including recently) with the Sweave maintainer, so maybe it will happpen eventually. But a package would retrofit it to eariier versions of R.



Thanks!

Regards,
Yihui
--
Yihui Xie <xieyi...@gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Sun, Feb 13, 2011 at 6:30 AM, Prof Brian Ripley
<rip...@stats.ox.ac.uk> wrote:
Robin Hankin's post reminded me to post about the following recent addition
to 'Writing R Extensions', in the section on 'Submitting a package to CRAN'

 Ensure that the package sources are not unnecessarily large. ...
 As a general rule, doc directories should not exceed 5Mb, and
 where data directories need to be 10Mb or more, consideration should
 be given to a separate package containing just the data. (Similarly
 for external data directories, large jar files and other libraries
 that need to be installed.)

With 2800 packages on CRAN, overall size is becoming a concern and currently
to install all of CRAN takes 4Gb.  As the attached (I hope) graph shows, the
20 packages over 20Mb take a quarter, and those over 5Mb take half.  (And
this is after we have removed 100Mb from the largest installed package by
re-compression, and archived the second largest, so Robin's package is
currently the largest.)  Some of the largest packages are data/jar packages,
but there are 55 packages with 'doc' directories over 5Mb.  To put that in
perspective, PDFs of whole books with lots of figures (MASS, Paul's R
Graphics) are well under 5Mb.

R CMD check in R-devel reports on large packages, and expect in future that
submitted package sizes will be questioned more often.

There are lots of different reasons why doc directories are large, but the
major ones are

- installing files that are unneeded, such as Rplots.pdf and .eps
 figures.
- using PDF figures of images where PNG would be more appropriate.
- including less than relevant material (such as how to install R,
 with screenshots!)

There are several ways to reduce the sizes of PDFs with no loss in quality,
e.g. Adobe Acrobat Standard/Pro.

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to