Hi Ian, I've spoken with Stefan Theussl (cran maintainer) about this, and he's concerned about the privacy implications of making the apache access logs public. A compromise that he mentioned was having a script run on the cran mirror that processed the log files and output summary statistics. Then a central process could aggregate these and produce a single overall summary.
A few comments on your current site: * Are you just including packages downloaded interactively from within R? * I don't think the continent from which the package was download is of much interest. There's definitely no need to include it on the main page. * I'd be far more interested in changes over time. Sparklines of the last month worth of data would be a neat addition to the main page. * More vertical whitespace or subtle zebra striping would make it much easier to read across rows. * I'm also not sure about displaying the number of unique IPs. R is used a lot in the university setting and until ipv6 comes along, many university downloads will appear to be coming from a single ip address. * It's not very useful to sort by % Windows because the variance increases as the sample size decreases so the packages with the highest and lowest % windows are just the packages that aren't downloaded very often. Maybe a shrunken estimate? * Have you thought at all about how to take package dependences into account? Hadley On Sun, Nov 22, 2009 at 6:18 PM, Fellows, Ian <ifell...@ucsd.edu> wrote: > Hi All, > > It seems that the question of how may people use (or download) R, and it's > packages is one that comes up on a fairly regular basis in a variety of > forums (There was also recent thread on the subject on Stack Overflow). A > couple of students at UCLA (including myself), wanted to address the issue, > so we set up a system to get and parse the cran.stat.ucla.edu APACHE logs > every night, and display some basic statistics. Right now, we have a working > sketch of a site based on one week of observations. > > http://neolab.stat.ucla.edu/cranstats/ > > We would very much like to incorporate data from all CRAN mirrors, including > cran.r-project.org. We would also like to set this up in a way that is > minimally invasive for the site administrators. Internally, our administrator > has set up a protected directory with the last couple days of cran activity. > We then pull that down using curl. > > What would be the best and easiest way for the CRAN mirrors to share their > data? Is the contact information for the administrators available anywhere? > > > Thank you, > Ian Fellows > > > > ________________________________________ > From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On Behalf > Of Steven McKinney [smckin...@bccrc.ca] > Sent: Thursday, November 19, 2009 2:21 PM > To: Kevin R. Coombes; r-devel@r-project.org > Subject: Re: [Rd] R Usage Statistics > > Hi Kevin, > > What a surprising comment from a reviewer for BMC Bioinformatics. > > I just did a PubMed search for "limma" and "aroma.affymetrix", > just two methods for which I use R software regularly. > "limma" yields 28 hits, several of which are published > in BMC Bioinformatics. Bengtsson's aroma.affymetrix paper > "Estimation and assessment of raw copy numbers at the single locus level." > is already cited by 6 others. > > It almost seems too easy to work up lists of usage of R packages. > > Spotfire is an application built around S-Plus that has widespread use > in the biopharmaceutical industry at a minimum. Vivek Ranadive's > TIBCO company just purchased Insightful, the S-Plus company. > (They bought Spotfire previously.) > Mr. Ranadive does not spend money on environments that are > not appropriate for deploying applications. > > You could easily cull a list of corporation names from the > various R email listservs as well. > > Press back with the reviewer. Reviewers can learn new things > and will respond to arguments with good evidence behind them. > Good luck! > > > Steven McKinney > > > ________________________________________ > From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On Behalf > Of Kevin R. Coombes [krcoom...@mdacc.tmc.edu] > Sent: November 19, 2009 10:47 AM > To: r-devel@r-project.org > Subject: [Rd] R Usage Statistics > > Hi, > > I got the following comment from the reviewer of a paper (describing an > algorithm implemented in R) that I submitted to BMC Bioinformatics: > > "Finally, which useful for exploratory work and some prototyping, > neither R nor S-Plus are appropriate environments for deploying user > applications that would receive much use." > > I can certainly respond by pointing out that CRAN contains more than > 2000 packages and Bioconductor contains more than 350. However, does > anyone have statistics on how often R (and possibly some R packages) are > downloaded, or on how many people actually use R? > > Thanks, > Kevin > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- http://had.co.nz/ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel