On Thursday 14 June 2007 5:54 pm, Tim Churches wrote: > Michael Hoffman wrote: > > Talbot Katz wrote: > >> I hope you'll indulge an ignorant outsider. I work at a financial > >> software firm, and the tool I currently use for my research is R, a > >> software environment for statistical computing and graphics. R is > >> designed with matrix manipulation in mind, and it's very easy to do > >> regression and time series modeling, and to plot the results and test > >> hypotheses. The kinds of functionality we rely on the most are standard > >> and robust versions of regression and principal component / factor > >> analysis, bayesian methods such as Gibbs sampling and shrinkage, and > >> optimization by linear, quadratic, newtonian / nonlinear, and genetic > >> programming; frequently used graphics include QQ plots and histograms. > >> In R, these procedures are all available as functions (some of them are > >> in auxiliary libraries that don't come with the standard distribution, > >> but are easily downloaded from a central repository). > > > > I use both R and Python for my work. I think R is probably better for > > most of the stuff you are mentioning. I do any sort of heavy > > lifting--database queries/tabulation/aggregation in Python and load the > > resulting data frames into R for analysis and graphics. > > I would second that. It is not either/or. Use Python, including Numpy > and matplotlib and packages from SciPy, for some things, and R for > others. And you can even embed R in Python using RPy - see > http://rpy.sourceforge.net/ > > We use the combination of Python, Numpy (actually, the older Numeric > Python package, but soon to be converted to Numpy), RPy and R in our > NetEpi Analysis project - exploratory epidemiological analysis of large > data sets - see http://sourceforge.net/projects/netepi - and it is a > good combination - Python for the Web interface, data manipulation and > data heavy-lifting, and for some of the more elementary statistics, and > R for more involved statistical analysis and graphics (with teh option > of using matplotlib or other Python-based graphics packages for some > tasks if we wish). The main thing to remember, though, is that indexing > is zero-based in Python and 1-based in R... > > Tim C
Thirded. I use R, Python, Matlab along with other languages (I hate pipeline pilot) in my work and from what I've seen nothing can compare with R when it comes to stats. I love R, from its brilliant CRAN system (PyPI needs serious work to be considered in the same class as CPAN et al) to its delicious Emacs integration. I just wish there was a way to distribute R packages without requiring the user to separately install R. In a similar vein, I wish there was a reasonable Free Software equivalent to Spotfire. The closest I've found (and they're nowhere near as good) are Orange (http://www.ailab.si/orange) and WEKA (http://www.cs.waikato.ac.nz/ml/weka/). Orange is written in Python, but its tied to QT 2.x as the 3.x series was not available on Windows under the GPL. Josh Gilbert -- http://mail.python.org/mailman/listinfo/python-list