Re: [Rpy] Making dataframes ... fast

2009-09-30 Thread Laurent Gautier
Gary, The "wrong" order (transposed) is for the creation of a data.frame, which is distinct from reading the information needed to create a data.frame from a file in which each row is a represented by a line. In R, the functions read.table, read.csv, read.delim, etc... are doing the transposit

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Gary Strangman
Very helpful, thanks! As for having data in the "wrong" order, it's a little odd that a datafile that's perfect for loading into R as a dataframe (via read.table), is inherently in the "wrong" order for dataframe creation after reading it into python (using numpy.genfromtext(), or f.readlines(

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Nathaniel Smith
On Tue, Sep 29, 2009 at 4:21 AM, Gary Strangman wrote: > Without benchmarking, that seems mighty inefficient. Nathaniel Smith's > rnumpy mostly allows the following: > > df = rnumpy.r.data_frame(numpy.array(d,np.object)) > > ... which is 2 conversions (rather than 4), but I haven't been able to ge

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Gary Strangman
That's the problem ... I don't have the data in R format to start, nor is there a simple way of getting it there (except through python, of course, in which case I have it in python, not R ;-) I did actually use the read.table method for a while, but with several hundred thousand disk hits eac

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Gary Strangman
Great. Thanks for the jump-start! On Tue, 29 Sep 2009, Laurent Gautier wrote: > Gary Strangman wrote: >> >> Hi Laurent, >> >> The only way to reduce the number of transformations is to add an >> equivalent number of columns to the dataframe (so that instead of several >> hundred thousand con

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Laurent Gautier
Gary Strangman wrote: > > Hi Laurent, > > The only way to reduce the number of transformations is to add an > equivalent number of columns to the dataframe (so that instead of > several hundred thousand conversions, I need several hundred thousand > columns), and then passing this beast back-a

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Peter
On Tue, Sep 29, 2009 at 12:21 PM, Gary Strangman wrote: > > Hi Laurent, > > The only way to reduce the number of transformations is to add an > equivalent number of columns to the dataframe (so that instead of several > hundred thousand conversions, I need several hundred thousand columns), > and t

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Gary Strangman
Hi Laurent, The only way to reduce the number of transformations is to add an equivalent number of columns to the dataframe (so that instead of several hundred thousand conversions, I need several hundred thousand columns), and then passing this beast back-and-forth between python and R for r

Re: [Rpy] Making dataframes ... fast

2009-09-29 Thread Laurent Gautier
Gary, Two things come to my mind: - Try having an initial Python data structure that requires less transformations than your current one in order to become a DataFrame. - Use rpy2.rinterface when speed matters. This can already get you faster than R. http://rpy.sourceforge.net/rpy2/doc-dev/htm

[Rpy] Making dataframes ... fast

2009-09-28 Thread Gary Strangman
Hi all, I have a python list of lists (each sublist is a row of data), plus a list of column names. Something like this ... >>> d = [['S80', 'C', 137.5, 0], ['S82', 'C', 155.1, 1], ['S83', 'T', 11.96, 0], ['S84', 'T', 47,1]] ['S85', 'T', numpy.nan, 1]