[cross listed to pandas list since it's at intersection of pandas/rpy2
- apologies for redundancy]

Hi all,

I'm trying to plot numpy arrays and pandas DataFrames with Rpy2 and am
running into several problems. I import rpy2, pandas and scipy/numpy
as follows:

import rpy2
from rpy2.robjects import r
import rpy2.robjects.numpy2ri
import pandas.rpy.common as com
rpy2.robjects.numpy2ri.activate()
from numpy import *
import scipy
import pandas

Then I read a CSV file as a pandas DataFrame as usual:

# Read a pandas DataFrame from file
data = pandas.read_table("myfile.txt")
r.hist(data.col1, xlab="", ylab="")

"col1" of the "data" DataFrame contains only floats. When I plot it,
my plot is littered with random numbers from the array on the
histogram plot.  They appear in bold in the top of the plot, and below
the x-axis is regular font.  They completely
hide the xtick labels and the x-axis label (if there was one.)  If I
don't pass xlab="", the entire histogram is covered with numbers.

My question is: how can I get rpy2 to actually read the information
from the pandas DataFrame and use it in the plot? This is  what
happens natively in R with DataFrames, and I'm trying to get the same
behavior here. For example, since it knows the names/labels of each
column (in this case "col1"), it can place that on the X-axis.  Is
this possible?  Does it require a conversion to an Rpy DataFrame
before?

Related issue: When I try to plot the DataFrame with:

r.plot(data)

It fails as well, with the error:

ValueError: Nothing can be done for the type <class
'pandas.core.frame.DataFrame'> at the moment.

Is it possible to get rpy2 to plot the DataFrame as best as it can
(just like in native R, where R does whatever guessing is most
reasonable to plot the DF in the requested way)?  Since pandas
DataFrames can do everything R DataFrames can, it should be possible.

I also tried to explicitly convert to R DataFrames first:

r.plot(com.convert_to_r_dataframe(data))

which generates this output (the first part just prints a column from
my dataframe for some reason)

==
0     1.791385
1     0.152134
2     0.000000
3     0.649393
4     0.000000
5     0.605132
6     0.000000
7     0.000000
8     0.000000
9     0.000000
10    2.084081
11    0.488127
12    0.006791
13    0.000000
14    0.244846
...
21500      1.578385
21501      0.080556
21502    166.923864
21503     15.274696
21504      0.000000
21505      1.333847
21506      0.000000
21507      0.000000
21508      0.000000
21509      0.075611
21510      0.000000
21511      2.025098
21512      0.562991
21513      0.000000
21514      0.000000
Name: rpkm, Length: 21515
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In data.matrix(x) : NAs introduced by coercion
2: In data.matrix(x) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
---------------------------------------------------------------------------
RRuntimeError                             Traceback (most recent call last)
/home/yarden/.local/lib/python2.7/site-packages/IPython/utils/py3compat.pyc
in execfile(fname, *where)
    176             else:
    177                 filename = fname
--> 178             __builtin__.execfile(filename, *where)

/home/yarden/test_rpy2.py in <module>()
     19 print data.rpkm
     20 #r.hist(data.rpkm.values, xlab="", ylab="")
---> 21 r.plot(com.convert_to_r_dataframe(data))
     22
     23

/home/yarden/.local/lib/python2.7/site-packages/rpy2-2.3.2-py2.7-linux-x86_64.egg/rpy2/robjects/functions.pyc
in __call__(self, *args, **kwargs)
     84                 v = kwargs.pop(k)
     85                 kwargs[r_k] = v
---> 86         return super(SignatureTranslatedFunction,
self).__call__(*args, **kwargs)

/home/yarden/.local/lib/python2.7/site-packages/rpy2-2.3.2-py2.7-linux-x86_64.egg/rpy2/robjects/functions.pyc
in __call__(self, *args, **kwargs)
     33         for k, v in kwargs.iteritems():
     34             new_kwargs[k] = conversion.py2ri(v)
---> 35         res = super(Function, self).__call__(*new_args, **new_kwargs)
     36         res = conversion.ri2py(res)
     37         return res

RRuntimeError: Error in plot.window(...) : need finite 'xlim' values
==

Advice on this will be very much appreciated.  Thank you.

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to