While we're on the subject, by the way, I have a pull request open on
IPython to improve how we handle a simple pandas DataFrame being passed
into rmagic. I imagine this is quite a common case, and at present rmagic
makes a complete mess of it, so I'd like to get some improvement in before
our next release. Any suggestions are welcome:

https://github.com/ipython/ipython/pull/2889

Thanks,
Thomas


On 10 February 2013 01:28, Laurent Gautier <lgaut...@gmail.com> wrote:

>
>
> This was presumably moved to StackOverflow, and answered there:
>
> http://stackoverflow.com/questions/14656852/how-to-use-pandas-dataframes-and-numpy-arrays-in-rpy2
>
> Given current situation, the issue has little to do with you GitHub I'd say
> (and exists largely because of wanting to move everything to your GitHub
> ;-) ).
>
>
> L.
>
>
>
> On 2013-02-10 05:47, Wes McKinney wrote:
> > On Wed, Feb 6, 2013 at 5:36 PM, Yarden Katz <yarden.k...@gmail.com>
> wrote:
> >> [cross listed to pandas list since it's at intersection of pandas/rpy2
> >> - apologies for redundancy]
> >>
> >> Hi all,
> >>
> >> I'm trying to plot numpy arrays and pandas DataFrames with Rpy2 and am
> >> running into several problems. I import rpy2, pandas and scipy/numpy
> >> as follows:
> >>
> >> import rpy2
> >> from rpy2.robjects import r
> >> import rpy2.robjects.numpy2ri
> >> import pandas.rpy.common as com
> >> rpy2.robjects.numpy2ri.activate()
> >> from numpy import *
> >> import scipy
> >> import pandas
> >>
> >> Then I read a CSV file as a pandas DataFrame as usual:
> >>
> >> # Read a pandas DataFrame from file
> >> data = pandas.read_table("myfile.txt")
> >> r.hist(data.col1, xlab="", ylab="")
> >>
> >> "col1" of the "data" DataFrame contains only floats. When I plot it,
> >> my plot is littered with random numbers from the array on the
> >> histogram plot.  They appear in bold in the top of the plot, and below
> >> the x-axis is regular font.  They completely
> >> hide the xtick labels and the x-axis label (if there was one.)  If I
> >> don't pass xlab="", the entire histogram is covered with numbers.
> >>
> >> My question is: how can I get rpy2 to actually read the information
> >> from the pandas DataFrame and use it in the plot? This is  what
> >> happens natively in R with DataFrames, and I'm trying to get the same
> >> behavior here. For example, since it knows the names/labels of each
> >> column (in this case "col1"), it can place that on the X-axis.  Is
> >> this possible?  Does it require a conversion to an Rpy DataFrame
> >> before?
> >>
> >> Related issue: When I try to plot the DataFrame with:
> >>
> >> r.plot(data)
> >>
> >> It fails as well, with the error:
> >>
> >> ValueError: Nothing can be done for the type <class
> >> 'pandas.core.frame.DataFrame'> at the moment.
> >>
> >> Is it possible to get rpy2 to plot the DataFrame as best as it can
> >> (just like in native R, where R does whatever guessing is most
> >> reasonable to plot the DF in the requested way)?  Since pandas
> >> DataFrames can do everything R DataFrames can, it should be possible.
> >>
> >> I also tried to explicitly convert to R DataFrames first:
> >>
> >> r.plot(com.convert_to_r_dataframe(data))
> >>
> >> which generates this output (the first part just prints a column from
> >> my dataframe for some reason)
> >>
> >> ==
> >> 0     1.791385
> >> 1     0.152134
> >> 2     0.000000
> >> 3     0.649393
> >> 4     0.000000
> >> 5     0.605132
> >> 6     0.000000
> >> 7     0.000000
> >> 8     0.000000
> >> 9     0.000000
> >> 10    2.084081
> >> 11    0.488127
> >> 12    0.006791
> >> 13    0.000000
> >> 14    0.244846
> >> ...
> >> 21500      1.578385
> >> 21501      0.080556
> >> 21502    166.923864
> >> 21503     15.274696
> >> 21504      0.000000
> >> 21505      1.333847
> >> 21506      0.000000
> >> 21507      0.000000
> >> 21508      0.000000
> >> 21509      0.075611
> >> 21510      0.000000
> >> 21511      2.025098
> >> 21512      0.562991
> >> 21513      0.000000
> >> 21514      0.000000
> >> Name: rpkm, Length: 21515
> >> Error in plot.window(...) : need finite 'xlim' values
> >> In addition: Warning messages:
> >> 1: In data.matrix(x) : NAs introduced by coercion
> >> 2: In data.matrix(x) : NAs introduced by coercion
> >> 3: In min(x) : no non-missing arguments to min; returning Inf
> >> 4: In max(x) : no non-missing arguments to max; returning -Inf
> >> 5: In min(x) : no non-missing arguments to min; returning Inf
> >> 6: In max(x) : no non-missing arguments to max; returning -Inf
> >>
> ---------------------------------------------------------------------------
> >> RRuntimeError                             Traceback (most recent call
> last)
> >>
> /home/yarden/.local/lib/python2.7/site-packages/IPython/utils/py3compat.pyc
> >> in execfile(fname, *where)
> >>      176             else:
> >>      177                 filename = fname
> >> --> 178             __builtin__.execfile(filename, *where)
> >>
> >> /home/yarden/test_rpy2.py in <module>()
> >>       19 print data.rpkm
> >>       20 #r.hist(data.rpkm.values, xlab="", ylab="")
> >> ---> 21 r.plot(com.convert_to_r_dataframe(data))
> >>       22
> >>       23
> >>
> >>
> /home/yarden/.local/lib/python2.7/site-packages/rpy2-2.3.2-py2.7-linux-x86_64.egg/rpy2/robjects/functions.pyc
> >> in __call__(self, *args, **kwargs)
> >>       84                 v = kwargs.pop(k)
> >>       85                 kwargs[r_k] = v
> >> ---> 86         return super(SignatureTranslatedFunction,
> >> self).__call__(*args, **kwargs)
> >>
> >>
> /home/yarden/.local/lib/python2.7/site-packages/rpy2-2.3.2-py2.7-linux-x86_64.egg/rpy2/robjects/functions.pyc
> >> in __call__(self, *args, **kwargs)
> >>       33         for k, v in kwargs.iteritems():
> >>       34             new_kwargs[k] = conversion.py2ri(v)
> >> ---> 35         res = super(Function, self).__call__(*new_args,
> **new_kwargs)
> >>       36         res = conversion.ri2py(res)
> >>       37         return res
> >>
> >> RRuntimeError: Error in plot.window(...) : need finite 'xlim' values
> >> ==
> >>
> >> Advice on this will be very much appreciated.  Thank you.
> >>
> >>
> ------------------------------------------------------------------------------
> >> Free Next-Gen Firewall Hardware Offer
> >> Buy your Sophos next-gen firewall before the end March 2013
> >> and get the hardware for free! Learn more.
> >> http://p.sf.net/sfu/sophos-d2d-feb
> >> _______________________________________________
> >> rpy-list mailing list
> >> rpy-list@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/rpy-list
> > Could you move this to GitHub? I don't have time to look at it right
> > now but dalejung or other active pandas users may be able to help.
> >
> > - Wes
> >
> >
> ------------------------------------------------------------------------------
> > Free Next-Gen Firewall Hardware Offer
> > Buy your Sophos next-gen firewall before the end March 2013
> > and get the hardware for free! Learn more.
> > http://p.sf.net/sfu/sophos-d2d-feb
> > _______________________________________________
> > rpy-list mailing list
> > rpy-list@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rpy-list
>
>
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> rpy-list mailing list
> rpy-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rpy-list
>
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to