?! phyper is deterministic, I think. Calling it with the same parameters 
will give the same results.

Good point, that it is in fact an excellent improvement. I will start 
storing everything already computed on a dictionary.

Anyway I was able to easily solve the problem with the mapping by adding: 
myparams = 
{'lower.tail':rinterface.SexpVector([False,],rinterface.LGLSXP),'log.p': 
rinterface.SexpVector([True,],rinterface.LGLSXP)}

I tried your code partially since I don't know what you mean with:
import rpy2.robjects as robjects
from rpy2.robjects.pacakges import importr

# this steps performs an early binding of all objects
# in the R package "stats"
stats = importr("stats")

rpy2.robjects has no module called packages or something similar. I know 
have everything in low level. I just don't like the fact that I need to 
create an RVector everytime I change the parameters to the phyper. Here it 
is how the code looks at the moment: import rpy2.rinterface as rinterface 
rinterface.initr() phyper = rinterface.globalEnv.get("phyper") myparams = 
{'lower.tail':rinterface.SexpVector([False,],rinterface.LGLSXP),'log.p': 
rinterface.SexpVector([True,],rinterface.LGLSXP)} ..... ..... def 
lphyper(self,q,m,n,k):
        """
        loc.phyper(q,m,n,k)
        Calculate p-value using R function phyper from rpy2 low-level
        interface.
        """
        phyper_q = rinterface.SexpVector([q,], rinterface.INTSXP)
        phyper_m = rinterface.SexpVector([m,], rinterface.INTSXP)
        phyper_n = rinterface.SexpVector([n,], rinterface.INTSXP)
        phyper_k = rinterface.SexpVector([k,], rinterface.INTSXP)
        return phyper(phyper_q,phyper_m,phyper_n,phyper_k,**myparams)[0]


I had solve the problem by reading the robjects code. Not sure if yours is 
a better option but for the bits I was able to port don't seem to make 
difference in such a small dataset. I will test with a larger dataset to 
see if I can spot any significant difference.

Once more. Thank you very much you have been helping a lot. 
Bruno 

2010/2/3 Laurent Gautier <lgaut...@gmail.com>
On 2/3/10 3:00 PM, B.A.D.C.M.D Santos wrote:
The main reason I was trying to move to the low level interface is
speed. So what I am doing is calculating several times the p-value for
the same python object.

?! phyper is deterministic, I think. Calling it with the same parameters 
will give the same results.



So I bind phyper to the object and then just
perform the test several times. This already improved the speed a lot,
compared to simply calling rinterface.phyper everytime.

There is no such thing as rinterface.phyper by default, but I think that I 
understand what you mean ( rinterface.baseenv.get("phyper") ). Early 
binding is definitely improving things. This may even make rpy2 faster than 
the same code in R, in some cases.

# -- for rpy2-2.1.0 (written without actually running it)

import rpy2.robjects as robjects
from rpy2.robjects.pacakges import importr

# this steps performs an early binding of all objects
# in the R package "stats"
stats = importr("stats")

# save the Python lookup in stats.
phyper = stats.phyper

# cast to low-level (to cut the automagic conversion of Python objects)
lowlevel_phyper = robjects.conversion.py2ri(phyper)

my_q = robjects.IntVector((0, ))
my_m = robjects.IntVector((0, ))
my_n = robjects.FloatVector((0, ))
my_k = robjects.FloatVector((0, ))

# you can note that high-level and low-level objects
# can be inter-exchanged.

results = lowlevel_phyper(my_q, my_m, my_n, my_k)

# low-overhead change of a parameter
my_q[0] = 123

results = lowlevel_phyper(my_q, my_m, my_n, my_k)


At this point, you should profile your code, and the largest part of the 
time should be spent in the call lowlevel_phyper(), and then there is not 
much left to optimize (well... in fact there is probably the options Rmaths 
library that could still push things a little faster).




But the problem
is that I am using this in huge sequence datasets, taking hours for a
single run. And the bottleneck is on the significance calculation which
is taking 3x more time than all the rest.

In a very small dataset that I use for test using the low level
interface gives me less time to run it. Dropping from ~0.9s to ~0.8s so
I expect that using it on a large dataset will give me even more saving.

I will take a look at the code and try to make sense of it. I just
though there was a better solution since it kind of points to that on
the documentation.

Every code optimization problem can be different. I agree that the 
documentation could be giving hints (it is planned, and there already a 
section on perfomances), but will not replace knowing the API inside out.




L.




Thank you very much once more,
Bruno
2010/2/3 Laurent Gautier <lgaut...@gmail.com>
On 2/3/10 12:32 PM, B.A.D.C.M.D Santos wrote:
Hello again,

Today I was trying to port my phyper from the high-level interface to the
low-level interface. My problem is again how I map the arguments with dot.
According to the documentation I should be able to use the special **Kwargs
again. But I have no idea how to do this. I tried directly but it obviously
didn't work because the function was expecting Sexp_Type object. Can
someone give some light on this.

Also is there any easy way to pass four parameters to phyper without
needing to create four Sexp_Type vectors individually?


Unless you have good reasons to move to the lower-level interface, I
would advice you to stay with the higher level interface.

The higher level interface is mostly abstracting those details from you.

If you really really want to do it at the lower level, you can study how
the higher level interface is doing it itself (it is implemented in
Python). An answer to your question would just be that code ;-) .


L.





Thanks in advance,
Bruno

2010/2/2 B.A.D.C.M.D Santos<bac...@cam.ac.uk>
Thank you Laurent. I had completely forget that solution. It seems to be
working know, although I am getting some weird values. But I think the
problem is on my script.

Thank you very much,
Bruno


2010/2/2 Laurent Gautier<lgaut...@gmail.com>
On 2/2/10 12:40 PM, B.A.D.C.M.D Santos wrote:
Hello everyone,

I am currently using rpy2 in order to use phyper function from R into my
Python script. Nevertheless I have a problem because I want to use the
argument log.p=TRUE. The last time I checked there was a problem with the
mapping of the arguments with dots in the middle. Is this still a bug? If
it is, is there any workaround to solve this?

Did the link given earlier work for you ?
http://www.mail-archive.com/rpy-list@lists.sourceforge.net/msg02313.html



Or does the new rpy2 alpha
solves this problem?

It does provide a syntactic sugar for this. You can read more at:
http://rpy.sourceforge.net/rpy2/doc-2.1/html/robjects.html#functions
(the doc also tells to make a blind 'turn "." into "_"' function)


Hoping this helps,


L.




I am using:
Python 2.6.4
R version 2.10.1
rpy2 2.0.4

Thanks in advance,
Bruno

------------------------------------------------------------------------------ 
The Planet: dedicated and managed hosting, cloud storage, colocation Stay 
online with enterprise data centers and the best network in the business 
Choose flexible plans and management services without long-term contracts 
Personal 24x7 support from experience hosting pros just a phone call away. 
http://p.sf.net/sfu/theplanet-com 
_______________________________________________ rpy-list mailing list 
rpy-list@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rpy-list


------------------------------------------------------------------------------ 
The Planet: dedicated and managed hosting, cloud storage, colocation Stay 
online with enterprise data centers and the best network in the business 
Choose flexible plans and management services without long-term contracts 
Personal 24x7 support from experience hosting pros just a phone call away. 
http://p.sf.net/sfu/theplanet-com 
_______________________________________________ rpy-list mailing list 
rpy-list@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rpy-list


------------------------------------------------------------------------------ 
The Planet: dedicated and managed hosting, cloud storage, colocation Stay 
online with enterprise data centers and the best network in the business 
Choose flexible plans and management services without long-term contracts 
Personal 24x7 support from experience hosting pros just a phone call away. 
http://p.sf.net/sfu/theplanet-com 
_______________________________________________ rpy-list mailing list 
rpy-list@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rpy-list


------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to