[Rpy] SF.net SVN: rpy:[579] branches/rpy_nextgen

2008-07-17 Thread lgautier
Revision: 579
  http://rpy.svn.sourceforge.net/rpy/?rev=579&view=rev
Author:   lgautier
Date: 2008-07-17 15:01:27 + (Thu, 17 Jul 2008)

Log Message:
---
rinterface:

- set R default console output to "print"
- disable R's stack checking (to avoid avoid crashes when 
multithreading)
- set_term_ui for Win32

Modified Paths:
--
branches/rpy_nextgen/rpy/rinterface/__init__.py
branches/rpy_nextgen/rpy/rinterface/rinterface.c
branches/rpy_nextgen/setup.py

Modified: branches/rpy_nextgen/rpy/rinterface/__init__.py
===
--- branches/rpy_nextgen/rpy/rinterface/__init__.py 2008-07-14 15:21:49 UTC 
(rev 578)
+++ branches/rpy_nextgen/rpy/rinterface/__init__.py 2008-07-17 15:01:27 UTC 
(rev 579)
@@ -35,8 +35,6 @@
 
 from rpy2.rinterface.rinterface import *
 
-
-
 class StrSexpVector(SexpVector):
 def __init__(self, v):
 super(StrSexpVector, self).__init__(v, STRSXP)
@@ -51,3 +49,10 @@
 def __init__(self, v):
 super(StrSexpVector, self).__init__(v, REALSXP)
 
+
+# wrapper because print is strangely not a function
+# Python prior to version 3.0
+def consolePrint(x):
+print(x)
+
+setWriteConsole(consolePrint)

Modified: branches/rpy_nextgen/rpy/rinterface/rinterface.c
===
--- branches/rpy_nextgen/rpy/rinterface/rinterface.c2008-07-14 15:21:49 UTC 
(rev 578)
+++ branches/rpy_nextgen/rpy/rinterface/rinterface.c2008-07-17 15:01:27 UTC 
(rev 579)
@@ -75,6 +75,11 @@
 
 //#define RPY_VERBOSE
 
+#if Win32
+extern __declspec(dllimport) uintptr_t R_CStackLimit; /* C stack limit */
+#endif
+
+
 /* Back-compatibility with Python 2.4 */
 #if (PY_VERSION_HEX < 0x0205)
 typedef int Py_ssize_t;
@@ -249,6 +254,14 @@
   int status = 1;
   Rf_initialize_R(n_args, options);
   R_Interactive = TRUE;
+
+  /* Taken from JRI:
+   * disable stack checking, because threads will thow it off */
+  R_CStackLimit = (uintptr_t) -1;
+  
+  #ifdef Win32
+  setup_term_ui();
+  #endif
   setup_Rmainloop();
 
   Py_XDECREF(embeddedR_isInitialized);

Modified: branches/rpy_nextgen/setup.py
===
--- branches/rpy_nextgen/setup.py   2008-07-14 15:21:49 UTC (rev 578)
+++ branches/rpy_nextgen/setup.py   2008-07-17 15:01:27 UTC (rev 579)
@@ -4,7 +4,7 @@
 
 
 pack_name = 'rpy2'
-pack_version = '2.0.0-a1'
+pack_version = '2.0.0-dev'
 
 RHOMES = os.getenv('RHOMES')
 
@@ -87,10 +87,18 @@
 
 #f_in.close()
 #f_out.close()
+
 
+#FIXME: crude way (will break in many cases)
+#check how to get how to have a configure step
 define_macros = []
-if sys.platform != 'win32':
+
+if sys.platform == 'win32':
+define_macros.append(('Win32', 1))
+else:
 define_macros.append(('R_INTERFACE_PTRS', 1))
+
+define_macros.append(('CSTACK_DEFNS', 1))
 
 rinterface_ext = Extension(
 pack_name + '.rinterface.rinterface',
@@ -130,6 +138,7 @@
   description = "Python interface to the R language",
   url = "http://rpy.sourceforge.net";,
   license = "(L)GPL",
+  author = "Laurent Gautier <[EMAIL PROTECTED]>",
   ext_modules = rinterface_exts,
   package_dir = pack_dir,
   packages = [pack_name,


This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list


[Rpy] RPY & shared libraries...

2008-07-17 Thread Vince Fulco
Dear RPy Experts-

Being a relative newbie at RPy, are there any non-obvious pitfalls to
incorporating C-code shared libraries which work properly in R
standalone?

TIA

-- 
Vince Fulco

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list


[Rpy] [ rpy-Bugs-2018909 ] Importing rpy in django application fails

2008-07-17 Thread SourceForge.net
Bugs item #2018909, was opened at 2008-07-15 20:09
Message generated for change (Comment added) made by nobody
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=453021&aid=2018909&group_id=48422

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nishant Joshi (nishantj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Importing rpy in django application fails

Initial Comment:
I am using rpy 1.0.3 with the R version 2.7.1, when I try to import rpy from a 
django application, I get the following,

Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit

 *** caught bus error ***
address 0xbfcc, cause 'non-existent physical address'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I know this is a known (and probably resolved) issue, but I couldn't get this 
to work following the suggestions I found on the old discussions. Any help is 
appreciated.

Thanks

--

Comment By: Nobody/Anonymous (nobody)
Date: 2008-07-17 08:53

Message:
Logged In: NO 

Just FYI: I've seen similar C stack problems discussed on the rpy mailing
list.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=453021&aid=2018909&group_id=48422

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list


Re: [Rpy] segmentation fault after long batch of LM

2008-07-17 Thread Laurent Gautier
I am a bit confused.

Do you have data frames created and bound to the same Python variable names ?
Or do you have a set of vectors, and subsets put together in different
data frame ?
(if you are iterating and building a lot of linear model, you might be
trying to do some
sort of variable/model selection).
In the later case, this is where a bug could lurk, and dangling pointers  appear
(through the recursive release of elements in the data frame - although I cannot
explain why it would randomly have to wait for 10, 100, 1000, or 5000
iterations...
which what I am experiencing with rpy2).

Any experiment is good. I would be very happy if we could come up with a way
to trigger the bug a reproducible way.


2008/7/16 laurent oget <[EMAIL PROTECTED]>:
> I was thinking of another workaround and would love your opinion on that. My
> intuition is that the problem occurs because I have variables who are in
> several dataframes with the same name. Do you think mangling the names of
> the variables before passing them to R might help? Would it be an
> interesting experiment?
>
> Laurent
>
> 2008/7/16 Laurent Gautier <[EMAIL PROTECTED]>:
>>
>> I have trouble narrowing down where exactly the problem is occurring
>> (it seems to keep moving each time I think that I am getting closer).
>>
>> In the short term, and for your needs, I guess that it is good if calls to
>> gc
>> prevents rpy from crashing.
>> To save on runtime, you can try a gc pooling strategy (call gc every N
>> iterations).
>>
>>
>>
>> 2008/7/15 laurent oget <[EMAIL PROTECTED]>:
>> > Calling r.gc() before each creation seems to have solved the problem. We
>> > ran
>> > a whole lot of things over the night without any segmentation fault.
>> > This is
>> > however pretty expensive timewise.
>> >
>> > Laurent
>> >
>> > 2008/7/14 laurent oget <[EMAIL PROTECTED]>:
>> >>
>> >> I am running things with a call to gc() before each regression, in case
>> >> what happens is a race condition where the gc is called in the middle
>> >> of the
>> >> constrction of a new dataframe...
>> >>
>> >> Thanks for the prompt help!
>> >>
>> >> Laurent
>> >>
>> >> 2008/7/14 Laurent Gautier <[EMAIL PROTECTED]>:
>> >>>
>> >>> 2008/7/15 laurent oget <[EMAIL PROTECTED]>:
>> >>> > can i get a quick hint on how i would go about calling --verbose
>> >>> > through RPY
>> >>> > ?
>> >>>
>> >>> I suspect that the only way is to hack
>> >>> line 93 of rpymodule.c
>> >>>
>> >>> char *defaultargv[] = {"rpy", "-q", "--vanilla"};
>> >>>
>> >>> and recompile/install rpy.
>> >>>
>> >>> > I am pretty clueless about the way R does the garbage collection.
>> >>> > One
>> >>> > thing
>> >>> > I know is there are columns that are shared between different linear
>> >>> > regressions, so it might be that the garbage collector cleans up a
>> >>> > column
>> >>> > from iteration n-1 and breaks column n in the process.
>> >>>
>> >>>
>> >>>
>> >>> > Is there a way to
>> >>> > call the garbage collection explicitely before I create a new
>> >>> > dataframe
>> >>> > to
>> >>> > test this hypothesis?
>> >>>
>> >>> gc()
>> >>> #possibly gc(verbose = TRUE)
>> >>>
>> >>>
>> >>> I have made a toy example to see if I could reproduce it here (with
>> >>> rpy2
>> >>> -
>> >>> but there are similarities in the way R objects pointed at from Python
>> >>> objects
>> >>> are protected from R's garbage collection)... and it seem that there
>> >>> is something going on with garbage collection (depending on how the
>> >>> python implementation, it either crashes after 10-50 iterations... or
>> >>> can
>> >>> go through 1000 iterations.
>> >>>
>> >>>
>> >>> > Thanks,
>> >>> >
>> >>> > Laurent
>> >>> >
>> >>> > 2008/7/14 Laurent Gautier <[EMAIL PROTECTED]>:
>> >>> >>
>> >>> >> Someone else reported on this list a similar sounding problem not
>> >>> >> so
>> >>> >> long
>> >>> >> ago.
>> >>> >>
>> >>> >>
>> >>> >> The problem might be caused by manipulating a stale pointer to an R
>> >>> >> object
>> >>> >> (that is an object that was discarded during R's garbage
>> >>> >> collection),
>> >>> >> and troubleshooting this will likely mean running things through a
>> >>> >> C
>> >>> >> debugger.
>> >>> >> You could try starting up you embedded R process with '--verbose'
>> >>> >> and
>> >>> >> see
>> >>> >> if the problem happens right after garbage collection.
>> >>> >> Without having further details on the exact code ran, it is
>> >>> >> difficult
>> >>> >> to say more.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 2008/7/14 laurent oget <[EMAIL PROTECTED]>:
>> >>> >> > I am using rpy/R to perform linear regressions on a large number
>> >>> >> > of
>> >>> >> > datasets, in one python run, and am encountering segmentation
>> >>> >> > faults
>> >>> >> > after a
>> >>> >> > large number of iteration, while handling cases which, taken on
>> >>> >> > their
>> >>> >> > own
>> >>> >> > run without a problem. My intuition is that the previous
>> >>> >> > iterations
>> >>> >> > somehow
>> >>> >> > corrupted the heap (or th

Re: [Rpy] rpy and mod_python problem

2008-07-17 Thread Laurent Gautier
I was mentioning R-2.7 because I stumbled upon posts about stack
problem on BSD since R-2.7.

May be it has to do with how many threads is the combo
apache/mod_python is using. I think that there is a way
to compile mod_python without threads.

2008/7/14 Toby Hocking <[EMAIL PROTECTED]>:
> I have tried ulimit -s values of up to 200 but even then I get C stack 
> limit usage errors in R -> error initializing RPy. Only ulimit -s unlimited 
> seems to work for my setup.
>
> To clarify, I have had this problem with R-2.4.1 -> R-2.7, so I am inclined 
> to think it is not 2.7-specific. The oldest version that worked without 
> messing with stack usage limits in the shell was R-2.2.1.
>
> -Original Message-
> From: Laurent Gautier [mailto:[EMAIL PROTECTED]
> Sent: Monday, July 14, 2008 8:40 AM
> To: Toby Hocking
> Cc: RPy help, support and design discussion list
> Subject: Re: [Rpy] rpy and mod_python problem
>
>
> Thanks for posting it.
>
> I am not a C stack expert, but I suspect that there is a reason why it is not
> unlimited by default. I would particularly consider putting somehow a limit
> to it on a webserver, although that limit would naturally be larger than the
> default one. It would be nice come up with such a number a put it in
> the documentation for the time being.
>
> >From what I read, it seems that from version 2.7 R is more C stack-hungry.
> May be this will improve in time.
>
> I have started playing with threading in python with rpy, and the C stack
> problem appears. Toying with the stack limit again let things run properly.
>
> Are there win32 folks here having similar problems with the C stack ?
>
>
>
>
>
> 2008/7/7 Toby Hocking <[EMAIL PROTECTED]>:
>> Hi all,
>>
>> Just thought I would post with a solution I have found to my own problem.
>>
>> I was getting an error: C stack size is too small (I think from R) in my 
>> shell, so I told my shell (bash) ulimit -s unlimited, which makes the C 
>> stack size unlimited, and that fixed the problem, so I can use R-2.7 with 
>> rpy and Apache2/mod_python/Django.
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] Behalf Of Toby Hocking
>> Sent: Wednesday, June 25, 2008 9:29 AM
>> To: RPy help, support and design discussion list
>> Subject: Re: [Rpy] rpy and mod_python problem
>>
>>
>> I'm inclined to think that it is NOT an inherent incompatibility problem 
>> with mod_python and RPy since I have a working Rpy/apache/mod_python system 
>> on top of Django. Relevant package versions for the working system are:
>>
>> Django-rev7716
>> Apache-2.0.55-4ubuntu2.3
>> mod_python-3.1.4-0ubuntu1.1
>> R-2.2.1
>> RPy-rev563
>>
>> But when I switch to R-2.7 and Python 2.5, I start getting the error 
>> (RPy_Exception: R Function "get" not found). Furthermore, this error happens 
>> on the Django development webserver as well, so it must be an error 
>> independent of apache/mod_python.
>>
>> Any thoughts?
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] Behalf Of Christof
>> Winter
>> Sent: Wednesday, June 25, 2008 7:54 AM
>> To: RPy help, support and design discussion list
>> Subject: Re: [Rpy] rpy and mod_python problem
>>
>>
>> Laurent Gautier wrote, On 06/25/08 15:47:
>>> Wild guess: this has something to do with the multiple interpreters
>>> http://www.modpython.org/live/current/doc-html/pyapi-interps.html
>>>
>>> (reading the doc Python's C-level "Py_NewInterpreter()" is hinting that
>>> this is likely the nature of the problem).
>>>
>>> For now, there is not much to do beside using python CGIs without
>>> mod_python.
>>> I'd like to have this solved ,but that will likely have to wait for rpy 2.1.
>>
>> That sounds plausible. Waiting for rpy 2.1 is fine for me, as using a 
>> cgi-based
>> Python with Apache works (although it's rather slow).
>>
>> Christof
>>
>>
>> -
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://sourceforge.net/services/buy/index.php
>> ___
>> rpy-list mailing list
>> rpy-list@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rpy-list
>>
>> __
>> This email has been scanned by the MessageLabs Email Security System.
>> For more information please visit http://www.messagelabs.com/email
>> __
>>
>> __
>> This email has been scanned by the MessageLabs Email Security System.
>> For more information please visit http://www.messagelabs.com/email
>> __
>>
>> -
>> Check out the new Sour

Re: [Rpy] segmentation fault after long batch of LM

2008-07-17 Thread laurent oget
2008/7/17 Laurent Gautier <[EMAIL PROTECTED]>:

> I am a bit confused.
>
> Do you have data frames created and bound to the same Python variable names
> ?


> Or do you have a set of vectors, and subsets put together in different
> data frame ?
> (if you are iterating and building a lot of linear model, you might be
> trying to do some
> sort of variable/model selection).


More precisely:

I build  a set of vector and a dictionary with string as keys and those
vectors as values. I build an R dataframe object from this dictionary and
run a series of regression using this dataframe.  I do this repeatedly and
some variable names are shared between dataframes, someting like

-create dataframe with d=r.as_data_frame(dict(x1=...,x2=..,x3=,x4=..,x5...)
   call r.lm('x1~x2+x3',d) and a whole lot of
other

-create new dataframe  d==r.as_data_frame(dict(x3=..,))where
some of the variable names were in the previous dataframe

i do not really understand either why this occurs randomly. I suspect a race
condition between the garbage collector and the computing process ( do you
know how the R gc is  triggered?). Most crashes seem to occur during the
construction of the dataframe.

I will try to get a reproducible scenario which triggers the crash.

Thanks,

Laurent
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/___
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list


Re: [Rpy] segmentation fault after long batch of LM

2008-07-17 Thread Laurent Gautier
2008/7/17 laurent oget <[EMAIL PROTECTED]>:
>
>
> 2008/7/17 Laurent Gautier <[EMAIL PROTECTED]>:
>>
>> I am a bit confused.
>>
>> Do you have data frames created and bound to the same Python variable
>> names ?
>>
>> Or do you have a set of vectors, and subsets put together in different
>> data frame ?
>> (if you are iterating and building a lot of linear model, you might be
>> trying to do some
>> sort of variable/model selection).
>
> More precisely:
>
> I build  a set of vector and a dictionary with string as keys and those
> vectors as values. I build an R dataframe object from this dictionary and
> run a series of regression using this dataframe.  I do this repeatedly and
> some variable names are shared between dataframes, someting like
>
> -create dataframe with d=r.as_data_frame(dict(x1=...,x2=..,x3=,x4=..,x5...)
>call r.lm('x1~x2+x3',d) and a whole lot of
> other
>
> -create new dataframe  d==r.as_data_frame(dict(x3=..,))where
> some of the variable names were in the previous dataframe
>
> i do not really understand either why this occurs randomly. I suspect a race
> condition between the garbage collector and the computing process ( do you
> know how the R gc is  triggered?).

It might not really be a race condition, but more an overzealous
garbage collection.
R does not have reference counting, and objects without an associated
symbol/name
in the R space can be flagged as "protected" from garbage collection.
The removal of
the protection is recursive, I think, and this might be causing
trouble in a situation where
- your data.frame 'd' above is built from R-anonymous objects (e.g 'x1
= r.rnorm(10)')
- 'd' is scheduled for garbage collection in Python (end of code block,
re-assignment, explicit call to "del")
- the R-anonymous objects used for 'd' are used again (since the
recursive release will
have them exposed for garbage collection).

I start being able to trigger reproducibly crashes this way with rpy2.

> Most crashes seem to occur during the
> construction of the dataframe.
>
> I will try to get a reproducible scenario which triggers the crash.
>
> Thanks,
>
> Laurent
>
>
>

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list