from:"thomas"

[Rd] Eilig Geschaftsvorschlag (PR#7945)

2005-06-15 Thread thomas



Geschaftsvorschlag.

Zuerst muß ich um Ihre Zuversicht in dieser
verhandlung bitten,dies ist auf Grund seiner lage, als
das sein total VERTRAULICH und Geheimnisvoll.

Aber ich weiß,daß eine verhandlung dieses Ausmaßes
irgendeinen ängstlich und besorgt machen wird,aber ich
versichre Ihnen,daß alles am Ende des tages in ordnung
sein wird.

Wir haben uns entschieden Sie durch eine E-mail
sendung,wegen der Dringlichkeit diese verhandlung zu
erreichen,als wir davon zuverlassig von seiner
schnelligkeit und vertraulichkeit überzeugt worden
sind.

Ich möchte mich nun vorstellen. Ich bin Herr Thomas
Mandino(Rechnungprüfer bei der Imperial Bank von Süd
Afrika).

Ich kam zu ihrem kontakt in meiner persönlichen suche
nach einer zuverlassigen und anstandige person,um eine
sehr vertrauliche verhandlung zu erledigen,die,die
übertragung von einem fremden Konto das maximale
zuversicht erfordert.

Der vorschlag:Ein Ausländische,verstorbener Ingenieur
Menfred Becker, ein Diamante Unternehmer mit der
Bundes Regierung von Süd Afrika.

Er war bis seinem Tod vor drei jahren in einem
Flugzeug absturz,als unternehmer bei der Regierung
tatig. Herr Becker war unser kunde hier bei der
Imperial Bank von Süd Afrika und hatte ein Konto
guthaben von US$18,5 milliarde(Achtzehn milliarde,fünf
hundert Tausend United States Dollar.)welches die Bank
jetzt fraglos erwartet,durch seine verwandten das Sie
sich melden,wenn Sie sich nicht melden wird alles zu
einem Afrikanischen vertrauens fond für waffen und
munitionsbesorgungen bei einer freiheitsbewegung hier
in AfriKa gespendet.

Leidenschaftliche wertvolle Anstrengungen werden durch
die Imperial - Bank gemacht,um einen kontakt mit
jemanden von der Becker familie oder verwandten zu
bekommen.Es hat aber bis jetzt keinen Erfolg gegeben.

Es ist wegen der wahrgenommen moglichkeit keinen
verwandten der Becker zu finden(er hatte keine frau
und kinder)daß das management der eine Anordnung für
den fond als nicht zubehaupten deklariert
werden,sollte, und dann zum vertrauens-fond für waffen
und munition bersorgung ausgeben,die dem kurs vom
krieg in Afrika gespendet wird.

Um dieser negative Entwicklung abzuwenden,haben ich
und einige meiner bewöhrten kollegen in der Bank
beschlossen das Geld nach Ihre zustimmung zu
überweisen und suchen jetzt Ihre Erlaubnis das Sie
sich als verwandter des verstorbenen Ing.Manfred
Becker deklarieren,damit der Fond in der höhe von
USD$18,5m infolgendessen als der Nutznießer(Verwandter
des Becker)auf Ihr Bank Konto überwiesen werden.

Alle beurkundungen und Beweist die Ihnen ermöglichen
diese Fonds zu behaupten werden wir zu Ihrer verfügung
stellen,damit alles gut verläuft und wir versicheren
Ihnen ein 100% Risiko freie Verwicklung.

Ihr Anteil wäre 30% von der totalen Gange, während die
restlichen 70% ist für mich und meine kollege.

Wenn dieser vorschlag für Sie OK ist und Sie wünschen
das vertrauen auszunutzen, das wir hoffen auf Ihnen
und Ihrer Gesellschaft zu verleihen,dann senden sie
mir netterweise sofort per meine personal E-mail
Adresse,Ihre vertrauliche Telefonnummer, fax-nummer
und Ihre vertraulicher E-mail,Anschrift, damit ich
Ihnen die relevanten details dieser verhandlung senden
kann.

Danke in voraus.

Mit freundlichen Grüße.

Herr Thomas Mandino.
Bitte schicken Sie Ihre Antwort an meinem Private
Email:[EMAIL PROTECTED]




Turmail.com Ucretsiz Eposta Servisi : http://www.turmail.com
Gelismis Spam ve Virus Filtreleme Servisleri
100 Megabyte E-posta kotasi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] source bug ? (PR#7929)

2005-06-10 Thread Thomas Lumley

On Fri, 10 Jun 2005 [EMAIL PROTECTED] wrote:

> hello bug fixers
>
> i think this bug is probably mentioned before but i couldn't find the
> answer to it on google.
> i have to build R from the source files ( on a mac os x system )
> because i need the python R interface too work. for this too happen r
> needs to be configured in the following way
>
> ./configure --enable-R-shlib
>

The R for Mac OS X FAQ tells you how to configure.  You do need the
   --with-blas='-framework vecLib' --with-lapack 
flags that it lists.

-thomas

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Citation for R

2005-06-13 Thread Thomas Lumley

On Mon, 13 Jun 2005, Gordon K Smyth wrote:

> This is just a note that R would get a lot more citations if the 
> recommended citation was an article in a recognised journal or from a 
> recognised publisher.
>

This is unfortunately true, but R is *not* an article or a book, it is a 
piece of software.  I don't think I'm the only person who thinks it is 
counterproductive in the long run to encourage users to cite an article 
that they probably haven't read instead of citing the software they 
actually used.

Jan's suggestion of the Journal of Statistical Software might provide a 
solution, since JSS *does* publish software.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] (no subject)

2005-06-13 Thread Thomas Lumley


Online registration for DSC 2005 in Seattle, August 13-14 is now open. See 
the conference web page at
http://depts.washington.edu//dsc2005/

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] DSC 2005

2005-06-13 Thread Thomas Lumley

Persons conversant with W3C standards would have noticed too many // in the URL 
quoted in my previous message.  The correct conference page URL is

   http://depts.washington.edu/dsc2005

   -thomas

On Mon, 13 Jun 2005, Thomas Lumley wrote:

>
> Online registration for DSC 2005 in Seattle, August 13-14 is now open. See
> the conference web page at
> http://depts.washington.edu//dsc2005/
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Open device -> glibc 2.3.4 bug for Redhat Enterprise 4?

2005-06-21 Thread Thomas Lumley


This was supposed to be fixed in 2.1.1 -- which version are you using?

-thomas

On Tue, 21 Jun 2005, Martin Maechler wrote:

> We have been using Redhat Enterprise 4, on some of our Linux
> clients for a while,
> and Christoph has just found that opening an R device for a file
> without write permission gives a bad glibc error and subsequent
> seg.fault:
>
>> postscript("/blabla.ps")
> *** glibc detected *** double free or corruption (!prev): 0x01505f10 
> ***
>
> or
>
>> xfig("/blabla.fig")
> *** glibc detected *** double free or corruption (!prev): 0x01505f10 
> ***
>
> and similar for pdf();
> does not happen for jpeg() {which runs via x11},
> nor e.g. for
>
>> sink("/bla.txt")
>
> ---
>
> Happens both on 32-bit (Pentium) and 64-bit (AMD Athlon)
> machines with the following libc :
>
> 32-bit:
>  -rwxr-xr-x  1 root root 1451681 May 13 00:17 /lib/tls/libc-2.3.4.so*
> 64-bit:
>  -rwxr-xr-x  1 root root 1490956 May 12 23:26 /lib64/tls/libc-2.3.4.so*
>
> ---
>
> Can anyone reproduce this problem?
>
> Regards,
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Open device -> glibc 2.3.4 bug for Redhat Enterprise 4?

2005-06-21 Thread Thomas Lumley

On Tue, 21 Jun 2005, Martin Maechler wrote:

>>>>>> "TL" == Thomas Lumley <[EMAIL PROTECTED]>
>>>>>> on Tue, 21 Jun 2005 09:59:31 -0700 (PDT) writes:
>
>TL> This was supposed to be fixed in 2.1.1 -- which version are you using?
>
> 2.1.1 -- and 2.1.0 and 2.0.0 all showed the problem.
>
> But thanks, Thomas, looking in "NEWS" of R-devel showed that
> there was a fix for this in R-devel only --- too bad it didn't
> make it for R 2.1.1.
>

It's a double free(), so it produces undefined behaviour and anything can 
happen.  On the Mac (where it was first reported), "anything" was a 
warning message from malloc, and on many systems "anything" is nothing.

I thought I had added it to R-patched as well, but obviously not.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] segmentation fault after max level of parenthesis (PR#7990)

2005-07-04 Thread Thomas Lumley


Already fixed, I believe.

-thomas

On Mon, 4 Jul 2005 [EMAIL PROTECTED] wrote:

> Full_Name: Matthias Laabs
> Version: 2.1.0 (source compiled)
> OS: debian linux (sarge)
> Submission from: (NULL) (195.143.236.2)
>
>
> Hi!
> R crashes with a segmentation fault when I use more than 85 parenthesis (it
> actually happened by accidently hitting a wrong shortcut in emacs ... )
>
> code:
> sum(()
>
>
> cheers from Berlin,
>
> Matthias
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R_alloc problems in R v1.11

2005-07-06 Thread Thomas Lumley


On Wed, 6 Jul 2005, [iso-8859-1] Marie-Hélène Ouellette wrote:


Dear Dr. Ripley,


Or possibly other people on the list.


I'm using the R v1.11 on Macintoch


Unlikely. There is no R v1.11, and R 1.1.1 (following the most common 
misspelling) wasn't available for the Mac.



and I seem to have a problem with the
function R_alloc. It crashes when using the following .C function (only an
example):

///
# include 
void Hello(int *n)
{
int i,x;
for(i=1;1< *n ; i++)


This is an infinite loop.  You probably mean i<*n as the second 
expression, or perhaps i<=*n (did you want n or n-1 repeats?)



{
Rprintf('salut!!!\n');


This is invalid C, as your compiler should have told you. You need double 
quotes.



}
x = (int *) R_alloc(5,sizeof(int));


x was declared as int, not int *, so again this is invalid.


}

///

I call it in R with this line:

.C('Hello',as.integer(5))

Any idea why and how I can resolve this problem?


After fixing the C errors and in a version of R that exists (2.0.1) I get

dyn.load("hello.so")
.C('Hello',as.integer(5))

salut!!!
salut!!!
salut!!!
salut!!!
[[1]]
[1] 5

-thomas__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem with dyn.load...or else...

2005-07-06 Thread Thomas Lumley


On Wed, 6 Jul 2005, [iso-8859-1] Marie-H?l?ne  Ouellette wrote:


And try to use the function:
> K_MEANSR(tab,centers=c(2,4))
[1] "AA"
[1] "AAA"
[1] "A"
[1] "B"
Error in .C("K_MEANSC", xrows = as.integer(xrows), xcols =
as.integer(xcols),  :
"C" function name not in load table



Hmm. Strange.  R doesn't think your C function is called K_MEANSC (I 
assume that K_MEANSC *is* actually the name of your C function.)


In a Terminal window you can use
  nm K_MEANSC.so
to see all the names of externally visible objects in your DLL. This would 
narrow down whether the change of names is happening in R or in 
compilation.


-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R_AllocatePtr

2005-07-19 Thread Thomas Lumley

On Tue, 19 Jul 2005, Paul Roebuck wrote:

> Had been looking into Luke Tierney's R_AllocatePtr() and
> was left with a question about exactly when does R reclaim
> heap memory. Implication of 'simpleref.nw' is that one can
> allocate C data on the R heap, and as long as pointer object
> is alive, the data and pointer will remain valid. But it
> calls allocString() which is implemented using R_alloc().

Um, no.  allocString is
SEXP allocString(int length)
{
 return allocVector(CHARSXP, length);
}

In fact the reverse is true, R_alloc is implemented using allocString.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Follow-Up: R on FC4

2005-08-01 Thread Thomas Lumley

On Mon, 1 Aug 2005, Gavin Simpson wrote:
> R-devel compiles without error (same set of flags as above) but fails
> make check-all in p-r-random-tests.R - the relevant section of p-r-
> random-tests.Rout.fail is:
>
> ...
>> dkwtest("norm")
> norm() PASSED
> [1] TRUE
>> dkwtest("norm",mean = 5,sd = 3)
> norm(mean = 5, sd = 3) PASSED
> [1] TRUE
>>
>> dkwtest("gamma",shape =  0.1)
> gamma(shape = 0.1) PASSED
> [1] TRUE
>> dkwtest("gamma",shape =  0.2)
> gamma(shape = 0.2) PASSED
> [1] TRUE
>> dkwtest("gamma",shape = 10)
> gamma(shape = 10) FAILED
> Error in dkwtest("gamma", shape = 10) : dkwtest failed
> Execution halted
>
> Is this a tolerance setting in R's tests not being met or is this
> indicative of a bad compilation of R-Devel on FC4 with gfortran?
>

Try this again -- it is a random test.  If it fails again then something 
is wrong.  The tolerance on these tests is not very tight, certainly 
nowhere near rounding error.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] bug? (PR#8074)

2005-08-16 Thread Thomas Lumley

On Wed, 17 Aug 2005 [EMAIL PROTECTED] wrote:
> I just don't understand this:
>
>> (2*2)==4
> [1] TRUE
>> .2*.2
> [1] 0.04
>> (.2*.2)==.04
> [1] FALSE

It's a FAQ, not a bug. Consider:

> (.2*.2) - .04
[1] 6.938894e-18

and read the FAQ

-thomas


> or
>
>> x=.04
>> x
> [1] 0.04
>> y=.2*.2
>> y
> [1] 0.04
>> y==x
> [1] FALSE
>
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] bug? (PR#8074)

2005-08-16 Thread Thomas Lumley

On Tue, 16 Aug 2005, Paul Mosquin wrote:
> I guess that I expect R to act pretty much as C or C++ would do if I were to 
> program the same code.  It's a bit of a surprise that assignment of 
> rationals, well within precision, followed by multiplication leading to a 
> result well within precision picks up those extra bits along the way. 
> Something to watch out for, to be sure.

But those rationals are *not* well within precision. 0.2 is a infinite 
repeating binary fraction (in base 16 it is 0.333) so it is not stored 
precisely. 0.04 is also not stored precisely, and it so happens that the 
error in representing 0.04 is not the same as the error in representing 
0.02*0.02.

Of course this will still happen in C: R is written in C.
For example, on my computer the following C program
---
#include 

int main(){
   double d=0.2;
   double dd;

   dd=d*d;
   if(dd==d)
  printf("Equal\n");
   else
  printf("Difference=%20.18f\n",0.04-dd);
}
------
prints
[al:~] thomas% ./a.out
Difference=-0.07

which happens to agree with the result R gives, though this isn't 
guaranteed.  You simply cannot rely on floating point equality unless you 
know how the last bit rounding errors are handled.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Memory leakage/violation?

2005-08-26 Thread Thomas Lumley


I can't reproduce this on R2.2.0dev on Windows XP (in a few hundred 
tries), or running under Valgrind on AMD64 Linux (in four or five tries).

-thomas


On Fri, 26 Aug 2005, Henrik Bengtsson wrote:

> Hi,
>
> I've spotted a possible memory leakage/violation in the latest R v2.1.1
> patched and R v2.2.0dev on Windows XP Pro SP2 Eng.
>
> I first caught it deep down in a nested svd algorithm when subtracting a
> double 'c' from a integer vector 'a' where both had finite values but
> when assigning 'a <- a - c' would report NaNs whereas (a-c) alone would
> not.  Different runs with the identical data would introduce NaNs at
> random positions, but not all the time.
>
> Troubleshooting is after a couple of hours still at v0.5, but here is a
> script that generates the strange behavior on the above R setups.  I let
> the script speak for itself.  Note that both the script 'strange.R' and
> the data 'strange.RData' is online too, see code below.
>
> People on other systems (but also on Windows), could you please try it
> and see if you can reproduce what I get.
>
> Cheers
>
> Henrik
>
>
> # The following was tested on: Windows XP Pro SP2 Eng with
> #   i) R Version 2.1.1 Patched (2005-08-25)
> #  ii) R 2.2.0 Under development (unstable) (2005-08-25 r35394M)
>
> # Start 'R --vanilla' and source() this script, i.e.
> #  source("http://www.maths.lth.se/help/R/strange.R";)
> # If you do not get any errors, retry a few times.
>
> foo <- function(x) {
>   print(list(
> name=as.character(substitute(x)),
> storage.mode=storage.mode(x),
> na=any(is.na(x)),
> nan=any(is.nan(x)),
> inf=any(is.infinite(x)),
> ok=all(is.finite(a))
>   ))
>   print(length(x))
>   print(summary(x))
> }
>
> # Load data from a complicated "non-reproducible" algorithm.
> # The below errors occur also when data is not
> # saved and then reloaded from file.  Data was generated in
> # R v2.1.1 patched (see above).
> if (file.exists("strange.RData")) {
>   load("strange.RData")
> } else {
>   load(url("http://www.maths.lth.se/help/R/strange.RData";))
> }
>
> # First glance at data...
> foo(a)
> foo(c)
>
> ## $name
> ## [1] "a"
> ##
> ## $storage.mode
> ## [1] "integer"
> ##
> ## $na
> ## [1] FALSE
> ##
> ## $nan
> ## [1] FALSE
> ##
> ## $inf
> ## [1] FALSE
> ##
> ## $ok
> ## [1] TRUE
> ##
> ## [1] 15000
> ##Min. 1st Qu.  MedianMean 3rd Qu.Max.
> ##41.051.063.0   292.2   111.0 65170.0
> ## $name
> ## [1] "c"
> ##
> ## $storage.mode
> ## [1] "double"
> ##
> ## $na
> ## [1] FALSE
> ##
> ## $nan
> ## [1] FALSE
> ##
> ## $inf
> ## [1] FALSE
> ##
> ## $ok
> ## [1] TRUE
> ##
> ## [1] 1
> ##Min. 1st Qu.  MedianMean 3rd Qu.Max.
> ##   53.43   53.43   53.43   53.43   53.43   53.43
> ##
>
> # But, trying the following, will result in
> # no-reproducible error messages. Sometimes
> # it errors at kk==1, sometimes at kk >> 1.
> # Also, look at the different output for
> # different kk:s.
> for (kk in 1:100) {
>   cat("kk=",kk, "\n")
>   print(summary(a-c))
> }
>
> ## kk= 1
> ##  Min. 1st Qu.  MedianMean 3rd Qu.Max.
> ## -7.741e+307  -2.431e+00   9.569e+00   5.757e+01
> ## kk= 2
> ##Min.   1st Qu.Median  Mean   3rd Qu.  Max.
> ##   -12.430-2.431 9.569   238.70057.570 65120.000
> ## kk= 3
> ##Min.   1st Qu.Median  Mean   3rd Qu.  Max.
> ##   -12.430-2.431 9.569  57.570 65120.000
> ## kk= 4
> ##Min.   1st Qu.Median  Mean   3rd Qu.  Max.
> ##   -12.430-2.431 9.569   238.70057.570 65120.000
> ## kk= 5
> ##Min.   1st Qu.Median  Mean   3rd Qu.  Max.
> ##   -12.430-2.431 9.569   238.70057.570 65120.000
> ## kk= 6
> ## Error in quantile.default(object) : missing values and NaN's
> ## not allowed if 'na.rm' is FALSE
>
>
> ## Comments: If you shorten down 'a', the bug occurs less frequently.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Compile warning in unique.c

2005-08-29 Thread Thomas Lumley

On Mon, 29 Aug 2005, Harris, Michael (NIH/NCI) [E] wrote:
>
> I am getting a compile warning when building R from source.  I am building
> on a AMD64 Opteron system  with  gcc (GCC) 3.3.3 (SuSE Linux)
>
> The warning is:
>
>unique.c: In function `cshash':
>
> unique.c:1146: warning: cast from pointer to integer of different size
>

The comment immediately above this suggests that it is deliberate

  /* Use hashing to improve object.size. Here we want equal CHARSXPs,
 not equal contents.  This only uses the bottom 32 bits of the pointer,
 but for now that's almost certainly OK */

The warning is presumably because casting this int back to a pointer would 
fail (and is a common 32 to 64bit conversion error), but that's not what 
is happening here.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] .Call and Segmentation Fault

2005-08-30 Thread Thomas Lumley

On Tue, 31 Aug 2005, Peter Dalgaard wrote:

> Well, did you try running under a debugger?
>
>  R -d gdb
>
> then "bt" after the segfault. (Make sure that things are compiled with
> -g option)

It would also be worth renaming the the function -- while I don't see 
exactly how it could be causing the problem, I would be nervous about 
having a C function called main() that wasn't the real main().

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 64 bit R for Windows

2005-09-02 Thread Thomas Lumley

On Fri, 2 Sep 2005, Martin Maechler wrote:

>>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
>>>>>> on 02 Sep 2005 18:48:24 +0200 writes:
>
>PD> "Milton Lopez" <[EMAIL PROTECTED]> writes:
>
>>> I appreciate the update. We will consider using Linux,
>>> which leads me to one more question: what is the maximum
>>> RAM that R can use on each platform (Linux and Windows)?
>>>
>>> Thanks again for your prompt responses.
>
>PD> On Win32, something like 3GB. Maybe a little more on
>PD> Linux32, but there's a physical limit at 4GB.
>
> for a *single* object, yes.  However (and Peter knows this
> probably better than me ..), R's workspace can be very much
> larger which makes it realistically possible to start *using* R
> functions on objects of around 4GB.

No, no.  On *Windows* there is an address space limit of about 3Gb (and on 
other 32bit systems)

On a 64bit system the limit is that a vector can't have length greater 
than 2^31, but this would be 8Gb for integers or 16Gb for doubles and so 
represents larger objects than you would want to handle on most current 
64-bit systems.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] .Call with C and Fortran together (PR#8122)

2005-09-05 Thread Thomas Lumley


>
> In some machines I don't get the segmentation fault problem, but I don't get 
> the
> message "Just a simple test" either (when using "cg" as the subroutine's 
> name).
> I believe this is bug in R because if I change my C interface again to return 
> a
> 0 instead of a R_NilValue, and then use it with another C program wich loads 
> the
> dynamic library amd call the function simple_program(), everything work
> perfectly.
>

I don't think it is an R bug.  I think it is because there is already a 
Fortran function called cg in R. The fact that changing the name matters 
suggest that you have a linking problem, and this turns out to be the 
case.

When I try running your code under gdb in R as Peter Dalgaard suggested 
(after editing it to use R's macros for calling fortran from C instead of 
"cfortran.h" which I don't have), I get

> .Call("simple_program")
  Calling the function...

Program received signal SIGSEGV, Segmentation fault.
0x081604e5 in cg_ (nm=0x9e5dda4, n=0xbfefccfc, ar=0xbfefcce8, ai=0x89a826,
 wr=0x9e5dda4, wi=0x9790cc0, matz=0x56090a0, zr=0x80992d4, zi=0x0, 
fv1=0x0,
 fv2=0x9e745f8, fv3=0x89a810, ierr=0x706d6973) at eigen.f:3416
3416  IERR = 10 * N
Current language:  auto; currently fortran


That is, your program is calling the Fortran subroutine CG in eigen.f, 
rather than your CG.

There should be some set of linker flags that makes sure your definition 
of CG is used, but I don't know what it would be (and it's probably very 
platform dependent)

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

2005-09-09 Thread Thomas Lumley

On Fri, 9 Sep 2005, Gabor Grothendieck wrote:
>
> I personally put NEWS, WISHLIST and THANKS files in the 'inst'
> directory of all my source packages.  This has the effect of copying them to 
> the
> top level of the built version so that they are accessible from R via:
>

I'm not sure that WISHLIST and THANKS need to be available to people who 
haven't installed the package.   NEWS, on the other hand, really does.

One option (if it doesn't turn out to be too much work for the CRAN 
maintainers) would be to have an optional Changelog field in the 
DESCRIPTION file giving the relative path to the file. This would mean 
that maintainers would not all have to switch to the same format.
eg for foreign
   Changelog: ChangeLog
and for survey
   Changelog: inst/NEWS

This might be enough to make it easy for CRAN to display these when the 
maintainer provides them.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

2005-09-09 Thread Thomas Lumley

On Fri, 9 Sep 2005, Gabor Grothendieck wrote:
> How about if there were just a standard location and name such as inst/NEWS,
> inst/WISHLIST, inst/THANKS (which has the advantage that they are 
> automatically
> made available in the built package under the current way packages are
> built)

The problem is that there *isn't* a standard location. As Robert Gentleman 
has pointed out, if you only maintain two or three packages it isn't too 
bad to change them to some new layout, but if you are the bioconductor 
project it gets painful quite quickly.

Also, there are good reasons for having NEWS in the top level directory. 
Nearly everything that isn't an R package does this, because it's a useful 
standard.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

2005-09-10 Thread Thomas Lumley

>
> Standard location or a mechachanism like the one you describe are both
> similar amount of work (and not much at all), the HTML pages are
> generated by perl and I have the parsed DESCRIPTION file there, i.e.,
> using a fixed name or the value of the Changelog field is basically
> the same.
>

In which case a Changlog entry in DESCRIPTION would be a very nice 
addition, and would have the advantage of not requiring changes to 
packages.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

2005-09-10 Thread Thomas Lumley

On Sat, 10 Sep 2005, Gabor Grothendieck wrote:
>
> And one more comment.   The DESCRIPTION file does not record the
> location or existence of the various subdirectories such as R, man,
> exec, etc. If NEWS is to be recorded as a meta data line item in
> DESCRIPTION then surely all of these should be too so its symmetric
> and they are all on an equal footing (or else none of them
> should be, which in fact I think is preferable).
>

I don't see any advantage in symmetry.  The locations of these 
subdirectories are fixed and I can't see why someone trying to decide 
whether to install an upgrade needs to know if it has an exec 
subdirectory before they download the package.

I also don't see why THANKS and WISHLIST should need to be visible before 
you download the package.  CRAN does display a URL if one is given, and if 
these are important they could be at that URL.

The changelog, on the other hand, is one piece of information that is 
really valuable in deciding whether or not to update a package, so it 
would be worth having it visible on CRAN.  Since other coding standards 
suggest different things for the name and location of this file, a path in 
DESCRIPTION seems a minimal change.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in, read.spss, package foreign?]

2005-09-10 Thread Thomas Lumley

On Sat, 10 Sep 2005, Frank E Harrell Jr wrote:

> I would vote for allowing a URL or external file name in in DESCRIPTION,
> whose contents could be automatically displayed for the user when
> needed.  Our changelogs are automatically generated by CVS and are on
> the web.

Yes, this would be nice.

However, a URL facility is already present (and you already use it, and 
link changelogs to the URL, as do I).

    -thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages

2005-09-10 Thread Thomas Lumley

On Sat, 10 Sep 2005, Seth Falcon wrote:
> For what its worth, I don't like this idea of adding a ChangeLog field
> to the DESCRIPTION file.
>
> Agreeing upon a standard location for NEWS or CHANGES or some such
> seems a more simple solution.  As long as the presence of such a file
> is *optional*.  And if the location really needs to be at the top,
> then the build tools could grab it from there as they do the
> DESCRIPTION file.

We're certainly agreed on its being optional.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

2005-09-10 Thread Thomas Lumley

On Sat, 10 Sep 2005, Gabor Grothendieck wrote:

> On 9/10/05, Thomas Lumley <[EMAIL PROTECTED]> wrote:
>> On Sat, 10 Sep 2005, Gabor Grothendieck wrote:
>>>
>>> And one more comment.   The DESCRIPTION file does not record the
>>> location or existence of the various subdirectories such as R, man,
>>> exec, etc. If NEWS is to be recorded as a meta data line item in
>>> DESCRIPTION then surely all of these should be too so its symmetric
>>> and they are all on an equal footing (or else none of them
>>> should be, which in fact I think is preferable).
>>>
>>
>> I don't see any advantage in symmetry.  The locations of these
>
> The present discussion is where the change information may be located
> but that is also true of the source and other information.We could
> just as easily have a field in the DESCRIPTION that tells the build
> where to find the R source.
> Its really the same issue.
>

There are two important differences

1/ No existing package has its source anywhere other than in the R 
subdirectory. Existing packages have their change logs in different places 
and different formats.

2/ Having source code where it will not be found must be an error -- 
making the source code available to R *cannot* be optional.  Making a 
change log available *must* be optional.


-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] ptr_R_EditFile, R_WriteConsole, and R_ShowMessage

2005-09-12 Thread Thomas Friedrichsmeier

Hi!

I have an application embedding R. For that of course, it is great, that since 
R 2.1.0 the pointers in Rinterface.h allow me to override some callbacks, 
easily. However, after implementing/overriding a couple of those, I'm a bit 
confused about when exactly they get called. So, here are a few specific 
questions:

ptr_R_EditFile:
I can find exactly one point in the R-sources where ptr_R_EditFile acutally 
seems to be used (at least if non-NULL). By default the pointer is set to 
NULL with the comment "for futur expansion".
I wonder:
1) Why is this needed at all? Shouldn't the more generic R_EditFiles 
(ptr_R_EditFiles) suffice for the more specific case of editing a single 
file?
2) Why is ptr_R_EditFiles only available on aqua? Ok, it says on other 
platforms this does not currently work. But if I'd be able to create a 
working implementation in my application, why shouldn't I be allowed to 
override it (ok, I still can by just declaring it extern, but it's not 
exported in rinterface.h)? R could still check ptr_R_EditFiles for NULL 
before using it.
3) Am I correct in assuming that the parameter char* buf is supposed to keep 
the filename?

R_ShowMessage (ptr_R_ShowMessage):
This one, too, seems to have very few use-cases (but at least some). Most seem 
to be for errors during startup.
I wonder:
1) If this callback is most useful during startR (...), can it even be used in 
a meaningful way? After all, startR () also initializes all the callbacks to 
the standard values.
2) That aside, what is the policy for R_ShowMessage? Can I assume all messages 
being shown this way are errors of some sort? Or could there also be mere 
informational messages (which in a GUI would be presented in slightly 
different ways)?

R_WriteConsole (ptr_R_WriteConsole):
This is a great callback. It will allow me to get rid of my hacky sinks 
(currently I use a sink to a file to retrieve the output). Even better would 
be an additional callback ptr_R_WriteErr. Is there any particular reason, why 
this does not exist? Could it be added?

Thanks!
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ptr_R_EditFile, R_WriteConsole, and R_ShowMessage

2005-09-12 Thread Thomas Friedrichsmeier

> R_ShowMessage (ptr_R_ShowMessage):
> This one, too, seems to have very few use-cases (but at least some). Most
> seem to be for errors during startup.
> I wonder:
> 1) If this callback is most useful during startR (...), can it even be used
> in a meaningful way? After all, startR () also initializes all the
> callbacks to the standard values.

Sorry, of course I meant to write Rf_initEmbeddedR (...). I got confused as I 
have this and a few other initialization calls in a functions called startR 
(...).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] incomplete make clean for grDevices ( Windows only) (PR#8137)

2005-09-16 Thread thomas . petzoldt

Full_Name: Thomas Petzoldt
Version: R 2.2.0 alpha
OS: Windows
Submission from: (NULL) (141.30.20.2)


Symptom:

If one moves a source tree to another drive letter, a following compile will
fail when compiling grDevices.

The bug is found on Windows only.

Reason:

When performing a "make clean" for the complete installation, several files (in
particular *.d are not cleaned up.

Suggested solution: 

modify Makefile.win that "clean" deletes *.rd (and possibly some others??)

clean:
$(RM) $(DLLNAME).dll *.a $(OBJS) $(DLLNAME).def grDevices_res.rc *.d

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Typo [Was: Rd and guillemots]

2005-09-16 Thread Thomas Lumley


On Fri, 16 Sep 2005, [EMAIL PROTECTED] wrote:


The name of the "continental" quotation mark « is "guillemet".



For anyone who is still confused:

Left pointing guillemet (U+00BB)
http://www.mathmlcentral.com/characters/glyphs/LeftGuillemet.html

Left pointing guillemot (Uria aalge)
http://www.rspb.org.uk/scotland/action/disaster/index.asp

Right pointing guillemet: (Unicode U+00AB)
http://www.mathmlcentral.com/characters/glyphs/RightGuillemet.html

Right pointing guillemot: (Uria aalge)
http://www.yptenc.org.uk/docs/factsheets/animal_facts/guillemot.html


-thomas__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Typo [Was: Rd and guillemots]

2005-09-16 Thread Thomas Lumley


On Fri, 16 Sep 2005, Thomas Lumley wrote:


On Fri, 16 Sep 2005, [EMAIL PROTECTED] wrote:


The name of the "continental" quotation mark ? is "guillemet".



For anyone who is still confused:


It should perhaps be noted that the Postscript name for the Unicode "Left 
pointing guillemet" is guillemotleft, which explains some of the 
confusion.  There does not seem to be a Postscript name for "Left pointing 
guillemot"


-thomas__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.data.frame segfaults on large lists (PR#8141)

2005-09-20 Thread Thomas Lumley



Under Valgrind on x86_64 I get
==27405==  Access not within mapped region at address 0x33FFEFD8
==27405==at 0x447045: Rf_substituteList (coerce.c:2003)
==27405== Stack overflow in thread 1: can't grow stack to 0x33FFEF98


-thomas

On Sun, 18 Sep 2005, Peter Dalgaard wrote:


[EMAIL PROTECTED] writes:


Full_Name: Ulrich Poetter
Version: 2.1.1
OS: i686-pc-linux-gnu FC2
Submission from: (NULL) (134.147.95.187)


as.data.frame() segfaults on lists with very many elements:


dfn <- rep(list(rep(0,2)),198000)
test <- as.data.frame.list(dfn)


Process R segmentation fault at Sun Sep 18 17:06:02 2005


Not for me on FC4. The process size grows to about 180M and the system
thrashes badly, but the calculation runs to completion.

It's not unlikely that we are ignoring a failed allocation somewhere,
but there's not much hope of finding it from the available
information. You could try running under gdb and see where things go
wrong for you.

--
  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subscripting fails if name of element is "" (PR#8161)

2005-09-30 Thread Thomas Lumley


On Fri, 30 Sep 2005, "Jens Oehlschlägel" wrote:

Dear all,

The following shows cases where accessing elements via their name fails (if
the
name is a string of length zero).



This looks deliberate (there is a function NonNullStringMatch that does 
the matching).  I assume this is because there is no other way to 
indicate that an element has no name.


If so, it is a documentation bug -- help(names) and FAQ 7.14 should 
specify this behaviour.  Too late for 2.2.0, unfortunately.


    -thomas






Best regards


Jens Oehlschlägel



p <- 1:3
names(p) <- c("a","", as.character(NA))
p

  a  
  123


for (i in names(p))

+ print(p[[i]])
[1] 1
[1] 2
[1] 3


# error 1: vector subsripting with "" fails in second element
for (i in names(p))

+ print(p[i])
a
1

 NA

  3


# error 2: print method for list shows no name for second element
p <- as.list(p)


for (i in names(p))

+ print(p[[i]])
[1] 1
[1] 2
[1] 3


# error 3: list subsripting with "" fails in second element
for (i in names(p))

+ print(p[i])
$a
[1] 1

$"NA"
NULL

$"NA"
[1] 3



version

_
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor1.1
year 2005
month06
day  20
language R




# -- replication code --

p <- 1:3
names(p) <- c("a","", as.character(NA))
p

for (i in names(p))
 print(p[[i]])

# error 1: vector subsripting with "" fails in second element
for (i in names(p))
 print(p[i])

# error 2: print method for list shows no name for second element
p <- as.list(p)


for (i in names(p))
 print(p[[i]])

# error 3: list subsripting with "" fails in second element
for (i in names(p))
 print(p[i])




--

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] access to R parse tree for Lisp-style macros?

2005-10-03 Thread Thomas Lumley

On Mon, 3 Oct 2005, Duncan Murdoch wrote:

> On 10/3/2005 3:25 AM, Andrew Piskorski wrote:
>> R folks, I'm curious about possible support for Lisp-style macros in
>> R.  I'm aware of the "defmacro" support for S-Plus and R discussed
>> here:
>>
>>   http://www.biostat.wustl.edu/archives/html/s-news/2002-10/msg00064.html
>>
>> but that's really just a syntactic short-cut to the run-time use of
>> substitute() and eval(), which you could manually put into a function
>> yourself if you cared too.  (AKA, not at all equivalent to Lisp
>> macros.)

Well, yes and no.  It is a syntactic shortcut using functions, but what it 
does is manipulate and then evaluate pieces of parse tree.  It doesn't 
have the efficiency under compilation that real macros would, but we don't 
have compilation.  It doesn't have gensyms, but again, R fails to support 
these in a fairly fundamental way, so they have to be faked using 
variables with weird random names.

I have a long-term plan to add real macros, but not until after Luke 
Tierney's byte-code compiler is finished.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Catching warning and error output

2005-10-10 Thread Thomas Friedrichsmeier

Hi all,

I'm working on a GUI frontend for R, and I'm looking for a good way to catch 
all warning- and error-output. The reason for this is mostly, that I would 
like to know, which sections of the output are "normal" output, warnings, and 
errors. This would allow for some nice features, such as highlighting 
warnings and errors in a different color, or popping up a message like e.g. 
"There were these warnings, while doing this-or-that:".
Maybe a good solution for this already exists, but I have not been able to 
find it. Therefore, I'll outline what I have tried/considered so far, and 
what I think might be ways to add a corresonding API. Of course, I'm not very 
familiar with R, so these suggestions are probably not "the right way to do 
it".

What I've tried so far:
1) Directing stderr-output to a file (using sink ()):
This solution splits warnings, and errors from "regular" output. Using
options (error=quote (myhandler ())
I can additionally tell warnings and errors apart after the fact. However, 
this solution is problematic, in that then the output is handled 
asynchronously, and I can not tell, at which position in the output a warning 
should have been printed. Further, if the user runs one "sink ()" too many 
(the user has a sort of console for direct interaction with R), the 
output-handling will be broken in my GUI.

2) Use options (warning.expression=...), and options (error=...):
Unfortunately, it turned out, that when setting warning.expression to 
something non-NULL, the warning is discarded completely. There is no way to 
get at the warning that would have been generated.

3) Use condition handlers:
3a) Wrap each call inside withCallingHandlers (...):
This would probably work, but add quite a bit of overhead for string 
manipulation, and parsing. The GUI runs a lot of small commands during normal 
operation. There are additional hazzles, such as error messages would then 
all look like "Error in withCallingHandlers(...", and I'd have to use yet 
more string operations to make them look normal.
Also: Can I be sure, that all warnings (i.e. even from C-code) are signalled 
as conditions? I'm afraid I do not fully understand the internal going-ons, 
here.
If all else fails, I'll have to use this, but I'm not very happy with this.
3b) Use ".Internal (.addCondHands (...))" once to set up persistent handlers:
This worked fine, when testing it in the R-console. However, in my GUI, calls 
are actually handled using R_tryEval (..., R_GlobalEnv, ...). It seems the 
condition handlers do not carry over between two successive calls of 
R_tryEval. So effectively, I'd once again be back at 3a.

What I would like to have:
Of course there would be many different ways to accomplish this. Here are some 
solutions I can think of:

1) In my programmers' dream, there'd simply be three callbacks that I can 
override: R_WriteConsole (exists), R_WriteWarning, and R_WriteError. 
R_WriteWarning, and R_WriteError would be called from vwarningcall_dflt, and 
verrorcall_dflt, respectively, instead of REprintf. The default 
implementation of those, would of course simply call REprintf. Drawbacks: 
a) REprintf is available for use in R_ext/Print.h. I'd miss out on any direct 
calls of REprintf, while those should probably still be recorded as a 
warning/error.
b) I'd have to add a fairly elaborate implementation in order to honor any 
sinks the user has (purposefully) created. If I can even access all necessary 
data/functions from my code (I have not investigated that, yet).

2) Similar, but add the hook a little later, in REvprintf. There, instead of 
directly calling R_WriteConsole, a new callback R_WriteError would be used 
(which defaults to R_WriteConsole). Using this callback I would get a stream 
of both warnings, and errors, separate from "regular" output. I could tell 
warnings and errors apart, using options (error=...).

3) Yet a little less intrusive: Somehow use a fake R_ErrorCon:
There I'd simply add my own callback for con->vprintf, and con->fflush. Then 
proceed as in 2. However, it seems Rconnection and the related functions are 
too tightly guarded against this approach. I simply don't see a way, how I 
could fake this connection (but maybe there is one?). Maybe a public API 
could be added to allow this.

Ok, so much on my helpless attempts. Could you help me with this?

Thanks!
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Catching warning and error output

2005-10-10 Thread Thomas Friedrichsmeier

> What I would like to have:
> Of course there would be many different ways to accomplish this. Here are
> some solutions I can think of:

I just had an additional idea:
4) Export inError, and inWarning from errors.c:
I'm not entirely sure this would really work as expected, but if it did, it 
would be fairly neat. I would receive all output though R_WriteConsole. There 
(in my implementation of the callback), I'd check for "int inError", and "int 
inWarning", and classify the output accordingly. Just as in solution 1), I 
still could not figure out about a direct call of REprintf. Such output would 
still be classified as "regular".

Regards
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 8 char labels in read.spss

2005-10-11 Thread Thomas Lumley

On Tue, 11 Oct 2005, Knut Krueger wrote:

> I was wondering why it is possible to read long labels from the CVS
> files but not from the SPSS files.

The SPSS file formats are not documented, and so we rely on the code from 
PSPP.  At the time, PSPP did not read long variable names.  It now does, 
so it would be possible for someone update the SPSS-reading code to handle 
long variable names.  This is much more complicated than just changing a 
#define; the long variable names are stored in a different part of the 
file.

I don't expect anyone on R-core to get around to this any time soon. If 
you want to try, the current PSPP code is at
http://savannah.gnu.org/projects/pspp

-thomas


> I did not have much time to search for the code but I found:
>
> in foreign_0.8-10 source file var.h.in
>
>> /* Definition of the max length of a short string value, generally
>>eight characters.  */
>> #define MAX_SHORT_STRING ((SIZEOF_DOUBLE)>=8 ? ((SIZEOF_DOUBLE)+1)/2*2
>> : 8)
>> #define MIN_LONG_STRING (MAX_SHORT_STRING+1)
>>
>> /* FYI: It is a bad situation if sizeof(R_flt64) < MAX_SHORT_STRING:
>>then short string missing values can be truncated in system files
>>because there's only room for as many characters as can fit in a
>>R_flt64. */
>> #if MAX_SHORT_STRING > 8
>> #error MAX_SHORT_STRING must be less than 8.
>> #endif
>
> I am am right then there was a restriction in the year 2000 because the
> files are from the year 2000.
>
> Now there are some questions:
> Did I found the right code?
> is it possible that anybody could recompile this with long value names
> or where is the best manual for a quick start in compiling packages.
>
>
> I found a couple of weeks before a tread where anybody wrote a complete
> way for building packages.
> He described all problems of him and there were a lot of hints for the
> first steps, but I am not able to find it again - I don't know the
> search terms which I used before :-(
>
>
> with regards
> Knut
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Catching warning and error output

2005-10-11 Thread Thomas Friedrichsmeier

Dear R developers,

since nobody has pointed me to an existing solution so far, here is a more 
specific suggestion. Attached you will find two patches against R version 
2.2.0.

The first patch does two things:
1) Add a function R_WriteErrConsole similar to R_WriteConsole, and a 
corresponding ptr_R_WriteErrConsole.
2) Make inError, inWarning, and inPrintWarnings from errors.c accessible 
(read-only) in R_ext/Error.h
I believe these changes to be minimally invasive. I did not test-compile on 
windows, but the change in gnuwin32 should be trivial.

The second patch adds a test/example to tests/Embedding, showing how this 
added API can be used to identify warnings and errors in the output stream.

If you need any further comments / different format of patches, or anything 
else in order to evaluate this proposal, please let me know.

Thanks!
Thomas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Catching warning and error output

2005-10-11 Thread Thomas Friedrichsmeier

> since nobody has pointed me to an existing solution so far, here is a more
> specific suggestion. Attached you will find two patches against R version
> 2.2.0.

Sorry, seems the attachments were stripped (at least they don't show up in the 
archive). Here are links to the two diffs in question:

http://rkward.sourceforge.net/temp/classify_output_patch.diff
http://rkward.sourceforge.net/temp/classify_output_test.diff

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 8 char labels in read.spss

2005-10-11 Thread Thomas Lumley

On Tue, 11 Oct 2005, Knut Krueger wrote:
>
> I found a definition of the SPSS files.
> http://www.wotsit.org/download.asp?f=spssdata
> but they recommend to use the spss input/output dll to ensure upward
> compatbility
>

"Well, they would say that, wouldn't they" (Rice-Davis 1963)

Unfortunately, that document is an old file format. It doesn't describe 
record 7 subtype 13, which is where the long variable names live.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] SPSS I/O dll was: 8 char labels in read.spss

2005-10-12 Thread Thomas Lumley

On Wed, 12 Oct 2005, Knut Krueger wrote:
> Thomas Lumley schrieb:
>> On Tue, 11 Oct 2005, Knut Krueger wrote:
>>> I found a definition of the SPSS files.
>>> http://www.wotsit.org/download.asp?f=spssdata
>>> but they recommend to use the spss input/output dll to ensure upward
>>> compatbility
>>>
>> "Well, they would say that, wouldn't they" (Rice-Davis 1963)
>>
>> Unfortunately, that document is an old file format. It doesn't describe
>> record 7 subtype 13, which is where the long variable names live.
>>
> What about using the SPSS input dll for those R users which would like to use 
> their old SPSS files. Most universities here have SPSS licences and therefor 
> the dll is available.
> I did not found yet any copyright notice on the developer part of the SPSS 
> CD. Maybe the DLL is usable with the own program without the SPSS Licence. I 
> will look for that if using the I/O dll is possible with R.
> All necessary C codes are delivered with the developer kit on the SPSS CD.
>

Yes, but it can't be incorporated in the  "foreign" package, which has a 
GPL license, and in any case wouldn't work off Windows.  It's not clear 
that it would be much easier than using the PSPP code in any case.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R-gui] R GUI considerations (was: R, Wine, and multi-threadedness)

2005-10-16 Thread Thomas Friedrichsmeier

Hi,

> Qt is C++, cross-platform using native widgets on OS X and Win and (since
> more recently) available without fee or license woes provided it is used
> for GPL'ed code.
>
> So it satisfies both the requirement to make it look and feel native
> whereever possible, and satisfies the preference for an OO paradigm for GUI
> programming.
>
> Would it be an alternative?  Is it worth a prototype app?

If you seriously consider writing a Qt app, please have a look at RKWard (at 
least to find out, which portions of the code to reuse). RKWard is not a pure 
Qt app, however, but a KDE app. KDE 4 is promised to be cross-platform (but 
won't be released for another year or so), so I hope RKWard will be 
cross-platform, then.
Also, note that RKWard uses a somewhat different approach, than most other 
R-GUIs (as far as I know), in that _all_ the GUI stuff is done in C++-code 
(or plugins). There is no API to build GUI(-elements) from R in rkward, and I 
don't have plans to add that in the near future.
On the discussion iniated by Philippe: I don't think there will ever be a 
single united R GUI. It's not like this discussion has not come up before. I 
agree this has something to do with individualistic developers, and I'll 
admit to being one myself. But there are other reasons as well.
What I do believe, is that there could be collaboration in some areas. Years 
ago I proposed a standard for defining some GUI-elements. Philippe was pretty 
much the only person expressing interest at that time, but now uses an 
entirely different approach.
Another area could be drawing up a flexible output-format, and R-methods to 
create such output. R2HTML does a pretty good job, for the time being, but 
ultimately, we'll want an output format that does some more abstraction, for 
instance, to allow changing the number of digits to display on the fly, etc. 
If we could agree on a common standard for this, it could save all projects a 
lot of effort. (But no, I haven't worked out any specific ideas for this, 
yet).
I think you'll have a hard time, convincing any of the projects to give up 
their individualistic approaches (including any agreement, even on which 
programming language to use). All I can see is some projects might share some 
common standards.

Regards
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R-gui] R GUI considerations

2005-10-17 Thread Thomas Friedrichsmeier

Hi Philippe,

> I answer to Marc's email, because I think it is the most constructive
> one. I am a little bit dissapointed that the discussion about R GUIs,
> whatever the initial subject, inevitably shifts to an endless discussion
> about which graphical toolkit to use, and whether one should interface
> it directly, or by means of an intermediary language perhaps more
> suitable for handling widgets events.
>
> Should I recall that this thread is *not* about which graphical toolkit
> to use, but is trying to trigger a discussion on how could we work
> together to avoid duplication in coding for R GUIs, and perhaps, join in
> a common project. Something totally different!

ok, I misunderstood you then. I thought you were in fact talking about drawing 
up a common ultimate R GUI project. Then in fact, discussions about which 
toolkit to use, etc., would arise quite naturally. And that is, what I wrote 
in my last mail, what I don't think will ever lead anywhere.
If you're talking about finding certain defined areas of collaboration, I'm 
all for doing that. I don't think I will attend UseR, but I'm certainly open 
to discussions of this sort. I support the idea of a wiki.
Here are some thoughts, on what I think might be areas of collaboration:

1) You talk about an API for R GUIs written in R. To me personally, this is 
not an attractive topic, as my approach is to do _all_ GUI stuff outside of R 
(I think of R more as an evaluation backend in rkward). Of course this is no 
reason against starting such a discussion. However, in addition, I'd like to 
bring in my idea of xml-specified GUIs again (this is useful in my approach, 
but probably not so much in yours). This could be discussed as a _separate_ 
topic.
2) As I mentioned in the last mail: Drawing up a flexible output format, that 
will allow small modifications on the fly. This includes R functions to 
generate such an output format. Something like R2HTML, only more flexible, 
allowing for dynamic changes.
3) R library API enhancements. GUIs have some specific needs for the R 
(C/library) API. Such needs do not seem top priority for the R core 
developers (no offence intended). Last week I proposed some API additions to 
the R C API on r-devel, but never received a reply. I have several more items 
in mind that I would like to see added/changed in the main R library. 
Probably it would be good to have some collaboration on identifying, and 
elaborating our C-API needs.

Many more areas of collaboration could probably be identified. However, I'd 
strongly advise to keep such topics small, well defined, and separate from 
each other. Otherwise we'll once again end up in the discussion about the 
great merger of all R GUI projects, which I don't think will lead anywhere.

Regards
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] RFC: API to allow identification of warnings and errors in the output stream

2005-10-17 Thread Thomas Friedrichsmeier

Dear R developers,

this mail is basically a summary / clarification of some previous mails I sent 
to this list last week (subject "Catching warning and error output"). Duncan 
Murdoch pointed out that these were badly organized, and suggested I repost, 
collecting the main points previously spread out between several mails.

Problem statement:
In an application embedding R, I would like to identify warnings and errors in 
the output stream. That is, basically, I'm looking for a good way to find 
out, which portions of the output stream are warnings, which are error 
messages, and which are regular output. Such information could be used in a 
GUI to mark up different types of output in different colors, or to ensure 
warnings are brought to the user's attention in situations where output would 
normally be hidden.
I believe this is not currently possible in a good way. For a discussion of 
approaches using the current API, and their drawbacks, refer to 
https://stat.ethz.ch/pipermail/r-devel/2005-October/034975.html.

Suggested solution:
Here are two patches (against R-2.2.0), which I will describe in more detail 
below:
A) http://rkward.sourceforge.net/temp/classify_output_patch.diff
B) http://rkward.sourceforge.net/temp/classify_output_test.diff

Description of and rationale for patch A:
This patch does two things:
1) It adds a function R_WriteErrConsole corresponding to R_WriteConsole. Also, 
a corresponding interface pointer ptr_R_WriteErrConsole is added.
I think this change is highly consistent with output handling in R in general. 
There are different sink ()s for "output" and "message", there is the 
distinction between Rprintf/Rvprintf and REprintf/REvprintf. This basically 
just carries this distinction over to the interface pointers.
You will note that I wrote
ptr_R_WriteErrConsole = R_WriteConsole;
instead of
ptr_R_WriteErrConsole = Rstd_WriteErrConsole;
or
ptr_R_WriteErrConsole = ptr_R_WriteConsole;
This way, code that currently overrides ptr_R_WriteConsole, but does not know 
about ptr_R_WriteErrConsole, will still receive all output in 
ptr_R_WriteConsole. Hence this change is perfectly backwards compatible.
Of course, the naming may not be perfect?

2) While 1 makes it easy to split warning _and_ errors (and messages) from 
regular output, it does not suffice to differentiate _between_ warnings, 
errors, and messages on the error channel. The patch addresses this, by 
making inError, inWarning, and inPrintWarnings from errors.c accessible 
(read-only) in R_ext/Error.h.
This part of the patch may be slightly more delicate, in that inError, 
inWarning, and inPrintWarnings seem to be more internal status indications. 
So the question is: Are those potentially subject to change, if errors.c is 
redesigned some day? My feeling is that this is still relatively safe, 
however. Such indications can easily be set/unset at entry/exit points of the 
corresponding functions, as long as there are separate handlers for warnings, 
errors, and printWarnings. I guess this is not likely to ever change 
fundamentally.
As a slightly safer variation, R_inError (), R_inWarning (), and 
R_inPrintWarnings () could return a boolean indication only, instead of the 
int-code, which might be more subject to change.

Description of and rationale for patch B:
Patch B probably helps a lot to illustrate, how patch A works, and how this is 
useful. It adds an example to tests/embedding/. This example basically just 
generates a number of different types of messages, then catches the output, 
and classifies it into several categories.

I'll gladly provide more explanations or a different format of patches if 
needed.

Looking forward to your comments
Thomas Friedrichsmeier

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RFC: API to allow identification of warnings and errors in the output stream

2005-10-17 Thread Thomas Friedrichsmeier

> I am strongly opposed to locking in anything from the C internals of
> error handling that is not already part of the API.  This is all very
> much subject to change and anything along the lines you propose will
> make that change more difficult.

Let's discuss this in two separate parts, then. The first is: adding 
ptr_R_WriteErrConsole as an anologon to ptr_R_WriteConsole. I can see nothing 
wrong with that (of course I'm not an R developer), and it would already help 
me a lot. As I pointed out in the previous mail, a distinction between a 
"message" channel and an "output" channel is done everywhere in R, except in 
the interface pointers. The R_WriteErrConsole comes into play at the very end 
of the "message" channel (REvprintf), and only there, exactly parallel to how 
R_WriteConsole works/gets invoked.
Even if you're going to reject the other part of the proposed patch, please 
consider this small addition. If it helps, I can provide a stripped down 
patch for that.

I'll discuss the second part (making inError, inWarning and inPrintWarnings 
available) in more detail below.

> Condition handling was added to make this sort of thing possible.  If
> there are aspects of condition handling, or your understanding of
> condition handing, that need to be improved then we can work on that.

Both is quite possible. However I have reason to believe it's not just my 
understanding of condition handling that is lacking. If not so, I'll happily 
accept suggestions. Here's why I think condition handling will not help me 
much:

First, for clarification, let me tell you some more about what I need: Besides 
other GUI elements, there is a pseudo-console for running commands in R 
interactively (or at least providing that illusion). I reality, all commands 
are evaluated using R_tryEval (). Still, the user should not see any of this, 
and the "console" should behave mostly just like R in a plain terminal, 
including how errors are printed (+ marking up errors in another color, 
however).
So how could I use condition handlers?

a) Wrap each call inside withCallingHandlers (...) before evaluating it in 
R_tryEval ():
This would probably work, but add quite a bit of overhead for string 
manipulation, and parsing. The GUI runs a lot of small commands during normal 
operation. There are additional hazzles, such as error messages would then 
all look like "Error in withCallingHandlers(...", and I'd have to use yet 
more string operations to make them look normal.
Also: Can I be sure, that all warnings (i.e. even from C-code) are signalled 
as conditions? I'm afraid I do not fully understand the internal going-ons, 
here.

b) Use ".Internal (.addCondHands (...))" once to set up persistent handlers:
This worked fine, when testing it in the R-console. However, when testing in 
my GUI, it seems the condition handlers do not carry over between two 
successive calls of R_tryEval. So effectively, I'd once again be back at 3a.

Of course, if there was an API to efficiently set up condition handlers from 
C, and persist over at least one call of R_tryEval, I could in fact use that, 
and would happily do so. I have not found something like that, however.

Thanks!
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] on ptr_R_WriteErrConsole (was: Re: RFC: API to allow identification of warnings and errors in the output stream)

2005-10-18 Thread Thomas Friedrichsmeier

s to present stdout() and stderr()
> separately would have to be a review of how they are used.  This is not
> hypothetical: I have struggled many times over the years with an R-like
> system which when scripted wrote error messages in inappropriate places on
> a file, at one point sending prompts to stderr yet echoed input to stdout.
> We've worked hard in R to avoid that kind of thing, mainly by having a
> single route to the console.  (There are still residual problems if C++ or
> Fortran I/O is used, BTW, and note that R_ReadConsole also *writes* to the
> console.)

And the single route to the console will remain intact using R_WriteConsole. 
I'm only asking for the *opportunity*, not the *obligation* to intercept the 
one call to R_WriteConsole in REvprintf.

> c) Anything in R involving more than one of the three main families of
> platforms is NOT a `small addition': it involves testing and subsequent
> maintenance by two or three people.

So, yes, somebody would have to add corresponding code to windows. 
Unfortunately, I don't think I qualify for this, as I just can't test on 
windows. I'm fairly confident, the change I did in gnuwin32 will ensure 
nothing is broken, but you would want a parallel to ptr_R_WriteErrConsole in 
windows for consistency's sake.
But please: Don't conjure up a maintenance nightmare for this simple change.

> d) There is an existing mechanism that could be used.  If you want
> file-like stderr and stdout, you could drive R via a file-like interface
> (e.g. ptys).  That is not easy on non-Unix-alike platforms (and was
> probably impossible on classic MacOS R), but I understood Thomas was using
> KDE.  (There are live examples of this route, even on Windows.)

And it's not like I haven't ventured along that route. Do you know how much 
fun it is to use two separate ptys, then try to make sure the output arrives 
in a sensible order, i.e. you don't get all warnings before all output or 
vice versa, or some strange mixture? I'm not the infailable programmer, and 
I'm not an expert in R internals. But before I go into lengthy discussions, I 
have checked my options.

So again: Why do I want something like this?
A GUI may want to do some things, which are not needed on the console. One 
such thing is to identify "message" output. There are several uses for this:
1) Highlight "message" output to bring it to the users attention
2) Offer context help on "message"s. Of course this is easier said that done, 
but the first step in this, would be to find out which portions of the output 
are messages.
3) Show "message"s that come up in operations that would usually redirect the 
output elsewhere, and not show it to the user directly. Much like the 
scripting situation you depicted in b)

And again: Why can't I just use current facilities?
1) Condition Handling: Probably the way to go for my advanced needs. But 
totally useless, if I want to catch messages generated using direct calls to 
REprintf/REvprintf. Those are abundant. They have to be dealt with. The only 
way to do this is to use a mechanism available in/below/after REvprintf.
2) Ptys, sinks: See above
3) Grepping: come on now. Could not even solve most needs, impossible due to 
internationalization...
4) Using mind reading? When I first posted about these matters to r-devel (and 
yes, before that I tried my luck on r-help), it was a plain support question: 
How can I do this? See also here:
https://stat.ethz.ch/pipermail/r-devel/2005-October/034975.html
I did not receive any reply on this. What should I conclude?

I'll gladly accept alternative solutions. They are not the above though. And 
I've written why they are not before.

> I have spent far longer (and written more) than I intended on this.  The
> length of correspondence so far (and much in a prolix style) is all part
> of the support costs.  One thing the R project can not afford is to
> explain to individual users how internals work -- we have not even been
> able to find the time to write down for the core team how a lot of the
> internals work, and some developments are being held up by that.  So this
> has to weighed against considering proposals which would appear to help
> just one user.

Sorry about writing more and more lengthy mails. I don't really want to. I 
have better things to do as well. But this is important, and - sorry to say - 
IMO you've simply overlooked a number of points. All I can do is restate 
them, trying to make extra sure to get my point across.

> I suspect that we will only want to go forward if a concise and strong
> case can be made from a group of developers who have tested it out on all
> three main families of platforms.

And where would you think, I could conjure such a group of developers, if not 
on this list?

Regards
Thomas Friedrichsmeier

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] on ptr_R_WriteErrConsole

2005-10-18 Thread Thomas Friedrichsmeier

> I'll volunteer to do your testing on Windows, provided nobody suggests a
> way to do what you want without your patches.  That is, if there's a way
> for a custom front-end to separate the streams of text coming to
> ptr_R_WriteConsole through REvprintf from that coming through Rvprintf,
> then your changes are not needed.  If there is currently no way to do
> that, then I think there should be.

That would be great. Thank you. I have created a new version of the diff 
(still against R 2.2.0). This only contains the R_WriteErrConsole part, and 
the other part is stripped. Also, I have elaborated the windows version of 
the handling. I hope I have not overlooked anything:

http://rkward.sourceforge.net/temp/r_writeerrconsole.diff

If in addition you would like me to create a test case, that would be slightly 
more difficult for me to do, but I'd be happy to give it a try. Just send me 
a mail in this case. Probably, however, I would not be able to do this before 
tomorrow.

Regards
Thomas Friedrichsmeier

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why no .Machine$sizeof.double?

2005-10-18 Thread Thomas Lumley

On Tue, 18 Oct 2005, Earl F. Glynn wrote:

> Whis is there a .Machine$sizeof.longdouble but no .Machine$sizeof.double?
>

sizeof(double) is always 8 and sizeof(int) is always 4, because R requires 
the IEEE/IEC standard arithmetic types.  R will not compile with any other 
sizes.

    -thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why no .Machine$sizeof.double?

2005-10-18 Thread Thomas Lumley

On Tue, 18 Oct 2005, Earl F. Glynn wrote:

> "Thomas Lumley" <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]
>> On Tue, 18 Oct 2005, Earl F. Glynn wrote:
>>
>>> Whis is there a .Machine$sizeof.longdouble but no
> .Machine$sizeof.double?
>>>
>>
>> sizeof(double) is always 8 and sizeof(int) is always 4, because R requires
>> the IEEE/IEC standard arithmetic types.  R will not compile with any other
>> sizes.
>
> But it's a common, recommended software engineering practice to define
> mnemonic, named constants.
>
> If I were to see code like .Machine$sizeof.double * N.COL * N.ROW, I know
> that's the number of bytes in a matrix of doubles. If I see code that is 8 *
> N.COL * N.ROW, I can guess what "8" means, but I could guess wrong.  I wrote
> code that looks just like this today because I couldn't find the defined
> constant.  Will someone else reading my code automatically know what the "8"
> means?
>

But why would you ever want to write either .Machine$sizeof.double * N.COL 
* N.ROW or 8 * N.COL * N.ROW?

If you are doing memory allocation in R then numeric() automatically 
allocates things of the correct size. If you are doing memory allocation 
in C then you should use sizeof(double) or, even better, sizeof(*yourpointer).
In both cases, the language has the facility to let you work in the 
correct units without explicit constants, named or unnamed.

You would have a stronger case for arguing that there should be typedefs
in C so that you didn't need to remember that R's numeric type is double 
(rather than, er, float?)and its integer type is int (rather than long)
  and in fact we do provide
  typedef int Sint;
in R.h.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R-gui] R GUI considerations (was: R, Wine, and multi-threadedness)

2005-10-20 Thread Thomas Friedrichsmeier

> If you want users to be productive, you have to give them
> something they can easily incorporate within the tools they use
> on a daily basis. No big applications with everything locked in,
> but a set of programs or commands that do specific tasks, with
> an easy to understand input and output. You need something that
> works in an open environment, so the user can use existing
> skills. With a GUI that does "everything", the user has to learn
> from scratch all the things that make "everything" happen.

Maybe you're just seeking this discussion for the fun of it. In this case, I 
won't stop you. If not, do you really think it is going anywhere?
You don't want/need a GUI? Fine. Don't use one, and don't write one. Do some 
others feel the need for a GUI? Yes, they do. Could there be at least some 
reason for that? Well, it's not like you can just "start using R", even if 
you do have the statistical background, and even if you do have basic 
programming knowledge. Is a bloated GUI less intimidating to some than a 
command-line? Yes. Do all GUIs necessarily make extra sure to hide you from 
everything going on behind the scenes, so you will be kept locked in, and 
helpless for ever? No. Should everybody be forced to use a GUI? No. Does 
anybody advocate otherwise? Not as far as I can see.
I don't think there's anything more to be said on this topic.

Regards
Thomas Friedrichsmeier

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [R-gui] R GUI considerations (was: R , Wine, and multi-threadedness)

2005-10-20 Thread Thomas Friedrichsmeier

> Is there a "simple" way (e.g. some socket based mechanism) to
> feed commands into R and retrieve the results of those commands?
> This would require that I program the sequence of commands I
> want to use (or a means to generate them) and then be able parse
> the resulting structure - I understand. But it would also allow
> separation of the computation, the "statistical reasoning", and
> the UI into (potentially) separate units which would not even
> need to be on the same machine to inter-operate.  If there is a
> reasonable way to do this, please tell me.

You're roughly describing, how things are done in rkward 
(http://rkward.sourceforge.net). R and GUI do not run on different machines, 
not even in different processes. They run in separate threads, however, and 
there's a high level of separation between the two. There are probably 
similar implementations, which can be reused more easily in new projects. 
However, if you want to have a look at one possible way to do it, this 
documentation may be useful to you:

Usage perspective/overview:
http://rkward.sourceforge.net/development/en/documentation/api/UsingTheInterfaceToR.html

Main low level interface:
http://rkward.sourceforge.net/development/en/documentation/api/classREmbedInternal.html

Regards
Thomas Friedrichsmeier

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] unvectorized option for outer()

2005-10-30 Thread Thomas Lumley

On Sun, 30 Oct 2005, Jonathan Rougier wrote:

> I'm not sure about this.  Perhaps I am a dinosaur, but my feeling is
> that if people are writing functions in R that might be subject to
> simple operations like outer products, then they ought to be writing
> vectorised functions!

I would agree.  How about an oapply() function that does multiway (rather 
than just two-way) outer products.  Basing the name on "apply" would 
emphasize the similarity to other flexible, not particularly optimized 
second-order functions.

-thomas



>   Maybe it's not possible to hold this line, and
> maybe "outer" is not the right place to draw it, but I think we ought to
> respect the "x is a vector" mindset as much as possible in the base
> package.  As Tony notes, the documentation does try to be clear about
> what outer actually does, and how it can be used.
>
> So I am not a fan of the VECTORIZED argument, and definitely not a fan
> of the VECTORIZED=FALSE default.
>
> Jonathan.
>
> Gabor Grothendieck wrote:
>> If the default were changed to VECTORIZED=FALSE then it would
>> still be functionally compatible with what we have now so all existing
>> software would continue to run correctly yet would not cause
>> problems for the unwary.  Existing software would not have to be changed
>> to add VECTORIZED=TRUE except for those, presumbly few, cases
>> where outer performance is critical.   One optimization might be to
>> have the default be TRUE if the function is * or perhaps if its specified
>> as a single character and FALSE otherwise.
>>
>> Having used APL I always regarded the original design of outer in R as
>> permature performance optimization and this would be a chance to get
>> it right.
>>
>> On 10/28/05, Tony Plate <[EMAIL PROTECTED]> wrote:
>>
>>> [following on from a thread on R-help, but my post here seems more
>>> appropriate to R-devel]
>>>
>>> Would a patch to make outer() work with non-vectorized functions be
>>> considered?  It seems to come up moderately often on the list, which
>>> probably indicates that many many people get bitten by the same
>>> incorrect expectation, despite the documentation and the FAQ entry.  It
>>> looks pretty simple to modify outer() appropriately: one extra function
>>> argument and an if-then-else clause to call mapply(FUN, ...) instead of
>>> calling FUN directly.
>>>
>>> Here's a function demonstrating this:
>>>
>>> outer2 <- function (X, Y, FUN = "*", ..., VECTORIZED=TRUE)
>>> {
>>>no.nx <- is.null(nx <- dimnames(X <- as.array(X)))
>>>dX <- dim(X)
>>>no.ny <- is.null(ny <- dimnames(Y <- as.array(Y)))
>>>dY <- dim(Y)
>>>if (is.character(FUN) && FUN == "*") {
>>>robj <- as.vector(X) %*% t(as.vector(Y))
>>>dim(robj) <- c(dX, dY)
>>>}
>>>else {
>>>FUN <- match.fun(FUN)
>>>Y <- rep(Y, rep.int(length(X), length(Y)))
>>>if (length(X) > 0)
>>>X <- rep(X, times = ceiling(length(Y)/length(X)))
>>>if (VECTORIZED)
>>>robj <- FUN(X, Y, ...)
>>>else
>>>robj <- mapply(FUN, X, Y, MoreArgs=list(...))
>>>dim(robj) <- c(dX, dY)
>>>}
>>>if (no.nx)
>>>nx <- vector("list", length(dX))
>>>else if (no.ny)
>>>ny <- vector("list", length(dY))
>>>if (!(no.nx && no.ny))
>>>dimnames(robj) <- c(nx, ny)
>>>robj
>>> }
>>> # Some examples
>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>> outer2(1:2, 3:5, f, 2)
>>> outer2(numeric(0), 3:5, f, 2)
>>> outer2(1:2, numeric(0), f, 2)
>>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
>>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
>>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
>>>
>>> # Output on examples
>>>
>>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>>> outer2(1:2, 3:5, f, 2)
>>>
>>> in f
>>> [,1] [,2] [,3]
>>> [1,]9   16   25
>>> [2,]   36   64  100
>>>
>>>> outer2(numeric(0), 3:5, f, 2)
>>>
>>> in f
>>> [,1] [,2] [,3]
>>>
>>>> outer2(1:2, numeric(0), f, 2)
>>>
>>> in f
>>>

Re: [Rd] [R] unvectorized option for outer()

2005-10-31 Thread Thomas Lumley

On Mon, 31 Oct 2005, Liaw, Andy wrote:

>> From: Thomas Lumley
>>
>> On Sun, 30 Oct 2005, Jonathan Rougier wrote:
>>
>>> I'm not sure about this.  Perhaps I am a dinosaur, but my feeling is
>>> that if people are writing functions in R that might be subject to
>>> simple operations like outer products, then they ought to be writing
>>> vectorised functions!
>>
>> I would agree.  How about an oapply() function that does
>> multiway (rather
>> than just two-way) outer products.  Basing the name on "apply" would
>> emphasize the similarity to other flexible, not particularly
>> optimized
>> second-order functions.
>>
>>  -thomas
>
> I'll toss in my $0.02:  The following is my attempt at creating a "general
> outer" that works with more than two arguments.
>
> gouter <- function (x, FUN, ...) {
>xgrid <- as.list(do.call("expand.grid", x))
>names(xgrid) <- NULL
>array(do.call(deparse(substitute(FUN)), c(xgrid, list(...))),
>dim = sapply(x, length), dimnames = x)
> }
>

Yes, that's the sort of thing I had in mind.  The name "gouter" isn't 
exactly to my taste -- the point was that outer() is the fast but 
restricted function and that the general but slow function should have a 
different name (eg xapply or oapply).


-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] unvectorized option for outer()

2005-11-01 Thread Thomas Lumley

On Tue, 1 Nov 2005, Duncan Murdoch wrote:

> The version I posted yesterday did indeed mess up when some arguments were 
> unspecified.  Here's a revision that seems to work in all the tests I can 
> think of.  I also added the SIMPLIFY and USE.NAMES args from mapply to it, 
> and a sanity check to the args.
>
> I did notice and work around one buglet in mapply:  if you choose not to 
> vectorize any arguments, you don't get a call to the original function, 
> mapply returns "list()".
>
> For example,
>
>> mapply(function(x) x^2, MoreArgs = list(x=2))
> list()
>
> whereas I would think 4 is a more logical answer.
>

I don't agree at all.  The answer should be the length of the longest 
vectorised argument, and it is.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] A problem with glm() and possibly model-using functions in general?

2005-11-18 Thread Thomas Lumley

On Fri, 18 Nov 2005, Byron Ellis wrote:

> So, consider the following:
>
> > example(glm)
> > g = function(model) { w = runif(9);glm(model,weights=w); }
> > g(counts ~ outcome + treatment)
> Error in eval(expr, envir, enclos) : object "w" not found
>
> Huh?! I suspect that somebody is lazily evaluating arguments in the
> wrong environment (probably GlobalEnv in this case). I'm willing to
> accept the fact that there's some mysterious reason you'd actually
> want this behavior, but this looks like it should be filed as a bug
> to me.

Yes, there is a reason you'd actually want this behaviour, and 
it is documented. In help(model.frame) it says

  All the variables in 'formula', 'subset' and in '...' are looked
  for first in 'data' and then in the environment of 'formula' (see
  the help for 'formula()' for further details) and collected into a
  data frame.

In your example the environment of 'formula' is the global environment, 
since that's where it was created.

There isn't a set of scoping rules for formulas that will make everyone 
happy, but this lexical scope is what R has done for quite some time.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Changes to Windows registry in R-2.2.0

2005-11-24 Thread Len Thomas

R-Devel,

I note from the CHANGES log accompanying the Windows version of R-2.2.0 that 
the behaviour with respect to the Windows registry has changed.  It says:

"If the user chooses to register R during installation, a registry entry
HKEY_LOCAL_MACHINE\Software\R-core\R\{version}\InstallPath will be added.
Users require administrative privileges to create this key.  For others,
the same key will be put under the HKEY_CURRENT_USER root."

The old behaviour was to add or modify the registry entry at
HKEY_LOCAL_MACHINE\Software\R-core\R\InstallPath
(ie the same entry, but without the extra {version} key).   Having installed 
R-2.2.0, I notice that the entry at this location, which used to say
C:\Program Files\R\R-2.1.1
now says
C:\Program Files\R\R-2.2.0
I also tried deleting the \R-core\R key, and re-installing R, and it added 
both the 
HKEY_LOCAL_MACHINE\Software\R-core\R\R-2.2.0\InstallPath
and
HKEY_LOCAL_MACHINE\Software\R-core\R\InstallPath
entries

In other words, the new behaviour seems to be to *both* modify/add an entry 
under
HKEY_LOCAL_MACHINE\Software\R-core\R\InstallPath
*and* 
HKEY_LOCAL_MACHINE\Software\R-core\R\{version}\InstallPath

I note also that it adds another entry 
HKEY_LOCAL_MACHINE\Software\R-core\R\Current Version

My questions are:

(1) Am I correct that this is the new behaviour?

(2) Can the appropriate developer confirm that this behaviour will be 
continued in future versions (at least for a while)?  I ask, because I 
distribute software that uses R, and it uses the 
HKEY_LOCAL_MACHINE\Software\R-core\R\InstallPath
to find R.  (It will also now look under HKEY_CURRENT_USER, as documented in 
CHANGES.)  If future versions will not update this entry, then I'll switch 
the behaviour of my software.

(3) Might it be worth documenting this behaviour somewhere?  I've searched 
all the files in the R-2.2.0 distribution and didn't find it, as well as 
looking in the recent r-devel and r-help archives.

There is one out-of-date entry: in R-2.2.0\doc\manual\R-exts.html it says:
[...]
Find and set the R home directory and the user's home directory.  The
former may be available from the Windows Registry: it will normally be
in HKEY_LOCAL_MACHINE\Software\R-core\R\InstallPath and can be
set there by running the program R_HOME\bin\RSetReg.exe


Perhaps I missed it elsewhere?

Thanks for any help,

 - Len Thomas


--
Len Thomas   [EMAIL PROTECTED]http://www.creem.st-and.ac.uk/len/
Centre for Research into Ecological and Environmental Modelling
The Observatory, University of St Andrews, Scotland KY16 9LZ
Tel. (0)1334-461801  Fax. (0)1334-461800  Secretary (0)1334-461842

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] computing the variance

2005-12-05 Thread Thomas Lumley

>
> Using the biased variance just because it is the MLE (if that is the
> argument) seems confused to me. However, there's another point:
>
>> var(sample(1:3, 10, replace=TRUE))
> [1] 0.6680556
>
> i.e. if we are considering x as the entire population, then the
> variance when sampling from it is indeed 1/N*E(X-EX)^2, which is why
> some presentations distinguish between the "population" and "sample"
> variances. We might want to support this distinction somehow.
>

We might also consider that the purpose of computing the variance is often 
to take the square root, and that using 1/(n-1) as the divisor does not 
give any particular optimality as an estimator of the standard deviation.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 0/1 vector for indexing leads to funny behaviour (PR#8389) (maybe a documentation deficiency?)

2005-12-14 Thread Thomas Lumley

On Wed, 14 Dec 2005, Tony Plate wrote:
>
> That's what I was trying to say: the whole truth is that numeric index
> vectors that contain positive integral quantities can also contain
> zeros.  Upon rereading this passage yet again, I think it is more
> misleading than merely incomplete: the phrasings "positive integral
> quantities", and "*must* lie in the set ..." rule out the possibility of
> the vector containing zeros.
>

"Someone told me that you can't run without bouncing the ball in 
basketball. I got a basketball and tried it and it worked fine. He must be 
wrong"  -- a comp.lang.c standard

It doesn't rule out the the possibility of the vector containing zeros, it 
tells you that you should not put zeros in the vector.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 2 x 2 chisq.test (PR#8415)

2005-12-20 Thread Thomas Lumley


This is the same as PR#8265, which was reported two months ago by someone 
else from syd.odn.ne.jp.  It still isn't a bug. According to Brian 
Ripley's response at that time, "almost all" the sources he checked gave 
the correction that R uses.

-thomas

On Tue, 20 Dec 2005, [EMAIL PROTECTED] wrote:

> Full_Name: nobody
> Version: 2.2.0
> OS: any
> Submission from: (NULL) (219.66.34.183)
>
>
> 2 x 2 table, such as
>
>> x
> [,1] [,2]
> [1,]   10   12
> [2,]   11   13
>
>> chisq.test(x)
>
>   Pearson's Chi-squared test with Yates'
>   continuity correction
>
> data:  x
> X-squared = 0.0732, df = 1, p-value = 0.7868
>
> but, X-squared = 0.0732 is over corrected.
>
> when abs(a*d-b*c) <= sum(a,b,c,d), chisq.value must be 0!, and P-value must be
> 1!
>
> code of chisq.test must be as follows
>
> #   if (correct && nrow(x) == 2 && ncol(x) == 2) {
> #   YATES <- 0.5
> #   METHOD <- paste(METHOD, "with Yates' continuity correction")
> #   }
> #   else YATES <- 0
> #   STATISTIC <- sum((abs(x - E) - YATES)^2/E)
> ## replace begin
> if (correct && nrow(x) == 2 && ncol(x) == 2) {
> STATISTIC <- if (abs(x[1,1]*x[2,2]-x[1,2]*x[2,1]) < sum(x)/2) > 0
>
>  else sum((abs(x - E) - 0.5)^2/E)
> METHOD <- paste(METHOD, "with Yates' continuity correction")
> }
> else STATISTIC <- sum((abs(x - E))^2/E)
> ## replace end
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Multiplication (PR#8466)

2006-01-06 Thread Thomas Lumley

On Fri, 6 Jan 2006, [EMAIL PROTECTED] wrote:

> hi - in version 2.1 the command
>
> >-2^2
>
> gives
>
> -4
>
> as the answer.  (-2)^2 is evaluated correctly.

So is -2^2.  The precedence of ^ is higher than that of unary minus. It 
may be surprising, but it *is* documented and has been in S for a long 
time.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] prod(numeric(0)) surprise

2006-01-09 Thread Thomas Lumley

On Mon, 9 Jan 2006, Martin Morgan wrote:

> I guess I have to say yes, I'd exepct
>
> x <- 1:10
> sum(x[x>10]) ==> numeric(0)
>
> this would be reinforced by recongnizing that numeric(0) is not zero,
> but nothing. I guess the summation over an empty set is an empty set,
> rather than a set containing the number 0. Certainly these
>
> exp(x[x>10]) ==> numeric(0)
> numeric(0) + 1 ==> numeric(0)
>

There are some fairly simple rules in how R does it.  You do need to 
distinguish between functions (binary operators) that map two vectors of 
length n to a vector of length n and functions such as prod and sum that 
map a vector of length n to a vector of length 1.

The output of sum and prod is always of length 1, so sum(numeric(0)) and 
prod(numeric(0)) should be of length 1 (or give an error).  It is 
convenient that sum(c(x,y)) is equal to sum(x)+sum(y) and that 
prod(c(x,y)) is equal to prod(x)*prod(y), which motivates making 
sum(numeric(0)) give 0 and prod(numeric(0)) give 1.

Single argument functions such as exp(numeric(0)) seem fairly obvious: you 
have no numbers and you exponentiate them so you still have no numbers. 
You could also argue based on c() and exp() commuting.

The rules for binary operators are a little less tidy [my fault]. They 
come from the idea that x+1 should always add 1 to each element of x.  If 
you add 1 to each element of numeric(0) you get numeric(0).  The usual 
recycling rule says that the shorter vector should be repeated to make it 
the same length as the longer vector, so this is a wart.  On the other 
hand, you can't recycle a vector of length 0 to have length 1, so the 
usual recycling rule can't be applied here. This also makes matrix 
operations work, at least in the sense of getting 
matrices of the right dimension.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue with c++ .C call

2006-01-10 Thread Thomas Lumley

On Tue, 10 Jan 2006, Dominick Samperi wrote:

> Sean,
>
> prm in your function calcStepgram is NOT a vector of doubles, it is of type
> SEXP, and you need to use R macros to fetch the value(s). This is done
> automatically in the Rcpp package, and if you want to see how this is
> done look at the definition of the class RcppVector in Rcpp.cpp

Not at all.  He is using .C, which passes a double * to the C function. 
You may be thinking of .Call

-thomas


> Dominick
>
> Sean Davis wrote:
>> I am still having some difficulties with connecting R to a C++ function.  I
>> am able to call the function as expected after compiling the shared library
>> and such.  However, the call to the function is via .C; parameters from the
>> .C call are not being passed correctly to the function.  As an example, I
>> have attached a GDB run of the code.  I set a breakpoint on entry to the
>> function I am calling from R.  What is bothering me (and probably causing
>> the segmentation faults I am seeing) is that the parameter
>> prm=as.double(c(3.2,1.1)) is not 3.2,1.1 IMMEDIATELY after the call to .C.
>> I am sure I am missing something very basic.
>>
>> Thanks,
>> Sean
>>
>>
>>
>>> sessionInfo()
>>>
>> R version 2.2.0, 2005-08-11, powerpc-apple-darwin7.9.0
>>
>> attached base packages:
>> [1] "methods"   "stats" "graphics"  "grDevices" "utils" "datasets"
>> [7] "base"
>>
>>
>>
>>
>> holmes:~/Mercury/projects/R/StepGram sdavis$ R -d gdb
>> GNU gdb 6.1-20040303 (Apple version gdb-413) (Wed May 18 10:17:02 GMT 2005)
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "powerpc-apple-darwin"...Reading symbols for
>> shared libraries ... done
>>
>> (gdb) r
>> Starting program:
>> /Users/sdavis/R-devel2/R.framework/Versions/2.2.0/Resources/bin/exec/R
>> Reading symbols for shared libraries ...+ done
>>
>> R : Copyright 2005, The R Foundation for Statistical Computing
>> Version 2.2.0 Under development (unstable) (2005-08-11 r35256)
>> ISBN 3-900051-07-0
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for a HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>> Reading symbols for shared libraries
>> . done
>> Reading symbols for shared libraries . done
>> Reading symbols for shared libraries . done
>>
>>> dyn.load('StepGram/src/Stepgram.so')
>>>
>> Reading symbols for shared libraries .. done
>>
>>> mySG <- function(dat1,thresh,noise) {
>>>
>>   vec <- c(thresh,noise)
>>   .C('calcStepgram',
>>  data=as.double(dat1),
>>  prm=as.double(vec),
>>  intervals=double(1*3+1),
>>  max=as.integer(1),
>>  n=as.integer(length(dat1)),
>>  plot=double(length(dat1)))}
>>
>>> dat <- c(0.01,0.1,-0.2, 0.1,-0.1,-1000,3.2,3.5,-1.3,3.1,
>>>
>> 3.2,3.1,-0.1,0.2,0.15,-0.05,-0.1,0.2,0.1,-0.1)
>>
>> Program received signal SIGINT, Interrupt.
>> 0x9001f208 in select ()
>> (gdb) break calcStepgram
>> Breakpoint 1 at 0x10bb418: file Stepgram.cpp, line 22.
>> (gdb) c
>> Continuing.
>>
>>> mySG(dat1=dat,thresh=3.2,noise=1.1)
>>>
>>
>> Breakpoint 1, calcStepgram (data=0x1137048, prm=0x1c81eb0,
>> intervals=0xbfffd7e0, max=0x2954840, n=0xbfffd6c0, plot=0x180d574) at
>> Stepgram.cpp:22
>> (gdb) print prm[0]
>> $1 = 1.7716149411915527e-303<<<<<--This should be 3.2!
>> Current language:  auto; currently c++
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue with c++ .C call

2006-01-10 Thread Thomas Lumley

On Tue, 10 Jan 2006, Sean Davis wrote:
> and such.  However, the call to the function is via .C; parameters from the
> .C call are not being passed correctly to the function.  As an example, I
> have attached a GDB run of the code.  I set a breakpoint on entry to the
> function I am calling from R.  What is bothering me (and probably causing
> the segmentation faults I am seeing) is that the parameter
> prm=as.double(c(3.2,1.1)) is not 3.2,1.1 IMMEDIATELY after the call to .C.

Is this compiled with optimization? If so, you can't conclude much from 
the gdb info as the code can be executed in a different order from how 
it's written.

When I use this example

extern "C" void calcStepgram(double *data,  double *prm, double *intervals,
int *max, int *n,double *plot) {

  prm[0]=data[0];
  return;
}

if I compile with -g -02 (the default) it looks as though there are 
problems with initialization like the ones you report, but in fact the 
function works correctly.  If I compile without optimization the 
initialization looks fine.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Issue with c++ .C call

2006-01-10 Thread Thomas Lumley

On Tue, 10 Jan 2006, Sean Davis wrote:

>
> Thanks, Thomas.  That did fix the initialization issue (or apparent one).
> Unfortunately, the reason that I started debugging was for segmentation
> faults, which have not gone away.  However, it now looks like the problem is
> internal to the C++ code and not with the way the arguments were being
> passed.

If you can get access to a Linux machine then it's worth trying Valgrind, 
which is very helpful for this sort of thing.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [OT] GPL3 draft

2006-01-17 Thread Thomas Lumley



Since someone is bound to point this out soon I will note that

a) A discussion draft of the proposed GPL version 3 is up at 
http://gplv3.fsf.org/


b) If you have comments on the draft, send them to the FSF rather than to 
r-devel


  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] (PR#8500) standard install on OS 10.3.9 crashes on start without useful diagnostics (PR#8500)

2006-01-18 Thread Thomas Lumley

This won't actually help you at all, but I used the standard install on OS 
10.3.9 (Powerbook G4) just last week without any problems.

On the other hand, my install was on a machine that had previously had 
other versions of R.

-thomas

On Wed, 18 Jan 2006, [EMAIL PROTECTED] wrote:

> Full_Name: Bob Donahue
> Version: 2.2
> OS: Mac OS 10.3.9
> Submission from: (NULL) (204.152.13.26)
>
>
> That's pretty much it.  I did the most basic install possible, tried running 
> the
> package through the GUI and on the command line, it crashed hard, immediately,
> with absolutely NO useful information as to WHY it crashed.
>
> To reproduce:
> 1) get the installer for OS X
> 2) install in the default places
> 3) run
> 4) watch it crash with no useful diagnostics
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Minumum memory requirements to run R.

2006-01-23 Thread Thomas Lumley

On Mon, 23 Jan 2006, Hin-Tak Leung wrote:

> Prof Brian Ripley wrote:
>> We know: we even document it in the appropriate places.
>
> I went and have a look - it is the last section of R-admin (and of
> course, for those who "read the source", R/include/Rinternals.h). It
> would be good to mention this in the FAQ (which it doesn't, or maybe I
> didn't look hard enough), or the beginning of R-admin?
>

It's not in the FAQ because it isn't a FAQ (yet).

If you use the PDF manual it is in the table of contents on page i.

In the HTML manual it is admittedly less clear: there isn't a table of 
contents and there is nothing obvious in the index. To some extent this is 
a problem with all the manuals. The structure in the .texi file isn't 
translated well to HTML form by the makeinfo tools.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Minumum memory requirements to run R.

2006-01-23 Thread Thomas Lumley

On Mon, 23 Jan 2006, Hin-Tak Leung wrote:
>
> The 32-bit/64-bit issue affects purchasing or upgrading decisions
> - whether one wants to spend the money on buying cheaper
> 32-bit machines, versus more expensive 64-bit machines. That
> decision would be based on information available while *not* having
> an operational R installation...
>

Not necessarily. It's perfectly feasible to use a 32-bit build on a 64-bit 
machine, as it says in the manual, which is available from 
http://www.r-project.org whether or not you have an R installation.

-thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Bug 16719: kruskal.test documentation for formula

2018-04-23 Thread Thomas Levine

I submit a couple options for addressing bug 16719: kruskal.test
documentation for formula.
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16719

disallow-character.diff changes the documentation and error message
to indicate that factors are accepted.

allow-character.diff changes the kruskal.test functions to convert
character vectors to factors; documentation is updated accordingly.

I tested the updated functions with the examples in example.R. It is
based on the examples in the bug report.

If there is interest in applying either patch, especially the latter,
I want first to test the change on lots of existing programs that call
kruskal.test, to see if it causes any regressions. Is there a good place
to look for programs that use particular R functions?

I am having trouble building R, so I have so far tested these changes
only by patching revision 74631 (SVN head) and sourcing the resulting
kruskal.test.R in R 3.4.1 on OpenBSD 6.2. I thus have not tested the
R documentation files.
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -46,7 +46,10 @@
 x <- x[OK]
 g <- g[OK]
 if (!all(is.finite(g)))
-stop("all group levels must be finite")
+if (is.character(g))
+stop("all group levels must be finite; convert group to a 
factor")
+else
+stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
 if (k < 2L)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,12 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a numeric vector or factor object giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group}
+a numeric vector or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +53,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector or factor object of the same length as \code{x}
+  giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -45,7 +45,7 @@
 OK <- complete.cases(x, g)
 x <- x[OK]
 g <- g[OK]
-if (!all(is.finite(g)))
+if (!is.character(g) & !all(is.finite(g)))
 stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,13 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a character vector, numeric vector, or factor
+giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group} a
+character vector, numeric vector, or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +54,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector, character vector, or factor of the same length
+  as \code{x} giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
s

Re: [Rd] Bug 16719: kruskal.test documentation for formula

2018-06-19 Thread Thomas Levine

Thomas Levine writes:
> I submit a couple options for addressing bug 16719: kruskal.test
> documentation for formula.
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16719
>
> disallow-character.diff changes the documentation and error message
> to indicate that factors are accepted.
>
> allow-character.diff changes the kruskal.test functions to convert
> character vectors to factors; documentation is updated accordingly.
>
> I tested the updated functions with the examples in example.R. It is
> based on the examples in the bug report.
>
> If there is interest in applying either patch, especially the latter,
> I want first to test the change on lots of existing programs that call
> kruskal.test, to see if it causes any regressions. Is there a good place
> to look for programs that use particular R functions?
>
> I am having trouble building R, so I have so far tested these changes
> only by patching revision 74631 (SVN head) and sourcing the resulting
> kruskal.test.R in R 3.4.1 on OpenBSD 6.2. I thus have not tested the
> R documentation files.

I thought it was important to test the changes on lots of existing
programs that call kruskal.test, to see if it causes any regressions.

CRAN testing

I downloaded all CRAN packages and checked whether they contained the
fixed expression "kruskal.test". (See "Makefile" and "all-kruskal.r".)

I subsequently tested on all packages in CRAN that mentioned
"kruskal.test".

I patched the development version of R and built like this.

  ./configure --without-recommended-packages
  gmake
  cd src/library
  gmake all docs Rdfiles

This command was helpful for cleaning the repository tree.

  svn status | sed -n 's/^\?  *//p' | xargs rm -r

I tested three versions of kruskal.test

* SVN checkout 74844 with no modifications
* SVN checkout 74844 with disallow-character patch
* SVN checkout 74844 with allow-character patch

The test is to run all of the examples from all of the packages that
mention kruskal.test; with each example I ran, I recorded whether an
error was raised.  I ran all examples, regardless of whether the example
mentioned kruskal.test.  I compared the raising of an error among the
three builds of R/kruskal.test.

I ran these commands for each R version to build R, install the packages
referencing kruskal.test, and run the tests in parallel. The procedure
is available here; see the Makefile for more detail.
https://thomaslevine.com/scm/r-patches/dir?ci=6ea0db4fde&name=kruskal.test-numeric/testing

Run it with like this if you are so inclined.

  make -j 3 install
  make -j 3 test

I found 100 packages that referenced kruskal.test. (This was based on a
very crude string matching; some of these packages mentioned
kruskal.test only in the documentation.) Of these 100 packages, I was
able to install 39. I ran all of the examples in all of these packages,
a total of 2361 examples.

The successes and failures matched exactly among the three builds.
341 examples succeeded, and 2020 failed.
https://thomaslevine.com/scm/r-patches/artifact/5df57add4414970a

This is of course a lot of failures and a small proportion of the
packages. I only installed the packages whose dependencies were easy
for me to install (on OpenBSD 6.2), and some of those implicitly
depended on other things that were not available; this explains
all of the examples that raised errors.

Review of r-help

I also began to collect all kruskal.test calls that I could find in the
r-help archives. Formatting them to be appropriate for evaluation is
quite tedious, so I doubt I will follow through with this, but all of
the calls appear to use ordinary character, numeric, or factor types,
and none performed error catching, so no obvious problems with my
proposed changes stand out.

Furthermore, in looking through the r-help archives, I noted these
messages on r-help where people were having trouble using kruskal.test
and where I think either of my proposed changes would have helped them
perform their desired Kruskal-Wallis rank sum tests.

  <1280836078385-2311712.p...@n4.nabble.com>
  <1280849183252-2312063.p...@n4.nabble.com>

Conclusions
---
I have yet to find any example of my proposed changes causing a
regression. I believe that the most reasonable thing that it might
break is something that depends on either kruskal.test raising an
error or that depends on the specific text in the error message.

If the limited testing is a concern, I could find a way to install
all of the packages and run all of their examples.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Bug 16719: kruskal.test documentation for formula

2018-06-19 Thread Thomas Levine

Thomas Levine writes:
> I have yet to find any example of my proposed changes causing a
> regression. I believe that the most reasonable thing that it might
> break is something that depends on either kruskal.test raising an
> error or that depends on the specific text in the error message.
>
> If the limited testing is a concern, I could find a way to install
> all of the packages and run all of their examples.

In case my April message is hard to find, I have attached the packages
redundantly to this email.
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -46,7 +46,10 @@
 x <- x[OK]
 g <- g[OK]
 if (!all(is.finite(g)))
-stop("all group levels must be finite")
+if (is.character(g))
+stop("all group levels must be finite; convert group to a 
factor")
+else
+stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
 if (k < 2L)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,12 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a numeric vector or factor object giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group}
+a numeric vector or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +53,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector or factor object of the same length as \code{x}
+  giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -45,7 +45,7 @@
 OK <- complete.cases(x, g)
 x <- x[OK]
 g <- g[OK]
-if (!all(is.finite(g)))
+if (!is.character(g) & !all(is.finite(g)))
 stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,13 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a character vector, numeric vector, or factor
+giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group} a
+character vector, numeric vector, or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +54,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector, character vector, or factor of the same length
+  as \code{x} giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Documentation examples for lm and glm

2018-12-13 Thread Thomas Yee


Hello,

something that has been on my mind for a decade or two has
been the examples for lm() and glm(). They encourage poor style
because of mismanagement of data frames. Also, having the
variables in a data frame means that predict()
is more likely to work properly.

For lm(), the variables should be put into a data frame.
As 2 vectors are assigned first in the general workspace they
should be deleted afterwards.

For the glm(), the data frame d.AD is constructed but not used. Also,
its 3 components were assigned first in the general workspace, so they
float around dangerously afterwards like in the lm() example.

Rather than attached improved .Rd files here, they are put at
www.stat.auckland.ac.nz/~yee/Rdfiles
You are welcome to use them!

Best,

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Documentation examples for lm and glm

2018-12-16 Thread Thomas Yee


Thanks for the discussion. I do feel quite strongly that
the variables should always be a part of a data frame. Then
functions such as summary() and pairs() can operate on them all
simultaneously regression is only one part of the analysis. And
what if there are lots of variables? Have them all scattered
about the workspace? One of them could be easily overwritten.

The generic predict() will still work when lm() was not assigned
a data frame, but then the 'newdata' argument needs be assigned
a data.frame. So this suggests that the original fit should have
used a data frame too.

BTW I believe attach() should be discouraged. Functions like
with() and within() are safer. Many users of attach() do not seem
to detach(), and subtle problems can arise with attach()---quite
dangerous really. The online help has a section called "Good
practice" which is good but I think it should go a little further
by actively discouraging its use in the first place.

I do not wish to be contentious on all this... just encouraging
good practice that's all.

cheers
Thomas



On 17/12/18 12:26 PM, Achim Zeileis wrote:

On Sat, 15 Dec 2018, frede...@ofb.net wrote:


I agree with Steve and Achim that we should keep some examples with no
data frame. That's Objectively Simpler, whether or not it leads to
clutter in the wrong hands. As Steve points out, we have attach()
which is an excellent language feature - not to mention with().


Just for the record: Personally, I wouldn't recommend using lm() with 
attach() or with() but would always encourage using data= instead.


In my previous e-mail I just wanted to point out that a pragmatic step 
for the man page could be to keep one example without data= argument 
when adding examples with data=.



I would go even further and say that the examples that are in lm() now
should stay at the top. Because people may be used to referring to
them, and also because Historical Order is generally a good order in
which to learn things. However, if there is an important function
argument ("data=") not in the examples, then we should add examples
which use it. Likewise if there is a popular programming style
(putting things in a data frame). So let's do something along the
lines of what Thomas is requesting, but put it after the existing
documentation? Please?

On a bit of a tangent, I would like to see an example in lm() which
plots my data with a fitted line through it. I'm probably betraying my
ignorance here, but I was asked how to do this when showing R to a
friend and I thought it should be in lm(), after all it seems a bit
more basic than displaying a Normal Q-Q plot (whatever that is!
gasp...). Similarly for glm(). Perhaps all this can be accomplished
with merely doubling the size of the existing examples.

Thanks.

Frederick

On Sat, Dec 15, 2018 at 02:15:52PM +0100, Achim Zeileis wrote:
A pragmatic solution could be to create a simple linear regression 
example with variables in the global environment and then another 
example with a data.frame.


The latter might be somewhat more complex, e.g., with several 
regressors and/or mixed categorical and numeric covariates to 
illustrate how regression and analysis of (co-)variance can be 
combined. I like to use MASS's whiteside data for this:


data("whiteside", package = "MASS")
m1 <- lm(Gas ~ Temp, data = whiteside)
m2 <- lm(Gas ~ Insul + Temp, data = whiteside)
m3 <- lm(Gas ~ Insul * Temp, data = whiteside)
anova(m1, m2, m3)

Moreover, some binary response data.frame with a few covariates 
might be a useful addition to "datasets". For example a more 
granular version of the "Titanic" data (in addition to the 4-way 
tabel ?Titanic). Or another relatively straightforward data set, 
popular in econometrics and social sciences is the "Mroz" data, see 
e.g., help("PSID1976", package = "AER").


I would be happy to help with these if such additions were 
considered for datasets/stats.



On Sat, 15 Dec 2018, David Hugh-Jones wrote:

I would argue examples should encourage good practice. Beginners 
ought to
learn to keep data in data frames and not to overuse attach(). 
Experts can

do otherwise at their own risk, but they have less need of explicit
examples.

On Fri, 14 Dec 2018 at 14:51, S Ellison  
wrote:


FWIW, before all the examples are changed to data frame variants, 
I think
there's fairly good reason to have at least _one_ example that 
does _not_

place variables in a data frame.

The data argument in lm() is optional. And there is more than one 
way to

manage data in a project. I personally don't much like lots of stray
variables lurking about, but if those are the only variables out 
there and
we can be sure they aren't affected by other code, it's hardly 
essential to

create a data frame to hold something you already have.
Also, attach() is still part of R, for those

[Rd] configure script issue with -flto with recent gcc and system ar/ranlib

2019-04-24 Thread Thomas König


Hi,

there can be an issue with recent gcc where the system-installed "ar"
and "ranlib" commands cannot handle LTO binaries.  On compilation, this
manifests itself with error messages claiming that they need extra
plugins.

This can be fixed by using the command line

$ AR=gcc-ar RANLIB=gcc-ranlib ./configure --enable-lto

so it is not a big issue, but it would still be nicer if the configure
script tested the functionality of ar and ranlib itself and would
select the appropriate one accordingly.

This is with R version 3.5.3.

Regards

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R problems with lapack with gfortran

2019-04-25 Thread Thomas König


Hi,

I have tried to pinpoint potential problems which could lead to the
LAPACK issues that are currently seen in R.  I built the current R
trunk using

AR=gcc-ar RANLIB=gcc-ranlib ./configure --prefix=$HOME --enable-lto 
--enable-BLAS-shlib=no --without-recommended-packages


and used this to find problem areas.

There are quite a few warnings that were flagged, due to mismatches
in function types.

The prototypes that R has in its header files, for example BLAS.h,
are often not compatible with gfortran function declarations.  To take
one small example, in src/main/print.c, we have

void NORET F77_NAME(xerbla)(const char *srname, int *info)

so xerbla_ is defined with two arguments.

However, gfortran passes string lengths as hidden arguments.
You can see this by compiling the small example

$ cat xer.f
  SUBROUTINE FOO
  INTEGER INFO
  CALL XERBLA ('FOO', INFO)
  END
$ gfortran -c -fdump-tree-original xer.f
$ cat xer.f.004t.original
foo ()
{
  integer(kind=4) info;

  xerbla (&"FOO"[1]{lb: 1 sz: 1}, &info, 3);
}

so here we have three arguments. This mismatch is flagged
by -Wlto-type-mismatch, which, for example, yields

print.c:1120:12: note: type 'void' should match type 'long int'
../../src/extra/blas/blas.f:357:20: warning: type of 'xerbla' does not 
match original declaration [-Wlto-type-mismatch]

  357 |  CALL XERBLA( 'DGBMV ', INFO )


So, why can gcc's r268992 / r269349 matter? Before these patches,
gfortran used the variadic calling convention for calling procedures
outside the current file, and the non-variadic calling convention for
calling procedures found in the current file.

Because the procedures were all compiled as non-variadic, the caller and
the calle's signature did not match if they were not in the same
source file, which is an ABI violation.

This violation manifested itself in https://gcc.gnu.org/PR87689 ,
where the the problem resulted in crashes on a primary gcc platform,
POWER.

How can this potentially affect R?  After the fix for PR87689,
gfortran's calls to external procedures are no longer variadic.  It is
quite possible that, while this "works" most of the time, there
is a problem with a particular LAPACK routine, the call sequence
leading up to it or the procedures it calls.

How to fix this problem?  The only clear way I see is to fix this
on the R side, by adding the string lengths to the prototypes.
These are size_t (64 bit on 64-bit systems, 32 bit on 32-bit
systems).  You should then try to make --enable-lto pass
without any warnings.

Regarding LAPACK itself, the default build system for R builds
it as a shared library.  Offhand, I did not see any way to
build a *.a file instead, so I could not use LTO to check
for mismatched prototypes between R and LAPACK.

Of course, I cannot be sure that this is really the root cause
of the problem you are seeing,but it does seem to fit quite well.
I hope this analysis helps in resolving this.

Regards

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] configure script issue with -flto with recent gcc and system ar/ranlib

2019-04-25 Thread Thomas König


Hi Tomas,


On 4/23/19 2:59 PM, Thomas König wrote:

Hi,

there can be an issue with recent gcc where the system-installed "ar"
and "ranlib" commands cannot handle LTO binaries.  On compilation, this
manifests itself with error messages claiming that they need extra
plugins.


Thanks for the report. What was the version of binutils on the system 
with this problem? On my Ubuntu 18.04 I can use the binutils version of 
"ar" and "ranlib" with --enable-lto without problems.  I read that with 
recent binutils (2.25?), the LTO plugin should be loaded automatically, 
so one does not have to use the wrappers anymore.


This was with, on x86_64-pc-linux-gnu,

GNU ar (GNU Binutils; openSUSE Leap 42.3) 2.31.1.20180828-19

and, on powerpc64le-unknown-linux-gnu,

GNU ar version 2.27-34.base.el7

both with a recent gcc 9.0.1 snapshot.

Regards

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R problems with lapack with gfortran

2019-05-03 Thread Thomas König


Hi Tomas,

thanks a lot for your analysis.  I have created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329
for this, and put you in CC (if your e-mail address
for GCC bugzilla is still current).

Regards

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R problems with lapack with gfortran

2019-05-04 Thread Thomas König


Hi Peter,

we (the gfortran team) are currently discussing this at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329 . I
invite everybody who has an interest in this topic to
take part in the discussion there.


Workarounds/solutions include:

- disable certain optimizations -- works for now, but doesn't remove the root 
cause so seems generally fragile


That looks like a short-term solutuion that could work (at least for
x86_64 using the standard Unix ABI). And yes, it is fragile.

And whatever other solution people come up with, it will still be
fragile unless the caller and the callee agree.

The root cause is that the Fortran LAPACK routines are called from C
via an incompatible call signature.


- "onion-skin" all LAPACK routines to call via a Fortran routine that converts 
integer arguments to the required character -- possible, but it adds overhead and there 
are hundreds of routines (and it would be kind of ugly!).


I agree.


- modify LAPACK itself similarly -- requires naming change of routines as per 
the license, and there are still hundreds of routines; avoids overhead, but 
creates maintenance nightmare to keep up with changes in LAPACK


I agree that this is not a preferred option.


- change all prototypes and calls to follow gfortran calling conventions -- 
still a lot of work since each char* arguments need to be supplemented by a 
length going at the end of the arglist. If gfortran was the only compiler 
around, I'd say this would be the least painful route, but still no fun since 
it requires changes to a lot of user code (in packages too). It is not clear if 
this approach works with other Fortrans.


The interesting thing is that this convention goes back to at least f2c,
which was modeled on the very first Unix compiler.


- figure out Fortran2003 specification for C/Fortran interoperability -- this 
_sounds_ like the right solution, but I don't think many understand how to use 
it and what is implied (in particular, will it require making changes to LAPACK 
itself?)


That would actually be fairly easy.  If you declare the subroutines
BIND(C), as in

  subroutine foo(a,b) BIND(C,name="foo_")
  real a
  character*1 b
  end

you will get the calling signature that you already have in your C
sources.

This also has the advantage of being standards compliant, and would be
probably be the preferred method.


- move towards the LAPACKE C interface -- but that also adds onionskin overhead 
and ultimately calls Fortran in essentially the same way as R does, so doesn't 
really solve anything at all (unless we can shift responsibility for sorting 
things out onto the LAPACK team, but I kind of expect that they do not want it.)


I suspect that they will hit the issue, too.


- twist the arms of the gfortran team to do something that keeps old code 
working. Compiler engineers understandably hate that sort of thing, but I seem 
to recall some precedent (pointer alignment, back in the dark ages?).


We're willing to do reasonable things :-) but so far all of the options
we have come up with have very serious drawbacks (see the link to the
PR at the top). If you come up with a suggestion, we'd be more than
happy to look at it.

I think the best option would really be to use BIND(C).

Regards

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R problems with lapack with gfortran

2019-05-04 Thread Thomas König


Hi Steve,


With the caveat that one may need to use the VALUE attribute to
account for pass-by-value vs pass-by-reference.


LAPACK should be all pass by reference, it is old F77-style
code (except that the odd ALLOCATABLE array has snuck in
in the testing routines).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Thomas Petzoldt

It seems that it's an old bug that was found in some other packages, but 
at that time not optim:


https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958

and that Duncan Murdoch posted a patch already last Friday :)

Thomas

Am 06.05.2019 um 16:40 schrieb Ben Bolker:

   That's consistent/not surprising if the problem lies in the numerical
gradient calculation step ...

On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:

Optim's Nelder-Mead works correctly for this example.



optim(par=10, fn=fn, method="Nelder-Mead")

x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0

$value
[1] 0

$counts
function gradient
   32   NA

$convergence
[1] 0

$message
NULL





From: R-devel  on behalf of Duncan Murdoch 

Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel@r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working 
with parent environments


It looks as though this happens when calculating numerical gradients:  x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again.  No check is made that x has other references after the
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch

On 03/05/2019 7:13 a.m., peter dalgaard wrote:

Yes, I think you are right. I was at first confused by the fact that after the 
optim() call,


environment(fn)$xx

[1] 10

environment(fn)$ret

[1] 100.02

so not 9.999, but this could come from x being assigned the final value without 
calling fn.

-pd



On 3 May 2019, at 11:58 , Duncan Murdoch  wrote:

Your results below make it look like a bug in optim():  it is not duplicating a 
value when it should, so changes to x affect xx as well.

Duncan Murdoch

On 03/05/2019 4:41 a.m., Serguei Sokol wrote:

On 03/05/2019 10:31, Serguei Sokol wrote:

On 02/05/2019 21:35, Florian Gerber wrote:

Dear all,

when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:

makeFn <- function(){
   xx <- ret <- NA
   fn <- function(x){
  if(!is.na(xx) && x==xx){
  cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
  return(ret)
  }
  xx <<- x; ret <<- sum(x^2)
  cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
  ret
   }
   fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)

I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.

Same problem with optim(par=10, fn=fn, method="BFGS").

Any ideas?

I don't have an answer but may be an insight. For some mysterious
reason xx is getting changed when in should not. Consider:

fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in

x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})

optim(par=10, fn=fn, method="L-BFGS-B")

1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10

$value
[1] 100

$counts
function gradient
 11

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

At the third call, xx has value 9.999 while it should have kept the
value 10.001.


A little follow-up: if you untie the link between xx and x by replacing
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
   > fn=

Re: [Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments

2019-05-27 Thread Thomas Mailund

With a bit of meta programming that manipulates expressions, I don’t think this 
would be difficult to implement in a package. Well, as difficult as it is to 
implement a CAS, but not harder. I wrote some code for symbolic differentiation 
— I don’t remember where I put it — and that was easy. But that is because 
differentiation is just a handful of rules and then the chain rule. I don’t 
have the skills for handling more complex symbolic manipulation, but anyone who 
could add it to the language could also easily add it as a package, I think.

Whether in a standard package or not, I have no preference whatsoever.

Cheers
Thomas

On 25 May 2019 at 00.59.44, Abby Spurdle 
(spurdl...@gmail.com<mailto:spurdl...@gmail.com>) wrote:

> Martin Maechler has asked me to send this to R-devel for discussion
> after I submitted it as an enhancement request (
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17563).

I think R needs to provide more support for CAS-style symbolic computation.
That is, support by either the R language itself or the standard packages,
or both.
(And certainly not by interfacing with another interpreted language).

Obviously, I don't speak for R Core.
However, this is how I would like to see R move in the future.
...improved symbolic and symbolic-numeric computation...

I think any changes to formula objects or their methods, should be
congruent with these symbolic improvements.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] justify hard coded in format.ftable

2020-05-10 Thread SOEIRO Thomas

Dear all,

justify argument is hard coded in format.ftable:

cbind(apply(LABS, 2L, format, justify = "left"),
  apply(DATA, 2L, format, justify = "right"))

It would be useful to have the possibility to modify the argument between 
c("left", "right", "centre", "none") as in format.default.

The lines could be changed to:

if(length(justify) != 2)
  stop("justify must be length 2")
cbind(apply(LABS, 2L, format, justify = justify[1]),
  apply(DATA, 2L, format, justify = justify[2]))

The argument justify could defaults to c("left", "right") for backward 
compatibility.

It could then allow:
ftab <- ftable(wool + tension ~ breaks, warpbreaks)
format.ftable(ftab, justify = c("none", "none"))

Best regards,

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] justify hard coded in format.ftable

2020-05-13 Thread SOEIRO Thomas

Dear all,

I haven't received any feedback so far on my proposal to make "justify" 
argument available in stats:::format.ftable

Is this list the appropriate place for this kind of proposal?

I hope this follow-up to my message won't be taken as rude. Of course it's not 
meant to be, but I'm not used to the R mailing lists...

Thank you in advance for your comments,

Best,

Thomas

> Dear all,
>
> justify argument is hard coded in format.ftable:
>
> cbind(apply(LABS, 2L, format, justify = "left"),
>   apply(DATA, 2L, format, justify = "right"))
>
> It would be useful to have the possibility to modify the argument between 
> c("left", "right", "centre", "none") as in format.default.
>
> The lines could be changed to:
>
> if(length(justify) != 2)
>   stop("justify must be length 2")
> cbind(apply(LABS, 2L, format, justify = justify[1]),
>   apply(DATA, 2L, format, justify = justify[2]))
>
> The argument justify could defaults to c("left", "right") for backward 
> compatibility.
>
> It could then allow:
> ftab <- ftable(wool + tension ~ breaks, warpbreaks)
> format.ftable(ftab, justify = c("none", "none"))
>
> Best regards,
>
> Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] justify hard coded in format.ftable

2020-05-14 Thread SOEIRO Thomas

I suspected it was partly due to the fact that ftable doesn't get much 
interest/isn't much used...

So thank you very much for answering, and for your time!

>> Dear all,
>> I haven't received any feedback so far on my proposal to make "justify" 
>> argument available in stats:::format.ftable
>>
>> Is this list the appropriate place for this kind of proposal?
> 
> Yes, it is.. Actually such a post is even a "role model" post for R-devel.
> 
>> I hope this follow-up to my message won't be taken as rude. Of course it's 
>> not meant to be, but I'm not used to the R mailing lists...
> 
> well, there could be said much, and many stories told here ... ;-)
> 
>> Thank you in advance for your comments,
>> 
>> Best,
>> Thomas
> 
> The main reasons for "no reaction" (for such nice post) probably are 
> combination of the following
> 
> - we are busy
> - if we have time, we think other things are more exciting
> - we have not used ftable much/at all and are not interested.
> 
> Even though the first 2 apply to me, I'll have a 2nd look into your post now, 
> and may end up well agreeing with your proposal.
> 
> Martin Maechler
> ETH Zurich  and  R Core team
> 
>>> Dear all,
>>>
>>> justify argument is hard coded in format.ftable:
>>>
>>> cbind(apply(LABS, 2L, format, justify = "left"),
>>> apply(DATA, 2L, format, justify = "right"))
>>>
>>> It would be useful to have the possibility to modify the argument between 
>>> c("left", "right", "centre", "none") as in format.default.
>>>
>>> The lines could be changed to:
>>>
>>> if(length(justify) != 2)
>>> stop("justify must be length 2")
>>> cbind(apply(LABS, 2L, format, justify = justify[1]),
>>> apply(DATA, 2L, format, justify = justify[2]))
>>>
>>> The argument justify could defaults to c("left", "right") for backward 
>>> compatibility.
>>>
>>> It could then allow:
>>> ftab <- ftable(wool + tension ~ breaks, warpbreaks)
>>> format.ftable(ftab, justify = c("none", "none"))
>>>
>>> Best regards,
>>>
>>> Thomas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] justify hard coded in format.ftable

2020-05-15 Thread SOEIRO Thomas

Thanks for the links. I agree that such a feature would be a nice addition, and 
could make ftable even more useful.

In the same spirit, I think it could be useful to mention the undocumented 
base::as.data.frame.matrix function in documentation of table and xtabs (in 
addition to the already mentioned base::as.data.frame.table). The conversion 
from ftable/table/xtabs to data.frame is a common task that some users seem to 
struggle with 
(https://stackoverflow.com/questions/10758961/how-to-convert-a-table-to-a-data-frame).

tab <- table(warpbreaks$wool, warpbreaks$tension)
as.data.frame(tab) # reshaped table
as.data.frame.matrix(tab) # non-reshaped table

To sum up, for the sake of clarity, these proposals address two different 
topics:
- The justify argument would reduce the need to reformat the exported ftable
- An ftable2df-like function (and the mention of as.data.frame.matrix in the 
documentation) would facilitate the reuse of ftable results for further 
analysis.

Thank you very much,

Thomas

> If you are looking at ftable could you also consider adding a way to convert 
> an ftable into a usable data.frame such as the ftable2df function defined 
> here:
> 
> https://stackoverflow.com/questions/11141406/reshaping-an-array-to-data-frame/11143126#11143126
> 
> and there is an example of using it here:
> 
> https://stackoverflow.com/questions/61333663/manipulating-an-array-into-a-data-frame-in-base-r/61334756#61334756
> 
> Being able to move back and forth between various base class representations 
> seems like something that would be natural to provide.
> 
> Thanks.
> 
> On Thu, May 14, 2020 at 5:32 AM Martin Maechler  
> wrote:
>>
>>>>>>> SOEIRO Thomas
>>>>>>> on Wed, 13 May 2020 20:27:15 + writes:
>>
>>> Dear all,
>>> I haven't received any feedback so far on my proposal to make 
>> "justify" argument available in stats:::format.ftable
>>
>>> Is this list the appropriate place for this kind of proposal?
>>
>> Yes, it is.. Actually such a post is even a "role model" post for 
>> R-devel.
>>
>>> I hope this follow-up to my message won't be taken as rude. Of course it's 
>>> not meant to be, but I'm not used to the R mailing lists...
>>
>> well, there could be said much, and many stories told here ... ;-)
>>
>>> Thank you in advance for your comments,
>>
>>> Best,
>>> Thomas
>>
>> The main reasons for "no reaction" (for such nice post) probably are 
>> combination of the following
>>
>> - we are busy
>> - if we have time, we think other things are more exciting
>> - we have not used ftable much/at all and are not interested.
>>
>> Even though the first 2 apply to me, I'll have a 2nd look into your 
>> post now, and may end up well agreeing with your proposal.
>>
>> Martin Maechler
>> ETH Zurich  and  R Core team
>>
>>
>>
>>
>>>> Dear all,
>>>>
>>>> justify argument is hard coded in format.ftable:
>>>>
>>>> cbind(apply(LABS, 2L, format, justify = "left"),
>>>> apply(DATA, 2L, format, justify = "right"))
>>>>
>>>> It would be useful to have the possibility to modify the argument between 
>>>> c("left", "right", "centre", "none") as in format.default.
>>>>
>>>> The lines could be changed to:
>>>>
>>>> if(length(justify) != 2)
>>>> stop("justify must be length 2")
>>>> cbind(apply(LABS, 2L, format, justify = justify[1]),
>>>> apply(DATA, 2L, format, justify = justify[2]))
>>>>
>>>> The argument justify could defaults to c("left", "right") for backward 
>>>> compatibility.
>>>>
>>>> It could then allow:
>>>> ftab <- ftable(wool + tension ~ breaks, warpbreaks)
>>>> format.ftable(ftab, justify = c("none", "none"))
>>>>
>>>> Best regards,
>>>>
>>>> Thomas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Patch proposal for bug 17770 - xtabs does not act as documented for na.action = na.pass

2020-05-21 Thread SOEIRO Thomas

Dear all,

(This issue was previously reported on Bugzilla 
(https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17770) and discussed on 
Stack Overflow (https://stackoverflow.com/q/61240049).)

The documentation of xtabs says:

"na.action: When it is na.pass and formula has a left hand side (with counts), 
sum(*, na.rm = TRUE) is used instead of sum(*) for the counts."

However, this is not the case:
 
DF <- data.frame(group = c("a", "a", "b", "b"),
 count = c(NA, TRUE, FALSE, TRUE))

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.pass)

# group
# a b
# 1

In the code, na.rm is TRUE if and only if na.action = na.omit:

na.rm <- 
  identical(naAct, quote(na.omit)) || identical(naAct, na.omit) ||
  identical(naAct, "na.omit")

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.omit)

# group
# a b
# 1 1

The example works as documented if we change the code to:

na.rm <- 
  identical(naAct, quote(na.pass)) || identical(naAct, na.pass) ||
  identical(naAct, "na.pass")

However, there may be something I am missing, and na.omit may be necessary for 
something else...

Best regards,

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Telling Windows how to find DLL's from R?

2010-07-13 Thread Thomas Baier

Dominick,

Dominick Samperi wrote:
> On Fri, Jul 9, 2010 at 3:48 PM, Duncan Murdoch
> wrote: 
> 
>> On 09/07/2010 2:38 PM, Dominick Samperi wrote:
>> 
>>> Is it possible to set Windows' search path from within R, or to tell 
>>> Windows how to find a DLL in some other way from R? Specifically, if 
>>> a package DLL depends on another DLL the normal requirement is that 
>>> the second DLL be in the search path so Windows can find it (there 
>>> are other tricks, but they apply at the Windows level, not at the R 
>>> level).
>>> 
>>> 
>> 
>> 
>> I haven't tried this, but can't you use Sys.setenv() to change the 
>> PATH to what you want?  Presumably you'll want to change it back 
>> afterwards.
>> 
> 
> Thanks, good suggestion, but it does not seem to work. If PATH is 
> updated in this way the change is local to the current process, not to 
> the top-level Windows process, so a subsequent
> dyn.load('foo.dll') will fail if foo.dll depends on bar.dll, unless 
> bar.dll is placed in the search path for the top-level shell. Seems 
> like this needs to be done as part of system startup outside of R.
> 
> On the other hand, if foo.dll is the package library for package foo, 
> and if foo depends on package bar, then there is no need to place 
> bar.dll in the top-level search path. R takes care of this (more
> typical) situation.   

there is another Windows "feature" which will do the trick for you without
modifying the search path. Windows normally only loads DLLs once, so this
means if you first load the dependent DLL (manually, e.g. using dyn.load())
then the already loaded DLL will be used instead of trying to load one from
path.

E.g. in your example:

dyn.load("mypath/bar.dll")
dyn.load("foo.dll")

will work, as bar.dll (a dependency of foo.dll) is already loaded.

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Plot window does not update in embedded code

2010-07-21 Thread Thomas Friedrichsmeier

Hi,

On Wednesday 21 July 2010, Jan van der Laan wrote:
> How do I ensure that the windows keep being updated?

in RKWard we run the following periodically during idle phases:


// this basically copied from R's unix/sys-std.c (Rstd_ReadConsole)
#ifndef Q_WS_WIN
for (;;) {
fd_set *what;
what = R_checkActivityEx(R_wait_usec > 0 ? R_wait_usec : 50, 1, 
Rf_onintr);
R_runHandlers(R_InputHandlers, what);
if (what == NULL) break;
}
/* This seems to be needed to make Rcmdr react to events. Has this 
always 
been the case? It was commented out for a long time, without anybody noticing. 
*/
R_PolledEvents ();
#else
R_ProcessEvents();
#endif


Regards
Thomas


signature.asc
Description: This is a digitally signed message part.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Visibility of methods in different namespaces

2010-07-22 Thread Thomas Etheber


Hey folks,

I read plenty of material and started to write some new R-packages, but 
actually I am looking for a solution of the following problem. Perhaps 
it's a beginners question for creating packages, but I wasn't able to 
track it down yet. Hopefully you can provide me with a solution or at 
least some adequate readings.


I have a package say it's called "AAA" with an S4-object say it's called 
"aObject". On the other hand I have a package "BBB" with an S4-object 
called "bObject".
Both objects want to provide the same method to outside users, the 
method is called "calculateModel" and they were implemented in the 
package using the following S3 notation:


Package AAA:
---
calculateModel <- function( object, ...) { UseMethod( "calculateModel") }
calculateModel.aObject <- function( object, ..., 
someAdditionalParameters ) { bla bla }


Package BBB:
---
calculateModel <- function( object, ...) { UseMethod( "calculateModel") }
calculateModel.bObject <- function( object, ..., 
someAdditionalParameters ) { bla bla }



In the NAMESPACE file of package AAA and package BBB, I registered the 
methods using:


Package AAA:
---
S3method( "calculateModel", "aObject" )
export( "calculateModel"  )

Package BBB:
---
S3method( "calculateModel", "bObject" )
export( "calculateModel"  )

Everything works fine, when I load only one of the packages. But if I 
load both packages together, one of the calculateModel methods 
disappears and is no longer visible, which results in an error.


What am I missing?

Thanks!
Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R support for 64 bit integers

2010-08-10 Thread Thomas Lumley

On Tue, 10 Aug 2010, Martin Maechler wrote:

{Hijacking the thread from from R-help to R-devel -- as I am
consciously shifting the focus away from the original question
...
}

David Winsemius 
on Tue, 10 Aug 2010 08:42:12 -0400 writes:

   > On Aug 9, 2010, at 2:45 PM, Theo Tannen wrote:

   >> Are integers strictly a signed 32 bit number on R even if
   >> I am running a 64 bit version of R on a x86_64 bit
   >> machine?
   >>
   >> I ask because I have integers stored in a hdf5 file where
   >> some of the data is 64 bit integers. When I read that
   >> into R using the hdf5 library it seems any integer
   >> greater than 2**31 returns NA.

   > That's the limit. It's hard coded and not affected by the
   > memory pointer size.

   >>
   >> Any solutions?

   > I have heard of packages that handle "big numbers". A bit
   > of searching produces suggestions to look at gmp on CRAN
   > and Rmpfr on R-Forge.

Note that Rmpfr has been on CRAN, too, for a while now.
If you only need large integers (and rationals), 'gmp' is enough
though.

*However* note that the gmp or Rmpfr (or any other arbitray
precision) implementation will be considerably slower in usage
than if there was native 64-bit integer support.

Introducing 64-bit integers natively into "base R" is an
"interesting" project, notably if we also allowed using them for
indices, and changed the internal structures to use them instead
of 32-bit.
This would allow to free ourselves from the increasingly
relevant  maximum-atomic-object-length = 2^31 problem.
The latter is something we have planned to address, possibly for
R 3.0.
However, for that, using 64-bit integers is just one
possibility, another being to use "double precision integers".
Personally, I'd prefer the "long long" (64-bit) integers quite
a bit, but there are other considerations, e.g.,
one big challenge will be to go there in a way such that not
all R packages using compiled code will have to be patched
extensively...
another aspect is how the BLAS / Lapack team will address the
problem.

At the moment, all the following are the same type:
 length of an R vector
 R integer type
 C int type
 Fortran INTEGER type

The last two are fixed at 32 bits (in practice for C, by standard for Fortran), 
and we would like the first and perhaps the second to become 64bit.

If both the R length type and the R integer type become the same 64bit type and 
replace the current integer type then every compiled package has to change to 
declare the arguments as int64 (or long, on most 64bit systems) and INTEGER*8. 
That should be all that is needed for most code, since C compilers nowadays 
already complain if you do unclean things like stuffing an int into a pointer.

If the R length type changes to something /different/ from the integer type 
then any compiled code has to be checked to see if  C int arguments are lengths 
or integers, which is more work and more error-prone.

On the other hand, changing the integer type to 64bit will presumably make 
integer code run noticeably more slowly on 32bit systems.

In both cases, the changes could be postponed by having an option to .C/.Call 
forcing lengths and integers to be passed as 32-bit. This would mean that the 
code couldn't use large integers or large vectors, but it would keep working 
indefinitely.

-thomas

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Non identical numerical results from R code vs C/C++ code?

2010-09-10 Thread Thomas Lumley


On Fri, 10 Sep 2010, Duncan Murdoch wrote:


On 10/09/2010 7:07 AM, Renaud Gaujoux wrote:

Thank you Duncan for your reply.

Currently I am using 'double' for the computations.
What type should I use for extended real in my intermediate computations?


I think it depends on compiler details.  On some compilers "long double" will 
get it, but I don't think there's a standard type that works on all 
compilers.  (In fact, the 80 bit type on Intel chips isn't necessarily 
supported on other hardware.)  R defines LDOUBLE in its header files and it 
is probably best to use that if you want to duplicate R results.


As a little more detail, 'long double' is in the C99 standard and seems to be 
fairly widely implemented, so code using it is likely to compile.   The 
Standard, as usual, doesn't define exactly what type it is, and permits it to 
be a synonym for 'double', so you may not get any extra precision.

On Intel chips it is likely to be the 80-bit type, but the Sparc architecture 
doesn't have any larger hardware type.  Radford Neal has recently reported much 
slower results on Solaris with long double, consistent with Wikipedia's 
statement that long double is sometimes a software-implemented 128-bit type on 
these systems.



The result will still be 'double' anyway right?


Yes, you do need to return type double.

Duncan Murdoch





On 10/09/2010 13:00, Duncan Murdoch wrote:

On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:

Hi,

suppose you have two versions of the same algorithm: one in pure R, the 
other one in C/C++ called via .Call().
Assuming there is no bug in the implementations (i.e. they both do the 
same thing), is there any well known reason why the C/C++ implementation 
could return numerical results non identical to the one obtained from the 
pure R code? (e.g. could it be rounding errors? please explain.)

Has anybody had a similar experience?
R often uses extended reals (80 bit floating point values on Intel chips) 
for intermediate values.  C compilers may or may not do that.
By not identical, I mean very small differences (< 2.4 e-14), but enough 
to have identical() returning FALSE. Maybe I should not bother, but I 
want to be sure where the differences come from, at least by mere 
curiosity.


Briefly the R code perform multiple matrix product; the C code is an 
optimization of those specific products via custom for loops, where 
entries are not computed in the same order, etc... which improves both 
memory usage and speed. The result is theoretically the same.
Changing the order of operations will often affect rounding.  For example, 
suppose epsilon is the smallest number such that 1 + epsilon is not equal 
to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate to either 1 or 1 + 
epsilon, depending on the order of computing the additions.


Duncan Murdoch


Thank you,
Renaud



 
###
UNIVERSITY OF CAPE TOWN 
This e-mail is subject to the UCT ICT policies and e-mail disclaimer 
published on our website at 
http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 
21 650 4500. This e-mail is intended only for the person(s) to whom it is 
addressed. If the e-mail has reached you in error, please notify the 
author. If you are not the intended recipient of the e-mail you may not 
use, disclose, copy, redirect or print the content. If this e-mail is not 
related to the business of UCT it is sent by the sender in the sender's 
individual capacity.


###
 


______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] package test failed on Solaris x86 -- help needed for debugging

2010-09-16 Thread Thomas Petzoldt


Dear R developers,

we have currently a 'mysterious' test problem with one package that 
successfully passed the tests on all platforms, with the only exception 
of Solaris x86 where obviously one of our help examples breaks the CRAN 
test.


As we don't own such a machine I want to ask about a possibility to run 
a few tests on such a system:


r-patched-solaris-x86

An even more recent version of R on the same OS (Solaris 10) and with 
the same compiler (Sun Studio 12u1) would help also.


Any assistance is appreciated


Thomas Petzoldt


--
Thomas Petzoldt
Technische Universitaet Dresden
Institut fuer Hydrobiologiethomas.petzo...@tu-dresden.de
01062 Dresden  http://tu-dresden.de/hydrobiologie/
GERMANY

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] package test failed on Solaris x86 -- help needed for debugging

2010-09-16 Thread Thomas Petzoldt


On 16.09.2010 17:05, Martyn Plummer wrote:

Dear Thomas,

Is this the deSolve package?

http://www.r-project.org/nosvn/R.check/r-patched-solaris-x86/deSolve-00check.html

I can help you with that. It does pass R CMD check on my OpenSolaris
installation, but I am getting some compiler warnings. I will send you
details.

Martyn


You are right and there are many reasons what can be wrong, i.e. an 
obsolete comma in the example, the call to colorRampPalette after the 
ode.2D call or any problem with the C code. I wonder why this problem is 
so specific because it runs on all other eleven platforms including 
Solaris / Sparc.


Details about the compiler warnings are welcome.

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Summary: package test failed on Solaris x86 ...

2010-09-17 Thread Thomas Petzoldt


Dear Martin,

many thanks for your effort. Last night we found the same, thanks to the 
kind assistance from Bill Dunlap. The most important bugs are now 
already fixed, some minor things and an upload of a new version will 
follow soon.


Many thanks for the quick and competent assistance to Bill Dunlap, 
Matthew Doyle and you (Martyn Plummer). I've also setup a new Linux test 
system, so that next time valgrind checks can be performed before 
package upload.


Thank you!

Thomas Petzoldt

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Matrix install fails because of defunct save in require

2010-09-17 Thread Thomas Petzoldt


Dear R-Devel,

I've just tried to compile the fresh R-devel and found that the install 
of package Matrix failed:


-
** help
*** installing help indices
** building package indices ...
Error in require(Matrix, save = FALSE) :
  unused argument(s) (save = FALSE)
ERROR: installing package indices failed
-


possible reason: Matrix/data/*.R

News.Rd says:

The \code{save} argument of \code{require()} is defunct.


Thomas Petzoldt

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to connect R to Mysql?

2010-09-17 Thread Thomas Etheber


I also had problems connecting via RMysql on Windows several weeks ago.
I decided to skip the package and now use RODBC, which runs stable out 
of the box. Perhaps you should have a look at this package.


Hth
Thomas

Am 17.09.2010 17:50, schrieb Spencer Graves:



  I've recently been through that with some success.  I don't 
remember all the details, but I first looked at "help(pac=RMySQL)".   
This told me that the maintainer was Jeffrey Horner.  Google told me 
he was at Vanderbilt.  Eventually I found 
"http://biostat.mc.vanderbilt.edu/wiki/Main/RMySQL";, which told me 
that I needed to build the package myself so it matches your version 
of MySQL, operating system, etc.  I did that.



  Does the MySQL database already exist?  I created a MySQL 
database and tables using MySQL server 5.1.50-win32.  (Which version 
of MySQL do you have?)



  help('RMySQL-package') includes "A typical usage".  That helped 
me get started, except that I needed to write to that database, not 
just query it.  For this, I think I got something like the following 
to work:



d <- dbReadTable(con, "WL")
dbWriteTable(con, "WL2", a.data.frame)  ## table from a data.frame
dbWriteTable(con, "test2", "~/data/test2.csv") ## table from a file


  Hope this helps.
  Spencer


On 9/17/2010 7:55 AM, Arijeet Mukherjee wrote:

I installed the RMySql package in R 2.11.1 64 bit
Now how can I connect R with MySql?
I am using a windows 7 64 bit version.
Please help ASAP.






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Matrix install fails because of defunct save in require

2010-09-17 Thread Thomas Petzoldt


On 17.09.2010 19:22, Uwe Ligges wrote:



On 17.09.2010 16:04, Thomas Petzoldt wrote:

Dear R-Devel,

I've just tried to compile the fresh R-devel and found that the install
of package Matrix failed:

-
** help
*** installing help indices
** building package indices ...
Error in require(Matrix, save = FALSE) :
unused argument(s) (save = FALSE)
ERROR: installing package indices failed
-



Have you got the Matrix package from the appropriate 2.12/recommended
repository or installed via

make rsync-recommended
make recommended


>


In that case it works for me.

Uwe


Yes, I did it this way, but did you use svn version before 52932 or a 
version equal or newer than 52940?


The svn log shows that in the meantime Brian Ripley added a workaround:

Revision: 52940
Author: ripley
Date: 19:31:48, Freitag, 17. September 2010
Message:
keep dummy require(save=FALSE) for now

Modified : /trunk/doc/NEWS.Rd
Modified : /trunk/src/library/base/R/library.R
Modified : /trunk/src/library/base/man/library.Rd


Is solved for now.

Thanks, Thomas P.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Matrix install fails because of defunct save in require

2010-09-17 Thread Thomas Petzoldt


On 17.09.2010 20:04, Prof Brian Ripley wrote:

I'm not sure why end users would be using R-devel rather than R-alpha at
this point, but I have already changed R-devel to allow Matrix to get
updated before it fails.


Yes I realized the update and successfully recompiled it. Many thanks.

"End users" or package developers want to keep own packages compatible 
with future versions, so maintaining svn syncs is much more efficient 
than downloading snapshoots. In the current case it would have been much 
easier for me, of course, to go back to an older svn release (as I 
sometimes do). However, I felt to be responsible for reporting issues as 
contribution to the open source development process.


O.K., I'll wait a little bit longer in the future and many thanks for 
developing this great software.


ThPe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Assignment to a slot in an S4 object in a list seems to violate copy rules?

2010-09-30 Thread Thomas Lumley

On Thu, Sep 30, 2010 at 8:15 AM, peter dalgaard  wrote:
>
> On Sep 30, 2010, at 16:19 , Niels Richard Hansen wrote:
>
>> setClass("A", representation(a = "numeric"))
>> B <- list()
>> myA <- new("A", a = 1)
>> B$otherA <- myA
>> b$oth...@a <- 2
>> m...@a
>
> R version 2.12.0 Under development (unstable) (2010-09-13 r52905)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
> --- not anymore, it seems: ---
>> setClass("A", representation(a = "numeric"))
> [1] "A"
>> B <- list()
>> myA <- new("A", a = 1)
>> B$otherA <- myA
>> b$oth...@a <- 2
>> m...@a
> [1] 1
>> sessionInfo()
> R version 2.12.0 alpha (2010-09-29 r53067)
> Platform: x86_64-apple-darwin10.4.0 (64-bit)
>
> So somewhere in the last 162 commits, this got caught. Probably r52914, but 
> it looks like it hasn't been recored in NEWS (and it should be as this was 
> apparently a live bug, not an obscure corner case):
>
> r52914 | luke | 2010-09-15 19:06:13 +0200 (Wed, 15 Sep 2010) | 4 lines
>
> Modified applydefine to duplicate if necessary to ensure that the
> assignment target in calls to assignment functions via the complex
> assignment mechanism always has NAMED == 1.
>

Yes, that was the one.  It was reported as a bug back then too, and
there was quite a bit of discussion that ended up with Luke's fix.

  -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Will PrintWarnings remain non static?

2010-11-05 Thread Thomas Friedrichsmeier

On Friday 05 November 2010, Jeffrey Horner wrote:
> Anyone have comments on this? PrintWarnings would be nice to utilize
> for those embedding R.

I had missed your previous post.

Rf_PrintWarnings is pretty useful when embedding R, indeed. We do use it in 
RKWard. So I would certainly appreciate if it could be promoted into the 
public C API.

Regards
Thomas

signature.asc
Description: This is a digitally signed message part.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

1 2 3 4 5 >

1 - 100 of 479 matches

Mail list logo