Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.

2017-10-21 Thread Martin Maechler
>>>>> C W 
>>>>> on Fri, 20 Oct 2017 15:51:16 -0400 writes:

> Thank you for your responses.  I guess I don't feel
> alone. I don't find the documentation go into any detail.

> I also find it surprising that,

>> object.size(train$data)
> 1730904 bytes

>> object.size(as.matrix(train$data))
> 6575016 bytes

> the dgCMatrix actually takes less memory, though it
> *looks* like the opposite.

to whom?

The whole idea of these sparse matrix classes in the 'Matrix'
package (and everywhere else in applied math, CS, ...) is that
1. they need  much less memory   and
2. matrix arithmetic with them can be much faster because it is based on
   sophisticated sparse matrix linear algebra, notably the
   sparse Cholesky decomposition for solve() etc.

Of course the efficency only applies if most of the
matrix entries _are_ 0.
You can measure the  "sparsity" or rather the  "density", of a
matrix by

  nnzero(A) / length(A)

where length(A) == nrow(A) * ncol(A)  as for regular matrices
(but it does *not* integer overflow)
and nnzero(.) is a simple utility from Matrix
which -- very efficiently for sparseMatrix objects -- gives the
number of nonzero entries of the matrix.
   
All of these classes are formally defined classes and have
therefore help pages.  Here  ?dgCMatrix-class  which then points
to  ?CsparseMatrix-class  (and I forget if Rstudio really helps
you find these ..; in emacs ESS they are found nicely via the usual key)

To get started, you may further look at  ?Matrix _and_  ?sparseMatrix
(and possibly the Matrix package vignettes --- though they need
 work -- I'm happy for collaborators there !)

Bill Dunlap's comment applies indeed:
In principle all these matrices should work like regular numeric
matrices, just faster with less memory foot print if they are
really sparse (and not just formally of a sparseMatrix class)
  ((and there are quite a few more niceties in the package))

Martin Maechler
(here, maintainer of 'Matrix')


> On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius 
> wrote:

>> > On Oct 20, 2017, at 11:11 AM, C W  wrote:
>> >
>> > Dear R list,
>> >
>> > I came across dgCMatrix. I believe this class is associated with sparse
>> > matrix.
>> 
>> Yes. See:
>> 
>> help('dgCMatrix-class', pack=Matrix)
>> 
>> If Martin Maechler happens to respond to this you should listen to him
>> rather than anything I write. Much of what the Matrix package does 
appears
>> to be magical to one such as I.
>> 
>> >
>> > I see there are 8 attributes to train$data, I am confused why are there
>> so
>> > many, some are vectors, what do they do?
>> >
>> > Here's the R code:
>> >
>> > library(xgboost)
>> > data(agaricus.train, package='xgboost')
>> > data(agaricus.test, package='xgboost')
>> > train <- agaricus.train
>> > test <- agaricus.test
>> > attributes(train$data)
>> >
>> 
>> I got a bit of an annoying surprise when I did something similar. It
>> appearred to me that I did not need to load the xgboost library since all
>> that was being asked was "where is the data" in an object that should be
>> loaded from that library using the `data` function. The last command 
asking
>> for the attributes filled up my console with a 100K length vector 
(actually
>> 2 of such vectors). The `str` function returns a more useful result.
>> 
>> > data(agaricus.train, package='xgboost')
>> > train <- agaricus.train
>> > names( attributes(train$data) )
>> [1] "i""p""Dim"  "Dimnames" "x""factors"
>> "class"
>> > str(train$data)
>> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
>> ..@ i   : int [1:143286] 2 6 8 11 18 20 21 24 28 32 ...
>> ..@ p   : int [1:127] 0 369 372 3306 5845 6489 6513 8380 8384 10991
>> ...
>> ..@ Dim : int [1:2] 6513 126
>> ..@ Dimnames:List of 2
>> .. ..$ : NULL
>> .. ..$ : chr [1:126] "cap-shape=bell" "cap-shape=conical"
>> "cap-shape=convex" "cap-shape=flat" ...
>> ..@ x   : num [1:143286] 1 1 1 1 1 1 1 1 1 1 ...
>> ..@ factors : list()
>> 
>> > Where is the data, is i

Re: [R] nls() and loop

2017-10-21 Thread Martin Maechler
Dear Vangi,

>>>>> Evangelina Viotto 
>>>>> on Fri, 20 Oct 2017 11:37:12 -0300 writes:

> Hello I´m need fitt growth curve with data length-age. I want to evaluate
> which is the function that best predicts my data, to do so I compare the
> Akaikes of different models. I'm now need to evaluate if changing the
> initial values changes the parameters and which do not allow to estimate
> the model.

> To do this I use the function nls(); 

good!

> and I randomize the initial values (real positive number).

Not a very good idea, I'm sorry to say:
You have enough observations to fit such a simple 3-parameter model:

I'm using your data showing you two models, both provided with R
as "self starting models" already.
Self starting means the use the data (and math and linear least
squares) to get good initial (aka "starting") values for the parameters.

The first model, SSasymp() is equivalent yours but more smartly
parametrized: it uses exp(lrc) (!) -- see   help(SSasymp)   in R, 
the 2nd model assumes the true curve goes through the origin ( = (0,0)
and hence uses one parameter less.
As we will see, both models fit ok, but the more simple models
may be preferable.

Here is the (self contained) R code, including your data at the beginning:



ANO <- c( 1.65, 1.69, 1.72, 1.72, 1.72, 1.72, 1.73, 2.66 ,2.66, 2.66, 2.66, 
2.76, 2.76, 2.76 ,2.76, 2.78, 2.8, 3.65, 3.65 ,3.65, 3.78, 3.78, 5.07, 7.02, 
7.1, 7.81, 8.72, 8.74, 8.78, 8.8, 8.8, 8.83, 8.98, 9.1, 9.11, 9.75, 9.82, 9.84, 
9.87, 9.87, 10.99, 11.67, 11.8, 11.81, 13.93, 14.83, 15.82, 15.99, 16.87, 
16.88, 16.9, 17.68, 17.79, 17.8, 17.8)

SVL <- 
c(26.11,29.02,41.13,30.96,37.74,29.02,33.38,24.18,34.35,35.8,29.99,42.59,27.57,47.43,46.95,30.47,29.75,35.8,40.65,36.29,34.83,29.02,43.5,75,68,70,67.5,80,77.5,68,68,73.84,72.14,68,64.5,58.5,72.63,78.44,71.17,70.69,77,79,78,68.5,69.72,71.66,77,77,79,76.5,78.5,79,73,80,69.72)

d.SA <- data.frame(SVL, ANO) # creo data frame {but do _not_ call it 'data'}
str(d.SA) ## 55 x 2
summary(d.SA) # to get an idea;  the plot below is more useful

## MM: just do this: it's equivalent to your model (but better parametrized!)
fm1 <- nls(SVL ~ SSasymp(ANO, Asym, lrc, c0), data = d.SA)
summary(fm1)

## Compute nicely spaced predicted values for plotting:
ANO. <- seq(-1/2, 30, by = 1/8)
SVL. <- predict(fm1, newdata = list(ANO = ANO.))

plot(SVL ~ ANO, d.SA, xlim = range(ANO, ANO.), ylim = range(SVL, SVL.))
lines(SVL. ~ ANO., col="red", lwd=2)
abline(h = coef(fm1)[["Asym"]], col = "tomato", lty=2, lwd = 1.5)
abline(h=0, v=0, lty=3, lwd = 1/2)

## Both from summary  (where 'lrc' has large variance) and because of the fit,
## Trying the Asymptotic through the origin ==> 1 parameter less instead :
fmO <- nls(SVL ~ SSasympOrig(ANO, Asym, lrc), data = d.SA)
summary(fmO)## (much better determined)
SVL.O <- predict(fmO, newdata = list(ANO = ANO.))
lines(SVL.O ~ ANO., col="blue", lwd=2)# does go through origin (0,0)
abline(h = coef(fmO)[["Asym"]], col = "skyblue", lty=2, lwd = 1.5)

## look close, I'd probably take the simpler one:
## and classical statistical inference also does not see a significant
## difference between the two fitted models :
anova(fm1, fmO)
## Analysis of Variance Table

## Model 1: SVL ~ SSasymp(ANO, Asym, lrc, c0)
## Model 2: SVL ~ SSasympOrig(ANO, Asym, lrc)
##   Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
## 1 52 2635.1
## 2 53 2702.6 -1 -67.55   1.333 0.2535

---

I see that the 10 nice self-starting models that came with nls
already in the  1990's   are not known and probably not
understood by many.

I'm working at making their help pages nicer, notably by
slightly enhancing  the nice model-visualizing plot, you already now
get in R when you run

example(SSasymp)
or
example(SSasympOrig)

(but unfortunately, they currently use 'lwd = 0' to draw the asymptote
 which shows fine on a PDF but not on a typical my screen graphics device.)


Martin Maechler
ETH Zurich and R Core team

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.

2017-10-21 Thread Martin Maechler
>>>>> C W 
>>>>> on Fri, 20 Oct 2017 16:01:06 -0400 writes:

> Subsetting using [] vs. head(), gives different results.
> R code:

>> head(train$data, 5)
> [1] 0 0 1 0 0

The above is surprising ... and points to a bug somewhere.
It is different (and correct) after you do

   require(Matrix)

but I think something like that should happen
semi-automatically.

As I just see, it is even worse if you get the data from xgboost
without loading the xgboost package, which you can do (and is
also more efficient !):

If you start R, and then do

   data(agaricus.train, package='xgboost')

   loadedNamespaces() # does not contain "xgboost" nor "Matrix"

so, no wonder

   head(agaricus.train $ data)

does not find head()s "Matrix" method [which _is_ exported by Matrix
via  exportMethods(.)].
But even more curiously, even after I do

loadNamespace("Matrix")
methods(head)

now does show the "Matrix" method,
but then head() *still* does not call it.  There's a bug
somewhere and I suspect it's in R's data() or methods package or
?? rather than in 'Matrix'.
But that will be another thread on R-devel or R's bugzilla.

Martin


>> train$data[1:5, 1:5]
> 5 x 5 sparse Matrix of class "dgCMatrix"
>  cap-shape=bell cap-shape=conical cap-shape=convex
> [1,]  . .1
> [2,]  . .1
> [3,]  1 ..
> [4,]  . .1
> [5,]  . .1
>  cap-shape=flat cap-shape=knobbed
> [1,]  . .
> [2,]  . .
> [3,]  . .
> [4,]  . .
> [5,]  . .

> On Fri, Oct 20, 2017 at 3:51 PM, C W  wrote:

>> Thank you for your responses.
>> 
>> I guess I don't feel alone. I don't find the documentation go into any
>> detail.
>> 
>> I also find it surprising that,
>> 
>> > object.size(train$data)
>> 1730904 bytes
>> 
>> > object.size(as.matrix(train$data))
>> 6575016 bytes
>> 
>> the dgCMatrix actually takes less memory, though it *looks* like the
>> opposite.
>> 
>> Cheers!
>> 
>> On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius 
>> wrote:
>> 
>>> 
>>> > On Oct 20, 2017, at 11:11 AM, C W  wrote:
>>> >
>>> > Dear R list,
>>> >
>>> > I came across dgCMatrix. I believe this class is associated with 
sparse
>>> > matrix.
>>> 
>>> Yes. See:
>>> 
>>> help('dgCMatrix-class', pack=Matrix)
>>> 
>>> If Martin Maechler happens to respond to this you should listen to him
>>> rather than anything I write. Much of what the Matrix package does 
appears
>>> to be magical to one such as I.
>>> 
>>> >
>>> > I see there are 8 attributes to train$data, I am confused why are 
there
>>> so
>>> > many, some are vectors, what do they do?
>>> >
>>> > Here's the R code:
>>> >
>>> > library(xgboost)
>>> > data(agaricus.train, package='xgboost')
>>> > data(agaricus.test, package='xgboost')
>>> > train <- agaricus.train
>>> > test <- agaricus.test
>>> > attributes(train$data)
>>> >
>>> 
>>> I got a bit of an annoying surprise when I did something similar. It
>>> appearred to me that I did not need to load the xgboost library since 
all
>>> that was being asked was "where is the data" in an object that should be
>>> loaded from that library using the `data` function. The last command 
asking
>>> for the attributes filled up my console with a 100K length vector 
(actually
>>> 2 of such vectors). The `str` function returns a more useful result.
>>> 
>>> > data(agaricus.train, package='xgboost')
>>> > train <- agaricus.train
>>> > names( attributes(train$data) )
>>> [1] "i""p""Dim"  "Dimnames" "x""factor

Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.

2017-10-21 Thread Martin Maechler
>>>>> David Winsemius 
>>>>> on Sat, 21 Oct 2017 09:05:38 -0700 writes:

>> On Oct 21, 2017, at 7:50 AM, Martin Maechler 
 wrote:
>> 
>>>>>>> C W 
>>>>>>> on Fri, 20 Oct 2017 15:51:16 -0400 writes:
>> 
>>> Thank you for your responses.  I guess I don't feel
>>> alone. I don't find the documentation go into any detail.
>> 
>>> I also find it surprising that,
>> 
>>>> object.size(train$data)
>>> 1730904 bytes
>> 
>>>> object.size(as.matrix(train$data))
>>> 6575016 bytes
>> 
>>> the dgCMatrix actually takes less memory, though it
>>> *looks* like the opposite.
>> 
>> to whom?
>> 
>> The whole idea of these sparse matrix classes in the 'Matrix'
>> package (and everywhere else in applied math, CS, ...) is that
>> 1. they need  much less memory   and
>> 2. matrix arithmetic with them can be much faster because it is based on
>> sophisticated sparse matrix linear algebra, notably the
>> sparse Cholesky decomposition for solve() etc.
>> 
>> Of course the efficency only applies if most of the
>> matrix entries _are_ 0.
>> You can measure the  "sparsity" or rather the  "density", of a
>> matrix by
>> 
>> nnzero(A) / length(A)
>> 
>> where length(A) == nrow(A) * ncol(A)  as for regular matrices
>> (but it does *not* integer overflow)
>> and nnzero(.) is a simple utility from Matrix
>> which -- very efficiently for sparseMatrix objects -- gives the
>> number of nonzero entries of the matrix.
>> 
>> All of these classes are formally defined classes and have
>> therefore help pages.  Here  ?dgCMatrix-class  which then points
>> to  ?CsparseMatrix-class  (and I forget if Rstudio really helps
>> you find these ..; in emacs ESS they are found nicely via the usual key)
>> 
>> To get started, you may further look at  ?Matrix _and_  ?sparseMatrix
>> (and possibly the Matrix package vignettes --- though they need
>> work -- I'm happy for collaborators there !)
>> 
>> Bill Dunlap's comment applies indeed:
>> In principle all these matrices should work like regular numeric
>> matrices, just faster with less memory foot print if they are
>> really sparse (and not just formally of a sparseMatrix class)
>> ((and there are quite a few more niceties in the package))
>> 
>> Martin Maechler
>> (here, maintainer of 'Matrix')
>> 
>> 
>>> On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius 

>>> wrote:
>> 
>>>>> On Oct 20, 2017, at 11:11 AM, C W  wrote:
>>>>> 
>>>>> Dear R list,
>>>>> 
>>>>> I came across dgCMatrix. I believe this class is associated with 
sparse
>>>>> matrix.
>>>> 
>>>> Yes. See:
>>>> 
>>>> help('dgCMatrix-class', pack=Matrix)
>>>> 
>>>> If Martin Maechler happens to respond to this you should listen to him
>>>> rather than anything I write. Much of what the Matrix package does 
appears
>>>> to be magical to one such as I.
>>>> 
[]

>>>>> data(agaricus.train, package='xgboost')
>>>>> train <- agaricus.train
>>>>> names( attributes(train$data) )
>>>> [1] "i""p""Dim"  "Dimnames" "x""factors"
>>>> "class"
>>>>> str(train$data)
>>>> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
>>>> ..@ i   : int [1:143286] 2 6 8 11 18 20 21 24 28 32 ...
>>>> ..@ p   : int [1:127] 0 369 372 3306 5845 6489 6513 8380 8384 10991
>>>> ...
>>>> ..@ Dim : int [1:2] 6513 126
>>>> ..@ Dimnames:List of 2
>>>> .. ..$ : NULL
>>>> .. ..$ : chr [1:126] "cap-shape=bell" "cap-shape=conical"
>>>> "cap-shape=convex" "cap-shape=flat" ...
>>>> ..@ x   : num [1:143286] 1 1 1 1 1 1 1 1 1 1 ...
>>>> ..@ factors : list()
>>

Re: [R] Syntax for fit.contrast (from package gmodels)

2017-10-23 Thread Martin Maechler
> Sorkin, John 
> on Sun, 22 Oct 2017 22:56:16 + writes:

> David,
> Thank you for responding to my post.


> Please consider the following output (typeregional is a factor having two 
levels, "regional" vs. "general"):

> Call:
> glm(formula = events ~ type, family = poisson(link = log), data = data,
> offset = log(SS))

> Deviance Residuals:
> Min   1Q   Median   3Q  Max
> -43.606  -17.295   -4.6514.204   38.421

> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -2.528300.01085 -233.13   <2e-16 ***
> typeregional  0.337880.01641   20.59   <2e-16 ***


> Let's forget for a moment that the model is a Poisson regression and 
pretend that the output is from a simple linear regression, e.g. from lm.


> To get the estimate for "general" one simply needs to use the value of 
the intercept i.e. -2.5830. Similarly to get the 95% CI of general one simply 
needs to compute -2.52830-(1.96*0.01085) and -2.52830+(1.96*0.01085).

I'm pretty sure you can just use  (the base R) functions

  dummy.coef()

or
  model.tables()

possibly with SE=TRUE to get coefficients for all levels of a factor..
I'd like to have tried to show this here, but for that we'd have
wanted to see a "MRE" or "ReprEx" (minimal reproducible example) ..

> To get the estimate for "regional" one needs to compute intercept + 
typeregional, i.e. -2.52830 + 0.33788. To get the 95% CI is somewhat more 
difficult as one needs to use results from the variance-covariance matix, 
specifically the variance of intercept, the variance of "regional", and the 
covariance of (intercept,"regional") which involves:

> var =  var(intercept) + var(regional) +2*(covar(intercept,regional)),

> and then get the SE of the variance

> SE=sqrt(var)

> 95% CI = intercept + regional - 1.95*SE and intercept + regional + 
1.95*SE.


> I was hoping that a contrast statement could be written that would give 
me the point estimate and SE for "general" and its SE and another contrast 
statement could be written that would give me the point estimate and SE for 
"general" and it SE without my having to work directly with the 
variance-covariance matrix. I tried doing this using the fit.contrast 
statements (from the gmodels package):


> fit.contrast(model,type,c(1,0),showall=TRUE)

> fit.contrast(model,type,c(0,1),showall=TRUE)


> and received the error message,

> Error in `[[<-`(`*tmp*`, varname, value = c(0, 1)) :
> no such index at level 1


> Perhaps fit.contrast is not the way to accomplish my goal. Perhaps my 
goal can be accomplished without a contrast statement, but I don't know how.

My guess is that "standard R" aka "base R" would be
sufficient to get what you'd want, notably if you'd consider
using  se.contrast() additionally.

Martin


> Thank you,
> John

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table(..., header == FALSE, colClasses = )

2017-10-24 Thread Martin Maechler
>>>>> Benjamin Tyner 
>>>>> on Tue, 24 Oct 2017 07:21:33 -0400 writes:

> Jeff,
> Thank you for your reply. The intent was to construct a minimum 
> reproducible example. The same warning occurs when the 'file' argument 
> points to a file on disk with a million lines. But you are correct, my 
> example was slightly malformed and in fact gives an error under R 
> version 3.2.2. Please allow me to try again; in older versions of R,

>    > read.table(file = textConnection("a\t3.14"), header = FALSE, 
> colClasses = c(x = "character", y = "numeric"), sep="\t")
>      V1   V2
>    1  a 3.14

> (with no warning). As of version 3.3.0,

>    > read.table(file = textConnection("a\t3.14"), header = FALSE, 
> colClasses = c(x = "character", y = "numeric"), sep="\t")
>      V1   V2
>    1  a 3.14
>    Warning message:
>    In read.table(file = textConnection("a\t3.14"), header = FALSE,  :
>      not all columns named in 'colClasses' exist

> My intent was not to complain but rather to learn more about best 
> practices regarding the names attribute.

which is a nice attitude, thank you.

An even shorter MRE (as header=FALSE is default, and the default
sep="" works, too):

> tt <- read.table(textConnection("a 3.14"), colClasses = c(x="character", 
> y="numeric"))
Warning message:
In read.table(file = textConnection("a 3.14"), colClasses = c(x = "character",  
:
  not all columns named in 'colClasses' exist
> 

If you read in the help page -- you did read that before posting, did you?---
how 'colClasses' should be specified ,

colClasses: character.  A vector of classes to be assumed for the
  columns.  If unnamed, recycled as necessary.  If named, names
  are matched with unspecified values being taken to be ‘NA’.

  Possible values are ..
  .

and the 'x' and 'y' names you used, are matched with the
colnames ... which on the other hand are "V1" and "V2"  for
you, and so you provoke a warning.

Once you have read (and understood) the above part of the help
page, it becomes, easy, no?

> tt <- read.table(textConnection("a 3.14"), colClasses = 
> c("character","numeric"))
> t2 <- read.table(textConnection("a 3.14"), 
> colClasses=c(x="character",y="numeric"), col.names=c("x","y"))
> t2
  xy
1 a 3.14
> 

i.e., no warning in both of these two cases.  

So please, please, PLEASE: at least non-beginners like you *should*
take the effort to read the help page (and report if these seem
incomplete or otherwise improvable)... 

Best,
Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] run r script in r-fiddle

2017-10-30 Thread Martin Maechler
> Suzen, Mehmet 
> on Mon, 30 Oct 2017 11:16:30 +0100 writes:

> Hi Frank, You could upload your R source file to a public
> URL, for example to github and read via RCurl, as source
> do not support https as far as I know. 

well... but your knowledge is severely (:-) outdated.
Why did you not try first?

source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";)

works for me even in R 3.3.0 which is really outdated itself!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-10-31 Thread Martin Maechler
>>>>> Suzen, Mehmet 
>>>>> on Mon, 30 Oct 2017 16:05:18 +0100 writes:

> We were talking about r-fiddle. It gives error there [*],

> that's why I suggested using RCurl.

>> 
source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";)
> ...  unsupported URL scheme Error : cannot open the
> connection
>> 

and later you note that it seems to run R-3.1.2 --- which is
ridiculously outdated.

I think R-help should not be (mis)used at all to talk about
R-Fiddle, then.
Notably as I think it's been provided by a company that no
longer exists under that name, and even if that'd be wrong,  R-Fiddle
does not seem free software (apart from the R parts, I hope !).

If another asked about his/her problems using R 3.1.2, I think
we would all tell him to go away and install a current version
of R, and not waste any more bandwidth and R-help readers' time,
wouldn't we ?


> On 30 October 2017 at 15:51, Martin Maechler
>  wrote:
>>>>>>> Suzen, Mehmet  on Mon, 30 Oct 2017
>>>>>>> 11:16:30 +0100 writes:
>> 
>> > Hi Frank, You could upload your R source file to a
>> public > URL, for example to github and read via RCurl,
>> as source > do not support https as far as I know.
>> 
>> well... but your knowledge is severely (:-) outdated.
>> Why did you not try first?
>> 
>> 
source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";)
>> 
>> works for me even in R 3.3.0 which is really outdated
>> itself!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-11-01 Thread Martin Maechler
>>>>> Suzen, Mehmet 
>>>>> on Tue, 31 Oct 2017 19:27:30 +0100 writes:

> Dear List, According to datacamp support team,
> r-fiddle.org is not supported. We asked them to put it
> down as Professor Maechler suggested it is a waste of time
> for the R-help to respond to questions on something not
> maintained and severely outdated. If you would like to use
> R from your browser, you can embed the following into a
> web page:

>  src="<a  rel="nofollow" href="https://cdn.datacamp.com/datacamp-light-latest.min.js"">https://cdn.datacamp.com/datacamp-light-latest.min.js"</a>;>
> 

> Currently, it supports R 3.4.0. See the code base, which
> is open source, here
> https://github.com/datacamp/datacamp-light

> Hope it helps.
> Best, Mehmet

Yes, it does!
Thank you very much, Mehmet, for reaching out to get these
clarifications and reporting back here.

I'm glad to be notified that datacamp seems to adhere to an open source
philosophy also in the tools they write: For datacamp-light they
_do_ adhere (*) to the much more strict 
GNU Public Licence (GPL) Free Software standard which emphasizes not
only the free availability of source code, but also other
freedoms, e.g., to *modify* and re-distribute under the GPL, see
https://en.wikipedia.org/wiki/GPL for much more.

An open Github repos alone is _no_ guarantee for that, and
unfortunately I think there's much software there where authors
don't care (or don't want) to use a "truly" free software
licence such as
(*) https://github.com/datacamp/datacamp-light/blob/master/LICENSE.md

Best,
Martin Maechler


> On 31 October 2017 at 15:09, Suzen, Mehmet
>  wrote:
>> On 31 October 2017 at 12:42, Martin Maechler
>>  wrote:
>>> Notably as I think it's been provided by a company that
>>> no longer exists under that name, and even if that'd be
>>> wrong, R-Fiddle does not seem free software (apart from
>>> the R parts, I hope !).
>> 
>> For the record, r-fiddle is maintained by datacamp:
>> 
https://www.datacamp.com/community/blog/r-fiddle-an-online-playground-for-r-code-2

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkgs] Release of ess 0.0.1

2017-11-10 Thread Martin Maechler
>>>>> Jorge Cimentada 
>>>>> on Fri, 10 Nov 2017 14:31:43 +0100 writes:

> Thanks to all. Will consider this change in future releases.
> ---


> Jorge Cimentada
> *https://cimentadaj.github.io/ <https://cimentadaj.github.io/>*


> On Fri, Nov 10, 2017 at 12:41 PM, Rainer Krug 
> wrote:

>> On 9 Nov 2017, at 15:57, Sam Steingold  wrote:
>> 
>> * Jorge Cimentada  [2017-11-09 00:02:53 +0100]:
>> 
>> I'm happy to announce the release of ess 0.0.1 a package designed to
>> download data from the European Social Survey
>> 
>> 
>> Given the existence of ESS (Emacs Speaks Statistics -
>> https://ess.r-project.org/) the package name "ess" seems unfortunate.
>> 
>> 
>> Agreed. I would suggest to rename the package to avoid conflicts (ESS
>> includes some R code, And I wouldn’t wonder if they would like to include
>> it in a package?).

As a matter of fact, we (the ESS core developers) have e-chatted
about this and came to the conclusion that we could get along fine with
an unrelated R package called 'ess', notably as the acronym also
makes sense for the European Social Survey.

We'd like to ask the 'ess' package authors to add something like the
following to their ess/DESCRIPTION file:

  This package ess is named to match the European Social Survey (ESS).
  It is unrelated to the Emacs Speaks Statistics (ESS) project, an
  emacs-based Integrated Development Environment hosted at
  https://ess.r-project.org

and last but not least we have thought of 'reserving'  ESSR  as
the name of a CRAN package that we'd consider using for the R
part of ESS (there are others, considerably less used, notably
Julia, Stata and SAS).

Martin Maechler, ETH Zurich
for ESS core developers

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggtern and bquote...

2017-12-04 Thread Martin Maechler
>>>>> peter dalgaard 
>>>>> on Mon, 4 Dec 2017 14:55:19 +0100 writes:

>> On 4 Dec 2017, at 11:58 , Levent TERLEMEZ via R-help
>>  wrote:
>> 
>> Dear Users,
>> 
>> What is the proper way to write symbol, superscript,
>> subscript in ggtern/ggplot? I tried every given example,
>> every possible features of ggplot but couldn’t achived. I
>> just want to write P_a, sigma^2, etc, would you please
>> advise me about this problem.

> Did you try expression(P_a)? I don't do much gg-stuff, but
> I seem to recall that quote() doesn't quite cut it the way
> it does in base graphics.

> -pd

Yes, I vaguely remember that indeed also for the lattice package
(which is based on 'grid' the same as 'ggplot2' is ..) sometimes
expressions instead of calls are needed, i.e., expression(*)
instead of just quote(*).

However, I think Levent really meant what you'd get by
 expression(P[a]) ?

@Levent: The clue is the need for valid R syntax, and indeed, as
   in LaTeX  x_i often is the i-th element of x,  the R syntax for
   indexing/subsetting is used here, i.e.
x[i]  for LaTeX  x_i


Last but not least, if Levent really needs bquote() [i.e. substitute()]
then, a final
  as.expression(.)
may be needed :

identical(as.expression(quote(a == 1)),
 expression(  a == 1))  # --> TRUE

--
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] max and pmax of NA and NaN

2018-01-15 Thread Martin Maechler
>>>>> Michal Burda 
>>>>> on Mon, 15 Jan 2018 12:04:13 +0100 writes:

> Dear R users, is the following OK?

>> max(NA, NaN)
> [1] NA
>> max(NaN, NA)
> [1] NA
>> pmax(NaN, NA)
> [1] NA
>> pmax(NA, NaN)
> [1] NaN

> ...or is it a bug? 

> Documentation says that NA has a higher priority over NaN.

which documentation ??
[That would be quite a bit misleading I think. So, it should be amended ...]

> Best regards, Michal Burda


R's help pages are *THE* reference documentation and they have 
(for a long time, I think) had :

?NaN   has in its 3rd 'Note:'

 Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’:
 which of those two is not guaranteed and may depend on the R
 platform (since compilers may re-order computations).

Similarly,  ?NA  contains, in its 'Details':

 Numerical computations using ‘NA’ will normally result in ‘NA’: a
 possible exception is where ‘NaN’ is also involved, in which case
 either might result (which may depend on the R platform).  

-

Yes, it is a bit unfortunate that this is platform dependent; if
we wanted to make this entirely consistent (as desired in a
perfect world), I'm almost sure R would become slower because
we'd have to do add some explicit "book keeping" / checking
instead of relying on the underlying C library code.

Note that for these reasons, often NaN and NA should not be
differentiated, and that's reason why using  is.na(*)  is
typically sufficient and "best" -- it gives TRUE for both NA and NaN.


Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with geterrmessage()

2018-02-23 Thread Martin Maechler
>>>>> Dennis Fisher 
>>>>> on Thu, 22 Feb 2018 13:01:37 -0800 writes:

> Luke
> Thanks — I revised the code to:
> ERRORMESSAGE <- try(source(USERSCRIPTFILE, local=T), silent=T) 

> print(ERRORMESSAGE) now returns:
> $value
> [1] 0

> $visible
> [1] FALSE

> Not clear what to make of that.
> Dennis

The help page   ?try   has

contained for a long time

 ‘try’ is implemented using ‘tryCatch’; for programming, instead of
 ‘try(expr, silent = TRUE)’, something like ‘tryCatch(expr, error =
 function(e) e)’ (or other simple error handler functions) may be
 more efficient and flexible.

and you do use 'silent=T' (which is "unsafe" (*) !)

I'd strongly advocate you use the 2nd proposition by given by
Luke Tierney (cited below):
 Learn to use tryCatch() instead of try() and then such things
 can be done considerably less obscurely. 

Best,
Martin Maechler, ETH Zurich

--
*) Using 'T' instead of 'TRUE'  (of 'F' instead of 'FALSE' *is* unsafe):
   a previous  'T <- 0'  will change what you really wanted.
   TRUE and FALSE are  "constants" in R, whereas  T and F are variables


> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com




>> On Feb 22, 2018, at 12:45 PM, luke-tier...@uiowa.edu wrote:
>> 
>> Only the default error handler puts the error message in a buffer
>> where it can be retrieved with geterrmessage. try() replaces the
>> default error handler. Either look at the value returned by try() or
>> use tryCatch with conditionMessage.
>> 
>> Best,
>> 
>> luke
>> 
>> On Thu, 22 Feb 2018, Dennis Fisher wrote:
>> 
>>> R 3.4.3
>>> OS X
>>> 
>>> Colleagues
>>> 
>>> I have a 20K line script in which I encounter an unexpected problem.
>>> 
>>> If the script detects presence of a particular file USERCODE.txt, it 
executes:
>>> source(“USERCODE.txt”)
>>> If that file is not present, the script executes without a problem.
>>> 
>>> There might be syntax errors in USERCODE.txt; therefore, the code above 
is embedded in a try command:
>>> try(source(“USERCODE.txt", local=T), silent=T)
>>> followed by:
>>> ERRORMESSAGE <- geterrmessage()
>>> 
>>> For unclear reasons, an earlier command is yielding an error message:
>>> unused argument (\"\\n\")
>>> Despite identifying the exact source of that error, I can’t fix it (and 
it is of no consequence).
>>> 
>>> Ideally, I would like to clear out the pre-existing error message 
immediately before the “try” command (or perhaps at that particular location 
where it is being created) — but I can’t figure out how to do so.
>>> 
>>> Any suggestions would be welcome.
>>> 
>>> Dennis
>>> 
>>> Dennis Fisher MD
>>> P < (The "P Less Than" Company)
>>> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
>>> www.PLessThan.com
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> -- 
>> Luke Tierney
>> Ralph E. Wareham Professor of Mathematical Sciences
>> University of Iowa  Phone: 319-335-3386
>> Department of Statistics andFax:   319-335-3017
>> Actuarial Science
>> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
>> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible Improvement to sapply

2018-03-13 Thread Martin Maechler
>>>>> Doran, Harold 
>>>>> on Tue, 13 Mar 2018 16:14:19 + writes:

> You’re right, it sure does. My suggestion causes it to fail when simplify 
= ‘array’

> From: William Dunlap [mailto:wdun...@tibco.com]
> Sent: Tuesday, March 13, 2018 12:11 PM
> To: Doran, Harold 
> Cc: r-help@r-project.org
> Subject: Re: [R] Possible Improvement to sapply

> Wouldn't that change how simplify='array' is handled?

>> str(sapply(1:3, function(x)diag(x,5,2), simplify="array"))
> int [1:5, 1:2, 1:3] 1 0 0 0 0 0 1 0 0 0 ...
>> str(sapply(1:3, function(x)diag(x,5,2), simplify=TRUE))
> int [1:10, 1:3] 1 0 0 0 0 0 1 0 0 0 ...
>> str(sapply(1:3, function(x)diag(x,5,2), simplify=FALSE))
> List of 3
> $ : int [1:5, 1:2] 1 0 0 0 0 0 1 0 0 0
> $ : int [1:5, 1:2] 2 0 0 0 0 0 2 0 0 0
> $ : int [1:5, 1:2] 3 0 0 0 0 0 3 0 0 0


> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com<http://tibco.com>

Yes, indeed, thank you Bill!

I sometimes marvel at how much the mental capacities of R core
are underestimated.  Of course, nobody is perfect, but the bugs
we produce are really more subtle than that ...  ;-)

Martin Maechler
R core  


> On Tue, Mar 13, 2018 at 6:23 AM, Doran, Harold 
mailto:hdo...@air.org>> wrote:
> While working with sapply, the documentation states that the simplify 
argument will yield a vector, matrix etc "when possible". I was curious how the 
code actually defined "as possible" and see this within the function

> if (!identical(simplify, FALSE) && length(answer))

> This seems superfluous to me, in particular this part:

> !identical(simplify, FALSE)

> The preceding code could be reduced to

> if (simplify && length(answer))

> and it would not need to execute the call to identical in order to 
trigger the conditional execution, which is known from the user's simplify = 
TRUE or FALSE inputs. I *think* the extra call to identical is just unnecessary 
overhead in this instance.

> Take for example, the following toy example code and benchmark results 
and a small modification to sapply:

> myList <- list(a = rnorm(100), b = rnorm(100))

> answer <- lapply(X = myList, FUN = length)
> simplify = TRUE

> library(microbenchmark)

> mySapply <- function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE){
> FUN <- match.fun(FUN)
> answer <- lapply(X = X, FUN = FUN, ...)
> if (USE.NAMES && is.character(X) && is.null(names(answer)))
> names(answer) <- X
> if (simplify && length(answer))
> simplify2array(answer, higher = (simplify == "array"))
> else answer
> }


>> microbenchmark(sapply(myList, length), times = 1L)
> Unit: microseconds
> exprmin lq mean median uqmax neval
> sapply(myList, length) 14.156 15.572 16.67603 15.926 16.634 650.46 1
>> microbenchmark(mySapply(myList, length), times = 1L)
> Unit: microseconds
> exprmin lq mean median uq  max neval
> mySapply(myList, length) 13.095 14.864 16.02964 15.218 15.573 1671.804 
1

> My benchmark timings show a timing improvement with only that small 
change made and it is seemingly nominal. In my actual work, the sapply function 
is called millions of times and this additional overhead propagates to some 
overall additional computing time.

> I have done some limited testing on various real data to verify that the 
objects produced under both variants of the sapply (base R and my modified) 
yield identical objects when simply is both TRUE or FALSE.

> Perhaps someone else sees a counterexample where my proposed fix does not 
cause for sapply to behave as expected.

> Harold

> __
> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible Improvement to sapply

2018-03-14 Thread Martin Maechler
>>>>> Henrik Bengtsson 
>>>>> on Tue, 13 Mar 2018 10:12:55 -0700 writes:

> FYI, in R devel (to become 3.5.0), there's isFALSE() which will cut
> some corners compared to identical():

> > microbenchmark::microbenchmark(identical(FALSE, FALSE), isFALSE(FALSE))
> Unit: nanoseconds
> expr min   lqmean median uq   max neval
>  identical(FALSE, FALSE) 984 1138 1694.13 1218.0 1337.5 13584   100
>   isFALSE(FALSE) 713  761 1133.53  809.5  871.5 18619   100

> > microbenchmark::microbenchmark(identical(TRUE, FALSE), isFALSE(TRUE))
> Unit: nanoseconds
>expr  min lqmean median   uq   max neval
>  identical(TRUE, FALSE) 1009 1103.5 2228.20 1170.5 1357 14346   100
>   isFALSE(TRUE)  718  760.0 1298.98  798.0  898 17782   100

> > microbenchmark::microbenchmark(identical("array", FALSE), isFALSE("array"))
> Unit: nanoseconds
>   expr min lqmean median uq  max neval
>  identical("array", FALSE) 975 1058.5 1257.95 1119.5 1250.0 9299   100
>   isFALSE("array") 409  433.5  658.76  446.0  476.5 9383   100

Thank you Henrik!

The speed of the new isTRUE() and isFALSE() is indeed amazing
compared to identical() which was written to be fast itself.

Note that the new code goes back to a proposal by  Hervé Pagès
(of Bioconductor fame) in a thread with R core in April 2017.
The goal of the new code actually *was*  to allow call like

  isTRUE(c(a = TRUE))   

to become TRUE rather than improving speed.
The new source code is at the end of  R/src/library/base/R/identical.R

## NB:  is.logical(.) will never dispatch:
## -- base::is.logical(x)  <==>  typeof(x) == "logical"
isTRUE  <- function(x) is.logical(x) && length(x) == 1L && !is.na(x) && x
isFALSE <- function(x) is.logical(x) && length(x) == 1L && !is.na(x) && !x

and one *reason* this is so fast is that all  6  functions which
are called are primitives :

> sapply(codetools::findGlobals(isTRUE), function(fn) is.primitive(get(fn)))
 ! && == is.logical  is.na length 
  TRUE   TRUE   TRUE   TRUE   TRUE   TRUE 

and a 2nd reason is probably with the many recent improvements of the
byte compiler.


> That could probably be used also is sapply().  The difference is that
> isFALSE() is a bit more liberal than identical(x, FALSE), e.g.

> > isFALSE(c(a = FALSE))
> [1] TRUE
> > identical(c(a = FALSE), FALSE)
> [1] FALSE

> Assuming the latter is not an issue, there are 69 places in base R
> where isFALSE() could be used:

> $ grep -E "identical[(][^,]+,[ ]*FALSE[)]" -r --include="*.R" | grep -F "/R/" 
> | wc
>  69 3265472

> and another 59 where isTRUE() can be used:

> $ grep -E "identical[(][^,]+,[ ]*TRUE[)]" -r --include="*.R" | grep -F "/R/" 
> | wc
>  59 3075021

Beautiful use of  'grep' -- thank you for those above, as well.
It does need a quick manual check, but if I use the above grep
from Emacs (via  'M-x grep')  or even better via a TAGS table
and M-x tags-query-replace  I should be able to do the changes
pretty quickly... and will start looking into that later today.

Interestingly and to my great pleasure, the first part of the
'Subject' of this mailing list thread, "Possible Improvement",
*has* become true after all --

-- thanks to Henrik !

Martin Maechler
ETH Zurich



> On Tue, Mar 13, 2018 at 9:21 AM, Doran, Harold  wrote:
> > Quite possibly, and I’ll look into that. Aside from the work I was doing, 
> > however, I wonder if there is a way such that sapply could avoid the 
> > overhead of having to call the identical function to determine the 
> > conditional path.
> >
> >
> >
> > From: William Dunlap [mailto:wdun...@tibco.com]
> > Sent: Tuesday, March 13, 2018 12:14 PM
> > To: Doran, Harold 
> > Cc: Martin Morgan ; r-help@r-project.org
> > Subject: Re: [R] Possible Improvement to sapply
> >
> > Could your code use vapply instead of sapply?  vapply forces you to declare 
> > the type and dimensions
> > of FUN's output and stops if any call to FUN does not match the 
> > declaration.  It can use much less
> > memory and time than sapply because it fills in the output array as it goes 
> > instead of calling lapply()
> > and seeing how it could be simplified.
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com<http://tibco.com>
> >
> > On Tue, Mar 13, 2018 at 7:06 AM, Doran, Harold 
> > mailto:hdo...@air.org>&g

Re: [R] Fwd: the same function returning different values when called differently..

2018-03-14 Thread Martin Maechler
>>>>> Eric Berger 
>>>>> on Wed, 14 Mar 2018 18:27:38 +0200 writes:

> Hi Akshay, (Please include r-help when replying)

> You have learned that PFC.NS and snl[[159]] are not
> identical. Now you have to figure out why they
> differ. This could also point to a bug or a logic error in
> your program.  Figuring out how two objects differ can be
> a bit tricky, but with experience it becomes easier. (Some
> others may even have some suggestions for good ways to do
> it.)  Basically you would work your way down. At the top
> level is the class of each, which you already tested and
> they are identical.

> Now try:

>> str(PFC.NS)

> and compare it to

>> str(snl[[159]]

Note further the

 all.equal(obj1, obj2)

maybe a good start to find differences between R objects 'obj1' and 'obj2',
as all.equal() is generic and has a list method
(which works recursively) the "output may be huge, but then if there
is huge number of differences you would have found yourself
anyway, and will not need all.equal()

Martin Maechler
ETH Zurich and R Core


> Look closely at the two outputs to try to detect
> differences. If they are the same then you will have to
> examine the sub-objects described by str().  You didn't
> mention what type of objects they are. Suppose they are
> both "numeric" vectors. Then you can check whether their
> lengths are equal.  And you can compare their values,
> etc. etc. No short cut that I can think of without more
> information.

> It definitely takes work to find discrepancies. Think of
> it as a challenge and perhaps write functions that you can
> use to automate this kind of comparison in the future.
> (Again, other people on this list might be able to point
> to tools that help with this for objects of specific
> type.)

> Good luck, Eric




> dear Eric,

>Bulls eye...! identical(PFC.NS,
> snl[[159]])) is resulting false..but class(PFC.NS) and
> class(snl[[159]]) are same...

> I want snl[[159]] to be equal to PFC.NS...how do I effect
> it? create a list with some other element list... for
> example snl[[200]] == PFC.NS?


> very many thanks for your time and effort...


> yours sincerely,

> AKSHAY M KULKARNI




> --
> *From:* Eric Berger  *Sent:*
> Wednesday, March 14, 2018 5:22 PM *To:* akshay kulkarni
> *Cc:* R help Mailing list *Subject:* Re: [R] the same
> function returning different values when called
> differently..

> Hi Akshay, Presumably PFC.NS and snl[[159]] are not
> exactly the same.  You can start by trying to determine if
> (and then how) they differ.  e.g.
>> identical(PFC.NS, snl[[159]])
> presumably this will result in FALSE

> then compare
>> class(PFC.NS) class(snl[[159]])

> etc

> HTH, Eric


> On Wed, Mar 14, 2018 at 1:42 PM, akshay kulkarni
>  wrote:

> dear members,

>  I have a function ygrpc which acts on the daily price
> increments of a stock. It returns the following values:


>  ygrpc(PFC.NS,"h") [1] 2.149997 1.875000 0.75 0.349991
> 2.16 0.17 4.00 2.574996 0.50 0.34
> 1.50 0.71 [13] 0.50 1.33 0.449997 2.83
> 2.724998 66.150002 0.550003 0.050003 1.224991 4.84
> 1.375000 1.574997 [25] 1.649994 0.449997 0.975006 2.475006
> 0.125000 2.625000 3.649994 0.34 1.33 2.074997
> 1.025001 0.125000 [37] 3.84 0.025002 0.824997 1.75
> 1.67 1.75 1.67 0.275002 2.33 0.349998
> 0.75 0.224998 [49] 0.125000 1.475006 1.58 0.125000
> 0.50 0.75 1.08 0.225006 1.274997 1.33
> 3.00 0.33 [61] 0.724999 3.67 2.424995 8.425003
> 1.01 2.025009 0.850006 4.00 1.724991 0.949997
> 1.825012 2.799988 [73] 0.425003 1.75 5.75 2.125000
> 0.125000 4.00 2.350006 1.524994 1.25 0.33
> 0.949997 0.449997 [85] 1.84 1.75 1.150009 0.849998
> 2.449997 5.33 0.1

> I also have a list of stocks called "snl" with snl[[159]]
> pointing to PFC.NS (above): tail(snl[[159]]) PFC.NS.Open
> PFC.NS.High PFC.NS.Low PFC.NS.Close PFC.NS.Volume
> PFC.NS.Adjusted 2018-03-07 98.40 98.45 95.1 95.30 7854861
> 95.30 2018-03-08 94.90 94.95 89.3 91.90 12408061 91.90
> 2018-03-09 92.00 94.50 91.9 93.10 7680222 93.10 2018-03-12
> 93.40 93.85 86.1 88.25 12617833 88.25 2018-03-13 89.20
> 91.85 86.2 89.85 12792

Re: [R] Hacked

2018-04-17 Thread Martin Maechler
>>>>> Peter Langfelder 
>>>>> on Tue, 17 Apr 2018 11:50:19 -0700 writes:

> I got some spam emails after my last post to the list, and the emails
> did not seem to go through r-help. The spammers may be subscribed to
> the r-help, or they get the poster emails from some of the web copies
> of this list (nabble or similar).

> Peter

Thank you Peter (and Ulrich and ..),

>From all the cases we (R-help-owner = your friendly volunteer
list moderators) checked,

1) the address from which the spam was sent  was *NOT* among the
   13'000 subscriber addresses to R-help

2) in one case, I replied to the address stating that it
   seemed that *e-mail account* had been hacked.
   The result was nil (NULL), i.e., my conclusion was

   a) the address was a really registered/existing e-mail address that gave
  no "mailer daemon" error 

   b) it probably did go to the spammers: a legitimate user
  would have replied to me.

Martin Maechler
ETH Zurich (= provider of all the r-*@r-project.org mailman mailing lists)


> On Tue, Apr 17, 2018 at 11:37 AM, Ulrik Stervbo  
wrote:
>> I asked the moderators about it. This is the reply
>> 
>> "Other moderators have looked into this a bit and may be able to shed 
more
>> light on it. This is a "new" tactic where the spammers appear to reply to
>> the r-help post. They are not, however, going through the r-help server.
>> 
>> It also seems that this does not happen to everyone.
>> 
>> I am not sure how you can automatically block the spammers.
>> 
>> Sorry I cannot be of more help."
>> 
>> --Ulrik
>> 
>> Jeff Newmiller  schrieb am Di., 17. Apr. 2018,
>> 14:59:
>> 
>>> Likely a spammer has joined the mailing list and is auto-replying to 
posts
>>> made to the list. Unlikely that the list itself has been "hacked". Agree
>>> that it is obnoxious.
>>> 
>>> On April 17, 2018 5:01:10 AM PDT, Neotropical bat risk assessments <
>>> neotropical.b...@gmail.com> wrote:
>>> >Hi all,
>>> >
>>> >Site has been hacked?
>>> >Bad SPAM arriving
>>> >
>>> >__
>>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question "breakdown point" --> robust statistics

2018-04-20 Thread Martin Maechler
>>>>> Olivier Crouzet 
>>>>> on Thu, 19 Apr 2018 18:13:30 +0200 writes:

> Hi,
> I think he's talking about how much a statistical estimator is influenced 
by extreme datapoints, e.g.

> https://en.m.wikipedia.org/wiki/Robust_statistics#Breakdown_point

> Olivier

Definitely!  Thank you, Olivier.

I had not replied because the question looked more like homework
than something for which R-help is appropropriate.

There is the 'robustbase' package and more of a dozen other high
quality R packages about robust statistics -- in which
"breakdown point" is one crucial concept.

BTW: What econometricians tend to call "robust" is the idea
of using "sandwich" estimates (--> 'sandwich' package in R), aka
Newey-West aka HAC (Heteroscadisticity And Correlation adjusted) estimates.

Robust statistics in a proper sense considers "sensitivity" of
both estimates and their precision estimates in situations where
the distributional assumption (e.g. Gaussian errors) is only
approximately true, say 95% of the observations/errors/.. are Gaussian but the
remaining 5% can be anything (-> arbitrary extreme outliers).

There's the CRAN task view on robust statistical methods:
https://cran.r-project.org/web/views/Robust.html

Martin Maechler
ETH Zurich

> -- 
> Olivier Crouzet 
> Assistant Professor
> @LLING UMR6310 - Université de Nantes / CNRS
> Guest Scientist
> @UMCG - University Medical Center Groningen / RijksUniversiteit Groningen

>> Le 19 avr. 2018 à 11:00, Keith Jewell  a 
écrit :
>> 
>>> On 15/04/2018 17:26, Marc Girondot via R-help wrote:
>>>> Le 15/04/2018 à 17:56, alireza daneshvar a écrit :
>>>> break-down point
>>> Can you explain more what you plan to do and give an example of what 
you have tried to do until now to do a "break down point" in R. Perhaps a 
"break down point" is common in your field, but I have no idea about what it is 
!
>>> https://en.wikipedia.org/wiki/Breakdown_(1997_film)
>>> Sincerely
>>> Marc
>> Perhaps the OP means "breakpoint" in which case the 'strucchange' 
package might be relevant 
<https://cran.r-project.org/web/packages/strucchange/index.html>
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error : 'start' contains NA values when fitting frank copula

2018-04-21 Thread Martin Maechler
>>>>> Soumen Banerjee 
>>>>> on Sat, 21 Apr 2018 17:22:56 +0800 writes:

> Hello!  I am trying to fit a copula to some data in R and
> I get the error mentioned above. This is the code for a
> reproducible example -

(not really reproducible: You did not set the random seed, so
 the data is different every time;
 OTOH, yes, the following always gives an error)

> library(copula)

Slightly clearer showing what you are doing:

x <- runif(200)
data <- cbind(x, 2*x, 3*x)

> fr_cop = frankCopula(dim=3)
> fit_fr_cop = fitCopula(fr_cop, pobs(data), method = "mpl") #Error Here

> The error says : Error in fitCopula.ml(copula, u = data, method = method,
> start = start, : 'start' contains NA values

> What could be going wrong?

Is this a homework question? [probably not, but ..]

The homework question & answer would be

Q:  What is the best fitting 3D copula if you have perfectly
correlated ("rank 1") data?

A:  The complete-dependence copula ... which is indeed a boundary case,
hard to imagine you'd encounter for non-artifical data.

Another hint:

> fitCopula(normalCopula(,3), pobs(data))
Call: fitCopula(copula, data = data)
Fit based on "maximum pseudo-likelihood" and 200 3-dimensional observations.
Copula: normalCopula 
rho.1 
1 
The maximized loglikelihood is 3686 
Optimization converged
>
---

Yes, one could adapt the fitCopula() code to work better for
this extreme boundary case,
and for the Frank copula,  it would return

> fitCopula(frankCopula(,3), pobs(data))
Call: fitCopula(copula, data = data)
Fit based on "maximum pseudo-likelihood" and 200 3-dimensional observations.
Copula: frankCopula 
   alpha 
7.21e+16 
The maximized loglikelihood is 1.798e+308 
Optimization converged


but it would need even more tweaking to also give such
  "alpha ~= +Inf"
results for other cases such as Gumbel or Joe.

I may add the possibility for frank*() as shown above for the
next release of the copula package...
but am a bit hesitant to complicate (and slowdown) the current
code by adding an extra check for this situation.

Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How would I color points conditional on their value in a plot of a time series

2018-05-02 Thread Martin Maechler
>>>>> Eivind K Dovik 
>>>>> on Tue, 1 May 2018 23:18:51 +0200 writes:

> You may also want to check this out:

> plot(ttt, type = "p")
> points(ttt, col = ifelse(ttt < 8, "black", "red"))

> Eivind K. Dovik
> Bergen, NO

yes, indeed,  or -- even nicer for a time series:
using  'type = "c"'  which many people don't know / have forgotten about:

ttt <- ts(rpois(12, lambda = 8), start = c(2000, 1), freq = 4)
plot  (ttt, type = "c")
points(ttt, col = ifelse(ttt < 8, "black", "red"))

Martin Maechler



> On Tue, 1 May 2018, Christopher W Ryan wrote:

>> Excellent! Worked like a charm. Thanks.
>> 
>> --Chris Ryan
>> 
>> On Tue, May 1, 2018 at 4:33 PM, William Dunlap  wrote:
>> 
>>> The ts method for plot() is quirky.  You can use the default method:
>>> 
>>> plot(as.vector(time(ttt)), as.vector(ttt), type = "p", col=ifelse(ttt<8,
>>> "black", "red"))
>>> 
>>> 
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>> 
>>> On Tue, May 1, 2018 at 1:17 PM, Christopher W Ryan 

>>> wrote:
>>> 
>>>> How would I color points conditional on their value in a plot of a time
>>>> series.  Something like this:
>>>> 
>>>> ## demonstration data
>>>> ttt <- ts(rpois(12, lambda = 8), start = c(2000, 1), freq = 4)
>>>> ttt
>>>> plot(ttt, type = "p")
>>>> 
>>>> ## doesn't work--all points the same color
>>>> plot(ttt, type = "p", col = ifelse(ttt < 8, "black", "red"))
>>>> 
>>>> ## also doesn't work--all points the same color
>>>> q <- as.numeric(ttt)
>>>> q
>>>> plot(ttt, type = "p", col = ifelse(q < 8, "black", "red"))
>>>> 
>>>> 
>>>> ## works OK with a simple, non-time-series scatterplot, as in
>>>> 
>>>> sss <- data.frame(x = rpois(12, lambda = 8), y = rnorm(12, mean = 100, 
sd
>>>> =
>>>> 25))
>>>> with(sss, plot(y ~ x, col = ifelse(y > 100, "black", "red")))
>>>> 
>>>> ## but I am missing something about time series.
>>>> 
>>>> Thanks.
>>>> 
>>>> --Chris Ryan
>>>> Broome County Health Department
>>>> and Binghamton University
>>>> Binghamton, NY
>>>> 
>>>> [[alternative HTML version deleted]]
>>>> 
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>>> ng-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>> 
>>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a list to a data frame

2018-05-03 Thread Martin Maechler
>>>>> David L Carlson 
>>>>> on Wed, 2 May 2018 21:43:52 + writes:

> Typo: dat[[z]] should be x[[z]]:
> 
> x2 <- do.call(rbind, lapply(names(x), function(z) 
> data.frame(type=z, x[[z]])))
> x2
>   type x y
> 1A 1 3
> 2A 2 4
> 3B 5 7
> 4B 6 8
> 
> David C

Before this thread gets carried away to data.table and
tibbleverse (and as nobody else has done so) :

Let me nominate this beautiful solution by
Bill Dunlap and David Carlson to win the prize  with a 10 out 10 grade:

Beautiful use of  do.call() and lapply(), two of the most
versatile and important functions from the base R toolbox.

Congratulations!

Martin Maechler
R Core Team


> -Original Message-
> From: R-help  On Behalf Of David L Carlson
> Sent: Wednesday, May 2, 2018 3:51 PM
> To: William Dunlap ; Kevin E. Thorpe 
> 
> Cc: r-help mailing list 
> Subject: Re: [R] Converting a list to a data frame
> 
> Or add the type column first and then rbind:
> 
> x <- list(A=data.frame(x=1:2, y=3:4),B=data.frame(x=5:6,y=7:8))
> x2 <- do.call(rbind, lapply(names(x), function(z) 
>   data.frame(type=z, dat[[z]])))
> 
> 
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
> 
> -Original Message-
> From: R-help  On Behalf Of William Dunlap via 
> R-help
> Sent: Wednesday, May 2, 2018 12:28 PM
> To: Kevin E. Thorpe 
> Cc: R Help Mailing List 
> Subject: Re: [R] Converting a list to a data frame
> 
> > x1 <- do.call(rbind, c(x, list(make.row.names=FALSE)))
> > x2 <- cbind(type=rep(names(x), vapply(x, nrow, 0)), x1)
> > str(x2)
> 'data.frame':   4 obs. of  3 variables:
>  $ type: Factor w/ 2 levels "A","B": 1 1 2 2
>  $ x   : int  1 2 5 6
>  $ y   : int  3 4 7 8
> 
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Wed, May 2, 2018 at 10:11 AM, Kevin E. Thorpe 
> wrote:
> 
> > I suspect this is pretty easy, but I'm having trouble figuring it out.
> > Basically, I have a list of data frames such as the following example:
> >
> > list(A=data.frame(x=1:2, y=3:4),B=data.frame(x=5:6,y=7:8))
> >
> > I would like to turn this into  data frame where the list elements are 
> > essentially rbind'ed together and the element name becomes a new variable.
> > For example, I would like to turn the list above into a data frame 
> > that looks like this:
> >
> > data.frame(type=c("A","A","B","B"),x=c(1:2,5:6),y=c(3:4,7:8))
> >
> > Appreciate any pointers.
> >
> > Kevin
> >
> > --
> > Kevin E. Thorpe
> > Head of Biostatistics,  Applied Health Research Centre (AHRC) Li Ka 
> > Shing Knowledge Institute of St. Michael's Hospital Assistant 
> > Professor, Dalla Lana School of Public Health University of Toronto
> > email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016
> >
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing figures?

2018-05-07 Thread Martin Maechler
>>>>> Suzen, Mehmet 
>>>>> on Mon, 7 May 2018 15:48:57 + writes:

> I suggest perceptual diff. You could write a wrapper
> around it.  http://pdiff.sourceforge.net

In (base) R's own test ('make check-devel'), we basically use

   pdf(*, compress=FALSE)

actually also setting 'encoding=.' and 'paper=.', see the R sources at
   https://svn.r-project.org/R/trunk/tests/reg-plot.R

and the check (source at
https://svn.r-project.org/R/trunk/tests/Makefile.common)

basically is using

    R CMD Rdiff  .pdf  .pdf.save

(with  := reg-plot in this case)

Martin Maechler
R Core Team


> On Mon, 7 May 2018 16:49 Ramiro Barrantes,
>  wrote:

>> Hello,
>> 
>> I am working on tests to compare figures.  I have been
>> using ImageMagick, which creates a figure signature, and
>> I can compare a "test" figure signature against a saved
>> "reference" figure signature.  It seems to work pretty
>> well.  However, it is slow as it requires reading from
>> the file system.
>> 
>> Are there any options to compare figures on memory?  For
>> example, if I generate a ggplot or lattice graph, I could
>> have all my saved "reference" figures on memory (which I
>> would have loaded all at once) and compare them.  I just
>> haven't found anything.
>> 
>> I just found out about the vdiffr package and was going
>> to explore it, not sure about the speed.
>> 
>> Any suggestions appreciated.
>> 
>> Thank you, <
>> 
https://west.exch023.serverdata.net/owa/?ae=Item&a=New&t=IPM.Note&cc=MTQuMy4zMTkuMixlbi1VUyw2LEhUTUwsMCww&pspid=_1525698150389_875737489#
>> >
>> Ramiro
>> 
>> Ramiro Barrantes Ph.D.  Precision Bioassay, Inc.  431
>> Pine St., Suite 110 Burlington, VT 05401 802 865 0155 802
>> 861 2365 FAX www.precisionbioassay.com<
>> 
https://west.exch023.serverdata.net/owa/redir.aspx?SURL=wN3KzpoKXAcetH7sTOTnSyfg-iAXFIinpPUtRcduCFCtkgZrUSDTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBwAHIAZQBjAGkAcwBpAG8AbgBiAGkAbwBhAHMAcwBhAHkALgBjAG8AbQA.&URL=http%3a%2f%2fwww.precisionbioassay.com
>> >
>> ram...@precisionbioassay.com
>> 
>> CONFIDENTIALITY NOTICE: This email, including any
>> attach...{{dropped:9}}
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
>> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide
>> commented, minimal, self-contained, reproducible code.
>> 

>   [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide
> commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] par(mfrow=c(3,4)) problem

2018-05-30 Thread Martin Maechler
> Sarah Goslee 
> on Wed, 30 May 2018 05:03:56 -0400 writes:

> Hi,
> You're mixing base plot and ggplot2 grid graphics, which as you've
> discovered doesn't work.

> Here's av strategy that does:
> https://cran.r-project.org/web/packages/egg/vignettes/Ecosystem.html

> This vignette has a good overview, well as info specific to that package.

> Sarah

A very nice vignette indeed!  Didn't know about these.
Thank you, Sarah, and thanks to the author, Baptiste Auguie !

Martin 

> On Wed, May 30, 2018 at 4:43 AM greg holly  wrote:

>> Hi all;
>> 
>> I need to put 12 different plot2 into the same matrix. So my array for 
the
>> matrix will be par(mfrow=c(3,4)).  I am running ggplot2 to produce my 12
>> plots. For some reason, par(mfrow=c(3,4)) did not turn out 3*4 matrix.
>> 
>> my basic R codes for each plot is
>> par(mfrow=c(3,4))
>> library(ggplot2)
>> p <- ggplot(a, aes(x=Genotypes, y=Plant_hight, size=Plant_hight,
>> color=Showing_rate)) +
>> .
>> .
>> 
>> Best regards,
>> 
>> Greg
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> -- 
> Sarah Goslee
> http://www.stringpage.com
> http://www.sarahgoslee.com
> http://www.functionaldiversity.org

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mysterious rounding digits output

2018-05-30 Thread Martin Maechler
> Ted Harding 
> on Thu, 31 May 2018 07:10:32 +0100 writes:

> Well pointed out, Jim!

> It is infortunate that the documentation for options(digits=...)
> does not mention that these are *significant digits* 
> and not *decimal places* (which is what Joshua seems to want):

Since R 3.4.0  the help on  ?options  *does* say significant!

The change in R's source code was this one,  14 months ago :

r72380 | maechler | 2017-03-21 11:28:13 +0100 (Tue, 21. Mar 2017) | 2 Zeilen
GeƤnderte Pfade:
   M /trunk/src/library/base/man/options.Rd

digits: + "signficant"


and since then, the text has been

 ‘digits’: controls the number of significant digits to print when
  printing numeric values.  It is a suggestion only.
  ...

whereas what you (Ted) cite is from R 3.3.x and earlier

 > "‘digits’: controls the number of digits to print when
 > printing numeric values."

Maybe we should additionally say that this is *not* round()ing,
and give a link to the help for signif() ?


> On the face of it, printing the value "0,517" of 'ccc' looks
> like printing 4 digits! Joshua's could look even worse if 'ddd'
> had values in the 1000s!

> To achieve exactly what Joshua seems to want, use the round()
> function. Starting with his original assignment of values to
> the variable itemInfo, the result of round(itemInfo,digits=3) is:

> aaa   bbb   ccc   dddeee
> skill 1.396 6.225 0.517 5.775  2.497
> predict   1.326 5.230 0.462 5.116 -2.673
> waiting   1.117 4.948NANA NA
> complex   1.237 4.170 0.220 4.713  5.642
> novelty   1.054 4.005 0.442 4.260  2.076
> creative  1.031 3.561 0.362 3.689  2.549
> evaluated 0.963 3.013NANA NA
> body  0.748 2.238 0.596 2.019  0.466
> control   0.620 2.149NANA NA
> stakes0.541 1.905 0.227 2.020  2.343
> spont 0.496 1.620NANA NA
> chatter   0.460 1.563 0.361 1.382  0.540
> present   0.428 1.236 0.215 1.804  2.194
> reward0.402 1.101 0.288 1.208  0.890
> feedback  0.283 0.662NANA NA
> goal  0.237 0.474NANA NA

> Best wishes to all,
> Ted.


> On Thu, 2018-05-31 at 15:30 +1000, Jim Lemon wrote:
>> Hi Joshua,
>> Because there are no values in column ddd less than 1.
>> 
>> itemInfo[3,"ddd"]<-0.3645372
>> itemInfo
>> aaa   bbb   ccc   dddeee
>> skill 1.396 6.225 0.517 5.775  2.497
>> predict   1.326 5.230 0.462 5.116 -2.673
>> waiting   1.117 4.948NA 0.365 NA
>> complex   1.237 4.170 0.220 4.713  5.642
>> novelty   1.054 4.005 0.442 4.260  2.076
>> creative  1.031 3.561 0.362 3.689  2.549
>> evaluated 0.963 3.013NANA NA
>> body  0.748 2.238 0.596 2.019  0.466
>> control   0.620 2.149NANA NA
>> stakes0.541 1.905 0.227 2.020  2.343
>> spont 0.496 1.620NANA NA
>> chatter   0.460 1.563 0.361 1.382  0.540
>> present   0.428 1.236 0.215 1.804  2.194
>> reward0.402 1.101 0.288 1.208  0.890
>> feedback  0.283 0.662NANA NA
>> goal  0.237 0.474NANA NA
>> 
>> digits specifies the number of significant digits ("It is a suggestion
>> only"), so when at least one number is padded with a leading zero it
>> affects the formatting of the other numbers in that column. I suspect
>> that this is an esthetic consideration to line up the decimal points.
>> 
>> Jim
>> 
>> 
>> On Thu, May 31, 2018 at 11:18 AM, Joshua N Pritikin 
 wrote:
>> > R version 3.5.0 (2018-04-23) -- "Joy in Playing"
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> >
>> > options(digits=3)
>> >
>> > itemInfo <- structure(list("aaa" = c(1.39633732316667, 
1.32598263816667,  1.1165832407, 1.23651072616667, 1.0536867998, 
1.0310073738,  0.9630728395, 0.7483865045, 0.62008664617, 0.5411017985, 
 0.49639760783, 0.45952804467, 0.42787704783, 0.40208597967,  
0.28316118123, 0.23689627723), "bbb" = c(6.22533860696667,  
5.229736804, 4.94816041772833, 4.17020503255333, 4.00453781427167,  
3.56058007398333, 3.0125202404, 2.2378235873, 2.14863910661167,  
1.90460903044777, 1.62001089796667, 1.56341257968151, 1.23618558850667,  
1.10086688908262, 0.661981500639833, 0.47397754310745), "ccc" = c(0.5165911355, 
 0.46220470767, NA, 0.21963592433, 0.44186378083,  
0.36150286583, NA, 0.59613794667, NA, 0.22698477157,  NA, 
0.36092266158, 0.2145347068, 0.28775624948, NA, NA ), "ddd" = 
c(5.77538400186667,  5.115877113, NA, 4.71294520316667, 4.25952652129833, 
3.68879921863167,  NA, 2.01942456211145, NA, 2.02032557108, NA, 1.381810875
>> 9571,  1.8043675977

Re: [R] verInd= and HorInd= arguments to pairs() function

2018-06-07 Thread Martin Maechler
(x[, j], x[, i], xlab = "", ylab = "", axes = FALSE,
>  type = "n", ..., log = l)
>  if (i == j || (i < j && has.lower) || (i > j && has.upper)) {
>  box()
>  if (i == 1 && (!(j%%2L) || !has.upper || !has.lower))
>  localAxis(1L + 2L * row1attop, x[, j], x[, i],
>...)
>  if (i == nc && (j%%2L || !has.upper || !has.lower))
>  localAxis(3L - 2L * row1attop, x[, j], x[, i],
>...)
>  if (j == 1 && (!(i%%2L) || !has.upper || !has.lower))
>  localAxis(2L, x[, j], x[, i], ...)
>  if (j == nc && (i%%2L || !has.upper || !has.lower))
>  localAxis(4L, x[, j], x[, i], ...)
>  mfg <- par("mfg")
>  if (i == j) {
>  if (has.diag)
>localDiagPanel(as.vector(x[, i]), ...)
>  if (doText) {
>par(usr = c(0, 1, 0, 1))
>if (is.null(cex.labels)) {
>  l.wid <- strwidth(labels, "user")
>  cex.labels <- max(0.8, min(2, 0.9/max(l.wid)))
>}
>xlp <- if (xl[i])
>  10^0.5
>else 0.5
>ylp <- if (yl[j])
>  10^label.pos
>else label.pos
>text.panel(xlp, ylp, labels[i], cex = cex.labels,
>  font = font.labels)
>  }
>  }
>  else if (i < j)
>  localLowerPanel(as.vector(x[, j]), as.vector(x[,
>        i]), ...)
>  else localUpperPanel(as.vector(x[, j]), as.vector(x[,
>  i]), ...)
>  if (any(par("mfg") != mfg))
>  stop("the 'panel' function made a new plot")
>  }
>  else par(new = FALSE)
>  }
>  if (!is.null(main)) {
>  font.main <- if ("font.main" %in% nmdots)
>  dots$font.main
>  else par("font.main")
>  cex.main <- if ("cex.main" %in% nmdots)
>  dots$cex.main
>  else par("cex.main")
>  mtext(main, 3, line.main, outer = TRUE, at = 0.5, cex = cex.main,
>  font = font.main)
>  }
>  invisible(NULL)
> }
> 
> 
> 
> ## Example:
> 
> mypairs(xmat, xlim=lim, ylim=lim, verInd=1:2, horInd=1:4)

Thank you, Chris, for the report and
Gerrit for your proposed fix !!

It looks good to me,  but I will test some more (also with
'row1attop=FALSE')  before committing the bug fix.

Best regards,

Martin Maechler
ETH Zurich and R Core Team

> Am 06.06.2018 um 23:55 schrieb Andrews, Chris:
> > 
> > After making scatterplot matrix, I determined I only needed the first 2 
> > columns of the matrix so I added verInd=1:2 to my pairs() call.  However, 
> > it did not turn out as I expected.
> > 
> > Perhaps the attached pdf of the example code will make it through.  If not, 
> > my description is "the wrong scatterplot pairs are in the wrong places" for 
> > the last two pairs() calls.
> > 
> > Thanks,
> > Chris
> > 
> > 
> > 
> > # fake data
> > xmat <- matrix(1:28, ncol=4)
> > lim <- range(xmat)
> > 
> > # what I expected
> > pairs(xmat, xlim=lim, ylim=lim) # 4x4 matrix of scatterplots
> > pairs(xmat, xlim=lim, ylim=lim, verInd=1:2, horInd=1:2) # 2x2 matrix of 
> > scatterplots: upper left
> > 
> > # here comes trouble
> > pairs(xmat, xlim=lim, ylim=lim, horInd=1:2) # 2x4 matrix of scatterplots: 
> > but not the top 2 rows (or bottom 2 rows)
> > pairs(xmat, xlim=lim, ylim=lim, verInd=1:2) # 4x2 matrix of scatterplots: 
> > but not the left 2 columns (or right 2 columns)
> > 
> > 
> > ###
> > 
> >> sessionInfo()
> > R version 3.5.0 (2018-04-23)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows 7 x64 (build 7601) Service Pack 1
> > 
> > Matrix products: default
> > 
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
> > States.1252LC_MONETARY=English_United States.1252
> > [4] LC_NUMERIC=C   LC_TIME=English_United 
> > States.1252
> > 
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods   base
> > 
> > loaded via a namespace (and not attached):
> > [1] compiler_3.5.0 tools_3.5.0
> > **

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] verInd= and HorInd= arguments to pairs() function

2018-06-08 Thread Martin Maechler
>>>>> Martin Maechler 
>>>>> on Thu, 7 Jun 2018 18:35:48 +0200 writes:

>>>>> Gerrit Eichner 
>>>>> on Thu, 7 Jun 2018 09:03:46 +0200 writes:

>> Hi, Chris, had the same problem (and first thought it was
>> my fault), but there seems to be a typo in the code of
>> pairs.default. Below is a workaround.  Look for two
>> comments (starting with #) in the code to see what I
>> have changed to make it work at least the way I'd expect
>> it in one of your examples.

>> Hth -- Gerrit


> > mypairs <- function (x, labels, panel = points, ...,
> >  horInd = 1:nc, verInd = 1:nc,
> >  lower.panel = panel, upper.panel = panel, diag.panel = NULL,
> >  text.panel = textPanel, label.pos = 0.5 + has.diag/3, line.main = 3,
> >  cex.labels = NULL, font.labels = 1, row1attop = TRUE, gap = 1,
> >  log = "") {
> >  if (doText <- missing(text.panel) || is.function(text.panel))
> >  textPanel <- function(x = 0.5, y = 0.5, txt, cex, font) text(x,
> >  y, txt, cex = cex, font = font)
> >  localAxis <- function(side, x, y, xpd, bg, col = NULL, main,
> >  oma, ...) {
> >  xpd <- NA
> >  if (side%%2L == 1L && xl[j])
> >  xpd <- FALSE
> >  if (side%%2L == 0L && yl[i])
> >  xpd <- FALSE
> >  if (side%%2L == 1L)
> >  Axis(x, side = side, xpd = xpd, ...)
> >  else Axis(y, side = side, xpd = xpd, ...)
> >  }
> >  localPlot <- function(..., main, oma, font.main, cex.main) plot(...)
> >  localLowerPanel <- function(..., main, oma, font.main, cex.main) 
> > lower.panel(...)
> >  localUpperPanel <- function(..., main, oma, font.main, cex.main) 
> > upper.panel(...)
> >  localDiagPanel <- function(..., main, oma, font.main, cex.main) 
> > diag.panel(...)
> >  dots <- list(...)
> >  nmdots <- names(dots)
> >  if (!is.matrix(x)) {
> >  x <- as.data.frame(x)
> >  for (i in seq_along(names(x))) {
> >  if (is.factor(x[[i]]) || is.logical(x[[i]]))
> >  x[[i]] <- as.numeric(x[[i]])
> >  if (!is.numeric(unclass(x[[i]])))
> >  stop("non-numeric argument to 'pairs'")
> >  }
> >  }
> >  else if (!is.numeric(x))
> >  stop("non-numeric argument to 'pairs'")
> >  panel <- match.fun(panel)
> >  if ((has.lower <- !is.null(lower.panel)) && !missing(lower.panel))
> >  lower.panel <- match.fun(lower.panel)
> >  if ((has.upper <- !is.null(upper.panel)) && !missing(upper.panel))
> >  upper.panel <- match.fun(upper.panel)
> >  if ((has.diag <- !is.null(diag.panel)) && !missing(diag.panel))
> >  diag.panel <- match.fun(diag.panel)
> >  if (row1attop) {
> >  tmp <- lower.panel
> >  lower.panel <- upper.panel
> >  upper.panel <- tmp
> >  tmp <- has.lower
> >  has.lower <- has.upper
> >  has.upper <- tmp
> >  }
> >  nc <- ncol(x)
> >  if (nc < 2L)
> >  stop("only one column in the argument to 'pairs'")
> >  if (!all(horInd >= 1L && horInd <= nc))
> >  stop("invalid argument 'horInd'")
> >  if (!all(verInd >= 1L && verInd <= nc))
> >  stop("invalid argument 'verInd'")
> >  if (doText) {
> >  if (missing(labels)) {
> >  labels <- colnames(x)
> >  if (is.null(labels))
> >  labels <- paste("var", 1L:nc)
> >  }
> >  else if (is.null(labels))
> >  doText <- FALSE
> >  }
> >  oma <- if ("oma" %in% nmdots)
> >  dots$oma
> >  main <- if ("main" %in% nmdots)
> >  dots$main
> >  if (is.null(oma))
> >  oma <- c(4, 4, if (!is.null(main)) 6 else 4, 4)
> >  opar <- par(mfcol = c(length(horInd), length(verInd)),
> > # Changed from mfrow to mfcol
> >  mar = rep.int(gap/2, 4), oma = oma)
> >  on.exit(par(opar))
> >  dev.hold()
> >  on.exit(dev.flush(), add = TRUE)
> >  xl <- 

Re: [R] verInd= and HorInd= arguments to pairs() function

2018-06-08 Thread Martin Maechler
>>>>> Martin Maechler 
>>>>> on Fri, 8 Jun 2018 11:13:24 +0200 writes:

[..]

>> Thank you, Chris, for the report and
>> Gerrit for your proposed fix !!
>> 
>> It looks good to me,  but I will test some more (also with
>> 'row1attop=FALSE')  before committing the bug fix.

> and there, another change was needed:  Instead of your

> for (j in if (row1attop) verInd else rev(verInd))
>for (i in horInd) {

> we do now need

> for(j in verInd)
>for(i in if(row1attop) horInd else rev(horInd)) {

> and the difference is of course only relevant for the
> non-default  'row1attop = FALSE'

> (which some graphic experts argue to be clearly *better* than the default,
> as only in that case,  the upper and lower triangles of the
> matrix are nicely "mirrors of each other", and that is also
> the reason why  lattice::splom()  uses the equivalent of
> 'row1attop=FALSE')

> I will commit the change to R-devel today - and intend to port
> to R-patched in time to make it into the upcoming R 3.5.1.

Well, as I find, there are more bugs there, if you are using
'horInd' and 'verInd' creatively:

In a nice pairs(), the axis ticks (and their labels (numbers))
are always "on the outside" of the scatterplot matrix, and
nicely alternating.  This is not the case unfortunately, when using
 horInd or verInd which are *not* of the form p1:p2 (p1 <= p2)

==> even more changes are needed to make these cases "nice",
or  should we *require* horInd and verInd to be of that form??

This would not be back-compatible, but than such cases have been
"wrong" really in all versions of R anyway, *and*  at least be
reordering the matrix/data.frame columns, the restriction of

(hor|ver)Ind =  p1:p2 (p1 <= p2)

would seem acceptable, would it ?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] verInd= and HorInd= arguments to pairs() function

2018-06-08 Thread Martin Maechler
>>>>> Gerrit Eichner 
>>>>> on Fri, 8 Jun 2018 12:55:31 +0200 writes:

> Am 08.06.2018 um 12:02 schrieb Martin Maechler:
>>>>>>> Martin Maechler
>>>>>>> on Fri, 8 Jun 2018 11:13:24 +0200 writes:
>> 
>> [..]
>> 
>> >> Thank you, Chris, for the report and
>> >> Gerrit for your proposed fix !!
>> >>
>> >> It looks good to me,  but I will test some more (also with
>> >> 'row1attop=FALSE')  before committing the bug fix.
>> 
>> > and there, another change was needed:  Instead of your
>> 
>> > for (j in if (row1attop) verInd else rev(verInd))
>> >for (i in horInd) {
>> 
>> > we do now need
>> 
>> > for(j in verInd)
>> >for(i in if(row1attop) horInd else rev(horInd)) {
>> 
>> > and the difference is of course only relevant for the
>> > non-default  'row1attop = FALSE'
>> 
>> > (which some graphic experts argue to be clearly *better* than the 
default,
>> > as only in that case,  the upper and lower triangles of the
>> > matrix are nicely "mirrors of each other", and that is also
>> > the reason why  lattice::splom()  uses the equivalent of
>> > 'row1attop=FALSE')
>> 
>> > I will commit the change to R-devel today - and intend to port
>> > to R-patched in time to make it into the upcoming R 3.5.1.
>> 
>> Well, as I find, there are more bugs there, if you are using
>> 'horInd' and 'verInd' creatively:
>> 
>> In a nice pairs(), the axis ticks (and their labels (numbers))
>> are always "on the outside" of the scatterplot matrix, and
>> nicely alternating.  This is not the case unfortunately, when using
>> horInd or verInd which are *not* of the form p1:p2 (p1 <= p2)
>> 
>> ==> even more changes are needed to make these cases "nice",

> Well, the *shown* axis ticks and labels do nicely alternate if
> (hor|ver)Ind = p1:p2 (p1 <= p2) is fulfilled, but not all of
> the axis ticks and labels, which one *might* wish to see, are
> currently drawn anyway ... I would consider changes which "heal"
> this as more interesting than solving this problem in full
> generality, i.e., in cases of creative use of (hor|ver)Ind.
> However, I don't think it's urgent, at all.


>> or  should we *require* horInd and verInd to be of that form??
>> 
>> This would not be back-compatible, but than such cases have been
>> "wrong" really in all versions of R anyway, *and*  at least be
>> reordering the matrix/data.frame columns, the restriction of
>> 
>> (hor|ver)Ind =  p1:p2 (p1 <= p2)
>> 
>> would seem acceptable, would it ?
>> 

> I could live very well with that requirement (while breaking
> back-compatibility), because from my point of view a "creative"
> use of 'horInd' and 'verInd' violating (hor|ver)Ind = p1:p2
> (p1 <= p2) won't occur often.

> On the other hand, why forcing (hor|ver)Ind = p1:p2 (p1 <= p2)?
> If people violated it "they get what they deserve". ;-)

> Btw, 'horInd' and 'verInd' sort of exchange their meaning if
> row1attop = FALSE, but I this can be explained by the "work flow":
> First, (hor|ver)Ind are used to select the respective rows and
> columns from the full paris-plot, and then, row1attop is used
> when the results are drawn. I think this is sensible.

Thank you; and yes indeed, and I was not going to change that.

In fact, I've found a relatively nice solution now, which does
*not* restrict the settings of '{hor,ver}Ind' and fixes all
problems mentioned,  quite back compatibly, and in some sense
perfectly labeling the axes.

==> Committed to R-devel svn c74871  a few minutes ago.

Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Do there need to be the same number of y-values for each x-value when using tapply?

2018-07-08 Thread Martin Maechler
The answer to your Q is surely a "no": That's exaxtly the point of tapply
that the number of y values may vary. The msg tells you that the x & y full
vectors must have the same length.

Hoping that helps.
Martin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Database

2018-07-26 Thread Martin Maechler
> Doran, Harold 
> on Wed, 25 Jul 2018 14:57:13 + writes:

> I'm doing some work now to learn which SQL database
> package is the most optimal for the task I am working on.

Hmm... we would have a problem with optimize() and optim() if
this was

   optimal << more optimal << most optimal

:-)  ;-)

Best,
Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot Rect Transparency

2018-07-27 Thread Martin Maechler
> Duncan Murdoch 
> on Thu, 28 Jun 2018 20:57:19 -0400 writes:

> On 28/06/2018 5:29 PM, Jeff Reichman wrote:
>> R-Help
>> 
>> 
>> 
>> Is there a way to make a rectangle transparent (alpha=0.1??)
>> 
>> 
>> 
>> plot(c(100, 200), c(300, 450), type= "n", xlab = "", ylab = "")
>> 
>> rect(110, 300, 175, 350, density = 5, border = "red")
>> 
>> 
>> 
>> Can't figure out how to take the changepoint function results and plot in
>> ggplot2 so I can just simply add rectangles to the plot function, but I 
need
>> to make transparent and there doesn't seem to be an alpha option.

> Alpha is part of the colour spec.  For example,

> rect(110, 300, 175, 350, density = 5, border = rgb("red")


> rect(110, 300, 175, 350, density = 5, border = rgb(red=1, green=0, 
> blue=0, alpha=0.1))

> I'm not sure what is the quickest way to work out the rgb values for a 
> named colour (col2rgb can do it, but not in a convenient format) if you 
> want to add alpha to it.

IIUC, it is adjustcolor() you were thinking of.  It had been
created to do that and more. 

I'm using that "all the time" nowadays in my graphics code,
e.g.,

> adjustcolor("red", 2/3)
[1] "#FFAA"

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Safeguarded Newton method for function minimization

2017-04-19 Thread Martin Maechler
> J C Nash 
> on Tue, 18 Apr 2017 13:32:52 -0400 writes:

> Recently Marie Boehnstedt reported a bug in the nlm()
> function for function minimization when both gradient and
> hessian are provided.
Indeed, on R's Bugzilla here :
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249

> She has provided a work-around for some cases and it seems
> this will get incorporated into the R function eventually.

indeed the first part -- fixing the wrong choldc() -- in the C code has
been in my version of R-devel for a while now.

See my follow up
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249#c4  and #c5

including my attachment which is an extended version of Marie's original :
https://bugs.r-project.org/bugzilla/attachment.cgi?id=2246

As that mentions _and_ shows: Fixing choldc() solves the problem
for the 2D Rosenbrook example, but _not_ the  4D Wood example.

> However, despite the great number of packages on CRAN,
> there does not appear to be a straightforward Newton
> approach to function minimization. This may be because
> providing the code for a hessian (the matrix of second
> derivatives) is a lot of work and error-prone.

> (R could also use some good tools for Automatic Differentiation).

The last part of what you say above is not at all true:
R -- and S (& S+ | S-PLUS) before it! -- always had deriv() and deriv3()
and my attachment above _shows_ you how to use  deriv3() to get
both the gradient and the hessian  via a version of automatic
differentiation completely effortlessly !!

For ease of readers, that part here, with an example:

##' Wood function (4 arguments 'x1' ... 'x4')
fwood <- function(x1,x2,x3,x4) {
  100*(x1^2-x2)^2 + (1-x1)^2 + 90*(x3^2-x4)^2 + (1-x3)^2 +
10.1*((1-x2)^2 + (1-x4)^2) + 19.8*(1-x2)*(1-x4)
}
## automatically construct correct gradient and hessian:
woodf.gh <- function(x) {
  stopifnot(is.numeric(x))
  woodGH <- deriv3(body(fwood)[[2]],
   c("x1","x2","x3","x4"), function.arg=TRUE)
  if(length(x) == 4)
woodGH(x[1],x[2],x[3],x[4])
  else if(is.matrix(x) && ncol(x) == 4)
woodGH(x[,1], x[,2], x[,3], x[,4])
  else stop("'x' must have length 4 or be a matrix with 4 columns")
}

and now get both the function f(x), gradient g(x) and Hessian H(x) 
for
 x = (0 0 0 0),
 x = (1 1 1 1), and
 x = (1 2 3 4)

with such a simple calle :
 
  > woodf.gh(rbind(0, 1, 1:4))
  [1]   42.00.0 2514.4
  attr(,"gradient")
 x1x2   x3 x4
  [1,]   -2 -40.0   -2  -40.0
  [2,]0   0.000.0
  [3,] -400 279.6 5404 -819.6
  attr(,"hessian")
  , , x1

x1   x2 x3 x4
  [1,]   20  0  0
  [2,] 802 -400  0  0
  [3,] 402 -400  0  0

  , , x2

 x1x2 x3   x4
  [1,]0 220.2  0 19.8
  [2,] -400 220.2  0 19.8
  [3,] -400 220.2  0 19.8

  , , x3

   x1 x2   x3x4
  [1,]  0  02 0
  [2,]  0  0  722  -360
  [3,]  0  0 8282 -1080

  , , x4

   x1   x2x3x4
  [1,]  0 19.8 0 200.2
  [2,]  0 19.8  -360 200.2
  [3,]  0 19.8 -1080 200.2

  > 

> I have also noted that a number of researchers try to
> implement textbook methods and run into trouble when maths
> and computing are not quite in sync. Therefore, I wrote a
> simple safeguarded Newton and put a small package on
> R-forge at

> https://r-forge.r-project.org/R/?group_id=395

> Note that Newton's method is used to solve nonlinear
> equations. In fact, for function minimization, we apply it
> to solve g(x) = 0 where g is the gradient and x is the
> vector of parameters. In part, safeguards ensure we reduce
> the function f(x) at each step to avoid some of the
> difficulties that may arise from a non-positive-definite
> hessian.

> In the package, I have a very simple quadratic test, the
> Rosenbrock test function and the Wood test function. The
> method fails on the last function -- the hessian is not
> positive definite where it stops.

> Before submitting this package to CRAN, I would like to
> see its behaviour on other test problems, but am lazy
> enough to wish to avoid creating the hessian code. If
> anyone has such code, it would be very welcome. Please
> contact me off-list. If I get some workable examples that
> are open for public view, I'll report back here.

> John Nash

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented

Re: [R] R-3.4.0 and survival_2.41-3 ..

2017-04-25 Thread Martin Maechler
>>>>> Göran Broström 
>>>>> on Tue, 25 Apr 2017 10:22:48 +0200 writes:

> I installed R-3.4.0 and got problems with the survival package, for 
instance
> 
>> library(survival)
>> mort <- data.frame(exit = 1:4, event = rep(1, 4), x = c(0, 1, 0, 1))
>> fit <- coxph(Surv(exit, event) ~ x, data = mort)
> Error in fitter(X, Y, strats, offset, init, control, weights = weights,  :
> object 'Ccoxmart' not found
> -

> No problems with R-3.3.3 and the same (latest) survival version, which 
> makes me think that something is going on in my R installation rather 
> than in the survival package.

> Thanks for any hint,

> Göran

>> sessionInfo()
> R version 3.4.0 (2017-04-21)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.2 LTS

> Matrix products: default
> BLAS: /usr/lib/openblas-base/libblas.so.3
> LAPACK: /usr/lib/libopenblasp-r0.2.18.so

> locale:
> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> [3] LC_TIME=sv_SE.UTF-8LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=sv_SE.UTF-8LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=sv_SE.UTF-8   LC_NAME=C
> [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=sv_SE.UTF-8 LC_IDENTIFICATION=C

> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base

> other attached packages:
> [1] survival_2.41-3

> loaded via a namespace (and not attached):
> [1] compiler_3.4.0  Matrix_1.2-8splines_3.4.0   grid_3.4.0
> [5] lattice_0.20-35

I'm 99.5% sure that you are using more than just the default
library of package (a very good thing - we have been doing the
same for years).

We have in NEWS for R 3.4.0

  > PACKAGE INSTALLATION:

  >   [...]

  >   [...]

  >   • Packages which register native routines for .C or .Fortran need
  > to be re-installed for this version (unless installed with
  > R-devel SVN revision r72375 or later).

and Prof Brian Ripley did announce that nicely and early on R-devel.
==> https://hypatia.math.ethz.ch/pipermail/r-devel/2017-March/073940.html

==> You have to re-install quite a few packages for R 3.4.0,
 __if__ they use .C() or .Fortran() 

When we've e-talked about the issue within R-core, 
Uwe Ligges noted we should really ask everyone to run

  update.packages(checkBuilt=TRUE)

after an update to a new major (meaning "R-x.y.0") release of R
and **not** re-use packages {inside R-x.y.z}
that were installed with R-x.(y-1).z'  ..

and of course Uwe is right:
We should ask others to do it _and_ do it ourselves.

Anyway it _is_ considerably more important for the 3.4.0
release.

Martin Maechler
ETH Zurich (and R Core team)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The effect of tolerance in all.equal()

2017-04-25 Thread Martin Maechler
> Ashim Kapoor 
> on Tue, 25 Apr 2017 14:02:18 +0530 writes:

> Dear all,
> I am not able to understand the interplay of absolute vs relative and
> tolerance in the use of all.equal

> If I want to find out if absolute differences between 2 numbers/vectors 
are
> bigger than a given tolerance I would do:

> all.equal(1,1.1,scale=1,tol= .1)

> If I want to find out if relative differences between 2 numbers/vectors 
are
> bigger than a given tolerance I would do :

> all.equal(1,1.1,tol=.1)

> 
##

> I can also do :

> all.equal(1,3,tol=1)

> to find out if the absolute difference is bigger than 1.But here I won't 
be
> able to detect absolute differences smaller than 1 in this case,so I don't
> think that this is a good way.

> My query is: what is the reasoning behind all.equal returning the absolute
> difference if the tolerance >= target and relative difference if tolerance
> < target?
(above, it istol  >/<=  |target|  ie. absolute value)


The following are desiderata / restrictions :

1) Relative tolerance is needed to keep things scale-invariant
   i.e.,  all.equal(x, y)  and  all.equal(1000 * x, 1000 * y)
   should typically be identical for (almost) all (x,y).

   ==> "the typical behavior should use relative error tolerance"

2) when x or y (and typically both!) are very close to zero it
   is typically undesirable to keep relative tolerances (in the
   boundary case, they _are_ zero exactly, and "relative error" is undefined).
   E.g., for most purposes, 3.45e-15 and 1.23e-17 should be counted as
   equal to zero and hence to themselves.

1) and 2) are typically reconciled by switching from relative to absolute
when the arguments are close to zero (*).

The exact cutoff at which to switch from relative to absolute
(or a combination of the two) is somewhat arbitrary(*2) and for
all.equal() has been made in the 1980's (or even slightly
earlier?) when all.equal() was introduced into the S language at
Bell labs AFAIK. Maybe John Chambers (or Rick Becker or ...,
but they may not read R-help) knows more.
*2) Then, the choice for all.equal() is in some way "least arbitrary", 
using c = 1 in the more general   tolerance >= c*|target|  framework.

*) There have been alternatives in "the (applied numerical
 analysis / algorithm) literature" seen in published algorithms,
 but I don't have any example ready.
 Notably some of these alternatives are _symmetric_ in (x,y)
 where all.equal() was designed to be asymmetric using names
 'target' and 'current'.

The alternative idea is along the following thoughts:

Assume that for "equality" we want _both_ relative and
absolute (e := tolerance) "equality"
 
   |x - y| < e (|x|+|y|)/2  (where you could use |y| or |x| 
 instead of their mean; all.equal()
 uses |target|)
   |x - y| < e * e1  (where e1 = 1, or e1 = 10^-7..)

If you add the two inequalities you get

   |x - y| < e (e1 + |x+y|/2)

as check which is a "mixture" of relative and absolute tolerance.

With a somewhat long history, my gut feeling would nowadays
actually prefer this (I think with a default of e1 = e) - which
does treat x and y symmetrically.

Note that convergence checks in good algorithms typically check
for _both_ relative and absolute difference (each with its
tolerance providable by the user), and the really good ones for
minimization do  check for (approximate) gradients also being
close to zero - as old timers among us should have learned from
Doug Bates ... but now I'm really diverging.

Last but not least some  R  code at the end,  showing that the *asymmetric*
nature of all.equal() may lead to somewhat astonishing (but very
logical and as documented!) behavior.

Martin

> Best Regards,
> Ashim


> ## The "data" to use:
> epsQ <- lapply(seq(12,18,by=1/2), function(P) bquote(10^-.(P))); names(epsQ) 
> <- sapply(epsQ, deparse); str(epsQ)
List of 13
 $ 10^-12  : language 10^-12
 $ 10^-12.5: language 10^-12.5
 $ 10^-13  : language 10^-13
 $ 10^-13.5: language 10^-13.5
 $ 10^-14  : language 10^-14
 $ 10^-14.5: language 10^-14.5
 $ 10^-15  : language 10^-15
 $ 10^-15.5: language 10^-15.5
 $ 10^-16  : language 10^-16
 $ 10^-16.5: language 10^-16.5
 $ 10^-17  : language 10^-17
 $ 10^-17.5: language 10^-17.5
 $ 10^-18  : language 10^-18

> str(lapply(epsQ, function(tl) all.equal(3.45e-15, 1.23e-17, tol = eval(tl
List of 13
 $ 10^-12  : logi TRUE
 $ 10^-12.5: logi TRUE
 $ 10^-13  : logi TRUE
 $ 10^-13.5: logi TRUE
 $ 10^-14  : logi TRUE
 $ 10^-14.5: chr "Mean relative difference: 0.9964348"
 $ 10^-15  : chr "Mean relative difference: 0.9964348"
 $ 10^-15.5: chr "Mean relative difference: 0.9964348"
 $ 10^-16  : chr "Mean relative difference: 0.9964348"
 $ 10^-16.5: chr "Mean relative difference: 0.9964348"
 

Re: [R] [FORGED] Logical Operators' inconsistent Behavior

2017-05-20 Thread Martin Maechler
> Ramnik Bansal 
> on Sat, 20 May 2017 08:52:55 +0530 writes:

> Taking this question further.
> If I use a complex number or a numeric as an operand in logical
> operations, to me it APPEARS that these two types are first coerced to
> LOGICAL internally and then THIS logical output is further used as the
> operand.

> For eg.
>> x <- 4+5i; c(x & F, x & T, x | F, x | T)
> [1] FALSE  TRUE  TRUE  TRUE

> This output is consistent with
>> x <- 4+5i; c(as.logical(x) & F, as.logical(x) & T, as.logical(x) | F, 
as.logical(x) | T)
> [1] FALSE  TRUE  TRUE  TRUE

> This consistency makes me draw an on-the-surface conclusion that in
> the case of logical operations if the operand is not of type 'logical'
> it is first coerced into 'logical'.

That conclusion is wrong as you show below.
Rather, as the error message says,
logical
"operations are possible only for numeric, logical or complex types"

Again:

1) Logical/Arithmetic  operations "work" with "numeric-like" types, namely
  numeric, logical or complex, (and numeric = {integer, double})

  ==> all other types give an error (the one you've cited twice)

2) For "numeric-like" types and *logical* operations (&, |, !; plus && and ||)
   the equivalent of as.logical() is applied before performing the Op.
   
Seems pretty consistent ...
and also according to the principle of "least surprise" (for me at least).

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in readRDS(dest) (was Re: Error with installed.packages with R 3.4.0 on Windows)

2017-05-23 Thread Martin Maechler
> Patrick Connolly 
> on Tue, 23 May 2017 20:47:22 +1200 writes:

> On Mon, 22-May-2017 at 05:43AM -0400, Martin Morgan wrote:
> |> On 05/22/2017 05:10 AM, Patrick Connolly wrote:

> |> >Apparently it isn't harmless.
> |> >
> |> >>install.packages("withr")
> |> >Error in readRDS(dest) : error reading from connection
> |> 
> |> that seems like a plain-old network connectivity issue, or perhaps
> |> an issue with the CRAN mirror you're using. Can you debug on your
> |> end, e.g,.
> |> 
> |>   options(error=recover)
> |>   install.packages("withr")
> |>   ...
> |> 
> |> then select the 'frame' where the error occurs, look around
> |> 
> |>   ls()
> |> 
> |> find the value of 'dest', and e.g., try to open dest in your  browser.

> This is what I get

>> options(error=recover)
>> install.packages("withr")
> ^C

> Enter a frame number, or 0 to exit   

> 1: install.packages("withr")
> 2: available.packages(contriburl = contriburl, method = method)
> 3: tryCatch({
> download.file(url = paste0(repos, "/PACKAGES.rds"), destfile
> 4: tryCatchList(expr, classes, parentenv, handlers)
> 5: tryCatchOne(expr, names, parentenv, handlers[[1]])
> 6: doTryCatch(return(expr), name, parentenv, handler)
> 7: download.file(url = paste0(repos, "/PACKAGES.rds"), destfile = dest, 
method

> Selection: 7

'7'  was the wrong choice:  'dest' exists in the frame that
 *calls* download.file, in this case, it is frame 2, i.e.,
 inside available.packages(.)  where the  tryCatch() call to
 download.file() happens.

Given the above stack trace. 
It may be easier to just do

debugonce(available.packages)
install.packages("withr")

and then inside available.packages, (using 'n') step to the
point _before_ the tryCatch(...) call happens; there, e.g. use

  ls.str()

which gives an str() of all your local objects, notably 'dest'
and 'method'.
but you can also try other things once inside
available.packages().

Martin


> Called from: eval(substitute(browser(skipCalls = skip), list(skip = 7 - 
which)), 
> envir = sys.frame(which))
> Browse[1]> dest
> Error during wrapup: object 'dest' not found

> That indicates to me that the problem is further back but I have no
> idea how make use of that information.


> Browse[1]> ls()
> [1] "cacheOK"  "destfile" "extra""method"   "mode" "quiet"
"url" 
> Browse[1]> url
> [1] "http://cran.stat.auckland.ac.nz/src/contrib/PACKAGES.rds";
> Browse[1]> destfile
> [1] 
"/tmp/RtmpplJSrB/repos_http%3A%2F%2Fcran.stat.auckland.ac.nz%2Fsrc%2Fcontrib.rds"
> Browse[1]> 

> The destfile above is zero-length and I suppose is where dest is
> intended to end up.

> Where else should I be looking?  Earlier installations never had this
> issue so I don't have anything to compare.

> TIA
> --

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing dependencies in pkg installs

2017-06-22 Thread Martin Maechler
> David Winsemius 
> on Wed, 21 Jun 2017 18:04:13 -0700 writes:

>> On Jun 21, 2017, at 1:39 PM, Conklin, Mike (GfK)  
wrote:
>> 
>> I have a Ubuntu server with an R installation that has 384 packages 
installed.  We are trying to replicate the system on a Red Hat Enterprise 
server. I downloaded the list of packages from the Ubuntu machine and read it 
into an R session on the new machine. Then I ran 
install.packages(listofpackages).  Now I have 352 packages on the new machine 
but several very common packages (like much of the tidyverse, ggplot2, Hmisc) 
failed to install because of missing dependencies (most of which are the other 
packages that failed to install).
>> 
>> R version 3.4.0 (2017-04-21)
>> Platform: x86_64-redhat-linux-gnu (64-bit)

> I'd make sure you have reviewed this:

> 
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Essential-programs-and-libraries

yes,  but see also below ..

> Best;
> David.

>> Running under: Red Hat Enterprise Linux Server 7.2 (Maipo)
>> 
>> Matrix products: default
>> BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
>> 
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>> [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>> 
>> other attached packages:
>> [1] dplyr_0.7.0 shiny_1.0.3
>> 
>> loaded via a namespace (and not attached):
>> [1] compiler_3.4.0   magrittr_1.5 assertthat_0.2.0 R6_2.2.2
>> [5] htmltools_0.3.6  tools_3.4.0  glue_1.1.1   tibble_1.3.3
>> [9] Rcpp_0.12.11 digest_0.6.12xtable_1.8-2 httpuv_1.3.3
>> [13] mime_0.5 rlang_0.1.1
>>> 
>> 
>> 
>> If I try and install Hmisc for example I get the following errors:
>> for the first error ERROR: 'configure' exists but is not executable -- 
>> see the 'R Installation and dministration Manual'  I did
>> not find the R Installation and Administration Manual
>> helpful.

why?  It does assume you spend some time understanding what it
is talking about.
OTOH, it is  *THE*  official document on the topic, written and
constantly updated by the  R core team.

Anyway:   " ERROR: 'configure' exists but is not executable "

is I think a crucial piece of information .. it's from stringi's
installation failure, and the top level directory of stringi (1.1-5)
in a Unix like (ls -l with user group names removed) looks like:

  -rw-r--r--. 1451 Apr  7 15:08 DESCRIPTION
  -rw-r--r--. 6207 Apr  7 11:21 INSTALL
  -rw-r--r--. 3580 Mar 21 13:29 LICENSE
  -rw-r--r--.63692 Apr  7 15:08 MD5
  -rw-r--r--. 6204 Oct 24  2016 NAMESPACE
  -rw-r--r--.22407 Apr  7 11:44 NEWS
  drwxr-xr-x. 8192 Mar 28 10:26 R
  -rwxr-xr-x.   66 Apr  2  2015 cleanup
  -rw-rw-r--.32193 Apr  8 05:46 config.log
  -rwxrwxr-x.40648 Apr  8 05:46 config.status
  -rwxr-xr-x.   173757 Apr  7 11:43 configure
  -rw-r--r--.  669 Jun 23  2015 configure.win
  drwxr-xr-x.  512 Apr  7 11:50 inst
  drwxr-xr-x. 8192 Mar 28 10:26 man
  drwxr-xr-x.16384 Apr  8 05:47 src

Note the "x"s  in the 'configure' line's " -rwxr-xr-x. ": 
This means the file is executable and a reasonable shell can
just "call the file" and it will be executed, but that's the
part which failed for you.

.. this *is* peculiar as it looks like some of the standard Unix
tools may be misbehaving for you .. I assume it could be some OS
security "feature" playing against you..

>> 
>> 
>>> install.packages("Hmisc")
>> Installing package into '/usr/lib64/R/library'
>> (as 'lib' is unspecified)
>> also installing the dependencies 'stringi', 'evaluate', 'reshape2', 
'stringr', knitr', 'ggplot2', 'htmlTable', 'viridis'
>> 
>> trying URL 'https://cloud.r-project.org/src/contrib/stringi_1.1.5.tar.gz'
>> Content type 'application/x-gzip' length 3645872 bytes (3.5 MB)
>> ==
>> downloaded 3.5 MB
>> 
>> trying URL 'https://cloud.r-project.org/src/contrib/evaluate_0.10.tar.gz'
>> Content type 'application/x-gzip' length 21914 bytes (21 KB)
>> ==
>> downloaded 21 KB
>> 
>> trying URL 
'https://cloud.r-project.org/src/contrib/reshape2_1.4.2.tar.gz'
>> Content type 'application/x-gzip' length 34688 bytes (33 KB)
>> ==
>> downloaded 33 KB
>> 
>> trying URL 'https://cloud.r-project.org/src/contrib/stringr_1.2.0.tar.gz'
>> Content type 'applicat

Re: [R] getting error while trying to make dendogram based on gene expression

2017-06-23 Thread Martin Maechler
> Yogesh Gupta 
> on Wed, 21 Jun 2017 13:42:15 +0900 writes:

> I am trying to make dendogram based on gene expression matrix , but 
getting
> some error:

> I
> countMatrix = read.table("count.row.txt",header=T,sep='\t',check.names=F)

> colnames(countMatrix)

> count_matrix <- countMatrix[,-1]   #  remove first column
> (gene names)
> rownames(count_matrix) <- countMatrix[,1] #added first column gene
> names as rownames)

>> nonzero_row <- count_matrix[rowSums(count_matrix) > 0, # removed row sum
> 0 across all sample

>> x1= as.matrix(nonzero_row)# converted data into 
matrix

>> x=log2(x1+1)  # converted into
> log value
>> d <- dist(x, method="euclidean")

>> h <- hclust(d, method="complete")


> *Error:*

> *** caught segfault ***
> address 0x7fa39060af28, cause 'memory not mapped'

> Traceback:
> 1: hclust(d, method = "complete")

> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection:

This looks like a problem that should not happen.
Though it could be that it's just too large a problem (for your
hardware, or "in general").
However, we cannot reproduce what you show above.

To help us help you please provide (to this mailing list!)

   - a (minimal) reproducible example
   - sessionInfo()

It is best to take time to carefully read
   https://www.r-proejct.org/help.html

and/or then just search "reproducible example R" :
   http://lmgtfy.com/?q=reproducible+example+R

Martin


> Thanks
> Yogesh

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] setReplaceMethod creates 'object' in the userworkspace

2017-06-27 Thread Martin Maechler
> Jonathan Fritzemeier 
> on Fri, 23 Jun 2017 16:15:30 +0200 writes:

> Hi,
> I recognized that the function 'setReplaceMethod' is creating a
> character vector in the user workspace having the name (e.g. "newClass")
> of the class used as value. If you can sort out a mistake by myself, I
> would like you to file a bug report.

Yes, a mistake by yourself (and really not fit for R-devel,
but rather for R-help to which I follow up now)

> BBFN,
> Jonathan

> setClass("newClass", representation(value="numeric"))
> 
> setMethod(f = "initialize", signature = "newClass",
> definition = function(.Object){
> .Object@value <- 1
> return(.Object)
> })
> 
> setGeneric(name = "myValue",
> def  = function(object) { standardGeneric("myValue") }
> )
> setGeneric(name = "myValue<-",
> def  = function(object, value) { standardGeneric("myValue<-") }
> )
> 
> setMethod("myValue", signature(object = "newClass"),
> function(object) {
> return(object@value)
> }
> )

> setReplaceMethod("myValue", signature = (object = "newClass"),
> function(object, value) {
> object@value <- value
> return(object)
> }
> )

Q: what do you think happens with the above [last setReplaceMethod() call] ?
A: it creates an R object named 'object' in the globalenv
   (or the package if this would go into a package)

If just replace  '(object = "newClass")'   by   '"newClass"'
things should be fine.

{{ Removing all the completely redundant return(.), i.e. return
   implicitly rather than via an extra function call would also
   make the code "cleaner" and more R-like  }} 

Best,
Martin


> myNewObject <- new("newClass")
> print(object)
> 
> 
> > print(object)
> [1] "newClass"
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readLines without skipNul=TRUE causes crash

2017-07-18 Thread Martin Maechler
>>>>> Anthony Damico 
>>>>> on Sun, 16 Jul 2017 06:40:38 -0400 writes:

> hi, the text file that prompts the segfault is 4gb but only 80,937 lines
>> file.info( "S:/temp/crash.txt")
> size isdir mode   mtime
> ctime   atime exe
> S:/temp/crash.txt 4078192743 FALSE  666 2017-07-15 17:24:35 2017-07-15
> 17:19:47 2017-07-15 17:19:47  no


> On Sun, Jul 16, 2017 at 6:34 AM, Duncan Murdoch 
> wrote:

>> On 16/07/2017 6:17 AM, Anthony Damico wrote:
>> 
>>> thank you for taking the time to write this.  i set it running last
>>> night and it's still going -- if it doesn't finish by tomorrow, i will
>>> try to find a site to host the problem file and add that link to the bug
>>> report so the archive package can be avoided at least.  i'm sorry for
>>> the bother
>>> 
>>> 
>> How big is that text file?  I wouldn't expect my script to take more than
>> a few minutes even on a huge file.
>> 
>> My script might have a bug...
>> 
>> Duncan Murdoch
>> 
>> On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch
>>> mailto:murdoch.dun...@gmail.com>> wrote:
>>> 
>>> On 15/07/2017 11:33 AM, Anthony Damico wrote:
>>> 
>>> hi, i realized that the segfault happens on the text file in a
>>> new R
>>> session.  so, creating the segfault-generating text file requires
>>> a
>>> contributed package, but prompting the actual segfault does not --
>>> pretty sure that means this is a base R bug?  submitted here:
>>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311
>>> <https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311>
>>> hopefully i am not doing something remarkably stupid.  the text file 
itself
>>> is 4GB
>>> so cannot upload it to bugzilla, and from the
>>> R_AllocStringBugger error
>>> in the previous message, i think most or all of it needs to be
>>> there to
>>> trigger the segfault.  thanks!

In the mean time, communication has continued a bit at the bugzilla bug tracker
(https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311 ), and
as you can read there, the bug is fixed now, also thanks to an
initial patch proposal by Hannes Mühleisen.

Martin Maechler
ETH Zurich (and R Core)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis() after image.plot() does not work except if points() is inserted between

2017-07-25 Thread Martin Maechler
>>>>> Marc Girondot via R-help 
>>>>> on Mon, 24 Jul 2017 09:35:06 +0200 writes:

> Thanks for the proposition. As you see bellow, par("usr") is the same 
> before and after the points() (the full code is bellow):
> 
>> par("usr")
> [1] -0.250  1.250 -0.167  1.167
>> # if you remove this points() function, axis will show nothing.
>> 
>> points(1.5, 1.5, type="p")
>> p2 <- par(no.readonly=TRUE)
>> par("usr")
> [1] -0.250  1.250 -0.167  1.167
> ...

> I can reproduce it in Ubuntu and MacosX R Gui and Rstudio (R 3.4.1).

> Marc

> Here is the code:
> library(fields)
> par(mar=c(5,4.5,4,7))
> D <- matrix(c(10, 20, 25, 30, 12, 22, 32, 35, 13, 25, 38, 40), nrow=3)

> p0 <- par(no.readonly=TRUE)
> image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
>   ylab="Columns", cex.lab = 0.5, 
>   zlim=c(min(D, na.rm=TRUE),max(D, na.rm=TRUE)),
>   las=1, axes=FALSE)
> p1 <- par(no.readonly=TRUE)

> par("usr")
> par("xpd")

> # if you remove this points() function, axis will show nothing.

> points(1.5, 1.5, type="p")
> p2 <- par(no.readonly=TRUE)
> par("usr")
> par("xpd")

> ##
> axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
> axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1,
> cex.axis=0.5)

> identical(p1, p2)

Have you ever carefully read the detailed help page about image.plot()?
   
I haven't, but a cursory reading already shows me that the
author of the function did this partly on purpose:

  > Side Effects:
  > 
  >  After exiting, the plotting region may be changed to make it
  >  possible to add more features to the plot. To be explicit,
  >  ‘par()\$plt’ may be changed to reflect a smaller plotting region
  >  that has accommodated room for the legend subplot.

Unfortunately, there it does _not_ mention the following :

From looking at its code, and then re-reading parts of the help page,
I see that there is a 'graphics.reset' argument which you can
set to TRUE in such a case:

 image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
   ylab="Columns", cex.lab = 0.5, 
   zlim= range(D, na.rm=TRUE),
   graphics.reset = TRUE, # <<<<< the solution
   las=1, axes=FALSE)

Also note that
 zlim = range(D ...)
is infinitely more elegant than
 zlim = c(min((D, ...), max(D, ...)))


Martin Maechler
ETH Zurich (and R core)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Superscript and subscrib R for legend x-axis and y-axis and colour different subjects in longitudinal data with different colours

2017-07-31 Thread Martin Maechler

> Hi Rosa
> something like

> plot(1,1, sub=expression(lambda^"2"))

> So with your example, do you want something like

> plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) ,
>   xlab="Day in ICU",
>   ylab="CRP (mg/dL)",
>   sub = mtext(expression(lambda^2)))

OOps!   Either  plot( ...,  sub = *)
or  plot( ... ) ; mtext(*)

but not both!

>  CRP7graph <- apply(CRP7, 1, lines, col="gray")

> Cheers
> Petr


> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rosa 
> > Oliveira
> > Sent: Friday, July 28, 2017 5:07 PM
> > To: r-help mailing list ; R-help@r-project.org
> > Subject: [R] Superscript and subscrib R for legend x-axis and y-axis and 
> > colour
> > different subjects in longitudinal data with different colours
> >
> > I am trying to make a x-axis and y-axis titles with both a special 
> > character and a
> > subscript. I am not being able to do this. I think its just a placing of my
> > parenthesis, but I've tried (seemingly) everything.
> >
> > Even more, when I try the blog users code it works.
> >
> >
> >
> > Is it because I’m using longitudinal data?
> >
> >
> >
> > Even more. Is it possible to colour each one of the 15 lines with a 
> > different
> > colour?
> >
> >
> >
> >
> >
> >
> > library(ggplot2)
> > library(reshape)
> > library(lattice)
> > library(gtable)
> > library(grid)
> >
> > attach(mtcars)
> >
> > beta0 = rnorm (15, 1, .5)
> > beta1 = rnorm (15, -1, .5)
> >
> > tempo = seq(1:5)
> >
> > CRP7raw = matrix(NA, 15, 5)
> > CRP7 = matrix(NA, 15, 5)
> > CRP98raw = matrix(NA, 15, 5)
> > CRP98 = matrix(NA, 15, 5)
> >
> > crp <- for (i in 1:15) {
> >   CRP7raw[i,] = beta0[i] + beta1[i] * tempo
> >   CRP7[i,] =CRP7raw[i,] + rnorm(5, 0, 2.14)
> >
> >   CRP98raw[i,] = beta0[i] + beta1[i] * tempo
> >   CRP98[i,] =CRP98raw[i,] + rnorm(5, 0, .1)
> > }
> >
> >
> > # plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) ,
> > #  xlab="Day in ICU",
> > #  ylab="CRP (mg/dL)",
> > #  sub = mtext(expression(paste(lambda)))
> > #
> > # CRP7graph <- apply(CRP7, 1, lines, col="gray")
> >
> >
> >
> >
> >
> >
> > # plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5),
> > #  xlab="Day in ICU",
> > #  ylab="CRP (mg/dL)")
> > # CRP98graph <- apply(CRP98, 1, lines, col="gray")
> >
> > par(mfrow=c(1,2))
> >
> > plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) ,
> >  xlab="t_i",
> >  ylab="y_ij",
> >  sub = "lambda = 0.7")
> >
> >  CRP7graph <- apply(CRP7, 1, lines, col="gray")
> >
> >
> >  plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5),
> >   xlab="Day in ICU",
> >   ylab="CRP (mg/dL",
> >   sub = "lambda = 0.98")
> >  CRP98graph <- apply(CRP98, 1, lines, col="gray")
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
> svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
> zpožděním přenosu e-mailu.

> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
> a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
> dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
> žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
> pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu 
> případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je 
> adresátovi či osobě jím zastoupené známá.

> This e-mail and any documents attached to it may be confidential and are 
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. 
> Delete the contents of this e-mail with all attachments and its copies from 
> your system.
> If you are not the intended recipi

Re: [R] Superscript and subscrib R for legend x-axis and y-axis and colour different subjects in longitudinal data with different colours

2017-07-31 Thread Martin Maechler
>>>>> PIKAL Petr 
>>>>> on Mon, 31 Jul 2017 09:11:18 + writes:

> Hi Martin see in line

>> -Original Message- From: Martin Maechler
>> [mailto:maech...@stat.math.ethz.ch] Sent: Monday, July
>> 31, 2017 10:52 AM To: PIKAL Petr 
>> Cc: Rosa Oliveira ; r-help mailing
>> list  
>> Subject: Re: [R] Superscript and subscrib R for legend
>> x-axis and y-axis and colour different subjects in
>> longitudinal data with different colours
>> 
>> 
>> > Hi Rosa > something like
>> 
>> > plot(1,1, sub=expression(lambda^"2"))
>> 
>> > So with your example, do you want something like
>> 
>> > plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5),
>> ylim=c(-10,5) , > xlab="Day in ICU", > ylab="CRP
>> (mg/dL)", > sub = mtext(expression(lambda^2)))
>> 
>> OOps!  Either plot( ..., sub = *) or plot( ... ) ;
>> mtext(*)
>> 
>> but not both!

> You are right, I used a code from OP and did not much
> think about it. Strangely enough, the code worked without
> any complain. Probably mtext is "stronger" than sub and
> overrides it.

Well, well, "the magic of R"  .
Not quite:  mtext(..) is a valid function call that is evaluated
before plot() finishes; then it returns NULL invisibly, and
'plot(*, sub=NULL)' does nothing "coincidentally" (but for good
reasons)

Martin

> Cheers Petr

>> 
>> > CRP7graph <- apply(CRP7, 1, lines, col="gray")
>> 
>> > Cheers > Petr
>> 
>> 
>> > > -Original Message- > > From: R-help
>> [mailto:r-help-boun...@r-project.org] On Behalf Of Rosa >
>> > Oliveira > > Sent: Friday, July 28, 2017 5:07 PM > >
>> To: r-help mailing list ;
>> R-help@r-project.org > > Subject: [R] Superscript and
>> subscrib R for legend x-axis and y-axis > > and colour
>> different subjects in longitudinal data with different >
>> > colours
>> > >
>> > > I am trying to make a x-axis and y-axis titles with
>> both a special > > character and a subscript. I am not
>> being able to do this. I think > > its just a placing of
>> my parenthesis, but I've tried (seemingly) everything.
>> > >
>> > > Even more, when I try the blog users code it works.
>> > >
>> > >
>> > >
>> > > Is it because I’m using longitudinal data?
>> > >
>> > >
>> > >
>> > > Even more. Is it possible to colour each one of the
>> 15 lines with a > > different colour?
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > library(ggplot2) > > library(reshape) > >
>> library(lattice) > > library(gtable) > > library(grid)
>> > >
>> > > attach(mtcars)
>> > >
>> > > beta0 = rnorm (15, 1, .5) > > beta1 = rnorm (15, -1,
>> .5)
>> > >
>> > > tempo = seq(1:5)
>> > >
>> > > CRP7raw = matrix(NA, 15, 5) > > CRP7 = matrix(NA, 15,
>> 5) > > CRP98raw = matrix(NA, 15, 5) > > CRP98 =
>> matrix(NA, 15, 5)
>> > >
>> > > crp <- for (i in 1:15) { > > CRP7raw[i,] = beta0[i] +
>> beta1[i] * tempo > > CRP7[i,] = CRP7raw[i,] + rnorm(5, 0,
>> 2.14)
>> > >
>> > > CRP98raw[i,] = beta0[i] + beta1[i] * tempo > >
>> CRP98[i,] = CRP98raw[i,] + rnorm(5, 0, .1) > > }
>> > >
>> > >
>> > > # plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5),
>> ylim=c(-10,5) , > > # xlab="Day in ICU", > > # ylab="CRP
>> (mg/dL)", > > # sub = mtext(expression(paste(lambda)))
>> > > #
>> > > # CRP7graph <- apply(CRP7, 1, lines, col="gray")
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > # plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5),
>> ylim=c(-10,5), > > # xlab="Day in ICU", > > #

Re: [R] Generating samples from truncated multivariate Student-t distribution

2017-08-02 Thread Martin Maechler
>>>>> David Winsemius 
>>>>> on Tue, 9 May 2017 14:33:04 -0700 writes:

>> On May 9, 2017, at 2:05 PM, Czarek Kowalski  
wrote:
>> 
>> I have already posted that in attachement - pdf file.

> I see that now. I failed to scroll to the 3rd page.

from a late reader:

Please, Czarek, and every other future reader:   This is an
unwritten "regulation" about all R lists I know (a few of which
I co-maintain): 

___>  R code must be given as unformatted plain text  rather than
___>  in a formatted document (pdf in your case) 


>> I am posting plain text here:

good .. and please do always on these lists


Martin Maechler, ETH Zurich & R Core


>> 
>>> library(tmvtnorm)
>>> meann = c(55, 40, 50, 35, 45, 30)
>>> covv = matrix(c(  1, 1, 0, 2, -1, -1,
>> 1, 16, -6, -6, -2, 12,
>> 0, -6, 4, 2, -2, -5,
>> 2, -6, 2, 25, 0, -17,
>> -1, -2, -2, 0, 9, -5,
>> -1, 12, -5, -17, -5, 36), 6, 6)
>> df = 4
>> lower = c(20, 20, 20, 20, 20, 20)
>> upper = c(60, 60, 60, 60, 60, 60)
>> X1 <- rtmvt(n=10, meann, covv, df, lower, upper)
>> 
>> 
>>> sum(X1[,1]) / 10
>> [1] 54.98258
>> sum(X1[,2]) / 10
>> [1] 40.36153
>> sum(X1[,3]) / 10
>> [1] 49.83571
>> sum(X1[,4]) / 10
>> [1] 34.69571  # "4th element of mean vector"
>> sum(X1[,5]) / 10
>> [1] 44.81081
>> sum(X1[,6]) / 10
>> [1] 31.10834
>> 
>> And corresponding results received using equation (3) from pdf file:
>> [54.97,
>> 40,
>> 49.95,
>> 35.31, #  "4th element of mean vector"
>> 44.94,
>> 31.32]
>> 

> I get similar results for the output from your code, 

> My 100-fold run of your calculations were:

> meansBig <- replicate(100, {Xbig <- rtmvt(n=10, meann, covv, df, 
lower, upper)
> colMeans(Xbig)} )

> describe(meansBig[4,])  # describe is from Hmisc package

> meansBig[4, ] 
> n  missing distinct Info Mean  Gmd  .05  .10  .25 
> 1000  1001 34.7  0.0195434.6834.68
34.69 
> .50  .75  .90  .95 
> 34.7034.7234.7234.73 

> lowest : 34.65222 34.66675 34.66703 34.66875 34.67566
> highest: 34.72939 34.73012 34.73051 34.73742 34.74441


> So agree, 35.31 is outside the plausible range of an RV formed with that 
package, but I don't have any code relating to your calculations from theory.

> Best;
> David.


>> On 9 May 2017 at 22:17, David Winsemius  wrote:
>>> 
>>>> On May 9, 2017, at 1:11 PM, Czarek Kowalski  
wrote:
>>>> 
>>>> Of course I have expected the difference between theory and a sample
>>>> of realizations of RV's and result mean should still be a random
>>>> variable. But, for example for 4th element of mean vector: 35.31 -
>>>> 34.69571 = 0.61429. It is quite big difference, nieprawdaż? I have
>>>> expected that the difference would be smaller because of law of large
>>>> numbers (for 10mln samples the difference is quite similar).
>>> 
>>> I for one have no idea what is meant by a "4th element of mean vector". 
So I have now idea what to consider "big". I have found that my intuitions 
about multivariate distributions, especially those where the covariate 
structure is as complex as you have assumed, are often far from simulated 
results.
>>> 
>>> I suggest you post some code and results.
>>> 
>>> --
>>> David.
>>> 
>>> 
>>>> 
>>>> On 9 May 2017 at 21:40, David Winsemius  wrote:
>>>>> 
>>>>> On May 9, 2017, at 10:09 AM, Czarek Kowalski  
>>>>> wrote:
>>>>> 
>>>>> Dear Members,
>>>>> I am working with 6-dimensional Student-t distribution with 4 degrees
>>>>> of freedom truncated to [20; 60]. I have generated 100 000 samples
>>>>> from truncated multivariate Student-t distribution using rtmvt
>>>>> function from package ‘tmvtnorm’. I have also calculated  mean vector
>>>>> using equation (3) from attached pdf. The problem is, that after
>>>>> summing all elements in one column of rtmvt result (and dividing by
>>>>> 100 000) I do not receive the same 

Re: [R] define a list with names as variables

2017-08-04 Thread Martin Maechler
>>>>> Giovanni Gherdovich 
>>>>> on Fri, 4 Aug 2017 12:36:49 +0200 writes:

> Hello Thomas, Ulrik,
> thanks for your suggestions.

There's also the
setNames(, names)
which is a trivial wrapper to achieve this.
I had provided it for R (in package 'stats') just to help people
to write more nicely readable code:

Your function f()  below then becomes a simple one-liner:

   f <- function(foo, bar)  setNames(list(bar), foo)

Martin Maechler
ETH Zurich and R Core


> Giovanni

> On Fri, Aug 4, 2017 at 12:13 PM, Thomas Mailund
>  wrote:
>> Do you mean like this?
>> 
>> 
>>> f <- function(foo, bar) {
>> +   result <- list(bar)
>> +   names(result) <- foo
>> +   result
>> + }
>> 
>>> (x <- f("hello", "world"))
>> $hello
>> [1] "world"
>> 
>>> names(x)
>> [1] "hello"

> On Fri, Aug 4, 2017 at 12:14 PM, Ulrik Stervbo  
wrote:
>> Hi Giovani,
>> 
>> I would create an unnamed list and set the names after.
>> 
>> Best,
>> Ulrik

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help creating the IBM Randu function

2017-08-14 Thread Martin Maechler
> Richard M Heiberger 
> on Mon, 14 Aug 2017 14:36:40 -0400 writes:

> Please look at ?datasets::randu
> for David Donoho's translation of RANDU into R.

Thanks a lot, Rich,  for pointing to this:

Indeed, the RANDU  aka 'randu' data set has been part of R since
the last millennium (and hence before R reached version 1.0.0).
The help page does mention where we got the data set originally,
namely Dave Donoho.  However, I would bet he did not use R at
the time, and indeed it was Prof Brian Ripley who added the
R code to R sources in 1999


r5632 | ripley | 1999-08-27 08:59:05 +0200 (Fri, 27. Aug 1999)

improve clarity (I hope), add an R example to re-generate this


and yes, indeed

example(randu)

already creates the RANDU data set for you .. since 1999.

One curious thing:
Martin .. Pedersen's Wikipedia page (1) seems to make it clear it
stems from IBM whereas in R's help page   help(randu)
we only mention that we got it through VAX VMS code.


> On Mon, Aug 14, 2017 at 12:49 PM, Martin Møller Skarbiniks Pedersen
>  wrote:
>> Dear all,
>> 
>> I am trying to learn functions in R and 3D plotting so I decided to try
>> to plot
>> the famous bad PRNG Randu from IBM(1).
>> However something is not correct in the function I have created.
>> First I define the function RANDU like this:
>> 
>>> RANDU <-  function(num) { return (65539*num)%%(2^31) }
>> 
>> and test that it works for a seed of 1:
>>> RANDU(1)
>> [1] 65539
>> 
>> but if I want the next value in the sequence I get this number.
>>> (65539*65539)%%(2^31)
>> [1] 393225
>> 
>> However using the RANDU function twice doesn't give the same result as
>> above.
>> 
>>> RANDU(RANDU(1))
>> [1] 4295360521
>> 
>> I expect these two values to be the same but that is not the case.
>> 393225 should be the correct.
>> 
>> I guess it might be something with local vs. global environment ?!
>> 
>> Please advise and thanks.
>> 
>> Regards
>> Martin M. S. Pedersen
>> 
>> (1) https://en.wikipedia.org/wiki/RANDU
>> 
>> [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Find maxima of a function

2017-08-29 Thread Martin Maechler
>>>>> Ranjan Maitra 
>>>>> on Sun, 27 Aug 2017 08:17:15 -0500 writes:

> I have not followed the history of this thread, but I am
> quite flummoxed as to why the OP is rewriting code to
> estimate parameters from an univariate Gaussian mixture
> model when alternatives such as EMCluster (which generally
> appears to handle initialization better than MClust)
> exist. Or perhaps there is more to it in which case I
> apologize. But I thought that I would make the OP aware of
> the package (of which, in full disclosure, I am
> co-author).  Best wishes, Ranjan

in this case, I should also mention the small (but nice) package
'nor1mix' (normal 1-dimensional mixtures) for which I had added
MLE via optim() in addition to the traditional slow and
sometimes hard-to-get-to-converge-correctly EM algorithm.
After
install.packages("nor1mix")
require("nor1mix")

?norMixMLE

Martin Maechler,
ETH Zurich


> On Sun, 27 Aug 2017 16:01:59 +0300 Ismail SEZEN
>  wrote:

>> Dear Niharika,
>> 
>> As I said before, the problem is basically an
>> optimization issue. You should isolate the problematic
>> part from the rest of your study. Sometimes, more
>> information does not help to solution. All the answers
>> from us (Ulrik, David, me) are more or less are correct
>> to find a maximum point. Newton’s method is also
>> correct. But after answers, you only say, it didn’t find
>> the right maxima. At this point I’m cluesless, because
>> problem might be originated at some other point and we
>> don’t know it. For instance, In my previous answer, my
>> suggested solution was right for both your 2
>> situations. But suddenly you just said, it didin’t work
>> for a mean greater than 6 and I’m not able to reproduce
>> your new situation.
>> 
>> I’m really upset that you lost 4 weeks on a very easy
>> issue (it is very long time). But if you don’t want to
>> loose anymore, please, isolate your question from the
>> rest of the work, create minimal reproduciple example,
>> and let’s focus on the problematic part. I assure you
>> that your problem will be solved faster.
>> 
>> I could install the distr package at the end, but I’m
>> getting another error while running your code. But I
>> don’t believe the question is related to this package.
>> 
>> Let me explain about extremum points although most of you
>> know about it. Let’s assume we have a y(x) function. An
>> extremum (max/min) point is defined as where dy/dx =
>> 0. If second order derivative of y is less than 0, it’s a
>> maximum, if greater than 0, it’s a minimum point.  If
>> zero, it’s undefined. So, at the end, we need x and y
>> vectors to find maxima.
>> 
>> y <- c(1,2,3,4,3,2,3,4,5,6,7,8,9,8,7,6,5,6,7,6,5) #
>> sample y values x <- 1:length(y) # correspoinding x
>> values
>> 
>> # Now we need to convert this discrete x-y vectors to a
>> function. Because we don’t know the values between
>> points.  # That’s why, we fit a cubic spline to have the
>> values between discrete points.
>> 
>> fun <- splinefun(x = x, y = y, method = "n”)
>> 
>> # now, we are able to find what value of y at x = 1.5 by
>> the help of fun function.
>> 
>> x2 <- seq(1, max(x), 0.1) y2 <- fun(x2)
>> 
>> plot(x, y, type = "l") lines(x2, y2, col = "red”)
>> 
>> # see red and black lines are slightly different because
>> we fit a cubic spline to the discrete points.  # Now, we
>> will use optimize function to find the maxima in a
>> defined interval. Interval definition is crucial.  #
>> optimize function calculates all required steps explained
>> above to find an extremum and gives the result.
>> 
>> max.x <- optimize(fun, interval = range(x), maximum =
>> TRUE) max.x2 <- x[which(y == max(y))] # global maximum of
>> discrete x-y vectors
>> 
>> print(max.x) # x value of global maximum of y
>> print(max.x2) # x value of global maximum of y ( discrete
>> points)
>> 
>> see max.x and max.x2 are very close. max.x is calculated
>> under a cubic spline assumption and max.x2 is calculated
>> by discrete points.
>> 
>> As a result, problem is reduced to find maxima of x-y
>> vecto

Re: [R] Optimize code to read text-file with digits

2017-09-08 Thread Martin Maechler
> peter dalgaard 
> on Fri, 8 Sep 2017 16:12:21 +0200 writes:

>> On 8 Sep 2017, at 15:51 , Martin Møller Skarbiniks
>> Pedersen  wrote:
>> 
>> On 8 September 2017 at 14:37, peter dalgaard
>>  wrote:
>>> 
>>> 
 On 8 Sep 2017, at 14:03 , peter dalgaard
  wrote:
 
 x <- scan("~/Downloads/digits.txt") x <-
 x[-seq(1,22,11)]
>>> 
>>> ...and, come to think of it, if you really want the
>>> 100 random digits:
>>> 
>>> xx <- c(outer(x,10^(0:4), "%/%")) %% 10
>>> 
>> 
>> Hi Peter, Thanks a lot for the answers. I can see that I
>> need to read about outer().  However I get a different
>> result than expected.
>> 
R> x <- scan("digits.txt")
>> Read 22 items
>> 
R> head(x)
>> [1] 0 10097 32533 76520 13586 34673
>> 
R> x <- x[-seq(1,22,11)] head(x)
>> [1] 10097 32533 76520 13586 34673 54876
>> 
R> head(c(outer(x,10^(0:4), "%/%")) %% 10, 10) #
>> [1] 7 3 0 6 3 6 9 7 2 5
>> 

> Ah, right. You do get all the digits, but in the order of
> the last digit of each 5 digit number, then all the
> penultimate digits, etc. To get digits in the right order,
> try

>> xx <- c(t(outer(x,10^(4:0), "%/%"))) %% 10 head(xx, 100)
>   [1] 1 0 0 9 7 3 2 5 3 3 7 6 5 2 0 1 3 5 8 6 3 4 6 7 3 5
> 4 8 7 6 8 0 9 5 [35] 9 0 9 1 1 7 3 9 2 9 2 7 4 9 4 5 3 7 5
> 4 2 0 4 8 0 5 6 4 8 9 4 7 4 2 [69] 9 6 2 4 8 0 5 2 4 0 3 7
> 2 0 6 3 6 1 0 4 0 2 0 0 8 2 2 9 1 6 6 5

> I.e., reverse the order of digit generation and transpose
> the matrix that outer() creates (because matrices are
> column-major).

As people are  "exercising" with R and it's Friday:

Try to use  read.fwf() instead of scan() to get to the digits directly,
and see if you get the identical digits, and if it is faster overall or not
[I have no idea of the answer to that].

another Martin.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error messages using nonlinear regression function (nls)

2017-10-20 Thread Martin Maechler
>>>>> PIKAL Petr 
>>>>> on Fri, 20 Oct 2017 06:33:36 + writes:

> Hi
> Keep your messages in the list, you increase your chance to get some 
answer.

> I changed your data to groupedData object (see below), but I did not find 
any problem in it.

> plot(wlg)
> gives reasonable picture and I am not such expert to see any problem with 
data. Seems to me, that something has to be wrong with nlsList function.

>> wheat.list <- nlsList(Prop ~ SSlogis(end,Asym, xmid, scal), data=wlg)
> Warning message:
> 6 times caught the same error in lm.fit(x, y, offset = offset, 
singular.ok = singular.ok, ...): NA/NaN/Inf in 'x'

> produces empty list. So maybe others could find some problem.

> Cheers
> Petr

Thank you, Petr,  for the reproducible example.

This indeed can be traced back to a bug in SSlogis() that has
been there since Doug Bates added the 'nls' package on Martin's
day 1999 (==> before R version 1.0.0 !) to R (svn rev 6455).

It's this part which does contain a thinko (by whomever, only
Doug may be able to remember/recall, but I guess it would have
been José Pinheiro, then the grad student doing the nitty gritty),
I have added 2 '--' below to make the 2 important lines stand out :

   z <- xy[["y"]]
 -
   if (min(z) <= 0) { z <- z - 1.05 * min(z) } # avoid zeroes
   z <- z/(1.05 * max(z))   # scale to within unit height
 -
   xy[["z"]] <- log(z/(1 - z))  # logit transformation
   aux <- coef(lm(x ~ z, xy))

the first of the 2 lines,  commented   "avoid zeroes"
is obviously *not* avoiding zeroes in the case min(z) == 0 , 
and even though the 2 lines  should rescale an interval  [0, M]
to [eps, 1 - eps] they don't in this case.

Consequently, the next line  log(z/(1 - z))  transforms
the 0s into -Inf  and these lead to the warning (or error in nls())

  " NA/NaN/Inf in 'x' "

One could fix this up by  replacing  min(z)  by  min(z, -1e-7)
which may be best for pure back compatibility, but I really
don't like it either :

The famous  logit - transform   log( z / (1-z))
is really anti-symmetric around z = 1/2 , in particular should
treat 0 and 1 "symmetrically" and I find it ugly that
the previous fixup (our two ominous lines) is not at all
symmetric wrt 1/2, notably the 2nd transformation is made
unconditionally but the first one not.

Fortunately, the same source file, /src/library/stats/R/zzzModels.R 
also defines  the SSfpl()  == 4-parameter logistic model

and there, the 'init' function needs to do the same scaling to
(0, 1)  and does it much nicer, indeed (anti)symmetrically.

I'm looking into using that in SSlogis() as well,
fixing this bug.

Martin Maechler
ETH Zurich and R Core Team

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error messages using nonlinear regression function (nls)

2017-10-20 Thread Martin Maechler
>>>>> PIKAL Petr 
>>>>> on Fri, 20 Oct 2017 13:35:01 + writes:

> Thank you Martin.
> If I understand correctly, OP could do

> wheat.list <- nlsList(Prop ~ SSfpl(end, A, B, xmid, scal), data=wlg)

(which is really a different model with one parameter more, and
 given the relatively small sample size per group, it may not be
 the best idea)

> or add some small value to all zeroes

> wlg$prop <- wlg$Prop+1e-7
> wheat.list <- nlsList(prop ~ SSlogis(end,Asym, xmid, scal), data=wlg)

> which gives fairly reasonable results.
> plot(augPred(wheat.list))

> Am I correct?

Yes, indeed, Petr,  I forgot to mention that obvious workaround.

(Interestingly, he could also _subtract_ a small positive value
 which is less intuitive, but work too given the code I showed.)

Martin



> Cheers
> Petr

>> -Original Message-
>> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
>> Sent: Friday, October 20, 2017 1:04 PM
>> To: PIKAL Petr 
>> Cc: Wall, Wade A ERDC-RDE-CERL-IL CIV ; r-
>> h...@r-project.org
>> Subject: Re: [R] Error messages using nonlinear regression function (nls)
>> 
>> >>>>> PIKAL Petr 
>> >>>>> on Fri, 20 Oct 2017 06:33:36 + writes:
>> 
>> > Hi
>> > Keep your messages in the list, you increase your chance to get some
>> answer.
>> 
>> > I changed your data to groupedData object (see below), but I did not 
find
>> any problem in it.
>> 
>> > plot(wlg)
>> > gives reasonable picture and I am not such expert to see any problem 
with
>> data. Seems to me, that something has to be wrong with nlsList function.
>> 
>> >> wheat.list <- nlsList(Prop ~ SSlogis(end,Asym, xmid, scal), data=wlg)
>> > Warning message:
>> > 6 times caught the same error in lm.fit(x, y, offset = offset, 
singular.ok =
>> singular.ok, ...): NA/NaN/Inf in 'x'
>> 
>> > produces empty list. So maybe others could find some problem.
>> 
>> > Cheers
>> > Petr
>> 
>> Thank you, Petr,  for the reproducible example.
>> 
>> This indeed can be traced back to a bug in SSlogis() that has been there 
since
>> Doug Bates added the 'nls' package on Martin's day 1999 (==> before R 
version
>> 1.0.0 !) to R (svn rev 6455).
>> 
>> It's this part which does contain a thinko (by whomever, only Doug may 
be able
>> to remember/recall, but I guess it would have been José Pinheiro, then 
the grad
>> student doing the nitty gritty), I have added 2 '--' below to 
make the 2
>> important lines stand out :
>> 
>> z <- xy[["y"]]
>> -
>> if (min(z) <= 0) { z <- z - 1.05 * min(z) } # avoid zeroes
>> z <- z/(1.05 * max(z)) # scale to within unit height
>> -
>> xy[["z"]] <- log(z/(1 - z))# logit transformation
>> aux <- coef(lm(x ~ z, xy))
>> 
>> the first of the 2 lines,  commented   "avoid zeroes"
>> is obviously *not* avoiding zeroes in the case min(z) == 0 , and even 
though the
>> 2 lines  should rescale an interval  [0, M] to [eps, 1 - eps] they don't 
in this case.
>> 
>> Consequently, the next line  log(z/(1 - z))  transforms the 0s into -Inf 
 and these
>> lead to the warning (or error in nls())
>> 
>> " NA/NaN/Inf in 'x' "
>> 
>> One could fix this up by  replacing  min(z)  by  min(z, -1e-7) which may 
be best
>> for pure back compatibility, but I really don't like it either :
>> 
>> The famous  logit - transform   log( z / (1-z))
>> is really anti-symmetric around z = 1/2 , in particular should treat 0 
and 1
>> "symmetrically" and I find it ugly that the previous fixup (our two 
ominous
>> lines) is not at all symmetric wrt 1/2, notably the 2nd transformation 
is made
>> unconditionally but the first one not.
>> 
>> Fortunately, the same source file, /src/library/stats/R/zzzModels.R
>> also defines  the SSfpl()  == 4-parameter logistic model
>> 
>> and there, the 'init' function needs to do the same scaling to (0, 1)  
a

Re: [R] x-axis tick marks on log scale plot

2016-05-20 Thread Martin Maechler
> Brian Smith 
> on Thu, 19 May 2016 11:04:55 -0400 writes:

> Thanks all !!  On Thu, May 19, 2016 at 9:55 AM, Ivan
> Calandra  wrote:

>> Hi,
>> 
>> You can do it by first plotting your values without the
>> x-axis: plot(x,y,log="xy", xaxt="n")
>> 
>> and then plotting the x-axis with ticks where you need to:
>> axis(side=1, at=seq(2000,8000,1000))

Getting nicer looking axis ticks  for log-scale axes (and
traditional graphics) I have created the function
   eaxis()
and utility functionpretty10exp(.)

and I also created standard R's  axTicks(.)  to help with these.

if(!require("sfsmisc")) install.packages("sfsmisc")
require("sfsmisc")

x <- lseq(1e-10, 0.1, length = 201)
plot(x, pt(x, df=3), type = "l", xaxt = "n", log = "x")
eaxis(1)

gives the attached plot



eaxis-log-example.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if else condition - help

2016-05-23 Thread Martin Maechler
>>>>> jim holtman 
>>>>> on Sun, 22 May 2016 16:47:06 -0400 writes:

> if you want to use 'ifelse', here is a way:

hmm, why should he want that ?
The OP did mention that it's about somewhat large objects, so
efficiency is one of the considerations :

ifelse() is often convenient and nicely self-explaining, but it 
is (because of its generality, but also by its definition)
much *less efficient* than the (sometimes slightly less
convenient) ways you were shown previously in this thread :

- For the generalized case findInterval() is order of magnitudes
  better, and
- for the simple case you were shown to use logical indexing,
  i.e., calls à lax[x > k] <- ..


In summary:
   Use  ifelse()  much less -- notably if writing
   functions/code which should scale !


Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] asking for large memory - crash running rterm.exe on windows

2016-05-28 Thread Martin Maechler
> Ben Bolker 
> on Sat, 28 May 2016 15:42:45 + writes:

> Anthony Damico  gmail.com> writes:
>> 
>> hi, here's a minimal reproducible example that crashes my
>> R 3.3.0 console on a powerful windows server.  below the
>> example, i've put the error (not crash) that occurs on R
>> 3.2.3.
>> 
>> should this be reported to http://bugs.r-project.org/ or
>> am i doing something silly?  thanx


> From the R FAQ (9.1):

> If R executes an illegal instruction, or dies with an
> operating system error message that indicates a problem in
> the program (as opposed to something like “disk full”),
> then it is certainly a bug.

>   So you could submit a bug report, *or* open a discussion
> on r-de...@r-project.org (which I'd have said was a more
> appropriate venue for this question in any case) ...

Indeed.
In this case, this is a known problem -- not just of R, but of
many programs that you can run ---
You are requesting (much) more memory than your computer has
RAM, and in this situation -- depending on the OS ---
your computer will kill R (what you saw) or your it will become
very slow trying to shove all memory to R and start swapping
(out to disk other running / sleeping processes on the
computer).

Both is very unpleasant...
But it is you as R user who asked R to allocate an object of
about 41.6 Gigabytes (26 * 1.6, see below).

As Ben mentioned this may be worth a discussion on R-devel ...
or you rather follow up the existing thread opened by Marius
Hofert  three weeks ago, with subject
 "[Rd] R process killed when allocating too large matrix (Mac OS X)"

  -->  https://stat.ethz.ch/pipermail/r-devel/2016-May/072648.html
 
His simple command to "crash R" was

   matrix(0, 1e5, 1e5)

which for some of use gives an error such as

> x <- matrix(0, 1e5,1e5)
Error: cannot allocate vector of size 74.5 Gb

but for others it had the same effect as your example.
BTW: I repeat it here in a functionalized form with added
 comments which makes apparent what's going on:


## Make simple data.frame
mkDf <- function(grpsize, wrongSize = FALSE) {
ne <- (if(wrongSize) 26 else 1) *grpsize
data.frame(x = rep(LETTERS, each = ne),
   v = runif(grpsize*26), stringsAsFactors=FALSE)
}

g1 <- ceiling(10^5/26)
d1 <- mkDf(g1) # works fine
str(d1)
## 'data.frame':100022 obs. of  2 variables:

dP <- mkDf(g1, wrong=TRUE)# mis-matching the number of elements

str(dP) # is 26 times larger
## 'data.frame': 2600572 obs. of  2 variables: .


# make this much bigger
gLarge <- ceiling(10^8/26)

dL <- mkDf(gLarge) # works "fine" .. (well, takes time!!)
str(dL)
## 'data.frame': 10004 obs. of  2 variables:
as.numeric(print(object.size(dL)) / 1e6)
## 162088 bytes
## [1] 1600.002  Mega  i.e.,  1.6 GBytes

## Well, this will be 26 times larger than already large ==> your R may crash 
*OR*
 ## your computer may basically slow down to a crawl, when R requests all its 
memory...
if(FALSE) ## ==> do *NOT* evaluate the following lightly !!
dLL <- mkDf(gLarge, wrong=TRUE)
# CONSOLE CRASH WITHOUT EXPLANATION
# C:\Users\AnthonyD>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting Rid of NaN in ts Object

2016-05-28 Thread Martin Maechler

> Perfect!
> Exactly what I was looking for.
> Thanks

> Lorenzo

> On Fri, May 27, 2016 at 01:50:03PM +0200, Christian Brandstätter wrote:
>> Hi Lorenzo,
>> 
>> Try:
>> 
>> tt[is.nan(tt)] <- NA
>> tt <- na.omit(tt)
>> 

or simply  na.omit(tt)

as it omits both NA and NaN (and *does* keep the 'ts' properties
as you have noted).

Martin

>> Best,
>> 
>> Christian
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making an if condition variable ?

2016-06-01 Thread Martin Maechler
> Jim Lemon 
> on Thu, 2 Jun 2016 13:03:01 +1000 writes:

> Hi ce,

> a<-10
> condition<-expression("a>0")
> if(eval(parse(text=condition))) cat("a>0\n")

While this may answer the question asked,
the above is *not* good advice, excuse me, Jim :

> fortune(106)

If the answer is parse() you should usually rethink the question.
   -- Thomas Lumley
  R-help (February 2005)

> fortune(181)

Personally I have never regretted trying not to underestimate my own future 
stupidity.
   -- Greg Snow (explaining why eval(parse(...)) is often suboptimal, answering 
a question
  triggered by the infamous fortune(106))
  R-help (January 2007)

-

Good advice would emphasize to use  expressions rather than
strings and yes that's a bit more sophistication.

But it's worth it.
Martin


> 
> Jim

> On Thu, Jun 2, 2016 at 12:30 PM, ce  wrote:
>> 
>> Dear all,
>> 
>> I want to make an if condition variable like :
>> 
>> a = 10
>> CONDITION = " a > 0 "
>> 
>> if ( CONDITION ) print(" a is bigger" )
>> 
>> I tried get , getElement , eval without success ?
>> 
>> Thanks
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stats package

2016-06-03 Thread Martin Maechler
>>>>> Dariush Ashtab 
>>>>> on Thu, 2 Jun 2016 17:57:14 +0430 writes:

["accidentally" to R-core .. should have gone to R-help -- and
 hence answered here]
 
> Dear R project I need stats package for optimization in
> simulated annealing but i can not download.  please guide
> me.

If you install R in any "normal way", it comes with 29 R
packages.

Inside, R, you get their names via

> rownames(installed.packages(.Library, priority=c("base","recommended")))
 [1] "base"   "boot"   "class"  "cluster""codetools" 
 [6] "compiler"   "datasets"   "foreign""graphics"   "grDevices" 
[11] "grid"   "KernSmooth" "lattice""MASS"   "Matrix"
[16] "methods""mgcv"   "nlme"   "nnet"   "parallel"  
[21] "rpart"  "spatial""splines""stats"  "stats4"
[26] "survival"   "tcltk"  "tools"  "utils" 
> 

A subset of  14  of these, the "base" ones, are bundled with R,
and entirely part of the source code of R, and hence cannot be 
installed separately :

> rownames(installed.packages(.Library, priority="base"))
 [1] "base"  "compiler"  "datasets"  "graphics"  "grDevices" "grid" 
 [7] "methods"   "parallel"  "splines"   "stats" "stats4""tcltk"
[13] "tools" "utils"

Package 'stats' is among these and even is among those packages
that are loaded and *attached* (to your "search path") by default,
when you start R.


> Many thanks from any pointers,
> Dariush

You are welcome,
Martin

Martin Maechler
ETH Zurich

> -- 
> Dariush Ashtab
> ( MS.c in Environment Assessment & Planning )
> Master Student , Department of Environment, Faculty of Natural Resources,
> Tarbiat Modares University (T.M.U.), Noor, Iran.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Install R version 3.3 on Debian server

2016-07-18 Thread Martin Maechler
>>>>> Mbr Mbr 
>>>>> on Mon, 18 Jul 2016 11:45:00 +0200 writes:

> Hello,
> I'm currently in a internship and I'm working on a Debian server.
> However, I have some errors in my scripts and I think it's because of the
> version of R installed on the server (works well on my PC with the latest
> version of R on Windows aka 3.3.1).

> Here are the details from sessionInfo() of the server :

> - R version 2.15.1 (2012-06-22)
> - Platform : i486-pc-linux-gnu (32 bit)

Yes,  2.15.1  is considered antique,  and nobody should be using
it unless for "Rchaelogy" (the pre-history and history of R
implementations), and so I do occasionally start such a version.

> Since I'm totally a beginner about Linux server and my internship's tutor
> doesn't want to read the tutorial to install R for Debian (even if the
> server is his) or help for this tank for some unknown reason, I would like
> to know how to upgrade the current R version on the server (from 2.15.1 to
> 3.3).

Very good proposal.
Two things:

- We have a *dedicated* mailing list for Debian (and
  Debian-derived such as "ubuntu") Linux distributions:
  --> https://www.R-project.org/mail.html does list the "SIG"
  (Special Interest Groups), including R-SIG-Debian
  which points to http://stat.ethz.ch/mailman/listinfo/r-sig-debian

- CRAN itself has sections about this, notably
  https://cloud.r-project.org/bin/linux/debian/
  will tell you how to tell your debian system to get R from
  CRAN as opposed from the debian repositories.
  

> Thanks in advance

You are welcome,
Martin Maechler

> Sincelerly,
> Mbr

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ... distribution based on mean values and quantiles in R?

2016-07-19 Thread Martin Maechler
>>>>> Jim Lemon 
>>>>> on Tue, 19 Jul 2016 16:20:49 +1000 writes:

> Hi Daniel,
> Judging by the numbers you mention, the distribution is either very
> skewed or not at all normal. If you look at this:

> plot(c(0,0.012,0.015,0.057,0.07),c(0,0.05,0.4,0.05,0),type="b")

> you will see the general shape of whatever distribution produced these
> summary statistics. Did the paper give any hints as to what the model
> distribution might be?

Yes, that's the correct question:  At first, it's not about
plotting, about *fitting* a distribution with the desired
properties, and then you can easily visualize it.

So, what were the data?  If they are 'amounts' of any kind, they
are necessarily non-negative often always positive, and hence
--- according to John W Tukey --- should be analyzed after
taking logs (Tukey's "first aid transformation" for *any*
amounts data).

Taking logs, and analyzing means to consider a normal
("Gaussian") distribution for the log()  which is
equivalent to fitting a  lognormal distribution -- R functions [dpqrr]lnorm() 
to the original data.  I'd strongly recommend doing that.

And I did so, finding out, however that if indeed it is the
*mean* and the 15% and 95% quantiles,  a log normal is not
fitting.  Here is the reproducible R code .. with lots of
comments :


##
## MM  strongly recommends to fit a  log-normal distribution .. however it does 
*not* fit

## The "data statistics"
qlN <- c(q15 = 0.012, q85 = 0.057) # Quantiles original scale
mlN <- 0.015

(qn <- log(qlN))  # Quantiles log scale [assumed normal 
(Gaussian) here]
## as the Gaussian is symmetri, the mid value of the two quantiles is the mean 
and median :
(mu <- mean(qn))   # -3.644
(medlN <- exp(mu)) # 0.02615
## an estimate for the median(.)  -- but  it is *larger* than the mean = 0.015 
above !
## ===> Log-normal does *NOT* fit well :

## From the help page, we learn that
##E[lN] = exp(mu + 1/2 sigma^2){and it is trivial that}
##   median[lN] = exp(mu)
## where here, our medLn is a (moment) estimate of median[lN]

## If the number were different, this would solve the problem :
## Consequently, a (moment / plugin) estimate for sigma is
(e12sig <- mlN / medlN) ## ~= exp( 1/2 sigma^2)
(sig2 <- 2*log(e12sig)) ## ~=  sigma^2  [--- is *NEGATIVE* (<==> 'est.median' > 
mean !)]
(sig <- sqrt(sig2)) ## ~=  sigma[here of course 'NaN' with a warning !]


My conclusion would be that other distributions (than the
log-normal; the normal  is out of question !!) have to be
considered, if you want to be serious about the topic.

Maybe the poweRlaw package (https://cloud.r-project.org/package=poweRlaw)
may help you (it has 4 vignettes, the last being a nice JSS publication).

The above is a "cute" non-standard problem in any case: to fit very skewed
distributions, given two quantiles and the mean only, and the
approach taken by the "poweRlawyers", namely to minimize the KS
(Kolmogorov-Smirnoff) decrepancy seems a good start to me.

Martin Maechler,
ETH Zurich



> Jim


> On Tue, Jul 19, 2016 at 7:11 AM, gcchenleidenuniv
>  wrote:
>> Hi all,
>> 
>> I need to draw density curves based on some published data. But in the 
article only mean value (0.015 ) and quantiles (Q0.15=0.012 , Q0.85=0.057) were 
given. So I was thinking if it is possible to plot density curves solely based 
on the mean value and quantiles. The dnorm(x, mean, sd, log) function needs the 
standard deviation which was not mentioned, so it can not be used in this 
situation.
>> 
>> Many thanks!!
>> Daniel
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readline issue with 3.3.1

2016-07-21 Thread Martin Maechler
>>>>> Ralf Goertz 
>>>>> on Wed, 20 Jul 2016 16:37:53 +0200 writes:

> Am Wed, 20 Jul 2016 11:35:31 +0200
> schrieb Ralf Goertz :

>> Hi,
>> 
>> after a recent update to version 3.3.1 on Opensuse Leap I have
>> problems with command lines longer than the terminal width. E.g. when
>> I do this

> I installed readline version 6.3 and the issue is gone. So probably some
> of the recent changes in R's readline code are incompatible with version
> readline version 6.2.

Yes, it seems so, unfortunately.

Thank you for reporting !

Our plan had been different: the NEWS entry for 3.3.1 (among 'BUG FIXES') says

• Use of Ctrl-C to terminate a reverse incremental search started
  by Ctrl-R in the readline-based Unix terminal interface is now
  supported when R was compiled against readline >= 6.0 (Ctrl-G
  always worked).  (PR#16603)

So we had hoped that change (fixing a bug you *should* have been
able to see with readline 6.2, as well) would work correctly in all
versions of readline >= 6.0,
but evidently it did not in yours.

Martin Maechler
ETH Zurich / R Core Team

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] C/C++/Fortran Rolling Window Regressions

2016-07-21 Thread Martin Maechler
>>>>> jeremiah rounds 
>>>>> on Thu, 21 Jul 2016 13:56:17 -0700 writes:

> I appreciate the timing, so much so I changed the code to show the issue.
> It is a problem of scale.

> roll_lm probably has a heavy start-up cost but otherwise completely
> out-performs those other versions at scale.  I suspect you are timing the
> nearly  constant time start-up cost in small data.  I did give code to
> paint a picture, but it was just cartoon code lifted from stackexchange.
> If you want to characterize the real problem it is closer to:
> 30 day rolling windows on 24 daily (by hour) measurements for 5 years with
> 24+7 -1 dummy predictor variables and finally you need to do this for 300
> sets of data.

> Pseudo-code is closer to what follows and roll_lm can handle that input in
> a timely manner.  You can do it with lm.fit, but you need to spend a lot 
of
> time waiting.  The issue of accuracy needs a follow-up check.  Not sure 
why
> it would be different.  Worth a check on that.

Just a note about lm(), lm.fit():
As Achim Zeileis noted, they are very reliable implementations
(based on QR) indeed.   For a year or so now, there's the bare
bone  .lm.fit()  which is even one order of magnitued factor
faster than lm.fit()  at least for simple cases, see the final
example on the  ?.lm.fit help page.

Martin Maechler
(who had added .lm.fit() to R )


> Thanks,
> Jeremiah


> library(rbenchmark)
> N = 30*24*12*5
> window = 30*24
> npred = 15  #15 chosen arbitrarily...
> set.seed(1)
> library(zoo)
> library(RcppArmadillo)
> library(roll)
> x = matrix(rnorm(N*(npred+1)), ncol = npred+1)
> colnames(x) <- c("y",  paste0("x", 1:npred))
> z <- zoo(x)


> benchmark(
> roll_lm =  roll_lm(coredata(z[, 1, drop = F]), coredata(z[, -1, drop =
> F]), window,
> center = FALSE), replications=3)

> Which comes out as:
> test replications elapsed relative user.self sys.self user.child
> sys.child
> 1 roll_lm3   6.273138.3120.654  0
> 0





> ## You arn't going to get that below...

> benchmark(fastLm = rollapplyr(z, width = window,
> function(x) coef(fastLm(cbind(1, x[, -1]), x[, 1])),
> by.column = FALSE),
> lm = rollapplyr(z, width = window,
> function(x) coef(lm(y ~ ., data = as.data.frame(x))),
> by.column = FALSE), replications=3)



> On Thu, Jul 21, 2016 at 1:28 PM, Gabor Grothendieck 
> wrote:

>> I would be careful about making assumptions regarding what is faster.
>> Performance tends to be nonintuitive.
>> 
>> When I ran rollapply/lm, rollapply/fastLm and roll_lm on the example
>> you provided rollapply/fastLm was three times faster than roll_lm.  Of
>> course this could change with data of different dimensions but it
>> would be worthwhile to do actual benchmarks before making assumptions.
>> 
>> I also noticed that roll_lm did not give the same coefficients as the
>> other two.
>> 
>> set.seed(1)
>> library(zoo)
>> library(RcppArmadillo)
>> library(roll)
>> z <- zoo(matrix(rnorm(10), ncol = 2))
>> colnames(z) <- c("y", "x")
>> 
>> ## rolling regression of width 4
>> library(rbenchmark)
>> benchmark(fastLm = rollapplyr(z, width = 4,
>> function(x) coef(fastLm(cbind(1, x[, 2]), x[, 1])),
>> by.column = FALSE),
>> lm = rollapplyr(z, width = 4,
>> function(x) coef(lm(y ~ x, data = as.data.frame(x))),
>> by.column = FALSE),
>> roll_lm =  roll_lm(coredata(z[, 1, drop = F]), coredata(z[, 2, drop =
>> F]), 4,
>> center = FALSE))[1:4]
>> 
>> 
>> test replications elapsed relative
>> 1  fastLm  1000.221.000
>> 2  lm  1000.723.273
>> 3 roll_lm  1000.642.909
>> 
>> On Thu, Jul 21, 2016 at 3:45 PM, jeremiah rounds
>>  wrote:
>> >  Thanks all.  roll::roll_lm was essentially what I wanted.   I think
>> maybe
>> > I would prefer it to have options to return a few more things, but it 
is
>> > the coefficients, and the remaining statistics you might want can be
>> > calculated fast enough from there.
>> >
>> >
>> > On Thu, Jul 21, 2016 at 12:36 PM, Achim Zeileis <
>> achim.zeil...@uibk.ac.at>
>> > wrote:
>> >
>> >> Jeremiah

Re: [R] readline issue with 3.3.1

2016-07-24 Thread Martin Maechler
>>>>> Ralf Goertz 
>>>>> on Fri, 22 Jul 2016 10:15:36 +0200 writes:

> Am Thu, 21 Jul 2016 18:07:43 +0200 schrieb Martin Maechler
> :

>> Ralf Goertz  on Wed, 20 Jul 2016
>> 16:37:53 +0200 writes:
 
>>> I installed readline version 6.3 and the issue is
>>> gone. So probably some of the recent changes in R's
>>> readline code are incompatible with version readline
>>> version 6.2.
>> 
>> Yes, it seems so, unfortunately.
>> 
>> Thank you for reporting !

> It would be great if – while fixing this – you also took
> care of the SIGWINCH problem described in bug report
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16604

Well, that *has* been fixed in 'R-devel' and 'R 3.3.1 patched'
..  but again only for readline >= 6.3

(The release process of 3.3.1 was already too much advanced, and
 such source changes are delicate)

It is quite a bit of work to get correct code for the different
versions of readline, also dealing with the fact that e.g. on
Mac OSX the code needs to work with the libedit/editlib
"substitute".

One reason we have not closed the bug is that the fix is only
for readline >= 6.3, ... and also that I don't think we got much
of user confirmation of the form
   " yes, the bug goes away once I compile R-devel or R-patched
   "

> Thanks, Ralf

You are welcome, Martin

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimators for non-normal data

2016-07-25 Thread Martin Maechler
>>>>> Nika Sušac 
>>>>> on Sat, 23 Jul 2016 19:39:48 +0200 writes:

> Hi!  I have non-normal data (items are continuous on a
> 9-point scale, but not normally distributed) and I want to
> conduct cfa. Which of the estimators available in lavaan
> do you recommend me to use?  Thanks in advance!

I think you want *robust* statistical methods then.

Robust factor analysis (if 'cfa' means something like that) is
somewhat prominent topic.
Simple approaches will already be available by using
MASS::cov.rob() for a robust covariance matrix which you then
can pass to other methods.

For more (and more modern) methods and approaches,

- for R packages, I'd recommend you consult the CRAN task view
  about robust statistical methods,
https://cran.r-project.org/web/views/Robust.html
  notably the section on 'Multivariate Analysis'

- for more specific help and expertise, I strongly recommend
  the dedicated  R-SIG-robust mailing list (instead of R-help),
   --> https://stat.ethz.ch/mailman/listinfo/r-sig-robust

Best regards,
Martin Maechler, ETH Zurich

>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm() silently drops NAs

2016-07-26 Thread Martin Maechler
I have been asked (in private)

> Hi Martin,

y <- c(1, 2, 3, NA, 4)
x <- c(1, 2, 2, 1, 1)

t.test(y ~ x)
lm(y ~ x)

> Normally, most R functions follow the principle that
> "missings should never silently go missing". Do you have
> any background on why these functions (and most model/test
> function in stats, I presume) silently drop missing values?

And I think, the issue and an answer are important enough to be
public, hence this posting to R-help :

First note that in some sense it is not true that lm(y ~ x) silently drops 
NAs: Everybody who is taught about lm() is taught to look at
   summary( fm )  where  fm <- lm(y ~ x)

and that (for the above case)  "says"

 ' (1 observation deleted due to missingness) '

and so is not entirely silent.


This goes all back to the idea of having an 'na.action' argument
which may be a function (a very good idea, "functional
programming" in a double sense!)... which Chambers et al
introduced in The White Book (*1) and which I think to remember 
was quite a revolutionary idea; at least I had liked that very much
once I understood the beauty of passing functions as arguments
to other functions.
One problem already back then has been that we already had the
---much more primitive but often sufficient--- standard of an
'na.rm = FALSE' (i.e. default FALSE) argument.

This has been tought in all good classes/course about statistical
modeling with S and R ever since ... I had hoped 
(but it seems I was too optimistic, .. or many students have too
 quickly forgotten what they were taught ..)
Notably the white book itself, and the MASS (*2) book do teach
this.. though possibly not loudly enough.

Two more decisions about this were made back then, as well:

  1) The default for na.action to be na.omit  (==> "silently dropping")

  2) na.action being governed by  options(na.action = ..)

'1)' may have been mostly "historical": I think it had been the behavior of
other "main stream" statistical packages back then (and now?) *and*
possibly more importantly, notably with the later (than white book = "S3")
advent of na.replace, you did want to keep the missing in your
data frame, for later analysis; e.g. drawing (-> "gaps" in
plots) so the NA *were* carried along and would not be
forgotten, something very important in most case.s.
   
and '2)' is something I personally no longer like very
much, as it is "killing" the functional paradigm.
OTOH, it has to be said in favor of that "session wide" / "global" setting
  options(na.action = *)
that indeed it depends on the specific data analysis, or even
the specific *phase* of a data analysis, *what* behavior of NA
treatment is desired and at the time it was thought smart
that all methods (also functions "deep down" called by
user-called functtions) would automatically use the "currently
desired" NA handling.

There have been recommendations (I don't know exactly where and
by whom) to always set

   options(na.action = na.fail)

in your global .First() or nowadays rather your  Rprofile, and
I assume that some of the CRAN packages and some of the "teaching
setups" would do that (and if you do that, the lm() and t.test()
calls above give an error). 

Of course, there's much more to say about this, quite a bit of which
has been published in scientific papers and books..

Martin Maechler
ETH Zurich and R Core Team



*1) "The White Book": Chambers and Hastie (1992):
 https://www.r-project.org/doc/bib/R-books_bib.html#R:Chambers+Hastie:1992

*2) "MASS - the book": Venables and Ripley (2002 (= 4th ed.)):
 https://www.r-project.org/doc/bib/R-books_bib.html#R:Venables+Ripley:2002

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm() silently drops NAs

2016-07-30 Thread Martin Maechler
>>>>> peter dalgaard 
>>>>> on Tue, 26 Jul 2016 23:30:31 +0200 writes:

>> On 26 Jul 2016, at 22:26 , Hadley Wickham
>>  wrote:
>> 
>> On Tue, Jul 26, 2016 at 3:24 AM, Martin Maechler
>>  wrote:
>>> 
> ...
>> To me, this would be the most sensible default behaviour,
>> but I realise it's too late to change without breaking
>> many existing expectations.

> Probably.

Well, yes, Hadley's proposal would change too many default outputs;
Still, I'd tend think we at least should contemplate changing
the default (mildly).  Before doing that we could at least provide one
or a few new  na.* functions which people could try using  as their own
defaults, using options(na.action = *) __ or __ as I read in the
white book (see below) by

  attr(, "na.action") <-  na.

That is actually something that I have not seen advertised much,
but seems very reasonable to me.

It often does make sense to set a dataset-specific NA handling.

I agree (with Hadley's implicit statement) that 'na.exclude' is
often much more reasonable than 'na.omit' and that was the
reason it was invented *after* the white book, IIRC, for S-PLUS,
not S; and quite advertized at the time (at least by some people
to whom I listened ...)


If we'd change the default na.action , we'd probably would not want
to go  as far as a version of  Hadley's (corrected in 2nd e-mail) 

   na.warn <- function(object, ...) {
 missing <- !complete.cases(object)
 if (any(missing)) {
warning("Dropping ", sum(missing), " rows with missing values", call. = 
FALSE)
 }
 na.exclude(object, ...)
   }

[the warning probably needs to be a bit smarter: you don't always have
 "rows", sometimes it's "entries" in a vector]

but *providing* something like the above, possibly rather named
along the line of

'na.warn.and.keep'

and provide a  'na.warn.and.omit'  (which I would call
'na.warn', as after all it would be very close to the current R
default, na.omit, but would warn additionally..




> Re. the default choice, my recollection is that at the
> time the only choices available were na.omit and na.omit.

> S-PLUS was using na.fail for all the usual good
> reasons, but from a practical perspective, the consequence
> was that, since almost every data set has NA values, you
> got an error unless you added na.action=na.omit to every
> single lm() call. And habitually typing na.action=na.omit
> doesn't really solve any of the issues with different
> models being fit to different subsets and all that. 

yes, indeed; you are right:

- The white book (p. 112-113) indeed explains that 'na.fail' is
  (i.e., was there) the default.

- an important example of where na.omit was not good enough, at
  the time was stepwise regression (where the number of complete
  cases may depend on the current subset of predictor
  variables), or also bootstrapping and crossvalidation.

> So the
> rationale for doing it differently in R was that it was
> better to get some probably meaningful output rather than
> to be certain of getting nothing. And, that was what the
> mainstream packages of the time were doing.

>> On a related note, I've never really understood why it's
>> called na.exclude - from my perspective it causes the
>> _inclusion_ of missing values in the
>> predictions/residuals.

> I think the notion is that you exclude them from the
> analysis, but keep them around for the other purposes.

> -pd

>> 
>> Thanks for the (as always!) informative response, Martin.
>> 
>> Hadley
>> 
>> -- 
>> http://hadley.nz
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
>> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide
>> commented, minimal, self-contained, reproducible code.

> -- 
> Peter Dalgaard, Professor, Center for Statistics,
> Copenhagen Business School Solbjerg Plads 3, 2000
> Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23
> Email: pd@cbs.dk Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No "number of days" format for 'difftime'?

2016-08-08 Thread Martin Maechler
> Loris Bennett 
> on Mon, 8 Aug 2016 14:12:47 +0200 writes:

> Loris Bennett  writes:
>> Thanks for the link, John.  However, there is a hyphen missing.  It
>> should be:
>> 
>> 
http://www.dummies.com/how-to/content/how-to-read-errors-and-warnings-in-r.html
>> 
>> Appropriately, with the correct URL we find the too often forgotten
>> pearl of wisdom:
>> 
>> "Chances are, you just typed something wrong there."
>> 
>> I think I need this on my coffee cup.
>> 
>> Cheers,
>> 
>> Loris

> Continuing the topic for my future self and others equally poorly versed
> in The Art and Dark Science of Interpreting R Error Messages, if I have
> the following in the file "my_data"

> 1094165  2016-07-24T09:40:02 13-23:03:28  1  COMPLETED 
> 1112076  2016-08-01T14:45:49 6-13:26:15  1  COMPLETED 

> and do

>> d <- read.table("my_data")
>> colnames(d) <- c("jobid","start","elapsed","alloccpus","state")
>> df <- transform(d,start = 
as.POSIXct(start,format="%Y-%m-%dT%H:%M:%S"),elapsed = 
as.difftime(elapsed,format="%d-%H:%M:%S"))

> I get the following:

> Error in as.difftime(elapsed, format = "%d-%H:%M:%S") : 
> 'tim' is not character or numeric

Well, let me argue that you should have found this to be a *helpful*
error message. You are no complete beginner anymore, right,
so

1) the error is in your use of  as.difftime().

2) ?as.difftime  or  str(difftime)
   both clearly indicate that  'tim' is the first argument of as.difftime,

and I really do wonder why you continued with the infamous
"trial-and-error programming technique" instead of reading or at
least quickly browsing the relevant reference, i.e., help page

Martin


> Remembering that if something is not what you think it is, it is
> probably a factor, I find that

>> d <- read.table(data_file,stringsAsFactors=FALSE)

> causes the error to go away.

> So what's all this 'tim' business?  A quick squint at the source code of
> datetime.R reveals the following line:

> if (!is.numeric(tim)) stop("'tim' is not character or numeric")

> So the error message tells me something about the arbitrary name of the
> variable 'tim', which could also have been 'tom', 'dick', or 'harriet',
> but nothing about the value.

> So follow Rolf Turner's fortune(350) advice, then look at the source
> code, and then realise that you weren't being totally dopey in not
> understanding the error message.


> Loris

> -- 
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Diethelm Wuertz (founder 'Rmetrics')

2016-08-10 Thread Martin Maechler
Dear colleagues

We are deeply saddened to inform you that Diethelm and his wife Barbara Würtz
(ascii-fied 'Wuertz') died in a car accident during their vacation in
Hungary last week. 

Both, Diethelm and Barbara worked at the ETH Zurich,
Diethelm as a researcher and teacher in the Institute for Theoretical
Physics and Barbara as an advisor in the Human Resources division.

In the "R world", Diethelm has been known as an enthousiastic
entrepreneur, having built the "Rmetrics" project (http://www.rmetrics.org) 
quite early in the history of R.  He organized much-liked
workshops / summer schools about "R in Finance" at Meielisalp in
the Swiss mountains, and in cities such as Zurich, Paris or Mumbay. 

Many of us have known Barbara and Diethelm as generous and amiable people
who've enjoyed hosting others both at such workshops or privately.
They leave their son, Fabian, and their daughter-in-law Viktoria.
Our thoughts are with them. 

Sincerely,

Martin Maechler, ETH Zurich (Seminar for Statistics)
Adrian Trapletti, Uster Metrics

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error while fitting gumbel copula

2016-08-10 Thread Martin Maechler
>>>>> Isaudin Ismail 
>>>>> on Tue, 9 Aug 2016 14:30:18 +0100 writes:

> Dear R experts,
> I have 5 time series of data (A, B, C, D and E) with all same lengths. All
> series exhibit gamma distribution except for B which is lognormal
> distribution. I am using copula package to model the joint-distribution of
> these 5 times series.

> I have selected Archimedean copula and successfully fitted  Frank and
> Clayton copula. The problem is when trying to fit Gumbel copula.

> The following are the codes I used to run in R.

> # Data of 5 time series

> A <- A
> B <- B
> C <- C
> D <- D
> E <- E

well, the above is really an "interesting" block of R code ;-)

--

More seriously, please learn to use reproducible examples,
e.g., from here
  http://bit.ly/MRE_R (nice to remember: MRE = Minimal Reproducible Example)
or here
  http://adv-r.had.co.nz/Reproducibility.html

then we will be glad to help you,
notably I as maintainer of the package 'copula' which you are
using (without saying so).

With regards,
Martin Maechler


> # Combined between A and C
> A+C <- A + C

> gumbel.copula <- gumbelCopula(dim = 5)
> m <- pobs(as.matrix(cbind(A+C, B, D, E)))
> fit.gumbel<- fitCopula(gumbel.copula, m, method = 'ml')

> And the error while trying to fit gumbel copula:

> Error in optim(start, loglikCopula, lower = lower, upper = upper, method =
> method,  :
> non-finite finite-difference value [1]
> In addition: Warning message:
> In .local(copula, tau, ...) : tau is out of the range [0, 1]

> Appreciate all help!

> Many thanks,
> Isaudin

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error while fitting gumbel copula

2016-08-23 Thread Martin Maechler
>>>>> Isaudin Ismail 
>>>>> on Thu, 18 Aug 2016 17:03:50 +0100 writes:

> Dear Martin, Following my earlier question on "error while
> fitting gumbel copula", I have also crated a new gist at
> https://gist.github.com/anonymous/0bb8aba7adee550d40b840a47d8b7e25 
> for easy checking and copying codes.

> I got no problem fitting other Archimedean copulas except
> gumbel copula as per my code I used above.

> Appreciate your kind help.

> Many thanks, Isaudin


> On Mon, Aug 15, 2016 at 4:28 PM, Isaudin Ismail
>  wrote:

>> Dear Dr. Martin,
>> 
>> I'm glad that you replied to my queries.
>> 
>> As advised, I have prepared the following:

MM: I'm including (cut'n'pasting) my commented and augmented version here:

--

## From: Isaudin Ismail 
## To: Martin Maechler 
## CC: 
## Subject: Re: [R] Error while fitting gumbel copula
## Date: Mon, 15 Aug 2016 16:28:14 +0100

## Dear Dr. Martin,

## I'm glad that you replied to my queries.
## As advised, I have prepared the following:

library(copula)

## 5 series of data, A, B, C, D and E
A <- c(0.849420849, 0.900652985, 0.97144217, 0.817888428, 0.877901578,
   1.070040669, 0.889742431, 0.87588968, 0.853541938, 0.848664688,
   0.876830319, 0.749582638, 0.818515498, 0.890997174, 0.794766966,
   0.784794851, 0.814858959, 1.074396518, 0.83752495, 0.894341116,
   0.880375293, 0.900816803)

B <- c(0.479850746, 0.652111668, 1.880607815, 0.579902303, 0.50669344,
   0.747560182, 0.701754386, 0.48969697, 0.346751006, 0.379234973,
   0.862691466, 0.328280188, 0.317312661, 0.534438115, 0.487002653,
   0.335043612, 0.373346897, 0.627520161, 0.792114695, 0.938253012,
   0.444553967, 0.625972763)

C <- c(0.693491124, 0.866523143, 4.585714286, 1.512055109, 0.387755102,
   0.513435701, 0.76252505, -0.113113113, 0.338521401, 0.333951763,
   0.668755595, 0.401273885, 0.419868791, 0.272885789, 0.541541542,
   0.32751938, 0.386409736, 0.957446809, 0.861195542, 1.531632653,
   0.431610942, 1.226470588)

D <- c(0.807792208, 0.548547718, 0.738232865, 0.542247744, 1.088964927,
   0.862385321, 0.60720268, 1.000816993, 0.699289661, 0.41723356,
   0.604037267, 0.605003791, 0.698940998, 0.764792899, 0.647897898,
   0.825256975, 0.767476085, 0.941391941, 0.889547813, 0.324503311,
   0.942435424, 0.740686633)

E <- c(1.077598829, 0.318507891, 1.152616279, 0.930397727, 1.515994437,
   0.940689655, 0.880886427, 1.054274084, 1.067282322, 0.677419355,
   0.966233766, 0.761029412, 1.05734767, 0.615925059, 1.061988304,
   1.07184241, 1.058890147, 1.123873874, 1.304891923, -0.069584736,
   1.172757475, 0.501096491)

require(copula)
gumbel.copula <- gumbelCopula(dim = 2)
p <- pobs(cbind(D + E, A + B+ C ))

fit.gumbel <- fitCopula(gumbel.copula, p, method = "ml")

## The error is here when trying to fit the gumbel copula

# I got the following error:
## Error in optim(start, loglikCopula, lower = lower, upper = upper, method =
## method,  :
##  non-finite finite-difference value [1]
##   In addition: Warning message:
##   In .local(copula, tau, ...) : tau is out of the range [0, 1]

## MM:  my version of copula gives the error message  "some tau < 0"
## --  -
## and indeed:
(tau.p <- cor(p[,1], p[,2], method="kendall"))
## [1] -0.1428571
## ^-  Kendall's tau is =  - 1/7  < 0  ... and that is not good for 
Gumbel!

plot(p)

##-

So, you tried fitting to *negatively* correlated data, and if
you use the default instead of "ml" the copula is fit, and uses
param = 1 (which corresponds to the *independence* copula:
Because among all the (weakly) positively correlated gumbel
copulas, the boundary case, param = 1 (<==> tau = 0) is the best
fitting.

What you can do is to "rotate" the data (actually mirror it),
and fit a gumbel copula, which now works nice and easily :

p2 <- p; p2[,2] <- 1-p2[,2]
(tau.p2 <- cor(p2, method="kendall"))
## --> now positively correlated
## --->
gumb.ml.p2 <- fitCopula(gumbel.copula, p2, method = "ml")
summary(gumb.ml.p2) # looks fine now :

  Call: fitCopula(copula, data = data, method = "ml")
  Fit based on "maximum likelihood" and 22 2-dimensional observations.
  Gumbel copula, dim. d = 2 
  Estimate Std. Error
  param1.121  0.209
  The maximized loglikelihood is 0.1839 
  Optimization converged
  Number of loglikelihood evaluations:
  function gradient 
 66 


---

The next versi

Re: [R] loading .rda file

2016-08-30 Thread Martin Maechler
> Jeff Newmiller 
> on Tue, 30 Aug 2016 09:36:05 -0700 writes:

> You cannot. However, you can load the file into a dedicated environment 
to keep those names separated from your global environment. e.g. [1]

yes, that's my "famous"  only-allowed use of  attach() :
attach() an rda-file to your search patch instead of polluting
your globalenv.

  > The saveRDS / loadRDS 

saveRDS / readRDS   are the correct names

> functions are an alternative handle one object at a time without dragging 
the object names into the picture (you have to name the re-loaded object).

The fact that it is readRDS() and not loadRDS(),
actually does convey via it is often "much better" in the sense
of functional / transparent programming to use this pair in
favor of save() / load() :
readRDS() does *return* what we are interested in, and

   result <- readRDS()

is so much better than   load()   silently overwriting all
kinds of objects in my globalenv.

Indeed, I strongly advocate to use  saveRDS() and readRDS()
much more frequently than they are used nowawadays.



> However, the best approach is to write scripts that pull directly from 
your source (non-R) data files. This makes your work process reproducible as 
you develop it.

Yes, but that's sometimes too inefficient.

And then, some of us do simulations and other expensive
computations, we need/want to save and re-read.

and yes,  I *always*  use the equivalent of   q("no")  to leave R; 
never save the workspace or load it at startup, unless
accidentally on non-standard (for me) platforms.

Martin

> [1] https://stat.ethz.ch/pipermail/r-help/2016-August/441078.html
> -- 
> Sent from my phone. Please excuse my brevity.

> On August 30, 2016 7:37:24 AM PDT, Leslie Rutkowski 
 wrote:
>> Hi,
>> 
>> I'm slowly migrating from SAS to R and - for the very first time - I'm
>> working with a native .Rda data file (rather than importing data from
>> other
>> sources). When I load this .Rda file into the global environment using
>> load("file path") I see a data.frame in the global environment called
>> "mydata" that corresponds to the .rda file.
>> 
>> My question: how can I change the name of this data.frame to something
>> of
>> my choosing?
>> 
>> Thanks for considering this very simple question.
>> 
>> Leslie
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting Mixture distributions

2016-09-08 Thread Martin Maechler
>>>>> Bert Gunter 
>>>>> on Wed, 7 Sep 2016 23:47:40 -0700 writes:

> "please suggest what can I do to resolve this
> issue."

> Fitting normal mixtures can be difficult, and sometime the
> optimization algorithm (EM) will get stuck with very slow convergence.
> Presumably there are options in the package to either increase the max
> number of steps before giving up or make the convergence criteria less
> sensitive. The former will increase the run time and the latter will
> reduce the optimality (possibly leaving you farther from the true
> optimum). So you should look into changing these as you think
> appropriate.

I'm jumping in late, without having read everything preceding.

One of the last messages seemed to indicate that you are looking
at mixtures of *one*-dimensional gaussians.

If this is the case, I strongly recommend looking at (my) CRAN
package 'nor1mix' (the "1" is for "*one*-dimensional).

For a while now that small package is providing an alternative
to the EM, namely direct MLE, simply using optim() where the
likelihood uses a somewhat smart parametrization.

Of course, *as the EM*, this also depends on the starting value,
but my (limited) experience has been that
  nor1mix::norMixMLE()
works considerably faster and more reliable than the EM (which I
also provide asnor1mix::norMixEM() .

Apropos 'starting value': The help page shows how to use
kmeans() for "somewhat" reliable starts; alternatively, I'd
recommend using cluster::pam() to get a start there.

I'm glad to hear about experiences using these / comparing
these with other approaches.

Martin


--
Martin Maechler,
ETH Zurich


> On Wed, Sep 7, 2016 at 3:51 PM, Aanchal Sharma
>  wrote:
>> Hi Simon
>> 
>> I am facing same problem as described above. i am trying to fit gaussian
>> mixture model to my data using normalmixEM. I am running a Rscript which
>> has this function running as part of it for about 17000 datasets (in 
loop).
>> The script runs fine for some datasets, but it terminates when it
>> encounters one dataset with the following error:
>> 
>> Error in normalmixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k = 2,  
:
>> Too many tries!
>> 
>> (command used: expr_mix_gau <- normalmixEM(expr_glm_residuals, lambda =
>> c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 1, maxrestarts=200, 
verb
>> = TRUE))
>> (expr_glm_residuals is my dataset which has residual values for different
>> samples)
>> 
>> It is suggested that one should define the mu and sigma in the command by
>> looking at your dataset. But in my case there are many datasets and it 
will
>> keep on changing every time. please suggest what can I do to resolve this
>> issue.
>> 
>> Regards
>> Anchal
>> 
>> On Tuesday, 16 July 2013 17:53:09 UTC-4, Simon Zehnder wrote:
>>> 
>>> Hi Tjun Kiat Teo,
>>> 
>>> you try to fit a Normal mixture to some data. The Normal mixture is very
>>> delicate when it comes to parameter search: If the variance gets closer 
and
>>> closer to zero, the log Likelihood becomes larger and larger for any 
values
>>> of the remaining parameters. Furthermore for the EM algorithm it is 
known,
>>> that it takes sometimes very long until convergence is reached.
>>> 
>>> Try the following:
>>> 
>>> Use as starting values for the component parameters:
>>> 
>>> start.par <- mean(your.data, na.rm = TRUE) + sd(your.data, na.rm = 
TRUE) *
>>> runif(K)
>>> 
>>> For the weights just use either 1/K or the R cluster function with K
>>> clusters
>>> 
>>> Here K is the number of components. Further enlarge the maximum number 
of
>>> iterations. What you could also try is to randomize start parameters and
>>> run an SEM (Stochastic EM). In my opinion the better method is in this 
case
>>> a Bayesian method: MCMC.
>>> 
>>> 
>>> Best
>>> 
>>> Simon
>>> 
>>> 
>>> On Jul 16, 2013, at 10:59 PM, Tjun Kiat Teo >> > wrote:
>>> 
>>> > I was trying to use the normixEM in mixtools and I got this error
>>> message.
>>> >
>>> > And I got this error message
>>> >
>>> > One of the variances is goin

Re: [R] Have help list filters changed recently

2016-09-09 Thread Martin Maechler
> Marc Schwartz 
> on Thu, 8 Sep 2016 22:29:38 -0500 writes:

>> On Sep 8, 2016, at 7:35 PM, Bert Gunter  wrote:
>> 
>> To all:
>> 
>> r-help has been holding up a lot of my recent messages: Have there
>> been any changes to help list filters that caused this? Is there
>> something I'm doing wrong? -- I have made no changes  that I am aware
>> of. Here's what I get:
>> 
>> Your mail to 'R-help' with the subject
>> 
>> Re: [R] with and evaluation [for example]
>> 
>> Is being held until the list moderator can review it for approval.
>> 
>> The reason it is being held:
>> 
>> The message headers matched a filter rule
>> 
>> 
>> Best,
>> Bert


> Bert,

> Have there been a lot of cc's in your replies?

> That is one thing that will tend to trigger the spam filters. I am not 
sure what the threshold is and I am not sure that Martin knows, but that has 
bitten me in the past on R-Help. As co-moderator with Martin on R-Devel, I have 
seen the other side of it there.

> Might also be the e-mail domain of one of the respondents in the thread.

> I think that it is the ETHZ SysAdmins that tend to control the formalized 
spam filters and heuristics.

> Regards,
> Marc

Thank you, Marc.

Additionally, and with a bit more details, the reason

 " The message headers matched a filter rule "

is really from the very last filter bank, which is 'mailman' and
to answer Bert's question wrt that: No, nor the mailman setup,
nor the R-help specific extra filters have changed during the
last months, probably not even during the last years.

The mailman filters however *do* look at some of the upstream
spam filter results (most of which are ETH wide), and these do
change of course, continually being adapted.

Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about data problem

2016-09-21 Thread Martin Maechler
> Joe Ceradini 
> on Tue, 20 Sep 2016 17:06:17 -0600 writes:

> read.csv("your_data.csv", stringsAsFactors=FALSE)
> (I'm just reiterating Jianling said...)

If you do not have very many columns, and want to become more
efficient and knowledgeable,
I strongly recommend alternatively to use the 'colClasses' argument
to read.csv or read.table (they are the same apart from defaults
for arguments!) and set "numeric" for numeric columns.

This has a similar effect to the *combination* of
 1)  stringsAsFactors = FALSE
 2)  foo <- as.numeric(foo) # for respective columns

Martin


> Joe

> On Tue, Sep 20, 2016 at 4:56 PM, lily li  wrote:

>> Is there a function in read.csv that I can use to avoid converting 
numeric
>> to factor? Thanks a lot.
>> 
>> 
>> 
>> On Tue, Sep 20, 2016 at 4:42 PM, lily li  wrote:
>> 
>> > Thanks. Then what should I do to solve the problem?
>> >
>> > On Tue, Sep 20, 2016 at 4:30 PM, Jeff Newmiller <
>> jdnew...@dcn.davis.ca.us>
>> > wrote:
>> >
>> >> I suppose you can do what works for your data, but I wouldn't 
recommend
>> >> na.rm=TRUE because it hides problems rather than clarifying them.
>> >>
>> >> If in fact your data includes true NA values (the letters NA or simply
>> >> nothing between the commas are typical ways this information may be
>> >> indicated), then read.csv will NOT change from integer to factor
>> >> (particularly if you have specified which markers represent NA using 
the
>> >> na.strings argument documented under read.table)... so you probably DO
>> have
>> >> unexpected garbage still in your data which could be obscuring 
valuable
>> >> information that could affect your conclusions.
>> >> --
>> >> Sent from my phone. Please excuse my brevity.
>> >>
>> >> On September 20, 2016 3:11:42 PM PDT, lily li 
>> >> wrote:
>> >> >I reread the data, and use 'na.rm = T' when reading the data. This 
time
>> >> >it
>> >> >has no such problem. It seems that the existence of NAs convert the
>> >> >integer
>> >> >to factor. Thanks for your help.
>> >> >
>> >> >
>> >> >On Tue, Sep 20, 2016 at 4:09 PM, Jianling Fan 
>> >> >wrote:
>> >> >
>> >> >> Add the "stringsAsFactors = F"  when you read the data, and then
>> >> >> convert them to numeric.
>> >> >>
>> >> >> On 20 September 2016 at 16:00, lily li  wrote:
>> >> >> > Yes, it is stored as factor. I can't check out any problem in the
>> >> >> original
>> >> >> > data. Reread data doesn't help either. I use read.csv to read in
>> >> >the
>> >> >> data,
>> >> >> > do you think it is better to use read.table? Thanks again.
>> >> >> >
>> >> >> > On Tue, Sep 20, 2016 at 3:55 PM, Greg Snow <538...@gmail.com>
>> >> >wrote:
>> >> >> >
>> >> >> >> This indicates that your Discharge column has been
>> >> >stored/converted as
>> >> >> >> a factor (run str(df) to verify and check other columns).  This
>> >> >> >> usually happens when functions like read.table are left to try 
to
>> >> >> >> figure out what each column is and it finds something in that
>> >> >column
>> >> >> >> that cannot be converted to a number (possibly an oh instead of 
a
>> >> >> >> zero, an el instead of a one, or just a letter or punctuation 
mark
>> >> >> >> accidentally in the file).  You can either find the error in 
your
>> >> >> >> original data, fix it, and reread the data, or specify that the
>> >> >column
>> >> >> >> should be numeric using the colClasses argument to read.table or
>> >> >other
>> >> >> >> function.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Tue, Sep 20, 2016 at 3:46 PM, lily li 
>> >> >wrote:
>> >> >> >> > Hi R users,
>> >> >> >> >
>> >> >> >> > I have a problem in reading data.
>> >> >> >> > For example, part of my dataframe is like this:
>> >> >> >> >
>> >> >> >> > df
>> >> >> >> > month day year  Discharge
>> >> >> >> >31   20106.4
>> >> >> >> >32   2010   7.58
>> >> >> >> >33   2010   6.82
>> >> >> >> >34   2010   8.63
>> >> >> >> >35   2010   8.16
>> >> >> >> >36   2010   7.58
>> >> >> >> >
>> >> >> >> > Then if I type summary(df), why it converts the discharge data
>> >> >to
>> >> >> >> levels? I
>> >> >> >> > also met the same problem when reading some other csv files. 
How
>> >> >to
>> >> >> solve
>> >> >> >> > this problem? Thanks.
>> >> >> >> >
>> >> >> >> > Discharge
>> >> >> >> > 7.58 :2
>> >> >> >> > 6.4   :1
>> >> >> >> > 6.82 :1
>> >> >> >> > 8.63 :1
>> >> >> >> > 8.16 :1
>> >> >> >> >
>> >> >> >> > [[alternative HT

Re: [R] "R": Redefine default parameter values for help.start()

2015-04-17 Thread Martin Maechler
> paul  
> on Thu, 16 Apr 2015 18:58:12 + writes:

> Because of the setup of R in cygwin, help.start() requires
> the following parameter values:

> help.start(browser="cygstart",remote=R.home())

> Is it possible to make these values the default?

You are building R from the sources, right?

Then, of course it is easily possible: Just modify your version
of the R source code appropriately:

It is file  /src/library/utils/R/help.start.R

(And you need more changes anyway to make things work under Cygwin;
 but don't ask me about it: I do run R on Windows only very rarely)

> I do not want cygstart to be the browser except as a
> parameter for help.start().

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list server problem?

2015-04-20 Thread Martin Maechler

> Dear all,
> I use Thunderbird 31.6.0 on Mac OS 10.6.8 to download my e-mails.
> Since this morning, Thunderbird downloads several times the same e-mails 
> from the list, some even being from yesterday. It occurs only with 
> e-mails from the R-help and not with my other professional and private 
> e-mails.

> I also checked that it was not a problem with the server on which I get 
> e-mails from the R-help: the problematic e-mails are erased every time 
> from the server but are received several times. So I would say that the 
> R-help server has sent these e-mails several times.

> I am wondering if there is currently an issue with the R-help server. Is 
> someone else having the same problem?

Not having the same problem, but an explanation
from the place where the server runs:

The R mailing list server indeed has had hard times, as it seems
since early Sunday (CEST), and a complete time out for a few
hours during this morning.  The symptoms have been that some e-mails
were badly delayed and --- as you have experienced --- mail
server timeouts do lead to e-mail being sent more than once.
This is the consequence of a safe e-mail server behavior:  If an
e-mail "may have been lost", rather send it again.
As some say:  Better error on the safe side.

Martin Maechler
ETH Zurich


> Thanks!
> Ivan

> -- 
> Ivan Calandra, ATER
> University of Reims Champagne-Ardenne
> 51100 Reims, France

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How numerical data is stored inside ts time series objects

2015-04-22 Thread Martin Maechler
> Paul  
> on Wed, 22 Apr 2015 01:39:16 + writes:

> William Dunlap  tibco.com> writes:
>> Use the str() function to see the internal structure of most
>> objects.  In your case it would show something like:
>> 
>> > Data <- data.frame(theData=round(sin(1:38),1))
>> > x <- ts(Data[[1]], frequency=12) # or Data[,1]
>> > y <- ts(Data, frequency=12)
>> > str(x)
>> Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -
> 0.5
>> ...
>> > str(y)
>> ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ...
>> - attr(*, "dimnames")=List of 2
>> ..$ : NULL
>> ..$ : chr "theData"
>> - attr(*, "tsp")= num [1:3] 1 4.08 12
>> 
>> 'x' contains a vector of data and 'y' contains a 1-column matrix of
>> data.  stl(x,"per") and stl(y, "per") give similar results as you
>> got.
>> 
>> Evidently, stl() does not know that 1-column matrices can be treated
>> much the same as vectors and gives an error message.  Thus you must
>> extract the one column into a vector: stl(y[,1], "per").

> Thanks, William.

> Interesting that a 2D matrix of size Nx1 is treated as a different
> animal from a length N vector.  It's a departure from math convention,
> and from what I'm accustomed to in Matlab.  

Ha -- Not at all!
The above is exactly the misconception I have been fighting --
mostly in vane -- for years.

Matlab's convention of treating a vector as an  N x 1 matrix is
a BIG confusion to much of math teaching :

The vector space  |R^n  is not all the same space as the space  |R^{n x 1}
even though of course there's a trivial mapping between the
objects (and the metrics) of the two.
A vector *is NOT* a matrix -- but in some matrix calculus
notations there is a convention to *treat* n-vectors as  (n x 1) matrices.

Good linear algebra teaching does distinguish vectors from
one-column or one-row matrices -- I'm sure still the case in all
good math departments around the globe -- but maybe not in math
teaching to engineers and others who only need applied math.
Yes, linear algebra teaching will also make a point that in
the usual matrix product notations, it is convenient and useful to treat
vectors as if they were 1-column matrices.

> That R's vector seems
> more akin to a list, where the notion of orientation doesn't apply.

Sorry, but again:  not at all in the sense 'list's are used in R.

Fortunately, well thought out languages such as S, R, Julia, Python,
all do make a good distinction between vectors and matrices
i.e. 1D and 2D arrays.  If Matlab still does not do that, it's
just another sign that Matlab users should flee and start using julia
or R or python.

  {and well yes, we could start bitchering about S' and hence R's distinction
   between a 1D array and a vector ... which I think has been a
   clear design error... but that's not the topic here}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cbind question, please

2015-04-24 Thread Martin Maechler
> Steve Taylor 
> on Thu, 23 Apr 2015 23:32:00 + writes:

> This works for me...
> get0 = function(x) get(x,pos=1)
> sapply(big.char, get0)

Note that  get0() is a _ somewhat important for efficient code _
new function since R 3.2.0
so you'd rather call your functions differently...

> The extra step seems necessary because without it, get() gets base::cat() 
instead of cat.

> cheers,
> Steve

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Erin 
Hodgess
> Sent: Friday, 24 April 2015 10:41a
> To: R help
> Subject: [R] cbind question, please

> Hello!

> I have a cbind type question, please:  Suppose I have the following:

> dog <- 1:3
> cat <- 2:4
> tree <- 5:7

> and a character vector
> big.char <- c("dog","cat","tree")

> I want to end up with a matrix that is a "cbind" of dog, cat, and tree.
> This is a toy example.  There will be a bunch of variables.

> I experimented with "do.call", but all I got was
> 1
> 2
> 3

> Any suggestions would be much appreciated.  I still think that do.call
> might be the key, but I'm not sure.

> R Version 3-1.3, Windows 7.

> Thanks,
> Erin


> -- 
> Erin Hodgess
> Associate Professor
> Department of Mathematical and Statistics
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does segfault mean (always) a bug?

2015-05-06 Thread Martin Maechler
>>>>> Duncan Murdoch 
>>>>> on Tue, 5 May 2015 15:36:59 -0400 writes:

> On 05/05/2015 2:54 PM, lejeczek wrote:
>> hi eveybody
>> 
>> I'm trying something simple (Biocunductor packages), so 
>> simple I believe it's example from docs but I get segfault.
>> I don't suppose incorrect scripting can cause segfault, right?

> In R, a segfault always indicates a bug.  What's not so clear is whether
> it is a bug in R, a bug in a contributed package, or a bug in some
> underlying system library.

> If you can only trigger the bug when using a Bioconductor package, then
> the first guess is that it is that package, and the maintainer of that
> package is in the best position to track it down further.  If you can
> simplify the code to trigger it without using any contributed packages,
> then it could well be a bug in R, and we'd like to see code to reproduce 
it.
> Duncan Murdoch

The bug Duncan mentions can also be in the user's R code, outside any package:

If a user does what (s)he should never do, namely directly call
.C(), .Fortran(), .Call() or similar (.Internal(), .External(),..)
it is typically easy to trigger segfaults, and then the bug is
in the user's R code.

Variations of the above involve using the inline package, or
other interfaces to C/C++/... code, libraries, etc: The bug may be in
your code rather than the underlying code which is allowed to
make strong assumptions about how it is called.

Martin Maechler

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transformation of the circular variable into linear variable

2015-05-06 Thread Martin Maechler
>>>>> David Winsemius 
>>>>> on Tue, 5 May 2015 08:41:08 -0700 writes:

> On May 5, 2015, at 3:59 AM, Nasr Al-Dhurafi wrote:

>> Dear All
>> 
>> I am working with a circular data ( wind direction ) and i have problem
>> in transforming  the circular variable into linear variable. I have found
>> an equation that can be used to convert the circular variable into linear
>> variable which is included in this paper *" ARMA based approaches for
>> forecasting the tuple of wind speed and direction". *The equation is 
called
>> as the inverse of a link function and i have attached  the summery of 
it.T
>> herefore,Please, if you have an idea of how to conduct it in R 
programming
>> or  if you have another method for transformation, let me know it. Your
>> cooperation is highly appreciated.
>> 
>> My Best Regards &  All The Best

> Here are a couple of plausible explanations for a lack of response to 
your first message posted 2 months ago:

> Neither of your duplicate messages have any attachments. This could be 
caused by the fact (buried away in one or more of webpages or Posting Guides 
that define the Rhelp facility)  the information that the mail-server used by 
the Rhelp list strips any attachments unless they are one of the limited number 
accepted file formats. (It does not notify posters of htis fact when they 
submit unacceptable formats.)  The 4 formats I know to be accepted are: 
MIME-text, .pdf, .ps or .png. Even if you submit formats that you might think 
would be considered "text", they will not be accepted if they have non-".txt" 
extensions such as ".csv". As far as I can tell ... only .txt extensions are 
passed through.

Thanks a lot, David, for helping Nasr (and many others)!

The above explantion is not quite accurate though and I want to make sure
correct info gets archived here:  The file extension does not
play any role for the mailing list server software, which is "mailman".
Mailman -- rightly -- only looks at the declared MIME type of the attachment.
("MIME": see https://en.wikipedia.org/wiki/MIME).

Now with 99.x% of people using e-mail software which does not
reveal any of these subtle details to you, and many such e-mail
interfaces behaving like Windows programs, that e-mail software may choose
the MIME type (it uses for the attachments it puts into the
e-mail it sends in your behalf) depending on the file extension.
So, from the naive user's point of view with his only e-mail
program he may know, it then seems as if the file extension
determines the acceptance of the attachment by the mailman rules
we have had imposed (for spam and virus protection).

Martin Maechler
ETH Zurich

>> I find an article by that name in Applied Energy Volume 88, Issue 4, 
April 2011, Pages 1405–1414. The publisher offers a copy at the price of 42 
USD. (Expecting us to buy that article seems unreasonable.)

> Most of us are not meteorological researchers. In general list members 
are more likely to respond to requests that include a) sample data, b) code 
that reflects your efforts, and c) a specification of correct output. It may be 
helpful to list details of your position.

> -- 
> David.

>> 
>> Nasr
>> 
>> On Mon, Mar 30, 2015 at 8:22 PM, Nasr Al-Dhurafi 
>> wrote:
>> 
>>> I am working with a circular data ( wind direction ) and i have problem
>>> in transforming  the circular variable into linear variable. I have 
found
>>> an equation that can be used to convert the circular variable into 
linear
>>> variable which is included in this paper *" ARMA based approaches for
>>> forecasting the tuple of wind speed and direction". *The equation is
>>> called as the inverse of a link function and i have attached  the 
summery
>>> of it.Therefore,Please, if you have an idea of how to conduct it in R
>>> programming or  if you have another method for transformation, let me
>>> know it. Your cooperation is highly appreciated.
>>> 
>>> My Best Regards &  All The Best
>>> Nasr
>>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> David Winsemius
> Alameda, CA, USA

> __

Re: [R] does segfault mean (always) a bug?

2015-05-06 Thread Martin Maechler
>>>>> lejeczek  
>>>>> on Wed, 6 May 2015 08:20:46 +0100 writes:

> On 05/05/15 20:36, Duncan Murdoch wrote:
>> On 05/05/2015 2:54 PM, lejeczek wrote:
>>> hi eveybody
>>> 
>>> I'm trying something simple (Biocunductor packages), so
>>> simple I believe it's example from docs but I get segfault.
>>> I don't suppose incorrect scripting can cause segfault, right?
>> In R, a segfault always indicates a bug.  What's not so clear is whether
>> it is a bug in R, a bug in a contributed package, or a bug in some
>> underlying system library.
>> 
>> If you can only trigger the bug when using a Bioconductor package, then
>> the first guess is that it is that package, and the maintainer of that
>> package is in the best position to track it down further.  If you can
>> simplify the code to trigger it without using any contributed packages,
>> then it could well be a bug in R, and we'd like to see code to reproduce 
it.
>> 
>> Duncan Murdoch
>> 
> hi Duncan
> I remember that this was a principle of most of programming 
> languages, only a bug in the code and/or compiler could 
> cause segfault.
> In my case it is a contributed package, specifically GOSim 
> package, I'm not R programmer and I realise my scripting is 
> far from good and possibly with errors.
> I could send that snippet of the code here if people think 
> it can be looked into and segfault could be replicated?
> I also emailed the author.

> many thanks
> P.

Dear P.,

in the case of segfault from using a contributed package,
you should typically really only email the package maintainer
(which may different than the package authors), and not R-help.
Only if the maintainer does not respond at all (and only if the
package is open source, typically CRAN) should you ask for help here
or another public forum.

(I would also think it to be polite to the maintainer who has
 volunteered her/his code to be used by you if you give him an
 opportunity to see, comment and fix the problem)

Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does segfault mean (always) a bug?

2015-05-08 Thread Martin Maechler
>>>>> William Dunlap 
>>>>> on Wed, 6 May 2015 09:53:50 -0700 writes:

> It looks like a problem in the Matrix package.  
Indeed.

Thanks to Bill Dunlap for the diagnostics below (and other
offline information and) I have been able to fix the bug
yesterday in the R-forge version of Matrix.
The problem was due to using a version of memory allocation
which is known to be quite fast... but is not to be used for
large objects .. which we have here.

I plan to release the amended version of Matrix to CRAN as 

  Matrix_1.2-1  

rather sooner than later.

With thanks to Bill and Pavel,
Martin Maechler
ETH Zurich

> I made the file KE.rda containing the Matrix objects K and
> E constructed in calc.diffusion.kernel by adding a save()
> call just before where R dies in the original example:

> K = lam$values[1] * I - M
> E = I - matrix(1, ncol = ncol(I), nrow = nrow(I))/ncol(I)
> cat("saving K, E, etc. in /tmp/KE.rda\n")
> save(K, E, deg, invD, I, W, M, lam, file="/tmp/KE.rda")
> cat("done making the file\n")
> K = E %*% K %*% E

> With that file in place I get
>> library(Matrix)
>> load("KE.rda")
>> sessionInfo()
> R version 3.2.0 (2015-04-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> Running under: Ubuntu precise (12.04.5 LTS)

[ .. ]

> other attached packages:
> [1] Matrix_1.2-0

> loaded via a namespace (and not attached):
> [1] grid_3.2.0  lattice_0.20-31
>> str(E)
> Formal class 'dsyMatrix' [package "Matrix"] with 5 slots
> ..@ x   : num [1:143376676] 1.00 -8.35e-05 -8.35e-05 -8.35e-05
> -8.35e-05 ...
> ..@ Dim : int [1:2] 11974 11974
> ..@ Dimnames:List of 2
> .. ..$ : NULL
> .. ..$ : NULL
> ..@ uplo: chr "U"
> ..@ factors : list()
>> str(K)
> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> ..@ i   : int [1:487692] 0 69 948 951 1027 1192 1414 1420 1421 1714
> ...
> ..@ p   : int [1:11975] 0 27 125 147 199 212 221 230 254 274 ...
> ..@ Dim : int [1:2] 11974 11974
> ..@ Dimnames:List of 2
> .. ..$ : chr [1:11974] "GO:002" "GO:003" "GO:012"
> "GO:018" ...
> .. ..$ : chr [1:11974] "GO:002" "GO:003" "GO:012"
> "GO:018" ...
> ..@ x   : num [1:487692] 32.2163 -0.004674 -0.000722 -0.005316
> -0.014022 ...
> ..@ factors : list()
>> EK <- E %*% K
>> EKE <- EK %*% E

> *** caught segfault ***
> address 0x7fffa7e1ccf8, cause 'memory not mapped'

[...]

> On Wed, May 6, 2015 at 1:57 AM, Martin Maechler <
> maech...@lynne.stat.math.ethz.ch> wrote:

>> >>>>> lejeczek  
>> >>>>> on Wed, 6 May 2015 08:20:46 +0100 writes:
>> 
>> > On 05/05/15 20:36, Duncan Murdoch wrote:
>> >> On 05/05/2015 2:54 PM, lejeczek wrote:
>> >>> hi eveybody
>> >>>
>> >>> I'm trying something simple (Biocunductor packages), so
>> >>> simple I believe it's example from docs but I get segfault.
>> >>> I don't suppose incorrect scripting can cause segfault, right?
>> >> In R, a segfault always indicates a bug.  What's not so clear is
>> whether
>> >> it is a bug in R, a bug in a contributed package, or a bug in some
>> >> underlying system library.
>> >>
>> >> If you can only trigger the bug when using a Bioconductor package,
>> then
>> >> the first guess is that it is that package, and the maintainer of
>> that
>> >> package is in the best position to track it down further.  If you
>> can
>> >> simplify the code to trigger it without using any contributed
>> packages,
>> >> then it could well be a bug in R, and we'd like to see code to
>> reproduce it.
>> >>
>> >> Duncan Murdoch
>> >>
>> > hi Duncan
>> > I remember that this was a principle of most of programming
>> > languages, only a bug in the code and/or compiler could
>> > cause segfault.
>> > In my case it is a contributed package, specifically GOSim
>> > package, I'm not R programmer and I realise my scripting is
>> > far from good and po

Re: [R] package implementing continuous binomial?

2015-05-08 Thread Martin Maechler
>>>>> David Winsemius 
>>>>> on Thu, 7 May 2015 12:10:25 -0700 writes:

> On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote:

>> Hi
>> 
>> I'm wondering if anyone is aware of an R package implementing (i.e.,
>> providing a pdf, cdf, and/or quantile function) for the continuous
>> binomial distribution? Specifically the one characterized here:
>> 
>> 
http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf
>> 
>> Figured I would check here first, before attempting to code it up myself.

> I found that reading the ArXiv version of that material was easier to 
understand:

> http://arxiv.org/abs/1303.5990

> zipfR package has an implementation of the incomplete beta function that 
might make some of the coding of the pdf and cdf more simple. 

Well.

To: David Winsemius 
Subject: Re: [R] package implementing continuous binomial?
In-Reply-To: <5425134b-6aa1-4246-bcd4-d4be4be23...@comcast.net>
References: <554ac75a.8070...@gmail.com>
<5425134b-6aa1-4246-bcd4-d4be4be23...@comcast.net>
X-Mailer: VM 8.2.0b under 24.3.1 (x86_64-redhat-linux-gnu)
Reply-To: Martin Maechler 
CC: maechler
--text follows this line--
Dear David,

>>>>> David Winsemius 
>>>>> on Thu, 7 May 2015 12:10:25 -0700 writes:

> On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote:

>> Hi
>> 
>> I'm wondering if anyone is aware of an R package implementing (i.e.,
>> providing a pdf, cdf, and/or quantile function) for the continuous
>> binomial distribution? Specifically the one characterized here:
>> 
>> 
http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf
>> 
>> Figured I would check here first, before attempting to code it up myself.

> I found that reading the ArXiv version of that material was easier to 
understand:

> http://arxiv.org/abs/1303.5990

> zipfR package has an implementation of the incomplete beta function that 
might make some of the coding of the pdf and cdf more simple. 

> Searching done with Graves' very useful utility package: 

> library('sos')
> findFn("incomplete beta function")

Hmm...  R's   pbeta() function is a pretty good implementation
of the  incomplete beta function ... as is tries to say on its
help page.

If you look closely, these functions (for incomplete gamma,
incomplete beta, and their inverses) are simple wrappers to
pgamma() and qgamma() --- sometimes "regularizing" and sometimes
not -- where regularization is simply a multiplication/division
with  gamma() or beta().

I wonder why you did not find R's help page about the beta distribution
{ -> functions  dgamma, pgamma, qgamma, rgamma }
which does mention the "incomplete beta function" prominently.

I don't think Benjamin should use the zipfR package just for
these functions  [and even the zipfR package help page on these
can be read as saying so .. ]

In the end I wonder if the "continuous Binomial" is not just a
version of the good old Beta distribution... as indeed the
Binomial and the Beta are related in the same way 
that the Gamma and the Poisson are.

Martin Maechler
ETH Zurich

> (I did't think that doing a search on "continuous Binomial" was likely to 
be helpful, but I tried it anyway and did not find any functions named 
"continuous binomial" in their help page titles.)

> -- 

> David Winsemius
> Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The R Foundation announces new mailing list 'R-package-devel'

2015-05-21 Thread Martin Maechler
New Mailing list:  R-package-devel -- User R Packages Development

At last week's monthly meeting, the R foundation has decided to
create a new mailing list in order to help R package authors in
their package development and testing.

The idea is that some experienced R programmers (often those
currently helping on R-devel or also R-help) will help package
authors and thus unload some of the burden of the CRAN team
members.

Please read the detailed description of the mailing list here,
   https://stat.ethz.ch/mailman/listinfo/r-package-devel
or also the more extended announcement of the list on R-devel,
archived at
 https://stat.ethz.ch/pipermail/r-devel/2015-May/071208.html


For the R foundation,
Martin Maechler, Secretary General

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml

2015-06-05 Thread Martin Maechler
>>>>> Christophe Genolini 
>>>>> on Fri, 5 Jun 2015 00:36:42 -0700 writes:

> Hi all,
> There is a compatibility issue between the package 'lme4' and my package
> 'kml'. I define the "[" operator. It works just fine in my package (1). 
If I
> try to use the lme4 package, then it does no longer work (2). Moreover, it
> has some kind of strange behavior (3). Do you know what is wrong? Any idea
> of how I can correct that?

> Here is a reproductible example, and the same code with the result 
follows.

Dear Christophe,
can you please specify the exact sessionInfo() ?
(after loading both kml and lme4).

'Matrix' has been updated on CRAN very recently, and 
'lme4' is about to be update very soon,
so this *is* a somewhat interesting and important problem,
but the exact versions of the packages *do* matter.

Bonnes salutations!

Martin Maechler
ETH Zurich


   []

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml

2015-06-05 Thread Martin Maechler
>>>>> Martin Maechler 
>>>>> on Fri, 5 Jun 2015 11:33:46 +0200 writes:

>>>>> Christophe Genolini 
>>>>> on Fri, 5 Jun 2015 00:36:42 -0700 writes:

>> Hi all,
>> There is a compatibility issue between the package 'lme4' and my package
>> 'kml'. I define the "[" operator. It works just fine in my package (1). 
If I
>> try to use the lme4 package, then it does no longer work (2). Moreover, 
it
>> has some kind of strange behavior (3). Do you know what is wrong? Any 
idea
>> of how I can correct that?

>> Here is a reproductible example, and the same code with the result 
follows.

> Dear Christophe,
> can you please specify the exact sessionInfo() ?
> (after loading both kml and lme4).

> 'Matrix' has been updated on CRAN very recently, and 
> 'lme4' is about to be update very soon,
> so this *is* a somewhat interesting and important problem,
> but the exact versions of the packages *do* matter.

As a matter of fact,  

1) the package versions don't seem to matter much.

2) lme4 is *not* involved directly:
   Much worse, it is the 'Matrix' package.

If you replace 'lme4' by 'Matrix' in your code, you get the same
symptoms.

... I'm investigating ..

Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.na for S4 object

2015-06-05 Thread Martin Maechler
> Martin Morgan 
> on Thu, 4 Jun 2015 10:33:37 -0700 writes:

> On 06/04/2015 10:08 AM, cgenolin wrote:
>> Hi the list,
>> 
>> I have a variable y that is either NA or some S4 object. I would like to
>> know in which case I am, but it seems taht is.na does not work with S4
>> object, I get a warnings:
>> 
>> --- 8< 
>> setClass("myClass",slots=c(x="numeric"))
>> if(runif(1)>0.5){a <- new("myClass")}else{a <- NA}
>> is.na(a)
>> --- 8< 
>> 
>> Any solution?

> getGeneric("is.na")

> shows that it's an S4 generic, so implement a method

> setMethod("is.na", "myClass", function(x) FALSE)

> Martin

For the present special case though,  a more efficient solution would be
to use  isS4(.)  instead of  !is.na(.)

another Martin 


>> Thanks
>> Christophe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml

2015-06-05 Thread Martin Maechler
> Christophe Genolini 
> on Fri, 5 Jun 2015 00:36:42 -0700 writes:

> Hi all,
> There is a compatibility issue between the package 'lme4' and my package
> 'kml'. I define the "[" operator. It works just fine in my package (1). 
If I
> try to use the lme4 package, then it does no longer work (2). Moreover, it
> has some kind of strange behavior (3). Do you know what is wrong? Any idea
> of how I can correct that?

> Here is a reproductible example, and the same code with the result 
follows.

> Thanks for your help
> Christophe

  [ ... I'm providing slightly different code below  ]

> --- 8< - Execution of the previous code ---

> > library(kml)
> Le chargement a nécessité le package : clv
> Le chargement a nécessité le package : cluster
> Le chargement a nécessité le package : class
> Le chargement a nécessité le package : longitudinalData
> Le chargement a nécessité le package : rgl
> Le chargement a nécessité le package : misc3d
> > dn <- gald(1)

>  ###
> ### (1) the "[" operator works just fine

> > dn["traj"]
>   t0   t1t2t3t4   t5   t6t7t8t9   t10
> i1 -3.11 4.32  2.17  1.82  4.90 7.34 0.83 -2.70  5.36  4.96  3.16
> i2 -7.11 1.40 -2.40 -2.96  4.31 0.50 1.25  0.52 -0.04  7.55  5.50
> i3  2.80 6.23  6.08  2.87  2.58 2.88 6.58 -2.38  2.30 -1.74 -3.23
> i4  2.24 0.91  6.50 10.92 11.32 7.79 7.78 10.69  9.15  1.07 -0.51

>  ###
> ### (2) using 'lme4', it does no longer work

> > library(lme4)
> Le chargement a nécessité le package : Matrix
> Le chargement a nécessité le package : Rcpp
> > dn["traj"]
> Error in x[i, j] :
>   erreur d'évaluation de l'argument 'j' lors de la sélection d'une méthode
> pour la fonction '[' : Erreur : l'argument "j" est manquant, avec aucune
> valeur par défaut

>  ###
> ### (3) If I define again the "[", it does not work the first time I call
> it, but it work the second time!
> > setMethod("[",
> +   signature=signature(x="ClusterLongData", i="character", 
> j="ANY",drop="ANY"),
> +   definition=function (x, i, j="missing", ..., drop = TRUE){
> +   x <- as(x, "LongData")
> +   return(x[i, j])
> + }
> + )
> [1] "["

> ### No working the first time I use it
> > dn["traj"]
> Error in dn["traj"] :
>   l'argument "j" est manquant, avec aucune valeur par défaut

> ### But working the second time
> > dn["traj"]
>   t0   t1t2t3t4   t5   t6t7t8t9   t10
> i1 -3.11 4.32  2.17  1.82  4.90 7.34 0.83 -2.70  5.36  4.96  3.16
> i2 -7.11 1.40 -2.40 -2.96  4.31 0.50 1.25  0.52 -0.04  7.55  5.50
> i3  2.80 6.23  6.08  2.87  2.58 2.88 6.58 -2.38  2.30 -1.74 -3.23
> i4  2.24 0.91  6.50 10.92 11.32 7.79 7.78 10.69  9.15  1.07 -0.51 

I have made some investigations, but have to stop for now, and
leave this hopefully to others knowledgable about S4 method
dispatch, etc :

1) I am confident to say that you have uncovered an "unfelicity if
  not a bug" in R.

2) I am also pretty confident that the "[" methods that you
  define in 'kml' and in the package '

3) Diagnosing is not easy: As you have shown yourself above,
  in some situations the bug "bites" and if you repeat the *same*
  code, things work.

  This is related to the fact that S4 methods are __cached__
  (so next time they are found more quickly) under some
  circumstances, and the cache is cleared under other such circumstances.

3b) Actually, I am sure that we have seen +/- the same problem many
months ago, in other contexts but did not get "down to it";
and at the moment, I cannot quickly find where to look for
the problem there...


##--- 8< Commented (incl output) reproducible code--
library(kml)

### Creating some data
dn <- gald(1)
(dnt <- dn["traj"])

showMethods("[")
## Function: [ (package base)
## x="ClusterLongData", i="character"
## x="ListPartition", i="ANY"
## x="LongData", i="ANY"
## x="LongData", i="character"
## (inherited from: x="LongData", i="ANY")
## x="LongData3d", i="ANY"
## x="nonStructure", i="ANY"
## x="ParChoice", i="ANY"
## x="ParKml", i="ANY"
## x="ParLongData", i="ANY"
## x="Partition", i="ANY"
## x="ParWindows", i="ANY"

### using Matrix  (or lme4, which 'Depends' on Matrix; hence same effect)
library(Matrix)
dn["traj"]
## Error in x[i, j] :
##   error in evaluating the argument 'j' in selecting a method for function 
'[': Error: argument "j" is missing, with no default
traceback()
## 3: x[i, j]
## 2: dn["traj"]
## 1: dn["traj"]
(ms <- methods(`[`)) ## 81 methods

## MM: debugging :
trace("[", browser, signature=c("ClusterLongData", "character", "missing",   
"missing"))
trace("[", browser, signature=c("LongData","character", "character", 
"missing"))
dn["traj"]
## -> you get into the browser, just press   "c"   twice (once for each "trace")
## ==> it works !!

## Remove the tracing :
untrace("[", signature=c("ClusterLongData", "character", "missing",   
"missing"))
untrace("[", signa

Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail

2015-06-08 Thread Martin Maechler

> Aehm, adding on this: I incorrectly *assumed* without testing that rounding 
> would help; it doesn't:
> ecdf(round(test2,0))  # a rounding that is way too rough for my application...
> #Error in xy.coords(x, y) : 'x' and 'y' lengths differ
> 
> Digging deeper: The initially mentioned call to unique() is not very helpful, 
> as test2 is a data frame, so I get what I deserve, an unchanged data frame 
> with 1 row. Still, the issue remains and can even be simplified further:
> 
> > ecdf(data.frame(a=3, b=4))
> Empirical CDF 
> Call: ecdf(data.frame(a = 3, b = 4))
>  x[1:2] =  3,  4
> 
> works ok, but
> 
> > ecdf(data.frame(a=3, b=3))
> Error in xy.coords(x, y) : 'x' and 'y' lengths differ
> 
> doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, 
> 
> > ecdf(c(a=3, b=3))
> Empirical CDF 
> Call: ecdf(c(a = 3, b = 3))
>  x[1:1] =  3
> 
> does the trick. From ?ecdf, I get that x should be a numeric vector - 
> apparently, my misuse of the function by applying it to a row of a data frame 
> (i.e. a data frame with one row). In all my other (dozens of) cases that 
> worked ok, though but not for this particular one. A simple unlist() helps:

You were lucky.   To use a one-row data frame instead of a
numerical vector will typically *not* work unless ... well, you
are lucky.

No, do *not*  pass data frame rows instead of numeric vectors.

> 
> > ecdf(unlist(data.frame(a=3, b=3)))
> Empirical CDF 
> Call: ecdf(unlist(data.frame(a = 3, b = 3)))
>  x[1:1] =  3
> 
> Yet, I'm even more confused than before: in my other data, there were also 
> duplicated values in the vector (1-row-data frame), and it never caused any 
> issue. For this particular example, it does. I must be missing something 
> fundamental...
>  

well.. I'm confused about why you are confused,
but if you are thinking about passing rows of data frames as
numeric vectors, this means you are sure that your data frame
only contains "classical numbers" (no factors, no 'Date's,
no...).

In such a case, transform your data frame to a numerical matrix
*once* preferably using  data.matrix() instead of just  as.matrix()
but in this case it should not matter.
Then *check* the result and then work with that matrix from then on.

All other things probably will continue to leave you confused ..
;-)

Martin Maechler, 
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'class(.) == **' [was 'Call to a function']

2015-06-24 Thread Martin Maechler
>>>>> Steve Taylor 
>>>>> on Wed, 24 Jun 2015 00:56:26 + writes:

> Note that objects can have more than one class, in which case your == and 
%in% might not work as expected.  

> Better to use inherits().

> cheers,
> Steve

Yes indeed, as Steve said, really do!  

The use of   (class(.) == "")   it is error prone and
against the philosophy of classes (S3 or S4 or ..) in R :

Classes can "extend" other classes or "inherit" from them;
S3 examples in "base R"  are
 - glm() objects which are "glm"
   but also inherit from "lm"
 - multivariate time-series are "mts" and "ts"
 - The time-date objects  POSIXt , POSIXct, POSIXlt

==> do work  with  inherits(, , )


We've seen this use of  

 class(.) == ".."(or '!=" or  %in% ...)

in too many places;  though it may work fine in your test cases,
it is wrong to be used in generality e.g. inside a function you
provide for more general use,
and is best  replaced with the use of inherits() / is()
everywhere  "out of principle".

Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R

2015-07-14 Thread Martin Maechler
>>>>> "P" == ProfJCNash  
>>>>> on Sat, 4 Jul 2015 21:42:27 -0400 writes:

P> n163 <- mpfr(163, 500)

P> is how I set up the number.

Yes, and you have needed to specify the desired precision.

As author and maintainer of Rmpfr, let me give my summary of this overly
long thread (with many wrong statements before finally RK give
the "obvious" answer):

1) Rmpfr is about high (or low, for didactical reasons, see Rich
   Heiberger's talk at 'useR! 2015') precision __numerical__ computations.
   This is what the GNU MPFR library is about, and Rmpfr wants
   to be a smart and R-like {e.g., automatic coercions wherever they
   make sense} R interface to MPFR.

2) If you use Rmpfr, you as user should decide about the desired accuracy
   of your "inputs".
   I think it would be a very bad idea to redefine 'pi' (in
   Rmpfr) to be Const("pi", 120).

   R-like also means that in a computation   a * b   the
   properties of a and b determine the result, and hence if  a <- pi
   then we know that pi is a double precision number with 53-bit
   mantissa.


p> On 15-07-04 05:10 PM, Ravi Varadhan wrote:
>>> What about numeric constants, like `163'?

well, if you _combine_ them with an mpfr() number, they are used
in their precision.  An integer like 163 is exact of course; but
pi (another numeric constant) is 53-bit as mentioned above, 
*and* sqrt(163)  is ("R-like") a double precision number
computed by R's internal double precision arithmetic.

My summary:


1. Rmpfr behaves entirely correctly and as documented 
   (in all the examples given here).

2. The idea of substituting expressions containing pi with
   something like Const("pi", 120) is not a good one.  (*)



*) One could think of adding a new class, say "symbolicNumber"
   or rather "numericExpression" which in arithmetic would
   try to behave correctly, namely find out what precision all
   the other components in "the arithmetic" have and make sure
   all parts of the 'numericExpression' are coerced to
   'mpfr' with the correct precision, and only then start
   evaluating "the arithmetic".

   But that *does* look like implementation of
   Maple/Mathematica/... in R  and that seems silly; rather R
   should interface to Free (as in "speech") aka "open source"
   symbolic math packages such as Pari, macysma, ..

Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indices of mismatch element in two vector with missing values

2015-07-29 Thread Martin Maechler
>>>>> Hervé Pagès 
>>>>> on Tue, 28 Jul 2015 23:07:14 -0700 writes:

> On 07/28/2015 09:58 PM, Peter Alspach wrote:
>> One way 
>> 
>> seq(test1)[-which(test1==test2)]

> One question is whether 2 NAs should be considered to match or not.
> The OP doesn't tell but I guess he wants them to match:

> test1 <- c("1", "2",  NA, "4", NA, "6")
> test2 <- c("1", "2", "3",  NA, NA, "66")

> seq(test1)[-which(test1==test2)]
> # [1] 3 4 5 6

> 5 should probably not be there!

>> but I imagine there are better ones .
>> 

Yes, indeed, as logical indexing is
considerably safer than "negative indexing with which(.)" :

PLEASE, everyone, do remember:

 %%>===<%%
 %%>  There is one ___big__ disadvantage to  [ - which() ] :  <%%
 %%>  It entirely __fails__ when the logical vector is all FALSE   <%%
 %%>===<%%


  ## Example (remember, we want the indices of *differing* entries):
  ## --- Here, all entries differ, so we want to get  1:8 :

   x <- c(NA, 1:7)
   y <- c(11:17, NA)

   ## now this
   seq(x)[ - which(x == y) ]  ## gives not what you expect

   ## But logical indexing always works:
x == y & is.na(x) == is.na(y)
   seq(x)[!(x == y & is.na(x) == is.na(y))]

With output :

> x <- c(NA, 1:7)
> y <- c(11:17, NA)
> ## now this
> seq(x)[ - which(x == y) ]  ## gives not what you expect
integer(0)

> ## But logical indexing always works:
>  x == y & is.na(x) == is.na(y)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> seq(x)[!(x == y & is.na(x) == is.na(y))]
[1] 1 2 3 4 5 6 7 8
> 

Martin Maechler
ETH Zurich


>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of baccts
>> Sent: Wednesday, 29 July 2015 8:26 a.m.
>> To: r-help@r-project.org
>> Subject: [R] indices of mismatch element in two vector with missing 
values
>> 
>> How would you return the index where two vectors differs if they may 
contain missing (NA) values?
>> For example:
>> test1 <- c("1","2",NA);
>> test2 <- c("1","2","3");
>> which(test1!=test2) does not return 3!
>> 
>> Thanks in advance.
>> 
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to have NLME return when convergence not reached

2014-12-03 Thread Martin Maechler
>>>>> Bert Gunter 
>>>>> on Tue, 2 Dec 2014 14:03:44 -0800 writes:

> Yes, Bill almost always has helpful ideas.
> Just a comment: If indeed the process is gobbling up too much memory,
> that might indicate a problem with your function or implementation. I
> defer to real experts on this, however.

> Cheers,
> Bert

> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374


Yes, thank you Bert, it could be entirely Ramiro's function, but
we don't know as you have not given any indication about what
the non-linear function is in your nlme() formula.

The little you wrote _seems_ to indicate a bug in nlme, and if
that was the case, the bug should be reported and fixed,
and we (R core - the maintainer of the recommended package nlme)
would really like to to get a reproducible example.

If OTOH, it is your nonlinear function that "goes haywire", 
the implicit blame on nlme would not be warranted.

Martin Maechler, 
ETH Zurich and R Core team


> On Tue, Dec 2, 2014 at 1:59 PM, Ramiro Barrantes
>  wrote:
>> Thanks so much for your reply.  I am using try but nlme never returns!!  
and I think the process is getting killed by the system as it is taking over 
all the memory.  However, I do like William Dunlap's idea of using 
R.utils::withTimeout to limit the time.
>> 
>> Thanks again for your help!
>> 
>> From: Bert Gunter [gunter.ber...@gene.com]
>> Sent: Tuesday, December 02, 2014 4:30 PM
>> To: Ramiro Barrantes
>> Cc: r-help@r-project.org
>> Subject: Re: [R] How to have NLME return when convergence not reached
>> 
>> ?try
>> Or
>> ?tryCatch
>> 
>> Bert
>> 
>> Sent from my iPhone -- please excuse typos.
>> 
>>> On Dec 2, 2014, at 12:57 PM, Ramiro Barrantes 
 wrote:
>>> 
>>> Hello,
>>> 
>>> I am trying to fit many hundreds of simulated datasets using NLME (it's 
all in a big loop in R).  Some don't seem to converge.  I am working on 
addressing the issues by perhaps adjusting my simulation, or tweaking iteration 
steps in nlme, etc.  However, when it doesn't converge, NLME just hangs, and my 
program either stalls for hours/days or takes over the computer memory and 
everything crashes eventually.  Is there a way to tell nlme to stop when it 
doesn't seem to be converging somehow?   I have been looking at the parameters 
in nlmeControl() but see nothing obvious.
>>> 
>>> Thanks in advance,
>>> 
>>> Ramiro
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The R Foundation - New board

2014-12-16 Thread Martin Maechler
Dear R lovers and other useRs,
notably benefactors, donors, supporting members and institutions
of the R foundation 
   http://www.r-project.org/foundation/

About 11.7 years ago, in April 2003, the R Foundation was
announced by Fritz Leisch, 
  https://stat.ethz.ch/pipermail/r-announce/2003/000385.html
shortly after its foundation, and since then, its board has
lead the foundation during years of phenomenal growth and success.

Our thanks go to the *past* board members

   Presidents:   Robert Gentleman, Ross Ihaka 
   Secretary General:Friedrich Leisch 
   Treasurer:Kurt Hornik 
   Member at Large:  John Chambers 

In addition, Peter Dalgaard and Martin Mächler had acted as auditors.

After Summer meetings of the R Core and the R Foundation,
we, i.e., Robert Gentleman, on Sep. 15 announced
the new ("ordinary") memberships and the fact that the
membership of the foundation has grown to 29 persons:
https://stat.ethz.ch/pipermail/r-announce/2014/000577.html

Subsequently (after a careful nomination and voting process),
the election of a new *board* finalized on December 1, and the
new board has started acting since.
It consists of

Presidents:Duncan Murdoch, Martyn Plummer 
Secretary General: Martin Mächler 
Treasurer: Kurt Hornik 
Members at Large:  John Chambers 
   Brian Ripley
   Dirk Eddelbuettel 

and the two auditors Peter Dalgaard and Roger Bivand.
This is very soon to be reflected on
http://www.r-project.org/foundation/board.html


The foundation and its board aim to become considerably more
active in the future.  Details about this will be announced here
in due time. 

Thank you all for supporting the R project, a Free Software
(hence Open source) endeavor only made possible by the dedicated
work of many thousands of volunteers and their contributions in several ways, 
including to become supporting (or ++) members of the R foundation!

Martin Maechler

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Do NOT use ifelse() if you can use if(.) else . [was "Checking .."]

2014-12-23 Thread Martin Maechler
>>>>> Steven Yen 
>>>>> on Sun, 14 Dec 2014 09:30:56 -0500 writes:
>>>>> Steven Yen 
>>>>> on Sun, 14 Dec 2014 09:30:56 -0500 writes:

> Thanks. This worked!! :)
> Fisher <- ifelse(!("Fisher" %in% names(obj$spec)), FALSE, obj$spec$Fisher)

"worked",  yes.

But please --- for the mailing list archives ---
do use better code.

Unfortunately, ifelse() is used and has been advertized much too
often in cases where it is very sub-optimal efficiency wise. 
ifelse(Cond, A, B)  should only be used  when the condition
'Cond', a logical vector can be (and typically is) of length > 1
{and even then, it maybe a nice short cut, but often still suboptimal
 efficiency wise ... but let's not get there}

In cases like this one when the condition 'Cond' is a
simple TRUE or FALSE (i.e. of length 1), using

if(Cond) A else B

instead of

ifelse(Cond, A, B)

is

  1. much more R - like  [[ "everything you do is a function call" ]]
 hence much more elegant

  2. considerably more efficient.

  3. :-) less typing: uses two "," less ;-)

  > require(microbenchmark)
  Loading required package: microbenchmark
  > x <- setNames(,LETTERS)
  > y <- setNames(letters, runif(letters))
  > z <- pi
  > microbenchmark(r1 <- ifelse(z > 3, x, y), r2 <- if(z > 3) x else y, 
times=1000)
  Unit: nanoseconds
expr  min lq mean median uq   max neval cld
   r1 <- ifelse(z > 3, x, y) 4466 4971.5 5498.928   5244 5673.5 31705  1000   b
   r2 <- if (z > 3) x else y  171  212.0  265.241264  291.0  3130  1000  a 
  > 

i.e., roughly a factor of 20 times more efficient


Martin Maechler,
ETH Zurich and R Core Team.


> At 08:43 AM 12/14/2014, Ben Tupper wrote:
>> Hi,
>> 
>> Does this work for you?  It simply tests if the name Fisher is found 
>> among the names of the elements of spec.
>> 
>> obj = list(spec = list)
>> Fisher <- ifelse(!("Fisher" %in% names(obj$spec)), FALSE, 
obj$spec$Fisher)
>> 
>> Cheers,
>> Ben
>> 
>> On Dec 14, 2014, at 8:07 AM, Steven Yen  wrote:
>> 
>> > My obj does not always come with a logical variable defined. So I do
>> >
>> > my.foo <- function(obj,df,digits=5){
>> > if (!is.na("obj$spec$Fisher")) Fisher<-obj$spec$Fisher
>> > ...
>> > }
>> >
>> > This works when "Fisher" is defined in/passed from obj. When it 
>> is not, I get error:
>> >
>> > Error in (!is.na("obj$spec$Fisher")) & Fisher :
>> >  operations are possible only for numeric, logical or complex types
>> >
>> > I tried exist(Fisher), missing(Fisher)... to no vail. Any idea? Thanks.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Links to vignettes in the base packages

2015-01-14 Thread Martin Maechler
>>>>> Richard Cotton 
>>>>> on Wed, 14 Jan 2015 09:23:42 +0300 writes:

> Contributed packages have copies of their vignettes on CRAN (in their
> package page at /web/packages/).

> Since base packages no longer have a page here, I can't find a web link 
to them.

> I'm aware that I can find the vignette via browseVignettes() or
> vignette("vignettename", package = "packagename").  There are two use
> cases where a web link is more useful.  If someone is asking for help
> and I want to direct them to a vignette, having a link to a vignette
> rather than getting them to open R and type a command increases the
> chance that they will actually bother to look at the thing. Secondly,
> if I want to read a vignette on a machine without R (ebook reader,
> phone, whatever), then a web link is more convenient.

> Are the base vignettes available online?  

I bet ... and I would have done so even before doing a quick check.

> If so, where are they?  

Indeed, one place is "here" (the place I work) i.e., from
http://stat.ethz.ch/R-manual/

(which I think is one of the oldest still working R related "web pages"
 ... and yes, it shows its age :-)

> If not, is it feasible to make them available?

The drawback of the above is that you see the future versions of
the manuals and I agree that it may make sense to provide them
in a .r-project.org  location  (where I don't "see" them).

Martin Maechler,
ETH Zurich

> -- 
> Regards,
> Richie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] two-sample KS test: data becomes significantly different after normalization

2015-01-14 Thread Martin Maechler
> Monnand  
> on Wed, 14 Jan 2015 07:17:02 + writes:

> I know this must be a wrong method, but I cannot help to ask: Can I only
> use the p-value from KS test, saying if p-value is greater than \beta, 
then
> two samples are from the same distribution. If the definition of p-value 
is
> the probability that the null hypothesis is true, 

Ouch, ouch, ouch, ouch 

The worst misuse/misunderstanding of statistics  now even on R-help ...

---> please get help from a statistician !!

--> and erase that sentence from your mind (unless you are pro
and want to keep it for anectdotal or didactical purposes...) 

> then why there's little
> people uses p-value as a "true" probability. e.g. normally, people will 
not
> multiply or add p-values to get the probability that two independent null
> hypothesis are both true or one of them is true. I had this question for
> very long time.

> -Monnand

> On Tue Jan 13 2015 at 2:47:30 PM Andrews, Chris 
> wrote:

>> This sounds more like quality control than hypothesis testing.  Rather
>> than statistical significance, you want to determine what is an 
acceptable
>> difference (an 'equivalence margin', if you will).  And that is a 
question
>> about the application, not a statistical one.
>> 
>> From: Monnand [monn...@gmail.com]
>> Sent: Monday, January 12, 2015 10:14 PM
>> To: Andrews, Chris
>> Cc: r-help@r-project.org
>> Subject: Re: [R] two-sample KS test: data becomes significantly different
>> after normalization
>> 
>> Thank you, Chris!
>> 
>> I think it is exactly the problem you mentioned. I did consider
>> 1000-point data is a large one at first.
>> 
>> I down-sampled the data from 1000 points to 100 points and ran KS test
>> again. It worked as expected. Is there any typical method to compare
>> two large samples? I also tried KL diverge, but it only gives me some
>> number but does not tell me how large the distance is should be
>> considered as significantly different.
>> 
>> Regards,
>> -Monnand
>> 
>> On Mon, Jan 12, 2015 at 9:32 AM, Andrews, Chris 
>> wrote:
>> >
>> > The main issue is that the original distributions are the same, you
>> shift the two samples *by different amounts* (about 0.01 SD), and you 
have
>> a large (n=1000) sample size.  Thus the new distributions are not the 
same.
>> >
>> > This is a problem with testing for equality of distributions.  With
>> large samples, even a small deviation is significant.
>> >
>> > Chris
>> >
>> > -Original Message-
>> > From: Monnand [mailto:monn...@gmail.com]
>> > Sent: Sunday, January 11, 2015 10:13 PM
>> > To: r-help@r-project.org
>> > Subject: [R] two-sample KS test: data becomes significantly different
>> after normalization
>> >
>> > Hi all,
>> >
>> > This question is sort of related to R (I'm not sure if I used an R
>> function
>> > correctly), but also related to stats in general. I'm sorry if this is
>> > considered as off-topic.
>> >
>> > I'm currently working on a data set with two sets of samples. The csv
>> file
>> > of the data could be found here: http://pastebin.com/200v10py
>> >
>> > I would like to use KS test to see if these two sets of samples are 
from
>> > different distributions.
>> >
>> > I ran the following R script:
>> >
>> > # read data from the file
>> >> data = read.csv('data.csv')
>> >> ks.test(data[[1]], data[[2]])
>> > Two-sample Kolmogorov-Smirnov test
>> >
>> > data:  data[[1]] and data[[2]]
>> > D = 0.025, p-value = 0.9132
>> > alternative hypothesis: two-sided
>> > The KS test shows that these two samples are very similar. (In fact, 
they
>> > should come from same distribution.)
>> >
>> > However, due to some reasons, instead of the raw values, the actual 
data
>> > that I will get will be normalized (zero mean, unit variance). So I 
tried
>> > to normalize the raw data I have and run the KS test again:
>> >
>> >> ks.test(scale(data[[1]]), scale(data[[2]]))
>> > Two-sample Kolmogorov-Smirnov test
>> >
>> > data:  scale(data[[1]]) and scale(data[[2]])
>> > D = 0.3273, p-value < 2.2e-16
>> > alternative hypothesis: two-sided
>> > The p-value becomes almost zero after normalization indicating these 
two
>> > samples are significantly different (from different distributions).
>> >
>> > My question is: How the normalization could make two similar samples
>> > becomes different from each other? I can see that if two samples are
>> > different, then normalization could make them similar. However, if two
>> sets
>> > of data are similar, then intuitively, applying same operation onto 
them
>> 

Re: [R] sparse matrix from vector outer product

2015-01-15 Thread Martin Maechler
> Philipp A 
> on Wed, 14 Jan 2015 14:02:40 + writes:

> Hi,
> creating a matrix from two vectors a, b by multiplying each combination 
can
> be done e.g. via

> a %*% t(b)

> or via

> outer(a, b)  # default for third argument is '*'

really the best (most efficient) way would be

   tcrossprod(a, b)

> But this yields a normal matrix.
of course.

Please always use small self-contained example code,
here, e.g.,

a <- numeric(17); a[3*(1:5)] <- 10*(5:1)
b <- numeric(12); b[c(2,3,7,11)] <- 1:3


> Is there an efficient way to create sparse matrices (from the Matrix
> package) like that?

> Right now i’m doing

> a.sparse = as(a, 'sparseVector')
> b.sparse = as(t(b), 'sparseMatrix')
> a.sparse %*% b.sparse

> but this strikes me as wasteful.

not really wasteful I think. But there is a nicer and more efficient way :

require(Matrix)
tcrossprod(as(a, "sparseVector"),
   as(b, "sparseVector"))

now also gives

 17 x 12 sparse Matrix of class "dgCMatrix"

  [1,] .  .   . . . .   . . . .  . .
  [2,] .  .   . . . .   . . . .  . .
  [3,] . 50 100 . . . 150 . . . 50 .
  [4,] .  .   . . . .   . . . .  . .
  [5,] .  .   . . . .   . . . .  . .
  [6,] . 40  80 . . . 120 . . . 40 .
  [7,] .  .   . . . .   . . . .  . .
  [8,] .  .   . . . .   . . . .  . .
  [9,] . 30  60 . . .  90 . . . 30 .
 [10,] .  .   . . . .   . . . .  . .
 [11,] .  .   . . . .   . . . .  . .
 [12,] . 20  40 . . .  60 . . . 20 .
 [13,] .  .   . . . .   . . . .  . .
 [14,] .  .   . . . .   . . . .  . .
 [15,] . 10  20 . . .  30 . . . 10 .
 [16,] .  .   . . . .   . . . .  . .
 [17,] .  .   . . . .   . . . .  . .
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "Negative length vectors are not allowed" error

2015-01-20 Thread Martin Maechler

> Hi all, I have a question concerning an error that occurs when using the 
> inspect() function on a set of rules made by the apriori function from the 
> arules package. I have a dataset with 12 million records. It contains some 
> basic sales information (such as product, customer data). I want to use the 
> apriori function from the arules package on it: ruleset <- apriori(sales, 
> parameter=list(support=0.0005, confidence=0.1, minlen=2))It gives me 780379 
> rules. I want to have that much rules on purpose, so hence the parameters 
> being so low (I guess you can reproduce this problem with any large dataset 
> and low settings for support and confidence). But then I want to check out 
> the rules with inspect. It has a subset because I'm only interested in rules 
> with the attribute Product in the rhs. inspect(subset(ruleset, subset=rhs 
> %pin% "Product="))Then this error occurs: Error in 
> inspect(subset(sales3ruleset, subset = rhs %pin% "Product=")) : 
>   error in evaluating the argument 'x' in selecting a method for function 
> 'inspect': Error in .Call("R_or_ngCMatrix", x@data, y@data, PACKAGE = 
> "arules") : 
>   negative length vectors are not allowedI looked around and apparently that 
> part about "negative length vectors are not allowed" means that you want to 
> create a vector that is larger than 2^31. How can you get around this limit? 
> Or how can I make the inspectfunction work in this case?Thanks in advance!Kim 
> 
Dear Kim,

if you learned to post (i.e. write that e-mail) in plain text,
the above would look more humane..

Still, I was able to decipher it and you are right in that
you hit a limitation of the current setup which may well
be linked to the Matrix package which I maintain, and on which
'arules' depends.

Can you please try to find a reproducible example [with randomly
generated data; i.e., you'd use  set.seed(),  runif(), rpois(),
rmultinom(), rnorm(),  ...] so we,
the maintainer of 'arules' Michael Hahsler (BCC'ed: use 
maintainer("arules") to find such an e-mail address),
and myself can look if and how that limitation might be lifted.

Best regards,
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re-order levels of a categorical (factor) variable

2015-01-22 Thread Martin Maechler
>>>>> Bert Gunter 
>>>>> on Wed, 21 Jan 2015 18:52:12 -0800 writes:

> Bill/Ravi:
> I believe the problem is that the factor is automatically created when
> a data frame is created by read.table(). By default, the levels are
> lexicographically ordered. The following reproduces the problem and
> gives a solution.

>> library(lattice)

>> z <- data.frame(y = 1:9, x = rep(c("pre", "day2","day10")))
>> xyplot(y~x,data=z) ## x axis order is day 10, day2, pre

>> levels(z$x)
> [1] "day10" "day2"  "pre"

>> z$x <- factor(as.character(z$x),levels=c(levels(z$x)[3:1])) ## 
explicitly defines level order
>> xyplot(y~x,data=z) ##  desired plot

Indeed, thank you, Bert,
and using  levels(.) <- *   does *not* work... (as I first thought).

However, slightly shorter and easier and maybe even easier to remember than

 z$x <- factor(as.character(z$x), levels = c(levels(z$x)[3:1])) ## def. level 
order

is

 z$x <- factor(z$x, levels = levels(z$x)[3:1])   ## def. level order


Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum function and missing values --- need to mimic SAS sum function

2015-01-26 Thread Martin Maechler
> Jim Lemon 
> on Mon, 26 Jan 2015 11:21:03 +1100 writes:

> Hi Allen, How about this:

> sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))

Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"

John Fox's reply *did* contain  the "proper" solution

 if (all(is.na(x))) NA else sum(x, na.rm=TRUE)

The ifelse() function should never be used in such cases.
Read more after googling
 
"Do NOT use ifelse()"

-- include the quotes in your search --

or directly at
   http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html

Yes, this has been on R-help a month ago..
Martin

> On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>  wrote:
>> I understand that in order to get the sum function to
>> ignore missing values I need to supply the argument
>> na.rm=TRUE. However, when summing numeric values in which
>> ALL components are "NA" ... the result is 0.0 ... instead
>> of (what I would get from SAS) of NA (or in the case of
>> SAS ".").
>> 
>> Accordingly, I've had to go to 'extreme' measures to get
>> the sum function to result in NA if all arguments are
>> missing (otherwise give me a sum of all non-NA elements).
>> 
>> So for example here's a snippet of code that ALMOST does
>> what I want:
>> 
>> 
>> 
SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>> 
>> In reality this does NOT give me records with NA for
>> SumValue ... but it doesn't give me values for any
>> records in which both Variable.1 and Variable.2 are NA
>> --- which is "good enough" for my purposes.
>> 
>> I'm guessing with a little more work I could come up with
>> a way to adapt the code above so that I could get it to
>> work like SAS's sum function ...
>> 
>> ... but before I go that extra mile I thought I'd ask
>> others if they know of functions in either base R ... or
>> in a package that will better mimic the SAS sum function.
>> 
>> Any suggestions?
>> 
>> Thanks.  __ Allen
>> Bingham aebingh...@gmail.com
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
>> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide
>> commented, minimal, self-contained, reproducible code.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide
> commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the less-than-minus gotcha

2015-02-02 Thread Martin Maechler

> All the more reason to use = instead of <-

Definitely not!

(As you were told, there are other drawbacks).

R does not have to look like C, it *is* different in many ways.
If you use a decent IDE for R, you get spaces around ' <- ' for
free: Both in ESS and in Rstudio, you can use   "[Alt] -"  
to produce the 4 characters ' <- '

{ [Alt] + "-") is called 'M--' in ESS / emacs which has even
  more options for " <- " and is fully configurable in its key
  bindings anyway. }

The '=' character has many uses in R  and using  ' <- '
for assignment makes the code "more expressive": It makes sense
to highlight the assignment op, but is a bit stupid to
highlight all "=" signs.  Further it can be nicely marked up by
a real "left arrow" by e.g. the listings LaTeX 
'listings' package, or the (oldish) 'a2ps'  GNU software.

Further, assignment is not commutative, and hence, 
there is a corresponding ` -> `  operator,
whereas the '=' is a commutative operator in mathematics, but
not when used as assignment op.

[ yes: "Flame war is on.  I'll stop reading R-help for a while.."
  ;-) ;-) ]


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker
> Sent: Monday, 2 February 2015 2:07p
> To: r-h...@stat.math.ethz.ch
> Subject: Re: [R] the less-than-minus gotcha

> Mike Miller  gmail.com> writes:

> > 
> > I've got to remember to use more spaces.  Here's the basic problem:
> > 
> > These are the same:
> > 
> > v< 1
> > v<1
> > 
> > But these are extremely different:
> > 
> > v< -1
> > v<-1
> > 

> This is indeed documented, in passing, in one of the pages you listed:

> http://tim-smith.us/arrgh/syntax.html

> Whitespace is meaningless, unless it isn't. Some parsing ambiguities 
> are resolved by considering whitespace around operators. See and
> despair: x<-y (assignment) is parsed differently than x < -y (comparison)!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   >