Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.
>>>>> C W >>>>> on Fri, 20 Oct 2017 15:51:16 -0400 writes: > Thank you for your responses. I guess I don't feel > alone. I don't find the documentation go into any detail. > I also find it surprising that, >> object.size(train$data) > 1730904 bytes >> object.size(as.matrix(train$data)) > 6575016 bytes > the dgCMatrix actually takes less memory, though it > *looks* like the opposite. to whom? The whole idea of these sparse matrix classes in the 'Matrix' package (and everywhere else in applied math, CS, ...) is that 1. they need much less memory and 2. matrix arithmetic with them can be much faster because it is based on sophisticated sparse matrix linear algebra, notably the sparse Cholesky decomposition for solve() etc. Of course the efficency only applies if most of the matrix entries _are_ 0. You can measure the "sparsity" or rather the "density", of a matrix by nnzero(A) / length(A) where length(A) == nrow(A) * ncol(A) as for regular matrices (but it does *not* integer overflow) and nnzero(.) is a simple utility from Matrix which -- very efficiently for sparseMatrix objects -- gives the number of nonzero entries of the matrix. All of these classes are formally defined classes and have therefore help pages. Here ?dgCMatrix-class which then points to ?CsparseMatrix-class (and I forget if Rstudio really helps you find these ..; in emacs ESS they are found nicely via the usual key) To get started, you may further look at ?Matrix _and_ ?sparseMatrix (and possibly the Matrix package vignettes --- though they need work -- I'm happy for collaborators there !) Bill Dunlap's comment applies indeed: In principle all these matrices should work like regular numeric matrices, just faster with less memory foot print if they are really sparse (and not just formally of a sparseMatrix class) ((and there are quite a few more niceties in the package)) Martin Maechler (here, maintainer of 'Matrix') > On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius > wrote: >> > On Oct 20, 2017, at 11:11 AM, C W wrote: >> > >> > Dear R list, >> > >> > I came across dgCMatrix. I believe this class is associated with sparse >> > matrix. >> >> Yes. See: >> >> help('dgCMatrix-class', pack=Matrix) >> >> If Martin Maechler happens to respond to this you should listen to him >> rather than anything I write. Much of what the Matrix package does appears >> to be magical to one such as I. >> >> > >> > I see there are 8 attributes to train$data, I am confused why are there >> so >> > many, some are vectors, what do they do? >> > >> > Here's the R code: >> > >> > library(xgboost) >> > data(agaricus.train, package='xgboost') >> > data(agaricus.test, package='xgboost') >> > train <- agaricus.train >> > test <- agaricus.test >> > attributes(train$data) >> > >> >> I got a bit of an annoying surprise when I did something similar. It >> appearred to me that I did not need to load the xgboost library since all >> that was being asked was "where is the data" in an object that should be >> loaded from that library using the `data` function. The last command asking >> for the attributes filled up my console with a 100K length vector (actually >> 2 of such vectors). The `str` function returns a more useful result. >> >> > data(agaricus.train, package='xgboost') >> > train <- agaricus.train >> > names( attributes(train$data) ) >> [1] "i""p""Dim" "Dimnames" "x""factors" >> "class" >> > str(train$data) >> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots >> ..@ i : int [1:143286] 2 6 8 11 18 20 21 24 28 32 ... >> ..@ p : int [1:127] 0 369 372 3306 5845 6489 6513 8380 8384 10991 >> ... >> ..@ Dim : int [1:2] 6513 126 >> ..@ Dimnames:List of 2 >> .. ..$ : NULL >> .. ..$ : chr [1:126] "cap-shape=bell" "cap-shape=conical" >> "cap-shape=convex" "cap-shape=flat" ... >> ..@ x : num [1:143286] 1 1 1 1 1 1 1 1 1 1 ... >> ..@ factors : list() >> >> > Where is the data, is i
Re: [R] nls() and loop
Dear Vangi, >>>>> Evangelina Viotto >>>>> on Fri, 20 Oct 2017 11:37:12 -0300 writes: > Hello I´m need fitt growth curve with data length-age. I want to evaluate > which is the function that best predicts my data, to do so I compare the > Akaikes of different models. I'm now need to evaluate if changing the > initial values changes the parameters and which do not allow to estimate > the model. > To do this I use the function nls(); good! > and I randomize the initial values (real positive number). Not a very good idea, I'm sorry to say: You have enough observations to fit such a simple 3-parameter model: I'm using your data showing you two models, both provided with R as "self starting models" already. Self starting means the use the data (and math and linear least squares) to get good initial (aka "starting") values for the parameters. The first model, SSasymp() is equivalent yours but more smartly parametrized: it uses exp(lrc) (!) -- see help(SSasymp) in R, the 2nd model assumes the true curve goes through the origin ( = (0,0) and hence uses one parameter less. As we will see, both models fit ok, but the more simple models may be preferable. Here is the (self contained) R code, including your data at the beginning: ANO <- c( 1.65, 1.69, 1.72, 1.72, 1.72, 1.72, 1.73, 2.66 ,2.66, 2.66, 2.66, 2.76, 2.76, 2.76 ,2.76, 2.78, 2.8, 3.65, 3.65 ,3.65, 3.78, 3.78, 5.07, 7.02, 7.1, 7.81, 8.72, 8.74, 8.78, 8.8, 8.8, 8.83, 8.98, 9.1, 9.11, 9.75, 9.82, 9.84, 9.87, 9.87, 10.99, 11.67, 11.8, 11.81, 13.93, 14.83, 15.82, 15.99, 16.87, 16.88, 16.9, 17.68, 17.79, 17.8, 17.8) SVL <- c(26.11,29.02,41.13,30.96,37.74,29.02,33.38,24.18,34.35,35.8,29.99,42.59,27.57,47.43,46.95,30.47,29.75,35.8,40.65,36.29,34.83,29.02,43.5,75,68,70,67.5,80,77.5,68,68,73.84,72.14,68,64.5,58.5,72.63,78.44,71.17,70.69,77,79,78,68.5,69.72,71.66,77,77,79,76.5,78.5,79,73,80,69.72) d.SA <- data.frame(SVL, ANO) # creo data frame {but do _not_ call it 'data'} str(d.SA) ## 55 x 2 summary(d.SA) # to get an idea; the plot below is more useful ## MM: just do this: it's equivalent to your model (but better parametrized!) fm1 <- nls(SVL ~ SSasymp(ANO, Asym, lrc, c0), data = d.SA) summary(fm1) ## Compute nicely spaced predicted values for plotting: ANO. <- seq(-1/2, 30, by = 1/8) SVL. <- predict(fm1, newdata = list(ANO = ANO.)) plot(SVL ~ ANO, d.SA, xlim = range(ANO, ANO.), ylim = range(SVL, SVL.)) lines(SVL. ~ ANO., col="red", lwd=2) abline(h = coef(fm1)[["Asym"]], col = "tomato", lty=2, lwd = 1.5) abline(h=0, v=0, lty=3, lwd = 1/2) ## Both from summary (where 'lrc' has large variance) and because of the fit, ## Trying the Asymptotic through the origin ==> 1 parameter less instead : fmO <- nls(SVL ~ SSasympOrig(ANO, Asym, lrc), data = d.SA) summary(fmO)## (much better determined) SVL.O <- predict(fmO, newdata = list(ANO = ANO.)) lines(SVL.O ~ ANO., col="blue", lwd=2)# does go through origin (0,0) abline(h = coef(fmO)[["Asym"]], col = "skyblue", lty=2, lwd = 1.5) ## look close, I'd probably take the simpler one: ## and classical statistical inference also does not see a significant ## difference between the two fitted models : anova(fm1, fmO) ## Analysis of Variance Table ## Model 1: SVL ~ SSasymp(ANO, Asym, lrc, c0) ## Model 2: SVL ~ SSasympOrig(ANO, Asym, lrc) ## Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F) ## 1 52 2635.1 ## 2 53 2702.6 -1 -67.55 1.333 0.2535 --- I see that the 10 nice self-starting models that came with nls already in the 1990's are not known and probably not understood by many. I'm working at making their help pages nicer, notably by slightly enhancing the nice model-visualizing plot, you already now get in R when you run example(SSasymp) or example(SSasympOrig) (but unfortunately, they currently use 'lwd = 0' to draw the asymptote which shows fine on a PDF but not on a typical my screen graphics device.) Martin Maechler ETH Zurich and R Core team __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.
>>>>> C W >>>>> on Fri, 20 Oct 2017 16:01:06 -0400 writes: > Subsetting using [] vs. head(), gives different results. > R code: >> head(train$data, 5) > [1] 0 0 1 0 0 The above is surprising ... and points to a bug somewhere. It is different (and correct) after you do require(Matrix) but I think something like that should happen semi-automatically. As I just see, it is even worse if you get the data from xgboost without loading the xgboost package, which you can do (and is also more efficient !): If you start R, and then do data(agaricus.train, package='xgboost') loadedNamespaces() # does not contain "xgboost" nor "Matrix" so, no wonder head(agaricus.train $ data) does not find head()s "Matrix" method [which _is_ exported by Matrix via exportMethods(.)]. But even more curiously, even after I do loadNamespace("Matrix") methods(head) now does show the "Matrix" method, but then head() *still* does not call it. There's a bug somewhere and I suspect it's in R's data() or methods package or ?? rather than in 'Matrix'. But that will be another thread on R-devel or R's bugzilla. Martin >> train$data[1:5, 1:5] > 5 x 5 sparse Matrix of class "dgCMatrix" > cap-shape=bell cap-shape=conical cap-shape=convex > [1,] . .1 > [2,] . .1 > [3,] 1 .. > [4,] . .1 > [5,] . .1 > cap-shape=flat cap-shape=knobbed > [1,] . . > [2,] . . > [3,] . . > [4,] . . > [5,] . . > On Fri, Oct 20, 2017 at 3:51 PM, C W wrote: >> Thank you for your responses. >> >> I guess I don't feel alone. I don't find the documentation go into any >> detail. >> >> I also find it surprising that, >> >> > object.size(train$data) >> 1730904 bytes >> >> > object.size(as.matrix(train$data)) >> 6575016 bytes >> >> the dgCMatrix actually takes less memory, though it *looks* like the >> opposite. >> >> Cheers! >> >> On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius >> wrote: >> >>> >>> > On Oct 20, 2017, at 11:11 AM, C W wrote: >>> > >>> > Dear R list, >>> > >>> > I came across dgCMatrix. I believe this class is associated with sparse >>> > matrix. >>> >>> Yes. See: >>> >>> help('dgCMatrix-class', pack=Matrix) >>> >>> If Martin Maechler happens to respond to this you should listen to him >>> rather than anything I write. Much of what the Matrix package does appears >>> to be magical to one such as I. >>> >>> > >>> > I see there are 8 attributes to train$data, I am confused why are there >>> so >>> > many, some are vectors, what do they do? >>> > >>> > Here's the R code: >>> > >>> > library(xgboost) >>> > data(agaricus.train, package='xgboost') >>> > data(agaricus.test, package='xgboost') >>> > train <- agaricus.train >>> > test <- agaricus.test >>> > attributes(train$data) >>> > >>> >>> I got a bit of an annoying surprise when I did something similar. It >>> appearred to me that I did not need to load the xgboost library since all >>> that was being asked was "where is the data" in an object that should be >>> loaded from that library using the `data` function. The last command asking >>> for the attributes filled up my console with a 100K length vector (actually >>> 2 of such vectors). The `str` function returns a more useful result. >>> >>> > data(agaricus.train, package='xgboost') >>> > train <- agaricus.train >>> > names( attributes(train$data) ) >>> [1] "i""p""Dim" "Dimnames" "x""factor
Re: [R] What exactly is an dgCMatrix-class. There are so many attributes.
>>>>> David Winsemius >>>>> on Sat, 21 Oct 2017 09:05:38 -0700 writes: >> On Oct 21, 2017, at 7:50 AM, Martin Maechler wrote: >> >>>>>>> C W >>>>>>> on Fri, 20 Oct 2017 15:51:16 -0400 writes: >> >>> Thank you for your responses. I guess I don't feel >>> alone. I don't find the documentation go into any detail. >> >>> I also find it surprising that, >> >>>> object.size(train$data) >>> 1730904 bytes >> >>>> object.size(as.matrix(train$data)) >>> 6575016 bytes >> >>> the dgCMatrix actually takes less memory, though it >>> *looks* like the opposite. >> >> to whom? >> >> The whole idea of these sparse matrix classes in the 'Matrix' >> package (and everywhere else in applied math, CS, ...) is that >> 1. they need much less memory and >> 2. matrix arithmetic with them can be much faster because it is based on >> sophisticated sparse matrix linear algebra, notably the >> sparse Cholesky decomposition for solve() etc. >> >> Of course the efficency only applies if most of the >> matrix entries _are_ 0. >> You can measure the "sparsity" or rather the "density", of a >> matrix by >> >> nnzero(A) / length(A) >> >> where length(A) == nrow(A) * ncol(A) as for regular matrices >> (but it does *not* integer overflow) >> and nnzero(.) is a simple utility from Matrix >> which -- very efficiently for sparseMatrix objects -- gives the >> number of nonzero entries of the matrix. >> >> All of these classes are formally defined classes and have >> therefore help pages. Here ?dgCMatrix-class which then points >> to ?CsparseMatrix-class (and I forget if Rstudio really helps >> you find these ..; in emacs ESS they are found nicely via the usual key) >> >> To get started, you may further look at ?Matrix _and_ ?sparseMatrix >> (and possibly the Matrix package vignettes --- though they need >> work -- I'm happy for collaborators there !) >> >> Bill Dunlap's comment applies indeed: >> In principle all these matrices should work like regular numeric >> matrices, just faster with less memory foot print if they are >> really sparse (and not just formally of a sparseMatrix class) >> ((and there are quite a few more niceties in the package)) >> >> Martin Maechler >> (here, maintainer of 'Matrix') >> >> >>> On Fri, Oct 20, 2017 at 3:22 PM, David Winsemius >>> wrote: >> >>>>> On Oct 20, 2017, at 11:11 AM, C W wrote: >>>>> >>>>> Dear R list, >>>>> >>>>> I came across dgCMatrix. I believe this class is associated with sparse >>>>> matrix. >>>> >>>> Yes. See: >>>> >>>> help('dgCMatrix-class', pack=Matrix) >>>> >>>> If Martin Maechler happens to respond to this you should listen to him >>>> rather than anything I write. Much of what the Matrix package does appears >>>> to be magical to one such as I. >>>> [] >>>>> data(agaricus.train, package='xgboost') >>>>> train <- agaricus.train >>>>> names( attributes(train$data) ) >>>> [1] "i""p""Dim" "Dimnames" "x""factors" >>>> "class" >>>>> str(train$data) >>>> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots >>>> ..@ i : int [1:143286] 2 6 8 11 18 20 21 24 28 32 ... >>>> ..@ p : int [1:127] 0 369 372 3306 5845 6489 6513 8380 8384 10991 >>>> ... >>>> ..@ Dim : int [1:2] 6513 126 >>>> ..@ Dimnames:List of 2 >>>> .. ..$ : NULL >>>> .. ..$ : chr [1:126] "cap-shape=bell" "cap-shape=conical" >>>> "cap-shape=convex" "cap-shape=flat" ... >>>> ..@ x : num [1:143286] 1 1 1 1 1 1 1 1 1 1 ... >>>> ..@ factors : list() >>
Re: [R] Syntax for fit.contrast (from package gmodels)
> Sorkin, John > on Sun, 22 Oct 2017 22:56:16 + writes: > David, > Thank you for responding to my post. > Please consider the following output (typeregional is a factor having two levels, "regional" vs. "general"): > Call: > glm(formula = events ~ type, family = poisson(link = log), data = data, > offset = log(SS)) > Deviance Residuals: > Min 1Q Median 3Q Max > -43.606 -17.295 -4.6514.204 38.421 > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -2.528300.01085 -233.13 <2e-16 *** > typeregional 0.337880.01641 20.59 <2e-16 *** > Let's forget for a moment that the model is a Poisson regression and pretend that the output is from a simple linear regression, e.g. from lm. > To get the estimate for "general" one simply needs to use the value of the intercept i.e. -2.5830. Similarly to get the 95% CI of general one simply needs to compute -2.52830-(1.96*0.01085) and -2.52830+(1.96*0.01085). I'm pretty sure you can just use (the base R) functions dummy.coef() or model.tables() possibly with SE=TRUE to get coefficients for all levels of a factor.. I'd like to have tried to show this here, but for that we'd have wanted to see a "MRE" or "ReprEx" (minimal reproducible example) .. > To get the estimate for "regional" one needs to compute intercept + typeregional, i.e. -2.52830 + 0.33788. To get the 95% CI is somewhat more difficult as one needs to use results from the variance-covariance matix, specifically the variance of intercept, the variance of "regional", and the covariance of (intercept,"regional") which involves: > var = var(intercept) + var(regional) +2*(covar(intercept,regional)), > and then get the SE of the variance > SE=sqrt(var) > 95% CI = intercept + regional - 1.95*SE and intercept + regional + 1.95*SE. > I was hoping that a contrast statement could be written that would give me the point estimate and SE for "general" and its SE and another contrast statement could be written that would give me the point estimate and SE for "general" and it SE without my having to work directly with the variance-covariance matrix. I tried doing this using the fit.contrast statements (from the gmodels package): > fit.contrast(model,type,c(1,0),showall=TRUE) > fit.contrast(model,type,c(0,1),showall=TRUE) > and received the error message, > Error in `[[<-`(`*tmp*`, varname, value = c(0, 1)) : > no such index at level 1 > Perhaps fit.contrast is not the way to accomplish my goal. Perhaps my goal can be accomplished without a contrast statement, but I don't know how. My guess is that "standard R" aka "base R" would be sufficient to get what you'd want, notably if you'd consider using se.contrast() additionally. Martin > Thank you, > John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table(..., header == FALSE, colClasses = )
>>>>> Benjamin Tyner >>>>> on Tue, 24 Oct 2017 07:21:33 -0400 writes: > Jeff, > Thank you for your reply. The intent was to construct a minimum > reproducible example. The same warning occurs when the 'file' argument > points to a file on disk with a million lines. But you are correct, my > example was slightly malformed and in fact gives an error under R > version 3.2.2. Please allow me to try again; in older versions of R, > > read.table(file = textConnection("a\t3.14"), header = FALSE, > colClasses = c(x = "character", y = "numeric"), sep="\t") > V1 V2 > 1 a 3.14 > (with no warning). As of version 3.3.0, > > read.table(file = textConnection("a\t3.14"), header = FALSE, > colClasses = c(x = "character", y = "numeric"), sep="\t") > V1 V2 > 1 a 3.14 > Warning message: > In read.table(file = textConnection("a\t3.14"), header = FALSE, : > not all columns named in 'colClasses' exist > My intent was not to complain but rather to learn more about best > practices regarding the names attribute. which is a nice attitude, thank you. An even shorter MRE (as header=FALSE is default, and the default sep="" works, too): > tt <- read.table(textConnection("a 3.14"), colClasses = c(x="character", > y="numeric")) Warning message: In read.table(file = textConnection("a 3.14"), colClasses = c(x = "character", : not all columns named in 'colClasses' exist > If you read in the help page -- you did read that before posting, did you?--- how 'colClasses' should be specified , colClasses: character. A vector of classes to be assumed for the columns. If unnamed, recycled as necessary. If named, names are matched with unspecified values being taken to be ‘NA’. Possible values are .. . and the 'x' and 'y' names you used, are matched with the colnames ... which on the other hand are "V1" and "V2" for you, and so you provoke a warning. Once you have read (and understood) the above part of the help page, it becomes, easy, no? > tt <- read.table(textConnection("a 3.14"), colClasses = > c("character","numeric")) > t2 <- read.table(textConnection("a 3.14"), > colClasses=c(x="character",y="numeric"), col.names=c("x","y")) > t2 xy 1 a 3.14 > i.e., no warning in both of these two cases. So please, please, PLEASE: at least non-beginners like you *should* take the effort to read the help page (and report if these seem incomplete or otherwise improvable)... Best, Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] run r script in r-fiddle
> Suzen, Mehmet > on Mon, 30 Oct 2017 11:16:30 +0100 writes: > Hi Frank, You could upload your R source file to a public > URL, for example to github and read via RCurl, as source > do not support https as far as I know. well... but your knowledge is severely (:-) outdated. Why did you not try first? source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";) works for me even in R 3.3.0 which is really outdated itself! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] run r script in r-fiddle
>>>>> Suzen, Mehmet >>>>> on Mon, 30 Oct 2017 16:05:18 +0100 writes: > We were talking about r-fiddle. It gives error there [*], > that's why I suggested using RCurl. >> source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";) > ... unsupported URL scheme Error : cannot open the > connection >> and later you note that it seems to run R-3.1.2 --- which is ridiculously outdated. I think R-help should not be (mis)used at all to talk about R-Fiddle, then. Notably as I think it's been provided by a company that no longer exists under that name, and even if that'd be wrong, R-Fiddle does not seem free software (apart from the R parts, I hope !). If another asked about his/her problems using R 3.1.2, I think we would all tell him to go away and install a current version of R, and not waste any more bandwidth and R-help readers' time, wouldn't we ? > On 30 October 2017 at 15:51, Martin Maechler > wrote: >>>>>>> Suzen, Mehmet on Mon, 30 Oct 2017 >>>>>>> 11:16:30 +0100 writes: >> >> > Hi Frank, You could upload your R source file to a >> public > URL, for example to github and read via RCurl, >> as source > do not support https as far as I know. >> >> well... but your knowledge is severely (:-) outdated. >> Why did you not try first? >> >> source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R";) >> >> works for me even in R 3.3.0 which is really outdated >> itself! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] run r script in r-fiddle
>>>>> Suzen, Mehmet >>>>> on Tue, 31 Oct 2017 19:27:30 +0100 writes: > Dear List, According to datacamp support team, > r-fiddle.org is not supported. We asked them to put it > down as Professor Maechler suggested it is a waste of time > for the R-help to respond to questions on something not > maintained and severely outdated. If you would like to use > R from your browser, you can embed the following into a > web page: > src="<a rel="nofollow" href="https://cdn.datacamp.com/datacamp-light-latest.min.js"">https://cdn.datacamp.com/datacamp-light-latest.min.js"</a>;> > > Currently, it supports R 3.4.0. See the code base, which > is open source, here > https://github.com/datacamp/datacamp-light > Hope it helps. > Best, Mehmet Yes, it does! Thank you very much, Mehmet, for reaching out to get these clarifications and reporting back here. I'm glad to be notified that datacamp seems to adhere to an open source philosophy also in the tools they write: For datacamp-light they _do_ adhere (*) to the much more strict GNU Public Licence (GPL) Free Software standard which emphasizes not only the free availability of source code, but also other freedoms, e.g., to *modify* and re-distribute under the GPL, see https://en.wikipedia.org/wiki/GPL for much more. An open Github repos alone is _no_ guarantee for that, and unfortunately I think there's much software there where authors don't care (or don't want) to use a "truly" free software licence such as (*) https://github.com/datacamp/datacamp-light/blob/master/LICENSE.md Best, Martin Maechler > On 31 October 2017 at 15:09, Suzen, Mehmet > wrote: >> On 31 October 2017 at 12:42, Martin Maechler >> wrote: >>> Notably as I think it's been provided by a company that >>> no longer exists under that name, and even if that'd be >>> wrong, R-Fiddle does not seem free software (apart from >>> the R parts, I hope !). >> >> For the record, r-fiddle is maintained by datacamp: >> https://www.datacamp.com/community/blog/r-fiddle-an-online-playground-for-r-code-2 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkgs] Release of ess 0.0.1
>>>>> Jorge Cimentada >>>>> on Fri, 10 Nov 2017 14:31:43 +0100 writes: > Thanks to all. Will consider this change in future releases. > --- > Jorge Cimentada > *https://cimentadaj.github.io/ <https://cimentadaj.github.io/>* > On Fri, Nov 10, 2017 at 12:41 PM, Rainer Krug > wrote: >> On 9 Nov 2017, at 15:57, Sam Steingold wrote: >> >> * Jorge Cimentada [2017-11-09 00:02:53 +0100]: >> >> I'm happy to announce the release of ess 0.0.1 a package designed to >> download data from the European Social Survey >> >> >> Given the existence of ESS (Emacs Speaks Statistics - >> https://ess.r-project.org/) the package name "ess" seems unfortunate. >> >> >> Agreed. I would suggest to rename the package to avoid conflicts (ESS >> includes some R code, And I wouldn’t wonder if they would like to include >> it in a package?). As a matter of fact, we (the ESS core developers) have e-chatted about this and came to the conclusion that we could get along fine with an unrelated R package called 'ess', notably as the acronym also makes sense for the European Social Survey. We'd like to ask the 'ess' package authors to add something like the following to their ess/DESCRIPTION file: This package ess is named to match the European Social Survey (ESS). It is unrelated to the Emacs Speaks Statistics (ESS) project, an emacs-based Integrated Development Environment hosted at https://ess.r-project.org and last but not least we have thought of 'reserving' ESSR as the name of a CRAN package that we'd consider using for the R part of ESS (there are others, considerably less used, notably Julia, Stata and SAS). Martin Maechler, ETH Zurich for ESS core developers __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggtern and bquote...
>>>>> peter dalgaard >>>>> on Mon, 4 Dec 2017 14:55:19 +0100 writes: >> On 4 Dec 2017, at 11:58 , Levent TERLEMEZ via R-help >> wrote: >> >> Dear Users, >> >> What is the proper way to write symbol, superscript, >> subscript in ggtern/ggplot? I tried every given example, >> every possible features of ggplot but couldn’t achived. I >> just want to write P_a, sigma^2, etc, would you please >> advise me about this problem. > Did you try expression(P_a)? I don't do much gg-stuff, but > I seem to recall that quote() doesn't quite cut it the way > it does in base graphics. > -pd Yes, I vaguely remember that indeed also for the lattice package (which is based on 'grid' the same as 'ggplot2' is ..) sometimes expressions instead of calls are needed, i.e., expression(*) instead of just quote(*). However, I think Levent really meant what you'd get by expression(P[a]) ? @Levent: The clue is the need for valid R syntax, and indeed, as in LaTeX x_i often is the i-th element of x, the R syntax for indexing/subsetting is used here, i.e. x[i] for LaTeX x_i Last but not least, if Levent really needs bquote() [i.e. substitute()] then, a final as.expression(.) may be needed : identical(as.expression(quote(a == 1)), expression( a == 1)) # --> TRUE -- Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] max and pmax of NA and NaN
>>>>> Michal Burda >>>>> on Mon, 15 Jan 2018 12:04:13 +0100 writes: > Dear R users, is the following OK? >> max(NA, NaN) > [1] NA >> max(NaN, NA) > [1] NA >> pmax(NaN, NA) > [1] NA >> pmax(NA, NaN) > [1] NaN > ...or is it a bug? > Documentation says that NA has a higher priority over NaN. which documentation ?? [That would be quite a bit misleading I think. So, it should be amended ...] > Best regards, Michal Burda R's help pages are *THE* reference documentation and they have (for a long time, I think) had : ?NaN has in its 3rd 'Note:' Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’: which of those two is not guaranteed and may depend on the R platform (since compilers may re-order computations). Similarly, ?NA contains, in its 'Details': Numerical computations using ‘NA’ will normally result in ‘NA’: a possible exception is where ‘NaN’ is also involved, in which case either might result (which may depend on the R platform). - Yes, it is a bit unfortunate that this is platform dependent; if we wanted to make this entirely consistent (as desired in a perfect world), I'm almost sure R would become slower because we'd have to do add some explicit "book keeping" / checking instead of relying on the underlying C library code. Note that for these reasons, often NaN and NA should not be differentiated, and that's reason why using is.na(*) is typically sufficient and "best" -- it gives TRUE for both NA and NaN. Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with geterrmessage()
>>>>> Dennis Fisher >>>>> on Thu, 22 Feb 2018 13:01:37 -0800 writes: > Luke > Thanks — I revised the code to: > ERRORMESSAGE <- try(source(USERSCRIPTFILE, local=T), silent=T) > print(ERRORMESSAGE) now returns: > $value > [1] 0 > $visible > [1] FALSE > Not clear what to make of that. > Dennis The help page ?try has contained for a long time ‘try’ is implemented using ‘tryCatch’; for programming, instead of ‘try(expr, silent = TRUE)’, something like ‘tryCatch(expr, error = function(e) e)’ (or other simple error handler functions) may be more efficient and flexible. and you do use 'silent=T' (which is "unsafe" (*) !) I'd strongly advocate you use the 2nd proposition by given by Luke Tierney (cited below): Learn to use tryCatch() instead of try() and then such things can be done considerably less obscurely. Best, Martin Maechler, ETH Zurich -- *) Using 'T' instead of 'TRUE' (of 'F' instead of 'FALSE' *is* unsafe): a previous 'T <- 0' will change what you really wanted. TRUE and FALSE are "constants" in R, whereas T and F are variables > Dennis Fisher MD > P < (The "P Less Than" Company) > Phone / Fax: 1-866-PLessThan (1-866-753-7784) > www.PLessThan.com >> On Feb 22, 2018, at 12:45 PM, luke-tier...@uiowa.edu wrote: >> >> Only the default error handler puts the error message in a buffer >> where it can be retrieved with geterrmessage. try() replaces the >> default error handler. Either look at the value returned by try() or >> use tryCatch with conditionMessage. >> >> Best, >> >> luke >> >> On Thu, 22 Feb 2018, Dennis Fisher wrote: >> >>> R 3.4.3 >>> OS X >>> >>> Colleagues >>> >>> I have a 20K line script in which I encounter an unexpected problem. >>> >>> If the script detects presence of a particular file USERCODE.txt, it executes: >>> source(“USERCODE.txt”) >>> If that file is not present, the script executes without a problem. >>> >>> There might be syntax errors in USERCODE.txt; therefore, the code above is embedded in a try command: >>> try(source(“USERCODE.txt", local=T), silent=T) >>> followed by: >>> ERRORMESSAGE <- geterrmessage() >>> >>> For unclear reasons, an earlier command is yielding an error message: >>> unused argument (\"\\n\") >>> Despite identifying the exact source of that error, I can’t fix it (and it is of no consequence). >>> >>> Ideally, I would like to clear out the pre-existing error message immediately before the “try” command (or perhaps at that particular location where it is being created) — but I can’t figure out how to do so. >>> >>> Any suggestions would be welcome. >>> >>> Dennis >>> >>> Dennis Fisher MD >>> P < (The "P Less Than" Company) >>> Phone / Fax: 1-866-PLessThan (1-866-753-7784) >>> www.PLessThan.com >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics andFax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tier...@uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible Improvement to sapply
>>>>> Doran, Harold >>>>> on Tue, 13 Mar 2018 16:14:19 + writes: > You’re right, it sure does. My suggestion causes it to fail when simplify = ‘array’ > From: William Dunlap [mailto:wdun...@tibco.com] > Sent: Tuesday, March 13, 2018 12:11 PM > To: Doran, Harold > Cc: r-help@r-project.org > Subject: Re: [R] Possible Improvement to sapply > Wouldn't that change how simplify='array' is handled? >> str(sapply(1:3, function(x)diag(x,5,2), simplify="array")) > int [1:5, 1:2, 1:3] 1 0 0 0 0 0 1 0 0 0 ... >> str(sapply(1:3, function(x)diag(x,5,2), simplify=TRUE)) > int [1:10, 1:3] 1 0 0 0 0 0 1 0 0 0 ... >> str(sapply(1:3, function(x)diag(x,5,2), simplify=FALSE)) > List of 3 > $ : int [1:5, 1:2] 1 0 0 0 0 0 1 0 0 0 > $ : int [1:5, 1:2] 2 0 0 0 0 0 2 0 0 0 > $ : int [1:5, 1:2] 3 0 0 0 0 0 3 0 0 0 > Bill Dunlap > TIBCO Software > wdunlap tibco.com<http://tibco.com> Yes, indeed, thank you Bill! I sometimes marvel at how much the mental capacities of R core are underestimated. Of course, nobody is perfect, but the bugs we produce are really more subtle than that ... ;-) Martin Maechler R core > On Tue, Mar 13, 2018 at 6:23 AM, Doran, Harold mailto:hdo...@air.org>> wrote: > While working with sapply, the documentation states that the simplify argument will yield a vector, matrix etc "when possible". I was curious how the code actually defined "as possible" and see this within the function > if (!identical(simplify, FALSE) && length(answer)) > This seems superfluous to me, in particular this part: > !identical(simplify, FALSE) > The preceding code could be reduced to > if (simplify && length(answer)) > and it would not need to execute the call to identical in order to trigger the conditional execution, which is known from the user's simplify = TRUE or FALSE inputs. I *think* the extra call to identical is just unnecessary overhead in this instance. > Take for example, the following toy example code and benchmark results and a small modification to sapply: > myList <- list(a = rnorm(100), b = rnorm(100)) > answer <- lapply(X = myList, FUN = length) > simplify = TRUE > library(microbenchmark) > mySapply <- function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE){ > FUN <- match.fun(FUN) > answer <- lapply(X = X, FUN = FUN, ...) > if (USE.NAMES && is.character(X) && is.null(names(answer))) > names(answer) <- X > if (simplify && length(answer)) > simplify2array(answer, higher = (simplify == "array")) > else answer > } >> microbenchmark(sapply(myList, length), times = 1L) > Unit: microseconds > exprmin lq mean median uqmax neval > sapply(myList, length) 14.156 15.572 16.67603 15.926 16.634 650.46 1 >> microbenchmark(mySapply(myList, length), times = 1L) > Unit: microseconds > exprmin lq mean median uq max neval > mySapply(myList, length) 13.095 14.864 16.02964 15.218 15.573 1671.804 1 > My benchmark timings show a timing improvement with only that small change made and it is seemingly nominal. In my actual work, the sapply function is called millions of times and this additional overhead propagates to some overall additional computing time. > I have done some limited testing on various real data to verify that the objects produced under both variants of the sapply (base R and my modified) yield identical objects when simply is both TRUE or FALSE. > Perhaps someone else sees a counterexample where my proposed fix does not cause for sapply to behave as expected. > Harold > __ > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible Improvement to sapply
>>>>> Henrik Bengtsson >>>>> on Tue, 13 Mar 2018 10:12:55 -0700 writes: > FYI, in R devel (to become 3.5.0), there's isFALSE() which will cut > some corners compared to identical(): > > microbenchmark::microbenchmark(identical(FALSE, FALSE), isFALSE(FALSE)) > Unit: nanoseconds > expr min lqmean median uq max neval > identical(FALSE, FALSE) 984 1138 1694.13 1218.0 1337.5 13584 100 > isFALSE(FALSE) 713 761 1133.53 809.5 871.5 18619 100 > > microbenchmark::microbenchmark(identical(TRUE, FALSE), isFALSE(TRUE)) > Unit: nanoseconds >expr min lqmean median uq max neval > identical(TRUE, FALSE) 1009 1103.5 2228.20 1170.5 1357 14346 100 > isFALSE(TRUE) 718 760.0 1298.98 798.0 898 17782 100 > > microbenchmark::microbenchmark(identical("array", FALSE), isFALSE("array")) > Unit: nanoseconds > expr min lqmean median uq max neval > identical("array", FALSE) 975 1058.5 1257.95 1119.5 1250.0 9299 100 > isFALSE("array") 409 433.5 658.76 446.0 476.5 9383 100 Thank you Henrik! The speed of the new isTRUE() and isFALSE() is indeed amazing compared to identical() which was written to be fast itself. Note that the new code goes back to a proposal by Hervé Pagès (of Bioconductor fame) in a thread with R core in April 2017. The goal of the new code actually *was* to allow call like isTRUE(c(a = TRUE)) to become TRUE rather than improving speed. The new source code is at the end of R/src/library/base/R/identical.R ## NB: is.logical(.) will never dispatch: ## -- base::is.logical(x) <==> typeof(x) == "logical" isTRUE <- function(x) is.logical(x) && length(x) == 1L && !is.na(x) && x isFALSE <- function(x) is.logical(x) && length(x) == 1L && !is.na(x) && !x and one *reason* this is so fast is that all 6 functions which are called are primitives : > sapply(codetools::findGlobals(isTRUE), function(fn) is.primitive(get(fn))) ! && == is.logical is.na length TRUE TRUE TRUE TRUE TRUE TRUE and a 2nd reason is probably with the many recent improvements of the byte compiler. > That could probably be used also is sapply(). The difference is that > isFALSE() is a bit more liberal than identical(x, FALSE), e.g. > > isFALSE(c(a = FALSE)) > [1] TRUE > > identical(c(a = FALSE), FALSE) > [1] FALSE > Assuming the latter is not an issue, there are 69 places in base R > where isFALSE() could be used: > $ grep -E "identical[(][^,]+,[ ]*FALSE[)]" -r --include="*.R" | grep -F "/R/" > | wc > 69 3265472 > and another 59 where isTRUE() can be used: > $ grep -E "identical[(][^,]+,[ ]*TRUE[)]" -r --include="*.R" | grep -F "/R/" > | wc > 59 3075021 Beautiful use of 'grep' -- thank you for those above, as well. It does need a quick manual check, but if I use the above grep from Emacs (via 'M-x grep') or even better via a TAGS table and M-x tags-query-replace I should be able to do the changes pretty quickly... and will start looking into that later today. Interestingly and to my great pleasure, the first part of the 'Subject' of this mailing list thread, "Possible Improvement", *has* become true after all -- -- thanks to Henrik ! Martin Maechler ETH Zurich > On Tue, Mar 13, 2018 at 9:21 AM, Doran, Harold wrote: > > Quite possibly, and I’ll look into that. Aside from the work I was doing, > > however, I wonder if there is a way such that sapply could avoid the > > overhead of having to call the identical function to determine the > > conditional path. > > > > > > > > From: William Dunlap [mailto:wdun...@tibco.com] > > Sent: Tuesday, March 13, 2018 12:14 PM > > To: Doran, Harold > > Cc: Martin Morgan ; r-help@r-project.org > > Subject: Re: [R] Possible Improvement to sapply > > > > Could your code use vapply instead of sapply? vapply forces you to declare > > the type and dimensions > > of FUN's output and stops if any call to FUN does not match the > > declaration. It can use much less > > memory and time than sapply because it fills in the output array as it goes > > instead of calling lapply() > > and seeing how it could be simplified. > > > > Bill Dunlap > > TIBCO Software > > wdunlap tibco.com<http://tibco.com> > > > > On Tue, Mar 13, 2018 at 7:06 AM, Doran, Harold > > mailto:hdo...@air.org>&g
Re: [R] Fwd: the same function returning different values when called differently..
>>>>> Eric Berger >>>>> on Wed, 14 Mar 2018 18:27:38 +0200 writes: > Hi Akshay, (Please include r-help when replying) > You have learned that PFC.NS and snl[[159]] are not > identical. Now you have to figure out why they > differ. This could also point to a bug or a logic error in > your program. Figuring out how two objects differ can be > a bit tricky, but with experience it becomes easier. (Some > others may even have some suggestions for good ways to do > it.) Basically you would work your way down. At the top > level is the class of each, which you already tested and > they are identical. > Now try: >> str(PFC.NS) > and compare it to >> str(snl[[159]] Note further the all.equal(obj1, obj2) maybe a good start to find differences between R objects 'obj1' and 'obj2', as all.equal() is generic and has a list method (which works recursively) the "output may be huge, but then if there is huge number of differences you would have found yourself anyway, and will not need all.equal() Martin Maechler ETH Zurich and R Core > Look closely at the two outputs to try to detect > differences. If they are the same then you will have to > examine the sub-objects described by str(). You didn't > mention what type of objects they are. Suppose they are > both "numeric" vectors. Then you can check whether their > lengths are equal. And you can compare their values, > etc. etc. No short cut that I can think of without more > information. > It definitely takes work to find discrepancies. Think of > it as a challenge and perhaps write functions that you can > use to automate this kind of comparison in the future. > (Again, other people on this list might be able to point > to tools that help with this for objects of specific > type.) > Good luck, Eric > dear Eric, >Bulls eye...! identical(PFC.NS, > snl[[159]])) is resulting false..but class(PFC.NS) and > class(snl[[159]]) are same... > I want snl[[159]] to be equal to PFC.NS...how do I effect > it? create a list with some other element list... for > example snl[[200]] == PFC.NS? > very many thanks for your time and effort... > yours sincerely, > AKSHAY M KULKARNI > -- > *From:* Eric Berger *Sent:* > Wednesday, March 14, 2018 5:22 PM *To:* akshay kulkarni > *Cc:* R help Mailing list *Subject:* Re: [R] the same > function returning different values when called > differently.. > Hi Akshay, Presumably PFC.NS and snl[[159]] are not > exactly the same. You can start by trying to determine if > (and then how) they differ. e.g. >> identical(PFC.NS, snl[[159]]) > presumably this will result in FALSE > then compare >> class(PFC.NS) class(snl[[159]]) > etc > HTH, Eric > On Wed, Mar 14, 2018 at 1:42 PM, akshay kulkarni > wrote: > dear members, > I have a function ygrpc which acts on the daily price > increments of a stock. It returns the following values: > ygrpc(PFC.NS,"h") [1] 2.149997 1.875000 0.75 0.349991 > 2.16 0.17 4.00 2.574996 0.50 0.34 > 1.50 0.71 [13] 0.50 1.33 0.449997 2.83 > 2.724998 66.150002 0.550003 0.050003 1.224991 4.84 > 1.375000 1.574997 [25] 1.649994 0.449997 0.975006 2.475006 > 0.125000 2.625000 3.649994 0.34 1.33 2.074997 > 1.025001 0.125000 [37] 3.84 0.025002 0.824997 1.75 > 1.67 1.75 1.67 0.275002 2.33 0.349998 > 0.75 0.224998 [49] 0.125000 1.475006 1.58 0.125000 > 0.50 0.75 1.08 0.225006 1.274997 1.33 > 3.00 0.33 [61] 0.724999 3.67 2.424995 8.425003 > 1.01 2.025009 0.850006 4.00 1.724991 0.949997 > 1.825012 2.799988 [73] 0.425003 1.75 5.75 2.125000 > 0.125000 4.00 2.350006 1.524994 1.25 0.33 > 0.949997 0.449997 [85] 1.84 1.75 1.150009 0.849998 > 2.449997 5.33 0.1 > I also have a list of stocks called "snl" with snl[[159]] > pointing to PFC.NS (above): tail(snl[[159]]) PFC.NS.Open > PFC.NS.High PFC.NS.Low PFC.NS.Close PFC.NS.Volume > PFC.NS.Adjusted 2018-03-07 98.40 98.45 95.1 95.30 7854861 > 95.30 2018-03-08 94.90 94.95 89.3 91.90 12408061 91.90 > 2018-03-09 92.00 94.50 91.9 93.10 7680222 93.10 2018-03-12 > 93.40 93.85 86.1 88.25 12617833 88.25 2018-03-13 89.20 > 91.85 86.2 89.85 12792
Re: [R] Hacked
>>>>> Peter Langfelder >>>>> on Tue, 17 Apr 2018 11:50:19 -0700 writes: > I got some spam emails after my last post to the list, and the emails > did not seem to go through r-help. The spammers may be subscribed to > the r-help, or they get the poster emails from some of the web copies > of this list (nabble or similar). > Peter Thank you Peter (and Ulrich and ..), >From all the cases we (R-help-owner = your friendly volunteer list moderators) checked, 1) the address from which the spam was sent was *NOT* among the 13'000 subscriber addresses to R-help 2) in one case, I replied to the address stating that it seemed that *e-mail account* had been hacked. The result was nil (NULL), i.e., my conclusion was a) the address was a really registered/existing e-mail address that gave no "mailer daemon" error b) it probably did go to the spammers: a legitimate user would have replied to me. Martin Maechler ETH Zurich (= provider of all the r-*@r-project.org mailman mailing lists) > On Tue, Apr 17, 2018 at 11:37 AM, Ulrik Stervbo wrote: >> I asked the moderators about it. This is the reply >> >> "Other moderators have looked into this a bit and may be able to shed more >> light on it. This is a "new" tactic where the spammers appear to reply to >> the r-help post. They are not, however, going through the r-help server. >> >> It also seems that this does not happen to everyone. >> >> I am not sure how you can automatically block the spammers. >> >> Sorry I cannot be of more help." >> >> --Ulrik >> >> Jeff Newmiller schrieb am Di., 17. Apr. 2018, >> 14:59: >> >>> Likely a spammer has joined the mailing list and is auto-replying to posts >>> made to the list. Unlikely that the list itself has been "hacked". Agree >>> that it is obnoxious. >>> >>> On April 17, 2018 5:01:10 AM PDT, Neotropical bat risk assessments < >>> neotropical.b...@gmail.com> wrote: >>> >Hi all, >>> > >>> >Site has been hacked? >>> >Bad SPAM arriving >>> > >>> >__ >>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >https://stat.ethz.ch/mailman/listinfo/r-help >>> >PLEASE do read the posting guide >>> >http://www.R-project.org/posting-guide.html >>> >and provide commented, minimal, self-contained, reproducible code. >>> >>> -- >>> Sent from my phone. Please excuse my brevity. >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question "breakdown point" --> robust statistics
>>>>> Olivier Crouzet >>>>> on Thu, 19 Apr 2018 18:13:30 +0200 writes: > Hi, > I think he's talking about how much a statistical estimator is influenced by extreme datapoints, e.g. > https://en.m.wikipedia.org/wiki/Robust_statistics#Breakdown_point > Olivier Definitely! Thank you, Olivier. I had not replied because the question looked more like homework than something for which R-help is appropropriate. There is the 'robustbase' package and more of a dozen other high quality R packages about robust statistics -- in which "breakdown point" is one crucial concept. BTW: What econometricians tend to call "robust" is the idea of using "sandwich" estimates (--> 'sandwich' package in R), aka Newey-West aka HAC (Heteroscadisticity And Correlation adjusted) estimates. Robust statistics in a proper sense considers "sensitivity" of both estimates and their precision estimates in situations where the distributional assumption (e.g. Gaussian errors) is only approximately true, say 95% of the observations/errors/.. are Gaussian but the remaining 5% can be anything (-> arbitrary extreme outliers). There's the CRAN task view on robust statistical methods: https://cran.r-project.org/web/views/Robust.html Martin Maechler ETH Zurich > -- > Olivier Crouzet > Assistant Professor > @LLING UMR6310 - Université de Nantes / CNRS > Guest Scientist > @UMCG - University Medical Center Groningen / RijksUniversiteit Groningen >> Le 19 avr. 2018 à 11:00, Keith Jewell a écrit : >> >>> On 15/04/2018 17:26, Marc Girondot via R-help wrote: >>>> Le 15/04/2018 à 17:56, alireza daneshvar a écrit : >>>> break-down point >>> Can you explain more what you plan to do and give an example of what you have tried to do until now to do a "break down point" in R. Perhaps a "break down point" is common in your field, but I have no idea about what it is ! >>> https://en.wikipedia.org/wiki/Breakdown_(1997_film) >>> Sincerely >>> Marc >> Perhaps the OP means "breakpoint" in which case the 'strucchange' package might be relevant <https://cran.r-project.org/web/packages/strucchange/index.html> >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error : 'start' contains NA values when fitting frank copula
>>>>> Soumen Banerjee >>>>> on Sat, 21 Apr 2018 17:22:56 +0800 writes: > Hello! I am trying to fit a copula to some data in R and > I get the error mentioned above. This is the code for a > reproducible example - (not really reproducible: You did not set the random seed, so the data is different every time; OTOH, yes, the following always gives an error) > library(copula) Slightly clearer showing what you are doing: x <- runif(200) data <- cbind(x, 2*x, 3*x) > fr_cop = frankCopula(dim=3) > fit_fr_cop = fitCopula(fr_cop, pobs(data), method = "mpl") #Error Here > The error says : Error in fitCopula.ml(copula, u = data, method = method, > start = start, : 'start' contains NA values > What could be going wrong? Is this a homework question? [probably not, but ..] The homework question & answer would be Q: What is the best fitting 3D copula if you have perfectly correlated ("rank 1") data? A: The complete-dependence copula ... which is indeed a boundary case, hard to imagine you'd encounter for non-artifical data. Another hint: > fitCopula(normalCopula(,3), pobs(data)) Call: fitCopula(copula, data = data) Fit based on "maximum pseudo-likelihood" and 200 3-dimensional observations. Copula: normalCopula rho.1 1 The maximized loglikelihood is 3686 Optimization converged > --- Yes, one could adapt the fitCopula() code to work better for this extreme boundary case, and for the Frank copula, it would return > fitCopula(frankCopula(,3), pobs(data)) Call: fitCopula(copula, data = data) Fit based on "maximum pseudo-likelihood" and 200 3-dimensional observations. Copula: frankCopula alpha 7.21e+16 The maximized loglikelihood is 1.798e+308 Optimization converged but it would need even more tweaking to also give such "alpha ~= +Inf" results for other cases such as Gumbel or Joe. I may add the possibility for frank*() as shown above for the next release of the copula package... but am a bit hesitant to complicate (and slowdown) the current code by adding an extra check for this situation. Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How would I color points conditional on their value in a plot of a time series
>>>>> Eivind K Dovik >>>>> on Tue, 1 May 2018 23:18:51 +0200 writes: > You may also want to check this out: > plot(ttt, type = "p") > points(ttt, col = ifelse(ttt < 8, "black", "red")) > Eivind K. Dovik > Bergen, NO yes, indeed, or -- even nicer for a time series: using 'type = "c"' which many people don't know / have forgotten about: ttt <- ts(rpois(12, lambda = 8), start = c(2000, 1), freq = 4) plot (ttt, type = "c") points(ttt, col = ifelse(ttt < 8, "black", "red")) Martin Maechler > On Tue, 1 May 2018, Christopher W Ryan wrote: >> Excellent! Worked like a charm. Thanks. >> >> --Chris Ryan >> >> On Tue, May 1, 2018 at 4:33 PM, William Dunlap wrote: >> >>> The ts method for plot() is quirky. You can use the default method: >>> >>> plot(as.vector(time(ttt)), as.vector(ttt), type = "p", col=ifelse(ttt<8, >>> "black", "red")) >>> >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> On Tue, May 1, 2018 at 1:17 PM, Christopher W Ryan >>> wrote: >>> >>>> How would I color points conditional on their value in a plot of a time >>>> series. Something like this: >>>> >>>> ## demonstration data >>>> ttt <- ts(rpois(12, lambda = 8), start = c(2000, 1), freq = 4) >>>> ttt >>>> plot(ttt, type = "p") >>>> >>>> ## doesn't work--all points the same color >>>> plot(ttt, type = "p", col = ifelse(ttt < 8, "black", "red")) >>>> >>>> ## also doesn't work--all points the same color >>>> q <- as.numeric(ttt) >>>> q >>>> plot(ttt, type = "p", col = ifelse(q < 8, "black", "red")) >>>> >>>> >>>> ## works OK with a simple, non-time-series scatterplot, as in >>>> >>>> sss <- data.frame(x = rpois(12, lambda = 8), y = rnorm(12, mean = 100, sd >>>> = >>>> 25)) >>>> with(sss, plot(y ~ x, col = ifelse(y > 100, "black", "red"))) >>>> >>>> ## but I am missing something about time series. >>>> >>>> Thanks. >>>> >>>> --Chris Ryan >>>> Broome County Health Department >>>> and Binghamton University >>>> Binghamton, NY >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> __ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posti >>>> ng-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting a list to a data frame
>>>>> David L Carlson >>>>> on Wed, 2 May 2018 21:43:52 + writes: > Typo: dat[[z]] should be x[[z]]: > > x2 <- do.call(rbind, lapply(names(x), function(z) > data.frame(type=z, x[[z]]))) > x2 > type x y > 1A 1 3 > 2A 2 4 > 3B 5 7 > 4B 6 8 > > David C Before this thread gets carried away to data.table and tibbleverse (and as nobody else has done so) : Let me nominate this beautiful solution by Bill Dunlap and David Carlson to win the prize with a 10 out 10 grade: Beautiful use of do.call() and lapply(), two of the most versatile and important functions from the base R toolbox. Congratulations! Martin Maechler R Core Team > -Original Message- > From: R-help On Behalf Of David L Carlson > Sent: Wednesday, May 2, 2018 3:51 PM > To: William Dunlap ; Kevin E. Thorpe > > Cc: r-help mailing list > Subject: Re: [R] Converting a list to a data frame > > Or add the type column first and then rbind: > > x <- list(A=data.frame(x=1:2, y=3:4),B=data.frame(x=5:6,y=7:8)) > x2 <- do.call(rbind, lapply(names(x), function(z) > data.frame(type=z, dat[[z]]))) > > > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77843-4352 > > -Original Message- > From: R-help On Behalf Of William Dunlap via > R-help > Sent: Wednesday, May 2, 2018 12:28 PM > To: Kevin E. Thorpe > Cc: R Help Mailing List > Subject: Re: [R] Converting a list to a data frame > > > x1 <- do.call(rbind, c(x, list(make.row.names=FALSE))) > > x2 <- cbind(type=rep(names(x), vapply(x, nrow, 0)), x1) > > str(x2) > 'data.frame': 4 obs. of 3 variables: > $ type: Factor w/ 2 levels "A","B": 1 1 2 2 > $ x : int 1 2 5 6 > $ y : int 3 4 7 8 > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Wed, May 2, 2018 at 10:11 AM, Kevin E. Thorpe > wrote: > > > I suspect this is pretty easy, but I'm having trouble figuring it out. > > Basically, I have a list of data frames such as the following example: > > > > list(A=data.frame(x=1:2, y=3:4),B=data.frame(x=5:6,y=7:8)) > > > > I would like to turn this into data frame where the list elements are > > essentially rbind'ed together and the element name becomes a new variable. > > For example, I would like to turn the list above into a data frame > > that looks like this: > > > > data.frame(type=c("A","A","B","B"),x=c(1:2,5:6),y=c(3:4,7:8)) > > > > Appreciate any pointers. > > > > Kevin > > > > -- > > Kevin E. Thorpe > > Head of Biostatistics, Applied Health Research Centre (AHRC) Li Ka > > Shing Knowledge Institute of St. Michael's Hospital Assistant > > Professor, Dalla Lana School of Public Health University of Toronto > > email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing figures?
>>>>> Suzen, Mehmet >>>>> on Mon, 7 May 2018 15:48:57 + writes: > I suggest perceptual diff. You could write a wrapper > around it. http://pdiff.sourceforge.net In (base) R's own test ('make check-devel'), we basically use pdf(*, compress=FALSE) actually also setting 'encoding=.' and 'paper=.', see the R sources at https://svn.r-project.org/R/trunk/tests/reg-plot.R and the check (source at https://svn.r-project.org/R/trunk/tests/Makefile.common) basically is using R CMD Rdiff .pdf .pdf.save (with := reg-plot in this case) Martin Maechler R Core Team > On Mon, 7 May 2018 16:49 Ramiro Barrantes, > wrote: >> Hello, >> >> I am working on tests to compare figures. I have been >> using ImageMagick, which creates a figure signature, and >> I can compare a "test" figure signature against a saved >> "reference" figure signature. It seems to work pretty >> well. However, it is slow as it requires reading from >> the file system. >> >> Are there any options to compare figures on memory? For >> example, if I generate a ggplot or lattice graph, I could >> have all my saved "reference" figures on memory (which I >> would have loaded all at once) and compare them. I just >> haven't found anything. >> >> I just found out about the vdiffr package and was going >> to explore it, not sure about the speed. >> >> Any suggestions appreciated. >> >> Thank you, < >> https://west.exch023.serverdata.net/owa/?ae=Item&a=New&t=IPM.Note&cc=MTQuMy4zMTkuMixlbi1VUyw2LEhUTUwsMCww&pspid=_1525698150389_875737489# >> > >> Ramiro >> >> Ramiro Barrantes Ph.D. Precision Bioassay, Inc. 431 >> Pine St., Suite 110 Burlington, VT 05401 802 865 0155 802 >> 861 2365 FAX www.precisionbioassay.com< >> https://west.exch023.serverdata.net/owa/redir.aspx?SURL=wN3KzpoKXAcetH7sTOTnSyfg-iAXFIinpPUtRcduCFCtkgZrUSDTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBwAHIAZQBjAGkAcwBpAG8AbgBiAGkAbwBhAHMAcwBhAHkALgBjAG8AbQA.&URL=http%3a%2f%2fwww.precisionbioassay.com >> > >> ram...@precisionbioassay.com >> >> CONFIDENTIALITY NOTICE: This email, including any >> attach...{{dropped:9}} >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide >> commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] par(mfrow=c(3,4)) problem
> Sarah Goslee > on Wed, 30 May 2018 05:03:56 -0400 writes: > Hi, > You're mixing base plot and ggplot2 grid graphics, which as you've > discovered doesn't work. > Here's av strategy that does: > https://cran.r-project.org/web/packages/egg/vignettes/Ecosystem.html > This vignette has a good overview, well as info specific to that package. > Sarah A very nice vignette indeed! Didn't know about these. Thank you, Sarah, and thanks to the author, Baptiste Auguie ! Martin > On Wed, May 30, 2018 at 4:43 AM greg holly wrote: >> Hi all; >> >> I need to put 12 different plot2 into the same matrix. So my array for the >> matrix will be par(mfrow=c(3,4)). I am running ggplot2 to produce my 12 >> plots. For some reason, par(mfrow=c(3,4)) did not turn out 3*4 matrix. >> >> my basic R codes for each plot is >> par(mfrow=c(3,4)) >> library(ggplot2) >> p <- ggplot(a, aes(x=Genotypes, y=Plant_hight, size=Plant_hight, >> color=Showing_rate)) + >> . >> . >> >> Best regards, >> >> Greg >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- > Sarah Goslee > http://www.stringpage.com > http://www.sarahgoslee.com > http://www.functionaldiversity.org > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mysterious rounding digits output
> Ted Harding > on Thu, 31 May 2018 07:10:32 +0100 writes: > Well pointed out, Jim! > It is infortunate that the documentation for options(digits=...) > does not mention that these are *significant digits* > and not *decimal places* (which is what Joshua seems to want): Since R 3.4.0 the help on ?options *does* say significant! The change in R's source code was this one, 14 months ago : r72380 | maechler | 2017-03-21 11:28:13 +0100 (Tue, 21. Mar 2017) | 2 Zeilen GeƤnderte Pfade: M /trunk/src/library/base/man/options.Rd digits: + "signficant" and since then, the text has been ‘digits’: controls the number of significant digits to print when printing numeric values. It is a suggestion only. ... whereas what you (Ted) cite is from R 3.3.x and earlier > "‘digits’: controls the number of digits to print when > printing numeric values." Maybe we should additionally say that this is *not* round()ing, and give a link to the help for signif() ? > On the face of it, printing the value "0,517" of 'ccc' looks > like printing 4 digits! Joshua's could look even worse if 'ddd' > had values in the 1000s! > To achieve exactly what Joshua seems to want, use the round() > function. Starting with his original assignment of values to > the variable itemInfo, the result of round(itemInfo,digits=3) is: > aaa bbb ccc dddeee > skill 1.396 6.225 0.517 5.775 2.497 > predict 1.326 5.230 0.462 5.116 -2.673 > waiting 1.117 4.948NANA NA > complex 1.237 4.170 0.220 4.713 5.642 > novelty 1.054 4.005 0.442 4.260 2.076 > creative 1.031 3.561 0.362 3.689 2.549 > evaluated 0.963 3.013NANA NA > body 0.748 2.238 0.596 2.019 0.466 > control 0.620 2.149NANA NA > stakes0.541 1.905 0.227 2.020 2.343 > spont 0.496 1.620NANA NA > chatter 0.460 1.563 0.361 1.382 0.540 > present 0.428 1.236 0.215 1.804 2.194 > reward0.402 1.101 0.288 1.208 0.890 > feedback 0.283 0.662NANA NA > goal 0.237 0.474NANA NA > Best wishes to all, > Ted. > On Thu, 2018-05-31 at 15:30 +1000, Jim Lemon wrote: >> Hi Joshua, >> Because there are no values in column ddd less than 1. >> >> itemInfo[3,"ddd"]<-0.3645372 >> itemInfo >> aaa bbb ccc dddeee >> skill 1.396 6.225 0.517 5.775 2.497 >> predict 1.326 5.230 0.462 5.116 -2.673 >> waiting 1.117 4.948NA 0.365 NA >> complex 1.237 4.170 0.220 4.713 5.642 >> novelty 1.054 4.005 0.442 4.260 2.076 >> creative 1.031 3.561 0.362 3.689 2.549 >> evaluated 0.963 3.013NANA NA >> body 0.748 2.238 0.596 2.019 0.466 >> control 0.620 2.149NANA NA >> stakes0.541 1.905 0.227 2.020 2.343 >> spont 0.496 1.620NANA NA >> chatter 0.460 1.563 0.361 1.382 0.540 >> present 0.428 1.236 0.215 1.804 2.194 >> reward0.402 1.101 0.288 1.208 0.890 >> feedback 0.283 0.662NANA NA >> goal 0.237 0.474NANA NA >> >> digits specifies the number of significant digits ("It is a suggestion >> only"), so when at least one number is padded with a leading zero it >> affects the formatting of the other numbers in that column. I suspect >> that this is an esthetic consideration to line up the decimal points. >> >> Jim >> >> >> On Thu, May 31, 2018 at 11:18 AM, Joshua N Pritikin wrote: >> > R version 3.5.0 (2018-04-23) -- "Joy in Playing" >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > >> > options(digits=3) >> > >> > itemInfo <- structure(list("aaa" = c(1.39633732316667, 1.32598263816667, 1.1165832407, 1.23651072616667, 1.0536867998, 1.0310073738, 0.9630728395, 0.7483865045, 0.62008664617, 0.5411017985, 0.49639760783, 0.45952804467, 0.42787704783, 0.40208597967, 0.28316118123, 0.23689627723), "bbb" = c(6.22533860696667, 5.229736804, 4.94816041772833, 4.17020503255333, 4.00453781427167, 3.56058007398333, 3.0125202404, 2.2378235873, 2.14863910661167, 1.90460903044777, 1.62001089796667, 1.56341257968151, 1.23618558850667, 1.10086688908262, 0.661981500639833, 0.47397754310745), "ccc" = c(0.5165911355, 0.46220470767, NA, 0.21963592433, 0.44186378083, 0.36150286583, NA, 0.59613794667, NA, 0.22698477157, NA, 0.36092266158, 0.2145347068, 0.28775624948, NA, NA ), "ddd" = c(5.77538400186667, 5.115877113, NA, 4.71294520316667, 4.25952652129833, 3.68879921863167, NA, 2.01942456211145, NA, 2.02032557108, NA, 1.381810875 >> 9571, 1.8043675977
Re: [R] verInd= and HorInd= arguments to pairs() function
(x[, j], x[, i], xlab = "", ylab = "", axes = FALSE, > type = "n", ..., log = l) > if (i == j || (i < j && has.lower) || (i > j && has.upper)) { > box() > if (i == 1 && (!(j%%2L) || !has.upper || !has.lower)) > localAxis(1L + 2L * row1attop, x[, j], x[, i], >...) > if (i == nc && (j%%2L || !has.upper || !has.lower)) > localAxis(3L - 2L * row1attop, x[, j], x[, i], >...) > if (j == 1 && (!(i%%2L) || !has.upper || !has.lower)) > localAxis(2L, x[, j], x[, i], ...) > if (j == nc && (i%%2L || !has.upper || !has.lower)) > localAxis(4L, x[, j], x[, i], ...) > mfg <- par("mfg") > if (i == j) { > if (has.diag) >localDiagPanel(as.vector(x[, i]), ...) > if (doText) { >par(usr = c(0, 1, 0, 1)) >if (is.null(cex.labels)) { > l.wid <- strwidth(labels, "user") > cex.labels <- max(0.8, min(2, 0.9/max(l.wid))) >} >xlp <- if (xl[i]) > 10^0.5 >else 0.5 >ylp <- if (yl[j]) > 10^label.pos >else label.pos >text.panel(xlp, ylp, labels[i], cex = cex.labels, > font = font.labels) > } > } > else if (i < j) > localLowerPanel(as.vector(x[, j]), as.vector(x[, > i]), ...) > else localUpperPanel(as.vector(x[, j]), as.vector(x[, > i]), ...) > if (any(par("mfg") != mfg)) > stop("the 'panel' function made a new plot") > } > else par(new = FALSE) > } > if (!is.null(main)) { > font.main <- if ("font.main" %in% nmdots) > dots$font.main > else par("font.main") > cex.main <- if ("cex.main" %in% nmdots) > dots$cex.main > else par("cex.main") > mtext(main, 3, line.main, outer = TRUE, at = 0.5, cex = cex.main, > font = font.main) > } > invisible(NULL) > } > > > > ## Example: > > mypairs(xmat, xlim=lim, ylim=lim, verInd=1:2, horInd=1:4) Thank you, Chris, for the report and Gerrit for your proposed fix !! It looks good to me, but I will test some more (also with 'row1attop=FALSE') before committing the bug fix. Best regards, Martin Maechler ETH Zurich and R Core Team > Am 06.06.2018 um 23:55 schrieb Andrews, Chris: > > > > After making scatterplot matrix, I determined I only needed the first 2 > > columns of the matrix so I added verInd=1:2 to my pairs() call. However, > > it did not turn out as I expected. > > > > Perhaps the attached pdf of the example code will make it through. If not, > > my description is "the wrong scatterplot pairs are in the wrong places" for > > the last two pairs() calls. > > > > Thanks, > > Chris > > > > > > > > # fake data > > xmat <- matrix(1:28, ncol=4) > > lim <- range(xmat) > > > > # what I expected > > pairs(xmat, xlim=lim, ylim=lim) # 4x4 matrix of scatterplots > > pairs(xmat, xlim=lim, ylim=lim, verInd=1:2, horInd=1:2) # 2x2 matrix of > > scatterplots: upper left > > > > # here comes trouble > > pairs(xmat, xlim=lim, ylim=lim, horInd=1:2) # 2x4 matrix of scatterplots: > > but not the top 2 rows (or bottom 2 rows) > > pairs(xmat, xlim=lim, ylim=lim, verInd=1:2) # 4x2 matrix of scatterplots: > > but not the left 2 columns (or right 2 columns) > > > > > > ### > > > >> sessionInfo() > > R version 3.5.0 (2018-04-23) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > Running under: Windows 7 x64 (build 7601) Service Pack 1 > > > > Matrix products: default > > > > locale: > > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > > States.1252LC_MONETARY=English_United States.1252 > > [4] LC_NUMERIC=C LC_TIME=English_United > > States.1252 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > loaded via a namespace (and not attached): > > [1] compiler_3.5.0 tools_3.5.0 > > ** __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] verInd= and HorInd= arguments to pairs() function
>>>>> Martin Maechler >>>>> on Thu, 7 Jun 2018 18:35:48 +0200 writes: >>>>> Gerrit Eichner >>>>> on Thu, 7 Jun 2018 09:03:46 +0200 writes: >> Hi, Chris, had the same problem (and first thought it was >> my fault), but there seems to be a typo in the code of >> pairs.default. Below is a workaround. Look for two >> comments (starting with #) in the code to see what I >> have changed to make it work at least the way I'd expect >> it in one of your examples. >> Hth -- Gerrit > > mypairs <- function (x, labels, panel = points, ..., > > horInd = 1:nc, verInd = 1:nc, > > lower.panel = panel, upper.panel = panel, diag.panel = NULL, > > text.panel = textPanel, label.pos = 0.5 + has.diag/3, line.main = 3, > > cex.labels = NULL, font.labels = 1, row1attop = TRUE, gap = 1, > > log = "") { > > if (doText <- missing(text.panel) || is.function(text.panel)) > > textPanel <- function(x = 0.5, y = 0.5, txt, cex, font) text(x, > > y, txt, cex = cex, font = font) > > localAxis <- function(side, x, y, xpd, bg, col = NULL, main, > > oma, ...) { > > xpd <- NA > > if (side%%2L == 1L && xl[j]) > > xpd <- FALSE > > if (side%%2L == 0L && yl[i]) > > xpd <- FALSE > > if (side%%2L == 1L) > > Axis(x, side = side, xpd = xpd, ...) > > else Axis(y, side = side, xpd = xpd, ...) > > } > > localPlot <- function(..., main, oma, font.main, cex.main) plot(...) > > localLowerPanel <- function(..., main, oma, font.main, cex.main) > > lower.panel(...) > > localUpperPanel <- function(..., main, oma, font.main, cex.main) > > upper.panel(...) > > localDiagPanel <- function(..., main, oma, font.main, cex.main) > > diag.panel(...) > > dots <- list(...) > > nmdots <- names(dots) > > if (!is.matrix(x)) { > > x <- as.data.frame(x) > > for (i in seq_along(names(x))) { > > if (is.factor(x[[i]]) || is.logical(x[[i]])) > > x[[i]] <- as.numeric(x[[i]]) > > if (!is.numeric(unclass(x[[i]]))) > > stop("non-numeric argument to 'pairs'") > > } > > } > > else if (!is.numeric(x)) > > stop("non-numeric argument to 'pairs'") > > panel <- match.fun(panel) > > if ((has.lower <- !is.null(lower.panel)) && !missing(lower.panel)) > > lower.panel <- match.fun(lower.panel) > > if ((has.upper <- !is.null(upper.panel)) && !missing(upper.panel)) > > upper.panel <- match.fun(upper.panel) > > if ((has.diag <- !is.null(diag.panel)) && !missing(diag.panel)) > > diag.panel <- match.fun(diag.panel) > > if (row1attop) { > > tmp <- lower.panel > > lower.panel <- upper.panel > > upper.panel <- tmp > > tmp <- has.lower > > has.lower <- has.upper > > has.upper <- tmp > > } > > nc <- ncol(x) > > if (nc < 2L) > > stop("only one column in the argument to 'pairs'") > > if (!all(horInd >= 1L && horInd <= nc)) > > stop("invalid argument 'horInd'") > > if (!all(verInd >= 1L && verInd <= nc)) > > stop("invalid argument 'verInd'") > > if (doText) { > > if (missing(labels)) { > > labels <- colnames(x) > > if (is.null(labels)) > > labels <- paste("var", 1L:nc) > > } > > else if (is.null(labels)) > > doText <- FALSE > > } > > oma <- if ("oma" %in% nmdots) > > dots$oma > > main <- if ("main" %in% nmdots) > > dots$main > > if (is.null(oma)) > > oma <- c(4, 4, if (!is.null(main)) 6 else 4, 4) > > opar <- par(mfcol = c(length(horInd), length(verInd)), > > # Changed from mfrow to mfcol > > mar = rep.int(gap/2, 4), oma = oma) > > on.exit(par(opar)) > > dev.hold() > > on.exit(dev.flush(), add = TRUE) > > xl <-
Re: [R] verInd= and HorInd= arguments to pairs() function
>>>>> Martin Maechler >>>>> on Fri, 8 Jun 2018 11:13:24 +0200 writes: [..] >> Thank you, Chris, for the report and >> Gerrit for your proposed fix !! >> >> It looks good to me, but I will test some more (also with >> 'row1attop=FALSE') before committing the bug fix. > and there, another change was needed: Instead of your > for (j in if (row1attop) verInd else rev(verInd)) >for (i in horInd) { > we do now need > for(j in verInd) >for(i in if(row1attop) horInd else rev(horInd)) { > and the difference is of course only relevant for the > non-default 'row1attop = FALSE' > (which some graphic experts argue to be clearly *better* than the default, > as only in that case, the upper and lower triangles of the > matrix are nicely "mirrors of each other", and that is also > the reason why lattice::splom() uses the equivalent of > 'row1attop=FALSE') > I will commit the change to R-devel today - and intend to port > to R-patched in time to make it into the upcoming R 3.5.1. Well, as I find, there are more bugs there, if you are using 'horInd' and 'verInd' creatively: In a nice pairs(), the axis ticks (and their labels (numbers)) are always "on the outside" of the scatterplot matrix, and nicely alternating. This is not the case unfortunately, when using horInd or verInd which are *not* of the form p1:p2 (p1 <= p2) ==> even more changes are needed to make these cases "nice", or should we *require* horInd and verInd to be of that form?? This would not be back-compatible, but than such cases have been "wrong" really in all versions of R anyway, *and* at least be reordering the matrix/data.frame columns, the restriction of (hor|ver)Ind = p1:p2 (p1 <= p2) would seem acceptable, would it ? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] verInd= and HorInd= arguments to pairs() function
>>>>> Gerrit Eichner >>>>> on Fri, 8 Jun 2018 12:55:31 +0200 writes: > Am 08.06.2018 um 12:02 schrieb Martin Maechler: >>>>>>> Martin Maechler >>>>>>> on Fri, 8 Jun 2018 11:13:24 +0200 writes: >> >> [..] >> >> >> Thank you, Chris, for the report and >> >> Gerrit for your proposed fix !! >> >> >> >> It looks good to me, but I will test some more (also with >> >> 'row1attop=FALSE') before committing the bug fix. >> >> > and there, another change was needed: Instead of your >> >> > for (j in if (row1attop) verInd else rev(verInd)) >> >for (i in horInd) { >> >> > we do now need >> >> > for(j in verInd) >> >for(i in if(row1attop) horInd else rev(horInd)) { >> >> > and the difference is of course only relevant for the >> > non-default 'row1attop = FALSE' >> >> > (which some graphic experts argue to be clearly *better* than the default, >> > as only in that case, the upper and lower triangles of the >> > matrix are nicely "mirrors of each other", and that is also >> > the reason why lattice::splom() uses the equivalent of >> > 'row1attop=FALSE') >> >> > I will commit the change to R-devel today - and intend to port >> > to R-patched in time to make it into the upcoming R 3.5.1. >> >> Well, as I find, there are more bugs there, if you are using >> 'horInd' and 'verInd' creatively: >> >> In a nice pairs(), the axis ticks (and their labels (numbers)) >> are always "on the outside" of the scatterplot matrix, and >> nicely alternating. This is not the case unfortunately, when using >> horInd or verInd which are *not* of the form p1:p2 (p1 <= p2) >> >> ==> even more changes are needed to make these cases "nice", > Well, the *shown* axis ticks and labels do nicely alternate if > (hor|ver)Ind = p1:p2 (p1 <= p2) is fulfilled, but not all of > the axis ticks and labels, which one *might* wish to see, are > currently drawn anyway ... I would consider changes which "heal" > this as more interesting than solving this problem in full > generality, i.e., in cases of creative use of (hor|ver)Ind. > However, I don't think it's urgent, at all. >> or should we *require* horInd and verInd to be of that form?? >> >> This would not be back-compatible, but than such cases have been >> "wrong" really in all versions of R anyway, *and* at least be >> reordering the matrix/data.frame columns, the restriction of >> >> (hor|ver)Ind = p1:p2 (p1 <= p2) >> >> would seem acceptable, would it ? >> > I could live very well with that requirement (while breaking > back-compatibility), because from my point of view a "creative" > use of 'horInd' and 'verInd' violating (hor|ver)Ind = p1:p2 > (p1 <= p2) won't occur often. > On the other hand, why forcing (hor|ver)Ind = p1:p2 (p1 <= p2)? > If people violated it "they get what they deserve". ;-) > Btw, 'horInd' and 'verInd' sort of exchange their meaning if > row1attop = FALSE, but I this can be explained by the "work flow": > First, (hor|ver)Ind are used to select the respective rows and > columns from the full paris-plot, and then, row1attop is used > when the results are drawn. I think this is sensible. Thank you; and yes indeed, and I was not going to change that. In fact, I've found a relatively nice solution now, which does *not* restrict the settings of '{hor,ver}Ind' and fixes all problems mentioned, quite back compatibly, and in some sense perfectly labeling the axes. ==> Committed to R-devel svn c74871 a few minutes ago. Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Do there need to be the same number of y-values for each x-value when using tapply?
The answer to your Q is surely a "no": That's exaxtly the point of tapply that the number of y values may vary. The msg tells you that the x & y full vectors must have the same length. Hoping that helps. Martin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Database
> Doran, Harold > on Wed, 25 Jul 2018 14:57:13 + writes: > I'm doing some work now to learn which SQL database > package is the most optimal for the task I am working on. Hmm... we would have a problem with optimize() and optim() if this was optimal << more optimal << most optimal :-) ;-) Best, Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Rect Transparency
> Duncan Murdoch > on Thu, 28 Jun 2018 20:57:19 -0400 writes: > On 28/06/2018 5:29 PM, Jeff Reichman wrote: >> R-Help >> >> >> >> Is there a way to make a rectangle transparent (alpha=0.1??) >> >> >> >> plot(c(100, 200), c(300, 450), type= "n", xlab = "", ylab = "") >> >> rect(110, 300, 175, 350, density = 5, border = "red") >> >> >> >> Can't figure out how to take the changepoint function results and plot in >> ggplot2 so I can just simply add rectangles to the plot function, but I need >> to make transparent and there doesn't seem to be an alpha option. > Alpha is part of the colour spec. For example, > rect(110, 300, 175, 350, density = 5, border = rgb("red") > rect(110, 300, 175, 350, density = 5, border = rgb(red=1, green=0, > blue=0, alpha=0.1)) > I'm not sure what is the quickest way to work out the rgb values for a > named colour (col2rgb can do it, but not in a convenient format) if you > want to add alpha to it. IIUC, it is adjustcolor() you were thinking of. It had been created to do that and more. I'm using that "all the time" nowadays in my graphics code, e.g., > adjustcolor("red", 2/3) [1] "#FFAA" __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Safeguarded Newton method for function minimization
> J C Nash > on Tue, 18 Apr 2017 13:32:52 -0400 writes: > Recently Marie Boehnstedt reported a bug in the nlm() > function for function minimization when both gradient and > hessian are provided. Indeed, on R's Bugzilla here : https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249 > She has provided a work-around for some cases and it seems > this will get incorporated into the R function eventually. indeed the first part -- fixing the wrong choldc() -- in the C code has been in my version of R-devel for a while now. See my follow up https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249#c4 and #c5 including my attachment which is an extended version of Marie's original : https://bugs.r-project.org/bugzilla/attachment.cgi?id=2246 As that mentions _and_ shows: Fixing choldc() solves the problem for the 2D Rosenbrook example, but _not_ the 4D Wood example. > However, despite the great number of packages on CRAN, > there does not appear to be a straightforward Newton > approach to function minimization. This may be because > providing the code for a hessian (the matrix of second > derivatives) is a lot of work and error-prone. > (R could also use some good tools for Automatic Differentiation). The last part of what you say above is not at all true: R -- and S (& S+ | S-PLUS) before it! -- always had deriv() and deriv3() and my attachment above _shows_ you how to use deriv3() to get both the gradient and the hessian via a version of automatic differentiation completely effortlessly !! For ease of readers, that part here, with an example: ##' Wood function (4 arguments 'x1' ... 'x4') fwood <- function(x1,x2,x3,x4) { 100*(x1^2-x2)^2 + (1-x1)^2 + 90*(x3^2-x4)^2 + (1-x3)^2 + 10.1*((1-x2)^2 + (1-x4)^2) + 19.8*(1-x2)*(1-x4) } ## automatically construct correct gradient and hessian: woodf.gh <- function(x) { stopifnot(is.numeric(x)) woodGH <- deriv3(body(fwood)[[2]], c("x1","x2","x3","x4"), function.arg=TRUE) if(length(x) == 4) woodGH(x[1],x[2],x[3],x[4]) else if(is.matrix(x) && ncol(x) == 4) woodGH(x[,1], x[,2], x[,3], x[,4]) else stop("'x' must have length 4 or be a matrix with 4 columns") } and now get both the function f(x), gradient g(x) and Hessian H(x) for x = (0 0 0 0), x = (1 1 1 1), and x = (1 2 3 4) with such a simple calle : > woodf.gh(rbind(0, 1, 1:4)) [1] 42.00.0 2514.4 attr(,"gradient") x1x2 x3 x4 [1,] -2 -40.0 -2 -40.0 [2,]0 0.000.0 [3,] -400 279.6 5404 -819.6 attr(,"hessian") , , x1 x1 x2 x3 x4 [1,] 20 0 0 [2,] 802 -400 0 0 [3,] 402 -400 0 0 , , x2 x1x2 x3 x4 [1,]0 220.2 0 19.8 [2,] -400 220.2 0 19.8 [3,] -400 220.2 0 19.8 , , x3 x1 x2 x3x4 [1,] 0 02 0 [2,] 0 0 722 -360 [3,] 0 0 8282 -1080 , , x4 x1 x2x3x4 [1,] 0 19.8 0 200.2 [2,] 0 19.8 -360 200.2 [3,] 0 19.8 -1080 200.2 > > I have also noted that a number of researchers try to > implement textbook methods and run into trouble when maths > and computing are not quite in sync. Therefore, I wrote a > simple safeguarded Newton and put a small package on > R-forge at > https://r-forge.r-project.org/R/?group_id=395 > Note that Newton's method is used to solve nonlinear > equations. In fact, for function minimization, we apply it > to solve g(x) = 0 where g is the gradient and x is the > vector of parameters. In part, safeguards ensure we reduce > the function f(x) at each step to avoid some of the > difficulties that may arise from a non-positive-definite > hessian. > In the package, I have a very simple quadratic test, the > Rosenbrock test function and the Wood test function. The > method fails on the last function -- the hessian is not > positive definite where it stops. > Before submitting this package to CRAN, I would like to > see its behaviour on other test problems, but am lazy > enough to wish to avoid creating the hessian code. If > anyone has such code, it would be very welcome. Please > contact me off-list. If I get some workable examples that > are open for public view, I'll report back here. > John Nash > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented
Re: [R] R-3.4.0 and survival_2.41-3 ..
>>>>> Göran Broström >>>>> on Tue, 25 Apr 2017 10:22:48 +0200 writes: > I installed R-3.4.0 and got problems with the survival package, for instance > >> library(survival) >> mort <- data.frame(exit = 1:4, event = rep(1, 4), x = c(0, 1, 0, 1)) >> fit <- coxph(Surv(exit, event) ~ x, data = mort) > Error in fitter(X, Y, strats, offset, init, control, weights = weights, : > object 'Ccoxmart' not found > - > No problems with R-3.3.3 and the same (latest) survival version, which > makes me think that something is going on in my R installation rather > than in the survival package. > Thanks for any hint, > Göran >> sessionInfo() > R version 3.4.0 (2017-04-21) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 16.04.2 LTS > Matrix products: default > BLAS: /usr/lib/openblas-base/libblas.so.3 > LAPACK: /usr/lib/libopenblasp-r0.2.18.so > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=sv_SE.UTF-8LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=sv_SE.UTF-8LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=sv_SE.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=sv_SE.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] survival_2.41-3 > loaded via a namespace (and not attached): > [1] compiler_3.4.0 Matrix_1.2-8splines_3.4.0 grid_3.4.0 > [5] lattice_0.20-35 I'm 99.5% sure that you are using more than just the default library of package (a very good thing - we have been doing the same for years). We have in NEWS for R 3.4.0 > PACKAGE INSTALLATION: > [...] > [...] > • Packages which register native routines for .C or .Fortran need > to be re-installed for this version (unless installed with > R-devel SVN revision r72375 or later). and Prof Brian Ripley did announce that nicely and early on R-devel. ==> https://hypatia.math.ethz.ch/pipermail/r-devel/2017-March/073940.html ==> You have to re-install quite a few packages for R 3.4.0, __if__ they use .C() or .Fortran() When we've e-talked about the issue within R-core, Uwe Ligges noted we should really ask everyone to run update.packages(checkBuilt=TRUE) after an update to a new major (meaning "R-x.y.0") release of R and **not** re-use packages {inside R-x.y.z} that were installed with R-x.(y-1).z' .. and of course Uwe is right: We should ask others to do it _and_ do it ourselves. Anyway it _is_ considerably more important for the 3.4.0 release. Martin Maechler ETH Zurich (and R Core team) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The effect of tolerance in all.equal()
> Ashim Kapoor > on Tue, 25 Apr 2017 14:02:18 +0530 writes: > Dear all, > I am not able to understand the interplay of absolute vs relative and > tolerance in the use of all.equal > If I want to find out if absolute differences between 2 numbers/vectors are > bigger than a given tolerance I would do: > all.equal(1,1.1,scale=1,tol= .1) > If I want to find out if relative differences between 2 numbers/vectors are > bigger than a given tolerance I would do : > all.equal(1,1.1,tol=.1) > ## > I can also do : > all.equal(1,3,tol=1) > to find out if the absolute difference is bigger than 1.But here I won't be > able to detect absolute differences smaller than 1 in this case,so I don't > think that this is a good way. > My query is: what is the reasoning behind all.equal returning the absolute > difference if the tolerance >= target and relative difference if tolerance > < target? (above, it istol >/<= |target| ie. absolute value) The following are desiderata / restrictions : 1) Relative tolerance is needed to keep things scale-invariant i.e., all.equal(x, y) and all.equal(1000 * x, 1000 * y) should typically be identical for (almost) all (x,y). ==> "the typical behavior should use relative error tolerance" 2) when x or y (and typically both!) are very close to zero it is typically undesirable to keep relative tolerances (in the boundary case, they _are_ zero exactly, and "relative error" is undefined). E.g., for most purposes, 3.45e-15 and 1.23e-17 should be counted as equal to zero and hence to themselves. 1) and 2) are typically reconciled by switching from relative to absolute when the arguments are close to zero (*). The exact cutoff at which to switch from relative to absolute (or a combination of the two) is somewhat arbitrary(*2) and for all.equal() has been made in the 1980's (or even slightly earlier?) when all.equal() was introduced into the S language at Bell labs AFAIK. Maybe John Chambers (or Rick Becker or ..., but they may not read R-help) knows more. *2) Then, the choice for all.equal() is in some way "least arbitrary", using c = 1 in the more general tolerance >= c*|target| framework. *) There have been alternatives in "the (applied numerical analysis / algorithm) literature" seen in published algorithms, but I don't have any example ready. Notably some of these alternatives are _symmetric_ in (x,y) where all.equal() was designed to be asymmetric using names 'target' and 'current'. The alternative idea is along the following thoughts: Assume that for "equality" we want _both_ relative and absolute (e := tolerance) "equality" |x - y| < e (|x|+|y|)/2 (where you could use |y| or |x| instead of their mean; all.equal() uses |target|) |x - y| < e * e1 (where e1 = 1, or e1 = 10^-7..) If you add the two inequalities you get |x - y| < e (e1 + |x+y|/2) as check which is a "mixture" of relative and absolute tolerance. With a somewhat long history, my gut feeling would nowadays actually prefer this (I think with a default of e1 = e) - which does treat x and y symmetrically. Note that convergence checks in good algorithms typically check for _both_ relative and absolute difference (each with its tolerance providable by the user), and the really good ones for minimization do check for (approximate) gradients also being close to zero - as old timers among us should have learned from Doug Bates ... but now I'm really diverging. Last but not least some R code at the end, showing that the *asymmetric* nature of all.equal() may lead to somewhat astonishing (but very logical and as documented!) behavior. Martin > Best Regards, > Ashim > ## The "data" to use: > epsQ <- lapply(seq(12,18,by=1/2), function(P) bquote(10^-.(P))); names(epsQ) > <- sapply(epsQ, deparse); str(epsQ) List of 13 $ 10^-12 : language 10^-12 $ 10^-12.5: language 10^-12.5 $ 10^-13 : language 10^-13 $ 10^-13.5: language 10^-13.5 $ 10^-14 : language 10^-14 $ 10^-14.5: language 10^-14.5 $ 10^-15 : language 10^-15 $ 10^-15.5: language 10^-15.5 $ 10^-16 : language 10^-16 $ 10^-16.5: language 10^-16.5 $ 10^-17 : language 10^-17 $ 10^-17.5: language 10^-17.5 $ 10^-18 : language 10^-18 > str(lapply(epsQ, function(tl) all.equal(3.45e-15, 1.23e-17, tol = eval(tl List of 13 $ 10^-12 : logi TRUE $ 10^-12.5: logi TRUE $ 10^-13 : logi TRUE $ 10^-13.5: logi TRUE $ 10^-14 : logi TRUE $ 10^-14.5: chr "Mean relative difference: 0.9964348" $ 10^-15 : chr "Mean relative difference: 0.9964348" $ 10^-15.5: chr "Mean relative difference: 0.9964348" $ 10^-16 : chr "Mean relative difference: 0.9964348" $ 10^-16.5: chr "Mean relative difference: 0.9964348"
Re: [R] [FORGED] Logical Operators' inconsistent Behavior
> Ramnik Bansal > on Sat, 20 May 2017 08:52:55 +0530 writes: > Taking this question further. > If I use a complex number or a numeric as an operand in logical > operations, to me it APPEARS that these two types are first coerced to > LOGICAL internally and then THIS logical output is further used as the > operand. > For eg. >> x <- 4+5i; c(x & F, x & T, x | F, x | T) > [1] FALSE TRUE TRUE TRUE > This output is consistent with >> x <- 4+5i; c(as.logical(x) & F, as.logical(x) & T, as.logical(x) | F, as.logical(x) | T) > [1] FALSE TRUE TRUE TRUE > This consistency makes me draw an on-the-surface conclusion that in > the case of logical operations if the operand is not of type 'logical' > it is first coerced into 'logical'. That conclusion is wrong as you show below. Rather, as the error message says, logical "operations are possible only for numeric, logical or complex types" Again: 1) Logical/Arithmetic operations "work" with "numeric-like" types, namely numeric, logical or complex, (and numeric = {integer, double}) ==> all other types give an error (the one you've cited twice) 2) For "numeric-like" types and *logical* operations (&, |, !; plus && and ||) the equivalent of as.logical() is applied before performing the Op. Seems pretty consistent ... and also according to the principle of "least surprise" (for me at least). __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in readRDS(dest) (was Re: Error with installed.packages with R 3.4.0 on Windows)
> Patrick Connolly > on Tue, 23 May 2017 20:47:22 +1200 writes: > On Mon, 22-May-2017 at 05:43AM -0400, Martin Morgan wrote: > |> On 05/22/2017 05:10 AM, Patrick Connolly wrote: > |> >Apparently it isn't harmless. > |> > > |> >>install.packages("withr") > |> >Error in readRDS(dest) : error reading from connection > |> > |> that seems like a plain-old network connectivity issue, or perhaps > |> an issue with the CRAN mirror you're using. Can you debug on your > |> end, e.g,. > |> > |> options(error=recover) > |> install.packages("withr") > |> ... > |> > |> then select the 'frame' where the error occurs, look around > |> > |> ls() > |> > |> find the value of 'dest', and e.g., try to open dest in your browser. > This is what I get >> options(error=recover) >> install.packages("withr") > ^C > Enter a frame number, or 0 to exit > 1: install.packages("withr") > 2: available.packages(contriburl = contriburl, method = method) > 3: tryCatch({ > download.file(url = paste0(repos, "/PACKAGES.rds"), destfile > 4: tryCatchList(expr, classes, parentenv, handlers) > 5: tryCatchOne(expr, names, parentenv, handlers[[1]]) > 6: doTryCatch(return(expr), name, parentenv, handler) > 7: download.file(url = paste0(repos, "/PACKAGES.rds"), destfile = dest, method > Selection: 7 '7' was the wrong choice: 'dest' exists in the frame that *calls* download.file, in this case, it is frame 2, i.e., inside available.packages(.) where the tryCatch() call to download.file() happens. Given the above stack trace. It may be easier to just do debugonce(available.packages) install.packages("withr") and then inside available.packages, (using 'n') step to the point _before_ the tryCatch(...) call happens; there, e.g. use ls.str() which gives an str() of all your local objects, notably 'dest' and 'method'. but you can also try other things once inside available.packages(). Martin > Called from: eval(substitute(browser(skipCalls = skip), list(skip = 7 - which)), > envir = sys.frame(which)) > Browse[1]> dest > Error during wrapup: object 'dest' not found > That indicates to me that the problem is further back but I have no > idea how make use of that information. > Browse[1]> ls() > [1] "cacheOK" "destfile" "extra""method" "mode" "quiet" "url" > Browse[1]> url > [1] "http://cran.stat.auckland.ac.nz/src/contrib/PACKAGES.rds"; > Browse[1]> destfile > [1] "/tmp/RtmpplJSrB/repos_http%3A%2F%2Fcran.stat.auckland.ac.nz%2Fsrc%2Fcontrib.rds" > Browse[1]> > The destfile above is zero-length and I suppose is where dest is > intended to end up. > Where else should I be looking? Earlier installations never had this > issue so I don't have anything to compare. > TIA > -- __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing dependencies in pkg installs
> David Winsemius > on Wed, 21 Jun 2017 18:04:13 -0700 writes: >> On Jun 21, 2017, at 1:39 PM, Conklin, Mike (GfK) wrote: >> >> I have a Ubuntu server with an R installation that has 384 packages installed. We are trying to replicate the system on a Red Hat Enterprise server. I downloaded the list of packages from the Ubuntu machine and read it into an R session on the new machine. Then I ran install.packages(listofpackages). Now I have 352 packages on the new machine but several very common packages (like much of the tidyverse, ggplot2, Hmisc) failed to install because of missing dependencies (most of which are the other packages that failed to install). >> >> R version 3.4.0 (2017-04-21) >> Platform: x86_64-redhat-linux-gnu (64-bit) > I'd make sure you have reviewed this: > https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Essential-programs-and-libraries yes, but see also below .. > Best; > David. >> Running under: Red Hat Enterprise Linux Server 7.2 (Maipo) >> >> Matrix products: default >> BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] dplyr_0.7.0 shiny_1.0.3 >> >> loaded via a namespace (and not attached): >> [1] compiler_3.4.0 magrittr_1.5 assertthat_0.2.0 R6_2.2.2 >> [5] htmltools_0.3.6 tools_3.4.0 glue_1.1.1 tibble_1.3.3 >> [9] Rcpp_0.12.11 digest_0.6.12xtable_1.8-2 httpuv_1.3.3 >> [13] mime_0.5 rlang_0.1.1 >>> >> >> >> If I try and install Hmisc for example I get the following errors: >> for the first error ERROR: 'configure' exists but is not executable -- >> see the 'R Installation and dministration Manual' I did >> not find the R Installation and Administration Manual >> helpful. why? It does assume you spend some time understanding what it is talking about. OTOH, it is *THE* official document on the topic, written and constantly updated by the R core team. Anyway: " ERROR: 'configure' exists but is not executable " is I think a crucial piece of information .. it's from stringi's installation failure, and the top level directory of stringi (1.1-5) in a Unix like (ls -l with user group names removed) looks like: -rw-r--r--. 1451 Apr 7 15:08 DESCRIPTION -rw-r--r--. 6207 Apr 7 11:21 INSTALL -rw-r--r--. 3580 Mar 21 13:29 LICENSE -rw-r--r--.63692 Apr 7 15:08 MD5 -rw-r--r--. 6204 Oct 24 2016 NAMESPACE -rw-r--r--.22407 Apr 7 11:44 NEWS drwxr-xr-x. 8192 Mar 28 10:26 R -rwxr-xr-x. 66 Apr 2 2015 cleanup -rw-rw-r--.32193 Apr 8 05:46 config.log -rwxrwxr-x.40648 Apr 8 05:46 config.status -rwxr-xr-x. 173757 Apr 7 11:43 configure -rw-r--r--. 669 Jun 23 2015 configure.win drwxr-xr-x. 512 Apr 7 11:50 inst drwxr-xr-x. 8192 Mar 28 10:26 man drwxr-xr-x.16384 Apr 8 05:47 src Note the "x"s in the 'configure' line's " -rwxr-xr-x. ": This means the file is executable and a reasonable shell can just "call the file" and it will be executed, but that's the part which failed for you. .. this *is* peculiar as it looks like some of the standard Unix tools may be misbehaving for you .. I assume it could be some OS security "feature" playing against you.. >> >> >>> install.packages("Hmisc") >> Installing package into '/usr/lib64/R/library' >> (as 'lib' is unspecified) >> also installing the dependencies 'stringi', 'evaluate', 'reshape2', 'stringr', knitr', 'ggplot2', 'htmlTable', 'viridis' >> >> trying URL 'https://cloud.r-project.org/src/contrib/stringi_1.1.5.tar.gz' >> Content type 'application/x-gzip' length 3645872 bytes (3.5 MB) >> == >> downloaded 3.5 MB >> >> trying URL 'https://cloud.r-project.org/src/contrib/evaluate_0.10.tar.gz' >> Content type 'application/x-gzip' length 21914 bytes (21 KB) >> == >> downloaded 21 KB >> >> trying URL 'https://cloud.r-project.org/src/contrib/reshape2_1.4.2.tar.gz' >> Content type 'application/x-gzip' length 34688 bytes (33 KB) >> == >> downloaded 33 KB >> >> trying URL 'https://cloud.r-project.org/src/contrib/stringr_1.2.0.tar.gz' >> Content type 'applicat
Re: [R] getting error while trying to make dendogram based on gene expression
> Yogesh Gupta > on Wed, 21 Jun 2017 13:42:15 +0900 writes: > I am trying to make dendogram based on gene expression matrix , but getting > some error: > I > countMatrix = read.table("count.row.txt",header=T,sep='\t',check.names=F) > colnames(countMatrix) > count_matrix <- countMatrix[,-1] # remove first column > (gene names) > rownames(count_matrix) <- countMatrix[,1] #added first column gene > names as rownames) >> nonzero_row <- count_matrix[rowSums(count_matrix) > 0, # removed row sum > 0 across all sample >> x1= as.matrix(nonzero_row)# converted data into matrix >> x=log2(x1+1) # converted into > log value >> d <- dist(x, method="euclidean") >> h <- hclust(d, method="complete") > *Error:* > *** caught segfault *** > address 0x7fa39060af28, cause 'memory not mapped' > Traceback: > 1: hclust(d, method = "complete") > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: This looks like a problem that should not happen. Though it could be that it's just too large a problem (for your hardware, or "in general"). However, we cannot reproduce what you show above. To help us help you please provide (to this mailing list!) - a (minimal) reproducible example - sessionInfo() It is best to take time to carefully read https://www.r-proejct.org/help.html and/or then just search "reproducible example R" : http://lmgtfy.com/?q=reproducible+example+R Martin > Thanks > Yogesh __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] setReplaceMethod creates 'object' in the userworkspace
> Jonathan Fritzemeier > on Fri, 23 Jun 2017 16:15:30 +0200 writes: > Hi, > I recognized that the function 'setReplaceMethod' is creating a > character vector in the user workspace having the name (e.g. "newClass") > of the class used as value. If you can sort out a mistake by myself, I > would like you to file a bug report. Yes, a mistake by yourself (and really not fit for R-devel, but rather for R-help to which I follow up now) > BBFN, > Jonathan > setClass("newClass", representation(value="numeric")) > > setMethod(f = "initialize", signature = "newClass", > definition = function(.Object){ > .Object@value <- 1 > return(.Object) > }) > > setGeneric(name = "myValue", > def = function(object) { standardGeneric("myValue") } > ) > setGeneric(name = "myValue<-", > def = function(object, value) { standardGeneric("myValue<-") } > ) > > setMethod("myValue", signature(object = "newClass"), > function(object) { > return(object@value) > } > ) > setReplaceMethod("myValue", signature = (object = "newClass"), > function(object, value) { > object@value <- value > return(object) > } > ) Q: what do you think happens with the above [last setReplaceMethod() call] ? A: it creates an R object named 'object' in the globalenv (or the package if this would go into a package) If just replace '(object = "newClass")' by '"newClass"' things should be fine. {{ Removing all the completely redundant return(.), i.e. return implicitly rather than via an extra function call would also make the code "cleaner" and more R-like }} Best, Martin > myNewObject <- new("newClass") > print(object) > > > > print(object) > [1] "newClass" > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readLines without skipNul=TRUE causes crash
>>>>> Anthony Damico >>>>> on Sun, 16 Jul 2017 06:40:38 -0400 writes: > hi, the text file that prompts the segfault is 4gb but only 80,937 lines >> file.info( "S:/temp/crash.txt") > size isdir mode mtime > ctime atime exe > S:/temp/crash.txt 4078192743 FALSE 666 2017-07-15 17:24:35 2017-07-15 > 17:19:47 2017-07-15 17:19:47 no > On Sun, Jul 16, 2017 at 6:34 AM, Duncan Murdoch > wrote: >> On 16/07/2017 6:17 AM, Anthony Damico wrote: >> >>> thank you for taking the time to write this. i set it running last >>> night and it's still going -- if it doesn't finish by tomorrow, i will >>> try to find a site to host the problem file and add that link to the bug >>> report so the archive package can be avoided at least. i'm sorry for >>> the bother >>> >>> >> How big is that text file? I wouldn't expect my script to take more than >> a few minutes even on a huge file. >> >> My script might have a bug... >> >> Duncan Murdoch >> >> On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch >>> mailto:murdoch.dun...@gmail.com>> wrote: >>> >>> On 15/07/2017 11:33 AM, Anthony Damico wrote: >>> >>> hi, i realized that the segfault happens on the text file in a >>> new R >>> session. so, creating the segfault-generating text file requires >>> a >>> contributed package, but prompting the actual segfault does not -- >>> pretty sure that means this is a base R bug? submitted here: >>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311 >>> <https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311> >>> hopefully i am not doing something remarkably stupid. the text file itself >>> is 4GB >>> so cannot upload it to bugzilla, and from the >>> R_AllocStringBugger error >>> in the previous message, i think most or all of it needs to be >>> there to >>> trigger the segfault. thanks! In the mean time, communication has continued a bit at the bugzilla bug tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311 ), and as you can read there, the bug is fixed now, also thanks to an initial patch proposal by Hannes Mühleisen. Martin Maechler ETH Zurich (and R Core) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] axis() after image.plot() does not work except if points() is inserted between
>>>>> Marc Girondot via R-help >>>>> on Mon, 24 Jul 2017 09:35:06 +0200 writes: > Thanks for the proposition. As you see bellow, par("usr") is the same > before and after the points() (the full code is bellow): > >> par("usr") > [1] -0.250 1.250 -0.167 1.167 >> # if you remove this points() function, axis will show nothing. >> >> points(1.5, 1.5, type="p") >> p2 <- par(no.readonly=TRUE) >> par("usr") > [1] -0.250 1.250 -0.167 1.167 > ... > I can reproduce it in Ubuntu and MacosX R Gui and Rstudio (R 3.4.1). > Marc > Here is the code: > library(fields) > par(mar=c(5,4.5,4,7)) > D <- matrix(c(10, 20, 25, 30, 12, 22, 32, 35, 13, 25, 38, 40), nrow=3) > p0 <- par(no.readonly=TRUE) > image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines", > ylab="Columns", cex.lab = 0.5, > zlim=c(min(D, na.rm=TRUE),max(D, na.rm=TRUE)), > las=1, axes=FALSE) > p1 <- par(no.readonly=TRUE) > par("usr") > par("xpd") > # if you remove this points() function, axis will show nothing. > points(1.5, 1.5, type="p") > p2 <- par(no.readonly=TRUE) > par("usr") > par("xpd") > ## > axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5) > axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1, > cex.axis=0.5) > identical(p1, p2) Have you ever carefully read the detailed help page about image.plot()? I haven't, but a cursory reading already shows me that the author of the function did this partly on purpose: > Side Effects: > > After exiting, the plotting region may be changed to make it > possible to add more features to the plot. To be explicit, > ‘par()\$plt’ may be changed to reflect a smaller plotting region > that has accommodated room for the legend subplot. Unfortunately, there it does _not_ mention the following : From looking at its code, and then re-reading parts of the help page, I see that there is a 'graphics.reset' argument which you can set to TRUE in such a case: image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines", ylab="Columns", cex.lab = 0.5, zlim= range(D, na.rm=TRUE), graphics.reset = TRUE, # <<<<< the solution las=1, axes=FALSE) Also note that zlim = range(D ...) is infinitely more elegant than zlim = c(min((D, ...), max(D, ...))) Martin Maechler ETH Zurich (and R core) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Superscript and subscrib R for legend x-axis and y-axis and colour different subjects in longitudinal data with different colours
> Hi Rosa > something like > plot(1,1, sub=expression(lambda^"2")) > So with your example, do you want something like > plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) , > xlab="Day in ICU", > ylab="CRP (mg/dL)", > sub = mtext(expression(lambda^2))) OOps! Either plot( ..., sub = *) or plot( ... ) ; mtext(*) but not both! > CRP7graph <- apply(CRP7, 1, lines, col="gray") > Cheers > Petr > > -Original Message- > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rosa > > Oliveira > > Sent: Friday, July 28, 2017 5:07 PM > > To: r-help mailing list ; R-help@r-project.org > > Subject: [R] Superscript and subscrib R for legend x-axis and y-axis and > > colour > > different subjects in longitudinal data with different colours > > > > I am trying to make a x-axis and y-axis titles with both a special > > character and a > > subscript. I am not being able to do this. I think its just a placing of my > > parenthesis, but I've tried (seemingly) everything. > > > > Even more, when I try the blog users code it works. > > > > > > > > Is it because I’m using longitudinal data? > > > > > > > > Even more. Is it possible to colour each one of the 15 lines with a > > different > > colour? > > > > > > > > > > > > > > library(ggplot2) > > library(reshape) > > library(lattice) > > library(gtable) > > library(grid) > > > > attach(mtcars) > > > > beta0 = rnorm (15, 1, .5) > > beta1 = rnorm (15, -1, .5) > > > > tempo = seq(1:5) > > > > CRP7raw = matrix(NA, 15, 5) > > CRP7 = matrix(NA, 15, 5) > > CRP98raw = matrix(NA, 15, 5) > > CRP98 = matrix(NA, 15, 5) > > > > crp <- for (i in 1:15) { > > CRP7raw[i,] = beta0[i] + beta1[i] * tempo > > CRP7[i,] =CRP7raw[i,] + rnorm(5, 0, 2.14) > > > > CRP98raw[i,] = beta0[i] + beta1[i] * tempo > > CRP98[i,] =CRP98raw[i,] + rnorm(5, 0, .1) > > } > > > > > > # plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) , > > # xlab="Day in ICU", > > # ylab="CRP (mg/dL)", > > # sub = mtext(expression(paste(lambda))) > > # > > # CRP7graph <- apply(CRP7, 1, lines, col="gray") > > > > > > > > > > > > > > # plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5), > > # xlab="Day in ICU", > > # ylab="CRP (mg/dL)") > > # CRP98graph <- apply(CRP98, 1, lines, col="gray") > > > > par(mfrow=c(1,2)) > > > > plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5) , > > xlab="t_i", > > ylab="y_ij", > > sub = "lambda = 0.7") > > > > CRP7graph <- apply(CRP7, 1, lines, col="gray") > > > > > > plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5), ylim=c(-10,5), > > xlab="Day in ICU", > > ylab="CRP (mg/dL", > > sub = "lambda = 0.98") > > CRP98graph <- apply(CRP98, 1, lines, col="gray") > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou > určeny pouze jeho adresátům. > Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně > jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze > svého systému. > Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email > jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. > Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či > zpožděním přenosu e-mailu. > V případě, že je tento e-mail součástí obchodního jednání: > - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, > a to z jakéhokoliv důvodu i bez uvedení důvodu. > - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; > Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany > příjemce s dodatkem či odchylkou. > - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným > dosažením shody na všech jejích náležitostech. > - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost > žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně > pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu > případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je > adresátovi či osobě jím zastoupené známá. > This e-mail and any documents attached to it may be confidential and are > intended only for its intended recipients. > If you received this e-mail by mistake, please immediately inform its sender. > Delete the contents of this e-mail with all attachments and its copies from > your system. > If you are not the intended recipi
Re: [R] Superscript and subscrib R for legend x-axis and y-axis and colour different subjects in longitudinal data with different colours
>>>>> PIKAL Petr >>>>> on Mon, 31 Jul 2017 09:11:18 + writes: > Hi Martin see in line >> -Original Message- From: Martin Maechler >> [mailto:maech...@stat.math.ethz.ch] Sent: Monday, July >> 31, 2017 10:52 AM To: PIKAL Petr >> Cc: Rosa Oliveira ; r-help mailing >> list >> Subject: Re: [R] Superscript and subscrib R for legend >> x-axis and y-axis and colour different subjects in >> longitudinal data with different colours >> >> >> > Hi Rosa > something like >> >> > plot(1,1, sub=expression(lambda^"2")) >> >> > So with your example, do you want something like >> >> > plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), >> ylim=c(-10,5) , > xlab="Day in ICU", > ylab="CRP >> (mg/dL)", > sub = mtext(expression(lambda^2))) >> >> OOps! Either plot( ..., sub = *) or plot( ... ) ; >> mtext(*) >> >> but not both! > You are right, I used a code from OP and did not much > think about it. Strangely enough, the code worked without > any complain. Probably mtext is "stronger" than sub and > overrides it. Well, well, "the magic of R" . Not quite: mtext(..) is a valid function call that is evaluated before plot() finishes; then it returns NULL invisibly, and 'plot(*, sub=NULL)' does nothing "coincidentally" (but for good reasons) Martin > Cheers Petr >> >> > CRP7graph <- apply(CRP7, 1, lines, col="gray") >> >> > Cheers > Petr >> >> >> > > -Original Message- > > From: R-help >> [mailto:r-help-boun...@r-project.org] On Behalf Of Rosa > >> > Oliveira > > Sent: Friday, July 28, 2017 5:07 PM > > >> To: r-help mailing list ; >> R-help@r-project.org > > Subject: [R] Superscript and >> subscrib R for legend x-axis and y-axis > > and colour >> different subjects in longitudinal data with different > >> > colours >> > > >> > > I am trying to make a x-axis and y-axis titles with >> both a special > > character and a subscript. I am not >> being able to do this. I think > > its just a placing of >> my parenthesis, but I've tried (seemingly) everything. >> > > >> > > Even more, when I try the blog users code it works. >> > > >> > > >> > > >> > > Is it because I’m using longitudinal data? >> > > >> > > >> > > >> > > Even more. Is it possible to colour each one of the >> 15 lines with a > > different colour? >> > > >> > > >> > > >> > > >> > > >> > > >> > > library(ggplot2) > > library(reshape) > > >> library(lattice) > > library(gtable) > > library(grid) >> > > >> > > attach(mtcars) >> > > >> > > beta0 = rnorm (15, 1, .5) > > beta1 = rnorm (15, -1, >> .5) >> > > >> > > tempo = seq(1:5) >> > > >> > > CRP7raw = matrix(NA, 15, 5) > > CRP7 = matrix(NA, 15, >> 5) > > CRP98raw = matrix(NA, 15, 5) > > CRP98 = >> matrix(NA, 15, 5) >> > > >> > > crp <- for (i in 1:15) { > > CRP7raw[i,] = beta0[i] + >> beta1[i] * tempo > > CRP7[i,] = CRP7raw[i,] + rnorm(5, 0, >> 2.14) >> > > >> > > CRP98raw[i,] = beta0[i] + beta1[i] * tempo > > >> CRP98[i,] = CRP98raw[i,] + rnorm(5, 0, .1) > > } >> > > >> > > >> > > # plot(c(1:5), CRP7raw[1,], type = "n", xlim=c(1,5), >> ylim=c(-10,5) , > > # xlab="Day in ICU", > > # ylab="CRP >> (mg/dL)", > > # sub = mtext(expression(paste(lambda))) >> > > # >> > > # CRP7graph <- apply(CRP7, 1, lines, col="gray") >> > > >> > > >> > > >> > > >> > > >> > > >> > > # plot(c(1:5), CRP98raw[1,], type = "n", xlim=c(1,5), >> ylim=c(-10,5), > > # xlab="Day in ICU", > > #
Re: [R] Generating samples from truncated multivariate Student-t distribution
>>>>> David Winsemius >>>>> on Tue, 9 May 2017 14:33:04 -0700 writes: >> On May 9, 2017, at 2:05 PM, Czarek Kowalski wrote: >> >> I have already posted that in attachement - pdf file. > I see that now. I failed to scroll to the 3rd page. from a late reader: Please, Czarek, and every other future reader: This is an unwritten "regulation" about all R lists I know (a few of which I co-maintain): ___> R code must be given as unformatted plain text rather than ___> in a formatted document (pdf in your case) >> I am posting plain text here: good .. and please do always on these lists Martin Maechler, ETH Zurich & R Core >> >>> library(tmvtnorm) >>> meann = c(55, 40, 50, 35, 45, 30) >>> covv = matrix(c( 1, 1, 0, 2, -1, -1, >> 1, 16, -6, -6, -2, 12, >> 0, -6, 4, 2, -2, -5, >> 2, -6, 2, 25, 0, -17, >> -1, -2, -2, 0, 9, -5, >> -1, 12, -5, -17, -5, 36), 6, 6) >> df = 4 >> lower = c(20, 20, 20, 20, 20, 20) >> upper = c(60, 60, 60, 60, 60, 60) >> X1 <- rtmvt(n=10, meann, covv, df, lower, upper) >> >> >>> sum(X1[,1]) / 10 >> [1] 54.98258 >> sum(X1[,2]) / 10 >> [1] 40.36153 >> sum(X1[,3]) / 10 >> [1] 49.83571 >> sum(X1[,4]) / 10 >> [1] 34.69571 # "4th element of mean vector" >> sum(X1[,5]) / 10 >> [1] 44.81081 >> sum(X1[,6]) / 10 >> [1] 31.10834 >> >> And corresponding results received using equation (3) from pdf file: >> [54.97, >> 40, >> 49.95, >> 35.31, # "4th element of mean vector" >> 44.94, >> 31.32] >> > I get similar results for the output from your code, > My 100-fold run of your calculations were: > meansBig <- replicate(100, {Xbig <- rtmvt(n=10, meann, covv, df, lower, upper) > colMeans(Xbig)} ) > describe(meansBig[4,]) # describe is from Hmisc package > meansBig[4, ] > n missing distinct Info Mean Gmd .05 .10 .25 > 1000 1001 34.7 0.0195434.6834.68 34.69 > .50 .75 .90 .95 > 34.7034.7234.7234.73 > lowest : 34.65222 34.66675 34.66703 34.66875 34.67566 > highest: 34.72939 34.73012 34.73051 34.73742 34.74441 > So agree, 35.31 is outside the plausible range of an RV formed with that package, but I don't have any code relating to your calculations from theory. > Best; > David. >> On 9 May 2017 at 22:17, David Winsemius wrote: >>> >>>> On May 9, 2017, at 1:11 PM, Czarek Kowalski wrote: >>>> >>>> Of course I have expected the difference between theory and a sample >>>> of realizations of RV's and result mean should still be a random >>>> variable. But, for example for 4th element of mean vector: 35.31 - >>>> 34.69571 = 0.61429. It is quite big difference, nieprawdaż? I have >>>> expected that the difference would be smaller because of law of large >>>> numbers (for 10mln samples the difference is quite similar). >>> >>> I for one have no idea what is meant by a "4th element of mean vector". So I have now idea what to consider "big". I have found that my intuitions about multivariate distributions, especially those where the covariate structure is as complex as you have assumed, are often far from simulated results. >>> >>> I suggest you post some code and results. >>> >>> -- >>> David. >>> >>> >>>> >>>> On 9 May 2017 at 21:40, David Winsemius wrote: >>>>> >>>>> On May 9, 2017, at 10:09 AM, Czarek Kowalski >>>>> wrote: >>>>> >>>>> Dear Members, >>>>> I am working with 6-dimensional Student-t distribution with 4 degrees >>>>> of freedom truncated to [20; 60]. I have generated 100 000 samples >>>>> from truncated multivariate Student-t distribution using rtmvt >>>>> function from package ‘tmvtnorm’. I have also calculated mean vector >>>>> using equation (3) from attached pdf. The problem is, that after >>>>> summing all elements in one column of rtmvt result (and dividing by >>>>> 100 000) I do not receive the same
Re: [R] define a list with names as variables
>>>>> Giovanni Gherdovich >>>>> on Fri, 4 Aug 2017 12:36:49 +0200 writes: > Hello Thomas, Ulrik, > thanks for your suggestions. There's also the setNames(, names) which is a trivial wrapper to achieve this. I had provided it for R (in package 'stats') just to help people to write more nicely readable code: Your function f() below then becomes a simple one-liner: f <- function(foo, bar) setNames(list(bar), foo) Martin Maechler ETH Zurich and R Core > Giovanni > On Fri, Aug 4, 2017 at 12:13 PM, Thomas Mailund > wrote: >> Do you mean like this? >> >> >>> f <- function(foo, bar) { >> + result <- list(bar) >> + names(result) <- foo >> + result >> + } >> >>> (x <- f("hello", "world")) >> $hello >> [1] "world" >> >>> names(x) >> [1] "hello" > On Fri, Aug 4, 2017 at 12:14 PM, Ulrik Stervbo wrote: >> Hi Giovani, >> >> I would create an unnamed list and set the names after. >> >> Best, >> Ulrik > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help creating the IBM Randu function
> Richard M Heiberger > on Mon, 14 Aug 2017 14:36:40 -0400 writes: > Please look at ?datasets::randu > for David Donoho's translation of RANDU into R. Thanks a lot, Rich, for pointing to this: Indeed, the RANDU aka 'randu' data set has been part of R since the last millennium (and hence before R reached version 1.0.0). The help page does mention where we got the data set originally, namely Dave Donoho. However, I would bet he did not use R at the time, and indeed it was Prof Brian Ripley who added the R code to R sources in 1999 r5632 | ripley | 1999-08-27 08:59:05 +0200 (Fri, 27. Aug 1999) improve clarity (I hope), add an R example to re-generate this and yes, indeed example(randu) already creates the RANDU data set for you .. since 1999. One curious thing: Martin .. Pedersen's Wikipedia page (1) seems to make it clear it stems from IBM whereas in R's help page help(randu) we only mention that we got it through VAX VMS code. > On Mon, Aug 14, 2017 at 12:49 PM, Martin Møller Skarbiniks Pedersen > wrote: >> Dear all, >> >> I am trying to learn functions in R and 3D plotting so I decided to try >> to plot >> the famous bad PRNG Randu from IBM(1). >> However something is not correct in the function I have created. >> First I define the function RANDU like this: >> >>> RANDU <- function(num) { return (65539*num)%%(2^31) } >> >> and test that it works for a seed of 1: >>> RANDU(1) >> [1] 65539 >> >> but if I want the next value in the sequence I get this number. >>> (65539*65539)%%(2^31) >> [1] 393225 >> >> However using the RANDU function twice doesn't give the same result as >> above. >> >>> RANDU(RANDU(1)) >> [1] 4295360521 >> >> I expect these two values to be the same but that is not the case. >> 393225 should be the correct. >> >> I guess it might be something with local vs. global environment ?! >> >> Please advise and thanks. >> >> Regards >> Martin M. S. Pedersen >> >> (1) https://en.wikipedia.org/wiki/RANDU >> >> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Find maxima of a function
>>>>> Ranjan Maitra >>>>> on Sun, 27 Aug 2017 08:17:15 -0500 writes: > I have not followed the history of this thread, but I am > quite flummoxed as to why the OP is rewriting code to > estimate parameters from an univariate Gaussian mixture > model when alternatives such as EMCluster (which generally > appears to handle initialization better than MClust) > exist. Or perhaps there is more to it in which case I > apologize. But I thought that I would make the OP aware of > the package (of which, in full disclosure, I am > co-author). Best wishes, Ranjan in this case, I should also mention the small (but nice) package 'nor1mix' (normal 1-dimensional mixtures) for which I had added MLE via optim() in addition to the traditional slow and sometimes hard-to-get-to-converge-correctly EM algorithm. After install.packages("nor1mix") require("nor1mix") ?norMixMLE Martin Maechler, ETH Zurich > On Sun, 27 Aug 2017 16:01:59 +0300 Ismail SEZEN > wrote: >> Dear Niharika, >> >> As I said before, the problem is basically an >> optimization issue. You should isolate the problematic >> part from the rest of your study. Sometimes, more >> information does not help to solution. All the answers >> from us (Ulrik, David, me) are more or less are correct >> to find a maximum point. Newton’s method is also >> correct. But after answers, you only say, it didn’t find >> the right maxima. At this point I’m cluesless, because >> problem might be originated at some other point and we >> don’t know it. For instance, In my previous answer, my >> suggested solution was right for both your 2 >> situations. But suddenly you just said, it didin’t work >> for a mean greater than 6 and I’m not able to reproduce >> your new situation. >> >> I’m really upset that you lost 4 weeks on a very easy >> issue (it is very long time). But if you don’t want to >> loose anymore, please, isolate your question from the >> rest of the work, create minimal reproduciple example, >> and let’s focus on the problematic part. I assure you >> that your problem will be solved faster. >> >> I could install the distr package at the end, but I’m >> getting another error while running your code. But I >> don’t believe the question is related to this package. >> >> Let me explain about extremum points although most of you >> know about it. Let’s assume we have a y(x) function. An >> extremum (max/min) point is defined as where dy/dx = >> 0. If second order derivative of y is less than 0, it’s a >> maximum, if greater than 0, it’s a minimum point. If >> zero, it’s undefined. So, at the end, we need x and y >> vectors to find maxima. >> >> y <- c(1,2,3,4,3,2,3,4,5,6,7,8,9,8,7,6,5,6,7,6,5) # >> sample y values x <- 1:length(y) # correspoinding x >> values >> >> # Now we need to convert this discrete x-y vectors to a >> function. Because we don’t know the values between >> points. # That’s why, we fit a cubic spline to have the >> values between discrete points. >> >> fun <- splinefun(x = x, y = y, method = "n”) >> >> # now, we are able to find what value of y at x = 1.5 by >> the help of fun function. >> >> x2 <- seq(1, max(x), 0.1) y2 <- fun(x2) >> >> plot(x, y, type = "l") lines(x2, y2, col = "red”) >> >> # see red and black lines are slightly different because >> we fit a cubic spline to the discrete points. # Now, we >> will use optimize function to find the maxima in a >> defined interval. Interval definition is crucial. # >> optimize function calculates all required steps explained >> above to find an extremum and gives the result. >> >> max.x <- optimize(fun, interval = range(x), maximum = >> TRUE) max.x2 <- x[which(y == max(y))] # global maximum of >> discrete x-y vectors >> >> print(max.x) # x value of global maximum of y >> print(max.x2) # x value of global maximum of y ( discrete >> points) >> >> see max.x and max.x2 are very close. max.x is calculated >> under a cubic spline assumption and max.x2 is calculated >> by discrete points. >> >> As a result, problem is reduced to find maxima of x-y >> vecto
Re: [R] Optimize code to read text-file with digits
> peter dalgaard > on Fri, 8 Sep 2017 16:12:21 +0200 writes: >> On 8 Sep 2017, at 15:51 , Martin Møller Skarbiniks >> Pedersen wrote: >> >> On 8 September 2017 at 14:37, peter dalgaard >> wrote: >>> >>> On 8 Sep 2017, at 14:03 , peter dalgaard wrote: x <- scan("~/Downloads/digits.txt") x <- x[-seq(1,22,11)] >>> >>> ...and, come to think of it, if you really want the >>> 100 random digits: >>> >>> xx <- c(outer(x,10^(0:4), "%/%")) %% 10 >>> >> >> Hi Peter, Thanks a lot for the answers. I can see that I >> need to read about outer(). However I get a different >> result than expected. >> R> x <- scan("digits.txt") >> Read 22 items >> R> head(x) >> [1] 0 10097 32533 76520 13586 34673 >> R> x <- x[-seq(1,22,11)] head(x) >> [1] 10097 32533 76520 13586 34673 54876 >> R> head(c(outer(x,10^(0:4), "%/%")) %% 10, 10) # >> [1] 7 3 0 6 3 6 9 7 2 5 >> > Ah, right. You do get all the digits, but in the order of > the last digit of each 5 digit number, then all the > penultimate digits, etc. To get digits in the right order, > try >> xx <- c(t(outer(x,10^(4:0), "%/%"))) %% 10 head(xx, 100) > [1] 1 0 0 9 7 3 2 5 3 3 7 6 5 2 0 1 3 5 8 6 3 4 6 7 3 5 > 4 8 7 6 8 0 9 5 [35] 9 0 9 1 1 7 3 9 2 9 2 7 4 9 4 5 3 7 5 > 4 2 0 4 8 0 5 6 4 8 9 4 7 4 2 [69] 9 6 2 4 8 0 5 2 4 0 3 7 > 2 0 6 3 6 1 0 4 0 2 0 0 8 2 2 9 1 6 6 5 > I.e., reverse the order of digit generation and transpose > the matrix that outer() creates (because matrices are > column-major). As people are "exercising" with R and it's Friday: Try to use read.fwf() instead of scan() to get to the digits directly, and see if you get the identical digits, and if it is faster overall or not [I have no idea of the answer to that]. another Martin. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error messages using nonlinear regression function (nls)
>>>>> PIKAL Petr >>>>> on Fri, 20 Oct 2017 06:33:36 + writes: > Hi > Keep your messages in the list, you increase your chance to get some answer. > I changed your data to groupedData object (see below), but I did not find any problem in it. > plot(wlg) > gives reasonable picture and I am not such expert to see any problem with data. Seems to me, that something has to be wrong with nlsList function. >> wheat.list <- nlsList(Prop ~ SSlogis(end,Asym, xmid, scal), data=wlg) > Warning message: > 6 times caught the same error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): NA/NaN/Inf in 'x' > produces empty list. So maybe others could find some problem. > Cheers > Petr Thank you, Petr, for the reproducible example. This indeed can be traced back to a bug in SSlogis() that has been there since Doug Bates added the 'nls' package on Martin's day 1999 (==> before R version 1.0.0 !) to R (svn rev 6455). It's this part which does contain a thinko (by whomever, only Doug may be able to remember/recall, but I guess it would have been José Pinheiro, then the grad student doing the nitty gritty), I have added 2 '--' below to make the 2 important lines stand out : z <- xy[["y"]] - if (min(z) <= 0) { z <- z - 1.05 * min(z) } # avoid zeroes z <- z/(1.05 * max(z)) # scale to within unit height - xy[["z"]] <- log(z/(1 - z)) # logit transformation aux <- coef(lm(x ~ z, xy)) the first of the 2 lines, commented "avoid zeroes" is obviously *not* avoiding zeroes in the case min(z) == 0 , and even though the 2 lines should rescale an interval [0, M] to [eps, 1 - eps] they don't in this case. Consequently, the next line log(z/(1 - z)) transforms the 0s into -Inf and these lead to the warning (or error in nls()) " NA/NaN/Inf in 'x' " One could fix this up by replacing min(z) by min(z, -1e-7) which may be best for pure back compatibility, but I really don't like it either : The famous logit - transform log( z / (1-z)) is really anti-symmetric around z = 1/2 , in particular should treat 0 and 1 "symmetrically" and I find it ugly that the previous fixup (our two ominous lines) is not at all symmetric wrt 1/2, notably the 2nd transformation is made unconditionally but the first one not. Fortunately, the same source file, /src/library/stats/R/zzzModels.R also defines the SSfpl() == 4-parameter logistic model and there, the 'init' function needs to do the same scaling to (0, 1) and does it much nicer, indeed (anti)symmetrically. I'm looking into using that in SSlogis() as well, fixing this bug. Martin Maechler ETH Zurich and R Core Team __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error messages using nonlinear regression function (nls)
>>>>> PIKAL Petr >>>>> on Fri, 20 Oct 2017 13:35:01 + writes: > Thank you Martin. > If I understand correctly, OP could do > wheat.list <- nlsList(Prop ~ SSfpl(end, A, B, xmid, scal), data=wlg) (which is really a different model with one parameter more, and given the relatively small sample size per group, it may not be the best idea) > or add some small value to all zeroes > wlg$prop <- wlg$Prop+1e-7 > wheat.list <- nlsList(prop ~ SSlogis(end,Asym, xmid, scal), data=wlg) > which gives fairly reasonable results. > plot(augPred(wheat.list)) > Am I correct? Yes, indeed, Petr, I forgot to mention that obvious workaround. (Interestingly, he could also _subtract_ a small positive value which is less intuitive, but work too given the code I showed.) Martin > Cheers > Petr >> -Original Message- >> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] >> Sent: Friday, October 20, 2017 1:04 PM >> To: PIKAL Petr >> Cc: Wall, Wade A ERDC-RDE-CERL-IL CIV ; r- >> h...@r-project.org >> Subject: Re: [R] Error messages using nonlinear regression function (nls) >> >> >>>>> PIKAL Petr >> >>>>> on Fri, 20 Oct 2017 06:33:36 + writes: >> >> > Hi >> > Keep your messages in the list, you increase your chance to get some >> answer. >> >> > I changed your data to groupedData object (see below), but I did not find >> any problem in it. >> >> > plot(wlg) >> > gives reasonable picture and I am not such expert to see any problem with >> data. Seems to me, that something has to be wrong with nlsList function. >> >> >> wheat.list <- nlsList(Prop ~ SSlogis(end,Asym, xmid, scal), data=wlg) >> > Warning message: >> > 6 times caught the same error in lm.fit(x, y, offset = offset, singular.ok = >> singular.ok, ...): NA/NaN/Inf in 'x' >> >> > produces empty list. So maybe others could find some problem. >> >> > Cheers >> > Petr >> >> Thank you, Petr, for the reproducible example. >> >> This indeed can be traced back to a bug in SSlogis() that has been there since >> Doug Bates added the 'nls' package on Martin's day 1999 (==> before R version >> 1.0.0 !) to R (svn rev 6455). >> >> It's this part which does contain a thinko (by whomever, only Doug may be able >> to remember/recall, but I guess it would have been José Pinheiro, then the grad >> student doing the nitty gritty), I have added 2 '--' below to make the 2 >> important lines stand out : >> >> z <- xy[["y"]] >> - >> if (min(z) <= 0) { z <- z - 1.05 * min(z) } # avoid zeroes >> z <- z/(1.05 * max(z)) # scale to within unit height >> - >> xy[["z"]] <- log(z/(1 - z))# logit transformation >> aux <- coef(lm(x ~ z, xy)) >> >> the first of the 2 lines, commented "avoid zeroes" >> is obviously *not* avoiding zeroes in the case min(z) == 0 , and even though the >> 2 lines should rescale an interval [0, M] to [eps, 1 - eps] they don't in this case. >> >> Consequently, the next line log(z/(1 - z)) transforms the 0s into -Inf and these >> lead to the warning (or error in nls()) >> >> " NA/NaN/Inf in 'x' " >> >> One could fix this up by replacing min(z) by min(z, -1e-7) which may be best >> for pure back compatibility, but I really don't like it either : >> >> The famous logit - transform log( z / (1-z)) >> is really anti-symmetric around z = 1/2 , in particular should treat 0 and 1 >> "symmetrically" and I find it ugly that the previous fixup (our two ominous >> lines) is not at all symmetric wrt 1/2, notably the 2nd transformation is made >> unconditionally but the first one not. >> >> Fortunately, the same source file, /src/library/stats/R/zzzModels.R >> also defines the SSfpl() == 4-parameter logistic model >> >> and there, the 'init' function needs to do the same scaling to (0, 1) a
Re: [R] x-axis tick marks on log scale plot
> Brian Smith > on Thu, 19 May 2016 11:04:55 -0400 writes: > Thanks all !! On Thu, May 19, 2016 at 9:55 AM, Ivan > Calandra wrote: >> Hi, >> >> You can do it by first plotting your values without the >> x-axis: plot(x,y,log="xy", xaxt="n") >> >> and then plotting the x-axis with ticks where you need to: >> axis(side=1, at=seq(2000,8000,1000)) Getting nicer looking axis ticks for log-scale axes (and traditional graphics) I have created the function eaxis() and utility functionpretty10exp(.) and I also created standard R's axTicks(.) to help with these. if(!require("sfsmisc")) install.packages("sfsmisc") require("sfsmisc") x <- lseq(1e-10, 0.1, length = 201) plot(x, pt(x, df=3), type = "l", xaxt = "n", log = "x") eaxis(1) gives the attached plot eaxis-log-example.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if else condition - help
>>>>> jim holtman >>>>> on Sun, 22 May 2016 16:47:06 -0400 writes: > if you want to use 'ifelse', here is a way: hmm, why should he want that ? The OP did mention that it's about somewhat large objects, so efficiency is one of the considerations : ifelse() is often convenient and nicely self-explaining, but it is (because of its generality, but also by its definition) much *less efficient* than the (sometimes slightly less convenient) ways you were shown previously in this thread : - For the generalized case findInterval() is order of magnitudes better, and - for the simple case you were shown to use logical indexing, i.e., calls à lax[x > k] <- .. In summary: Use ifelse() much less -- notably if writing functions/code which should scale ! Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asking for large memory - crash running rterm.exe on windows
> Ben Bolker > on Sat, 28 May 2016 15:42:45 + writes: > Anthony Damico gmail.com> writes: >> >> hi, here's a minimal reproducible example that crashes my >> R 3.3.0 console on a powerful windows server. below the >> example, i've put the error (not crash) that occurs on R >> 3.2.3. >> >> should this be reported to http://bugs.r-project.org/ or >> am i doing something silly? thanx > From the R FAQ (9.1): > If R executes an illegal instruction, or dies with an > operating system error message that indicates a problem in > the program (as opposed to something like “disk full”), > then it is certainly a bug. > So you could submit a bug report, *or* open a discussion > on r-de...@r-project.org (which I'd have said was a more > appropriate venue for this question in any case) ... Indeed. In this case, this is a known problem -- not just of R, but of many programs that you can run --- You are requesting (much) more memory than your computer has RAM, and in this situation -- depending on the OS --- your computer will kill R (what you saw) or your it will become very slow trying to shove all memory to R and start swapping (out to disk other running / sleeping processes on the computer). Both is very unpleasant... But it is you as R user who asked R to allocate an object of about 41.6 Gigabytes (26 * 1.6, see below). As Ben mentioned this may be worth a discussion on R-devel ... or you rather follow up the existing thread opened by Marius Hofert three weeks ago, with subject "[Rd] R process killed when allocating too large matrix (Mac OS X)" --> https://stat.ethz.ch/pipermail/r-devel/2016-May/072648.html His simple command to "crash R" was matrix(0, 1e5, 1e5) which for some of use gives an error such as > x <- matrix(0, 1e5,1e5) Error: cannot allocate vector of size 74.5 Gb but for others it had the same effect as your example. BTW: I repeat it here in a functionalized form with added comments which makes apparent what's going on: ## Make simple data.frame mkDf <- function(grpsize, wrongSize = FALSE) { ne <- (if(wrongSize) 26 else 1) *grpsize data.frame(x = rep(LETTERS, each = ne), v = runif(grpsize*26), stringsAsFactors=FALSE) } g1 <- ceiling(10^5/26) d1 <- mkDf(g1) # works fine str(d1) ## 'data.frame':100022 obs. of 2 variables: dP <- mkDf(g1, wrong=TRUE)# mis-matching the number of elements str(dP) # is 26 times larger ## 'data.frame': 2600572 obs. of 2 variables: . # make this much bigger gLarge <- ceiling(10^8/26) dL <- mkDf(gLarge) # works "fine" .. (well, takes time!!) str(dL) ## 'data.frame': 10004 obs. of 2 variables: as.numeric(print(object.size(dL)) / 1e6) ## 162088 bytes ## [1] 1600.002 Mega i.e., 1.6 GBytes ## Well, this will be 26 times larger than already large ==> your R may crash *OR* ## your computer may basically slow down to a crawl, when R requests all its memory... if(FALSE) ## ==> do *NOT* evaluate the following lightly !! dLL <- mkDf(gLarge, wrong=TRUE) # CONSOLE CRASH WITHOUT EXPLANATION # C:\Users\AnthonyD> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Rid of NaN in ts Object
> Perfect! > Exactly what I was looking for. > Thanks > Lorenzo > On Fri, May 27, 2016 at 01:50:03PM +0200, Christian Brandstätter wrote: >> Hi Lorenzo, >> >> Try: >> >> tt[is.nan(tt)] <- NA >> tt <- na.omit(tt) >> or simply na.omit(tt) as it omits both NA and NaN (and *does* keep the 'ts' properties as you have noted). Martin >> Best, >> >> Christian >> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making an if condition variable ?
> Jim Lemon > on Thu, 2 Jun 2016 13:03:01 +1000 writes: > Hi ce, > a<-10 > condition<-expression("a>0") > if(eval(parse(text=condition))) cat("a>0\n") While this may answer the question asked, the above is *not* good advice, excuse me, Jim : > fortune(106) If the answer is parse() you should usually rethink the question. -- Thomas Lumley R-help (February 2005) > fortune(181) Personally I have never regretted trying not to underestimate my own future stupidity. -- Greg Snow (explaining why eval(parse(...)) is often suboptimal, answering a question triggered by the infamous fortune(106)) R-help (January 2007) - Good advice would emphasize to use expressions rather than strings and yes that's a bit more sophistication. But it's worth it. Martin > > Jim > On Thu, Jun 2, 2016 at 12:30 PM, ce wrote: >> >> Dear all, >> >> I want to make an if condition variable like : >> >> a = 10 >> CONDITION = " a > 0 " >> >> if ( CONDITION ) print(" a is bigger" ) >> >> I tried get , getElement , eval without success ? >> >> Thanks >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stats package
>>>>> Dariush Ashtab >>>>> on Thu, 2 Jun 2016 17:57:14 +0430 writes: ["accidentally" to R-core .. should have gone to R-help -- and hence answered here] > Dear R project I need stats package for optimization in > simulated annealing but i can not download. please guide > me. If you install R in any "normal way", it comes with 29 R packages. Inside, R, you get their names via > rownames(installed.packages(.Library, priority=c("base","recommended"))) [1] "base" "boot" "class" "cluster""codetools" [6] "compiler" "datasets" "foreign""graphics" "grDevices" [11] "grid" "KernSmooth" "lattice""MASS" "Matrix" [16] "methods""mgcv" "nlme" "nnet" "parallel" [21] "rpart" "spatial""splines""stats" "stats4" [26] "survival" "tcltk" "tools" "utils" > A subset of 14 of these, the "base" ones, are bundled with R, and entirely part of the source code of R, and hence cannot be installed separately : > rownames(installed.packages(.Library, priority="base")) [1] "base" "compiler" "datasets" "graphics" "grDevices" "grid" [7] "methods" "parallel" "splines" "stats" "stats4""tcltk" [13] "tools" "utils" Package 'stats' is among these and even is among those packages that are loaded and *attached* (to your "search path") by default, when you start R. > Many thanks from any pointers, > Dariush You are welcome, Martin Martin Maechler ETH Zurich > -- > Dariush Ashtab > ( MS.c in Environment Assessment & Planning ) > Master Student , Department of Environment, Faculty of Natural Resources, > Tarbiat Modares University (T.M.U.), Noor, Iran. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Install R version 3.3 on Debian server
>>>>> Mbr Mbr >>>>> on Mon, 18 Jul 2016 11:45:00 +0200 writes: > Hello, > I'm currently in a internship and I'm working on a Debian server. > However, I have some errors in my scripts and I think it's because of the > version of R installed on the server (works well on my PC with the latest > version of R on Windows aka 3.3.1). > Here are the details from sessionInfo() of the server : > - R version 2.15.1 (2012-06-22) > - Platform : i486-pc-linux-gnu (32 bit) Yes, 2.15.1 is considered antique, and nobody should be using it unless for "Rchaelogy" (the pre-history and history of R implementations), and so I do occasionally start such a version. > Since I'm totally a beginner about Linux server and my internship's tutor > doesn't want to read the tutorial to install R for Debian (even if the > server is his) or help for this tank for some unknown reason, I would like > to know how to upgrade the current R version on the server (from 2.15.1 to > 3.3). Very good proposal. Two things: - We have a *dedicated* mailing list for Debian (and Debian-derived such as "ubuntu") Linux distributions: --> https://www.R-project.org/mail.html does list the "SIG" (Special Interest Groups), including R-SIG-Debian which points to http://stat.ethz.ch/mailman/listinfo/r-sig-debian - CRAN itself has sections about this, notably https://cloud.r-project.org/bin/linux/debian/ will tell you how to tell your debian system to get R from CRAN as opposed from the debian repositories. > Thanks in advance You are welcome, Martin Maechler > Sincelerly, > Mbr > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ... distribution based on mean values and quantiles in R?
>>>>> Jim Lemon >>>>> on Tue, 19 Jul 2016 16:20:49 +1000 writes: > Hi Daniel, > Judging by the numbers you mention, the distribution is either very > skewed or not at all normal. If you look at this: > plot(c(0,0.012,0.015,0.057,0.07),c(0,0.05,0.4,0.05,0),type="b") > you will see the general shape of whatever distribution produced these > summary statistics. Did the paper give any hints as to what the model > distribution might be? Yes, that's the correct question: At first, it's not about plotting, about *fitting* a distribution with the desired properties, and then you can easily visualize it. So, what were the data? If they are 'amounts' of any kind, they are necessarily non-negative often always positive, and hence --- according to John W Tukey --- should be analyzed after taking logs (Tukey's "first aid transformation" for *any* amounts data). Taking logs, and analyzing means to consider a normal ("Gaussian") distribution for the log() which is equivalent to fitting a lognormal distribution -- R functions [dpqrr]lnorm() to the original data. I'd strongly recommend doing that. And I did so, finding out, however that if indeed it is the *mean* and the 15% and 95% quantiles, a log normal is not fitting. Here is the reproducible R code .. with lots of comments : ## ## MM strongly recommends to fit a log-normal distribution .. however it does *not* fit ## The "data statistics" qlN <- c(q15 = 0.012, q85 = 0.057) # Quantiles original scale mlN <- 0.015 (qn <- log(qlN)) # Quantiles log scale [assumed normal (Gaussian) here] ## as the Gaussian is symmetri, the mid value of the two quantiles is the mean and median : (mu <- mean(qn)) # -3.644 (medlN <- exp(mu)) # 0.02615 ## an estimate for the median(.) -- but it is *larger* than the mean = 0.015 above ! ## ===> Log-normal does *NOT* fit well : ## From the help page, we learn that ##E[lN] = exp(mu + 1/2 sigma^2){and it is trivial that} ## median[lN] = exp(mu) ## where here, our medLn is a (moment) estimate of median[lN] ## If the number were different, this would solve the problem : ## Consequently, a (moment / plugin) estimate for sigma is (e12sig <- mlN / medlN) ## ~= exp( 1/2 sigma^2) (sig2 <- 2*log(e12sig)) ## ~= sigma^2 [--- is *NEGATIVE* (<==> 'est.median' > mean !)] (sig <- sqrt(sig2)) ## ~= sigma[here of course 'NaN' with a warning !] My conclusion would be that other distributions (than the log-normal; the normal is out of question !!) have to be considered, if you want to be serious about the topic. Maybe the poweRlaw package (https://cloud.r-project.org/package=poweRlaw) may help you (it has 4 vignettes, the last being a nice JSS publication). The above is a "cute" non-standard problem in any case: to fit very skewed distributions, given two quantiles and the mean only, and the approach taken by the "poweRlawyers", namely to minimize the KS (Kolmogorov-Smirnoff) decrepancy seems a good start to me. Martin Maechler, ETH Zurich > Jim > On Tue, Jul 19, 2016 at 7:11 AM, gcchenleidenuniv > wrote: >> Hi all, >> >> I need to draw density curves based on some published data. But in the article only mean value (0.015 ) and quantiles (Q0.15=0.012 , Q0.85=0.057) were given. So I was thinking if it is possible to plot density curves solely based on the mean value and quantiles. The dnorm(x, mean, sd, log) function needs the standard deviation which was not mentioned, so it can not be used in this situation. >> >> Many thanks!! >> Daniel >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readline issue with 3.3.1
>>>>> Ralf Goertz >>>>> on Wed, 20 Jul 2016 16:37:53 +0200 writes: > Am Wed, 20 Jul 2016 11:35:31 +0200 > schrieb Ralf Goertz : >> Hi, >> >> after a recent update to version 3.3.1 on Opensuse Leap I have >> problems with command lines longer than the terminal width. E.g. when >> I do this > I installed readline version 6.3 and the issue is gone. So probably some > of the recent changes in R's readline code are incompatible with version > readline version 6.2. Yes, it seems so, unfortunately. Thank you for reporting ! Our plan had been different: the NEWS entry for 3.3.1 (among 'BUG FIXES') says • Use of Ctrl-C to terminate a reverse incremental search started by Ctrl-R in the readline-based Unix terminal interface is now supported when R was compiled against readline >= 6.0 (Ctrl-G always worked). (PR#16603) So we had hoped that change (fixing a bug you *should* have been able to see with readline 6.2, as well) would work correctly in all versions of readline >= 6.0, but evidently it did not in yours. Martin Maechler ETH Zurich / R Core Team __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C/C++/Fortran Rolling Window Regressions
>>>>> jeremiah rounds >>>>> on Thu, 21 Jul 2016 13:56:17 -0700 writes: > I appreciate the timing, so much so I changed the code to show the issue. > It is a problem of scale. > roll_lm probably has a heavy start-up cost but otherwise completely > out-performs those other versions at scale. I suspect you are timing the > nearly constant time start-up cost in small data. I did give code to > paint a picture, but it was just cartoon code lifted from stackexchange. > If you want to characterize the real problem it is closer to: > 30 day rolling windows on 24 daily (by hour) measurements for 5 years with > 24+7 -1 dummy predictor variables and finally you need to do this for 300 > sets of data. > Pseudo-code is closer to what follows and roll_lm can handle that input in > a timely manner. You can do it with lm.fit, but you need to spend a lot of > time waiting. The issue of accuracy needs a follow-up check. Not sure why > it would be different. Worth a check on that. Just a note about lm(), lm.fit(): As Achim Zeileis noted, they are very reliable implementations (based on QR) indeed. For a year or so now, there's the bare bone .lm.fit() which is even one order of magnitued factor faster than lm.fit() at least for simple cases, see the final example on the ?.lm.fit help page. Martin Maechler (who had added .lm.fit() to R ) > Thanks, > Jeremiah > library(rbenchmark) > N = 30*24*12*5 > window = 30*24 > npred = 15 #15 chosen arbitrarily... > set.seed(1) > library(zoo) > library(RcppArmadillo) > library(roll) > x = matrix(rnorm(N*(npred+1)), ncol = npred+1) > colnames(x) <- c("y", paste0("x", 1:npred)) > z <- zoo(x) > benchmark( > roll_lm = roll_lm(coredata(z[, 1, drop = F]), coredata(z[, -1, drop = > F]), window, > center = FALSE), replications=3) > Which comes out as: > test replications elapsed relative user.self sys.self user.child > sys.child > 1 roll_lm3 6.273138.3120.654 0 > 0 > ## You arn't going to get that below... > benchmark(fastLm = rollapplyr(z, width = window, > function(x) coef(fastLm(cbind(1, x[, -1]), x[, 1])), > by.column = FALSE), > lm = rollapplyr(z, width = window, > function(x) coef(lm(y ~ ., data = as.data.frame(x))), > by.column = FALSE), replications=3) > On Thu, Jul 21, 2016 at 1:28 PM, Gabor Grothendieck > wrote: >> I would be careful about making assumptions regarding what is faster. >> Performance tends to be nonintuitive. >> >> When I ran rollapply/lm, rollapply/fastLm and roll_lm on the example >> you provided rollapply/fastLm was three times faster than roll_lm. Of >> course this could change with data of different dimensions but it >> would be worthwhile to do actual benchmarks before making assumptions. >> >> I also noticed that roll_lm did not give the same coefficients as the >> other two. >> >> set.seed(1) >> library(zoo) >> library(RcppArmadillo) >> library(roll) >> z <- zoo(matrix(rnorm(10), ncol = 2)) >> colnames(z) <- c("y", "x") >> >> ## rolling regression of width 4 >> library(rbenchmark) >> benchmark(fastLm = rollapplyr(z, width = 4, >> function(x) coef(fastLm(cbind(1, x[, 2]), x[, 1])), >> by.column = FALSE), >> lm = rollapplyr(z, width = 4, >> function(x) coef(lm(y ~ x, data = as.data.frame(x))), >> by.column = FALSE), >> roll_lm = roll_lm(coredata(z[, 1, drop = F]), coredata(z[, 2, drop = >> F]), 4, >> center = FALSE))[1:4] >> >> >> test replications elapsed relative >> 1 fastLm 1000.221.000 >> 2 lm 1000.723.273 >> 3 roll_lm 1000.642.909 >> >> On Thu, Jul 21, 2016 at 3:45 PM, jeremiah rounds >> wrote: >> > Thanks all. roll::roll_lm was essentially what I wanted. I think >> maybe >> > I would prefer it to have options to return a few more things, but it is >> > the coefficients, and the remaining statistics you might want can be >> > calculated fast enough from there. >> > >> > >> > On Thu, Jul 21, 2016 at 12:36 PM, Achim Zeileis < >> achim.zeil...@uibk.ac.at> >> > wrote: >> > >> >> Jeremiah
Re: [R] readline issue with 3.3.1
>>>>> Ralf Goertz >>>>> on Fri, 22 Jul 2016 10:15:36 +0200 writes: > Am Thu, 21 Jul 2016 18:07:43 +0200 schrieb Martin Maechler > : >> Ralf Goertz on Wed, 20 Jul 2016 >> 16:37:53 +0200 writes: >>> I installed readline version 6.3 and the issue is >>> gone. So probably some of the recent changes in R's >>> readline code are incompatible with version readline >>> version 6.2. >> >> Yes, it seems so, unfortunately. >> >> Thank you for reporting ! > It would be great if – while fixing this – you also took > care of the SIGWINCH problem described in bug report > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16604 Well, that *has* been fixed in 'R-devel' and 'R 3.3.1 patched' .. but again only for readline >= 6.3 (The release process of 3.3.1 was already too much advanced, and such source changes are delicate) It is quite a bit of work to get correct code for the different versions of readline, also dealing with the fact that e.g. on Mac OSX the code needs to work with the libedit/editlib "substitute". One reason we have not closed the bug is that the fix is only for readline >= 6.3, ... and also that I don't think we got much of user confirmation of the form " yes, the bug goes away once I compile R-devel or R-patched " > Thanks, Ralf You are welcome, Martin > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimators for non-normal data
>>>>> Nika Sušac >>>>> on Sat, 23 Jul 2016 19:39:48 +0200 writes: > Hi! I have non-normal data (items are continuous on a > 9-point scale, but not normally distributed) and I want to > conduct cfa. Which of the estimators available in lavaan > do you recommend me to use? Thanks in advance! I think you want *robust* statistical methods then. Robust factor analysis (if 'cfa' means something like that) is somewhat prominent topic. Simple approaches will already be available by using MASS::cov.rob() for a robust covariance matrix which you then can pass to other methods. For more (and more modern) methods and approaches, - for R packages, I'd recommend you consult the CRAN task view about robust statistical methods, https://cran.r-project.org/web/views/Robust.html notably the section on 'Multivariate Analysis' - for more specific help and expertise, I strongly recommend the dedicated R-SIG-robust mailing list (instead of R-help), --> https://stat.ethz.ch/mailman/listinfo/r-sig-robust Best regards, Martin Maechler, ETH Zurich > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm() silently drops NAs
I have been asked (in private) > Hi Martin, y <- c(1, 2, 3, NA, 4) x <- c(1, 2, 2, 1, 1) t.test(y ~ x) lm(y ~ x) > Normally, most R functions follow the principle that > "missings should never silently go missing". Do you have > any background on why these functions (and most model/test > function in stats, I presume) silently drop missing values? And I think, the issue and an answer are important enough to be public, hence this posting to R-help : First note that in some sense it is not true that lm(y ~ x) silently drops NAs: Everybody who is taught about lm() is taught to look at summary( fm ) where fm <- lm(y ~ x) and that (for the above case) "says" ' (1 observation deleted due to missingness) ' and so is not entirely silent. This goes all back to the idea of having an 'na.action' argument which may be a function (a very good idea, "functional programming" in a double sense!)... which Chambers et al introduced in The White Book (*1) and which I think to remember was quite a revolutionary idea; at least I had liked that very much once I understood the beauty of passing functions as arguments to other functions. One problem already back then has been that we already had the ---much more primitive but often sufficient--- standard of an 'na.rm = FALSE' (i.e. default FALSE) argument. This has been tought in all good classes/course about statistical modeling with S and R ever since ... I had hoped (but it seems I was too optimistic, .. or many students have too quickly forgotten what they were taught ..) Notably the white book itself, and the MASS (*2) book do teach this.. though possibly not loudly enough. Two more decisions about this were made back then, as well: 1) The default for na.action to be na.omit (==> "silently dropping") 2) na.action being governed by options(na.action = ..) '1)' may have been mostly "historical": I think it had been the behavior of other "main stream" statistical packages back then (and now?) *and* possibly more importantly, notably with the later (than white book = "S3") advent of na.replace, you did want to keep the missing in your data frame, for later analysis; e.g. drawing (-> "gaps" in plots) so the NA *were* carried along and would not be forgotten, something very important in most case.s. and '2)' is something I personally no longer like very much, as it is "killing" the functional paradigm. OTOH, it has to be said in favor of that "session wide" / "global" setting options(na.action = *) that indeed it depends on the specific data analysis, or even the specific *phase* of a data analysis, *what* behavior of NA treatment is desired and at the time it was thought smart that all methods (also functions "deep down" called by user-called functtions) would automatically use the "currently desired" NA handling. There have been recommendations (I don't know exactly where and by whom) to always set options(na.action = na.fail) in your global .First() or nowadays rather your Rprofile, and I assume that some of the CRAN packages and some of the "teaching setups" would do that (and if you do that, the lm() and t.test() calls above give an error). Of course, there's much more to say about this, quite a bit of which has been published in scientific papers and books.. Martin Maechler ETH Zurich and R Core Team *1) "The White Book": Chambers and Hastie (1992): https://www.r-project.org/doc/bib/R-books_bib.html#R:Chambers+Hastie:1992 *2) "MASS - the book": Venables and Ripley (2002 (= 4th ed.)): https://www.r-project.org/doc/bib/R-books_bib.html#R:Venables+Ripley:2002 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm() silently drops NAs
>>>>> peter dalgaard >>>>> on Tue, 26 Jul 2016 23:30:31 +0200 writes: >> On 26 Jul 2016, at 22:26 , Hadley Wickham >> wrote: >> >> On Tue, Jul 26, 2016 at 3:24 AM, Martin Maechler >> wrote: >>> > ... >> To me, this would be the most sensible default behaviour, >> but I realise it's too late to change without breaking >> many existing expectations. > Probably. Well, yes, Hadley's proposal would change too many default outputs; Still, I'd tend think we at least should contemplate changing the default (mildly). Before doing that we could at least provide one or a few new na.* functions which people could try using as their own defaults, using options(na.action = *) __ or __ as I read in the white book (see below) by attr(, "na.action") <- na. That is actually something that I have not seen advertised much, but seems very reasonable to me. It often does make sense to set a dataset-specific NA handling. I agree (with Hadley's implicit statement) that 'na.exclude' is often much more reasonable than 'na.omit' and that was the reason it was invented *after* the white book, IIRC, for S-PLUS, not S; and quite advertized at the time (at least by some people to whom I listened ...) If we'd change the default na.action , we'd probably would not want to go as far as a version of Hadley's (corrected in 2nd e-mail) na.warn <- function(object, ...) { missing <- !complete.cases(object) if (any(missing)) { warning("Dropping ", sum(missing), " rows with missing values", call. = FALSE) } na.exclude(object, ...) } [the warning probably needs to be a bit smarter: you don't always have "rows", sometimes it's "entries" in a vector] but *providing* something like the above, possibly rather named along the line of 'na.warn.and.keep' and provide a 'na.warn.and.omit' (which I would call 'na.warn', as after all it would be very close to the current R default, na.omit, but would warn additionally.. > Re. the default choice, my recollection is that at the > time the only choices available were na.omit and na.omit. > S-PLUS was using na.fail for all the usual good > reasons, but from a practical perspective, the consequence > was that, since almost every data set has NA values, you > got an error unless you added na.action=na.omit to every > single lm() call. And habitually typing na.action=na.omit > doesn't really solve any of the issues with different > models being fit to different subsets and all that. yes, indeed; you are right: - The white book (p. 112-113) indeed explains that 'na.fail' is (i.e., was there) the default. - an important example of where na.omit was not good enough, at the time was stepwise regression (where the number of complete cases may depend on the current subset of predictor variables), or also bootstrapping and crossvalidation. > So the > rationale for doing it differently in R was that it was > better to get some probably meaningful output rather than > to be certain of getting nothing. And, that was what the > mainstream packages of the time were doing. >> On a related note, I've never really understood why it's >> called na.exclude - from my perspective it causes the >> _inclusion_ of missing values in the >> predictions/residuals. > I think the notion is that you exclude them from the > analysis, but keep them around for the other purposes. > -pd >> >> Thanks for the (as always!) informative response, Martin. >> >> Hadley >> >> -- >> http://hadley.nz >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide >> commented, minimal, self-contained, reproducible code. > -- > Peter Dalgaard, Professor, Center for Statistics, > Copenhagen Business School Solbjerg Plads 3, 2000 > Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 > Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] No "number of days" format for 'difftime'?
> Loris Bennett > on Mon, 8 Aug 2016 14:12:47 +0200 writes: > Loris Bennett writes: >> Thanks for the link, John. However, there is a hyphen missing. It >> should be: >> >> http://www.dummies.com/how-to/content/how-to-read-errors-and-warnings-in-r.html >> >> Appropriately, with the correct URL we find the too often forgotten >> pearl of wisdom: >> >> "Chances are, you just typed something wrong there." >> >> I think I need this on my coffee cup. >> >> Cheers, >> >> Loris > Continuing the topic for my future self and others equally poorly versed > in The Art and Dark Science of Interpreting R Error Messages, if I have > the following in the file "my_data" > 1094165 2016-07-24T09:40:02 13-23:03:28 1 COMPLETED > 1112076 2016-08-01T14:45:49 6-13:26:15 1 COMPLETED > and do >> d <- read.table("my_data") >> colnames(d) <- c("jobid","start","elapsed","alloccpus","state") >> df <- transform(d,start = as.POSIXct(start,format="%Y-%m-%dT%H:%M:%S"),elapsed = as.difftime(elapsed,format="%d-%H:%M:%S")) > I get the following: > Error in as.difftime(elapsed, format = "%d-%H:%M:%S") : > 'tim' is not character or numeric Well, let me argue that you should have found this to be a *helpful* error message. You are no complete beginner anymore, right, so 1) the error is in your use of as.difftime(). 2) ?as.difftime or str(difftime) both clearly indicate that 'tim' is the first argument of as.difftime, and I really do wonder why you continued with the infamous "trial-and-error programming technique" instead of reading or at least quickly browsing the relevant reference, i.e., help page Martin > Remembering that if something is not what you think it is, it is > probably a factor, I find that >> d <- read.table(data_file,stringsAsFactors=FALSE) > causes the error to go away. > So what's all this 'tim' business? A quick squint at the source code of > datetime.R reveals the following line: > if (!is.numeric(tim)) stop("'tim' is not character or numeric") > So the error message tells me something about the arbitrary name of the > variable 'tim', which could also have been 'tom', 'dick', or 'harriet', > but nothing about the value. > So follow Rolf Turner's fortune(350) advice, then look at the source > code, and then realise that you weren't being totally dopey in not > understanding the error message. > Loris > -- > Dr. Loris Bennett (Mr.) > ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Diethelm Wuertz (founder 'Rmetrics')
Dear colleagues We are deeply saddened to inform you that Diethelm and his wife Barbara Würtz (ascii-fied 'Wuertz') died in a car accident during their vacation in Hungary last week. Both, Diethelm and Barbara worked at the ETH Zurich, Diethelm as a researcher and teacher in the Institute for Theoretical Physics and Barbara as an advisor in the Human Resources division. In the "R world", Diethelm has been known as an enthousiastic entrepreneur, having built the "Rmetrics" project (http://www.rmetrics.org) quite early in the history of R. He organized much-liked workshops / summer schools about "R in Finance" at Meielisalp in the Swiss mountains, and in cities such as Zurich, Paris or Mumbay. Many of us have known Barbara and Diethelm as generous and amiable people who've enjoyed hosting others both at such workshops or privately. They leave their son, Fabian, and their daughter-in-law Viktoria. Our thoughts are with them. Sincerely, Martin Maechler, ETH Zurich (Seminar for Statistics) Adrian Trapletti, Uster Metrics __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error while fitting gumbel copula
>>>>> Isaudin Ismail >>>>> on Tue, 9 Aug 2016 14:30:18 +0100 writes: > Dear R experts, > I have 5 time series of data (A, B, C, D and E) with all same lengths. All > series exhibit gamma distribution except for B which is lognormal > distribution. I am using copula package to model the joint-distribution of > these 5 times series. > I have selected Archimedean copula and successfully fitted Frank and > Clayton copula. The problem is when trying to fit Gumbel copula. > The following are the codes I used to run in R. > # Data of 5 time series > A <- A > B <- B > C <- C > D <- D > E <- E well, the above is really an "interesting" block of R code ;-) -- More seriously, please learn to use reproducible examples, e.g., from here http://bit.ly/MRE_R (nice to remember: MRE = Minimal Reproducible Example) or here http://adv-r.had.co.nz/Reproducibility.html then we will be glad to help you, notably I as maintainer of the package 'copula' which you are using (without saying so). With regards, Martin Maechler > # Combined between A and C > A+C <- A + C > gumbel.copula <- gumbelCopula(dim = 5) > m <- pobs(as.matrix(cbind(A+C, B, D, E))) > fit.gumbel<- fitCopula(gumbel.copula, m, method = 'ml') > And the error while trying to fit gumbel copula: > Error in optim(start, loglikCopula, lower = lower, upper = upper, method = > method, : > non-finite finite-difference value [1] > In addition: Warning message: > In .local(copula, tau, ...) : tau is out of the range [0, 1] > Appreciate all help! > Many thanks, > Isaudin > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error while fitting gumbel copula
>>>>> Isaudin Ismail >>>>> on Thu, 18 Aug 2016 17:03:50 +0100 writes: > Dear Martin, Following my earlier question on "error while > fitting gumbel copula", I have also crated a new gist at > https://gist.github.com/anonymous/0bb8aba7adee550d40b840a47d8b7e25 > for easy checking and copying codes. > I got no problem fitting other Archimedean copulas except > gumbel copula as per my code I used above. > Appreciate your kind help. > Many thanks, Isaudin > On Mon, Aug 15, 2016 at 4:28 PM, Isaudin Ismail > wrote: >> Dear Dr. Martin, >> >> I'm glad that you replied to my queries. >> >> As advised, I have prepared the following: MM: I'm including (cut'n'pasting) my commented and augmented version here: -- ## From: Isaudin Ismail ## To: Martin Maechler ## CC: ## Subject: Re: [R] Error while fitting gumbel copula ## Date: Mon, 15 Aug 2016 16:28:14 +0100 ## Dear Dr. Martin, ## I'm glad that you replied to my queries. ## As advised, I have prepared the following: library(copula) ## 5 series of data, A, B, C, D and E A <- c(0.849420849, 0.900652985, 0.97144217, 0.817888428, 0.877901578, 1.070040669, 0.889742431, 0.87588968, 0.853541938, 0.848664688, 0.876830319, 0.749582638, 0.818515498, 0.890997174, 0.794766966, 0.784794851, 0.814858959, 1.074396518, 0.83752495, 0.894341116, 0.880375293, 0.900816803) B <- c(0.479850746, 0.652111668, 1.880607815, 0.579902303, 0.50669344, 0.747560182, 0.701754386, 0.48969697, 0.346751006, 0.379234973, 0.862691466, 0.328280188, 0.317312661, 0.534438115, 0.487002653, 0.335043612, 0.373346897, 0.627520161, 0.792114695, 0.938253012, 0.444553967, 0.625972763) C <- c(0.693491124, 0.866523143, 4.585714286, 1.512055109, 0.387755102, 0.513435701, 0.76252505, -0.113113113, 0.338521401, 0.333951763, 0.668755595, 0.401273885, 0.419868791, 0.272885789, 0.541541542, 0.32751938, 0.386409736, 0.957446809, 0.861195542, 1.531632653, 0.431610942, 1.226470588) D <- c(0.807792208, 0.548547718, 0.738232865, 0.542247744, 1.088964927, 0.862385321, 0.60720268, 1.000816993, 0.699289661, 0.41723356, 0.604037267, 0.605003791, 0.698940998, 0.764792899, 0.647897898, 0.825256975, 0.767476085, 0.941391941, 0.889547813, 0.324503311, 0.942435424, 0.740686633) E <- c(1.077598829, 0.318507891, 1.152616279, 0.930397727, 1.515994437, 0.940689655, 0.880886427, 1.054274084, 1.067282322, 0.677419355, 0.966233766, 0.761029412, 1.05734767, 0.615925059, 1.061988304, 1.07184241, 1.058890147, 1.123873874, 1.304891923, -0.069584736, 1.172757475, 0.501096491) require(copula) gumbel.copula <- gumbelCopula(dim = 2) p <- pobs(cbind(D + E, A + B+ C )) fit.gumbel <- fitCopula(gumbel.copula, p, method = "ml") ## The error is here when trying to fit the gumbel copula # I got the following error: ## Error in optim(start, loglikCopula, lower = lower, upper = upper, method = ## method, : ## non-finite finite-difference value [1] ## In addition: Warning message: ## In .local(copula, tau, ...) : tau is out of the range [0, 1] ## MM: my version of copula gives the error message "some tau < 0" ## -- - ## and indeed: (tau.p <- cor(p[,1], p[,2], method="kendall")) ## [1] -0.1428571 ## ^- Kendall's tau is = - 1/7 < 0 ... and that is not good for Gumbel! plot(p) ##- So, you tried fitting to *negatively* correlated data, and if you use the default instead of "ml" the copula is fit, and uses param = 1 (which corresponds to the *independence* copula: Because among all the (weakly) positively correlated gumbel copulas, the boundary case, param = 1 (<==> tau = 0) is the best fitting. What you can do is to "rotate" the data (actually mirror it), and fit a gumbel copula, which now works nice and easily : p2 <- p; p2[,2] <- 1-p2[,2] (tau.p2 <- cor(p2, method="kendall")) ## --> now positively correlated ## ---> gumb.ml.p2 <- fitCopula(gumbel.copula, p2, method = "ml") summary(gumb.ml.p2) # looks fine now : Call: fitCopula(copula, data = data, method = "ml") Fit based on "maximum likelihood" and 22 2-dimensional observations. Gumbel copula, dim. d = 2 Estimate Std. Error param1.121 0.209 The maximized loglikelihood is 0.1839 Optimization converged Number of loglikelihood evaluations: function gradient 66 --- The next versi
Re: [R] loading .rda file
> Jeff Newmiller > on Tue, 30 Aug 2016 09:36:05 -0700 writes: > You cannot. However, you can load the file into a dedicated environment to keep those names separated from your global environment. e.g. [1] yes, that's my "famous" only-allowed use of attach() : attach() an rda-file to your search patch instead of polluting your globalenv. > The saveRDS / loadRDS saveRDS / readRDS are the correct names > functions are an alternative handle one object at a time without dragging the object names into the picture (you have to name the re-loaded object). The fact that it is readRDS() and not loadRDS(), actually does convey via it is often "much better" in the sense of functional / transparent programming to use this pair in favor of save() / load() : readRDS() does *return* what we are interested in, and result <- readRDS() is so much better than load() silently overwriting all kinds of objects in my globalenv. Indeed, I strongly advocate to use saveRDS() and readRDS() much more frequently than they are used nowawadays. > However, the best approach is to write scripts that pull directly from your source (non-R) data files. This makes your work process reproducible as you develop it. Yes, but that's sometimes too inefficient. And then, some of us do simulations and other expensive computations, we need/want to save and re-read. and yes, I *always* use the equivalent of q("no") to leave R; never save the workspace or load it at startup, unless accidentally on non-standard (for me) platforms. Martin > [1] https://stat.ethz.ch/pipermail/r-help/2016-August/441078.html > -- > Sent from my phone. Please excuse my brevity. > On August 30, 2016 7:37:24 AM PDT, Leslie Rutkowski wrote: >> Hi, >> >> I'm slowly migrating from SAS to R and - for the very first time - I'm >> working with a native .Rda data file (rather than importing data from >> other >> sources). When I load this .Rda file into the global environment using >> load("file path") I see a data.frame in the global environment called >> "mydata" that corresponds to the .rda file. >> >> My question: how can I change the name of this data.frame to something >> of >> my choosing? >> >> Thanks for considering this very simple question. >> >> Leslie >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting Mixture distributions
>>>>> Bert Gunter >>>>> on Wed, 7 Sep 2016 23:47:40 -0700 writes: > "please suggest what can I do to resolve this > issue." > Fitting normal mixtures can be difficult, and sometime the > optimization algorithm (EM) will get stuck with very slow convergence. > Presumably there are options in the package to either increase the max > number of steps before giving up or make the convergence criteria less > sensitive. The former will increase the run time and the latter will > reduce the optimality (possibly leaving you farther from the true > optimum). So you should look into changing these as you think > appropriate. I'm jumping in late, without having read everything preceding. One of the last messages seemed to indicate that you are looking at mixtures of *one*-dimensional gaussians. If this is the case, I strongly recommend looking at (my) CRAN package 'nor1mix' (the "1" is for "*one*-dimensional). For a while now that small package is providing an alternative to the EM, namely direct MLE, simply using optim() where the likelihood uses a somewhat smart parametrization. Of course, *as the EM*, this also depends on the starting value, but my (limited) experience has been that nor1mix::norMixMLE() works considerably faster and more reliable than the EM (which I also provide asnor1mix::norMixEM() . Apropos 'starting value': The help page shows how to use kmeans() for "somewhat" reliable starts; alternatively, I'd recommend using cluster::pam() to get a start there. I'm glad to hear about experiences using these / comparing these with other approaches. Martin -- Martin Maechler, ETH Zurich > On Wed, Sep 7, 2016 at 3:51 PM, Aanchal Sharma > wrote: >> Hi Simon >> >> I am facing same problem as described above. i am trying to fit gaussian >> mixture model to my data using normalmixEM. I am running a Rscript which >> has this function running as part of it for about 17000 datasets (in loop). >> The script runs fine for some datasets, but it terminates when it >> encounters one dataset with the following error: >> >> Error in normalmixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k = 2, : >> Too many tries! >> >> (command used: expr_mix_gau <- normalmixEM(expr_glm_residuals, lambda = >> c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 1, maxrestarts=200, verb >> = TRUE)) >> (expr_glm_residuals is my dataset which has residual values for different >> samples) >> >> It is suggested that one should define the mu and sigma in the command by >> looking at your dataset. But in my case there are many datasets and it will >> keep on changing every time. please suggest what can I do to resolve this >> issue. >> >> Regards >> Anchal >> >> On Tuesday, 16 July 2013 17:53:09 UTC-4, Simon Zehnder wrote: >>> >>> Hi Tjun Kiat Teo, >>> >>> you try to fit a Normal mixture to some data. The Normal mixture is very >>> delicate when it comes to parameter search: If the variance gets closer and >>> closer to zero, the log Likelihood becomes larger and larger for any values >>> of the remaining parameters. Furthermore for the EM algorithm it is known, >>> that it takes sometimes very long until convergence is reached. >>> >>> Try the following: >>> >>> Use as starting values for the component parameters: >>> >>> start.par <- mean(your.data, na.rm = TRUE) + sd(your.data, na.rm = TRUE) * >>> runif(K) >>> >>> For the weights just use either 1/K or the R cluster function with K >>> clusters >>> >>> Here K is the number of components. Further enlarge the maximum number of >>> iterations. What you could also try is to randomize start parameters and >>> run an SEM (Stochastic EM). In my opinion the better method is in this case >>> a Bayesian method: MCMC. >>> >>> >>> Best >>> >>> Simon >>> >>> >>> On Jul 16, 2013, at 10:59 PM, Tjun Kiat Teo >> > wrote: >>> >>> > I was trying to use the normixEM in mixtools and I got this error >>> message. >>> > >>> > And I got this error message >>> > >>> > One of the variances is goin
Re: [R] Have help list filters changed recently
> Marc Schwartz > on Thu, 8 Sep 2016 22:29:38 -0500 writes: >> On Sep 8, 2016, at 7:35 PM, Bert Gunter wrote: >> >> To all: >> >> r-help has been holding up a lot of my recent messages: Have there >> been any changes to help list filters that caused this? Is there >> something I'm doing wrong? -- I have made no changes that I am aware >> of. Here's what I get: >> >> Your mail to 'R-help' with the subject >> >> Re: [R] with and evaluation [for example] >> >> Is being held until the list moderator can review it for approval. >> >> The reason it is being held: >> >> The message headers matched a filter rule >> >> >> Best, >> Bert > Bert, > Have there been a lot of cc's in your replies? > That is one thing that will tend to trigger the spam filters. I am not sure what the threshold is and I am not sure that Martin knows, but that has bitten me in the past on R-Help. As co-moderator with Martin on R-Devel, I have seen the other side of it there. > Might also be the e-mail domain of one of the respondents in the thread. > I think that it is the ETHZ SysAdmins that tend to control the formalized spam filters and heuristics. > Regards, > Marc Thank you, Marc. Additionally, and with a bit more details, the reason " The message headers matched a filter rule " is really from the very last filter bank, which is 'mailman' and to answer Bert's question wrt that: No, nor the mailman setup, nor the R-help specific extra filters have changed during the last months, probably not even during the last years. The mailman filters however *do* look at some of the upstream spam filter results (most of which are ETH wide), and these do change of course, continually being adapted. Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about data problem
> Joe Ceradini > on Tue, 20 Sep 2016 17:06:17 -0600 writes: > read.csv("your_data.csv", stringsAsFactors=FALSE) > (I'm just reiterating Jianling said...) If you do not have very many columns, and want to become more efficient and knowledgeable, I strongly recommend alternatively to use the 'colClasses' argument to read.csv or read.table (they are the same apart from defaults for arguments!) and set "numeric" for numeric columns. This has a similar effect to the *combination* of 1) stringsAsFactors = FALSE 2) foo <- as.numeric(foo) # for respective columns Martin > Joe > On Tue, Sep 20, 2016 at 4:56 PM, lily li wrote: >> Is there a function in read.csv that I can use to avoid converting numeric >> to factor? Thanks a lot. >> >> >> >> On Tue, Sep 20, 2016 at 4:42 PM, lily li wrote: >> >> > Thanks. Then what should I do to solve the problem? >> > >> > On Tue, Sep 20, 2016 at 4:30 PM, Jeff Newmiller < >> jdnew...@dcn.davis.ca.us> >> > wrote: >> > >> >> I suppose you can do what works for your data, but I wouldn't recommend >> >> na.rm=TRUE because it hides problems rather than clarifying them. >> >> >> >> If in fact your data includes true NA values (the letters NA or simply >> >> nothing between the commas are typical ways this information may be >> >> indicated), then read.csv will NOT change from integer to factor >> >> (particularly if you have specified which markers represent NA using the >> >> na.strings argument documented under read.table)... so you probably DO >> have >> >> unexpected garbage still in your data which could be obscuring valuable >> >> information that could affect your conclusions. >> >> -- >> >> Sent from my phone. Please excuse my brevity. >> >> >> >> On September 20, 2016 3:11:42 PM PDT, lily li >> >> wrote: >> >> >I reread the data, and use 'na.rm = T' when reading the data. This time >> >> >it >> >> >has no such problem. It seems that the existence of NAs convert the >> >> >integer >> >> >to factor. Thanks for your help. >> >> > >> >> > >> >> >On Tue, Sep 20, 2016 at 4:09 PM, Jianling Fan >> >> >wrote: >> >> > >> >> >> Add the "stringsAsFactors = F" when you read the data, and then >> >> >> convert them to numeric. >> >> >> >> >> >> On 20 September 2016 at 16:00, lily li wrote: >> >> >> > Yes, it is stored as factor. I can't check out any problem in the >> >> >> original >> >> >> > data. Reread data doesn't help either. I use read.csv to read in >> >> >the >> >> >> data, >> >> >> > do you think it is better to use read.table? Thanks again. >> >> >> > >> >> >> > On Tue, Sep 20, 2016 at 3:55 PM, Greg Snow <538...@gmail.com> >> >> >wrote: >> >> >> > >> >> >> >> This indicates that your Discharge column has been >> >> >stored/converted as >> >> >> >> a factor (run str(df) to verify and check other columns). This >> >> >> >> usually happens when functions like read.table are left to try to >> >> >> >> figure out what each column is and it finds something in that >> >> >column >> >> >> >> that cannot be converted to a number (possibly an oh instead of a >> >> >> >> zero, an el instead of a one, or just a letter or punctuation mark >> >> >> >> accidentally in the file). You can either find the error in your >> >> >> >> original data, fix it, and reread the data, or specify that the >> >> >column >> >> >> >> should be numeric using the colClasses argument to read.table or >> >> >other >> >> >> >> function. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Sep 20, 2016 at 3:46 PM, lily li >> >> >wrote: >> >> >> >> > Hi R users, >> >> >> >> > >> >> >> >> > I have a problem in reading data. >> >> >> >> > For example, part of my dataframe is like this: >> >> >> >> > >> >> >> >> > df >> >> >> >> > month day year Discharge >> >> >> >> >31 20106.4 >> >> >> >> >32 2010 7.58 >> >> >> >> >33 2010 6.82 >> >> >> >> >34 2010 8.63 >> >> >> >> >35 2010 8.16 >> >> >> >> >36 2010 7.58 >> >> >> >> > >> >> >> >> > Then if I type summary(df), why it converts the discharge data >> >> >to >> >> >> >> levels? I >> >> >> >> > also met the same problem when reading some other csv files. How >> >> >to >> >> >> solve >> >> >> >> > this problem? Thanks. >> >> >> >> > >> >> >> >> > Discharge >> >> >> >> > 7.58 :2 >> >> >> >> > 6.4 :1 >> >> >> >> > 6.82 :1 >> >> >> >> > 8.63 :1 >> >> >> >> > 8.16 :1 >> >> >> >> > >> >> >> >> > [[alternative HT
Re: [R] "R": Redefine default parameter values for help.start()
> paul > on Thu, 16 Apr 2015 18:58:12 + writes: > Because of the setup of R in cygwin, help.start() requires > the following parameter values: > help.start(browser="cygstart",remote=R.home()) > Is it possible to make these values the default? You are building R from the sources, right? Then, of course it is easily possible: Just modify your version of the R source code appropriately: It is file /src/library/utils/R/help.start.R (And you need more changes anyway to make things work under Cygwin; but don't ask me about it: I do run R on Windows only very rarely) > I do not want cygstart to be the browser except as a > parameter for help.start(). __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list server problem?
> Dear all, > I use Thunderbird 31.6.0 on Mac OS 10.6.8 to download my e-mails. > Since this morning, Thunderbird downloads several times the same e-mails > from the list, some even being from yesterday. It occurs only with > e-mails from the R-help and not with my other professional and private > e-mails. > I also checked that it was not a problem with the server on which I get > e-mails from the R-help: the problematic e-mails are erased every time > from the server but are received several times. So I would say that the > R-help server has sent these e-mails several times. > I am wondering if there is currently an issue with the R-help server. Is > someone else having the same problem? Not having the same problem, but an explanation from the place where the server runs: The R mailing list server indeed has had hard times, as it seems since early Sunday (CEST), and a complete time out for a few hours during this morning. The symptoms have been that some e-mails were badly delayed and --- as you have experienced --- mail server timeouts do lead to e-mail being sent more than once. This is the consequence of a safe e-mail server behavior: If an e-mail "may have been lost", rather send it again. As some say: Better error on the safe side. Martin Maechler ETH Zurich > Thanks! > Ivan > -- > Ivan Calandra, ATER > University of Reims Champagne-Ardenne > 51100 Reims, France __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How numerical data is stored inside ts time series objects
> Paul > on Wed, 22 Apr 2015 01:39:16 + writes: > William Dunlap tibco.com> writes: >> Use the str() function to see the internal structure of most >> objects. In your case it would show something like: >> >> > Data <- data.frame(theData=round(sin(1:38),1)) >> > x <- ts(Data[[1]], frequency=12) # or Data[,1] >> > y <- ts(Data, frequency=12) >> > str(x) >> Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 - > 0.5 >> ... >> > str(y) >> ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ... >> - attr(*, "dimnames")=List of 2 >> ..$ : NULL >> ..$ : chr "theData" >> - attr(*, "tsp")= num [1:3] 1 4.08 12 >> >> 'x' contains a vector of data and 'y' contains a 1-column matrix of >> data. stl(x,"per") and stl(y, "per") give similar results as you >> got. >> >> Evidently, stl() does not know that 1-column matrices can be treated >> much the same as vectors and gives an error message. Thus you must >> extract the one column into a vector: stl(y[,1], "per"). > Thanks, William. > Interesting that a 2D matrix of size Nx1 is treated as a different > animal from a length N vector. It's a departure from math convention, > and from what I'm accustomed to in Matlab. Ha -- Not at all! The above is exactly the misconception I have been fighting -- mostly in vane -- for years. Matlab's convention of treating a vector as an N x 1 matrix is a BIG confusion to much of math teaching : The vector space |R^n is not all the same space as the space |R^{n x 1} even though of course there's a trivial mapping between the objects (and the metrics) of the two. A vector *is NOT* a matrix -- but in some matrix calculus notations there is a convention to *treat* n-vectors as (n x 1) matrices. Good linear algebra teaching does distinguish vectors from one-column or one-row matrices -- I'm sure still the case in all good math departments around the globe -- but maybe not in math teaching to engineers and others who only need applied math. Yes, linear algebra teaching will also make a point that in the usual matrix product notations, it is convenient and useful to treat vectors as if they were 1-column matrices. > That R's vector seems > more akin to a list, where the notion of orientation doesn't apply. Sorry, but again: not at all in the sense 'list's are used in R. Fortunately, well thought out languages such as S, R, Julia, Python, all do make a good distinction between vectors and matrices i.e. 1D and 2D arrays. If Matlab still does not do that, it's just another sign that Matlab users should flee and start using julia or R or python. {and well yes, we could start bitchering about S' and hence R's distinction between a 1D array and a vector ... which I think has been a clear design error... but that's not the topic here} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind question, please
> Steve Taylor > on Thu, 23 Apr 2015 23:32:00 + writes: > This works for me... > get0 = function(x) get(x,pos=1) > sapply(big.char, get0) Note that get0() is a _ somewhat important for efficient code _ new function since R 3.2.0 so you'd rather call your functions differently... > The extra step seems necessary because without it, get() gets base::cat() instead of cat. > cheers, > Steve > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess > Sent: Friday, 24 April 2015 10:41a > To: R help > Subject: [R] cbind question, please > Hello! > I have a cbind type question, please: Suppose I have the following: > dog <- 1:3 > cat <- 2:4 > tree <- 5:7 > and a character vector > big.char <- c("dog","cat","tree") > I want to end up with a matrix that is a "cbind" of dog, cat, and tree. > This is a toy example. There will be a bunch of variables. > I experimented with "do.call", but all I got was > 1 > 2 > 3 > Any suggestions would be much appreciated. I still think that do.call > might be the key, but I'm not sure. > R Version 3-1.3, Windows 7. > Thanks, > Erin > -- > Erin Hodgess > Associate Professor > Department of Mathematical and Statistics > University of Houston - Downtown > mailto: erinm.hodg...@gmail.com > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] does segfault mean (always) a bug?
>>>>> Duncan Murdoch >>>>> on Tue, 5 May 2015 15:36:59 -0400 writes: > On 05/05/2015 2:54 PM, lejeczek wrote: >> hi eveybody >> >> I'm trying something simple (Biocunductor packages), so >> simple I believe it's example from docs but I get segfault. >> I don't suppose incorrect scripting can cause segfault, right? > In R, a segfault always indicates a bug. What's not so clear is whether > it is a bug in R, a bug in a contributed package, or a bug in some > underlying system library. > If you can only trigger the bug when using a Bioconductor package, then > the first guess is that it is that package, and the maintainer of that > package is in the best position to track it down further. If you can > simplify the code to trigger it without using any contributed packages, > then it could well be a bug in R, and we'd like to see code to reproduce it. > Duncan Murdoch The bug Duncan mentions can also be in the user's R code, outside any package: If a user does what (s)he should never do, namely directly call .C(), .Fortran(), .Call() or similar (.Internal(), .External(),..) it is typically easy to trigger segfaults, and then the bug is in the user's R code. Variations of the above involve using the inline package, or other interfaces to C/C++/... code, libraries, etc: The bug may be in your code rather than the underlying code which is allowed to make strong assumptions about how it is called. Martin Maechler __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transformation of the circular variable into linear variable
>>>>> David Winsemius >>>>> on Tue, 5 May 2015 08:41:08 -0700 writes: > On May 5, 2015, at 3:59 AM, Nasr Al-Dhurafi wrote: >> Dear All >> >> I am working with a circular data ( wind direction ) and i have problem >> in transforming the circular variable into linear variable. I have found >> an equation that can be used to convert the circular variable into linear >> variable which is included in this paper *" ARMA based approaches for >> forecasting the tuple of wind speed and direction". *The equation is called >> as the inverse of a link function and i have attached the summery of it.T >> herefore,Please, if you have an idea of how to conduct it in R programming >> or if you have another method for transformation, let me know it. Your >> cooperation is highly appreciated. >> >> My Best Regards & All The Best > Here are a couple of plausible explanations for a lack of response to your first message posted 2 months ago: > Neither of your duplicate messages have any attachments. This could be caused by the fact (buried away in one or more of webpages or Posting Guides that define the Rhelp facility) the information that the mail-server used by the Rhelp list strips any attachments unless they are one of the limited number accepted file formats. (It does not notify posters of htis fact when they submit unacceptable formats.) The 4 formats I know to be accepted are: MIME-text, .pdf, .ps or .png. Even if you submit formats that you might think would be considered "text", they will not be accepted if they have non-".txt" extensions such as ".csv". As far as I can tell ... only .txt extensions are passed through. Thanks a lot, David, for helping Nasr (and many others)! The above explantion is not quite accurate though and I want to make sure correct info gets archived here: The file extension does not play any role for the mailing list server software, which is "mailman". Mailman -- rightly -- only looks at the declared MIME type of the attachment. ("MIME": see https://en.wikipedia.org/wiki/MIME). Now with 99.x% of people using e-mail software which does not reveal any of these subtle details to you, and many such e-mail interfaces behaving like Windows programs, that e-mail software may choose the MIME type (it uses for the attachments it puts into the e-mail it sends in your behalf) depending on the file extension. So, from the naive user's point of view with his only e-mail program he may know, it then seems as if the file extension determines the acceptance of the attachment by the mailman rules we have had imposed (for spam and virus protection). Martin Maechler ETH Zurich >> I find an article by that name in Applied Energy Volume 88, Issue 4, April 2011, Pages 1405–1414. The publisher offers a copy at the price of 42 USD. (Expecting us to buy that article seems unreasonable.) > Most of us are not meteorological researchers. In general list members are more likely to respond to requests that include a) sample data, b) code that reflects your efforts, and c) a specification of correct output. It may be helpful to list details of your position. > -- > David. >> >> Nasr >> >> On Mon, Mar 30, 2015 at 8:22 PM, Nasr Al-Dhurafi >> wrote: >> >>> I am working with a circular data ( wind direction ) and i have problem >>> in transforming the circular variable into linear variable. I have found >>> an equation that can be used to convert the circular variable into linear >>> variable which is included in this paper *" ARMA based approaches for >>> forecasting the tuple of wind speed and direction". *The equation is >>> called as the inverse of a link function and i have attached the summery >>> of it.Therefore,Please, if you have an idea of how to conduct it in R >>> programming or if you have another method for transformation, let me >>> know it. Your cooperation is highly appreciated. >>> >>> My Best Regards & All The Best >>> Nasr >>> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > David Winsemius > Alameda, CA, USA > __
Re: [R] does segfault mean (always) a bug?
>>>>> lejeczek >>>>> on Wed, 6 May 2015 08:20:46 +0100 writes: > On 05/05/15 20:36, Duncan Murdoch wrote: >> On 05/05/2015 2:54 PM, lejeczek wrote: >>> hi eveybody >>> >>> I'm trying something simple (Biocunductor packages), so >>> simple I believe it's example from docs but I get segfault. >>> I don't suppose incorrect scripting can cause segfault, right? >> In R, a segfault always indicates a bug. What's not so clear is whether >> it is a bug in R, a bug in a contributed package, or a bug in some >> underlying system library. >> >> If you can only trigger the bug when using a Bioconductor package, then >> the first guess is that it is that package, and the maintainer of that >> package is in the best position to track it down further. If you can >> simplify the code to trigger it without using any contributed packages, >> then it could well be a bug in R, and we'd like to see code to reproduce it. >> >> Duncan Murdoch >> > hi Duncan > I remember that this was a principle of most of programming > languages, only a bug in the code and/or compiler could > cause segfault. > In my case it is a contributed package, specifically GOSim > package, I'm not R programmer and I realise my scripting is > far from good and possibly with errors. > I could send that snippet of the code here if people think > it can be looked into and segfault could be replicated? > I also emailed the author. > many thanks > P. Dear P., in the case of segfault from using a contributed package, you should typically really only email the package maintainer (which may different than the package authors), and not R-help. Only if the maintainer does not respond at all (and only if the package is open source, typically CRAN) should you ask for help here or another public forum. (I would also think it to be polite to the maintainer who has volunteered her/his code to be used by you if you give him an opportunity to see, comment and fix the problem) Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] does segfault mean (always) a bug?
>>>>> William Dunlap >>>>> on Wed, 6 May 2015 09:53:50 -0700 writes: > It looks like a problem in the Matrix package. Indeed. Thanks to Bill Dunlap for the diagnostics below (and other offline information and) I have been able to fix the bug yesterday in the R-forge version of Matrix. The problem was due to using a version of memory allocation which is known to be quite fast... but is not to be used for large objects .. which we have here. I plan to release the amended version of Matrix to CRAN as Matrix_1.2-1 rather sooner than later. With thanks to Bill and Pavel, Martin Maechler ETH Zurich > I made the file KE.rda containing the Matrix objects K and > E constructed in calc.diffusion.kernel by adding a save() > call just before where R dies in the original example: > K = lam$values[1] * I - M > E = I - matrix(1, ncol = ncol(I), nrow = nrow(I))/ncol(I) > cat("saving K, E, etc. in /tmp/KE.rda\n") > save(K, E, deg, invD, I, W, M, lam, file="/tmp/KE.rda") > cat("done making the file\n") > K = E %*% K %*% E > With that file in place I get >> library(Matrix) >> load("KE.rda") >> sessionInfo() > R version 3.2.0 (2015-04-16) > Platform: x86_64-unknown-linux-gnu (64-bit) > Running under: Ubuntu precise (12.04.5 LTS) [ .. ] > other attached packages: > [1] Matrix_1.2-0 > loaded via a namespace (and not attached): > [1] grid_3.2.0 lattice_0.20-31 >> str(E) > Formal class 'dsyMatrix' [package "Matrix"] with 5 slots > ..@ x : num [1:143376676] 1.00 -8.35e-05 -8.35e-05 -8.35e-05 > -8.35e-05 ... > ..@ Dim : int [1:2] 11974 11974 > ..@ Dimnames:List of 2 > .. ..$ : NULL > .. ..$ : NULL > ..@ uplo: chr "U" > ..@ factors : list() >> str(K) > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > ..@ i : int [1:487692] 0 69 948 951 1027 1192 1414 1420 1421 1714 > ... > ..@ p : int [1:11975] 0 27 125 147 199 212 221 230 254 274 ... > ..@ Dim : int [1:2] 11974 11974 > ..@ Dimnames:List of 2 > .. ..$ : chr [1:11974] "GO:002" "GO:003" "GO:012" > "GO:018" ... > .. ..$ : chr [1:11974] "GO:002" "GO:003" "GO:012" > "GO:018" ... > ..@ x : num [1:487692] 32.2163 -0.004674 -0.000722 -0.005316 > -0.014022 ... > ..@ factors : list() >> EK <- E %*% K >> EKE <- EK %*% E > *** caught segfault *** > address 0x7fffa7e1ccf8, cause 'memory not mapped' [...] > On Wed, May 6, 2015 at 1:57 AM, Martin Maechler < > maech...@lynne.stat.math.ethz.ch> wrote: >> >>>>> lejeczek >> >>>>> on Wed, 6 May 2015 08:20:46 +0100 writes: >> >> > On 05/05/15 20:36, Duncan Murdoch wrote: >> >> On 05/05/2015 2:54 PM, lejeczek wrote: >> >>> hi eveybody >> >>> >> >>> I'm trying something simple (Biocunductor packages), so >> >>> simple I believe it's example from docs but I get segfault. >> >>> I don't suppose incorrect scripting can cause segfault, right? >> >> In R, a segfault always indicates a bug. What's not so clear is >> whether >> >> it is a bug in R, a bug in a contributed package, or a bug in some >> >> underlying system library. >> >> >> >> If you can only trigger the bug when using a Bioconductor package, >> then >> >> the first guess is that it is that package, and the maintainer of >> that >> >> package is in the best position to track it down further. If you >> can >> >> simplify the code to trigger it without using any contributed >> packages, >> >> then it could well be a bug in R, and we'd like to see code to >> reproduce it. >> >> >> >> Duncan Murdoch >> >> >> > hi Duncan >> > I remember that this was a principle of most of programming >> > languages, only a bug in the code and/or compiler could >> > cause segfault. >> > In my case it is a contributed package, specifically GOSim >> > package, I'm not R programmer and I realise my scripting is >> > far from good and po
Re: [R] package implementing continuous binomial?
>>>>> David Winsemius >>>>> on Thu, 7 May 2015 12:10:25 -0700 writes: > On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote: >> Hi >> >> I'm wondering if anyone is aware of an R package implementing (i.e., >> providing a pdf, cdf, and/or quantile function) for the continuous >> binomial distribution? Specifically the one characterized here: >> >> http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf >> >> Figured I would check here first, before attempting to code it up myself. > I found that reading the ArXiv version of that material was easier to understand: > http://arxiv.org/abs/1303.5990 > zipfR package has an implementation of the incomplete beta function that might make some of the coding of the pdf and cdf more simple. Well. To: David Winsemius Subject: Re: [R] package implementing continuous binomial? In-Reply-To: <5425134b-6aa1-4246-bcd4-d4be4be23...@comcast.net> References: <554ac75a.8070...@gmail.com> <5425134b-6aa1-4246-bcd4-d4be4be23...@comcast.net> X-Mailer: VM 8.2.0b under 24.3.1 (x86_64-redhat-linux-gnu) Reply-To: Martin Maechler CC: maechler --text follows this line-- Dear David, >>>>> David Winsemius >>>>> on Thu, 7 May 2015 12:10:25 -0700 writes: > On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote: >> Hi >> >> I'm wondering if anyone is aware of an R package implementing (i.e., >> providing a pdf, cdf, and/or quantile function) for the continuous >> binomial distribution? Specifically the one characterized here: >> >> http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf >> >> Figured I would check here first, before attempting to code it up myself. > I found that reading the ArXiv version of that material was easier to understand: > http://arxiv.org/abs/1303.5990 > zipfR package has an implementation of the incomplete beta function that might make some of the coding of the pdf and cdf more simple. > Searching done with Graves' very useful utility package: > library('sos') > findFn("incomplete beta function") Hmm... R's pbeta() function is a pretty good implementation of the incomplete beta function ... as is tries to say on its help page. If you look closely, these functions (for incomplete gamma, incomplete beta, and their inverses) are simple wrappers to pgamma() and qgamma() --- sometimes "regularizing" and sometimes not -- where regularization is simply a multiplication/division with gamma() or beta(). I wonder why you did not find R's help page about the beta distribution { -> functions dgamma, pgamma, qgamma, rgamma } which does mention the "incomplete beta function" prominently. I don't think Benjamin should use the zipfR package just for these functions [and even the zipfR package help page on these can be read as saying so .. ] In the end I wonder if the "continuous Binomial" is not just a version of the good old Beta distribution... as indeed the Binomial and the Beta are related in the same way that the Gamma and the Poisson are. Martin Maechler ETH Zurich > (I did't think that doing a search on "continuous Binomial" was likely to be helpful, but I tried it anyway and did not find any functions named "continuous binomial" in their help page titles.) > -- > David Winsemius > Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The R Foundation announces new mailing list 'R-package-devel'
New Mailing list: R-package-devel -- User R Packages Development At last week's monthly meeting, the R foundation has decided to create a new mailing list in order to help R package authors in their package development and testing. The idea is that some experienced R programmers (often those currently helping on R-devel or also R-help) will help package authors and thus unload some of the burden of the CRAN team members. Please read the detailed description of the mailing list here, https://stat.ethz.ch/mailman/listinfo/r-package-devel or also the more extended announcement of the list on R-devel, archived at https://stat.ethz.ch/pipermail/r-devel/2015-May/071208.html For the R foundation, Martin Maechler, Secretary General ___ r-annou...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-announce __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml
>>>>> Christophe Genolini >>>>> on Fri, 5 Jun 2015 00:36:42 -0700 writes: > Hi all, > There is a compatibility issue between the package 'lme4' and my package > 'kml'. I define the "[" operator. It works just fine in my package (1). If I > try to use the lme4 package, then it does no longer work (2). Moreover, it > has some kind of strange behavior (3). Do you know what is wrong? Any idea > of how I can correct that? > Here is a reproductible example, and the same code with the result follows. Dear Christophe, can you please specify the exact sessionInfo() ? (after loading both kml and lme4). 'Matrix' has been updated on CRAN very recently, and 'lme4' is about to be update very soon, so this *is* a somewhat interesting and important problem, but the exact versions of the packages *do* matter. Bonnes salutations! Martin Maechler ETH Zurich [] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml
>>>>> Martin Maechler >>>>> on Fri, 5 Jun 2015 11:33:46 +0200 writes: >>>>> Christophe Genolini >>>>> on Fri, 5 Jun 2015 00:36:42 -0700 writes: >> Hi all, >> There is a compatibility issue between the package 'lme4' and my package >> 'kml'. I define the "[" operator. It works just fine in my package (1). If I >> try to use the lme4 package, then it does no longer work (2). Moreover, it >> has some kind of strange behavior (3). Do you know what is wrong? Any idea >> of how I can correct that? >> Here is a reproductible example, and the same code with the result follows. > Dear Christophe, > can you please specify the exact sessionInfo() ? > (after loading both kml and lme4). > 'Matrix' has been updated on CRAN very recently, and > 'lme4' is about to be update very soon, > so this *is* a somewhat interesting and important problem, > but the exact versions of the packages *do* matter. As a matter of fact, 1) the package versions don't seem to matter much. 2) lme4 is *not* involved directly: Much worse, it is the 'Matrix' package. If you replace 'lme4' by 'Matrix' in your code, you get the same symptoms. ... I'm investigating .. Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] is.na for S4 object
> Martin Morgan > on Thu, 4 Jun 2015 10:33:37 -0700 writes: > On 06/04/2015 10:08 AM, cgenolin wrote: >> Hi the list, >> >> I have a variable y that is either NA or some S4 object. I would like to >> know in which case I am, but it seems taht is.na does not work with S4 >> object, I get a warnings: >> >> --- 8< >> setClass("myClass",slots=c(x="numeric")) >> if(runif(1)>0.5){a <- new("myClass")}else{a <- NA} >> is.na(a) >> --- 8< >> >> Any solution? > getGeneric("is.na") > shows that it's an S4 generic, so implement a method > setMethod("is.na", "myClass", function(x) FALSE) > Martin For the present special case though, a more efficient solution would be to use isS4(.) instead of !is.na(.) another Martin >> Thanks >> Christophe __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml
> Christophe Genolini > on Fri, 5 Jun 2015 00:36:42 -0700 writes: > Hi all, > There is a compatibility issue between the package 'lme4' and my package > 'kml'. I define the "[" operator. It works just fine in my package (1). If I > try to use the lme4 package, then it does no longer work (2). Moreover, it > has some kind of strange behavior (3). Do you know what is wrong? Any idea > of how I can correct that? > Here is a reproductible example, and the same code with the result follows. > Thanks for your help > Christophe [ ... I'm providing slightly different code below ] > --- 8< - Execution of the previous code --- > > library(kml) > Le chargement a nécessité le package : clv > Le chargement a nécessité le package : cluster > Le chargement a nécessité le package : class > Le chargement a nécessité le package : longitudinalData > Le chargement a nécessité le package : rgl > Le chargement a nécessité le package : misc3d > > dn <- gald(1) > ### > ### (1) the "[" operator works just fine > > dn["traj"] > t0 t1t2t3t4 t5 t6t7t8t9 t10 > i1 -3.11 4.32 2.17 1.82 4.90 7.34 0.83 -2.70 5.36 4.96 3.16 > i2 -7.11 1.40 -2.40 -2.96 4.31 0.50 1.25 0.52 -0.04 7.55 5.50 > i3 2.80 6.23 6.08 2.87 2.58 2.88 6.58 -2.38 2.30 -1.74 -3.23 > i4 2.24 0.91 6.50 10.92 11.32 7.79 7.78 10.69 9.15 1.07 -0.51 > ### > ### (2) using 'lme4', it does no longer work > > library(lme4) > Le chargement a nécessité le package : Matrix > Le chargement a nécessité le package : Rcpp > > dn["traj"] > Error in x[i, j] : > erreur d'évaluation de l'argument 'j' lors de la sélection d'une méthode > pour la fonction '[' : Erreur : l'argument "j" est manquant, avec aucune > valeur par défaut > ### > ### (3) If I define again the "[", it does not work the first time I call > it, but it work the second time! > > setMethod("[", > + signature=signature(x="ClusterLongData", i="character", > j="ANY",drop="ANY"), > + definition=function (x, i, j="missing", ..., drop = TRUE){ > + x <- as(x, "LongData") > + return(x[i, j]) > + } > + ) > [1] "[" > ### No working the first time I use it > > dn["traj"] > Error in dn["traj"] : > l'argument "j" est manquant, avec aucune valeur par défaut > ### But working the second time > > dn["traj"] > t0 t1t2t3t4 t5 t6t7t8t9 t10 > i1 -3.11 4.32 2.17 1.82 4.90 7.34 0.83 -2.70 5.36 4.96 3.16 > i2 -7.11 1.40 -2.40 -2.96 4.31 0.50 1.25 0.52 -0.04 7.55 5.50 > i3 2.80 6.23 6.08 2.87 2.58 2.88 6.58 -2.38 2.30 -1.74 -3.23 > i4 2.24 0.91 6.50 10.92 11.32 7.79 7.78 10.69 9.15 1.07 -0.51 I have made some investigations, but have to stop for now, and leave this hopefully to others knowledgable about S4 method dispatch, etc : 1) I am confident to say that you have uncovered an "unfelicity if not a bug" in R. 2) I am also pretty confident that the "[" methods that you define in 'kml' and in the package ' 3) Diagnosing is not easy: As you have shown yourself above, in some situations the bug "bites" and if you repeat the *same* code, things work. This is related to the fact that S4 methods are __cached__ (so next time they are found more quickly) under some circumstances, and the cache is cleared under other such circumstances. 3b) Actually, I am sure that we have seen +/- the same problem many months ago, in other contexts but did not get "down to it"; and at the moment, I cannot quickly find where to look for the problem there... ##--- 8< Commented (incl output) reproducible code-- library(kml) ### Creating some data dn <- gald(1) (dnt <- dn["traj"]) showMethods("[") ## Function: [ (package base) ## x="ClusterLongData", i="character" ## x="ListPartition", i="ANY" ## x="LongData", i="ANY" ## x="LongData", i="character" ## (inherited from: x="LongData", i="ANY") ## x="LongData3d", i="ANY" ## x="nonStructure", i="ANY" ## x="ParChoice", i="ANY" ## x="ParKml", i="ANY" ## x="ParLongData", i="ANY" ## x="Partition", i="ANY" ## x="ParWindows", i="ANY" ### using Matrix (or lme4, which 'Depends' on Matrix; hence same effect) library(Matrix) dn["traj"] ## Error in x[i, j] : ## error in evaluating the argument 'j' in selecting a method for function '[': Error: argument "j" is missing, with no default traceback() ## 3: x[i, j] ## 2: dn["traj"] ## 1: dn["traj"] (ms <- methods(`[`)) ## 81 methods ## MM: debugging : trace("[", browser, signature=c("ClusterLongData", "character", "missing", "missing")) trace("[", browser, signature=c("LongData","character", "character", "missing")) dn["traj"] ## -> you get into the browser, just press "c" twice (once for each "trace") ## ==> it works !! ## Remove the tracing : untrace("[", signature=c("ClusterLongData", "character", "missing", "missing")) untrace("[", signa
Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail
> Aehm, adding on this: I incorrectly *assumed* without testing that rounding > would help; it doesn't: > ecdf(round(test2,0)) # a rounding that is way too rough for my application... > #Error in xy.coords(x, y) : 'x' and 'y' lengths differ > > Digging deeper: The initially mentioned call to unique() is not very helpful, > as test2 is a data frame, so I get what I deserve, an unchanged data frame > with 1 row. Still, the issue remains and can even be simplified further: > > > ecdf(data.frame(a=3, b=4)) > Empirical CDF > Call: ecdf(data.frame(a = 3, b = 4)) > x[1:2] = 3, 4 > > works ok, but > > > ecdf(data.frame(a=3, b=3)) > Error in xy.coords(x, y) : 'x' and 'y' lengths differ > > doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, > > > ecdf(c(a=3, b=3)) > Empirical CDF > Call: ecdf(c(a = 3, b = 3)) > x[1:1] = 3 > > does the trick. From ?ecdf, I get that x should be a numeric vector - > apparently, my misuse of the function by applying it to a row of a data frame > (i.e. a data frame with one row). In all my other (dozens of) cases that > worked ok, though but not for this particular one. A simple unlist() helps: You were lucky. To use a one-row data frame instead of a numerical vector will typically *not* work unless ... well, you are lucky. No, do *not* pass data frame rows instead of numeric vectors. > > > ecdf(unlist(data.frame(a=3, b=3))) > Empirical CDF > Call: ecdf(unlist(data.frame(a = 3, b = 3))) > x[1:1] = 3 > > Yet, I'm even more confused than before: in my other data, there were also > duplicated values in the vector (1-row-data frame), and it never caused any > issue. For this particular example, it does. I must be missing something > fundamental... > well.. I'm confused about why you are confused, but if you are thinking about passing rows of data frames as numeric vectors, this means you are sure that your data frame only contains "classical numbers" (no factors, no 'Date's, no...). In such a case, transform your data frame to a numerical matrix *once* preferably using data.matrix() instead of just as.matrix() but in this case it should not matter. Then *check* the result and then work with that matrix from then on. All other things probably will continue to leave you confused .. ;-) Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'class(.) == **' [was 'Call to a function']
>>>>> Steve Taylor >>>>> on Wed, 24 Jun 2015 00:56:26 + writes: > Note that objects can have more than one class, in which case your == and %in% might not work as expected. > Better to use inherits(). > cheers, > Steve Yes indeed, as Steve said, really do! The use of (class(.) == "") it is error prone and against the philosophy of classes (S3 or S4 or ..) in R : Classes can "extend" other classes or "inherit" from them; S3 examples in "base R" are - glm() objects which are "glm" but also inherit from "lm" - multivariate time-series are "mts" and "ts" - The time-date objects POSIXt , POSIXct, POSIXlt ==> do work with inherits(, , ) We've seen this use of class(.) == ".."(or '!=" or %in% ...) in too many places; though it may work fine in your test cases, it is wrong to be used in generality e.g. inside a function you provide for more general use, and is best replaced with the use of inherits() / is() everywhere "out of principle". Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R
>>>>> "P" == ProfJCNash >>>>> on Sat, 4 Jul 2015 21:42:27 -0400 writes: P> n163 <- mpfr(163, 500) P> is how I set up the number. Yes, and you have needed to specify the desired precision. As author and maintainer of Rmpfr, let me give my summary of this overly long thread (with many wrong statements before finally RK give the "obvious" answer): 1) Rmpfr is about high (or low, for didactical reasons, see Rich Heiberger's talk at 'useR! 2015') precision __numerical__ computations. This is what the GNU MPFR library is about, and Rmpfr wants to be a smart and R-like {e.g., automatic coercions wherever they make sense} R interface to MPFR. 2) If you use Rmpfr, you as user should decide about the desired accuracy of your "inputs". I think it would be a very bad idea to redefine 'pi' (in Rmpfr) to be Const("pi", 120). R-like also means that in a computation a * b the properties of a and b determine the result, and hence if a <- pi then we know that pi is a double precision number with 53-bit mantissa. p> On 15-07-04 05:10 PM, Ravi Varadhan wrote: >>> What about numeric constants, like `163'? well, if you _combine_ them with an mpfr() number, they are used in their precision. An integer like 163 is exact of course; but pi (another numeric constant) is 53-bit as mentioned above, *and* sqrt(163) is ("R-like") a double precision number computed by R's internal double precision arithmetic. My summary: 1. Rmpfr behaves entirely correctly and as documented (in all the examples given here). 2. The idea of substituting expressions containing pi with something like Const("pi", 120) is not a good one. (*) *) One could think of adding a new class, say "symbolicNumber" or rather "numericExpression" which in arithmetic would try to behave correctly, namely find out what precision all the other components in "the arithmetic" have and make sure all parts of the 'numericExpression' are coerced to 'mpfr' with the correct precision, and only then start evaluating "the arithmetic". But that *does* look like implementation of Maple/Mathematica/... in R and that seems silly; rather R should interface to Free (as in "speech") aka "open source" symbolic math packages such as Pari, macysma, .. Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] indices of mismatch element in two vector with missing values
>>>>> Hervé Pagès >>>>> on Tue, 28 Jul 2015 23:07:14 -0700 writes: > On 07/28/2015 09:58 PM, Peter Alspach wrote: >> One way >> >> seq(test1)[-which(test1==test2)] > One question is whether 2 NAs should be considered to match or not. > The OP doesn't tell but I guess he wants them to match: > test1 <- c("1", "2", NA, "4", NA, "6") > test2 <- c("1", "2", "3", NA, NA, "66") > seq(test1)[-which(test1==test2)] > # [1] 3 4 5 6 > 5 should probably not be there! >> but I imagine there are better ones . >> Yes, indeed, as logical indexing is considerably safer than "negative indexing with which(.)" : PLEASE, everyone, do remember: %%>===<%% %%> There is one ___big__ disadvantage to [ - which() ] : <%% %%> It entirely __fails__ when the logical vector is all FALSE <%% %%>===<%% ## Example (remember, we want the indices of *differing* entries): ## --- Here, all entries differ, so we want to get 1:8 : x <- c(NA, 1:7) y <- c(11:17, NA) ## now this seq(x)[ - which(x == y) ] ## gives not what you expect ## But logical indexing always works: x == y & is.na(x) == is.na(y) seq(x)[!(x == y & is.na(x) == is.na(y))] With output : > x <- c(NA, 1:7) > y <- c(11:17, NA) > ## now this > seq(x)[ - which(x == y) ] ## gives not what you expect integer(0) > ## But logical indexing always works: > x == y & is.na(x) == is.na(y) [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > seq(x)[!(x == y & is.na(x) == is.na(y))] [1] 1 2 3 4 5 6 7 8 > Martin Maechler ETH Zurich >> -Original Message- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of baccts >> Sent: Wednesday, 29 July 2015 8:26 a.m. >> To: r-help@r-project.org >> Subject: [R] indices of mismatch element in two vector with missing values >> >> How would you return the index where two vectors differs if they may contain missing (NA) values? >> For example: >> test1 <- c("1","2",NA); >> test2 <- c("1","2","3"); >> which(test1!=test2) does not return 3! >> >> Thanks in advance. >> >> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to have NLME return when convergence not reached
>>>>> Bert Gunter >>>>> on Tue, 2 Dec 2014 14:03:44 -0800 writes: > Yes, Bill almost always has helpful ideas. > Just a comment: If indeed the process is gobbling up too much memory, > that might indicate a problem with your function or implementation. I > defer to real experts on this, however. > Cheers, > Bert > Bert Gunter > Genentech Nonclinical Biostatistics > (650) 467-7374 Yes, thank you Bert, it could be entirely Ramiro's function, but we don't know as you have not given any indication about what the non-linear function is in your nlme() formula. The little you wrote _seems_ to indicate a bug in nlme, and if that was the case, the bug should be reported and fixed, and we (R core - the maintainer of the recommended package nlme) would really like to to get a reproducible example. If OTOH, it is your nonlinear function that "goes haywire", the implicit blame on nlme would not be warranted. Martin Maechler, ETH Zurich and R Core team > On Tue, Dec 2, 2014 at 1:59 PM, Ramiro Barrantes > wrote: >> Thanks so much for your reply. I am using try but nlme never returns!! and I think the process is getting killed by the system as it is taking over all the memory. However, I do like William Dunlap's idea of using R.utils::withTimeout to limit the time. >> >> Thanks again for your help! >> >> From: Bert Gunter [gunter.ber...@gene.com] >> Sent: Tuesday, December 02, 2014 4:30 PM >> To: Ramiro Barrantes >> Cc: r-help@r-project.org >> Subject: Re: [R] How to have NLME return when convergence not reached >> >> ?try >> Or >> ?tryCatch >> >> Bert >> >> Sent from my iPhone -- please excuse typos. >> >>> On Dec 2, 2014, at 12:57 PM, Ramiro Barrantes wrote: >>> >>> Hello, >>> >>> I am trying to fit many hundreds of simulated datasets using NLME (it's all in a big loop in R). Some don't seem to converge. I am working on addressing the issues by perhaps adjusting my simulation, or tweaking iteration steps in nlme, etc. However, when it doesn't converge, NLME just hangs, and my program either stalls for hours/days or takes over the computer memory and everything crashes eventually. Is there a way to tell nlme to stop when it doesn't seem to be converging somehow? I have been looking at the parameters in nlmeControl() but see nothing obvious. >>> >>> Thanks in advance, >>> >>> Ramiro >>> >>> [[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The R Foundation - New board
Dear R lovers and other useRs, notably benefactors, donors, supporting members and institutions of the R foundation http://www.r-project.org/foundation/ About 11.7 years ago, in April 2003, the R Foundation was announced by Fritz Leisch, https://stat.ethz.ch/pipermail/r-announce/2003/000385.html shortly after its foundation, and since then, its board has lead the foundation during years of phenomenal growth and success. Our thanks go to the *past* board members Presidents: Robert Gentleman, Ross Ihaka Secretary General:Friedrich Leisch Treasurer:Kurt Hornik Member at Large: John Chambers In addition, Peter Dalgaard and Martin Mächler had acted as auditors. After Summer meetings of the R Core and the R Foundation, we, i.e., Robert Gentleman, on Sep. 15 announced the new ("ordinary") memberships and the fact that the membership of the foundation has grown to 29 persons: https://stat.ethz.ch/pipermail/r-announce/2014/000577.html Subsequently (after a careful nomination and voting process), the election of a new *board* finalized on December 1, and the new board has started acting since. It consists of Presidents:Duncan Murdoch, Martyn Plummer Secretary General: Martin Mächler Treasurer: Kurt Hornik Members at Large: John Chambers Brian Ripley Dirk Eddelbuettel and the two auditors Peter Dalgaard and Roger Bivand. This is very soon to be reflected on http://www.r-project.org/foundation/board.html The foundation and its board aim to become considerably more active in the future. Details about this will be announced here in due time. Thank you all for supporting the R project, a Free Software (hence Open source) endeavor only made possible by the dedicated work of many thousands of volunteers and their contributions in several ways, including to become supporting (or ++) members of the R foundation! Martin Maechler ___ r-annou...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-announce __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Do NOT use ifelse() if you can use if(.) else . [was "Checking .."]
>>>>> Steven Yen >>>>> on Sun, 14 Dec 2014 09:30:56 -0500 writes: >>>>> Steven Yen >>>>> on Sun, 14 Dec 2014 09:30:56 -0500 writes: > Thanks. This worked!! :) > Fisher <- ifelse(!("Fisher" %in% names(obj$spec)), FALSE, obj$spec$Fisher) "worked", yes. But please --- for the mailing list archives --- do use better code. Unfortunately, ifelse() is used and has been advertized much too often in cases where it is very sub-optimal efficiency wise. ifelse(Cond, A, B) should only be used when the condition 'Cond', a logical vector can be (and typically is) of length > 1 {and even then, it maybe a nice short cut, but often still suboptimal efficiency wise ... but let's not get there} In cases like this one when the condition 'Cond' is a simple TRUE or FALSE (i.e. of length 1), using if(Cond) A else B instead of ifelse(Cond, A, B) is 1. much more R - like [[ "everything you do is a function call" ]] hence much more elegant 2. considerably more efficient. 3. :-) less typing: uses two "," less ;-) > require(microbenchmark) Loading required package: microbenchmark > x <- setNames(,LETTERS) > y <- setNames(letters, runif(letters)) > z <- pi > microbenchmark(r1 <- ifelse(z > 3, x, y), r2 <- if(z > 3) x else y, times=1000) Unit: nanoseconds expr min lq mean median uq max neval cld r1 <- ifelse(z > 3, x, y) 4466 4971.5 5498.928 5244 5673.5 31705 1000 b r2 <- if (z > 3) x else y 171 212.0 265.241264 291.0 3130 1000 a > i.e., roughly a factor of 20 times more efficient Martin Maechler, ETH Zurich and R Core Team. > At 08:43 AM 12/14/2014, Ben Tupper wrote: >> Hi, >> >> Does this work for you? It simply tests if the name Fisher is found >> among the names of the elements of spec. >> >> obj = list(spec = list) >> Fisher <- ifelse(!("Fisher" %in% names(obj$spec)), FALSE, obj$spec$Fisher) >> >> Cheers, >> Ben >> >> On Dec 14, 2014, at 8:07 AM, Steven Yen wrote: >> >> > My obj does not always come with a logical variable defined. So I do >> > >> > my.foo <- function(obj,df,digits=5){ >> > if (!is.na("obj$spec$Fisher")) Fisher<-obj$spec$Fisher >> > ... >> > } >> > >> > This works when "Fisher" is defined in/passed from obj. When it >> is not, I get error: >> > >> > Error in (!is.na("obj$spec$Fisher")) & Fisher : >> > operations are possible only for numeric, logical or complex types >> > >> > I tried exist(Fisher), missing(Fisher)... to no vail. Any idea? Thanks. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Links to vignettes in the base packages
>>>>> Richard Cotton >>>>> on Wed, 14 Jan 2015 09:23:42 +0300 writes: > Contributed packages have copies of their vignettes on CRAN (in their > package page at /web/packages/). > Since base packages no longer have a page here, I can't find a web link to them. > I'm aware that I can find the vignette via browseVignettes() or > vignette("vignettename", package = "packagename"). There are two use > cases where a web link is more useful. If someone is asking for help > and I want to direct them to a vignette, having a link to a vignette > rather than getting them to open R and type a command increases the > chance that they will actually bother to look at the thing. Secondly, > if I want to read a vignette on a machine without R (ebook reader, > phone, whatever), then a web link is more convenient. > Are the base vignettes available online? I bet ... and I would have done so even before doing a quick check. > If so, where are they? Indeed, one place is "here" (the place I work) i.e., from http://stat.ethz.ch/R-manual/ (which I think is one of the oldest still working R related "web pages" ... and yes, it shows its age :-) > If not, is it feasible to make them available? The drawback of the above is that you see the future versions of the manuals and I agree that it may make sense to provide them in a .r-project.org location (where I don't "see" them). Martin Maechler, ETH Zurich > -- > Regards, > Richie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two-sample KS test: data becomes significantly different after normalization
> Monnand > on Wed, 14 Jan 2015 07:17:02 + writes: > I know this must be a wrong method, but I cannot help to ask: Can I only > use the p-value from KS test, saying if p-value is greater than \beta, then > two samples are from the same distribution. If the definition of p-value is > the probability that the null hypothesis is true, Ouch, ouch, ouch, ouch The worst misuse/misunderstanding of statistics now even on R-help ... ---> please get help from a statistician !! --> and erase that sentence from your mind (unless you are pro and want to keep it for anectdotal or didactical purposes...) > then why there's little > people uses p-value as a "true" probability. e.g. normally, people will not > multiply or add p-values to get the probability that two independent null > hypothesis are both true or one of them is true. I had this question for > very long time. > -Monnand > On Tue Jan 13 2015 at 2:47:30 PM Andrews, Chris > wrote: >> This sounds more like quality control than hypothesis testing. Rather >> than statistical significance, you want to determine what is an acceptable >> difference (an 'equivalence margin', if you will). And that is a question >> about the application, not a statistical one. >> >> From: Monnand [monn...@gmail.com] >> Sent: Monday, January 12, 2015 10:14 PM >> To: Andrews, Chris >> Cc: r-help@r-project.org >> Subject: Re: [R] two-sample KS test: data becomes significantly different >> after normalization >> >> Thank you, Chris! >> >> I think it is exactly the problem you mentioned. I did consider >> 1000-point data is a large one at first. >> >> I down-sampled the data from 1000 points to 100 points and ran KS test >> again. It worked as expected. Is there any typical method to compare >> two large samples? I also tried KL diverge, but it only gives me some >> number but does not tell me how large the distance is should be >> considered as significantly different. >> >> Regards, >> -Monnand >> >> On Mon, Jan 12, 2015 at 9:32 AM, Andrews, Chris >> wrote: >> > >> > The main issue is that the original distributions are the same, you >> shift the two samples *by different amounts* (about 0.01 SD), and you have >> a large (n=1000) sample size. Thus the new distributions are not the same. >> > >> > This is a problem with testing for equality of distributions. With >> large samples, even a small deviation is significant. >> > >> > Chris >> > >> > -Original Message- >> > From: Monnand [mailto:monn...@gmail.com] >> > Sent: Sunday, January 11, 2015 10:13 PM >> > To: r-help@r-project.org >> > Subject: [R] two-sample KS test: data becomes significantly different >> after normalization >> > >> > Hi all, >> > >> > This question is sort of related to R (I'm not sure if I used an R >> function >> > correctly), but also related to stats in general. I'm sorry if this is >> > considered as off-topic. >> > >> > I'm currently working on a data set with two sets of samples. The csv >> file >> > of the data could be found here: http://pastebin.com/200v10py >> > >> > I would like to use KS test to see if these two sets of samples are from >> > different distributions. >> > >> > I ran the following R script: >> > >> > # read data from the file >> >> data = read.csv('data.csv') >> >> ks.test(data[[1]], data[[2]]) >> > Two-sample Kolmogorov-Smirnov test >> > >> > data: data[[1]] and data[[2]] >> > D = 0.025, p-value = 0.9132 >> > alternative hypothesis: two-sided >> > The KS test shows that these two samples are very similar. (In fact, they >> > should come from same distribution.) >> > >> > However, due to some reasons, instead of the raw values, the actual data >> > that I will get will be normalized (zero mean, unit variance). So I tried >> > to normalize the raw data I have and run the KS test again: >> > >> >> ks.test(scale(data[[1]]), scale(data[[2]])) >> > Two-sample Kolmogorov-Smirnov test >> > >> > data: scale(data[[1]]) and scale(data[[2]]) >> > D = 0.3273, p-value < 2.2e-16 >> > alternative hypothesis: two-sided >> > The p-value becomes almost zero after normalization indicating these two >> > samples are significantly different (from different distributions). >> > >> > My question is: How the normalization could make two similar samples >> > becomes different from each other? I can see that if two samples are >> > different, then normalization could make them similar. However, if two >> sets >> > of data are similar, then intuitively, applying same operation onto them >>
Re: [R] sparse matrix from vector outer product
> Philipp A > on Wed, 14 Jan 2015 14:02:40 + writes: > Hi, > creating a matrix from two vectors a, b by multiplying each combination can > be done e.g. via > a %*% t(b) > or via > outer(a, b) # default for third argument is '*' really the best (most efficient) way would be tcrossprod(a, b) > But this yields a normal matrix. of course. Please always use small self-contained example code, here, e.g., a <- numeric(17); a[3*(1:5)] <- 10*(5:1) b <- numeric(12); b[c(2,3,7,11)] <- 1:3 > Is there an efficient way to create sparse matrices (from the Matrix > package) like that? > Right now i’m doing > a.sparse = as(a, 'sparseVector') > b.sparse = as(t(b), 'sparseMatrix') > a.sparse %*% b.sparse > but this strikes me as wasteful. not really wasteful I think. But there is a nicer and more efficient way : require(Matrix) tcrossprod(as(a, "sparseVector"), as(b, "sparseVector")) now also gives 17 x 12 sparse Matrix of class "dgCMatrix" [1,] . . . . . . . . . . . . [2,] . . . . . . . . . . . . [3,] . 50 100 . . . 150 . . . 50 . [4,] . . . . . . . . . . . . [5,] . . . . . . . . . . . . [6,] . 40 80 . . . 120 . . . 40 . [7,] . . . . . . . . . . . . [8,] . . . . . . . . . . . . [9,] . 30 60 . . . 90 . . . 30 . [10,] . . . . . . . . . . . . [11,] . . . . . . . . . . . . [12,] . 20 40 . . . 60 . . . 20 . [13,] . . . . . . . . . . . . [14,] . . . . . . . . . . . . [15,] . 10 20 . . . 30 . . . 10 . [16,] . . . . . . . . . . . . [17,] . . . . . . . . . . . . > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "Negative length vectors are not allowed" error
> Hi all, I have a question concerning an error that occurs when using the > inspect() function on a set of rules made by the apriori function from the > arules package. I have a dataset with 12 million records. It contains some > basic sales information (such as product, customer data). I want to use the > apriori function from the arules package on it: ruleset <- apriori(sales, > parameter=list(support=0.0005, confidence=0.1, minlen=2))It gives me 780379 > rules. I want to have that much rules on purpose, so hence the parameters > being so low (I guess you can reproduce this problem with any large dataset > and low settings for support and confidence). But then I want to check out > the rules with inspect. It has a subset because I'm only interested in rules > with the attribute Product in the rhs. inspect(subset(ruleset, subset=rhs > %pin% "Product="))Then this error occurs: Error in > inspect(subset(sales3ruleset, subset = rhs %pin% "Product=")) : > error in evaluating the argument 'x' in selecting a method for function > 'inspect': Error in .Call("R_or_ngCMatrix", x@data, y@data, PACKAGE = > "arules") : > negative length vectors are not allowedI looked around and apparently that > part about "negative length vectors are not allowed" means that you want to > create a vector that is larger than 2^31. How can you get around this limit? > Or how can I make the inspectfunction work in this case?Thanks in advance!Kim > Dear Kim, if you learned to post (i.e. write that e-mail) in plain text, the above would look more humane.. Still, I was able to decipher it and you are right in that you hit a limitation of the current setup which may well be linked to the Matrix package which I maintain, and on which 'arules' depends. Can you please try to find a reproducible example [with randomly generated data; i.e., you'd use set.seed(), runif(), rpois(), rmultinom(), rnorm(), ...] so we, the maintainer of 'arules' Michael Hahsler (BCC'ed: use maintainer("arules") to find such an e-mail address), and myself can look if and how that limitation might be lifted. Best regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re-order levels of a categorical (factor) variable
>>>>> Bert Gunter >>>>> on Wed, 21 Jan 2015 18:52:12 -0800 writes: > Bill/Ravi: > I believe the problem is that the factor is automatically created when > a data frame is created by read.table(). By default, the levels are > lexicographically ordered. The following reproduces the problem and > gives a solution. >> library(lattice) >> z <- data.frame(y = 1:9, x = rep(c("pre", "day2","day10"))) >> xyplot(y~x,data=z) ## x axis order is day 10, day2, pre >> levels(z$x) > [1] "day10" "day2" "pre" >> z$x <- factor(as.character(z$x),levels=c(levels(z$x)[3:1])) ## explicitly defines level order >> xyplot(y~x,data=z) ## desired plot Indeed, thank you, Bert, and using levels(.) <- * does *not* work... (as I first thought). However, slightly shorter and easier and maybe even easier to remember than z$x <- factor(as.character(z$x), levels = c(levels(z$x)[3:1])) ## def. level order is z$x <- factor(z$x, levels = levels(z$x)[3:1]) ## def. level order Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum function and missing values --- need to mimic SAS sum function
> Jim Lemon > on Mon, 26 Jan 2015 11:21:03 +1100 writes: > Hi Allen, How about this: > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) Excuse, Jim, but that's yet another "horrible misuse of ifelse()" John Fox's reply *did* contain the "proper" solution if (all(is.na(x))) NA else sum(x, na.rm=TRUE) The ifelse() function should never be used in such cases. Read more after googling "Do NOT use ifelse()" -- include the quotes in your search -- or directly at http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html Yes, this has been on R-help a month ago.. Martin > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham > wrote: >> I understand that in order to get the sum function to >> ignore missing values I need to supply the argument >> na.rm=TRUE. However, when summing numeric values in which >> ALL components are "NA" ... the result is 0.0 ... instead >> of (what I would get from SAS) of NA (or in the case of >> SAS "."). >> >> Accordingly, I've had to go to 'extreme' measures to get >> the sum function to result in NA if all arguments are >> missing (otherwise give me a sum of all non-NA elements). >> >> So for example here's a snippet of code that ALMOST does >> what I want: >> >> >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) >> >> In reality this does NOT give me records with NA for >> SumValue ... but it doesn't give me values for any >> records in which both Variable.1 and Variable.2 are NA >> --- which is "good enough" for my purposes. >> >> I'm guessing with a little more work I could come up with >> a way to adapt the code above so that I could get it to >> work like SAS's sum function ... >> >> ... but before I go that extra mile I thought I'd ask >> others if they know of functions in either base R ... or >> in a package that will better mimic the SAS sum function. >> >> Any suggestions? >> >> Thanks. __ Allen >> Bingham aebingh...@gmail.com >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide >> commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the less-than-minus gotcha
> All the more reason to use = instead of <- Definitely not! (As you were told, there are other drawbacks). R does not have to look like C, it *is* different in many ways. If you use a decent IDE for R, you get spaces around ' <- ' for free: Both in ESS and in Rstudio, you can use "[Alt] -" to produce the 4 characters ' <- ' { [Alt] + "-") is called 'M--' in ESS / emacs which has even more options for " <- " and is fully configurable in its key bindings anyway. } The '=' character has many uses in R and using ' <- ' for assignment makes the code "more expressive": It makes sense to highlight the assignment op, but is a bit stupid to highlight all "=" signs. Further it can be nicely marked up by a real "left arrow" by e.g. the listings LaTeX 'listings' package, or the (oldish) 'a2ps' GNU software. Further, assignment is not commutative, and hence, there is a corresponding ` -> ` operator, whereas the '=' is a commutative operator in mathematics, but not when used as assignment op. [ yes: "Flame war is on. I'll stop reading R-help for a while.." ;-) ;-) ] > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker > Sent: Monday, 2 February 2015 2:07p > To: r-h...@stat.math.ethz.ch > Subject: Re: [R] the less-than-minus gotcha > Mike Miller gmail.com> writes: > > > > I've got to remember to use more spaces. Here's the basic problem: > > > > These are the same: > > > > v< 1 > > v<1 > > > > But these are extremely different: > > > > v< -1 > > v<-1 > > > This is indeed documented, in passing, in one of the pages you listed: > http://tim-smith.us/arrgh/syntax.html > Whitespace is meaningless, unless it isn't. Some parsing ambiguities > are resolved by considering whitespace around operators. See and > despair: x<-y (assignment) is parsed differently than x < -y (comparison)! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.