[R] Error in ginv(A) : 'X' must be a numeric or complex matrix
Dear all, I encountered a problem that has been bugging me for some hours now, and I still can't come up with a viable solution. I tried searching the archives, but to no avail, so here I am sending my first e-mail to the list! I am estimating a binary spatial autoregressive model via a Gibbs sampler. When I do this with a neighborhood matrix, everything goes perfectly fine, but when I switch to a distance matrix, the program stops almost immediately, and outputs the famous (but hardly ever occurring): "Error in ginv(A) : 'X' must be a numeric or complex matrix" What strikes me as odd is that when I try to compute the generalized inverse myself, everything goes smoothly, and all the matrixes seem to be numeric. Below you will find the data and code for the replicable example. Data for the replicable example: https://gist.github.com/lessermatter/66b6488cfe6f5d7893bf And here is the code (the error occurs after executing the final line, which is also a toy model to be estimated through the bsar function): https://gist.github.com/lessermatter/0284be117a19620750aa Any ideas? Kind regards, Matteo Villa Università degli Studi di Milano Italy __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating list with increasing string lengths
Hi Nia, Many ways to do something like this, for example: N<-6 sapply(1:N,function(x) return(sample(1:10,x,TRUE))) I'll let you work out how to generalize this. Jim On Sun, Jun 14, 2015 at 3:06 PM, Bert Gunter wrote: > Is this homework? Homework is deprecated here. > > ?lapply > > is one of many possible approaches. If this is not homework, showing your > unsuccessful code would likely lead to a better learning experience for you. > > Cheers, > Bert > > Bert Gunter > > "Data is not information. Information is not knowledge. And knowledge is > certainly not wisdom." >-- Clifford Stoll > > On Sat, Jun 13, 2015 at 8:04 AM, Nia Gupta via R-help > wrote: > >> Hello, >> >> I am trying to create a list where each name would have an increasing >> vector length. For example, I am trying to obtain something that looks like >> this: >> >> [[1]][1] 2 >> [[2]] >> [1] 2 4 >> >> [[3]] >> [1] 1 2 3 >> . >> The numbers generated would just be any random numbers. My thought was to >> use a for-loop and the sequence function but that doesn't seem to be >> working. Any help on this would be greatly appreciated. >> >> Thank you, >> >> Nia >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 3.2, Mac 10.10.3 : help.search showing error
On Jun 13, 2015, at 11:04 PM, Berend Hasselman wrote: > >> On 14-06-2015, at 06:25, Ramnik Bansal wrote: >> >> Thanks. But it seems to be an R 3.2.0 specific problem. >> > > I replied with the following to a similar message on R-devel. There was no error either with using help() or using the Mac GUI package manager. Those were the reported difficulties previously reported. > > > See this thread on R-SIG-Mac > > https://stat.ethz.ch/pipermail/r-sig-mac/2015-April/011420.html > > This may help. > Get R 3.2.0-patched or even the release candidate for R 3.2.1 > The error I am getting was from the most recent R 3.2.0 downloaded yesterday. There is no 3.2.0-Patched for Mavericks. The release candidate, R 3.2.1 RC, is marked on the ATT Research webpage as failing Make and it indeed fails to launch. I would NOT recommend that anyone accept that advice. But maybe it is a Mac-specific problem. When I remove the crippled R 3.2.1 RC and reinstall the R 3.2.0 and run from a Terminal window I do not get the help.search() error. So copying to R SIG Mac, and will not copy R-help on any further efforts. -- David. > > Berend > > >> On Sun, Jun 14, 2015 at 8:42 AM, David Winsemius >> wrote: >> >>> >>> On Jun 13, 2015, at 7:41 AM, Ramnik Bansal wrote: >>> Getting following error in using help.search > utils::help.search("linear models") Error in help(db[i, "topic"], package = db[i, "Package"], lib.loc = >>> lib, : 'topic' should be a name, length-one character vector or reserved word >>> >>> I first tried this in a Mac running the SL version for R 3.1.2 and did not >>> get this error. I updated my Mavericks laptop to R 3.2.0 and can now >>> reproduce this error. It does not seem to depend on having a space in the >>> argument. It seems to be thrown by this segment of code in the >>> `help()`-function: >>> >>> ischar <- tryCatch(is.character(topic) && length(topic) == >>> 1L, error = identity) >>> if (inherits(ischar, "error")) >>> ischar <- FALSE >>> if (!ischar) { >>> reserved <- c("TRUE", "FALSE", "NULL", "Inf", "NaN", >>> "NA", "NA_integer_", "NA_real_", "NA_complex_", "NA_character_") >>> stopic <- deparse(substitute(topic)) >>> if (!is.name(substitute(topic)) && !stopic %in% reserved) >>> stop("'topic' should be a name, length-one character vector or >>> reserved word") >>> >>> If gone through the `help.search` function code and cannot find where the >>> `help` function is actually called. This seems unlikely to be a >>> Mac-specific problem. >>> > example(help.search) hlp.sr> help.search("linear models")# In case you forgot how to fit linear Error in help(db[i, "topic"], package = db[i, "Package"], lib.loc = >>> lib, : 'topic' should be a name, length-one character vector or reserved word How to sort this? [[alternative HTML version deleted]] >>> >>> David Winsemius >>> Alameda, CA, USA >>> >>> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with abs function
Hi, Just do the following: > tran<-c(7.2) > tgrid<-c(7.1,7.4,7.3,7.1,7.3) > tgrid<-tgrid-tran > tgrid [1] -0.1 0.2 0.1 -0.1 0.1 > abs(tgrid[tgrid>0.1]) [1] 0.2 Andrés > El 12/06/2015, a las 11:01, Jeff Newmiller > escribió: > > FAQ 7.31 > --- > Jeff NewmillerThe . . Go Live... > DCN:Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/BatteriesO.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --- > Sent from my phone. Please excuse my brevity. > > On June 12, 2015 8:39:48 AM PDT, "Nelson, Gary (MISC)" > wrote: >> I have come across some odd behavior (to me) using the abs() function >> that I wonder if anyone can explain. >> >> Using R version 3.2.0, I created a vector of absolute values using >> the following code: >> >>> tran<-c(7.2) >>> tgrid<-c(7.1,7.4,7.3,7.1,7.3) >>> dgrid<-abs(tgrid-tran) >>> dgrid >> [1] 0.1 0.2 0.1 0.1 0.1 >> >> When I tried to extract just the rows with values>0.1, I get >> >>> dgrid[dgrid>0.1] >> [1] 0.1 0.2 0.1 >> >> There should be only 1 value extracted. >> >> However, if I enter the values by hand >> >>> bgrid<-c(0.1,0.2,0.1,0.1,0.1) >>> bgrid >> [1] 0.1 0.2 0.1 0.1 0.1 >>> bgrid[bgrid>0.1] >> [1] 0.2 >> >> The result is correct. So why is this happening? >> >> I did explore a little bit and found >> >>> as.character(dgrid) >> [1] "0.101" "0.2" >> [3] "0.0996" "0.101" >> [5] "0.0996" >> >> which shows the absolute values of the negative differences are >> slightly greater than the difference of 0.1 >> >> Is this normal behavior for the abs() function? >> >> Thanks, >> >> Gary Nelson >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] most frequent value
Dear group, I have the following integer object > a 3 4 6 3 3 6 how to get the most frequent value (it should be 3) and get nothing if no frequent one and all is equal 3 4 6 3 4 6 Thanks in advance Ragia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] most frequent value
Hi Ragia, The basic method is to use "table" and examine the result: > a<-matrix(c(3,3,4,3,6,6),nrow=2) > b<-matrix(c(3,3,4,4,6,6),nrow=2) > table(a) a 3 4 6 3 1 2 but you can get the modal value directly like this: library(prettyR) Mode(a) [1] "3" Mode(b) [1] ">1 mode" Jim On Mon, Jun 15, 2015 at 9:04 AM, Ragia Ibrahim wrote: > Dear group, > I have the following integer object > >> a > > 3 4 6 > 3 3 6 > > how to get the most frequent value (it should be 3) > > and get nothing if no frequent one and all is equal > > 3 4 6 > 3 4 6 > Thanks in advance > Ragia > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix/df help populate NA
Is this what you want: > x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3, + -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject", + "A", "B", "C", "D"), class = "data.frame", row.names = c(NA, + -2L)) > > x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4, + 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject", + "A", "D", "F", "H"), class = "data.frame", row.names = c(NA, + -2L)) > > # determine what the missing columns are and then add them to x2 > missing <- setdiff(colnames(x1), colnames(x2)) > > new_x2 <- x2 > > for (i in missing) new_x2[[i]] <- NA > > new_x2 Subject AD FH B C 1 x1 4.3 -2.4 1.3 -2.3 NA NA 2 x2 2.4 0.1 0.5 -1.4 NA NA Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sat, Jun 13, 2015 at 11:17 PM, Adrian Johnson wrote: > Dear group: > > I have two data frames. The column names of the two data frame has > some common variables but not identical. > > my aim is to make 2 DFs more uniform by taking union of both colnames > > > For example: I have x1 and x2 matrices: > > > x1 > SubjectAB CD > 1 x1 1.5 -1.3 0.4 -0.2 > 2 x2 -1.2 -0.3 0.3 -0.1 > > x2 > Subject AD FH > 1 x1 4.3 -2.4 1.3 -2.3 > 2 x2 2.4 0.1 0.5 -1.4 > > cases = c('A','B','C','D','F','H') > > for X2 I want to create newX2 DF. > > > x3 > Subject A B CD FH > 1 x1 4.3 NA NA -2.4 1.3 -2.3 > 2 x2 2.4 NA NA 0.1 0.5 -1.4 > > > Since B and C are no existing in x2, I put NAs. > > how can I create x3 matrix? > > > > dput code: > > x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3, > -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject", > "A", "B", "C", "D"), class = "data.frame", row.names = c(NA, > -2L)) > > x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4, > 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject", > "A", "D", "F", "H"), class = "data.frame", row.names = c(NA, > -2L)) > > > Could you please help how to create x3 with NAs incorporated. > adrian. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot : smoothing colors according to density of points
check out the 'hexbin' package for making scatter plots that have a lot of points overlapping in a small area. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Tue, Jun 2, 2015 at 9:51 AM, Adams, Jean wrote: > Try this. > > Jean > > D <- structure(list( > id = structure(1:6, .Label = c("O13297", "O13329", "O13525", > "O13539", "O13541", "O13547"), class = "factor"), > X = c(44.44, 31.272085, 6.865672, 14.176245, 73.275862, > 28.991597), > Y = c(21.6122, 4.0159, 2.43884, 7.81217, 3.59012, 258.999)), > .Names = c("id", "X", "Y"), class = "data.frame", > row.names = c("1", "2", "3", "4", "5", "6")) > > # define the number of colors > ncol <- 100 > # define the radius of the neighborhood > distcut <- 30 > pal <- colorRampPalette(c("blue", "yellow", "red"))(ncol) > > # calculate the euclidean distance between all pairs of points, based on X, > Y coordinates > Ddist <- with(D, as.matrix(dist(cbind(X, Y), diag=TRUE, upper=TRUE))) > # count up the number of neighbors within distcut distance of each point > D$C <- apply(Ddist # use this count to define the levels (which will be then used to color > points in the plot > D$Clevels <- with(D, > cut(C, breaks=seq(min(C), max(C), length.out=ncol+1), > labels=FALSE, include.lowest=TRUE)) > > # plot the data > with(D, plot(X, Y, col=pal[Clevels], log="y", pch=16)) > > > > On Tue, Jun 2, 2015 at 5:37 AM, Benjamin Dubreuil < > benjamin.dubre...@weizmann.ac.il> wrote: > > > Hello everyone, > > > > I have a data frame D with 4 columns id,X,Y,C. > > I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values > > as a color. (id is just a text string not used for the plot) > > > > But actually, I don't want to use the raw values of D$C, I would prefer > to > > calculate the average values of D$C according to the density of points > in a > > fixed neighborhood. > > In other words, I would like to smooth the colors according to the > density > > of points. > > > > I am looking for any function,package that could solve this. > > So far, I've been looking at library MASS and the function kde2d which > can > > calculate the density of points in 2 directions, but I don't see how I > > could then use this information to recalculate my D$C values. > > > > Here is a piece of the matrix : > > > head(D) > > id X YC > > 1 O13297 44.44 21.61220 -0.136651639 > > 2 O13329 31.272085 4.01590 -0.117016949 > > 3 O13525 6.865672 2.43884 -0.161173913 > > 4 O13539 14.176245 7.81217 -0.075756757 > > 5 O13541 73.275862 3.59012 -0.006988235 > > 6 O13547 28.991597 258.99900 -0.013985507 > > > > > dim(D) > > [1] 36164 > > > > > apply(D[,-1],2,range) > >X Y C > > [1,] 0.3378378 0.0003 -0.738 > > [2,] 100.000 24556.4000 0.5582500 > > (Y is not linear, so I use log='y' in the plot function) > > > > I used a palette of 100 colors ranging from Blue to Yellow to red. > > >pal = colorRampPalette(c("blue","yellow","red"))(100) > > > > To make D$C values correspond to a color, I used a cut with the following > > breaks (101 breaks from -1.2 to 1.2): > > > BREAKS > > [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000 > > -0.1925 > > [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368 > > -0.1336 > > [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080 > > -0.1048 > > [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792 > > -0.0760 > > [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504 > > -0.0472 > > [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216 > > -0.0184 > > [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024 0.0008 0.0040 0.0072 > > 0.0104 > > [64] 0.0136 0.0168 0.0200 0.0232 0.0264 0.0296 0.0328 0.0360 > > 0.0392 > > [73] 0.0424 0.0456 0.0488 0.0520 0.0552 0.0584 0.0616 0.0648 > > 0.0680 > > [82] 0.0712 0.0744 0.0776 0.0808 0.0840 0.0872 0.0904 0.0936 > > 0.0968 > > [91] 0.1000 0.1250 0.1500 0.1750 0.2000 0.2250 0.2500 0.4875 > > 0.7250 > > [100] 0.9625 1.2000 > > > C.levels = as.numeric(cut(D$C,breaks=BREAKS)) > > >length(C.levels) > > [1] 3616 > > > > C.levels ranges from 2 to 98 and then to plot the colors I used > > pal[C.levels]. > > > plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y') > > > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE a
Re: [R] Milisecond problem in as.POSIXct ?
On Wed, Jun 10, 2015 at 8:05 PM, Joshua Ulrich wrote: > > This is known behavior with how POSIXt objects are printed. See the > discussion on StackOverflow: > http://stackoverflow.com/questions/7726034/how-r-formats-posixct-with-fractional-seconds > To summarize the relevant portion of the discussion on StackOverflow: I believe the behavior is due to truncating the fractional seconds instead of rounding, combined with the floating point representation of the POSIXct time (though I would need to take a closer look at do_formatPOSIXlt to verify). If you look at the underlying double value of the POSIXct object, you can see why the printed value in the first example below is 0.2s and the printed value for the second example is 0.4s. R> sprintf("%20.10f", as.POSIXct('2011-10-11 07:49:36.3', tz="UTC")) [1] "1318319376.299523" R> sprintf("%20.10f", as.POSIXct('2011-10-11 07:49:36.4', tz="UTC")) [1] "1318319376.400954" > On Wed, Jun 10, 2015 at 7:41 PM, ce wrote: >> >> Dear all, >> >> my main problem is with miliseconds. I have an array : >> >> library(xts) >> options(digits.secs = 3) >> > x >> [1] "2015-06-10 10:22:06.389 EDT" "2015-06-10 10:22:07.473 EDT" >> [3] "2015-06-10 10:22:08.717 EDT" "2015-06-10 10:22:09.475 EDT" >> >> > x[1] >> [1] "2015-06-10 10:22:06.38 EDT" >> > x[2] >> [1] "2015-06-10 10:22:07.473 EDT" >> >> why it cuts last digit of miliseconds 389 to 38 ? ( it doesn't cut 473 !! ) >> >> I try to dump it to post here: >> >> > dump("x",file=stdout()) >> >> x <- >> structure(c(1433946126.39, 1433946127.474, 1433946128.717, 1433946129.476 >> ), tzone = "", tclass = c("POSIXct", "POSIXt"), class = c("POSIXct", >> "POSIXt")) >> >> new array becomes : >> >> > x >> [1] "2015-06-10 10:22:06.390 EDT" "2015-06-10 10:22:07.473 EDT" >> [3] "2015-06-10 10:22:08.717 EDT" "2015-06-10 10:22:09.476 EDT" >> >> this time first milisecond 389 became 390 ? and last element 475 became 476 >> ? >> >> I do some more tests : >> >> as.POSIXct("2015-06-10 10:22:07.473",format='%Y-%m-%d %H:%M:%OS') >> [1] "2015-06-10 10:22:07.473 EDT" >> >> is correct, but : >> >> as.POSIXct("2015-06-10 10:22:06.389",format='%Y-%m-%d %H:%M:%OS') >> [1] "2015-06-10 10:22:06.388 EDT" >> >> why miliseconds turn to 388 instead of 389 ? >> >> or >> >> as.POSIXct("2015-06-10 10:22:07.478",format='%Y-%m-%d %H:%M:%OS') >> [1] "2015-06-10 10:22:07.477 EDT" >> >> why it shows 477 instead of 478 >> >> > sessionInfo() >> R version 3.2.0 (2015-04-16) >> Platform: x86_64-suse-linux-gnu (64-bit) >> Running under: openSUSE 13.2 (Harlequin) (x86_64) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] xts_0.9-7 zoo_1.7-12 >> >> loaded via a namespace (and not attached): >> [1] tools_3.2.0 grid_3.2.0 lattice_0.20-31 >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Joshua Ulrich | about.me/joshuaulrich > FOSS Trading | www.fosstrading.com -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different behavior of model.matrix between R 3.2 and R3.1.1
Terry - your example didn't demonstrate the problem because the variable that interacted with strata (zed) was not a factor variable. But I had stated the problem incorrectly. It's not that there are too many strata terms; there are too many non-strata terms when the variable interacting with the stratification factor is a factor variable. Here is a simple example, where I have attached no packages other than the basic startup packages. strat <- function(x) x d <- expand.grid(a=c('a1','a2'), b=c('b1','b2')) d$y <- c(1,3,2,4) f <- y ~ a * strat(b) m <- model.frame(f, data=d) Terms <- terms(f, specials='strat', data=d) specials <- attr(Terms, 'specials') temp <- survival:::untangle.specials(Terms, 'strat', 1) Terms <- Terms[- temp$terms] model.matrix(Terms, m) (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2 1 1 0 0 0 2 1 1 0 0 3 1 0 1 0 4 1 1 0 1 . . . The column corresponding to a='a1' b='b2' should not be there (aa1:strat(b)b2). This does seem to be a change in R. Any help appreciated. Note that after subsetting out strat terms using Terms[ - temp$terms], Terms attributes factor and term.labels are: attr(,"factors") a a:strat(b) y0 0 a1 2 strat(b) 0 1 attr(,"term.labels") [1] "a" "a:strat(b)" Frank On 06/11/2015 08:44 AM, Therneau, Terry M., Ph.D. wrote: Frank, I'm not sure what is going on. The following test function works for me in both 3.1.1 and 3.2, i.e, the second model matrix has fewer columns. As I indicated to you earlier, the coxph code removes the strata() columns after creating X because I found it easier to correctly create the assign attribute. Can you create a worked example? require(survival) testfun <- function(formula, data) { tform <- terms(formula, specials="strata") mf <- model.frame(tform, data) terms2 <- terms(mf) strat <- untangle.specials(terms2, "strata") if (length(strat$terms)) terms2 <- terms2[-strat$terms] X <- model.matrix(terms2, mf) X } tdata <- data.frame(y= 1:10, zed = 1:10, grp = factor(c(1,1,1,2,2,2,1,1,3,3))) testfun(y ~ zed*grp, tdata) testfun(y ~ strata(grp)*zed, tdata) Terry T. - original message -- For building design matrices for Cox proportional hazards models in the cph function in the rms package I have always used this construct: Terms <- terms(formula, specials=c("strat", "cluster", "strata"), data=data) specials <- attr(Terms, 'specials') stra<- specials$strat Terms.ns <- Terms if(length(stra)) { temp <- untangle.specials(Terms.ns, "strat", 1) Terms.ns <- Terms.ns[- temp$terms]#uses [.terms function } X <- model.matrix(Terms.ns, X)[, -1, drop=FALSE] The Terms.ns logic removes stratification factor "main effects" so that if a stratification factor interacts with a non-stratification factor, only the interaction terms are included, not the strat. factor main effects. [In a Cox PH model stratification goes into the nonparametric survival curve part of the model]. Lately this logic quit working; model.matrix keeps the unneeded main effects in the design matrix. Does anyone know what changed in R that could have caused this, and possibly a workaround? --- -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of *Biostatistics* *Vanderbilt University* __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix/df help populate NA
Thank you very much. It worked! On Sun, Jun 14, 2015 at 8:00 PM, jim holtman wrote: > Is this what you want: > >> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3, > + -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject", > + "A", "B", "C", "D"), class = "data.frame", row.names = c(NA, > + -2L)) >> >> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4, > + 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject", > + "A", "D", "F", "H"), class = "data.frame", row.names = c(NA, > + -2L)) >> >> # determine what the missing columns are and then add them to x2 >> missing <- setdiff(colnames(x1), colnames(x2)) >> >> new_x2 <- x2 >> >> for (i in missing) new_x2[[i]] <- NA >> >> new_x2 > Subject AD FH B C > 1 x1 4.3 -2.4 1.3 -2.3 NA NA > 2 x2 2.4 0.1 0.5 -1.4 NA NA > > > > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Sat, Jun 13, 2015 at 11:17 PM, Adrian Johnson > wrote: >> >> Dear group: >> >> I have two data frames. The column names of the two data frame has >> some common variables but not identical. >> >> my aim is to make 2 DFs more uniform by taking union of both colnames >> >> >> For example: I have x1 and x2 matrices: >> >> > x1 >> SubjectAB CD >> 1 x1 1.5 -1.3 0.4 -0.2 >> 2 x2 -1.2 -0.3 0.3 -0.1 >> > x2 >> Subject AD FH >> 1 x1 4.3 -2.4 1.3 -2.3 >> 2 x2 2.4 0.1 0.5 -1.4 >> >> cases = c('A','B','C','D','F','H') >> >> for X2 I want to create newX2 DF. >> >> > x3 >> Subject A B CD FH >> 1 x1 4.3 NA NA -2.4 1.3 -2.3 >> 2 x2 2.4 NA NA 0.1 0.5 -1.4 >> >> >> Since B and C are no existing in x2, I put NAs. >> >> how can I create x3 matrix? >> >> >> >> dput code: >> >> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3, >> -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject", >> "A", "B", "C", "D"), class = "data.frame", row.names = c(NA, >> -2L)) >> >> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4, >> 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject", >> "A", "D", "F", "H"), class = "data.frame", row.names = c(NA, >> -2L)) >> >> >> Could you please help how to create x3 with NAs incorporated. >> adrian. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.