Re: [R] Problem with X11
Hello! Today on debian testing R 3.2.5 was delivered among the updates. The X11 problem is no longer there. Cheers Lorenzo On Tue, Apr 19, 2016 at 02:28:44PM -0400, Tom Wright wrote: I don't have my debian box available so can't confirm. But I would try $apt-get install libpng On Tue, Apr 19, 2016 at 11:23 AM, Lorenzo Isella wrote: Dear All, I have never had this problem before. I run debian testing on my box and I have recently update my R environment. Now, see what happens when I try the most trivial of all plots plot(seq(22)) Error in (function (display = "", width, height, pointsize, gamma, bg, : X11 module cannot be loaded In addition: Warning message: In (function (display = "", width, height, pointsize, gamma, bg, : unable to load shared object '/usr/lib/R/modules//R_X11.so': /usr/lib/x86_64-linux-gnu/libpng12.so.0: version `PNG12_0' not found (required by /usr/lib/R/modules//R_X11.so) and this is my sessionInfo() sessionInfo() R version 3.2.4 Revised (2016-03-16 r70336) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux stretch/sid locale: [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8 [5] LC_MONETARY=en_GB.utf8LC_MESSAGES=en_GB.utf8 [7] LC_PAPER=en_GB.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Anybody understands what is going on here? Regards Lorenzo __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
> Henrik Bengtsson > on Tue, 19 Apr 2016 14:04:11 -0700 writes: > Using the Matrix package, how can I create a row-oriented sparse > Matrix from scratch populated with some data? By default a > column-oriented one is created and I'm aware of the note that the > package is optimized for column-oriented ones, but I'm only interested > in using it for holding my sparse row-oriented data and doing basic > subsetting by rows (even using drop=FALSE). > Here is what I get when I set up a column-oriented sparse Matrix: >> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE) >> Cc[1:3,1] <- 1 A general ("teaching") remark : The above use of Matrix() is seen in many places, and is fine for small matrices and the case where you only use the `[<-` method very few times (as above). Also using Matrix() is nice when being introduced to using the Matrix package. However, for efficience in non-small cases, do use sparseMatrix() directly to construct sparse matrices. >> Cc > 5 x 5 sparse Matrix of class "dgCMatrix" > [1,] 1 . . . . > [2,] 1 . . . . > [3,] 1 . . . . > [4,] . . . . . > [5,] . . . . . >> str(Cc) > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > ..@ i : int [1:3] 0 1 2 > ..@ p : int [1:6] 0 3 3 3 3 3 > ..@ Dim : int [1:2] 5 5 > ..@ Dimnames:List of 2 > .. ..$ : NULL > .. ..$ : NULL > ..@ x : num [1:3] 1 1 1 > ..@ factors : list() > When I try to do the analogue for a row-oriented matrix, I get a > "dgTMatrix", whereas I would expect a "dgRMatrix": >> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE) >> Cr <- as(Cr, "dsRMatrix") >> Cr[1,1:3] <- 1 >> Cr > 5 x 5 sparse Matrix of class "dgTMatrix" > [1,] 1 1 1 . . > [2,] . . . . . > [3,] . . . . . > [4,] . . . . . > [5,] . . . . . The reason for the above behavior has been a) efficiency. All the subassignment ( `[<-` ) methods for "RsparseMatrix" objects (of which "dsRMatrix" is a special case) are implemented via TsparseMatrix. b) because of the general attitude that Csparse (and Tsparse to some extent) are well supported in Matrix, and e.g. further operations on Rsparse matrices would *again* go via T* or C* sparse ones, I had decided to keep things Tsparse. [...] > Trying with explicit coercion does not work: >> as(Cc, "dgRMatrix") > Error in as(Cc, "dgRMatrix") : > no method or default for coercing "dgCMatrix" to "dgRMatrix" >> as(Cr, "dgRMatrix") > Error in as(Cr, "dgRMatrix") : > no method or default for coercing "dgTMatrix" to "dgRMatrix" The general philosophy in 'Matrix' with all the class hierarchies and the many specific classes has been to allow and foster coercing to abstract super classes, i.e, to "dMatrix" or "generalMatrix", "triangularMatrix", or then "denseMatrix", "sparseMatrix", "CsparseMatrix" or "RsparseMatrix", etc So in the above as(*, "RsparseMatrix") should work always. As a summary, in other words, for what you want, as(sparseMatrix(.), "RsparseMatrix") should give you what you want reliably and efficiently. > Am I doing some wrong here? Or is this what means that the package is > optimized for the column-oriented representation and I shouldn't > really work with row-oriented ones? I'm really only interested in > access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory > footprint). { though you could equivalently use Cc[,row, drop=FALSE] with a CsparseMatrix Cc := t(Cr), couldn't you ? } Martin Maechler (maintainer of 'Matrix') ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge sort
I indeed used is.na() to check length, as I was not sure weather lenght() was a simple query or would go through the whole vector to count the elements. So to sum up, function calls are expensive, therefore recursion should be avoided, and growing the size of a vector (which is probably reassigning and copying?) is also expensive. Thank you for your help! On 04/19/2016 11:51 PM, Duncan Murdoch wrote: On 19/04/2016 3:39 PM, Gaston wrote: Hello everyone, I am learning R since recently, and as a small exercise I wanted to write a recursive mergesort. I was extremely surprised to discover that my sorting, although operational, is deeply inefficient in time. Here is my code : merge <- function(x,y){ if (is.na(x[1])) return(y) else if (is.na(y[1])) return(x) else if (x[1]return(cbind(c(x[1],division(x[-c(1,2)])[,1]),c(x[2],division(x[-c(1,2)])[,2]))) } mergesort <- function(x){ if (is.na(x[2])) return(x) else{ print(x) t=division(x) return(merge(mergesort(t[,1]),mergesort(t[,2]))) } } I tried my best to write it "the R-way", but apparently I failed. I suppose some of the functions I used are quite heavy. I would be grateful if you could give a hint on how to change that! I hope I made myself clear and wish you a nice day, Your use of is.na() looks strange. I don't understand why you are testing element 2 in mergesort(), and element 1 in merge(), and element 3 in division. Are you using it to test the length? It's better to use the length() function for that. The division() function returns a matrix. It would make more R-sense to return a list containing the two parts, because they might not be the same length. Generally speaking, function calls are expensive in R, so the recursive merge you're using looks like it would be the bottleneck. You'd almost certainly be better off to allocate something of length(x) + length(y), and do the assignments in a loop. Here's a merge sort I wrote as an illustration in a class. It's designed for clarity rather than speed, but I'd guess it would be faster than yours: mergesort <- function(x) { n <- length(x) if (n < 2) return(x) # split x into two pieces of approximately equal size, x1 and x2 x1 <- x[1:(n %/% 2)] x2 <- x[(n %/% 2 + 1):n] # sort each of the pieces x1 <- mergesort(x1) x2 <- mergesort(x2) # merge them back together result <- c() i <- 0 while (length(x1) > 0 && length(x2) > 0) { # compare the first values if (x1[1] < x2[1]) { result[i + 1] <- x1[1] x1 <- x1[-1] } else { result[i + 1] <- x2[1] x2 <- x2[-1] } i <- i + 1 } # put the smaller one into the result # delete it from whichever vector it came from # repeat until one of x1 or x2 is empty # copy both vectors (one is empty!) onto the end of the results result <- c(result, x1, x2) result } If I were going for speed, I wouldn't modify the x1 and x2 vectors, and I'd pre-allocate result to the appropriate length, rather than growing it in the while loop. But that was a different class! Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge sort
On 20/04/2016 7:38 AM, Gaston wrote: I indeed used is.na() to check length, as I was not sure weather lenght() was a simple query or would go through the whole vector to count the elements. length() is a simple query, and is very fast. The other problem in your approach (which may not be a problem with your current data) is that NA is commonly used as an element of a vector to represent a missing value. So to sum up, function calls are expensive, therefore recursion should be avoided, and growing the size of a vector (which is probably reassigning and copying?) is also expensive. "Avoided" may be too strong: speed isn't always a concern, sometimes clarity is more important. Growing vectors is definitely expensive. Duncan Murdoch Thank you for your help! On 04/19/2016 11:51 PM, Duncan Murdoch wrote: On 19/04/2016 3:39 PM, Gaston wrote: Hello everyone, I am learning R since recently, and as a small exercise I wanted to write a recursive mergesort. I was extremely surprised to discover that my sorting, although operational, is deeply inefficient in time. Here is my code : merge <- function(x,y){ if (is.na(x[1])) return(y) else if (is.na(y[1])) return(x) else if (x[1] I tried my best to write it "the R-way", but apparently I failed. I suppose some of the functions I used are quite heavy. I would be grateful if you could give a hint on how to change that! I hope I made myself clear and wish you a nice day, Your use of is.na() looks strange. I don't understand why you are testing element 2 in mergesort(), and element 1 in merge(), and element 3 in division. Are you using it to test the length? It's better to use the length() function for that. The division() function returns a matrix. It would make more R-sense to return a list containing the two parts, because they might not be the same length. Generally speaking, function calls are expensive in R, so the recursive merge you're using looks like it would be the bottleneck. You'd almost certainly be better off to allocate something of length(x) + length(y), and do the assignments in a loop. Here's a merge sort I wrote as an illustration in a class. It's designed for clarity rather than speed, but I'd guess it would be faster than yours: mergesort <- function(x) { n <- length(x) if (n < 2) return(x) # split x into two pieces of approximately equal size, x1 and x2 x1 <- x[1:(n %/% 2)] x2 <- x[(n %/% 2 + 1):n] # sort each of the pieces x1 <- mergesort(x1) x2 <- mergesort(x2) # merge them back together result <- c() i <- 0 while (length(x1) > 0 && length(x2) > 0) { # compare the first values if (x1[1] < x2[1]) { result[i + 1] <- x1[1] x1 <- x1[-1] } else { result[i + 1] <- x2[1] x2 <- x2[-1] } i <- i + 1 } # put the smaller one into the result # delete it from whichever vector it came from # repeat until one of x1 or x2 is empty # copy both vectors (one is empty!) onto the end of the results result <- c(result, x1, x2) result } If I were going for speed, I wouldn't modify the x1 and x2 vectors, and I'd pre-allocate result to the appropriate length, rather than growing it in the while loop. But that was a different class! Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data reshaping with conditions
Dear All, I am trying to reshape the data with some conditions. A small part of the data looks like below. Like this there will be more data with repeating ID. Count id name type 117 335 sally A 19 335 sally A 167 335 sally B 18 340 susan A 56 340 susan A 22 340 susan B 53 340 susan B 135 351 lee A 114 351 lee A 84 351 lee A 80 351 lee A 19 351 lee A 8 351 lee A 21 351 lee A 88 351 lee B 111 351 lee B 46 351 lee B 108 351 lee B >From the above data I am expecting an output like below. id name type count_of_B Max of count B x y 335 sally B 167 167 117,19 NA 340 susan B 22,53 53 18 56 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 Where, the column x and column y are: x = Count_A_less_than_max of (Count type B) y = Count_A_higher_than_max of (Count type B). *1)* I tried with dplyr with the following code for the initial step to get the values for each column. *2)* I thought to transpose the columns which has the unique ID alone. I tried with the following code and I am struck with the intial step itself. The code is executed but higher and lower value of A is not coming. Expected_output= data %>% group_by(id, Type) %>% mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>% mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type == "B"]),max(count[Type == "A"]))) %>% mutate(count_type_A_lesser = ifelse (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"], collapse = ",")), "NA"))%>% mutate(count_type_A_higher = ifelse(Type=="B",(paste(unlist(count[Type=="A"]) > Max_of_count_B[Type=="B"], collapse = ",")), "NA")) I hope I make my point clear. Please bare with the code, as I am new to this. Regards, sri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use multiple cores on Linux
I am trying to run the following code in R on a Linux cluster. I would like to use the full processing power (specifying cores/nodes/memory). The code essentially runs predictions based on a GAM regression and saves the results as a CSV file for multiple sets of data (here I only show two). Is it possible to run this code using HPC packages such as Rmpi/snow/doParallel? Thank you! # library(data.table) library(mgcv) library(reshape2) library(dplyr) library(tidyr) library(lubridate) library(DataCombine) # gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) + factor(week) + s(lnincome) + s(tmax) + s(hmax),data=cont,na.action="na.omit", method="ML") # # Historic temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv") humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv") # temp_hist <- as.data.table(temp_hist) humid_hist <- as.data.table(humid_hist) # # Merge mykey<- c("FIPS", "year","month", "week") setkeyv(temp_hist, mykey) setkeyv(humid_hist, mykey) # hist<- merge(temp_hist, humid_hist, by=mykey) # hist$X.x <- NULL hist$X.y <- NULL # # Max hist_max <- hist hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <- hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL # # Adding Factors hist_max$citycode <- rep(101,nrow(hist_max)) hist_max$year <- rep(2010,nrow(hist_max)) hist_max$lnincome <- rep(10.262,nrow(hist_max)) # # Predictions pred_hist_max <- predict.gam(gam_max_count_wk,hist_max) # pred_hist_max <- as.data.table(pred_hist_max) pred_hist_max <- cbind(hist, pred_hist_max) pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <- pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL # # Aggregate by FIPS max_hist <- pred_hist_max %>% group_by(FIPS) %>% summarise(pred_hist = mean(pred_hist_max)) # ### Future ## 4.5 # 4.5_2021_2050 temp_sim <- read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv") humid_sim <- read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv") # # Max temp_sim <- as.data.table(temp_sim) setnames(temp_sim, "max", "tmax") setnames(temp_sim, "min", "tmin") setnames(temp_sim, "avg", "tmean") # humid_sim <- as.data.table(humid_sim) setnames(humid_sim, "max", "hmax") setnames(humid_sim, "min", "hmin") setnames(humid_sim, "avg", "hmean") # temp_sim$X <- NULL humid_sim$X <- NULL # # Merge mykey<- c("FIPS", "year","month", "week") setkeyv(temp_sim, mykey) setkeyv(humid_sim, mykey) # sim <- merge(temp_sim, humid_sim, by=mykey) # sim_max <- sim # sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <- sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL # # Adding Factors sim_max$citycode <- rep(101,nrow(sim_max)) sim_max$year <- rep(2010,nrow(sim_max)) sim_max$week <- rep(1,nrow(sim_max)) sim_max$lnincome <- rep(10.262,nrow(sim_max)) # # Predictions pred_sim_max <- predict.gam(gam_max_count_wk,sim_max) # pred_sim_max <- as.data.table(pred_sim_max) pred_sim_max <- cbind(sim, pred_sim_max) pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <- pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL # # Aggregate by FIPS max_sim <- pred_sim_max %>% group_by(FIPS) %>% summarise(pred_sim = mean(pred_sim_max)) # # Merge with Historical Data max_hist$FIPS <- as.factor(max_hist$FIPS) max_sim$FIPS <- as.factor(max_sim$FIPS) # mykey1<- c("FIPS") setkeyv(max_hist, mykey1) setkeyv(max_sim, mykey1) max_change <- merge(max_hist, max_sim, by=mykey1) max_change$change <- ((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100 # write.csv(max_change, file = "/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv") # 4.5_2081_2100 temp_sim <- read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv") humid_sim <- read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv") # # Max temp_sim <- as.data.table(temp_sim) setnames(temp_sim, "max", "tmax") setnames(temp_sim, "min", "tmin") setnames(temp_sim, "avg", "tmean") # humid_sim <- as.data.table(humid_sim) setnames(humid_sim, "max", "hmax") setnames(humid_sim, "min", "hmin") setnames(humid_sim, "avg", "hmean") # temp_sim$X <- NULL humid_sim$X <- NULL # # Merge mykey<- c("FIPS", "year","month", "week") setkeyv(temp_sim, mykey) setkeyv(humid_sim, mykey) # sim <- merge(temp_sim, humid_sim, by=mykey) # sim_max <- sim # sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <- sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL # # Adding Factors sim_max$citycode <- rep(101,nrow(sim_max)) sim_max$year <- rep(2010,nrow(sim_max)) sim_max$week <- rep(1,nrow(sim_max)) sim_max$lnincome <- rep(10.262,nrow(sim_max)) # # Predictions pred_sim_max <- predict.gam(gam_max_count_wk,sim_max) # pred_sim_max <- as.data.table(pred_sim_max) pred_sim_max <- cbind(sim, pred_sim_max) pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <- pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL # # Aggregate by FIPS max_sim <- pred_sim
Re: [R] Add a vertical arrow to a time series graph using ggplot and xts
Please see updates to df2 assignment as shown below. library(xts) # primary #library(tseries) # Unit root tests library(ggplot2) library(vars) library(grid) dt_xts<-xts(x = 1:10, order.by = seq(as.Date("2016-01-01"), as.Date("2016-01-10"), by = "1 day")) colnames(dt_xts)<-"gdp" xmin<-min(index(dt_xts)) xmax<-max(index(dt_xts)) df1<-data.frame(x = index(dt_xts), coredata(dt_xts)) p<-ggplot(data = df1, mapping= aes(x=x, y=gdp))+geom_line() rg<-ggplot_build(p)$panel$ranges[[1]]$y.range y1<-rg[1] y2<-rg[2] # x = as.Date(..) in place of x = "2016-01-05" df2<-data.frame(x = as.Date("2016-01-05"), y1=y1, y2=y2 ) p1<-p+geom_segment(mapping=aes(x=x, y=y1, xend=x, yend=y2), data=df2, arrow=arrow()) -- Best, GG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
This is not a solution but your lsfit attempt #Error in lsfit(A, b) : only 3 cases, but 4 variables lsfit(A,b) gave that error because lsfit adds a column of 1 to its first argument unless you use intercept=FALSE. Then it will give you an answer (but I think it converts your sparse matrix into a dense one before doing any linear algebra). Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help wrote: > > > > I have a situation in R where I would like to find any x (if one exists) > that solves the linear system of equations Ax = b, where A is square, > sparse, and singular, and b is a vector. Here is some code that mimics my > issue with a relatively simple A and b, along with three other methods of > solving this system that I found online, two of which give me an error and > one of which succeeds on the simplified problem, but fails on my data > set(attached). Is there a solver in R that I can use in order to get x > without any errors given the structure of A? Thanks for your time. > #CODE STARTS HEREA = > as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b > = matrix(c(-30,40,-10),nrow=3,ncol=1) > #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or > out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) > #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A > %*% x > #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the > system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) > : singular matrix 'a' in solveqr.solve(A,b) > #matrices used in my actual problem (see attached files)A = > readMM("A.txt")b = readMM("b.txt") > #Error in as(x, "matrix")[i, , drop = drop] : subscript out of > boundssolve(qr(A, LAPACK=TRUE),b) > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
This is kind of like asking for a solution to x+1=x+1. Go back to linear algebra and look up Singular Value Decomposition, and decide if you really want to proceed. See also ?svd and package irlba. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 4:22:34 AM PDT, A A via R-help wrote: > > > >I have a situation in R where I would like to find any x (if one >exists) that solves the linear system of equations Ax = b, where A is >square, sparse, and singular, and b is a vector. Here is some code that >mimics my issue with a relatively simple A and b, along with three >other methods of solving this system that I found online, two of which >give me an error and one of which succeeds on the simplified problem, >but fails on my data set(attached). Is there a solver in R that I can >use in order to get x without any errors given the structure of A? >Thanks for your time. >#CODE STARTS HEREA = >as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b >= matrix(c(-30,40,-10),nrow=3,ncol=1) >#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or >out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) >#one x that happens to solve Ax = bx = >matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x >#Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves >the system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in >qr.solve(A, b) : singular matrix 'a' in solveqr.solve(A,b) >#matrices used in my actual problem (see attached files)A = >readMM("A.txt")b = readMM("b.txt") >#Error in as(x, "matrix")[i, , drop = drop] : subscript out of >boundssolve(qr(A, LAPACK=TRUE),b) > > > > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading Multiple Output Variables
Hi all, I am trying to read multiple out variables for a sensitivity analysis. Currently using one output value as follows: Y<-(E1) However I need to run analysis against 12 values of Y. So E1-E12. My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of 4 input variables in different combinations. These will be analysed against 40 rows of output variables for 12 columns. e.g. V1 V2 V3 V4E1 E2 E3 E4 ... E12 1 2 ... 40 Can someone provide guidance on How I can plot against all 12 months? Thanks Jody This message is intended solely for the addressee and may contain confidential and/or legally privileged information. Any use, disclosure or reproduction without the sender's explicit consent is unauthorised and may be unlawful. If you have received this message in error, please notify Northumbria University immediately and permanently delete it. Any views or opinions expressed in this message are solely those of the author and do not necessarily represent those of the University. The University cannot guarantee that this message or any attachment is virus free or has not been intercepted and/or amended. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use multiple cores on Linux
The answer to your question is yes. You might consider using the parallel package., and I would suggest starting with a simpler test case to learn how it works and incrementally adding complexity of packages and data handling. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 7:37:07 AM PDT, Miluji Sb wrote: >I am trying to run the following code in R on a Linux cluster. I would >like >to use the full processing power (specifying cores/nodes/memory). The >code >essentially runs predictions based on a GAM regression and saves the >results as a CSV file for multiple sets of data (here I only show two). > >Is it possible to run this code using HPC packages such as >Rmpi/snow/doParallel? Thank you! > ># >library(data.table) >library(mgcv) >library(reshape2) >library(dplyr) >library(tidyr) >library(lubridate) >library(DataCombine) ># >gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) + >factor(week) + s(lnincome) + s(tmax) + >s(hmax),data=cont,na.action="na.omit", method="ML") > ># ># Historic >temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv") >humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv") ># >temp_hist <- as.data.table(temp_hist) >humid_hist <- as.data.table(humid_hist) ># ># Merge >mykey<- c("FIPS", "year","month", "week") >setkeyv(temp_hist, mykey) >setkeyv(humid_hist, mykey) ># >hist<- merge(temp_hist, humid_hist, by=mykey) ># >hist$X.x <- NULL >hist$X.y <- NULL ># ># Max >hist_max <- hist >hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <- >hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL ># ># Adding Factors >hist_max$citycode <- rep(101,nrow(hist_max)) >hist_max$year <- rep(2010,nrow(hist_max)) >hist_max$lnincome <- rep(10.262,nrow(hist_max)) ># ># Predictions >pred_hist_max <- predict.gam(gam_max_count_wk,hist_max) ># >pred_hist_max <- as.data.table(pred_hist_max) >pred_hist_max <- cbind(hist, pred_hist_max) >pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <- >pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL ># ># Aggregate by FIPS >max_hist <- pred_hist_max %>% > group_by(FIPS) %>% > summarise(pred_hist = mean(pred_hist_max)) ># >### Future >## 4.5 ># 4.5_2021_2050 >temp_sim <- >read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv") >humid_sim <- >read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv") ># ># Max >temp_sim <- as.data.table(temp_sim) >setnames(temp_sim, "max", "tmax") >setnames(temp_sim, "min", "tmin") >setnames(temp_sim, "avg", "tmean") ># >humid_sim <- as.data.table(humid_sim) >setnames(humid_sim, "max", "hmax") >setnames(humid_sim, "min", "hmin") >setnames(humid_sim, "avg", "hmean") ># >temp_sim$X <- NULL >humid_sim$X <- NULL ># ># Merge >mykey<- c("FIPS", "year","month", "week") >setkeyv(temp_sim, mykey) >setkeyv(humid_sim, mykey) ># >sim <- merge(temp_sim, humid_sim, by=mykey) ># >sim_max <- sim ># >sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <- >sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL ># ># Adding Factors >sim_max$citycode <- rep(101,nrow(sim_max)) >sim_max$year <- rep(2010,nrow(sim_max)) >sim_max$week <- rep(1,nrow(sim_max)) >sim_max$lnincome <- rep(10.262,nrow(sim_max)) ># ># Predictions >pred_sim_max <- predict.gam(gam_max_count_wk,sim_max) ># >pred_sim_max <- as.data.table(pred_sim_max) >pred_sim_max <- cbind(sim, pred_sim_max) >pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <- >pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL ># ># Aggregate by FIPS >max_sim <- pred_sim_max %>% > group_by(FIPS) %>% > summarise(pred_sim = mean(pred_sim_max)) ># ># Merge with Historical Data >max_hist$FIPS <- as.factor(max_hist$FIPS) >max_sim$FIPS <- as.factor(max_sim$FIPS) ># >mykey1<- c("FIPS") >setkeyv(max_hist, mykey1) >setkeyv(max_sim, mykey1) >max_change <- merge(max_hist, max_sim, by=mykey1) >max_change$change <- >((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100 ># >write.csv(max_change, file = >"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv") > > > ># 4.5_2081_2100 >temp_sim <- >read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv") >humid_sim <- >read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv") ># ># Max >temp_sim <- as.data.table(temp_sim) >setnames(temp_sim, "max", "tmax") >setnames(temp_sim, "min", "tmin") >setnames(temp_sim, "avg", "tmean") ># >humid_sim <- as.data.table(humid_sim) >setnames(humid_sim, "max", "hmax") >setnames(humid_sim, "min", "hmin") >setnames(humid_sim, "avg", "hmean") ># >temp_sim$X <- NULL >humid_sim$X <- NULL ># ># Merge >mykey<- c("FIPS", "year","month", "week") >setkeyv(temp_sim, mykey) >setkeyv(humid_sim, mykey) ># >sim <- merge(temp_sim, humid_sim, by=mykey) ># >sim_max <- sim ># >sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <- >sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL ># ># Adding Facto
Re: [R] Reading Multiple Output Variables
The word "analysis" is too vague. If you are referring to lm regression, you can specify Y as a matrix instead of a vector. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example Also, please disable HTML in your email when sending to this list, since it will usually come through to us in damaged form. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 8:19:48 AM PDT, "jody.kelly" wrote: > >Hi all, > > >I am trying to read multiple out variables for a sensitivity analysis. > > >Currently using one output value as follows: > > >Y<-(E1) > > >However I need to run analysis against 12 values of Y. So E1-E12. > > >My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of 4 >input variables in different combinations. These will be analysed >against 40 rows of output variables for 12 columns. > > >e.g. > > > V1 V2 V3 V4E1 E2 E3 E4 ... E12 > >1 > >2 > >... > >40 > > >Can someone provide guidance on How I can plot against all 12 months? > > >Thanks > > >Jody > > >This message is intended solely for the addressee and may contain >confidential and/or legally privileged information. Any use, disclosure >or reproduction without the sender's explicit consent is unauthorised >and may be unlawful. If you have received this message in error, please >notify Northumbria University immediately and permanently delete it. >Any views or opinions expressed in this message are solely those of the >author and do not necessarily represent those of the University. The >University cannot guarantee that this message or any attachment is >virus free or has not been intercepted and/or amended. > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installation of dplyr
Increasing memory resolved the issue for me. Thanks again, Ben > On Apr 19, 2016, at 4:10 PM, Hadley Wickham wrote: > > You normally see these errors when compiling on a vm that has very > little memory. > Hadley > > On Tue, Apr 19, 2016 at 2:47 PM, Ben Tupper wrote: >> Hello, >> >> I am getting a fresh CentOS 6.7 machine set up with all of the goodies for R >> 3.2.3, including dplyr package. I am unable to successfully install it. >> Below I show the failed installation using utils::install.packages() and >> then again using devtools::install_github(). Each yields an error similar >> to the other but not quite exactly the same - the error messages sail right >> over my head. >> >> I can contact the package author if that would be better, but thought it >> best to start here. >> >> Thanks! >> Ben >> >> Ben Tupper >> Bigelow Laboratory for Ocean Sciences >> 60 Bigelow Drive, P.O. Box 380 >> East Boothbay, Maine 04544 >> http://www.bigelow.org >> >>> sessionInfo() >> R version 3.2.3 (2015-12-10) >> Platform: x86_64-redhat-linux-gnu (64-bit) >> Running under: CentOS release 6.7 (Final) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> >> >> >> # utils::install.packages() >> >> >>> install.packages("dplyr", repo = "http://cran.r-project.org";) >> Installing package into ‘/usr/lib64/R/library’ >> (as ‘lib’ is unspecified) >> trying URL 'http://cran.r-project.org/src/contrib/dplyr_0.4.3.tar.gz' >> Content type 'application/x-gzip' length 655997 bytes (640 KB) >> == >> downloaded 640 KB >> >> * installing *source* package ‘dplyr’ ... >> ** package ‘dplyr’ successfully unpacked and MD5 sums checked >> ** libs >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c RcppExports.cpp -o >> RcppExports.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c address.cpp -o address.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c api.cpp -o api.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c arrange.cpp -o arrange.o >> In file included from ../inst/include/dplyr.h:131, >> from arrange.cpp:1: >> ../inst/include/dplyr/DataFrameSubsetVisitors.h: In constructor >> ‘dplyr::DataFrameSubsetVisitors::DataFrameSubsetVisitors(const >> Rcpp::DataFrame&, const Rcpp::CharacterVector&)’: >> ../inst/include/dplyr/DataFrameSubsetVisitors.h:40: warning: ‘column’ may be >> used uninitialized in this function >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c between.cpp -o between.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c bind.cpp -o bind.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR >> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" >> -I"/usr/lib64/R/library/BH/include" -fpic -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m64 -mtune=generic -c combine_variables.cpp -o >> combine_variables.o >> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR
Re: [R] Solving sparse, singular systems of equations
> On 20 Apr 2016, at 13:22, A A via R-help wrote: > > > > > I have a situation in R where I would like to find any x (if one exists) that > solves the linear system of equations Ax = b, where A is square, sparse, and > singular, and b is a vector. Here is some code that mimics my issue with a > relatively simple A and b, along with three other methods of solving this > system that I found online, two of which give me an error and one of which > succeeds on the simplified problem, but fails on my data set(attached). Is > there a solver in R that I can use in order to get x without any errors given > the structure of A? Thanks for your time. > #CODE STARTS HEREA = > as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = > matrix(c(-30,40,-10),nrow=3,ncol=1) > #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out > of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) > #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% > x > #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the > system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : > singular matrix 'a' in solveqr.solve(A,b) > #matrices used in my actual problem (see attached files)A = readMM("A.txt")b > = readMM("b.txt") > #Error in as(x, "matrix")[i, , drop = drop] : subscript out of > boundssolve(qr(A, LAPACK=TRUE),b) Your code is a mess. A singular square system of linear equations has an infinity of solutions if a solution exists at all. How that works you can find here: https://en.wikipedia.org/wiki/System_of_linear_equations in the section "Matrix solutions". For your simple example you can do it like this: library(MASS) Ag <- ginv(A) # pseudoinverse xb <- Ag %*% b # minimum norm solution Aw <- diag(nrow=nrow(Ag)) - Ag %*% A # see the Wikipedia page w <- runif(3) z <- xb + Aw %*% w A %*% z - b N <- Null(t(A)) # null space of A; see the help for Null in package MASS A %*% N A %*% (xb + 2 * N) - b For sparse systems you will have to approach this differently; I have no experience with that. Berend __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
Thanks for the advice. I fixed the function and ran it on my systems just to see if it would work; for the first set of A and b, I got a valid solution, but for the second set, I got the error "Error in complete.cases(x, y, wt) : not all arguments have the same length". On Wednesday, April 20, 2016 10:59 AM, William Dunlap wrote: This is not a solution but your lsfit attempt #Error in lsfit(A, b) : only 3 cases, but 4 variables lsfit(A,b)gave that error because lsfit adds a column of 1 toits first argument unless you use intercept=FALSE.Then it will give you an answer (but I think it convertsyour sparse matrix into a dense one before doingany linear algebra). Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help wrote: I have a situation in R where I would like to find any x (if one exists) that solves the linear system of equations Ax = b, where A is square, sparse, and singular, and b is a vector. Here is some code that mimics my issue with a relatively simple A and b, along with three other methods of solving this system that I found online, two of which give me an error and one of which succeeds on the simplified problem, but fails on my data set(attached). Is there a solver in R that I can use in order to get x without any errors given the structure of A? Thanks for your time. #CODE STARTS HEREA = as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = matrix(c(-30,40,-10),nrow=3,ncol=1) #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular matrix 'a' in solveqr.solve(A,b) #matrices used in my actual problem (see attached files)A = readMM("A.txt")b = readMM("b.txt") #Error in as(x, "matrix")[i, , drop = drop] : subscript out of boundssolve(qr(A, LAPACK=TRUE),b) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installation problem on Ubuntu
I needed to update R so I could install ggplot. I am running Ubuntu 12.04. I cannot upgrade Ubuntu because I am using a work computer. I tried upgrading the normal way: sudo apt-get update sudo apt-get install r-base r-base-dev But this only installed an earlier version. Finally I tried installing from source (./configure, Make install). This worked. However, when I try to install packages, I get this error: Error in download.file(url, destfile = f, quiet = TRUE) : internet routines cannot be loaded In addition: Warning message: In download.file(url, destfile = f, quiet = TRUE) : unable to load shared object '/usr/local/lib/R/modules//internet.so': /usr/local/lib/R/modules//internet.so: undefined symbol: curl_multi_wait >> ls /usr/local/lib/R/modules/ >> R_X11.so R_de.so internet.so lapack.so Thanks! P [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging Data Sets with Full Outer Join
Hi All, I would like to match some datasets. Both deliver variables AND cases which might or might not be present in all datasets: This sequence Kunden <- Kunden_2011 Kunden <- merge(Kunden, Kunden_2012, by.x = "Debitor", by.y = "Debitor") Kunden <- merge(Kunden, Kunden_2013, by.x = "Debitor", by.y = "Debitor") Kunden <- merge(Kunden, Kunden_2014, by.x = "Debitor", by.y = "Debitor") Kunden <- merge(Kunden, Kunden_2015, by.x = "Debitor", by.y = "Debitor") delivers too few cases. So I guess it does an equi-join. How can I join the datasets and keep the variables as well as the cases? I am looking forward to your reply. Kind regards Georg __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] project test data into principal components of training dataset
For the records, a slightly hacky answer, by modifying the ggbiplot function, is provided now here: http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot On 18/04/16 17:20, olsen wrote: > Hi there, > > I've a training dataset and a test dataset. My aim is to visually > allocate the test data within the calibrated space reassembled by the > PC's of the training data set, furthermore to keep the training data set > coordinates fixed, so they can serve as ruler for measurement for > additional test datasets coming up. > > Please find a minimum working example using the wine dataset below. > Ideally I would like to use ggbiplot as it comes with the elegant > features but it only accepts objects of class prcomp, princomp, PCA, or > lda, which is not fullfilled by the predicted test data. > > I'm still slightly wet behind my R ears and the only solution I can > think of is to plot the calibrated space in ggbiplot and the training > data in ggplot and then join them, in the worst case by exporting them > as svg and importing them in inkscape. Which is slightly complicated > plus the scaling is different. > > Any indication how this mission can be accomplished very welcome! > > Thanks and greets > Olsen > > I started a threat on stackoverflow on that issue but know relevant > indications so far. > http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot > > ##MWE > library(ggbiplot) > data(wine) > > ##pca on the wine dataset used as training data > wine.pca <- prcomp(wine, center = TRUE, scale. = TRUE) > > wine$class <- wine.class > > ##simulate test data by generating three new wine classes > wine.new.1 <- wine[c(sample(1:nrow(wine), 25)),] > wine.new.2 <- wine[c(sample(1:nrow(wine), 43)),] > wine.new.3 <- wine[c(sample(1:nrow(wine), 36)),] > > ##Predict PCs for the new classes by transforming > #them using the predict.prcomp function > pred.new.1 <- predict(wine.pca, newdata = wine.new.1) > pred.new.2 <- predict(wine.pca, newdata = wine.new.2) > pred.new.3 <- predict(wine.pca, newdata = wine.new.3) > > #simulate the classes for the new sorts > wine.new.1$class <- rep("new.wine.1", nrow(wine.new.1)) > wine.new.2$class <- rep("new.wine.2", nrow(wine.new.2)) > wine.new.3$class <- rep("new.wine.3", nrow(wine.new.3)) > wine.new.bind <- rbind(wine.new.1, wine.new.2, wine.new.3) > > ##compose the plot by joining the PCA ggbiplot training data with the > testing data from ggplot > #plot the calibrated space resulting from the test data > g.train <- ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups = > wine$class, ellipse = TRUE, circle = TRUE) > g.train > #plot the test data resulting from the prediction > df.pred = data.frame(PC1 = wine.new.bind[,1], PC2 = wine.new.bind[,2], > PC3 = wine.new.bind[,3], PC4 = wine.new.bind[,4], > classes = wine.new.bind$class) > g.test <- ggplot(df.pred, aes(PC1, PC2, color = classes, shape = > classes)) + geom_point() + stat_ellipse() > g.test > > > > > -- Our solar system is the cream of the crop http://hasa-labs.org __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting Numerical Vector Into Chunks
Greetings! I have several large data sets of animal movements. Their pauses (zero magnitude vectors) are of particular interest in addition to the speed distributions that precede the periods of rest. Here is an example of the kind of data I am interested in analyzing: x <- abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0))) length(x) This example has 27 elements with strings of zeroes (pauses) situated among the speed values. Is there a way to split the vector into zero and nonzero chunks and store them in a form where they can be analyzed? I have tried various forms of split() to no avail. Thank you! Salvatore A. Sidoti __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parsing and counting expressions in .txt-files
Dear Community, I hope that I have the right category selected because I am relatively new to the "R" world. I come with a relatively challenging problem in the luggage. I would like to realize, that "R" reads text files (there are several hundred pieces in my folder) sequentially, and screens for specific terms. If the term is found, the program should write a 1, if not a 0. Another task is to scrape a ten-digit number from the file after a particular keyword, so that I can map the results. The Programm should create an .txt file ideally. A brief example: Keywords: "surpassed" "achieved", "very motivated" Text1: "Personnel number: 0123456789 The employee has exceeded the set targets and was also otherwise always motivated (...) " So I want that my program for this case, ideally reflects the following (in lines and columns= Personell number;surpassed;achieved; very motivated (do not write) 0123456789;1;0;1 For the following files, he shall all continue analogously in line 2, 3, 4 and so on. Could you give a brief assessment, how to realize such a thing? How do I start best and whether you are possibly "stumbled" in advance about something similar in R? I am grateful for any suggestions/proposals. Thank you in advance, Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
Thanks for the response. Yes, in that situation a solution of x = 1 would be just as good as x = 1000 or any other value of x for me (but in my problem the matrix has nonzero rank, so I can't just randomly choose a vector and have it be a solution). If it helps, what I'm interested in is the R equivalent of x = A\b in MATLAB, for these particular kinds of A matrices. I looked into irlba, and it seems to be able to calculate some of the singular values/vectors for the large dataset without taking too much time. I'll look more into seeing how I can solve the system with it. On Wednesday, April 20, 2016 11:01 AM, Jeff Newmiller wrote: This is kind of like asking for a solution to x+1=x+1. Go back to linear algebra and look up Singular Value Decomposition, and decide if you really want to proceed. See also ?svd and package irlba. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 4:22:34 AM PDT, A A via R-help wrote: I have a situation in R where I would like to find any x (if one exists) that solves the linear system of equations Ax = b, where A is square, sparse, and singular, and b is a vector. Here is some code that mimics my issue with a relatively simple A and b, along with three other methods of solving this system that I found online, two of which give me an error and one of which succeeds on the simplified problem, but fails on my data set(attached). Is there a solver in R that I can use in order to get x without any errors given the structure of A? Thanks for your time. #CODE STARTS HEREA = as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = matrix(c(-30,40,-10),nrow=3,ncol=1) #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular matrix 'a' in solveqr.solve(A,b) #matrices used in my actual problem (see attached files)A = readMM("A.txt")b = readMM("b.txt") #Error in as(x, "matrix")[i, , drop = drop] : subscript out of boundssolve(qr(A, LAPACK=TRUE),b) R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
Thanks for the help. Sorry, I am not sure why it looks like that in the mailing list - it looks much more neat on my end (see attached file). On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman wrote: > On 20 Apr 2016, at 13:22, A A via R-help wrote: > > > > > I have a situation in R where I would like to find any x (if one exists) that > solves the linear system of equations Ax = b, where A is square, sparse, and > singular, and b is a vector. Here is some code that mimics my issue with a > relatively simple A and b, along with three other methods of solving this > system that I found online, two of which give me an error and one of which > succeeds on the simplified problem, but fails on my data set(attached). Is > there a solver in R that I can use in order to get x without any errors given > the structure of A? Thanks for your time. > #CODE STARTS HEREA = > as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = > matrix(c(-30,40,-10),nrow=3,ncol=1) > #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out > of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) > #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% > x > #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the > system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : > singular matrix 'a' in solveqr.solve(A,b) > #matrices used in my actual problem (see attached files)A = readMM("A.txt")b > = readMM("b.txt") > #Error in as(x, "matrix")[i, , drop = drop] : subscript out of > boundssolve(qr(A, LAPACK=TRUE),b) Your code is a mess. A singular square system of linear equations has an infinity of solutions if a solution exists at all. How that works you can find here: https://en.wikipedia.org/wiki/System_of_linear_equations in the section "Matrix solutions". For your simple example you can do it like this: library(MASS) Ag <- ginv(A) # pseudoinverse xb <- Ag %*% b # minimum norm solution Aw <- diag(nrow=nrow(Ag)) - Ag %*% A # see the Wikipedia page w <- runif(3) z <- xb + Aw %*% w A %*% z - b N <- Null(t(A)) # null space of A; see the help for Null in package MASS A %*% N A %*% (xb + 2 * N) - b For sparse systems you will have to approach this differently; I have no experience with that. Berend __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging Data Sets with Full Outer Join
> On Apr 19, 2016, at 11:23 PM, g.maub...@weinwolf.de wrote: > > Hi All, > > I would like to match some datasets. Both deliver variables AND cases > which might or might not be present in all datasets: > > This sequence > > Kunden <- Kunden_2011 > Kunden <- merge(Kunden, Kunden_2012, >by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2013, >by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2014, >by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2015, >by.x = "Debitor", by.y = "Debitor") > > delivers too few cases. So I guess it does an equi-join. You should not be guessing. Read the help page. It calls the default setting a natural join. > > How can I join the datasets and keep the variables as well as the cases? > If you want a full outer join use all=TRUE. This, too, should have been in the ?merge help page. > I am looking forward to your reply. > > Kind regards > > Georg > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging Data Sets with Full Outer Join
Kunden <- Kunden_2011 Kunden <- merge(Kunden, Kunden_2012, by = "Debitor", all = TRUE) etc. See ?merge for details. Best, Ista On Wed, Apr 20, 2016 at 2:23 AM, wrote: > Hi All, > > I would like to match some datasets. Both deliver variables AND cases > which might or might not be present in all datasets: > > This sequence > > Kunden <- Kunden_2011 > Kunden <- merge(Kunden, Kunden_2012, > by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2013, > by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2014, > by.x = "Debitor", by.y = "Debitor") > > Kunden <- merge(Kunden, Kunden_2015, > by.x = "Debitor", by.y = "Debitor") > > delivers too few cases. So I guess it does an equi-join. > > How can I join the datasets and keep the variables as well as the cases? > > I am looking forward to your reply. > > Kind regards > > Georg > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting Numerical Vector Into Chunks
Perhaps x <- split(x, x == 0) Best, Ista On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A. wrote: > Greetings! > > I have several large data sets of animal movements. Their pauses (zero > magnitude vectors) are of particular interest in addition to the speed > distributions that precede the periods of rest. Here is an example of the > kind of data I am interested in analyzing: > > x <- > abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0))) > length(x) > > This example has 27 elements with strings of zeroes (pauses) situated among > the speed values. > Is there a way to split the vector into zero and nonzero chunks and store > them in a form where they can be analyzed? I have tried various forms of > split() to no avail. > > Thank you! > Salvatore A. Sidoti > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting Numerical Vector Into Chunks
> i <- seq_len(length(x)-1) > split(x, cumsum(c(TRUE, (x[i]==0) != (x[i+1]==0 $`1` [1] 0.144872972504 0.850797178400 $`2` [1] 0 0 0 $`3` [1] 0.199304859380 2.063609410700 0.939393760782 0.838781367540 $`4` [1] 0 0 0 0 0 $`5` [1] 0.374688091264 0.488423999452 0.783034615362 0.626990428900 0.138188255307 2.324635712186 $`6` [1] 0 0 0 0 0 0 0 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Apr 20, 2016 at 12:49 PM, Ista Zahn wrote: > Perhaps > > x <- split(x, x == 0) > > Best, > Ista > > On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A. > wrote: > > Greetings! > > > > I have several large data sets of animal movements. Their pauses (zero > magnitude vectors) are of particular interest in addition to the speed > distributions that precede the periods of rest. Here is an example of the > kind of data I am interested in analyzing: > > > > x <- > abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0))) > > length(x) > > > > This example has 27 elements with strings of zeroes (pauses) situated > among the speed values. > > Is there a way to split the vector into zero and nonzero chunks and > store them in a form where they can be analyzed? I have tried various forms > of split() to no avail. > > > > Thank you! > > Salvatore A. Sidoti > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
On Wed, Apr 20, 2016 at 1:25 AM, Martin Maechler wrote: >> Henrik Bengtsson >> on Tue, 19 Apr 2016 14:04:11 -0700 writes: > > > Using the Matrix package, how can I create a row-oriented sparse > > Matrix from scratch populated with some data? By default a > > column-oriented one is created and I'm aware of the note that the > > package is optimized for column-oriented ones, but I'm only interested > > in using it for holding my sparse row-oriented data and doing basic > > subsetting by rows (even using drop=FALSE). > > > Here is what I get when I set up a column-oriented sparse Matrix: > > >> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE) > >> Cc[1:3,1] <- 1 > > A general ("teaching") remark : > The above use of Matrix() is seen in many places, and is fine > for small matrices and the case where you only use the `[<-` > method very few times (as above). > Also using Matrix() is nice when being introduced to using the > Matrix package. > > However, for efficience in non-small cases, do use > >sparseMatrix() > > directly to construct sparse matrices. > > > >> Cc > > 5 x 5 sparse Matrix of class "dgCMatrix" > > > [1,] 1 . . . . > > [2,] 1 . . . . > > [3,] 1 . . . . > > [4,] . . . . . > > [5,] . . . . . > >> str(Cc) > > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > > ..@ i : int [1:3] 0 1 2 > > ..@ p : int [1:6] 0 3 3 3 3 3 > > ..@ Dim : int [1:2] 5 5 > > ..@ Dimnames:List of 2 > > .. ..$ : NULL > > .. ..$ : NULL > > ..@ x : num [1:3] 1 1 1 > > ..@ factors : list() > > > When I try to do the analogue for a row-oriented matrix, I get a > > "dgTMatrix", whereas I would expect a "dgRMatrix": > > >> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE) > >> Cr <- as(Cr, "dsRMatrix") > >> Cr[1,1:3] <- 1 > >> Cr > > 5 x 5 sparse Matrix of class "dgTMatrix" > > > [1,] 1 1 1 . . > > [2,] . . . . . > > [3,] . . . . . > > [4,] . . . . . > > [5,] . . . . . > > The reason for the above behavior has been > > a) efficiency. All the subassignment ( `[<-` ) methods for >"RsparseMatrix" objects (of which "dsRMatrix" is a special case) >are implemented via TsparseMatrix. > b) because of the general attitude that Csparse (and Tsparse to >some extent) are well supported in Matrix, >and e.g. further operations on Rsparse matrices would *again* >go via T* or C* sparse ones, I had decided to keep things Tsparse. Thanks, understanding these design decisions is helpful. Particularly, since I consider myself a rookie when it comes to the Matrix package. > > [...] > > > Trying with explicit coercion does not work: > > >> as(Cc, "dgRMatrix") > > Error in as(Cc, "dgRMatrix") : > > no method or default for coercing "dgCMatrix" to "dgRMatrix" > > >> as(Cr, "dgRMatrix") > > Error in as(Cr, "dgRMatrix") : > > no method or default for coercing "dgTMatrix" to "dgRMatrix" > > The general philosophy in 'Matrix' with all the class > hierarchies and the many specific classes has been to allow and > foster coercing to abstract super classes, > i.e, to "dMatrix" or "generalMatrix", "triangularMatrix", or > then "denseMatrix", "sparseMatrix", "CsparseMatrix" or > "RsparseMatrix", etc > > So in the above as(*, "RsparseMatrix") should work always. Thanks for pointing this out (and confirming as I since discovered the virtual RsparseMatrix class in the help). > > > As a summary, in other words, for what you want, > >as(sparseMatrix(.), "RsparseMatrix") > > should give you what you want reliably and efficiently. Perfect. > > > > Am I doing some wrong here? Or is this what means that the package is > > optimized for the column-oriented representation and I shouldn't > > really work with row-oriented ones? I'm really only interested in > > access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory > > footprint). > > { though you could equivalently use Cc[,row, drop=FALSE] > with a CsparseMatrix Cc := t(Cr), > couldn't you ? > } Yes, I actually went ahead did that, but since the code I'm writing supports both plain matrix:es and sparse Matrix:es, and the underlying model operates row-by-row, I figured the code would be more consistent if I could use row-orientation everywhere. Not a big deal. Thanks Martin Henrik > > > Martin Maechler (maintainer of 'Matrix') > ETH Zurich > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving sparse, singular systems of equations
The usual culprit in messy code is posting in HTML format. That usually leads to stripping of the formatting by the mailing list and a notice that that occurred, but I don't see that warning here. I still think posting plain text format would fix the problem. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 11:51:40 AM PDT, A A via R-help wrote: >Thanks for the help. Sorry, I am not sure why it looks like that in the >mailing list - it looks much more neat on my end (see attached file). > >On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman >wrote: > > > >> On 20 Apr 2016, at 13:22, A A via R-help >wrote: >> >> >> >> >> I have a situation in R where I would like to find any x (if one >exists) that solves the linear system of equations Ax = b, where A is >square, sparse, and singular, and b is a vector. Here is some code that >mimics my issue with a relatively simple A and b, along with three >other methods of solving this system that I found online, two of which >give me an error and one of which succeeds on the simplified problem, >but fails on my data set(attached). Is there a solver in R that I can >use in order to get x without any errors given the structure of A? >Thanks for your time. >> #CODE STARTS HEREA = >as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b >= matrix(c(-30,40,-10),nrow=3,ncol=1) >> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A >(or out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps) >> #one x that happens to solve Ax = bx = >matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x >> #Error in lsfit(A, b) : only 3 cases, but 4 >variableslsfit(A,b)#solves the system, but fails belowsolve(qr(A, >LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular matrix 'a' in >solveqr.solve(A,b) >> #matrices used in my actual problem (see attached files)A = >readMM("A.txt")b = readMM("b.txt") >> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of >boundssolve(qr(A, LAPACK=TRUE),b) > >Your code is a mess. > >A singular square system of linear equations has an infinity of >solutions if a solution exists at all. >How that works you can find here: >https://en.wikipedia.org/wiki/System_of_linear_equations >in the section "Matrix solutions". > >For your simple example you can do it like this: > >library(MASS) >Ag <- ginv(A) # pseudoinverse > >xb <- Ag %*% b # minimum norm solution > >Aw <- diag(nrow=nrow(Ag)) - Ag %*% A # see the Wikipedia page >w <- runif(3) >z <- xb + Aw %*% w >A %*% z - b > >N <- Null(t(A)) # null space of A; see the help for Null in package >MASS >A %*% N >A %*% (xb + 2 * N) - b > >For sparse systems you will have to approach this differently; I have >no experience with that. > >Berend > > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing and counting expressions in .txt-files
I suggest you go through some R tutorials to learn about R's capabilities. Some recommendations can be found here: https://www.rstudio.com/online-learning/#R To answer your specific query: ?scan ## Because you do not specify file format. ?grep ?regexp ## to use regular expressions to find text. R may not be the best tool for this task, however. Or certain R packages may be better than the basic R tools. Try searching on the rseek.org site to see what might be available if you do not receive suggestions here. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote: > Dear Community, > > > > I hope that I have the right category selected because I am relatively new > to the "R" world. I come with a relatively challenging problem in the > luggage. I would like to realize, that "R" reads text files (there are > several hundred pieces in my folder) sequentially, and screens for specific > terms. If the term is found, the program should write a 1, if not a 0. > Another task is to scrape a ten-digit number from the file after a > particular keyword, so that I can map the results. The Programm should > create an .txt file ideally. > > > > A brief example: > > > > Keywords: "surpassed" "achieved", "very motivated" > > Text1: > > "Personnel number: 0123456789 > > > > The employee has exceeded the set targets and was also otherwise always > motivated (...) " > > > > So I want that my program for this case, ideally reflects the following (in > lines and columns= > > > > Personell number;surpassed;achieved; very motivated (do not write) > 0123456789;1;0;1 > > > For the following files, he shall all continue analogously in line 2, 3, 4 > and so on. > > > > Could you give a brief assessment, how to realize such a thing? How do I > start best and whether you are possibly "stumbled" in advance about > something similar in R? I am grateful for any suggestions/proposals. > > > > Thank you in advance, > > > > Alex > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing and counting expressions in .txt-files
also check out this CRAN task view: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote: > Dear Community, > > > > I hope that I have the right category selected because I am relatively new > to the "R" world. I come with a relatively challenging problem in the > luggage. I would like to realize, that "R" reads text files (there are > several hundred pieces in my folder) sequentially, and screens for specific > terms. If the term is found, the program should write a 1, if not a 0. > Another task is to scrape a ten-digit number from the file after a > particular keyword, so that I can map the results. The Programm should > create an .txt file ideally. > > > > A brief example: > > > > Keywords: "surpassed" "achieved", "very motivated" > > Text1: > > "Personnel number: 0123456789 > > > > The employee has exceeded the set targets and was also otherwise always > motivated (...) " > > > > So I want that my program for this case, ideally reflects the following (in > lines and columns= > > > > Personell number;surpassed;achieved; very motivated (do not write) > 0123456789;1;0;1 > > > For the following files, he shall all continue analogously in line 2, 3, 4 > and so on. > > > > Could you give a brief assessment, how to realize such a thing? How do I > start best and whether you are possibly "stumbled" in advance about > something similar in R? I am grateful for any suggestions/proposals. > > > > Thank you in advance, > > > > Alex > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overlay two facet_grid
Hi all, Does anyone know how to overlay two facet_grids? I have two facet grids as following: ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) Thanks for any help! Elahe __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlay two facet_grid
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example Overlaying aesthetics is possible. Overlaying graphs is not. Without sample data, concrete examples will be unlikely to appear, so read the above link and pay attention to the dput function. -- Sent from my phone. Please excuse my brevity. On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help" wrote: >Hi all, >Does anyone know how to overlay two facet_grids? I have two facet grids >as following: > > >ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) >ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) > >Thanks for any help! >Elahe > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installation problem on Ubuntu
Have you read the CRAN instructions for installing on Ubuntu? Have you read the Posting Guide that mentions the R-sig-debian mailing list and that if you need help compiling R this is not the right list? -- Sent from my phone. Please excuse my brevity. On April 20, 2016 9:36:51 AM PDT, Paul Tremblay wrote: >I needed to update R so I could install ggplot. I am running Ubuntu >12.04. >I cannot upgrade Ubuntu because I am using a work computer. > >I tried upgrading the normal way: > >sudo apt-get update > sudo apt-get install r-base r-base-dev > >But this only installed an earlier version. Finally I tried installing >from >source (./configure, Make install). This worked. However, when I try to >install packages, I get this error: > >Error in download.file(url, destfile = f, quiet = TRUE) : > internet routines cannot be loaded >In addition: Warning message: >In download.file(url, destfile = f, quiet = TRUE) : > unable to load shared object '/usr/local/lib/R/modules//internet.so': >/usr/local/lib/R/modules//internet.so: undefined symbol: >curl_multi_wait > > >>> ls /usr/local/lib/R/modules/ >>> R_X11.so R_de.so internet.so lapack.so > >Thanks! > >P > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data reshaping with conditions
Hi sri, As your problem involves a few logical steps, I found it easier to approach it in a stepwise way. Perhaps there are more elegant ways to accomplish this. svdat<-read.table(text="Count id name type 117 335 sally A 19 335 sally A 167 335 sally B 18 340 susan A 56 340 susan A 22 340 susan B 53 340 susan B 135 351 lee A 114 351 lee A 84 351 lee A 80 351 lee A 19 351 lee A 8 351 lee A 21 351 lee A 88 351 lee B 111 351 lee B 46 351 lee B 108 351 lee B",header=TRUE) # you can also do this with other reshape functions library(prettyR) svdatstr<-stretch_df(svdat,"id",c("Count","type")) count_ind<-grep("Count",names(svdatstr)) type_ind<-grep("type",names(svdatstr)) svdatstr$maxA<-NA svdatstr$maxB<-NA svdatstr$x<-NA svdatstr$y<-NA for(row in 1:nrow(svdatstr)) { svdatstr[row,"maxA"]<- max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]]) svdatstr[row,"maxB"]<- max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]]) svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"] svdatstr[row,"y"]<-!svdatstr[row,"x"] } svdatstr You can then just extract the columns that you need. Jim On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan wrote: > Dear All, > > I am trying to reshape the data with some conditions. A small part of the > data looks like below. Like this there will be more data with repeating ID. > > Count id name type > 117 335 sally A > 19 335 sally A > 167 335 sally B > 18 340 susan A > 56 340 susan A > 22 340 susan B > 53 340 susan B > 135 351 lee A > 114 351 lee A > 84 351 lee A > 80 351 lee A > 19 351 lee A > 8 351 lee A > 21 351 lee A > 88 351 lee B > 111 351 lee B > 46 351 lee B > 108 351 lee B > > >From the above data I am expecting an output like below. > > id name type count_of_B Max of count B x y > 335 sally B 167 167 117,19 NA > 340 susan B 22,53 53 18 56 > 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 > > Where, the column x and column y are: > > x = Count_A_less_than_max of (Count type B) > y = Count_A_higher_than_max of (Count type B). > > *1)* I tried with dplyr with the following code for the initial step to get > the values for each column. > *2)* I thought to transpose the columns which has the unique ID alone. > > I tried with the following code and I am struck with the intial step > itself. The code is executed but higher and lower value of A is not coming. > > Expected_output= data %>% > group_by(id, Type) %>% > mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>% > mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type == > "B"]),max(count[Type == "A"]))) %>% > mutate(count_type_A_lesser = ifelse > (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"], > collapse = ",")), "NA"))%>% > mutate(count_type_A_higher = > ifelse(Type=="B",(paste(unlist(count[Type=="A"]) > > Max_of_count_B[Type=="B"], collapse = ",")), "NA")) > > I hope I make my point clear. Please bare with the code, as I am new to > this. > > Regards, > sri > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlay two facet_grid
It sounds like you want to use grid.arrange() from gridExtra: https://cran.r-project.org/web/packages/gridExtra/vignettes/arrangeGrob.html Hope this helps, Ulrik On Thu, 21 Apr 2016 at 00:52 Jeff Newmiller wrote: > > http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example > > Overlaying aesthetics is possible. Overlaying graphs is not. Without > sample data, concrete examples will be unlikely to appear, so read the > above link and pay attention to the dput function. > -- > Sent from my phone. Please excuse my brevity. > > On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help" < > r-help@r-project.org> wrote: > >Hi all, > >Does anyone know how to overlay two facet_grids? I have two facet grids > >as following: > > > > > > >ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) > > >ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1) > > > >Thanks for any help! > >Elahe > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data reshaping with conditions
Hi sri, I think that I see what you mean. Your statements: x = Count_A_less_than_max of (Count type B) y = Count_A_higher_than_max of (Count type B). I took to mean that you wanted a logical value for x and y. Looking more closely at your initial message, I see that you wanted _all_ values of A with respect to maxB in x and y. The error with maximum values was due to a typo. Perhaps this will do what you want: svdat<-read.table(text="Count id name type 117 335 sally A 19 335 sally A 167 335 sally B 18 340 susan A 56 340 susan A 22 340 susan B 53 340 susan B 135 351 lee A 114 351 lee A 84 351 lee A 80 351 lee A 19 351 lee A 8 351 lee A 21 351 lee A 88 351 lee B 111 351 lee B 46 351 lee B 108 351 lee B",header=TRUE) # you can also do this with other reshape functions library(prettyR) svdatstr<-stretch_df(svdat,"id",c("Count","type")) count_ind<-grep("Count",names(svdatstr)) type_ind<-grep("type",names(svdatstr)) svdatstr$maxA<-NA svdatstr$maxB<-NA svdatstr$x<-NA svdatstr$y<-NA for(row in 1:nrow(svdatstr)) { indicesA<-count_ind[as.logical(match(svdatstr[row,type_ind],"A",0))] svdatstr[row,"maxA"]<-max(svdatstr[row,indicesA]) indicesB<-count_ind[as.logical(match(svdatstr[row,type_ind],"B",0))] svdatstr[row,"maxB"]<-max(svdatstr[row,indicesB]) AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]=svdatstr[row,"maxB"]] svdatstr[row,"y"]<-paste(AgeB,collapse=",") } svdatstr[,c("id","name","maxB","x","y")] Jim On Thu, Apr 21, 2016 at 2:23 PM, sri vathsan wrote: > Hi Jim, > > Thanks for your time. But somehow this code did not help me to achieve my > expected output. > Problems: 1) x, y are coming as logical rather than values as I mentioned in > my post >2) The values that I get for Max A and Max B not correct >3) It looks like a pretty big data, but I just need to > concatenate the values with a comma, the final output will be a character > variable. > > Regards, > Sri > > On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon wrote: >> >> Hi sri, >> As your problem involves a few logical steps, I found it easier to >> approach it in a stepwise way. Perhaps there are more elegant ways to >> accomplish this. >> >> svdat<-read.table(text="Count id name type >> 117 335 sally A >> 19 335 sally A >> 167 335 sally B >> 18 340 susan A >> 56 340 susan A >> 22 340 susan B >> 53 340 susan B >> 135 351 lee A >> 114 351 lee A >> 84 351 lee A >> 80 351 lee A >> 19 351 lee A >> 8 351 lee A >> 21 351 lee A >> 88 351 lee B >> 111 351 lee B >> 46 351 lee B >> 108 351 lee B",header=TRUE) >> # you can also do this with other reshape functions >> library(prettyR) >> svdatstr<-stretch_df(svdat,"id",c("Count","type")) >> count_ind<-grep("Count",names(svdatstr)) >> type_ind<-grep("type",names(svdatstr)) >> svdatstr$maxA<-NA >> svdatstr$maxB<-NA >> svdatstr$x<-NA >> svdatstr$y<-NA >> for(row in 1:nrow(svdatstr)) { >> svdatstr[row,"maxA"]<- >> >> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]]) >> svdatstr[row,"maxB"]<- >> >> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]]) >> svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"] >> svdatstr[row,"y"]<-!svdatstr[row,"x"] >> } >> svdatstr >> >> You can then just extract the columns that you need. >> >> Jim >> >> >> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan wrote: >> > Dear All, >> > >> > I am trying to reshape the data with some conditions. A small part of >> > the >> > data looks like below. Like this there will be more data with repeating >> > ID. >> > >> > Count id name type >> > 117 335 sally A >> > 19 335 sally A >> > 167 335 sally B >> > 18 340 susan A >> > 56 340 susan A >> > 22 340 susan B >> > 53 340 susan B >> > 135 351 lee A >> > 114 351 lee A >> > 84 351 lee A >> > 80 351 lee A >> > 19 351 lee A >> > 8 351 lee A >> > 21 351 lee A >> > 88 351 lee B >> > 111 351 lee B >> > 46 351 lee B >> > 108 351 lee B >> > >> > >From the above data I am expecting an output like below. >> > >> > id name type count_of_B Max of count B x y >> > 335 sally B 167 167 117,19 NA >> > 340 susan B 22,53 53 18 56 >> > 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 >> > >> > Where, the column x and column y are: >> > >> > x = Count_A_less_than_max of (Count type B) >> > y = Count_A_higher_than_max of (Count type B). >> > >> > *1)* I tried with dplyr with the following code for the initial step to >> > get >> > the values for each column. >> > *2)* I thought to transpose the columns which has the unique ID alone. >> > >> > I tried with the following code and I am struck with the intial step >> > itself. The code is executed but higher and lower value of A is not >> > coming. >> > >> > Expected_output= data %>% >> > group_by(id, Type) %>% >> > mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = >> > ","))%>% >> > mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type == >> > "B"]),max(count[Type == "A"]))) %>% >> > mutate(count_type_A_lesser
[R] Mailing List
Dear All, I am using R to do my work and thank you very much for developing, maintaining and making such excellent software available to anyone that is interested enough to ask for it. I have registered at Nabble. I was wondering the right forum for me to send my help request. I have tried sending to R-help@r-project.org. However, I do receive a kind of warning email stating that my email awaits approval from the moderator since I am a non-member posting to membership email. Can any one please direct me to the right forum for me. My problem range from plotting graph using R, statistics in R, etc. You could have seen some of my request this few days. Thank you for your time. Ogbos __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data reshaping with conditions
Hi Jim, Thanks for your time. But somehow this code did not help me to achieve my expected output. Problems: 1) x, y are coming as logical rather than values as I mentioned in my post 2) The values that I get for Max A and Max B not correct 3) It looks like a pretty big data, but I just need to concatenate the values with a comma, the final output will be a character variable. Regards, Sri On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon wrote: > Hi sri, > As your problem involves a few logical steps, I found it easier to > approach it in a stepwise way. Perhaps there are more elegant ways to > accomplish this. > > svdat<-read.table(text="Count id name type > 117 335 sally A > 19 335 sally A > 167 335 sally B > 18 340 susan A > 56 340 susan A > 22 340 susan B > 53 340 susan B > 135 351 lee A > 114 351 lee A > 84 351 lee A > 80 351 lee A > 19 351 lee A > 8 351 lee A > 21 351 lee A > 88 351 lee B > 111 351 lee B > 46 351 lee B > 108 351 lee B",header=TRUE) > # you can also do this with other reshape functions > library(prettyR) > svdatstr<-stretch_df(svdat,"id",c("Count","type")) > count_ind<-grep("Count",names(svdatstr)) > type_ind<-grep("type",names(svdatstr)) > svdatstr$maxA<-NA > svdatstr$maxB<-NA > svdatstr$x<-NA > svdatstr$y<-NA > for(row in 1:nrow(svdatstr)) { > svdatstr[row,"maxA"]<- > > max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]]) > svdatstr[row,"maxB"]<- > > max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]]) > svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"] > svdatstr[row,"y"]<-!svdatstr[row,"x"] > } > svdatstr > > You can then just extract the columns that you need. > > Jim > > > On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan wrote: > > Dear All, > > > > I am trying to reshape the data with some conditions. A small part of the > > data looks like below. Like this there will be more data with repeating > ID. > > > > Count id name type > > 117 335 sally A > > 19 335 sally A > > 167 335 sally B > > 18 340 susan A > > 56 340 susan A > > 22 340 susan B > > 53 340 susan B > > 135 351 lee A > > 114 351 lee A > > 84 351 lee A > > 80 351 lee A > > 19 351 lee A > > 8 351 lee A > > 21 351 lee A > > 88 351 lee B > > 111 351 lee B > > 46 351 lee B > > 108 351 lee B > > > > >From the above data I am expecting an output like below. > > > > id name type count_of_B Max of count B x y > > 335 sally B 167 167 117,19 NA > > 340 susan B 22,53 53 18 56 > > 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 > > > > Where, the column x and column y are: > > > > x = Count_A_less_than_max of (Count type B) > > y = Count_A_higher_than_max of (Count type B). > > > > *1)* I tried with dplyr with the following code for the initial step to > get > > the values for each column. > > *2)* I thought to transpose the columns which has the unique ID alone. > > > > I tried with the following code and I am struck with the intial step > > itself. The code is executed but higher and lower value of A is not > coming. > > > > Expected_output= data %>% > > group_by(id, Type) %>% > > mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>% > > mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type == > > "B"]),max(count[Type == "A"]))) %>% > > mutate(count_type_A_lesser = ifelse > > (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"], > > collapse = ",")), "NA"))%>% > > mutate(count_type_A_higher = > > ifelse(Type=="B",(paste(unlist(count[Type=="A"]) > > > Max_of_count_B[Type=="B"], collapse = ",")), "NA")) > > > > I hope I make my point clear. Please bare with the code, as I am new to > > this. > > > > Regards, > > sri > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Regards, Srivathsan.K Phone : 9600165206 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.