date:20160420

Re: [R] Problem with X11

2016-04-20 Thread Lorenzo Isella


Hello!
Today on debian testing R 3.2.5 was delivered among the updates.
The X11 problem is no longer there.
Cheers

Lorenzo

On Tue, Apr 19, 2016 at 02:28:44PM -0400, Tom Wright wrote:

I don't have my debian box available so can't confirm. But I would try
$apt-get install libpng

On Tue, Apr 19, 2016 at 11:23 AM, Lorenzo Isella 
wrote:


Dear All,
I have never had this problem before. I run debian testing on my box
and I have recently update my R environment.
Now, see what happens when I try the most trivial of all plots

plot(seq(22))



Error in (function (display = "", width, height, pointsize, gamma, bg,
:
 X11 module cannot be loaded
 In addition: Warning message:
 In (function (display = "", width, height, pointsize, gamma, bg,  :
   unable to load shared object '/usr/lib/R/modules//R_X11.so':
 /usr/lib/x86_64-linux-gnu/libpng12.so.0: version `PNG12_0' not
 found (required by /usr/lib/R/modules//R_X11.so)

and this is my sessionInfo()

sessionInfo()



R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid

locale:
[1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
  [5] LC_MONETARY=en_GB.utf8LC_MESSAGES=en_GB.utf8
   [7] LC_PAPER=en_GB.utf8   LC_NAME=C
[9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


Anybody understands what is going on here?
Regards

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-20 Thread Martin Maechler

> Henrik Bengtsson 
> on Tue, 19 Apr 2016 14:04:11 -0700 writes:

> Using the Matrix package, how can I create a row-oriented sparse
> Matrix from scratch populated with some data?  By default a
> column-oriented one is created and I'm aware of the note that the
> package is optimized for column-oriented ones, but I'm only interested
> in using it for holding my sparse row-oriented data and doing basic
> subsetting by rows (even using drop=FALSE).

> Here is what I get when I set up a column-oriented sparse Matrix:

>> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
>> Cc[1:3,1] <- 1

A general ("teaching") remark :
The above use of Matrix() is seen in many places, and is fine
for small matrices and the case where you only use the `[<-`
method very few times (as above).
Also using  Matrix()  is nice when being introduced to using the
Matrix package.

However, for efficience in non-small cases, do use

   sparseMatrix()

directly to construct sparse matrices.

>> Cc
> 5 x 5 sparse Matrix of class "dgCMatrix"

> [1,] 1 . . . .
> [2,] 1 . . . .
> [3,] 1 . . . .
> [4,] . . . . .
> [5,] . . . . .
>> str(Cc)
> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> ..@ i   : int [1:3] 0 1 2
> ..@ p   : int [1:6] 0 3 3 3 3 3
> ..@ Dim : int [1:2] 5 5
> ..@ Dimnames:List of 2
> .. ..$ : NULL
> .. ..$ : NULL
> ..@ x   : num [1:3] 1 1 1
> ..@ factors : list()

> When I try to do the analogue for a row-oriented matrix, I get a
> "dgTMatrix", whereas I would expect a "dgRMatrix":

>> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
>> Cr <- as(Cr, "dsRMatrix")
>> Cr[1,1:3] <- 1
>> Cr
> 5 x 5 sparse Matrix of class "dgTMatrix"

> [1,] 1 1 1 . .
> [2,] . . . . .
> [3,] . . . . .
> [4,] . . . . .
> [5,] . . . . .

The reason for the above behavior has been

a) efficiency.  All the subassignment ( `[<-` ) methods for
   "RsparseMatrix" objects (of which "dsRMatrix" is a special case)
   are implemented via  TsparseMatrix.
b) because of the general attitude that Csparse (and Tsparse to
   some extent) are well supported in Matrix,
   and e.g. further operations on Rsparse matrices would *again*
   go via T* or C* sparse ones, I had decided to keep things Tsparse.

[...]

> Trying with explicit coercion does not work:

>> as(Cc, "dgRMatrix")
> Error in as(Cc, "dgRMatrix") :
> no method or default for coercing "dgCMatrix" to "dgRMatrix"

>> as(Cr, "dgRMatrix")
> Error in as(Cr, "dgRMatrix") :
> no method or default for coercing "dgTMatrix" to "dgRMatrix"

The general philosophy in 'Matrix' with all the class
hierarchies and the many specific classes has been to allow and
foster coercing to abstract super classes,
i.e, to  "dMatrix"  or "generalMatrix", "triangularMatrix", or
then "denseMatrix", "sparseMatrix", "CsparseMatrix" or
"RsparseMatrix", etc

So in the above  as(*, "RsparseMatrix")   should work always.

As a summary, in other words,  for what you want,

   as(sparseMatrix(.), "RsparseMatrix")

should give you what you want reliably and efficiently.

> Am I doing some wrong here?  Or is this what means that the package is
> optimized for the column-oriented representation and I shouldn't
> really work with row-oriented ones?  I'm really only interested in
> access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
> footprint).

{ though you could equivalently use   Cc[,row, drop=FALSE]
  with a CsparseMatrix Cc := t(Cr),
  couldn't you ?
}

Martin Maechler  (maintainer of 'Matrix')
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge sort

2016-04-20 Thread Gaston

I indeed used is.na() to check length, as I was not sure weather 
lenght() was a simple query or would go through the whole vector to 
count the elements.


So to sum up, function calls are expensive, therefore recursion should 
be avoided, and growing the size of a vector (which is probably 
reassigning and copying?) is also expensive.


Thank you for your help!



On 04/19/2016 11:51 PM, Duncan Murdoch wrote:

On 19/04/2016 3:39 PM, Gaston wrote:

Hello everyone,

I am learning R since recently, and as a small exercise I wanted to
write a recursive mergesort. I was extremely surprised to discover that
my sorting, although operational, is deeply inefficient in time. Here is
my code :


merge <- function(x,y){
   if (is.na(x[1])) return(y)
   else if (is.na(y[1])) return(x)
   else if (x[1]return(cbind(c(x[1],division(x[-c(1,2)])[,1]),c(x[2],division(x[-c(1,2)])[,2]))) 


}

mergesort <- function(x){
   if (is.na(x[2])) return(x)
   else{
 print(x)
 t=division(x)
 return(merge(mergesort(t[,1]),mergesort(t[,2])))
   }
}


I tried my best to write it "the R-way", but apparently I failed. I
suppose some of the functions I used are quite heavy. I would be
grateful if you could give a hint on how to change that!

I hope I made myself clear and wish you a nice day,


Your use of is.na() looks strange.  I don't understand why you are 
testing element 2 in mergesort(), and element 1 in merge(), and 
element 3 in division.  Are you using it to test the length?  It's 
better to use the length() function for that.


The division() function returns a matrix.  It would make more R-sense 
to return a list containing the two parts, because they might not be 
the same length.


Generally speaking, function calls are expensive in R, so the 
recursive merge you're using looks like it would be the bottleneck.  
You'd almost certainly be better off to allocate something of 
length(x) + length(y), and do the assignments in a loop.


Here's a merge sort I wrote as an illustration in a class.  It's 
designed for clarity rather than speed, but I'd guess it would be 
faster than yours:


mergesort <- function(x) {

  n <- length(x)
  if (n < 2) return(x)

  # split x into two pieces of approximately equal size, x1 and x2

  x1 <- x[1:(n %/% 2)]
  x2 <- x[(n %/% 2 + 1):n]

  # sort each of the pieces
  x1 <- mergesort(x1)
  x2 <- mergesort(x2)

  # merge them back together
  result <- c()
  i <- 0
  while (length(x1) > 0 && length(x2) > 0) {
# compare the first values
if (x1[1] < x2[1]) {
  result[i + 1] <- x1[1]
  x1 <- x1[-1]
} else {
  result[i + 1] <- x2[1]
  x2 <- x2[-1]
}
i <- i + 1
  }

  # put the smaller one into the result
  # delete it from whichever vector it came from
  # repeat until one of x1 or x2 is empty
  # copy both vectors (one is empty!) onto the end of the results
  result <- c(result, x1, x2)
  result
}

If I were going for speed, I wouldn't modify the x1 and x2 vectors, 
and I'd pre-allocate result to the appropriate length, rather than 
growing it in the while loop.  But that was a different class!


Duncan Murdoch


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge sort

2016-04-20 Thread Duncan Murdoch


On 20/04/2016 7:38 AM, Gaston wrote:

I indeed used is.na() to check length, as I was not sure weather
lenght() was a simple query or would go through the whole vector to
count the elements.


length() is a simple query, and is very fast.  The other problem in your 
approach (which may not be a problem with your current data) is that NA 
is commonly used as an element of a vector to represent a missing value.




So to sum up, function calls are expensive, therefore recursion should
be avoided, and growing the size of a vector (which is probably
reassigning and copying?) is also expensive.


"Avoided" may be too strong:  speed isn't always a concern, sometimes 
clarity is more important.  Growing vectors is definitely expensive.


Duncan Murdoch



Thank you for your help!



On 04/19/2016 11:51 PM, Duncan Murdoch wrote:

On 19/04/2016 3:39 PM, Gaston wrote:

Hello everyone,

I am learning R since recently, and as a small exercise I wanted to
write a recursive mergesort. I was extremely surprised to discover that
my sorting, although operational, is deeply inefficient in time. Here is
my code :


merge <- function(x,y){
if (is.na(x[1])) return(y)
else if (is.na(y[1])) return(x)
else if (x[1]

I tried my best to write it "the R-way", but apparently I failed. I
suppose some of the functions I used are quite heavy. I would be
grateful if you could give a hint on how to change that!

I hope I made myself clear and wish you a nice day,


Your use of is.na() looks strange.  I don't understand why you are
testing element 2 in mergesort(), and element 1 in merge(), and
element 3 in division.  Are you using it to test the length?  It's
better to use the length() function for that.

The division() function returns a matrix.  It would make more R-sense
to return a list containing the two parts, because they might not be
the same length.

Generally speaking, function calls are expensive in R, so the
recursive merge you're using looks like it would be the bottleneck.
You'd almost certainly be better off to allocate something of
length(x) + length(y), and do the assignments in a loop.

Here's a merge sort I wrote as an illustration in a class.  It's
designed for clarity rather than speed, but I'd guess it would be
faster than yours:

mergesort <- function(x) {

   n <- length(x)
   if (n < 2) return(x)

   # split x into two pieces of approximately equal size, x1 and x2

   x1 <- x[1:(n %/% 2)]
   x2 <- x[(n %/% 2 + 1):n]

   # sort each of the pieces
   x1 <- mergesort(x1)
   x2 <- mergesort(x2)

   # merge them back together
   result <- c()
   i <- 0
   while (length(x1) > 0 && length(x2) > 0) {
 # compare the first values
 if (x1[1] < x2[1]) {
   result[i + 1] <- x1[1]
   x1 <- x1[-1]
 } else {
   result[i + 1] <- x2[1]
   x2 <- x2[-1]
 }
 i <- i + 1
   }

   # put the smaller one into the result
   # delete it from whichever vector it came from
   # repeat until one of x1 or x2 is empty
   # copy both vectors (one is empty!) onto the end of the results
   result <- c(result, x1, x2)
   result
}

If I were going for speed, I wouldn't modify the x1 and x2 vectors,
and I'd pre-allocate result to the appropriate length, rather than
growing it in the while loop.  But that was a different class!

Duncan Murdoch




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data reshaping with conditions

2016-04-20 Thread sri vathsan

Dear All,

I am trying to reshape the data with some conditions. A small part of the
data looks like below. Like this there will be more data with repeating ID.

Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B

>From the above data I am expecting an output like below.

id name type count_of_B Max of count B x   y
335 sally B 167 167 117,19  NA
340 susan B 22,53 53 18  56
351 lee B 88,111,46,108  111 84,80,19,8,2   135,114

Where, the column x and column y are:

x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).

*1)* I tried with dplyr with the following code for the initial step to get
the values for each column.
*2)*  I thought to transpose the columns which has the unique ID alone.

I tried with the following code and I am struck with the intial step
itself. The code is executed but higher and lower value of A is not coming.

Expected_output= data %>%
  group_by(id, Type) %>%
  mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
  mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
"B"]),max(count[Type == "A"]))) %>%
  mutate(count_type_A_lesser = ifelse
(Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
collapse = ",")), "NA"))%>%
  mutate(count_type_A_higher =
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
Max_of_count_B[Type=="B"], collapse = ",")), "NA"))

I hope I make my point clear. Please bare with the code, as I am new to
this.

Regards,
sri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use multiple cores on Linux

2016-04-20 Thread Miluji Sb

I am trying to run the following code in R on a Linux cluster. I would like
to use the full processing power (specifying cores/nodes/memory). The code
essentially runs predictions based on a GAM regression and saves the
results as a CSV file for multiple sets of data (here I only show two).

Is it possible to run this code using HPC packages such as
Rmpi/snow/doParallel? Thank you!

#
library(data.table)
library(mgcv)
library(reshape2)
library(dplyr)
library(tidyr)
library(lubridate)
library(DataCombine)
#
gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) +
factor(week) + s(lnincome) + s(tmax) +
s(hmax),data=cont,na.action="na.omit", method="ML")

#
# Historic
temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv")
humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv")
#
temp_hist <- as.data.table(temp_hist)
humid_hist <- as.data.table(humid_hist)
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_hist, mykey)
setkeyv(humid_hist, mykey)
#
hist<- merge(temp_hist, humid_hist, by=mykey)
#
hist$X.x <- NULL
hist$X.y <- NULL
#
# Max
hist_max <- hist
hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <-
hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL
#
# Adding Factors
hist_max$citycode <- rep(101,nrow(hist_max))
hist_max$year <- rep(2010,nrow(hist_max))
hist_max$lnincome <- rep(10.262,nrow(hist_max))
#
# Predictions
pred_hist_max <- predict.gam(gam_max_count_wk,hist_max)
#
pred_hist_max <- as.data.table(pred_hist_max)
pred_hist_max <- cbind(hist, pred_hist_max)
pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <-
pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL
#
# Aggregate by FIPS
max_hist <- pred_hist_max %>%
  group_by(FIPS) %>%
  summarise(pred_hist = mean(pred_hist_max))
#
### Future
## 4.5
# 4.5_2021_2050
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <- pred_sim_max %>%
  group_by(FIPS) %>%
  summarise(pred_sim = mean(pred_sim_max))
#
# Merge with Historical Data
max_hist$FIPS <- as.factor(max_hist$FIPS)
max_sim$FIPS <- as.factor(max_sim$FIPS)
#
mykey1<- c("FIPS")
setkeyv(max_hist, mykey1)
setkeyv(max_sim, mykey1)
max_change <- merge(max_hist, max_sim, by=mykey1)
max_change$change <-
((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100
#
write.csv(max_change, file =
"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv")



# 4.5_2081_2100
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <- pred_sim

Re: [R] Add a vertical arrow to a time series graph using ggplot and xts

2016-04-20 Thread Giorgio Garziano

Please see updates to df2 assignment as shown below.

library(xts)  # primary
#library(tseries)   # Unit root tests
library(ggplot2)
library(vars)
library(grid)
dt_xts<-xts(x = 1:10, order.by = seq(as.Date("2016-01-01"),
 as.Date("2016-01-10"), by = "1 day"))
colnames(dt_xts)<-"gdp"
xmin<-min(index(dt_xts))
xmax<-max(index(dt_xts))
df1<-data.frame(x = index(dt_xts), coredata(dt_xts))
p<-ggplot(data = df1, mapping= aes(x=x, y=gdp))+geom_line()
rg<-ggplot_build(p)$panel$ranges[[1]]$y.range
y1<-rg[1]
y2<-rg[2]


# x = as.Date(..) in place of x = "2016-01-05"
df2<-data.frame(x = as.Date("2016-01-05"), y1=y1, y2=y2 )

p1<-p+geom_segment(mapping=aes(x=x, y=y1, xend=x, yend=y2), data=df2,
   arrow=arrow())
--

Best,

GG




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread William Dunlap via R-help

This is not a solution but your lsfit attempt
   #Error in lsfit(A, b) : only 3 cases, but 4 variables
   lsfit(A,b)
gave that error because lsfit adds a column of 1 to
its first argument unless you use intercept=FALSE.
Then it will give you an answer (but I think it converts
your sparse matrix into a dense one before doing
any linear algebra).



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help 
wrote:

>
>
>
>  I have a situation in R where I would like to find any x (if one exists)
> that solves the linear system of equations Ax = b, where A is square,
> sparse, and singular, and b is a vector. Here is some code that mimics my
> issue with a relatively simple A and b, along with three other methods of
> solving this system that I found online, two of which give me an error and
> one of which succeeds on the simplified problem, but fails on my data
> set(attached). Is there a solver in R that I can use in order to get x
> without any errors given the structure of A? Thanks for your time.
> #CODE STARTS HEREA =
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
> = matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or
> out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A
> %*% x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b)
> : singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A =
> readMM("A.txt")b = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of
> boundssolve(qr(A, LAPACK=TRUE),b)
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Jeff Newmiller

This is kind of like asking for a solution to x+1=x+1. Go back to linear 
algebra and look up Singular Value Decomposition, and decide if you really want 
to proceed.  See also ?svd and package irlba.
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 4:22:34 AM PDT, A A via R-help  wrote:
>
>
>
>I have a situation in R where I would like to find any x (if one
>exists) that solves the linear system of equations Ax = b, where A is
>square, sparse, and singular, and b is a vector. Here is some code that
>mimics my issue with a relatively simple A and b, along with three
>other methods of solving this system that I found online, two of which
>give me an error and one of which succeeds on the simplified problem,
>but fails on my data set(attached). Is there a solver in R that I can
>use in order to get x without any errors given the structure of A?
>Thanks for your time.
>#CODE STARTS HEREA =
>as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
>= matrix(c(-30,40,-10),nrow=3,ncol=1)
>#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or
>out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
>#one x that happens to solve Ax = bx =
>matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
>#Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves
>the system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in
>qr.solve(A, b) : singular matrix 'a' in solveqr.solve(A,b)
>#matrices used in my actual problem (see attached files)A =
>readMM("A.txt")b = readMM("b.txt")
>#Error in as(x, "matrix")[i, , drop = drop] : subscript out of
>boundssolve(qr(A, LAPACK=TRUE),b)
>
>   
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading Multiple Output Variables

2016-04-20 Thread jody.kelly


Hi all,


I am trying to read multiple out variables for a sensitivity analysis.


Currently using one output value as follows:


Y<-(E1)


However I need to run analysis against 12 values of Y. So E1-E12.


My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of  4  input 
variables in different combinations. These will be analysed against 40 rows of 
output variables for 12 columns.


e.g.


  V1 V2 V3 V4E1 E2 E3 E4 ... E12

1

2

...

40


Can someone provide guidance on How I can plot against all 12 months?


Thanks


Jody


This message is intended solely for the addressee and may contain confidential 
and/or legally privileged information. Any use, disclosure or reproduction 
without the sender's explicit consent is unauthorised and may be unlawful. If 
you have received this message in error, please notify Northumbria University 
immediately and permanently delete it. Any views or opinions expressed in this 
message are solely those of the author and do not necessarily represent those 
of the University. The University cannot guarantee that this message or any 
attachment is virus free or has not been intercepted and/or amended.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use multiple cores on Linux

2016-04-20 Thread Jeff Newmiller

The answer to your question is yes. You might consider using the parallel 
package., and I would suggest starting  with a simpler test case to learn how 
it works and incrementally adding complexity of packages and data handling. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 7:37:07 AM PDT, Miluji Sb  wrote:
>I am trying to run the following code in R on a Linux cluster. I would
>like
>to use the full processing power (specifying cores/nodes/memory). The
>code
>essentially runs predictions based on a GAM regression and saves the
>results as a CSV file for multiple sets of data (here I only show two).
>
>Is it possible to run this code using HPC packages such as
>Rmpi/snow/doParallel? Thank you!
>
>#
>library(data.table)
>library(mgcv)
>library(reshape2)
>library(dplyr)
>library(tidyr)
>library(lubridate)
>library(DataCombine)
>#
>gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) +
>factor(week) + s(lnincome) + s(tmax) +
>s(hmax),data=cont,na.action="na.omit", method="ML")
>
>#
># Historic
>temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv")
>humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv")
>#
>temp_hist <- as.data.table(temp_hist)
>humid_hist <- as.data.table(humid_hist)
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_hist, mykey)
>setkeyv(humid_hist, mykey)
>#
>hist<- merge(temp_hist, humid_hist, by=mykey)
>#
>hist$X.x <- NULL
>hist$X.y <- NULL
>#
># Max
>hist_max <- hist
>hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <-
>hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL
>#
># Adding Factors
>hist_max$citycode <- rep(101,nrow(hist_max))
>hist_max$year <- rep(2010,nrow(hist_max))
>hist_max$lnincome <- rep(10.262,nrow(hist_max))
>#
># Predictions
>pred_hist_max <- predict.gam(gam_max_count_wk,hist_max)
>#
>pred_hist_max <- as.data.table(pred_hist_max)
>pred_hist_max <- cbind(hist, pred_hist_max)
>pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <-
>pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL
>#
># Aggregate by FIPS
>max_hist <- pred_hist_max %>%
>  group_by(FIPS) %>%
>  summarise(pred_hist = mean(pred_hist_max))
>#
>### Future
>## 4.5
># 4.5_2021_2050
>temp_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
>humid_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
>#
># Max
>temp_sim <- as.data.table(temp_sim)
>setnames(temp_sim, "max", "tmax")
>setnames(temp_sim, "min", "tmin")
>setnames(temp_sim, "avg", "tmean")
>#
>humid_sim <- as.data.table(humid_sim)
>setnames(humid_sim, "max", "hmax")
>setnames(humid_sim, "min", "hmin")
>setnames(humid_sim, "avg", "hmean")
>#
>temp_sim$X <- NULL
>humid_sim$X <- NULL
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_sim, mykey)
>setkeyv(humid_sim, mykey)
>#
>sim <- merge(temp_sim, humid_sim, by=mykey)
>#
>sim_max <- sim
>#
>sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
>sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
>#
># Adding Factors
>sim_max$citycode <- rep(101,nrow(sim_max))
>sim_max$year <- rep(2010,nrow(sim_max))
>sim_max$week <- rep(1,nrow(sim_max))
>sim_max$lnincome <- rep(10.262,nrow(sim_max))
>#
># Predictions
>pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
>#
>pred_sim_max <- as.data.table(pred_sim_max)
>pred_sim_max <- cbind(sim, pred_sim_max)
>pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
>pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
>#
># Aggregate by FIPS
>max_sim <- pred_sim_max %>%
>  group_by(FIPS) %>%
>  summarise(pred_sim = mean(pred_sim_max))
>#
># Merge with Historical Data
>max_hist$FIPS <- as.factor(max_hist$FIPS)
>max_sim$FIPS <- as.factor(max_sim$FIPS)
>#
>mykey1<- c("FIPS")
>setkeyv(max_hist, mykey1)
>setkeyv(max_sim, mykey1)
>max_change <- merge(max_hist, max_sim, by=mykey1)
>max_change$change <-
>((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100
>#
>write.csv(max_change, file =
>"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv")
>
>
>
># 4.5_2081_2100
>temp_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
>humid_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
>#
># Max
>temp_sim <- as.data.table(temp_sim)
>setnames(temp_sim, "max", "tmax")
>setnames(temp_sim, "min", "tmin")
>setnames(temp_sim, "avg", "tmean")
>#
>humid_sim <- as.data.table(humid_sim)
>setnames(humid_sim, "max", "hmax")
>setnames(humid_sim, "min", "hmin")
>setnames(humid_sim, "avg", "hmean")
>#
>temp_sim$X <- NULL
>humid_sim$X <- NULL
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_sim, mykey)
>setkeyv(humid_sim, mykey)
>#
>sim <- merge(temp_sim, humid_sim, by=mykey)
>#
>sim_max <- sim
>#
>sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
>sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
>#
># Adding Facto

Re: [R] Reading Multiple Output Variables

2016-04-20 Thread Jeff Newmiller

The word "analysis" is too vague. 

If you are referring to lm regression, you can specify Y as a matrix instead of 
a vector. 

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Also, please disable HTML in your email when sending to this list, since it 
will usually come through to us in damaged form. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 8:19:48 AM PDT, "jody.kelly"  
wrote:
>
>Hi all,
>
>
>I am trying to read multiple out variables for a sensitivity analysis.
>
>
>Currently using one output value as follows:
>
>
>Y<-(E1)
>
>
>However I need to run analysis against 12 values of Y. So E1-E12.
>
>
>My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of  4 
>input variables in different combinations. These will be analysed
>against 40 rows of output variables for 12 columns.
>
>
>e.g.
>
>
>  V1 V2 V3 V4E1 E2 E3 E4 ... E12
>
>1
>
>2
>
>...
>
>40
>
>
>Can someone provide guidance on How I can plot against all 12 months?
>
>
>Thanks
>
>
>Jody
>
>
>This message is intended solely for the addressee and may contain
>confidential and/or legally privileged information. Any use, disclosure
>or reproduction without the sender's explicit consent is unauthorised
>and may be unlawful. If you have received this message in error, please
>notify Northumbria University immediately and permanently delete it.
>Any views or opinions expressed in this message are solely those of the
>author and do not necessarily represent those of the University. The
>University cannot guarantee that this message or any attachment is
>virus free or has not been intercepted and/or amended.
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installation of dplyr

2016-04-20 Thread Ben Tupper

Increasing memory resolved the issue for me.

Thanks again,
Ben

> On Apr 19, 2016, at 4:10 PM, Hadley Wickham  wrote:
> 
> You normally see these errors when compiling on a vm that has very
> little memory.
> Hadley
> 
> On Tue, Apr 19, 2016 at 2:47 PM, Ben Tupper  wrote:
>> Hello,
>> 
>> I am getting a fresh CentOS 6.7 machine set up with all of the goodies for R 
>> 3.2.3, including dplyr package. I am unable to successfully install it.  
>> Below I show the failed installation using utils::install.packages() and 
>> then again using devtools::install_github().  Each yields an error similar 
>> to the other but not quite exactly the same - the error messages sail right 
>> over my head.
>> 
>> I can contact the package author if that would be better, but thought it 
>> best to start here.
>> 
>> Thanks!
>> Ben
>> 
>> Ben Tupper
>> Bigelow Laboratory for Ocean Sciences
>> 60 Bigelow Drive, P.O. Box 380
>> East Boothbay, Maine 04544
>> http://www.bigelow.org
>> 
>>> sessionInfo()
>> R version 3.2.3 (2015-12-10)
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>> Running under: CentOS release 6.7 (Final)
>> 
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>> [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>> 
>> 
>> 
>> 
>> #   utils::install.packages()
>> 
>> 
>>> install.packages("dplyr", repo = "http://cran.r-project.org";)
>> Installing package into ‘/usr/lib64/R/library’
>> (as ‘lib’ is unspecified)
>> trying URL 'http://cran.r-project.org/src/contrib/dplyr_0.4.3.tar.gz'
>> Content type 'application/x-gzip' length 655997 bytes (640 KB)
>> ==
>> downloaded 640 KB
>> 
>> * installing *source* package ‘dplyr’ ...
>> ** package ‘dplyr’ successfully unpacked and MD5 sums checked
>> ** libs
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c RcppExports.cpp -o 
>> RcppExports.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c address.cpp -o address.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c api.cpp -o api.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c arrange.cpp -o arrange.o
>> In file included from ../inst/include/dplyr.h:131,
>> from arrange.cpp:1:
>> ../inst/include/dplyr/DataFrameSubsetVisitors.h: In constructor 
>> ‘dplyr::DataFrameSubsetVisitors::DataFrameSubsetVisitors(const 
>> Rcpp::DataFrame&, const Rcpp::CharacterVector&)’:
>> ../inst/include/dplyr/DataFrameSubsetVisitors.h:40: warning: ‘column’ may be 
>> used uninitialized in this function
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c between.cpp -o between.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c bind.cpp -o bind.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c combine_variables.cpp -o 
>> combine_variables.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Berend Hasselman


> On 20 Apr 2016, at 13:22, A A via R-help  wrote:
> 
> 
> 
> 
> I have a situation in R where I would like to find any x (if one exists) that 
> solves the linear system of equations Ax = b, where A is square, sparse, and 
> singular, and b is a vector. Here is some code that mimics my issue with a 
> relatively simple A and b, along with three other methods of solving this 
> system that I found online, two of which give me an error and one of which 
> succeeds on the simplified problem, but fails on my data set(attached). Is 
> there a solver in R that I can use in order to get x without any errors given 
> the structure of A? Thanks for your time.
> #CODE STARTS HEREA = 
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
> matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out 
> of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% 
> x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
> singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A = readMM("A.txt")b 
> = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
> boundssolve(qr(A, LAPACK=TRUE),b)

Your code is a mess. 

A singular square system of linear equations has an infinity of solutions if a 
solution exists at all.
How that works you can find here: 
https://en.wikipedia.org/wiki/System_of_linear_equations
in the section "Matrix solutions".

For your simple example you can do it like this:

library(MASS)
Ag <- ginv(A)   # pseudoinverse

xb <- Ag %*% b # minimum norm solution

Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
w <- runif(3)
z <- xb + Aw %*% w
A %*% z - b

N <- Null(t(A))  # null space of A;  see the help for Null in package MASS
A %*% N
A %*% (xb + 2 * N) - b

For sparse systems you will have to approach this differently; I have no 
experience with that.

Berend

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the advice. I fixed the function and ran it on my systems just to 
see if it would work; for the first set of A and b, I got a valid solution, but 
for the second set, I got the error "Error in complete.cases(x, y, wt) : not 
all arguments have the same length".  

On Wednesday, April 20, 2016 10:59 AM, William Dunlap  
wrote:
 

 This is not a solution but your lsfit attempt   #Error in lsfit(A, b) : only 3 
cases, but 4 variables   lsfit(A,b)gave that error because lsfit adds a column 
of 1 toits first argument unless you use intercept=FALSE.Then it will give you 
an answer (but I think it convertsyour sparse matrix into a dense one before 
doingany linear algebra).


Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help  wrote:




 I have a situation in R where I would like to find any x (if one exists) that 
solves the linear system of equations Ax = b, where A is square, sparse, and 
singular, and b is a vector. Here is some code that mimics my issue with a 
relatively simple A and b, along with three other methods of solving this 
system that I found online, two of which give me an error and one of which 
succeeds on the simplified problem, but fails on my data set(attached). Is 
there a solver in R that I can use in order to get x without any errors given 
the structure of A? Thanks for your time.
#CODE STARTS HEREA = 
as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
matrix(c(-30,40,-10),nrow=3,ncol=1)
#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of 
memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
#one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
#Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
singular matrix 'a' in solveqr.solve(A,b)
#matrices used in my actual problem (see attached files)A = readMM("A.txt")b = 
readMM("b.txt")
#Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
boundssolve(qr(A, LAPACK=TRUE),b)

   
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] installation problem on Ubuntu

2016-04-20 Thread Paul Tremblay

I needed to update R so I could install ggplot. I am running Ubuntu 12.04.
I cannot upgrade Ubuntu because I am using a work computer.

I tried upgrading the normal way:

sudo apt-get update
 sudo apt-get install r-base r-base-dev

But this only installed an earlier version. Finally I tried installing from
source (./configure, Make install). This worked. However, when I try to
install packages, I get this error:

Error in download.file(url, destfile = f, quiet = TRUE) :
  internet routines cannot be loaded
In addition: Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
  unable to load shared object '/usr/local/lib/R/modules//internet.so':
  /usr/local/lib/R/modules//internet.so: undefined symbol: curl_multi_wait


>> ls /usr/local/lib/R/modules/
>> R_X11.so  R_de.so  internet.so  lapack.so

Thanks!

P

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merging Data Sets with Full Outer Join

2016-04-20 Thread G . Maubach

Hi All,

I would like to match some datasets. Both deliver variables AND cases 
which might or might not be present in all datasets:

This sequence

Kunden <- Kunden_2011 
Kunden <- merge(Kunden, Kunden_2012,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2013,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2014,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2015,
by.x = "Debitor", by.y = "Debitor")

delivers too few cases. So I guess it does an equi-join.

How can I join the datasets and keep the variables as well as the cases?

I am looking forward to your reply.

Kind regards

Georg

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] project test data into principal components of training dataset

2016-04-20 Thread olsen

For the records, a slightly hacky answer, by modifying the ggbiplot
function, is provided now here:
http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot

On 18/04/16 17:20, olsen wrote:
> Hi there,
> 
> I've a training dataset and a test dataset. My aim is to visually
> allocate the test data within the calibrated space reassembled by the
> PC's of the training data set, furthermore to keep the training data set
> coordinates fixed, so they can serve as ruler for measurement for
> additional test datasets coming up.
> 
> Please find a minimum working example using the wine dataset below.
> Ideally I would like to use ggbiplot as it comes with the elegant
> features but it only accepts objects of class prcomp, princomp, PCA, or
> lda, which is not fullfilled by the predicted test data.
> 
> I'm still slightly wet behind my R ears and the only solution I can
> think of is to plot the calibrated space in ggbiplot and the training
> data in ggplot and then join them, in the worst case by exporting them
> as svg and importing them in inkscape. Which is slightly complicated
> plus the scaling is different.
> 
> Any indication how this mission can be accomplished very welcome!
> 
> Thanks and greets
> Olsen
> 
> I started a threat on stackoverflow on that issue but know relevant
> indications so far.
> http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot
> 
> ##MWE
> library(ggbiplot)
> data(wine)
> 
> ##pca on the wine dataset used as training data
> wine.pca <- prcomp(wine, center = TRUE, scale. = TRUE)
> 
> wine$class <- wine.class
> 
> ##simulate test data by generating three new wine classes
> wine.new.1 <- wine[c(sample(1:nrow(wine), 25)),]
> wine.new.2 <- wine[c(sample(1:nrow(wine), 43)),]
> wine.new.3 <- wine[c(sample(1:nrow(wine), 36)),]
> 
> ##Predict PCs for the new classes by transforming
> #them using the predict.prcomp function
> pred.new.1 <- predict(wine.pca, newdata = wine.new.1)
> pred.new.2 <- predict(wine.pca, newdata = wine.new.2)
> pred.new.3 <- predict(wine.pca, newdata = wine.new.3)
> 
> #simulate the classes for the new sorts
> wine.new.1$class <- rep("new.wine.1", nrow(wine.new.1))
> wine.new.2$class <- rep("new.wine.2", nrow(wine.new.2))
> wine.new.3$class <- rep("new.wine.3", nrow(wine.new.3))
> wine.new.bind <- rbind(wine.new.1, wine.new.2, wine.new.3)
> 
> ##compose the plot by joining the PCA ggbiplot training data with the
> testing data from ggplot
> #plot the calibrated space resulting from the test data
> g.train <- ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups =
> wine$class, ellipse = TRUE, circle = TRUE)
> g.train
> #plot the test data resulting from the prediction
> df.pred = data.frame(PC1 = wine.new.bind[,1], PC2 = wine.new.bind[,2],
> PC3 = wine.new.bind[,3], PC4 = wine.new.bind[,4],
> classes = wine.new.bind$class)
> g.test <- ggplot(df.pred, aes(PC1, PC2, color = classes, shape =
> classes)) +  geom_point() +  stat_ellipse()
> g.test
> 
> 
> 
> 
> 

-- 
Our solar system is the cream of the crop
http://hasa-labs.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Sidoti, Salvatore A.

Greetings!

I have several large data sets of animal movements. Their pauses (zero 
magnitude vectors) are of particular interest in addition to the speed 
distributions that precede the periods of rest. Here is an example of the kind 
of data I am interested in analyzing:

x <- 
abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
length(x)

This example has 27 elements with strings of zeroes (pauses) situated among the 
speed values.
Is there a way to split the vector into zero and nonzero chunks and store them 
in a form where they can be analyzed? I have tried various forms of split() to 
no avail.

Thank you!
Salvatore A. Sidoti

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Alexander Nikles

Dear Community,



I hope that I have the right category selected because I am relatively new
to the "R" world. I come with a relatively challenging problem in the
luggage.  I would like to realize, that "R" reads text files (there are
several hundred pieces in my folder) sequentially, and screens for specific
terms. If the term is found, the program should write a 1, if not a 0.
Another task is to scrape a ten-digit number from the file after a
particular keyword, so that I can map the results. The Programm should
create an .txt file ideally.



A brief example:



Keywords: "surpassed" "achieved", "very motivated"

Text1:

"Personnel number: 0123456789



The employee has exceeded the set targets and was also otherwise always
motivated (...) "



So I want that my program for this case, ideally reflects the following (in
lines and columns=



Personell number;surpassed;achieved; very motivated (do not write)
0123456789;1;0;1


For the following files, he shall all continue analogously in line 2, 3, 4
and so on.



Could you give a brief assessment, how to realize such a thing? How do I
start best and whether you are possibly "stumbled" in advance about
something similar in R? I am grateful for any suggestions/proposals.



Thank you in advance,



Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the response. Yes, in that situation a solution of x = 1 would be 
just as good as x = 1000 or any other value of x for me (but in my problem the 
matrix has nonzero rank, so I can't just randomly choose a vector and have it 
be a solution). If it helps, what I'm interested in is the R equivalent of 
x = A\b
in MATLAB, for these particular kinds of A matrices. I looked into irlba, and 
it seems to be able to calculate some of the singular values/vectors for the 
large dataset without taking too much time. I'll look more into seeing how I 
can solve the system with it. 

On Wednesday, April 20, 2016 11:01 AM, Jeff Newmiller 
 wrote:
 

 This is kind of like asking for a solution to x+1=x+1. Go back to linear 
algebra and look up Singular Value Decomposition, and decide if you really want 
to proceed. See also ?svd and package irlba.
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 4:22:34 AM PDT, A A via R-help  wrote:



 I have a situation in R where I would like to find any x (if one exists) that 
solves the linear system of equations Ax = b, where A is square, sparse, and 
singular, and b is a vector. Here is some code that mimics my issue with a 
relatively simple A and b, along with three other methods of solving this 
system that I found online, two of which give me an error and one of which 
succeeds on the simplified problem, but fails on my data set(attached). Is 
there a solver in R that I can use in order to get x without any errors given 
the structure of A? Thanks for your time.
#CODE STARTS HEREA = 
as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
matrix(c(-30,40,-10),nrow=3,ncol=1)
#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of 
memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
#one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
#Error in
lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the system, but 
fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular 
matrix 'a' in solveqr.solve(A,b)
#matrices used in my actual problem (see attached files)A = readMM("A.txt")b = 
readMM("b.txt")
#Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
boundssolve(qr(A, LAPACK=TRUE),b)


R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the help. Sorry, I am not sure why it looks like that in the mailing 
list - it looks much more neat on my end (see attached file). 

On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman  
wrote:

> On 20 Apr 2016, at 13:22, A A via R-help  wrote:
> 
> 
> 
> 
> I have a situation in R where I would like to find any x (if one exists) that 
> solves the linear system of equations Ax = b, where A is square, sparse, and 
> singular, and b is a vector. Here is some code that mimics my issue with a 
> relatively simple A and b, along with three other methods of solving this 
> system that I found online, two of which give me an error and one of which 
> succeeds on the simplified problem, but fails on my data set(attached). Is 
> there a solver in R that I can use in order to get x without any errors given 
> the structure of A? Thanks for your time.
> #CODE STARTS HEREA = 
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
> matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out 
> of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% 
> x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
> singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A = readMM("A.txt")b 
> = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
> boundssolve(qr(A, LAPACK=TRUE),b)

Your code is a mess. 

A singular square system of linear equations has an infinity of solutions if a 
solution exists at all.
How that works you can find here: 
https://en.wikipedia.org/wiki/System_of_linear_equations
in the section "Matrix solutions".

For your simple example you can do it like this:

library(MASS)
Ag <- ginv(A)    # pseudoinverse

xb <- Ag %*% b # minimum norm solution

Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
w <- runif(3)
z <- xb + Aw %*% w
A %*% z - b

N <- Null(t(A))    # null space of A;  see the help for Null in package MASS
A %*% N
A %*% (xb + 2 * N) - b

For sparse systems you will have to approach this differently; I have no 
experience with that.

Berend

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging Data Sets with Full Outer Join

2016-04-20 Thread David Winsemius


> On Apr 19, 2016, at 11:23 PM, g.maub...@weinwolf.de wrote:
> 
> Hi All,
> 
> I would like to match some datasets. Both deliver variables AND cases 
> which might or might not be present in all datasets:
> 
> This sequence
> 
> Kunden <- Kunden_2011 
> Kunden <- merge(Kunden, Kunden_2012,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2013,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2014,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2015,
>by.x = "Debitor", by.y = "Debitor")
> 
> delivers too few cases. So I guess it does an equi-join.

You should not be guessing. Read the help page. It calls the default setting a 
natural join.

> 
> How can I join the datasets and keep the variables as well as the cases?
> 

If you want a full outer join use all=TRUE. This, too, should have been in the 
?merge help page.


> I am looking forward to your reply.
> 
> Kind regards
> 
> Georg
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging Data Sets with Full Outer Join

2016-04-20 Thread Ista Zahn

Kunden <- Kunden_2011
Kunden <- merge(Kunden, Kunden_2012,
by = "Debitor", all = TRUE)

etc.

See ?merge for details.

Best,
Ista

On Wed, Apr 20, 2016 at 2:23 AM,   wrote:
> Hi All,
>
> I would like to match some datasets. Both deliver variables AND cases
> which might or might not be present in all datasets:
>
> This sequence
>
> Kunden <- Kunden_2011
> Kunden <- merge(Kunden, Kunden_2012,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2013,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2014,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2015,
> by.x = "Debitor", by.y = "Debitor")
>
> delivers too few cases. So I guess it does an equi-join.
>
> How can I join the datasets and keep the variables as well as the cases?
>
> I am looking forward to your reply.
>
> Kind regards
>
> Georg
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Ista Zahn

Perhaps

x <- split(x, x == 0)

Best,
Ista

On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
 wrote:
> Greetings!
>
> I have several large data sets of animal movements. Their pauses (zero 
> magnitude vectors) are of particular interest in addition to the speed 
> distributions that precede the periods of rest. Here is an example of the 
> kind of data I am interested in analyzing:
>
> x <- 
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> length(x)
>
> This example has 27 elements with strings of zeroes (pauses) situated among 
> the speed values.
> Is there a way to split the vector into zero and nonzero chunks and store 
> them in a form where they can be analyzed? I have tried various forms of 
> split() to no avail.
>
> Thank you!
> Salvatore A. Sidoti
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread William Dunlap via R-help

> i <- seq_len(length(x)-1)
> split(x, cumsum(c(TRUE, (x[i]==0) != (x[i+1]==0
$`1`
[1] 0.144872972504 0.850797178400

$`2`
[1] 0 0 0

$`3`
[1] 0.199304859380 2.063609410700 0.939393760782 0.838781367540

$`4`
[1] 0 0 0 0 0

$`5`
[1] 0.374688091264 0.488423999452 0.783034615362 0.626990428900
0.138188255307 2.324635712186

$`6`
[1] 0 0 0 0 0 0 0


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 20, 2016 at 12:49 PM, Ista Zahn  wrote:

> Perhaps
>
> x <- split(x, x == 0)
>
> Best,
> Ista
>
> On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
>  wrote:
> > Greetings!
> >
> > I have several large data sets of animal movements. Their pauses (zero
> magnitude vectors) are of particular interest in addition to the speed
> distributions that precede the periods of rest. Here is an example of the
> kind of data I am interested in analyzing:
> >
> > x <-
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> > length(x)
> >
> > This example has 27 elements with strings of zeroes (pauses) situated
> among the speed values.
> > Is there a way to split the vector into zero and nonzero chunks and
> store them in a form where they can be analyzed? I have tried various forms
> of split() to no avail.
> >
> > Thank you!
> > Salvatore A. Sidoti
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-20 Thread Henrik Bengtsson

On Wed, Apr 20, 2016 at 1:25 AM, Martin Maechler
 wrote:
>> Henrik Bengtsson 
>> on Tue, 19 Apr 2016 14:04:11 -0700 writes:
>
> > Using the Matrix package, how can I create a row-oriented sparse
> > Matrix from scratch populated with some data?  By default a
> > column-oriented one is created and I'm aware of the note that the
> > package is optimized for column-oriented ones, but I'm only interested
> > in using it for holding my sparse row-oriented data and doing basic
> > subsetting by rows (even using drop=FALSE).
>
> > Here is what I get when I set up a column-oriented sparse Matrix:
>
> >> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cc[1:3,1] <- 1
>
> A general ("teaching") remark :
> The above use of Matrix() is seen in many places, and is fine
> for small matrices and the case where you only use the `[<-`
> method very few times (as above).
> Also using  Matrix()  is nice when being introduced to using the
> Matrix package.
>
> However, for efficience in non-small cases, do use
>
>sparseMatrix()
>
> directly to construct sparse matrices.
>
>
> >> Cc
> > 5 x 5 sparse Matrix of class "dgCMatrix"
>
> > [1,] 1 . . . .
> > [2,] 1 . . . .
> > [3,] 1 . . . .
> > [4,] . . . . .
> > [5,] . . . . .
> >> str(Cc)
> > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> > ..@ i   : int [1:3] 0 1 2
> > ..@ p   : int [1:6] 0 3 3 3 3 3
> > ..@ Dim : int [1:2] 5 5
> > ..@ Dimnames:List of 2
> > .. ..$ : NULL
> > .. ..$ : NULL
> > ..@ x   : num [1:3] 1 1 1
> > ..@ factors : list()
>
> > When I try to do the analogue for a row-oriented matrix, I get a
> > "dgTMatrix", whereas I would expect a "dgRMatrix":
>
> >> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cr <- as(Cr, "dsRMatrix")
> >> Cr[1,1:3] <- 1
> >> Cr
> > 5 x 5 sparse Matrix of class "dgTMatrix"
>
> > [1,] 1 1 1 . .
> > [2,] . . . . .
> > [3,] . . . . .
> > [4,] . . . . .
> > [5,] . . . . .
>
> The reason for the above behavior has been
>
> a) efficiency.  All the subassignment ( `[<-` ) methods for
>"RsparseMatrix" objects (of which "dsRMatrix" is a special case)
>are implemented via  TsparseMatrix.
> b) because of the general attitude that Csparse (and Tsparse to
>some extent) are well supported in Matrix,
>and e.g. further operations on Rsparse matrices would *again*
>go via T* or C* sparse ones, I had decided to keep things Tsparse.

Thanks, understanding these design decisions is helpful.
Particularly, since I consider myself a rookie when it comes to the
Matrix package.

>
> [...]
>
> > Trying with explicit coercion does not work:
>
> >> as(Cc, "dgRMatrix")
> > Error in as(Cc, "dgRMatrix") :
> > no method or default for coercing "dgCMatrix" to "dgRMatrix"
>
> >> as(Cr, "dgRMatrix")
> > Error in as(Cr, "dgRMatrix") :
> > no method or default for coercing "dgTMatrix" to "dgRMatrix"
>
> The general philosophy in 'Matrix' with all the class
> hierarchies and the many specific classes has been to allow and
> foster coercing to abstract super classes,
> i.e, to  "dMatrix"  or "generalMatrix", "triangularMatrix", or
> then "denseMatrix", "sparseMatrix", "CsparseMatrix" or
> "RsparseMatrix", etc
>
> So in the above  as(*, "RsparseMatrix")   should work always.

Thanks for pointing this out (and confirming as I since discovered the
virtual RsparseMatrix class in the help).

>
>
> As a summary, in other words,  for what you want,
>
>as(sparseMatrix(.), "RsparseMatrix")
>
> should give you what you want reliably and efficiently.

Perfect.

>
>
> > Am I doing some wrong here?  Or is this what means that the package is
> > optimized for the column-oriented representation and I shouldn't
> > really work with row-oriented ones?  I'm really only interested in
> > access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
> > footprint).
>
> { though you could equivalently use   Cc[,row, drop=FALSE]
>   with a CsparseMatrix Cc := t(Cr),
>   couldn't you ?
> }

Yes, I actually went ahead did that, but since the code I'm writing
supports both plain matrix:es and sparse Matrix:es, and the underlying
model operates row-by-row, I figured the code would be more consistent
if I could use row-orientation everywhere.  Not a big deal.

Thanks Martin

Henrik

>
>
> Martin Maechler  (maintainer of 'Matrix')
> ETH Zurich
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Jeff Newmiller

The usual culprit in messy code is posting in HTML format. That usually leads 
to stripping of the formatting by the mailing list and a notice that that 
occurred, but I don't see that warning here. I still think posting plain text 
format would fix the problem. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 11:51:40 AM PDT, A A via R-help  wrote:
>Thanks for the help. Sorry, I am not sure why it looks like that in the
>mailing list - it looks much more neat on my end (see attached file). 
>
>On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman 
>wrote:
> 
>
> 
>> On 20 Apr 2016, at 13:22, A A via R-help 
>wrote:
>> 
>> 
>> 
>> 
>> I have a situation in R where I would like to find any x (if one
>exists) that solves the linear system of equations Ax = b, where A is
>square, sparse, and singular, and b is a vector. Here is some code that
>mimics my issue with a relatively simple A and b, along with three
>other methods of solving this system that I found online, two of which
>give me an error and one of which succeeds on the simplified problem,
>but fails on my data set(attached). Is there a solver in R that I can
>use in order to get x without any errors given the structure of A?
>Thanks for your time.
>> #CODE STARTS HEREA =
>as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
>= matrix(c(-30,40,-10),nrow=3,ncol=1)
>> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A
>(or out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
>> #one x that happens to solve Ax = bx =
>matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
>> #Error in lsfit(A, b) : only 3 cases, but 4
>variableslsfit(A,b)#solves the system, but fails belowsolve(qr(A,
>LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular matrix 'a' in
>solveqr.solve(A,b)
>> #matrices used in my actual problem (see attached files)A =
>readMM("A.txt")b = readMM("b.txt")
>> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of
>boundssolve(qr(A, LAPACK=TRUE),b)
>
>Your code is a mess. 
>
>A singular square system of linear equations has an infinity of
>solutions if a solution exists at all.
>How that works you can find here:
>https://en.wikipedia.org/wiki/System_of_linear_equations
>in the section "Matrix solutions".
>
>For your simple example you can do it like this:
>
>library(MASS)
>Ag <- ginv(A)    # pseudoinverse
>
>xb <- Ag %*% b # minimum norm solution
>
>Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
>w <- runif(3)
>z <- xb + Aw %*% w
>A %*% z - b
>
>N <- Null(t(A))    # null space of A;  see the help for Null in package
>MASS
>A %*% N
>A %*% (xb + 2 * N) - b
>
>For sparse systems you will have to approach this differently; I have
>no experience with that.
>
>Berend
>
>
>  
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter

I suggest you go through some R tutorials to learn about R's
capabilities.  Some recommendations can be found here:
https://www.rstudio.com/online-learning/#R

To answer your specific query:

?scan  ## Because you do not specify file format.

?grep  ?regexp ## to use regular expressions to find text.

R may not be the best tool for this task, however. Or certain R
packages may be better than the basic R tools. Try searching on the
rseek.org site to see what might be available if you do not receive
suggestions here.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote:
> Dear Community,
>
>
>
> I hope that I have the right category selected because I am relatively new
> to the "R" world. I come with a relatively challenging problem in the
> luggage.  I would like to realize, that "R" reads text files (there are
> several hundred pieces in my folder) sequentially, and screens for specific
> terms. If the term is found, the program should write a 1, if not a 0.
> Another task is to scrape a ten-digit number from the file after a
> particular keyword, so that I can map the results. The Programm should
> create an .txt file ideally.
>
>
>
> A brief example:
>
>
>
> Keywords: "surpassed" "achieved", "very motivated"
>
> Text1:
>
> "Personnel number: 0123456789
>
>
>
> The employee has exceeded the set targets and was also otherwise always
> motivated (...) "
>
>
>
> So I want that my program for this case, ideally reflects the following (in
> lines and columns=
>
>
>
> Personell number;surpassed;achieved; very motivated (do not write)
> 0123456789;1;0;1
>
>
> For the following files, he shall all continue analogously in line 2, 3, 4
> and so on.
>
>
>
> Could you give a brief assessment, how to realize such a thing? How do I
> start best and whether you are possibly "stumbled" in advance about
> something similar in R? I am grateful for any suggestions/proposals.
>
>
>
> Thank you in advance,
>
>
>
> Alex
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter

also check out this CRAN task view:

https://cran.r-project.org/web/views/NaturalLanguageProcessing.html

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote:
> Dear Community,
>
>
>
> I hope that I have the right category selected because I am relatively new
> to the "R" world. I come with a relatively challenging problem in the
> luggage.  I would like to realize, that "R" reads text files (there are
> several hundred pieces in my folder) sequentially, and screens for specific
> terms. If the term is found, the program should write a 1, if not a 0.
> Another task is to scrape a ten-digit number from the file after a
> particular keyword, so that I can map the results. The Programm should
> create an .txt file ideally.
>
>
>
> A brief example:
>
>
>
> Keywords: "surpassed" "achieved", "very motivated"
>
> Text1:
>
> "Personnel number: 0123456789
>
>
>
> The employee has exceeded the set targets and was also otherwise always
> motivated (...) "
>
>
>
> So I want that my program for this case, ideally reflects the following (in
> lines and columns=
>
>
>
> Personell number;surpassed;achieved; very motivated (do not write)
> 0123456789;1;0;1
>
>
> For the following files, he shall all continue analogously in line 2, 3, 4
> and so on.
>
>
>
> Could you give a brief assessment, how to realize such a thing? How do I
> start best and whether you are possibly "stumbled" in advance about
> something similar in R? I am grateful for any suggestions/proposals.
>
>
>
> Thank you in advance,
>
>
>
> Alex
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] overlay two facet_grid

2016-04-20 Thread ch.elahe via R-help

Hi all,
Does anyone know how to overlay two facet_grids? I have two facet grids as 
following:


ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)

Thanks for any help!
Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlay two facet_grid

2016-04-20 Thread Jeff Newmiller

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Overlaying aesthetics is possible. Overlaying graphs is not. Without sample 
data, concrete examples will be unlikely to  appear, so read the above link and 
pay attention to the dput function. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help"  
wrote:
>Hi all,
>Does anyone know how to overlay two facet_grids? I have two facet grids
>as following:
>
>
>ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>
>Thanks for any help!
>Elahe
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installation problem on Ubuntu

2016-04-20 Thread Jeff Newmiller

Have you read the CRAN  instructions for installing on Ubuntu?  Have you read 
the Posting Guide that mentions the R-sig-debian mailing list and that if you 
need help compiling R this is not the right list?
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 9:36:51 AM PDT, Paul Tremblay  wrote:
>I needed to update R so I could install ggplot. I am running Ubuntu
>12.04.
>I cannot upgrade Ubuntu because I am using a work computer.
>
>I tried upgrading the normal way:
>
>sudo apt-get update
> sudo apt-get install r-base r-base-dev
>
>But this only installed an earlier version. Finally I tried installing
>from
>source (./configure, Make install). This worked. However, when I try to
>install packages, I get this error:
>
>Error in download.file(url, destfile = f, quiet = TRUE) :
>  internet routines cannot be loaded
>In addition: Warning message:
>In download.file(url, destfile = f, quiet = TRUE) :
>  unable to load shared object '/usr/local/lib/R/modules//internet.so':
>/usr/local/lib/R/modules//internet.so: undefined symbol:
>curl_multi_wait
>
>
>>> ls /usr/local/lib/R/modules/
>>> R_X11.so  R_de.so  internet.so  lapack.so
>
>Thanks!
>
>P
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data reshaping with conditions

2016-04-20 Thread Jim Lemon

Hi sri,
As your problem involves a few logical steps, I found it easier to
approach it in a stepwise way. Perhaps there are more elegant ways to
accomplish this.

svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
 svdatstr[row,"maxA"]<-
  max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
 svdatstr[row,"maxB"]<-
  max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
 svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
 svdatstr[row,"y"]<-!svdatstr[row,"x"]
}
svdatstr

You can then just extract the columns that you need.

Jim


On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan  wrote:
> Dear All,
>
> I am trying to reshape the data with some conditions. A small part of the
> data looks like below. Like this there will be more data with repeating ID.
>
> Count id name type
> 117 335 sally A
> 19 335 sally A
> 167 335 sally B
> 18 340 susan A
> 56 340 susan A
> 22 340 susan B
> 53 340 susan B
> 135 351 lee A
> 114 351 lee A
> 84 351 lee A
> 80 351 lee A
> 19 351 lee A
> 8 351 lee A
> 21 351 lee A
> 88 351 lee B
> 111 351 lee B
> 46 351 lee B
> 108 351 lee B
>
> >From the above data I am expecting an output like below.
>
> id name type count_of_B Max of count B x   y
> 335 sally B 167 167 117,19  NA
> 340 susan B 22,53 53 18  56
> 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
>
> Where, the column x and column y are:
>
> x = Count_A_less_than_max of (Count type B)
> y = Count_A_higher_than_max of (Count type B).
>
> *1)* I tried with dplyr with the following code for the initial step to get
> the values for each column.
> *2)*  I thought to transpose the columns which has the unique ID alone.
>
> I tried with the following code and I am struck with the intial step
> itself. The code is executed but higher and lower value of A is not coming.
>
> Expected_output= data %>%
>   group_by(id, Type) %>%
>   mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
>   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
> "B"]),max(count[Type == "A"]))) %>%
>   mutate(count_type_A_lesser = ifelse
> (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
> collapse = ",")), "NA"))%>%
>   mutate(count_type_A_higher =
> ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
> Max_of_count_B[Type=="B"], collapse = ",")), "NA"))
>
> I hope I make my point clear. Please bare with the code, as I am new to
> this.
>
> Regards,
> sri
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlay two facet_grid

2016-04-20 Thread Ulrik Stervbo

It sounds like you want to use grid.arrange() from gridExtra:
https://cran.r-project.org/web/packages/gridExtra/vignettes/arrangeGrob.html

Hope this helps,
Ulrik

On Thu, 21 Apr 2016 at 00:52 Jeff Newmiller 
wrote:

>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>
> Overlaying aesthetics is possible. Overlaying graphs is not. Without
> sample data, concrete examples will be unlikely to  appear, so read the
> above link and pay attention to the dput function.
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help" <
> r-help@r-project.org> wrote:
> >Hi all,
> >Does anyone know how to overlay two facet_grids? I have two facet grids
> >as following:
> >
> >
>
> >ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>
> >ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
> >
> >Thanks for any help!
> >Elahe
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data reshaping with conditions

2016-04-20 Thread Jim Lemon

Hi sri,
I think that I see what you mean. Your statements:

x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).

I took to mean that you wanted a logical value for x and y. Looking
more closely at your initial message, I see that you wanted _all_
values of A with respect to maxB in x and y. The error with maximum
values was due to a typo. Perhaps this will do what you want:

svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
 indicesA<-count_ind[as.logical(match(svdatstr[row,type_ind],"A",0))]
 svdatstr[row,"maxA"]<-max(svdatstr[row,indicesA])
 indicesB<-count_ind[as.logical(match(svdatstr[row,type_ind],"B",0))]
 svdatstr[row,"maxB"]<-max(svdatstr[row,indicesB])
 AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]=svdatstr[row,"maxB"]]
 svdatstr[row,"y"]<-paste(AgeB,collapse=",")
}
svdatstr[,c("id","name","maxB","x","y")]

Jim


On Thu, Apr 21, 2016 at 2:23 PM, sri vathsan  wrote:
> Hi Jim,
>
> Thanks for your time. But somehow this code did not help me to achieve my
> expected output.
> Problems: 1) x, y are coming as logical rather than values as I mentioned in
> my post
>2) The values that I get for Max A and Max B not correct
>3) It looks like a pretty big data, but I just need to
> concatenate the values with a comma, the final output will be a character
> variable.
>
> Regards,
> Sri
>
> On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon  wrote:
>>
>> Hi sri,
>> As your problem involves a few logical steps, I found it easier to
>> approach it in a stepwise way. Perhaps there are more elegant ways to
>> accomplish this.
>>
>> svdat<-read.table(text="Count id name type
>> 117 335 sally A
>> 19 335 sally A
>> 167 335 sally B
>> 18 340 susan A
>> 56 340 susan A
>> 22 340 susan B
>> 53 340 susan B
>> 135 351 lee A
>> 114 351 lee A
>> 84 351 lee A
>> 80 351 lee A
>> 19 351 lee A
>> 8 351 lee A
>> 21 351 lee A
>> 88 351 lee B
>> 111 351 lee B
>> 46 351 lee B
>> 108 351 lee B",header=TRUE)
>> # you can also do this with other reshape functions
>> library(prettyR)
>> svdatstr<-stretch_df(svdat,"id",c("Count","type"))
>> count_ind<-grep("Count",names(svdatstr))
>> type_ind<-grep("type",names(svdatstr))
>> svdatstr$maxA<-NA
>> svdatstr$maxB<-NA
>> svdatstr$x<-NA
>> svdatstr$y<-NA
>> for(row in 1:nrow(svdatstr)) {
>>  svdatstr[row,"maxA"]<-
>>
>> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
>>  svdatstr[row,"maxB"]<-
>>
>> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
>>  svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
>>  svdatstr[row,"y"]<-!svdatstr[row,"x"]
>> }
>> svdatstr
>>
>> You can then just extract the columns that you need.
>>
>> Jim
>>
>>
>> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan  wrote:
>> > Dear All,
>> >
>> > I am trying to reshape the data with some conditions. A small part of
>> > the
>> > data looks like below. Like this there will be more data with repeating
>> > ID.
>> >
>> > Count id name type
>> > 117 335 sally A
>> > 19 335 sally A
>> > 167 335 sally B
>> > 18 340 susan A
>> > 56 340 susan A
>> > 22 340 susan B
>> > 53 340 susan B
>> > 135 351 lee A
>> > 114 351 lee A
>> > 84 351 lee A
>> > 80 351 lee A
>> > 19 351 lee A
>> > 8 351 lee A
>> > 21 351 lee A
>> > 88 351 lee B
>> > 111 351 lee B
>> > 46 351 lee B
>> > 108 351 lee B
>> >
>> > >From the above data I am expecting an output like below.
>> >
>> > id name type count_of_B Max of count B x   y
>> > 335 sally B 167 167 117,19  NA
>> > 340 susan B 22,53 53 18  56
>> > 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
>> >
>> > Where, the column x and column y are:
>> >
>> > x = Count_A_less_than_max of (Count type B)
>> > y = Count_A_higher_than_max of (Count type B).
>> >
>> > *1)* I tried with dplyr with the following code for the initial step to
>> > get
>> > the values for each column.
>> > *2)*  I thought to transpose the columns which has the unique ID alone.
>> >
>> > I tried with the following code and I am struck with the intial step
>> > itself. The code is executed but higher and lower value of A is not
>> > coming.
>> >
>> > Expected_output= data %>%
>> >   group_by(id, Type) %>%
>> >   mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse =
>> > ","))%>%
>> >   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
>> > "B"]),max(count[Type == "A"]))) %>%
>> >   mutate(count_type_A_lesser

[R] Mailing List

2016-04-20 Thread Ogbos Okike

Dear All,
I am using R to do my work and thank you very much for developing,
maintaining and making such excellent software available to anyone
that is interested enough to ask for it.

 I have registered at Nabble. I was wondering the right forum for me
to send my help request. I have tried sending to R-help@r-project.org.
However, I do receive a kind of warning email stating that my email
awaits approval from the moderator since I am a non-member posting to
membership email.

Can any one please direct me to the right forum for me. My problem
range from plotting graph using R, statistics in R, etc. You could
have seen some of my request this few days.

Thank you for your time.
Ogbos

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data reshaping with conditions

2016-04-20 Thread sri vathsan

Hi Jim,

Thanks for your time. But somehow this code did not help me to achieve my
expected output.
Problems: 1) x, y are coming as logical rather than values as I mentioned
in my post
   2) The values that I get for Max A and Max B not correct
   3) It looks like a pretty big data, but I just need to
concatenate the values with a comma, the final output will be a character
variable.

Regards,
Sri

On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon  wrote:

> Hi sri,
> As your problem involves a few logical steps, I found it easier to
> approach it in a stepwise way. Perhaps there are more elegant ways to
> accomplish this.
>
> svdat<-read.table(text="Count id name type
> 117 335 sally A
> 19 335 sally A
> 167 335 sally B
> 18 340 susan A
> 56 340 susan A
> 22 340 susan B
> 53 340 susan B
> 135 351 lee A
> 114 351 lee A
> 84 351 lee A
> 80 351 lee A
> 19 351 lee A
> 8 351 lee A
> 21 351 lee A
> 88 351 lee B
> 111 351 lee B
> 46 351 lee B
> 108 351 lee B",header=TRUE)
> # you can also do this with other reshape functions
> library(prettyR)
> svdatstr<-stretch_df(svdat,"id",c("Count","type"))
> count_ind<-grep("Count",names(svdatstr))
> type_ind<-grep("type",names(svdatstr))
> svdatstr$maxA<-NA
> svdatstr$maxB<-NA
> svdatstr$x<-NA
> svdatstr$y<-NA
> for(row in 1:nrow(svdatstr)) {
>  svdatstr[row,"maxA"]<-
>
> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
>  svdatstr[row,"maxB"]<-
>
> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
>  svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
>  svdatstr[row,"y"]<-!svdatstr[row,"x"]
> }
> svdatstr
>
> You can then just extract the columns that you need.
>
> Jim
>
>
> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan  wrote:
> > Dear All,
> >
> > I am trying to reshape the data with some conditions. A small part of the
> > data looks like below. Like this there will be more data with repeating
> ID.
> >
> > Count id name type
> > 117 335 sally A
> > 19 335 sally A
> > 167 335 sally B
> > 18 340 susan A
> > 56 340 susan A
> > 22 340 susan B
> > 53 340 susan B
> > 135 351 lee A
> > 114 351 lee A
> > 84 351 lee A
> > 80 351 lee A
> > 19 351 lee A
> > 8 351 lee A
> > 21 351 lee A
> > 88 351 lee B
> > 111 351 lee B
> > 46 351 lee B
> > 108 351 lee B
> >
> > >From the above data I am expecting an output like below.
> >
> > id name type count_of_B Max of count B x   y
> > 335 sally B 167 167 117,19  NA
> > 340 susan B 22,53 53 18  56
> > 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
> >
> > Where, the column x and column y are:
> >
> > x = Count_A_less_than_max of (Count type B)
> > y = Count_A_higher_than_max of (Count type B).
> >
> > *1)* I tried with dplyr with the following code for the initial step to
> get
> > the values for each column.
> > *2)*  I thought to transpose the columns which has the unique ID alone.
> >
> > I tried with the following code and I am struck with the intial step
> > itself. The code is executed but higher and lower value of A is not
> coming.
> >
> > Expected_output= data %>%
> >   group_by(id, Type) %>%
> >   mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
> >   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
> > "B"]),max(count[Type == "A"]))) %>%
> >   mutate(count_type_A_lesser = ifelse
> > (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
> > collapse = ",")), "NA"))%>%
> >   mutate(count_type_A_higher =
> > ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
> > Max_of_count_B[Type=="B"], collapse = ",")), "NA"))
> >
> > I hope I make my point clear. Please bare with the code, as I am new to
> > this.
> >
> > Regards,
> > sri
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 

Regards,
Srivathsan.K
Phone : 9600165206

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

38 matches

Mail list logo