Re: [R] How to Turn OFF Vectors Recycling Rule in Paste()

2015-08-27 Thread PIKAL Petr

as Jeff said you cannot without modifying source code for paste.

If your problem is as you expressed it you can do e.g.

paste(S, append(X, rep(NA, length(S)-length(X


> Hello Team,
> Platform:Windows 7 32-bit
> R Version:3.2.2
> I have two vectors as follows
> > S = c("aa", "bb", "cc", "dd", "ee")
> > X= c("aa", "bb", "cc", "dd")
> When I trying a paste function
> > z=paste(S,X)
> > z
> I am getting result as
> [1] "aa aa" "bb bb" "cc cc" "dd dd" "ee aa"
> I don't need the last element of S vector should get concatenated with
> first element of X vector rather than it should take a space or null
> value for concatenation. If I specify explicit space value in X vector
> like
> > X= c("aa", "bb", "cc", "dd", " ")
> Then I am getting result as
> > z=paste(S,X)
> > z
> [1] "aa aa" "bb bb" "cc cc" "dd dd" "ee  "
> Could you please suggest me how to turn off the Vector recycle in paste
> function?
> Thanks and Regards
> Preetiranjan Pradhan
Re: [R] igraph plot slowness

2015-08-27 Thread Loris Bennett
Loris Bennett  writes:

> Hi Jim,
> jim holtman  writes:
>> Here is what it does locally on my PC:
>>> library("igraph")
>>>  topo_data <- read.table(text = "ibcore01ibswitch01
>> +  ibcore01ibswitch02
>> +  ibcore01ibswitch03
>> +  ibcore02ibswitch01
>> +  ibcore02ibswitch02
>> +  ibcore02ibswitch03
>> +  ibswitch01  node001
>> +  ibswitch01  node002
>> +  ibswitch01  node003
>> +  ibswitch02  node004
>> +  ibswitch02  node005
>> +  ibswitch02  node006
>> +  ibswitch03  node007
>> +  ibswitch03  node008
>> +  ibswitch03  node009" ,head=FALSE)
>>>  system.time({
>> +  network_data <, directed=F)
>> +  plot(network_data)
>> + })
>>user  system elapsed
>> Does not seem too slow.  Creating a PDF file takes a little longer:
>>> library("igraph")
>>>  topo_data <- read.table(text = "ibcore01ibswitch01
>> +  ibcore01ibswitch02
>> +  ibcore01ibswitch03
>> +  ibcore02ibswitch01
>> +  ibcore02ibswitch02
>> +  ibcore02ibswitch03
>> +  ibswitch01  node001
>> +  ibswitch01  node002
>> +  ibswitch01  node003
>> +  ibswitch02  node004
>> +  ibswitch02  node005
>> +  ibswitch02  node006
>> +  ibswitch03  node007
>> +  ibswitch03  node008
>> +  ibswitch03  node009" ,head=FALSE)
>>>  system.time({
>> +  network_data <, directed=F)
>> +  pdf('test.pdf')
>> +  plot(network_data)
>> +
>> + })
>>user  system elapsed
>> The PDF file is attached.  So maybe it is something with your remote
>> connection.
> You're right.  Running locally even the plot of complete network takes
> less than 0.2 seconds via the X11 device.  I'll have a closer look at
> the connection.
> Thanks,
> Loris

I found the solution here:

namely calling


in the remote R session.



Re: [R] TSclust multivariate time series clustering

2015-08-27 Thread cgenolin
You can also try kml3d. You can either use some default distances or define
your own.


__ mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Re: [R] TSclust multivariate time series clustering

2015-08-27 Thread Ranjan Maitra

Although there is no R package available (we did not think of it), if you want 
a Gaussian-mixture=model-based approach, you may look at the paper:

"Model-Based Clustering of Regression Time Series Data via APECM—An AECM 
Algorithm Sung to an Even Faster Beat" by Wei-Chen Chen and Ranjan Maitra 

which appeared in Statistical Analysis and Data Mining in December 2011 (pages 
567-578), has DOI:10.1002/sam.10143 and won the primary author (Wei-Chen Chen) 
a JSM 2011 Best Student Paper award in Statistical Learning and Data Mining.

The code was all in R, so it is available as such upon personal request -- see 
e-mail addresses in the paper, but (as I mentioned earlier) not as a R package 

Many thanks and best wishes,

Re: [R] compiling Rmd - can't find tex file....

2015-08-27 Thread Witold E Wolski
to answer my own question.

I did not find out what knit2pdf is good for
but rmarkdown::render does the job.


On 26 August 2015 at 11:46, Witold E Wolski  wrote:

> I am using from within R-studio and the .Rmd file builds nicely.
> However, when I try to compile the fiel using :
> knit2pdf( "specL.Rmd", output=file.path("res.12345","specL.pdf") )
> I am getting tex errors (see below). When I wan't to check what's wrong I
> can't find the tex file.
> Any ideas?
> Thank you
> PS: Errors:
>  |..   |  95%
> label: writepepProt
>   |.| 100%
>   ordinary text without R code
> *output file: res.12345/specL.pdf*
> *Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,
> : *
> *  Running 'texi2dvi' on 'specL.pdf' failed.*
> *LaTeX errors:*
> *! Missing $ inserted.*
> * *
> *$*
> * *
> *   _*
> *! Missing $ inserted.*
> * *
> *$*
> * *
> *   \par *
> *! You can't use `macro parameter character #' in horizontal mode.*
> *l.15 #*
Re: [R] How to Turn OFF Vectors Recycling Rule in Paste()

2015-08-27 Thread Preeti ranjan Pradhan
Okay, thank you all for your reply.

In SAS it doesn't do that way, so I was bit confused. Anyways thanks all.


[R] Rcpp, function signature

2015-08-27 Thread Michael Meyer via R-help

I am an  (very) grateful user of Rcpp.
As such I defined a function 

// [[Rcpp::export]]
leftShift(NumericVector x){

[R] Problem R markdown document

2015-08-27 Thread Conklin, Mike (GfK)
I have successfully done this many times using RStudio's rmarkdown capabilities 
and knitting the document to HTML or Word. However, I am running into this 
error today.

"C:/Program Files/RStudio/bin/pandoc/pandoc" --to 
docx --from 
 --output FusionTestsAugust25.docx --highlight-style tango 
pandoc.exe: getMBlocks: VirtualAlloc MEM_COMMIT failed
Error: pandoc document conversion failed with error 1

Same error occurs whether to knitting to Word or to HTML.  It looks like the a 
memory issue but this whole process is of limited use if the entire analysis 
runs but the document can't be output.  If anyone has any ideas on how to deal 
with memory issues in the final step of pandoc conversion or can point me to 
where to look for this I would appreciate it.

W. Michael Conklin
Executive Vice President
Marketing & Data Sciences - North America
GfK | 8401 Golden Valley Road | Minneapolis | MN | 55427 
T +1 763 417 4545 | M +1 612 567 8287 

Re: [R] Problem R markdown document

2015-08-27 thread boB Rudis
Try increasing the memory for pandoc via knitr YAML options:

title: "TITLE"
pandoc_args: [
  "+RTS", "-K64m",


you can bump up those #'s IIRC, too, if they don't work at first.

On Thu, Aug 27, 2015 at 1:55 PM, Conklin, Mike (GfK)
> I have successfully done this many times using RStudio's rmarkdown 
> capabilities and knitting the document to HTML or Word. However, I am running 
> into this error today.
> "C:/Program Files/RStudio/bin/pandoc/pandoc" --to 
> docx --from 
> markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures
>  --output FusionTestsAugust25.docx --highlight-style tango
> pandoc.exe: getMBlocks: VirtualAlloc MEM_COMMIT failed
> Error: pandoc document conversion failed with error 1
> Same error occurs whether to knitting to Word or to HTML.  It looks like the 
> a memory issue but this whole process is of limited use if the entire 
> analysis runs but the document can't be output.  If anyone has any ideas on 
> how to deal with memory issues in the final step of pandoc conversion or can 
> point me to where to look for this I would appreciate it.
> Best regards,
> Mike
[R] Problem with gridExtra

2015-08-27 Thread Lorenzo Isella
Dear All,
Please consider the snippet at the end of the email, largely based on
what you find here

When I run it, I get this error

Error in arrangeGrob(p, sub = textGrob("Footnote", x = 0, hjust =
-0.1,  :
 could not find function "textGrob"

However, the code runs on another machine I own. I suppose something
must have changed in the gridExtra library but right now I am banging
my head against the wall.

This is my sessionInfo()


R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid

[1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
  [5] LC_MONETARY=en_GB.utf8LC_MESSAGES=en_GB.utf8
   [7] LC_PAPER=en_GB.utf8   LC_NAME=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] gridExtra_2.0.0 ggplot2_1.0.1

loaded via a namespace (and not attached):
[1] Rcpp_0.11.6  digest_0.6.8 MASS_7.3-43  grid_3.2.2
 [5] plyr_1.8.3   gtable_0.1.2 magrittr_1.5 scales_0.3.0
  [9] stringi_0.5-5reshape2_1.4.1   proto_0.3-10 labeling_0.3
  [13] tools_3.2.2  stringr_1.0.0munsell_0.4.2

Any suggestion is appreciated.


toyota <- mpg[which(mpg$manufacturer == 'toyota'), ]
p <- ggplot(toyota, aes(displ, hwy)) + facet_wrap(~ class, ncol = 2) +
g <- arrangeGrob(p, sub = textGrob("Footnote", x = 0, hjust = -0.1,
vjust=0.1, gp = gpar(fontface = "italic", fontsize = 18)))
ggsave("/Users/Alan/Desktop/plot_grid_extra.png", g)

[R] Fisher's Test 5x4 table

2015-08-27 thread paul brett
Dear all,
I am trying to do a fishers test on a 5x4 table on R
statistics. I have already done a chi squared test using Minitab on this
data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet using
these results (even though they are excellent) may not be suitable for
publication. I have tried numerous other statistical packages in the hope
of doing this test, yet each one has just the 2x2 table.
I am struggling to edit the template fishers test on R to fit
my table (as according to the R book it is possible, yet i cannot get it to
work). The template given on the R documentation and R book is for a 2x2
fisher test. What do i need to change to get this to work? I have attached
the data with the email so one can see what i am on about. Or do i have to
write my own new code to compute this.

 Yours Sincerely,
 Paul Brett
Traps   Insecta Diplopoda   ArachniaMalocostan
5.trap.4.barrier345 77  200 154
1.trap.4.barrier170 54  61  58   232 19  30  5   59  5   6   5   105 11  26  37
Re: [R] Rcpp, function signature

2015-08-27 thread Dirk Eddelbuettel
Michael Meyer via R-help> writes:

> I am an  (very) grateful user of Rcpp.

Glad to hear that!

But you are on the wrong mailing list. Please ask on rcpp-devel.


[R] Issues with RPostgres

2015-08-27 thread Abraham Mathew
I have a user-defined function that I'm using alongside a postgresql
connection to
summarize some data. I've connected to the local machine with no problem.
the connection keeps throwing the following error when I attempt to use it.
Can anyone point
to what I could be doing wrong.

> ds_summary(con, "test", vars=c("Age"), y=c("Class"))
Error in postgresqlNewConnection(drv, ...) :
  RS-DBI driver: (could not connect postgres@localhost on dbname "test"

con is the connection
test is the database table
age is the attribute that will be summarized
class is the response variable

Can anyone help?


Re: [R] lsqlin in R package pracma

2015-08-27 thread Raubertas, Richard
Is it really that complicated?  This looks like an ordinary quadratic 
programming problem, and 'solve.QP' from the 'quadprog' package seems to solve 
it without user-specified starting values:

Dmat <- t(C) %*% C
dvec <- (t(C) %*% d)
Amat <- -1 * t(A)
bvec <- -1 * b

rslt <- solve.QP(Dmat, dvec, Amat, bvec)
sum((C %*% rslt$solution - d)^2)

[1] 0.01758538

Richard Raubertas
Merck & Co.

On Mon Aug 24 Wang, Xue, Ph.D. Wang.Xue at wrote
> I am looking for a R version of Matlab function lsqlin. I came across
> R pracma package which has a lsqlin function. Compared with Matlab lsqlin,
> the R version does not allow inequality constraints.
> I am wondering if this functionality will be available in future. And also
> like to get your opinion on which R package/function is the best for
> least square minimization problem with linear inequality constraints.
> Thanks very much for your time and attention!

Solving (linear) least-squares problems with linear inequality constraints
is more difficult then one would expect. Inspecting the MATLAB code reveals
that it employs advanced methods such as active-set (linear inequality
constraints) and interior-point (for bounds constraints).

Function nlsLM() in package *minpack.lm* supports bound constraints if that
is sufficient for you. The same is true for *nlmrt*. Convex optimization
might be a promising approach for linear inequality constraints, but there
is no easy-to-handle convex solver in R at this moment.

So the most straightforward way would be to use constrOptim(), that is
optim with linear constraints. It requires a reasonable starting point, and
keeping your fingers crossed that you are able to find such a point in the
interior of the feasible region.

I someone wants to try: Here is the example from the MATLAB "lsqlin" page:

C <- matrix(c(
0.9501,   0.7620,   0.6153,   0.4057,
0.2311,   0.4564,   0.7919,   0.9354,
0.6068,   0.0185,   0.9218,   0.9169,
0.4859,   0.8214,   0.7382,   0.4102,
0.8912,   0.4447,   0.1762,   0.8936), 5, 4, byrow=TRUE)
d <- c(0.0578, 0.3528, 0.8131, 0.0098, 0.1388)
A <- matrix(c(
0.2027,   0.2721,   0.7467,   0.4659,
0.1987,   0.1988,   0.4450,   0.4186,
0.6037,   0.0152,   0.9318,   0.8462), 3, 4, byrow=TRUE)
b <- c(0.5251, 0.2026, 0.6721)

The least-square function to be minimized is  ||C x - d||_2 , and the
constraints are  A x <= b :

f <- function(x) sum((C %*% x - d)^2)

The solution x0 returned by MATLAB has a minimum of  f(x0) = 0.01759204 .
This point does not lie in the interior and cannot be used for a start.

Re: [R] Issues with RPostgres

2015-08-27 thread John McKown
On Thu, Aug 27, 2015 at 2:29 PM, Abraham Mathew 

> I have a user-defined function that I'm using alongside a postgresql
> connection to
> summarize some data. I've connected to the local machine with no problem.
> However,
> the connection keeps throwing the following error when I attempt to use it.
> Can anyone point
> to what I could be doing wrong.
> > ds_summary(con, "test", vars=c("Age"), y=c("Class"))
> Error in postgresqlNewConnection(drv, ...) :
>   RS-DBI driver: (could not connect postgres@localhost on dbname "test"
> )
> con is the connection

​It would be helpful to see the assignment to "con" as well as any other
assignments related to this. If you are using the DBI package, then what I
am talking about would be something like:


>From looking at the message, it appears to me that you are trying to
connect to PostgreSQL as the "postgres" user. That just seems wrong to me.
Normally that user is only for administration purposes. It does not
normally contain user tables such as "test". I would think that what you
needed would be your PostgreSQL user id. Or the id of the owner of the
"test" table.​

> test is the database table
> age is the attribute that will be summarized
> class is the response variable
> Can anyone help?
Re: [R] lsqlin in R package pracma

2015-08-27 thread Wang, Xue, Ph.D.
Hi Richard,

It is good to know that solve.QP could solve quadratic programming problem. The 
difficulty here is that the objective function might not be in quadratic form. 
It is not in the form of t(X)QX, where Q is an n by n symmetric matrix.




Is it really that complicated?  This looks like an ordinary quadratic 
programming problem, and 'solve.QP' from the 'quadprog' package seems to solve 
it without user-specified starting values:

Dmat <- t(C) %*% C
dvec <- (t(C) %*% d)
Amat <- -1 * t(A)
bvec <- -1 * b

rslt <- solve.QP(Dmat, dvec, Amat, bvec) sum((C %*% rslt$solution - d)^2)

[1] 0.01758538

Richard Raubertas
Merck & Co.

On Mon Aug 24 Wang, Xue, Ph.D. Wang.Xue at wrote
> I am looking for a R version of Matlab function lsqlin. I came across 
> R pracma package which has a lsqlin function. Compared with Matlab 
> lsqlin, the R version does not allow inequality constraints.
> I am wondering if this functionality will be available in future. And 
> also like to get your opinion on which R package/function is the best 
> for
> least square minimization problem with linear inequality constraints.
> Thanks very much for your time and attention!

Solving (linear) least-squares problems with linear inequality constraints is 
more difficult then one would expect. Inspecting the MATLAB code reveals that 
it employs advanced methods such as active-set (linear inequality
constraints) and interior-point (for bounds constraints).

Function nlsLM() in package *minpack.lm* supports bound constraints if that is 
sufficient for you. The same is true for *nlmrt*. Convex optimization might be 
a promising approach for linear inequality constraints, but there is no 
easy-to-handle convex solver in R at this moment.

So the most straightforward way would be to use constrOptim(), that is optim 
with linear constraints. It requires a reasonable starting point, and keeping 
your fingers crossed that you are able to find such a point in the interior of 
the feasible region.

I someone wants to try: Here is the example from the MATLAB "lsqlin" page:

C <- matrix(c(
0.9501,   0.7620,   0.6153,   0.4057,
0.2311,   0.4564,   0.7919,   0.9354,
0.6068,   0.0185,   0.9218,   0.9169,
0.4859,   0.8214,   0.7382,   0.4102,
0.8912,   0.4447,   0.1762,   0.8936), 5, 4, byrow=TRUE)
d <- c(0.0578, 0.3528, 0.8131, 0.0098, 0.1388)
A <- matrix(c(
0.2027,   0.2721,   0.7467,   0.4659,
0.1987,   0.1988,   0.4450,   0.4186,
0.6037,   0.0152,   0.9318,   0.8462), 3, 4, byrow=TRUE)
b <- c(0.5251, 0.2026, 0.6721)

The least-square function to be minimized is  ||C x - d||_2 , and the 
constraints are  A x <= b :

f <- function(x) sum((C %*% x - d)^2)

The solution x0 returned by MATLAB has a minimum of  f(x0) = 0.01759204 .
This point does not lie in the interior and cannot be used for a start.

Re: [R] Issues with RPostgres

2015-08-27 thread Hadley Wickham
On Thu, Aug 27, 2015 at 3:46 PM, John McKown
> On Thu, Aug 27, 2015 at 2:29 PM, Abraham Mathew 
> wrote:
>> I have a user-defined function that I'm using alongside a postgresql
>> connection to
>> summarize some data. I've connected to the local machine with no problem.
>> However,
>> the connection keeps throwing the following error when I attempt to use it.
>> Can anyone point
>> to what I could be doing wrong.
>> > ds_summary(con, "test", vars=c("Age"), y=c("Class"))
>> Error in postgresqlNewConnection(drv, ...) :
>>   RS-DBI driver: (could not connect postgres@localhost on dbname "test"
>> )
>> con is the connection
> It would be helpful to see the assignment to "con" as well as any other
> assignments related to this. If you are using the DBI package, then what I
> am talking about would be something like:
> drv<-dbDriver("PgSQL")
> con<-dbConnect(drb,user=...,password=...,dbname="test');

FWIW the best way to create a connection is:

con <- dbConnect(RPostgreSQL::PostgreSQL(), ...)

The older string based approach is not advised.


[R] ggplot2 scale_shape_manual with large numbers instead of shapes

2015-08-27 thread Marian Talbert
I'm trying to produce a plot with climate data in which colors describe one
aspect of the data (emissions scenario) and numbers rather than shapes show
the model used (there are 36 models for one emissions scenario and 34 for
the other).  I'm trying to use numbers rather than symbols because there are
36 climate models and thus not enough symbols.  Numbering seems more
consistent than some combo of letters and symbols.  I couldn't figure out
how to define my own shapes as numbers 1 to 36 using scale_shape_manual so
I'm adding the numbers with annotate.  The problem is that I'd like a second
legend linking the numbering to the long model names but am having a hard
time with this.  I've created a toy example below to make this more clear.
p1 below was my original plot and I'd like p2 only with the second legend
linking numbers to long model names any suggestions? 


  Emissions=factor(rep(c("RCP 4.5","RCP 8.5"),each=36)))
 Pquants <- aggregate(Dat$Precp,list(RCP=Dat$Emissions),
 Tquants <- aggregate(Dat$Temp,list(RCP=Dat$Emissions),

#Original Plot
 p1 <- ggplot()+geom_point(Dat,mapping=aes(x=Temp,y=Precp,colour=Emissions),
  annotate("text", label=Labels, x=Dat$Temp,
y=Dat$Precp,colour=c("#EEB422BE","#FFBE")[Dat$Emissions]) +
  guides(fill=guide_legend(reverse=TRUE))+theme(axis.title =
element_text(size = 2)) +

#with numbers instead of model names
  annotate("text", label=Labels, x=Dat$Temp,
  guides(fill=guide_legend(reverse=TRUE))+theme(axis.title =
element_text(size = 2)) +

Re: [R] ggplot2 scale_shape_manual with large numbers instead of shapes

2015-08-27 thread Hadley Wickham
Something like this?

df <- data.frame(
  x = runif(30),
  y = runif(30),
  z = factor(1:30)

ggplot(df, aes(x, y)) +
  geom_point(aes(shape = z), size = 5) +
  scale_shape_manual(values = c(letters, 0:9))


On Thu, Aug 27, 2015 at 4:48 PM, Marian Talbert  wrote:
> I'm trying to produce a plot with climate data in which colors describe one
> aspect of the data (emissions scenario) and numbers rather than shapes show
> the model used (there are 36 models for one emissions scenario and 34 for
> the other).  I'm trying to use numbers rather than symbols because there are
> 36 climate models and thus not enough symbols.  Numbering seems more
> consistent than some combo of letters and symbols.  I couldn't figure out
> how to define my own shapes as numbers 1 to 36 using scale_shape_manual so
> I'm adding the numbers with annotate.  The problem is that I'd like a second
> legend linking the numbering to the long model names but am having a hard
> time with this.  I've created a toy example below to make this more clear.
> p1 below was my original plot and I'd like p2 only with the second legend
> linking numbers to long model names any suggestions?
> library(ggplot2)
> Dat<-data.frame(Temp=c(rnorm(36,0,1),rnorm(36,1.5,1)),Precp=c(rnorm(36,0,1),rnorm(36,1,1)),
> model=factor(rep(paste("LongModelName",c(letters,1:10),sep="_"),times=2)),
>   Emissions=factor(rep(c("RCP 4.5","RCP 8.5"),each=36)))
>  EmissionsCol<-c("goldenrod2","red")
>  Pquants <- aggregate(Dat$Precp,list(RCP=Dat$Emissions),
>  Tquants <- aggregate(Dat$Temp,list(RCP=Dat$Emissions),
>  Quants<-data.frame(Emissions=Tquants$RCP,Tmin=Tquants[[2]][,1],
>   TMedian=Tquants[[2]][,2],Tmax=Tquants[[2]][,3],
> Pmin=Pquants[[2]][,1],PMedian=Pquants[[2]][,2],Pmax=Pquants[[2]][,3])
> #Original Plot
> Labels<-Dat$model
>  p1 <- ggplot()+geom_point(Dat,mapping=aes(x=Temp,y=Precp,colour=Emissions),
>  size=.1)+
>  scale_colour_manual(values=c("#EEB422BE","#FFBE"),guide="none")+
>   annotate("text", label=Labels, x=Dat$Temp,
> y=Dat$Precp,colour=c("#EEB422BE","#FFBE")[Dat$Emissions]) +
>   guides(fill=guide_legend(reverse=TRUE))+theme(axis.title =
> element_text(size = 2)) +
> geom_segment(data=Quants,mapping=aes(x=Tmin,y=PMedian,xend=Tmax,yend=PMedian),size=2,colour="black")+
> geom_segment(data=Quants,mapping=aes(x=TMedian,y=Pmin,xend=TMedian,yend=Pmax),size=2,colour="black")+
> geom_segment(data=Quants,mapping=aes(x=Tmin,y=PMedian,xend=Tmax,yend=PMedian,colour=Emissions),size=1)+
> geom_segment(data=Quants,mapping=aes(x=TMedian,y=Pmin,xend=TMedian,yend=Pmax,colour=Emissions),size=1)+
> geom_point(data=Quants,mapping=aes(x=TMedian,y=PMedian,fill=Emissions),size=6,pch=21,colour="black")+
>   scale_fill_manual(values=EmissionsCol)
> p1
> #with numbers instead of model names
> Labels<-as.numeric(factor(Dat$model))
>  p2<-
> ggplot()+geom_point(Dat,mapping=aes(x=Temp,y=Precp,colour=Emissions),size=.1)+
>  scale_colour_manual(values=c("#EEB422BE","#FFBE"),guide="none")+
>   annotate("text", label=Labels, x=Dat$Temp,
> y=Dat$Precp,colour=c("#EEB422BE","#FFBE")[Dat$Emissions])+
>   guides(fill=guide_legend(reverse=TRUE))+theme(axis.title =
> element_text(size = 2)) +
> geom_segment(data=Quants,mapping=aes(x=Tmin,y=PMedian,xend=Tmax,yend=PMedian),size=2,colour="black")+
> geom_segment(data=Quants,mapping=aes(x=TMedian,y=Pmin,xend=TMedian,yend=Pmax),size=2,colour="black")+
> geom_segment(data=Quants,mapping=aes(x=Tmin,y=PMedian,xend=Tmax,yend=PMedian,colour=Emissions),size=1)+
> geom_segment(data=Quants,mapping=aes(x=TMedian,y=Pmin,xend=TMedian,yend=Pmax,colour=Emissions),size=1)+
> geom_point(data=Quants,mapping=aes(x=TMedian,y=PMedian,fill=Emissions),size=6,pch=21,colour="black")+
>   scale_fill_manual(values=EmissionsCol)
> p2
Re: [R] ggplot2 scale_shape_manual with large numbers instead of shapes

2015-08-27 thread Marian Talbert
Not exactly I was trying to only numbers for symbols instead of a mix of
letters and numbers just to be consistent.  I'm pretty sure someone will nag
me if I use both letters and numbers as symbols  

[R] heat map labeling

2015-08-27 thread Angela via R-help

I have a dataset of 985 genes, looks something like the ones below. I want to 
label only those with the high intensities, since labeling all doesn't show up. 
Is there a way to do that? If not, is there a way to pull out the highest ones 
(say, highest 50, or those above X amount) and only show those in a heat map? 


Z transforming gives all cells the same value, just + or - (for example, all 
have 0.5 and -0.5). The researchers want the actual values used.

Gene   var1       var2
A    800    0
B    25    30
C    75    200
D    0            0
E    400    600
E    500    70
E    100    100
F    600    600
F    70    827460
G    420930    40
H    0            0
H    100    100
I    70    60
J    0            70
K    0            0
L    20    50
L    100    300

Re: [R] xyplot colour points and layout

2015-08-27 thread Duncan Mackay

Following on from Davids reply you can do the following if you want a key or
By putting the colour scheme in par.settings the "local" equivalent of
setting trellis.par.set() for that plot
you can get things right for the key without having to have add arguments to

  culr<-ifelse(Raw$Year=="Y2002","Year 2002","Year 2014")

  xyplot(Abun~Date1|Station, data=Raw,
groups = culr,
par.settings = list(strip.background = list(col = "transparent"),
superpose.symbol = list(cex = rep(2, 2),
pch = rep(16,2))),
auto.key = T)

for a list of the settings


Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home:

Dear All,

I have tried to plot graphs of one row of four figures for each station.  In
each graph, black points indicate data in the year of 2002, denoted as
Y2002, whereas grey points indicate data in the year of 2014, denoted as
Y2014.  I ended up with 2x2 plots with all data points in black.  Can anyone
find out what has gone wrong by any chance please?

Raw<-structure(list(Date = structure(c(6L, 7L, 2L, 4L, 12L, 9L, 7L, 
2L, 4L, 12L, 6L, 15L, 14L, 3L, 6L, 1L, 16L, 5L, 11L, 8L, 4L, 
10L, 13L, 6L, 1L, 16L, 5L, 11L, 8L, 4L, 10L, 13L, 6L, 1L, 16L, 
5L, 11L, 8L, 4L, 10L, 13L, 11L, 8L, 4L, 10L, 13L), .Label = c("1/10", 
"1/11", "11/11", "12/11", "13/10", "19/9", "2/10", "2/11", "20/9", 
"26/11", "29/10", "29/11", "30/11", "31/10", "4/10", "6/10"), class =
Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Y2002", "Y2014"), class =
Station = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 4L, 4L, 4L, 4L, 4L), .Label = c("E", "F", "H", "I"
), class = "factor"), Abun = c(3.42, 1.33, 3.67, 3.67, 3.92, 
2.17, 2.5, 1.67, 6.33, 0.67, 1, 1, 1.33, 2.08, 0, 0, 0.33, 
0.08, 0.08, 0, 0.5, 0.17, 0.67, 0.67, 0, 1, 0.58, 1.5, 2.67, 
0.67, 1.33, 3, 0.58, 1.17, 1.25, 0.75, 1.25, 1.75, 0.92, 
1.5, 0.83, 0.75, 2.33, 0.67, 1.33, 1.58), Date1 = structure(c(16697, 
16710, 16740, 16751, 16768, 16698, 16710, 16740, 16751, 16768, 
16697, 16712, 16739, 16750, 16697, 16709, 16714, 16721, 16737, 
16741, 16751, 16765, 16769, 16697, 16709, 16714, 16721, 16737, 
16741, 16751, 16765, 16769, 16697, 16709, 16714, 16721, 16737, 
16741, 16751, 16765, 16769, 16737, 16741, 16751, 16765, 16769
), class = "Date")), .Names = c("Date", "Year", "Station", 
"Abun", "Date1"), row.names = c(NA, -46L), class = "data.frame")

Many thanks.


Re: [R] Problem with gridExtra

2015-08-27 thread Richard M. Heiberger
gridExtra was changed.  This is the email from Baptiste to CRAN package
developers that describes the changes and
points to the vignettes that will describe the changes.  The changes
described here are now in the current release of gridExtra.

Baptiste Auguie 
Jul 9
to Borja, Pablo, Paul-Christian, Zachary, Andrey, Liam, Michael, Rafael,
Mikkel, Xinyu, Christopher, Andrew, Thierry, Diogo, Grigori, Felix, Adelino
, Dean, Wencke, Brian, me, Frank, Jason, Pieter, Timothy
Dear package maintainers,

I'm working on a long-overdue update of gridExtra for CRAN, and I believe
your package depends on it. Please have a look at the dev version on
github, and let me know if it breaks something in your package.

I've removed practically everything; only two main functions are left:
and grid.table(). I believe they were by-and-large the only ones actually
used, and the rest was mostly experimental code that shouldn't stay on
I've rewritten these two functions using gtable, which I found more
practical and extensible. However, this means that the new functions are
entirely different from their predecessor, internally, and may break a lot
of code. I have included two vignettes for an overview of these updated
functions, also reproduced in the wiki:



On Thu, Aug 27, 2015 at 3:33 PM, Lorenzo Isella 

> Dear All,
> Please consider the snippet at the end of the email, largely based on
> what you find here
> When I run it, I get this error
> Error in arrangeGrob(p, sub = textGrob("Footnote", x = 0, hjust =
> -0.1,  :
>  could not find function "textGrob"
> However, the code runs on another machine I own. I suppose something
> must have changed in the gridExtra library but right now I am banging
> my head against the wall.
> This is my sessionInfo()
> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux stretch/sid
> locale:
> [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
>  [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
>   [5] LC_MONETARY=en_GB.utf8LC_MESSAGES=en_GB.utf8
>[7] LC_PAPER=en_GB.utf8   LC_NAME=C
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> other attached packages:
> [1] gridExtra_2.0.0 ggplot2_1.0.1
> loaded via a namespace (and not attached):
> [1] Rcpp_0.11.6  digest_0.6.8 MASS_7.3-43  grid_3.2.2
>  [5] plyr_1.8.3   gtable_0.1.2 magrittr_1.5 scales_0.3.0
>   [9] stringi_0.5-5reshape2_1.4.1   proto_0.3-10 labeling_0.3
>   [13] tools_3.2.2  stringr_1.0.0munsell_0.4.2
> colorspace_1.2-6
> Any suggestion is appreciated.
> Cheers
> Lorenzo
> ##
> library(ggplot2)
> toyota <- mpg[which(mpg$manufacturer == 'toyota'), ]
> p <- ggplot(toyota, aes(displ, hwy)) + facet_wrap(~ class, ncol = 2) +
> geom_point(aes(size=cyl))
> print(p)
> library(gridExtra)
> g <- arrangeGrob(p, sub = textGrob("Footnote", x = 0, hjust = -0.1,
> vjust=0.1, gp = gpar(fontface = "italic", fontsize = 18)))
> ggsave("/Users/Alan/Desktop/plot_grid_extra.png", g)
[R] Piecewise regression using segmented package plotted in xyplot

2015-08-27 thread Sumitrajit Dhar

xyplot(threshold ~ age |frequency.a, data=rage,
  xlab="Age (years)",
  ylab="Threshold (dB SPL)",
  panel=function(x,y,groups,...) {
# panel.abline(segmented(lm(threshold~age),seg.Z = ~age, psi = NA, control = 

Is there anyway to make the commented line work in lattice? I need to fit my 
data in each panel using piecewise regression. Being able to use segmented 
would make it easy.

The code above works to give me a linear fit.

Thanks for your help in advance.


[R] Gaussian Mixture Regression

2015-08-27 thread lucasmalta

I am looking for a way to run a Gaussian Mixture Regression (GMR) in R.

In other words, say that I have a Gaussian Mixture Model (GMM) calculated
using, for example, the MClust library. This model represents the joint
distribution of two independent variables P(A,B). I need to calculate P(A|B
= b).

I have seen implementations in  Matlab

and  Python    for that, but the rest
of my project is in R. I wanted to make sure it does not exist (no luck on
Google so far...) before trying to implement it myself. 

Thanks in advance,


[R] heat map labeling

2015-08-27 thread Angela via R-help

I have a dataset of 985 genes, looks something like the ones below. I want to 
label only those with the high intensities, since labeling all doesn't show up. 
Is there a way to do that? If not, is there a way to pull out the highest ones 
(say, highest 50, or those above X amount) and only show those in a heat map? 


Z transforming gives all cells the same value, just + or - (for example, all 
have 0.5 and -0.5). The researchers want the actual values used.

Gene   var1   var2
A   800 0
B   25  30
C   75  200
D   0   0
E   400 600
E   500 70
E   100 100
F   600 600
F   70  827460
G   420930  40
H   0   0
H   100 100
I   70  60
J   0   70
K   0   0
L   20  50
L   100 300

Re: [R] lsqlin in R package pracma

2015-08-27 thread Berend Hasselman
> On 27 Aug 2015, at 23:12, Wang, Xue, Ph.D.  wrote:
> Hi Richard,
> It is good to know that solve.QP could solve quadratic programming problem. 
> The difficulty here is that the objective function might not be in quadratic 
> form. It is not in the form of t(X)QX, where Q is an n by n symmetric matrix.

Unless I’m very mistaken the objective function is in the form you mention.
The quadratic part is  t(x) %*% t(C) %*% C %*% x so your Q is simply equivalent 
to t(C) %*% C.


