[R] Event handling in R

2011-12-16 Thread beetonn
Dear R-helpers, 

I've just started playing with getGraphicsEvent() in R, and was wondering if 
there is a simple way to stop this function waiting for input after a 
pre-defined time, instead of relying either on a non-NULL value from one of the 
event handlers or for a user-interrupt for it to halt (as per the R manual).

The only way that I've thought of to make this work is using setTimeLimit to 
kill the function after a set time - here's a (messy) example:

setTimeLimit(cpu = 1)
plot(0, 0)
eventEnv <- getGraphicsEventEnv()
setGraphicsEventHandlers(prompt = 'Test')
while(TRUE) 
{
   err <- try(q <- getGraphicsEvent(), silent=TRUE)
   print(err)
}
# Not run
setTimeLimit(cpu = Inf)

but setTimeLimit doesn't seem to kill getGraphicsEvent() properly, as running 
this code just ends up with the repeated returned error of "Error in 
getGraphicsEvent(): recursive use of getGraphicsEvent not supported" instead of 
the intended "Error: reached CPU time limit" every second.

Is there a way to get around this problem, or a different method to make this 
work?  Or should I quit dabbling in things beyond my ken?


Thanks,
Nick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple plots in one subplot

2011-12-16 Thread annek
Hi, 
I making a figure with six sub-plots using par(mfcol=c(2,3)). In the last
sub-plot I want to have two graphs instead of one. I have tried using
par(fig=x,y,z,v) but this par seems to overwrite the first par. Is there a
simple solution? 
Thanks!

Anna

--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-plots-in-one-subplot-tp4203525p4203525.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple plots in one subplot

2011-12-16 Thread Rolf Turner

On 16/12/11 19:36, annek wrote:

Hi,
I making a figure with six sub-plots using par(mfcol=c(2,3)). In the last
sub-plot I want to have two graphs instead of one. I have tried using
par(fig=x,y,z,v) but this par seems to overwrite the first par. Is there a
simple solution?
If I understand you correctly you simply need to use points() and/or 
lines().

See the help for these.

If these functions do not solve your problem you'll have to make your
question clearer.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Multiple plots in one subplot

2011-12-16 Thread Petr PIKAL
Hi
> 
> Hi, 
> I making a figure with six sub-plots using par(mfcol=c(2,3)). In the 
last
> sub-plot I want to have two graphs instead of one. I have tried using
> par(fig=x,y,z,v) but this par seems to overwrite the first par. Is there 
a
> simple solution? 

You can try ?layout or grid graphic. This gives you powerfull 
functionality but it probably is not so simple as par.

Regards
Petr


> Thanks!
> 
> Anna
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Multiple-
> plots-in-one-subplot-tp4203525p4203525.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Event handling in R

2011-12-16 Thread Henrik Bengtsson
...because you're catching the timeout error with try() so it has not
effect, and hence the while(TRUE) {} loop keeps running.

Without knowing anything about graphics event handlers per se, here's
an alternative to catch the timeout event:

library("R.utils");

plot(0, 0);
eventEnv <- getGraphicsEventEnv();
setGraphicsEventHandlers(prompt='Test');

q <- NULL;
evalWithTimeout({
  q <- getGraphicsEvent();
}, timeout=1);
print(q);

does it.  See print(evalWithTimeout) to see what you are doing differently.

My $.02

/Henrik


On Thu, Dec 15, 2011 at 9:49 PM, beetonn  wrote:
> Dear R-helpers,
>
> I've just started playing with getGraphicsEvent() in R, and was wondering if 
> there is a simple way to stop this function waiting for input after a 
> pre-defined time, instead of relying either on a non-NULL value from one of 
> the event handlers or for a user-interrupt for it to halt (as per the R 
> manual).
>
> The only way that I've thought of to make this work is using setTimeLimit to 
> kill the function after a set time - here's a (messy) example:
>
> setTimeLimit(cpu = 1)
> plot(0, 0)
> eventEnv <- getGraphicsEventEnv()
> setGraphicsEventHandlers(prompt = 'Test')
> while(TRUE)
> {
>   err <- try(q <- getGraphicsEvent(), silent=TRUE)
>   print(err)
> }
> # Not run
> setTimeLimit(cpu = Inf)
>
> but setTimeLimit doesn't seem to kill getGraphicsEvent() properly, as running 
> this code just ends up with the repeated returned error of "Error in 
> getGraphicsEvent(): recursive use of getGraphicsEvent not supported" instead 
> of the intended "Error: reached CPU time limit" every second.
>
> Is there a way to get around this problem, or a different method to make this 
> work?  Or should I quit dabbling in things beyond my ken?
>
>
> Thanks,
> Nick
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modify the name of axis of an R function

2011-12-16 Thread Martin Maechler
> David Winsemius 
> on Thu, 15 Dec 2011 09:56:31 -0500 writes:

> On Dec 15, 2011, at 1:53 AM, plocq wrote:

>> Hi,
>> 
>> I use the function fpot of packages evd. If I call the
>> fit that I obtain "fit", I want to modify the name of the
>> axis and the main title that is produced by plot(fit1).

> Usually this would be accomplished with

> plot(fit, main="main title", xlab="X-axis lable",
> ylab="y-axis label")

>> To do this, I want to create a new function where I would
>> have the names modified.

> Seems unlikely that this would be needed.

>> The problem is that I can't find the function that
>> produce these plots... I tried to see in plot.uvevd but
>> it doesn't seems to be the good one. Can anybody help me?
>> Many thanks!

> The way to address this, if you are committed to this
> path, is to first determine the class of the fit-object
> and then to look for a plot method with methods(plot). If
> you see an S3 method you can call up the code with:

> evd:::plot.fit-class # when "fit-class" is the value you
> got with class(fit)

> If it's an S4 method, then it's much more convoluted, and
> over time I've learned not to try.

huh?   
David, I'm disappointed.

Not at all much more convoluted, just different.
I think you should make a 2nd attempt, starting with
things like

  showMethods(plot)

with versions

  showMethods(plot, class = "...")
  showMethods(plot, include = TRUE)
  showMethods(plot, class = "...", include = TRUE)

or then directly

  selectMethod(plot, "")



and yes, this is all not relevant to  evd  where the methods are
S3 only...

Martin Maechler, ETH Zurich

> -- 

> David Winsemius, MD West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .nc files query

2011-12-16 Thread Prof Brian Ripley

On 16/12/2011 02:19, vaish wrote:

Hello Everyone, I have tried using open.ncdf(con,..) to open .nc files in R,
but somehow, its showing that R could not find function open.ncdf. I am new
to R, please help me out with this.


Yet somewhere you found a reference to open.ncdf -- a pretty arcane 
function not part of R itself.  Please check what else the reference 
said (and remember for the future that we cannot know what you have 
read, so please tell us).


It is likely you need to install and load the package ncdf.  But that is 
not the only way to read NetCDF files in R (I presume that is what you 
meant), and indeed cannot read the most recent NetCDF format.  There are 
also CRAN packages RNetCDF and ncdf4, and other ways have been tried 
(and can be found by googling).



Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/nc-files-query-tp4203048p4203048.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with tick graph

2011-12-16 Thread n.via...@libero.it
Dear all,
I'm having problems with the tick of my graph. I'mpcombining lines and 
barplot. 
For my I'm using the function axis combined with the function pretty to have 
more efficient tick, but all my tick (for example, 300 as my max tick and -100 
as my min tick) are not printed on my graph.
So I would like to have for the left axis the seq from 0 to 100 (with 0 and 
100 printed on my graph) and for the right axis the seq from -100 to 300 (with 
-100 and 300 printed on my graph)
Someone Knows how to get it???


The code and data are:
grafico<-{
  
pdf(file=file, paper="special",width=30, height=20)
par(bg="white",font=3,font.axis=3,las=1,cex.axis=2.2, mar=c(8,6,8,8)+8)
barplot(Imp$TassoV, width=10,space=c(0,0.1),legend.text = 
FALSE,beside=TRUE,
border="grey40", main="",col="midnightblue",cex.main=2.4,
axes = FALSE, axisnames =FALSE)
lab=as.character(pretty(Imp$TassoV))
axis(side=4,at=pretty(Imp$TassoV),labels=lab,cex.axis=2.4)
par(new=T)
chart.TimeSeries(Imp$ValueA, type="l", lwd=8, main="", ylab="", 
xlab="", date.
format="%y", 
col="red3",major.ticks="years",minor.ticks=TRUE, grid.color="grey50", 
grid.
lty="dotted", cex.axis=2.4,yaxis=FALSE)
axis(2,at=c(pretty(Imp$ValueA)))
legend("topleft",c("Livelli mln $ (sc. sx)","Tasso di var. 
(sc.dx)"),col=c
("red3", "midnightblue"),bty="n",lwd=4,cex=2.4)
}
dev.off()

  ValueA  ValueA_L  TassoV
1995-12-1688.06557   NA  NA
1996-12-1688.34237 88.06557   0.3143044
1997-12-1657.86447 88.34237 -34.4997529
1998-12-16   50.19389 57.86447 -13.2561039
1999-12-16   23.06846 50.19389 -54.0412943
2000-12-16  45.79965 23.06846  98.5379304
2001-12-16  22.35262 45.79965 -51.1947722
2002-12-16 66.89223 22.35262 199.2589371
2003-12-1689.24867 66.89223  33.4215852
2004-12-1677.16459 89.24867 -13.5397854
2005-12-16   51.23656 77.16459 -33.6009462
2006-12-16   49.51073 51.23656  -3.3683450
2007-12-16   90.39337 49.51073  82.5732837
2008-12-16   38.84831 90.39337 -57.0230554
2009-12-16   14.70859 38.84831 -62.1384086
2010-12-16   55.23819 14.70859 275.5505995


Thanks for your attention

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Multiple plots in one subplot

2011-12-16 Thread Walmes Zeviani
You can use a different way of split the plotting area that is by means of
layout() function.

x <- rnorm(100)
M <- matrix(c(rep(1:5, e=2), 6, 7), byrow=TRUE, nrow=2)
layout(M)
plot(x)
hist(x)
qqnorm(x)
boxplot(x)
plot(density(x))
plot(abs(x))
hist(abs(x))

Bests.
Walmes.

==
Walmes Marques Zeviani
LEG (Laboratório de Estatística e Geoinformação, 25.450418 S, 49.231759 W)
Departamento de Estatística - Universidade Federal do Paraná
fone: (+55) 41 3361 3573
VoIP: (3361 3600) 1053 1173
e-mail: wal...@ufpr.br
twitter: @walmeszeviani
homepage: http://www.leg.ufpr.br/~walmes
linux user number: 531218
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2011-12-16 Thread Kristi Shoemaker
...Lose weight without diets.  
http://www.compratusaldo.com/page.december.php?gID=73uc5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multicollinearty in logistic regression models

2011-12-16 Thread David Winsemius


On Dec 15, 2011, at 7:41 PM,  wrote:


David Winsemius  writes:


On Dec 15, 2011, at 11:34 AM, Mohamed Lajnef wrote:


Dear All,

Is there a method to diagnostic multicollinearty in logistic
regression
models  like vif indicator in linear regression ( variance inflation
Factor ...) ?



Wouldn't matrix representation of the predictor "side" of the
regression be the same? Couldn't you just use the same methods you
employ for linear regression?


Harrell's rms package has a vif function that is intended for use with  
fits from his logistic regression model function, lrm. This uses the  
variance covariance matrix from the last iteration of the fitting  
process alluded to below and Bert Gunter's reply.




Trouble is that in logistic regression the Fisher Information for each
case has a factor of p[i]*(1-p[i]) (where 'p' is the vector of success
probabilites and 'i' indexes which case).

If the value of p[i] is very near one or zero, then the information
provided is scant. And this will happen if you have a really good
predictor in the mix.




Even with an orthogonal design, you can wind up with huge variances.  
And

you can have an ill-conditioned var-cov matrix for the coefficients
depending on how different cases get weighted. Thus, you could get the
equivalent of multicollinearity even with an orthogonal design.

And the diagnostics for linear regresson would not be all
that helpful if you have a good predictor.




OTOH, if the predictors were collectively pretty weak, the linear
regression diagnostics might be OK.

Mu advice: Google Scholar 'pregibon logistic regression', click  
where it

says 'cited by ...' and page through the results to find good leads on
this topic.


Yes, that was an interesting exercise. Brought back many fond  
memories. In my training we constructed  df-betas and df-deviance  
using GLIM macros. I think we may even have been given Pregibon's  
paper, since he was a local boy (UW). When one learns logistic  
regression on GLIM interacting with a VAX over a TTY line, R is a real  
treat.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans and plot labels

2011-12-16 Thread Meesters, Christian
Hi,

Thanks Sarah. Unfortunately I did not get a step further.

My question, perhaps a bit clearer, is how to display the case control status 
(or any other arbitrary point label) after clustering in a plot:

With a bit of pseudo code, where dataset is a data.frame, parameters are those 
column names where we find numerical values (no NAs) and nclasses is the 
desired number of classes.

fit <- kmeans(dataset[, parameters], nclasses)
plot(dataset[, parameters], col = fit$cluster, dimen = nclasses)

This gives us a nicely coloured plot where all points are circled. Yet, my 
desire is to see the case / control-Status encoded in a column "Status" with 
single characters, in place of those circled points.

Any hints? I did not get pch() nor text() to work. Perhaps the solution lies 
there, but them I am sure that I do not see the wood for all those trees ...

TIA
Christian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error constructing probabilities in Zelig

2011-12-16 Thread Abraham Mathew
I've run an ordered logistic regression model in R with Zelig and am
looking to calculate predicted probabilities. Zelig has a series of simple
one line commands to generate the information I want on first differences
and so forth. Unfortunately, I keep getting an error when running the zelig
function and was wondering if there was a quick alternative for generating
predicted probabilities for a ordered logit in R.

library(Zelig)
mod <- zelig(sold ~ age + gender + marital + educ2 + cars + license +
credit +
type + home + id, model="ologit", data=dat, Hess=TRUE)
summary(mod)

For what it's worth, here's the error from my Zelig code.
> x.out <- setx(mod, credit=1)
Error in dta[complete.cases(mf), names(dta) %in% vars, drop = FALSE] :
  incorrect number of dimensions

Can anyone tell me what I might be doing wrong.

If not, can you give me an alternative solution that I can use to generate
the probabilities.

-- 
*Abraham Mathew
Statistical Analyst
www.amathew.com
720-648-0108
@abmathewks*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Improve a browse through list items - Transform a loop to apply-able function.

2011-12-16 Thread Robin Cura
Thanks to all of you for those answers, it now works and it's way faster
than it used to be ;)
Especially, converting my list of matrix to a 3-dimensionnal array
simplifies a lot the statistics I have to run on my data :)

Thanks again,

Robin

2011/12/14 Patrizio Frederic 

> Hi robin,
> I'm not sure is what you need, but that's an esthetically nice
> solution (one single line without any loop :) )
>
>
> matrix(apply(log(cbind(as.numeric(a),as.numeric(b),as.numeric(c),as.numeric(d))),1,sd),3)
>
> hope it could help,
>
> PF
>
>
> On Mon, Dec 12, 2011 at 5:15 PM, Robin Cura  wrote:
> > Hello,
> >
> > I'm currently trying to convert a slow and ugly script I made, so that
> it's
> > faster and can be computed on a computer grid with the multicore package.
> > My problem is that I don't see how to turn some loops into an
> "apply-able"
> > function.
> >
> > Here's an example of my loops :
> > I got a list of dataframes (or matrices like here), and I need to browse
> > each cell of those many dataframes to compute a mean (or standard
> deviation
> > like here).
> >
> > Here's a example script :
> >
> > a <- b <- c <- d <- result <- matrix(nrow=3, ncol=3)
> > a[] <- sample.int(n=100,size=9,replace=TRUE)
> > b[] <- sample.int(n=100,size=9,replace=TRUE)
> > c[] <- sample.int(n=100,size=9,replace=TRUE)
> > d[] <- sample.int(n=100,size=9,replace=TRUE)
> > result[] <- NA
> > mylist <- list(a,b,c,d)
> >
> > for (row in 1:3)
> > {
> >  for (col in 1:3)
> >  {
> >tmpList <- log(mylist[[1]][row, col])
> >for (listitem in 2:4)
> >{
> >  tmpList <- c(tmpList, log(mylist[[listitem]][row, col]))
> >}
> >result[row, col] <- sd(tmpList)
> >  }
> > }
> >
> > Considering I have to look at the same cell in each dataframe, I don't
> > understand how I could turn this into a function, considering I need the
> > row and column number to iterate.
> >
> > I succeeded improving my script duration a lot, but such loops are really
> > long to run, considering that my lists contains like 100 dataframes, who
> > all contains thousands of values.
> >
> > Any help would be really appreciated
> >
> > Thanks in advance,
> >
> > Robin
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> +---
> | Patrizio Frederic,
> | http://www.economia.unimore.it/frederic_patrizio/
> +---
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate AUC Using the Trapezoidal Method

2011-12-16 Thread arivald
Thx!

--
View this message in context: 
http://r.789695.n4.nabble.com/Calculate-AUC-Using-the-Trapezoidal-Method-tp4200049p4203947.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] main title in plot; outer=TRUE (cut off)

2011-12-16 Thread AlexC
Hello, 

I'm trying to position a plot title "1 a)" in the top left corner of a
graph; i've set outer=TRUE for it to be in the outer margin unfortunately
this is cut off.  Is there a way either to make it so that it is not cut off
or increase the number of margins and then place it in margin below the
outer?

Heres what I have so far
windows(width=7,height=7)
plot(data$x,data$y,ylab="ylab",xlab="xlab",cex=1.5,cex.lab=1.5,cex.axis=1.5,font.axis=3,axes=FALSE)
par(mar=c(5,4,4,2)+0.2)
title(outer=TRUE,adj=0,main = list("1 a)", cex=1.1,col="black", font=2))

thank you for your help in advance



--
View this message in context: 
http://r.789695.n4.nabble.com/main-title-in-plot-outer-TRUE-cut-off-tp4204006p4204006.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multicollinearty in logistic regression models

2011-12-16 Thread Liviu Andronic
On Fri, Dec 16, 2011 at 1:47 PM, David Winsemius  wrote:
> Harrell's rms package has a vif function that is intended for use with fits
> from his logistic regression model function, lrm. This uses the variance
>
Also see vif() in 'car'.

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] main title in plot; outer=TRUE (cut off)

2011-12-16 Thread jim holtman
try moving the 'par' function call before the plot; you need to have
the margins set before you do the 'plot'ting.  Also not reproducible
since 'data' not provided.

On Fri, Dec 16, 2011 at 5:25 AM, AlexC  wrote:
> Hello,
>
> I'm trying to position a plot title "1 a)" in the top left corner of a
> graph; i've set outer=TRUE for it to be in the outer margin unfortunately
> this is cut off.  Is there a way either to make it so that it is not cut off
> or increase the number of margins and then place it in margin below the
> outer?
>
> Heres what I have so far
> windows(width=7,height=7)
> plot(data$x,data$y,ylab="ylab",xlab="xlab",cex=1.5,cex.lab=1.5,cex.axis=1.5,font.axis=3,axes=FALSE)
> par(mar=c(5,4,4,2)+0.2)
> title(outer=TRUE,adj=0,main = list("1 a)", cex=1.1,col="black", font=2))
>
> thank you for your help in advance
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/main-title-in-plot-outer-TRUE-cut-off-tp4204006p4204006.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] main title in plot; outer=TRUE (cut off)

2011-12-16 Thread jim holtman
Also try adding 'line = -1' and adjusting for the proper position.

title(outer=TRUE,adj=0,main = list("1 a)", cex=1.1,col="black",
font=2), line = -1)

On Fri, Dec 16, 2011 at 8:44 AM, jim holtman  wrote:
> try moving the 'par' function call before the plot; you need to have
> the margins set before you do the 'plot'ting.  Also not reproducible
> since 'data' not provided.
>
> On Fri, Dec 16, 2011 at 5:25 AM, AlexC  wrote:
>> Hello,
>>
>> I'm trying to position a plot title "1 a)" in the top left corner of a
>> graph; i've set outer=TRUE for it to be in the outer margin unfortunately
>> this is cut off.  Is there a way either to make it so that it is not cut off
>> or increase the number of margins and then place it in margin below the
>> outer?
>>
>> Heres what I have so far
>> windows(width=7,height=7)
>> plot(data$x,data$y,ylab="ylab",xlab="xlab",cex=1.5,cex.lab=1.5,cex.axis=1.5,font.axis=3,axes=FALSE)
>> par(mar=c(5,4,4,2)+0.2)
>> title(outer=TRUE,adj=0,main = list("1 a)", cex=1.1,col="black", font=2))
>>
>> thank you for your help in advance
>>
>>
>>
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/main-title-in-plot-outer-TRUE-cut-off-tp4204006p4204006.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Model design

2011-12-16 Thread alfreda morinez
Dear List,

I am realtively inexperienced so i apologise in advance and ask for
understanding in the simplicity of my question:

I have data on the amount of grass per km in a cell ( of which i have
lots) "grass" and for each cell i have x/y coordinates - required due
to spatial autocorrelation

Cells can be classfied in a hierarchical nature into  AREAS and STATES

i.e Cell 1, Cell 2, Cell 3 are all in AREA "A"

where as Cell 4,5 and 6 are in AREA "B"

However both area A + B are in state "S1"

I have lots of these (13000) cells which are classfied into ~2000
AREA's and ~750 STATE'S

So my question is do AREA'S differ in the amount of grass they contain
i.e does AREA A contain significantly more grass than AREA B?

I have modelled this by

area_grass <- gls(grass~AREA, correlation=corExp(form=~x+y), data = grassland

I have set the contrasts to options(contrasts = c("contr.treatment",
"contr.poly")) as there are no control groups.

What i will get ( it is taking ages!)

is

AREA A: -0.12 **
AREA B:  0.17*
AREA C..


So can i then say AREA A has significantly less grass than the
average, AREA B significantly more and AREA C is not significantly
different?

Thanks

Alfreda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mgcv 1.7-12 crashes R

2011-12-16 Thread Reinhard Seifert
Dear community,

I encountered a very disturbing phenomenon today:

When I try to fit any gam() with mgcv R aborts. I could not find any post
regarding this in google, which mades in even more strange.

I am using the latest Ubuntu, latest R and latest mgcv everything up to
date. The crash occured too with the last mgcv 1.7-11.

This is the output from the R shell:



R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

> library(mgcv)
This is mgcv 1.7-12. For overview type 'help("mgcv-package")'.
> dat <- gamSim(1 , n=400 , dist="normal",scale = 2)
Gu & Wahba 4 term additive model
> gam( y ~ s(x0) , data = dat)
Aborted



I am using Ubuntu 11.10 - amd64, R version 2.14.0 and mgcv 1.7-12.

any ideas?

Reinhard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation - make diagonal matrix of each element of a matrix

2011-12-16 Thread Clemontina Alexander
Thank you, that is much simpler!



On Thu, Dec 15, 2011 at 2:04 PM, Rui Barradas  wrote:
> Hello,
>
> I believe I can help, or at least, my code is simpler.
> First, look at your first line:
>
> idd <- length(diag(1,tt))   # length of intercept matrix
> #
> not needed: diag(tt) would do the job but it's not needed,
>    why call 2 functions, and one of them, 'diag', uses memory(*), if the
>    result is tt squared? It's much simpler!
>    (*)like you say, "larger and larger" amounts of it
>
> My solution to your problem is as follows (as a function, and yours).
>
> fun2 <- function(n, tt, numco){
>    M.Unit <- matrix(rep(diag(1,tt),n), ncol=tt, byrow=TRUE)
>    M <- NULL
>    for(i in 1:numco) M <- cbind(M, M.Unit*rep(x[,i], each=tt))
>    M
> }
>
> fun1 <- function(n, tt, numco){
>    idd <- length(diag(1,tt))    # length of intercept matrix
>    X <- matrix(numeric(n*numco*idd),ncol=tt*numco)
>    for(i in 1:numco){
>          X[,((i-1)*tt+1):(i*tt)] <- matrix(
>            c(matrix(rep(diag(1,tt),n),ncol=tt, byrow=TRUE))*
>                rep(rep(x[,i],each=tt),tt)
>           , ncol=tt)
>    }
>    X
> }
>
> I' ve tested the two with larger values of 'n', 'tt' and 'numco'
> using the following timing instructions
>
>
> n  <- 1000
> tt <- 50
> numco <- 15
> set.seed(1)
> x <- matrix(round(rnorm(n*numco),2), ncol=numco)   # the actual covariates
>
> Runs <- 10^1
>
> t1 <- system.time(for(i in 1:Runs) a1 <- fun1(n, tt, numco))[c(1,3)]
> t2 <- system.time(for(i in 1:Runs) a2 <- fun2(n, tt, numco))[c(1,3)]
>
> rbind(t1, t2, t1/t2)
>
>      user.self     elapsed
> t1 23.21   31.06
> t2 14.97   22.54
>     1.550434    1.377995
>
> As you can see, it's not a great speed improvement.
> I hope it's at least usefull.
>
> Rui Barradas
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Data-Manipulation-make-diagonal-matrix-of-each-element-of-a-matrix-tp4200321p4201305.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv 1.7-12 crashes R

2011-12-16 Thread Kevin E. Thorpe

On 12/16/2011 08:52 AM, Reinhard Seifert wrote:

Dear community,

I encountered a very disturbing phenomenon today:

When I try to fit any gam() with mgcv R aborts. I could not find any post
regarding this in google, which mades in even more strange.

I am using the latest Ubuntu, latest R and latest mgcv everything up to
date. The crash occured too with the last mgcv 1.7-11.

This is the output from the R shell:



R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)


library(mgcv)

This is mgcv 1.7-12. For overview type 'help("mgcv-package")'.

dat<- gamSim(1 , n=400 , dist="normal",scale = 2)

Gu&  Wahba 4 term additive model

gam( y ~ s(x0) , data = dat)

Aborted



I am using Ubuntu 11.10 - amd64, R version 2.14.0 and mgcv 1.7-12.

any ideas?

Reinhard


It works for me in Slackware 13.37 (64bit)

> library(mgcv)
This is mgcv 1.7-11. For overview type 'help("mgcv-package")'.
> dat <- gamSim(1 , n=400 , dist="normal",scale = 2)
Gu & Wahba 4 term additive model
> gam( y ~ s(x0) , data = dat)

Family: gaussian
Link function: identity

Formula:
y ~ s(x0)

Estimated degrees of freedom:
2.4812  total = 3.481196

GCV score: 13.66653

> sessionInfo()
R version 2.14.0 Patched (2011-11-29 r57769)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US   LC_NUMERIC=C LC_TIME=en_US
 [4] LC_COLLATE=C LC_MONETARY=en_USLC_MESSAGES=en_US
 [7] LC_PAPER=C   LC_NAME=CLC_ADDRESS=C
[10] LC_TELEPHONE=C   LC_MEASUREMENT=en_US LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] mgcv_1.7-11

loaded via a namespace (and not attached):
[1] Matrix_1.0-2   grid_2.14.0lattice_0.20-0 nlme_3.1-102




--
Kevin E. Thorpe
Biostatistician/Trialist,  Applied Health Research Centre (AHRC)
Li Ka Shing Knowledge Institute of St. Michael's
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Model design

2011-12-16 Thread ONKELINX, Thierry
Dear Alfreda,

anova(area_grass) will tell you IF the average grass area is different among 
areas.

If you want to know WHICH areas are different from each other, then you have to 
do some multiple comparisons. You can use the multcomp package: e.g. 

library(multcomp)
glht(area_grass, linfct = mcp(AREA = "Tukey"))

Best regards,

Thierry

Van: r-help-boun...@r-project.org [r-help-boun...@r-project.org] namens alfreda 
morinez [alfredamori...@gmail.com]
Verzonden: vrijdag 16 december 2011 14:07
Aan: r-help@r-project.org
Onderwerp: [R] Model design

Dear List,

I am realtively inexperienced so i apologise in advance and ask for
understanding in the simplicity of my question:

I have data on the amount of grass per km in a cell ( of which i have
lots) "grass" and for each cell i have x/y coordinates - required due
to spatial autocorrelation

Cells can be classfied in a hierarchical nature into  AREAS and STATES

i.e Cell 1, Cell 2, Cell 3 are all in AREA "A"

where as Cell 4,5 and 6 are in AREA "B"

However both area A + B are in state "S1"

I have lots of these (13000) cells which are classfied into ~2000
AREA's and ~750 STATE'S

So my question is do AREA'S differ in the amount of grass they contain
i.e does AREA A contain significantly more grass than AREA B?

I have modelled this by

area_grass <- gls(grass~AREA, correlation=corExp(form=~x+y), data = grassland

I have set the contrasts to options(contrasts = c("contr.treatment",
"contr.poly")) as there are no control groups.

What i will get ( it is taking ages!)

is

AREA A: -0.12 **
AREA B:  0.17*
AREA C..


So can i then say AREA A has significantly less grass than the
average, AREA B significantly more and AREA C is not significantly
different?

Thanks

Alfreda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Model design

2011-12-16 Thread alfreda morinez
Hi Thierry

I looked at running an ANOVA but I have spatial autocorrelation in the
data set as indicated by Variograms and significant moran's I i.e the
cells closer together are more likely to be similar than expected
under a normal distibution - is it possible to make this approach take
it into consideration ( through the x/y coordinates of the grass).

Is the approach i am using stasitically incorrect or is it doing
something similar to a ANOVA but weighting it through the correlation
matrix?

Thanks



On Fri, Dec 16, 2011 at 2:17 PM, ONKELINX, Thierry
 wrote:
> Dear Alfreda,
>
> anova(area_grass) will tell you IF the average grass area is different among 
> areas.
>
> If you want to know WHICH areas are different from each other, then you have 
> to do some multiple comparisons. You can use the multcomp package: e.g.
>
> library(multcomp)
> glht(area_grass, linfct = mcp(AREA = "Tukey"))
>
> Best regards,
>
> Thierry
> 
> Van: r-help-boun...@r-project.org [r-help-boun...@r-project.org] namens 
> alfreda morinez [alfredamori...@gmail.com]
> Verzonden: vrijdag 16 december 2011 14:07
> Aan: r-help@r-project.org
> Onderwerp: [R] Model design
>
> Dear List,
>
> I am realtively inexperienced so i apologise in advance and ask for
> understanding in the simplicity of my question:
>
> I have data on the amount of grass per km in a cell ( of which i have
> lots) "grass" and for each cell i have x/y coordinates - required due
> to spatial autocorrelation
>
> Cells can be classfied in a hierarchical nature into  AREAS and STATES
>
> i.e Cell 1, Cell 2, Cell 3 are all in AREA "A"
>
> where as Cell 4,5 and 6 are in AREA "B"
>
> However both area A + B are in state "S1"
>
> I have lots of these (13000) cells which are classfied into ~2000
> AREA's and ~750 STATE'S
>
> So my question is do AREA'S differ in the amount of grass they contain
> i.e does AREA A contain significantly more grass than AREA B?
>
> I have modelled this by
>
> area_grass <- gls(grass~AREA, correlation=corExp(form=~x+y), data = grassland
>
> I have set the contrasts to options(contrasts = c("contr.treatment",
> "contr.poly")) as there are no control groups.
>
> What i will get ( it is taking ages!)
>
> is
>
> AREA A:     -0.12 **
> AREA B:      0.17*
> AREA C..
>
>
> So can i then say AREA A has significantly less grass than the
> average, AREA B significantly more and AREA C is not significantly
> different?
>
> Thanks
>
> Alfreda
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans and plot labels

2011-12-16 Thread Sarah Goslee
Hi,

You include no context, and I only vaguely remember your original
question. You also don't include a reproducible example.

But perhaps this helps?

testmat <- data.frame(x=c(1,1,2,3), y=c(2,1,3,2), names=letters[1:4],
stringsAsFactors=FALSE)
with(testmat, plot(x, y, pch=names))

or

with(testmat, plot(x, y, type="n"))
with(testmat, text(x, y, names))

Otherwise you'll have to actually follow the posting guide and
pose a fully self-contained question.

Sarah

On Fri, Dec 16, 2011 at 4:25 AM, Meesters, Christian  wrote:
> Hi,
>
> Thanks Sarah. Unfortunately I did not get a step further.
>
> My question, perhaps a bit clearer, is how to display the case control status 
> (or any other arbitrary point label) after clustering in a plot:
>
> With a bit of pseudo code, where dataset is a data.frame, parameters are 
> those column names where we find numerical values (no NAs) and nclasses is 
> the desired number of classes.
>
> fit <- kmeans(dataset[, parameters], nclasses)
> plot(dataset[, parameters], col = fit$cluster, dimen = nclasses)
>
> This gives us a nicely coloured plot where all points are circled. Yet, my 
> desire is to see the case / control-Status encoded in a column "Status" with 
> single characters, in place of those circled points.
>
> Any hints? I did not get pch() nor text() to work. Perhaps the solution lies 
> there, but them I am sure that I do not see the wood for all those trees ...
>
> TIA
> Christian
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] optim with simulated annealing SANN for combinatorial optimization

2011-12-16 Thread John C Nash
A particularly unfortunate aspect of "SANN" in optim() is that it will evaluate 
the
objective function 'maxit' times and quit with conv=0, implying it has 
"converged".

The Rd file points out a number of departures of SANN from the other 
optimizers, but a key
one is that it does NOT return a result that can be considered as an optimum 
without
external testing and evaluation. When (if??) I get to a point where I'm 
satisfied with the
local minimizers in the optimx package, I hope to prepare a similar wrapper for 
the
stochastic optimizers like SANN. As with many tools in this domain, for 
effective use they
require more knowledge than many of their users possess, and can be dangerous 
because they
seem to "work".

JN


> Message: 72
> Date: Fri, 16 Dec 2011 18:41:12 +1100
> From: Dae-Jin Lee 
> To: r-help@r-project.org
> Subject: [R] optim with simulated annealing SANN for combinatorial
>   optimization
> Message-ID:
>   
> Content-Type: text/plain
> 
> Hi all
> 
> I am trying to solve a combinatorial optimization problem. Basically, I can
> reduce my problem into the next problem:
> 
> 1.- Given a NxN grid of points, with some values in each cell
> 2.- Find the combination of K points on the grid such that, the maximum
> mean value is obtained
> 
> 
>  I took the Travel SalesMan problem example in ?optim documentation. I am
> not sure if I have understood correctly the SANN implementation in optim,
> as I do not see how the acceptance probability comes out. And it looks like
> I am only evaluating the criteria several times and keep the maximum at the
> end.
> 
> Thanks in advance
> 
> 
> Here is some example code in R
> 
> ### toy example
> N=5
> K=6
> 
> new.points = expand.grid(1:N,1:N)  # grid points
> 
> set.seed(1234)
> 
> resp=rnorm(N2)  # random values on each grid cell
> 
> ###  function to generate the sequence of candidates
> generate<-function(ind){
> 
> idx <- seq(1, nrow(new.points), by=1)   # index of 1 to N2 grid cells
> swap.points <- c(sample(ind,1),sample(idx[-c(ind)], size=1))  # swap
> between points of the initial set and other candidate
> ind[which(ind==swap.points[1])]<-idx[which(idx==swap.points[2])]
> 
> cat("set  =",ind,"\n")
> cat("crit =",media(ind),"\n")
> cat("swap =",swap.points,"\n")
> 
> plot(new.points[,1:2],col='black',xlim=c(range(new.points[,1])),ylim=c(range(new.points[,2])))
> points(new.points[ind,1:2],col='blue',pch=19,xlim=c(range(new.points[,1])),ylim=c(range(new.points[,2])))
> return(as.numeric(ind))
> }
> 
> ###  function to calculate the mean value of the points on the grid
> media=function(sq){
> med=mean(resp[sq])
> return(as.numeric(med))
> }
> 
> ###  initial set of K candidates
> init=sample(1:K,K)
> 
> ###  run SANN
> res <- optim(init, media ,generate, method="SANN",control =
> list(maxit=2, temp=100,tmax=1000, trace=TRUE, REPORT=1, fnscale=-1))
> new.points[sort(res$par),]
> plot(new.points,cex=.1)
> points(new.points[res$par,],col=3,lwd=3,cex=1.5)
> 
>   [[alternative HTML version deleted]]
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans and plot labels

2011-12-16 Thread Meesters, Christian
Hi,

You are right - I promise improvement of my way to ask.

Anyhow, thanks to your comparisons I got much better insight in the way 
plotting environments can be altered - and have my solution. Thank you.

Christian

From: Sarah Goslee [sarah.gos...@gmail.com]
Sent: Friday, December 16, 2011 3:29 PM
To: Meesters, Christian
Cc: r-help@r-project.org
Subject: Re: [R] kmeans and plot labels

Hi,

You include no context, and I only vaguely remember your original
question. You also don't include a reproducible example.

But perhaps this helps?

testmat <- data.frame(x=c(1,1,2,3), y=c(2,1,3,2), names=letters[1:4],
stringsAsFactors=FALSE)
with(testmat, plot(x, y, pch=names))

or

with(testmat, plot(x, y, type="n"))
with(testmat, text(x, y, names))

Otherwise you'll have to actually follow the posting guide and
pose a fully self-contained question.

Sarah

On Fri, Dec 16, 2011 at 4:25 AM, Meesters, Christian  wrote:
> Hi,
>
> Thanks Sarah. Unfortunately I did not get a step further.
>
> My question, perhaps a bit clearer, is how to display the case control status 
> (or any other arbitrary point label) after clustering in a plot:
>
> With a bit of pseudo code, where dataset is a data.frame, parameters are 
> those column names where we find numerical values (no NAs) and nclasses is 
> the desired number of classes.
>
> fit <- kmeans(dataset[, parameters], nclasses)
> plot(dataset[, parameters], col = fit$cluster, dimen = nclasses)
>
> This gives us a nicely coloured plot where all points are circled. Yet, my 
> desire is to see the case / control-Status encoded in a column "Status" with 
> single characters, in place of those circled points.
>
> Any hints? I did not get pch() nor text() to work. Perhaps the solution lies 
> there, but them I am sure that I do not see the wood for all those trees ...
>
> TIA
> Christian
>

--
Sarah Goslee
http://www.functionaldiversity.org


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave

2011-12-16 Thread Frank Lawrence
I am new to using odfWeave but I have encountered a problem running both
the example in the help file as well as another file.  I am not sure how to
correct the error.



First example:

library(odfWeave)

filein <- 'c:\\my documents\\example01_in.odt'

fileout <- 'c:\\my documents\\example01.odt'

odfWeave(filein, fileout, control = odfWeaveControl(cleanup = T))



error:

The system cannot find the path specified.

The system cannot find the drive specified.

Error: content.xml  does not seem to be XML, nor to identify a file name



log

> odfWeave(filein, fileout, control = odfWeaveControl(cleanup = T))

  Copying  c:\my documents\example01_in.odt

  Setting wd to  c:\temp\RtmpEhxqNx/odfWeave16110543541

  Unzipping ODF file using unzip -o "example01_in.odt"



c:\temp\RtmpEhxqNx\odfWeave16110543541>n:\reflect\tel-mgr /u



c:\temp\RtmpEhxqNx\odfWeave16110543541>h:



c:\temp\RtmpEhxqNx\odfWeave16110543541>cd \



  Removing  example01_in.odt

  Creating a Pictures directory



> Sys.info()

 sysname  release

   "Windows" "XP"

 version nodename

"build 2600, Service Pack 3" "OFPC5ZL4K1"

 machinelogin

   "x86"   "flawren1"

user   effective_user

  "flawren1"   "flawren1"



> sessionInfo()

R version 2.14.0 (2011-10-31)

Platform: i386-pc-mingw32/i386 (32-bit)



locale:

[1] LC_COLLATE=English_United States.1252

[2] LC_CTYPE=English_United States.1252

[3] LC_MONETARY=English_United States.1252

[4] LC_NUMERIC=C

[5] LC_TIME=English_United States.1252



attached base packages:

[1] grDevices datasets  splines   graphics  stats tcltk utils

[8] methods   base



other attached packages:

[1] odfWeave_0.7.17  XML_3.4-2.2  lattice_0.20-0   svSocket_0.9-51

[5] TinnR_1.0.3  R2HTML_2.2   Hmisc_3.8-3  survival_2.36-10



loaded via a namespace (and not attached):

[1] cluster_1.14.1 grid_2.14.0svMisc_0.9-63  tools_2.14.0

Second Example:
demoFile <- system.file("examples", "simple.odt", package = "odfWeave")
outputFile <- gsub("simple.odt", "output.odt", demoFile)
odfWeave(demoFile, outputFile)

Same error; same session info

Respectfully,

Frank Lawrence

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls start values

2011-12-16 Thread Nick Fankhauser
Hi!

thanks a lot for this suggestion! I tried to implement it like this, and
it worked nicely.
I used the method suggested by Gabor Grothendieck for simplification:

frml <- gene_expression ~ sin(tpoints * afreq + phase) * amp + shift
gridfit <- nls2(frml, algorithm = "grid-search", data=gendat, start =
startdf)
sine_nls <- nls2(gridfit, data=gendat, start = gridfit, algorithm="port")

But I also tried the method of running the nls with start-values for
phase between 0 and 2pi in five steps and then choosing the one which
has the lowest Standardized Residual Sum of Squares. This worked even
better in some cases and I think it's also conceptually simpler.

Yours,
Nick

On 13/12/11 16:53, Hans W Borchers wrote:
> Niklaus Fankhauser  cell.biol.ethz.ch> writes:
>
>   
>> I'm using nls to fit periodic gene-expression data to sine waves. I need
>> to set the upper and lower boundaries, because I do not want any
>> negative phase and amplitude solutions. This means that I have to use
>> the "port" algorithm. The problem is, that depending on what start value
>> I choose for phase, the fit works for some cases, but not for others.
>> In the example below, the fit works using phase=pi,  but not using
>> phase=0. But there are many examples which fit just fine using 0.
>>
>> Is there a comparable alternative to nls that is not so extremely
>> influenced by the start values?
>>
>> 
> Use package `nls2' to first search on a grid, and then apply `nls' again
> to identify the globally best point:
>
> # Data for example fit
> afreq <- 1 / (24 / 2 / pi)
> tpoints <- c(0,0,0,2,2,2,4,4,4,6,6,6,8,8,8,12,12,12,
>  14,14,14,16,16,16,18,18,18,20,20,20,24,24,24)
> gene_expression <-
> c(1.551383, 1.671742, 1.549499, 1.694480, 1.632436, 1.471568, 1.623381,
>   1.579361, 1.809394, 1.753223, 1.685918, 1.754968, 1.963069, 1.820690,
>   1.985159, 2.205064, 2.160308, 2.120189, 2.194758, 2.165993, 2.189981,
>   2.098671, 2.122207, 2.012621, 1.963610, 1.884184, 1.955160, 1.801175,
>   1.829686, 1.773260, 1.588768, 1.563774, 1.559192)
> shift=mean(gene_expression) # y-axis (expression) shift
>
> # Grid search
> library("nls2")
> frml <- gene_expression ~ sin(tpoints * afreq + phase) * amp + shift
> startdf <- data.frame(phase=c(0, 2*pi), amp = c(0, 2))
> nls2(frml, algorithm = "grid-search", start = startdf,
>control = list(maxiter=200))
>
> # Perfect fit
> startvals <- list(phase = 4.4880, amp = 0.2857)
> sine_nls <- nls(frml, start=startvals)
> #  phaseamp 
> # 4.3964 0.2931 
> # residual sum-of-squares: 0.1378
>
> Maybe this can be done in one step.
> Hans Werner
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Model design

2011-12-16 Thread ONKELINX, Thierry
Dear Alfreda,

I'm not suggesting to use ANOVA but to look at the anova table of your gls 
model. You can get that with anova(your.model). 

Thierry

Van: alfreda morinez [alfredamori...@gmail.com]
Verzonden: vrijdag 16 december 2011 15:29
Aan: ONKELINX, Thierry
CC: r-help@r-project.org
Onderwerp: Re: [R] Model design

Hi Thierry

I looked at running an ANOVA but I have spatial autocorrelation in the
data set as indicated by Variograms and significant moran's I i.e the
cells closer together are more likely to be similar than expected
under a normal distibution - is it possible to make this approach take
it into consideration ( through the x/y coordinates of the grass).

Is the approach i am using stasitically incorrect or is it doing
something similar to a ANOVA but weighting it through the correlation
matrix?

Thanks



On Fri, Dec 16, 2011 at 2:17 PM, ONKELINX, Thierry
 wrote:
> Dear Alfreda,
>
> anova(area_grass) will tell you IF the average grass area is different among 
> areas.
>
> If you want to know WHICH areas are different from each other, then you have 
> to do some multiple comparisons. You can use the multcomp package: e.g.
>
> library(multcomp)
> glht(area_grass, linfct = mcp(AREA = "Tukey"))
>
> Best regards,
>
> Thierry
> 
> Van: r-help-boun...@r-project.org [r-help-boun...@r-project.org] namens 
> alfreda morinez [alfredamori...@gmail.com]
> Verzonden: vrijdag 16 december 2011 14:07
> Aan: r-help@r-project.org
> Onderwerp: [R] Model design
>
> Dear List,
>
> I am realtively inexperienced so i apologise in advance and ask for
> understanding in the simplicity of my question:
>
> I have data on the amount of grass per km in a cell ( of which i have
> lots) "grass" and for each cell i have x/y coordinates - required due
> to spatial autocorrelation
>
> Cells can be classfied in a hierarchical nature into  AREAS and STATES
>
> i.e Cell 1, Cell 2, Cell 3 are all in AREA "A"
>
> where as Cell 4,5 and 6 are in AREA "B"
>
> However both area A + B are in state "S1"
>
> I have lots of these (13000) cells which are classfied into ~2000
> AREA's and ~750 STATE'S
>
> So my question is do AREA'S differ in the amount of grass they contain
> i.e does AREA A contain significantly more grass than AREA B?
>
> I have modelled this by
>
> area_grass <- gls(grass~AREA, correlation=corExp(form=~x+y), data = grassland
>
> I have set the contrasts to options(contrasts = c("contr.treatment",
> "contr.poly")) as there are no control groups.
>
> What i will get ( it is taking ages!)
>
> is
>
> AREA A: -0.12 **
> AREA B:  0.17*
> AREA C..
>
>
> So can i then say AREA A has significantly less grass than the
> average, AREA B significantly more and AREA C is not significantly
> different?
>
> Thanks
>
> Alfreda
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing contractions for recode in car

2011-12-16 Thread Nicole Marie Ford
Hello, all.

I am encountering a new problem I have not seen before.

I recoded my variable trust in the following manner (thank you John and David):

trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; \"Can't 
choose\" = 6; else=NA")

when I examine the variable, I find the following:


> levels(Poland$trust)
[1] "1" "2" "3" "4" "6"

level 5 is missing.

i check again here:


> levels(Poland$SN35B)
[1] "NAP:NOT ASKED/BALOT""Strongly Agree" "Agree"   
  
[4] "Neither Agree nor Disagree" "Disagree"   "Cant choose" 
  
[7] "Can't choose"   "NO ANSWER"


yes, 5 should be "Disagree".


So I check to be sure people did answer "Disagree".  I find they did:

> summary(Poland$SN35B)
   NAP:NOT ASKED/BALOT Strongly Agree  
Agree 
 12443 83
545 
Neither Agree nor Disagree   DisagreeCant 
choose 
   306183 
29 
  Can't choose  NO ANSWER 
62 13 

I was wondering if anyone has any ideas as to why my 5th choice might be 
missing after I recode?

Thanks.

~Nicole



- Original Message -
From: "Nicole Marie Ford" 
To: "John Fox" 
Cc: r-help@r-project.org
Sent: Thursday, December 15, 2011 3:39:01 PM
Subject: Re: [R] removing contractions for recode in car

Dear John and David,

Thanks so much for the advice.  I have only come across this one other time in 
my research, so the code escaped me!  Much obliged!

~Nicole

Ph.D. Student 
University of Wisconsin Milwaukee 
c: 813.786.5715 
e: nmf...@uwm.edu 

- Original Message -
From: "John Fox" 
To: "Nicole Marie Ford" 
Cc: r-help@r-project.org
Sent: Thursday, December 15, 2011 3:16:58 PM
Subject: Re: [R] removing contractions for recode in car

Dear Nicole,

On Thu, 15 Dec 2011 14:49:04 -0600 (CST)
 Nicole Marie Ford  wrote:
> hello,
> 
> i need to recode a variable, however the contraction is causing problems.  i 
> had the code to change this written down somewhere and i just can't find it, 
> of course.
> 
> i am using the car library to recode.
> 
> it's only 5 levels when it should have 6... when i do levels(trust).
> 
> here is my recode: 
> 
> 
> trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
> Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; 'Can't 
> choose' = 6; else=NA")

Although it's awkward, you should be able to do what you want by "escaping" the 
quotation marks surrounding the level name; the following (untested) should 
work:

trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; \"Can't 
choose\" = 6; else=NA")

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

> 
> 
> thanks.
> 
> ~nicole
> - Original Message -
> From: "David Winsemius" 
> To: "Rui Barradas" 
> Cc: r-help@r-project.org
> Sent: Thursday, December 15, 2011 2:33:32 PM
> Subject: Re: [R] printing all htest class members
> 
> 
> On Dec 15, 2011, at 2:16 PM, Rui Barradas wrote:
> 
> > You're right, David,
> >
> > The first line is wrong, it should be
> >
> > ...  df=2:4  ...
> >
> > As for creating something, try
> >
> >> ht <- structure( ... etc ...
> >> ht
> >> class(ht)
> >
> > See what is printed and what function prints it.
> 
> Well, the function is stats:::print.htest.
> 
> (Do not expect any further replies to emails sent without context.)
> 
> #---#
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> #---#
> >
> 
> David Winsemius, MD
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.h

Re: [R] removing contractions for recode in car

2011-12-16 Thread Nicole Marie Ford
I see the problem.  So Sorry.
 

- Original Message -
From: "Nicole Marie Ford" 
To: "John Fox" 
Cc: r-help@r-project.org
Sent: Friday, December 16, 2011 11:54:36 AM
Subject: Re: [R] removing contractions for recode in car

Hello, all.

I am encountering a new problem I have not seen before.

I recoded my variable trust in the following manner (thank you John and David):

trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; \"Can't 
choose\" = 6; else=NA")

when I examine the variable, I find the following:


> levels(Poland$trust)
[1] "1" "2" "3" "4" "6"

level 5 is missing.

i check again here:


> levels(Poland$SN35B)
[1] "NAP:NOT ASKED/BALOT""Strongly Agree" "Agree"   
  
[4] "Neither Agree nor Disagree" "Disagree"   "Cant choose" 
  
[7] "Can't choose"   "NO ANSWER"


yes, 5 should be "Disagree".


So I check to be sure people did answer "Disagree".  I find they did:

> summary(Poland$SN35B)
   NAP:NOT ASKED/BALOT Strongly Agree  
Agree 
 12443 83
545 
Neither Agree nor Disagree   DisagreeCant 
choose 
   306183 
29 
  Can't choose  NO ANSWER 
62 13 

I was wondering if anyone has any ideas as to why my 5th choice might be 
missing after I recode?

Thanks.

~Nicole



- Original Message -
From: "Nicole Marie Ford" 
To: "John Fox" 
Cc: r-help@r-project.org
Sent: Thursday, December 15, 2011 3:39:01 PM
Subject: Re: [R] removing contractions for recode in car

Dear John and David,

Thanks so much for the advice.  I have only come across this one other time in 
my research, so the code escaped me!  Much obliged!

~Nicole

Ph.D. Student 
University of Wisconsin Milwaukee 
c: 813.786.5715 
e: nmf...@uwm.edu 

- Original Message -
From: "John Fox" 
To: "Nicole Marie Ford" 
Cc: r-help@r-project.org
Sent: Thursday, December 15, 2011 3:16:58 PM
Subject: Re: [R] removing contractions for recode in car

Dear Nicole,

On Thu, 15 Dec 2011 14:49:04 -0600 (CST)
 Nicole Marie Ford  wrote:
> hello,
> 
> i need to recode a variable, however the contraction is causing problems.  i 
> had the code to change this written down somewhere and i just can't find it, 
> of course.
> 
> i am using the car library to recode.
> 
> it's only 5 levels when it should have 6... when i do levels(trust).
> 
> here is my recode: 
> 
> 
> trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
> Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; 'Can't 
> choose' = 6; else=NA")

Although it's awkward, you should be able to do what you want by "escaping" the 
quotation marks surrounding the level name; the following (untested) should 
work:

trust <- recode(Poland$SN35B, " 'Strongly Agree' = 1; 'Agree' = 2; 'Neither 
Agree nor Disagree' = 3; 'Disagree' = 4; 'Strongly Disagree' = 5; \"Can't 
choose\" = 6; else=NA")

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

> 
> 
> thanks.
> 
> ~nicole
> - Original Message -
> From: "David Winsemius" 
> To: "Rui Barradas" 
> Cc: r-help@r-project.org
> Sent: Thursday, December 15, 2011 2:33:32 PM
> Subject: Re: [R] printing all htest class members
> 
> 
> On Dec 15, 2011, at 2:16 PM, Rui Barradas wrote:
> 
> > You're right, David,
> >
> > The first line is wrong, it should be
> >
> > ...  df=2:4  ...
> >
> > As for creating something, try
> >
> >> ht <- structure( ... etc ...
> >> ht
> >> class(ht)
> >
> > See what is printed and what function prints it.
> 
> Well, the function is stats:::print.htest.
> 
> (Do not expect any further replies to emails sent without context.)
> 
> #---#
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> #---#
> >
> 
> David Winsemius, MD
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, sel

Re: [R] Multiple plots in one subplot

2011-12-16 Thread Greg Snow
Look at the layout function, it may do what you want.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of annek
> Sent: Thursday, December 15, 2011 11:36 PM
> To: r-help@r-project.org
> Subject: [R] Multiple plots in one subplot
> 
> Hi,
> I making a figure with six sub-plots using par(mfcol=c(2,3)). In the
> last
> sub-plot I want to have two graphs instead of one. I have tried using
> par(fig=x,y,z,v) but this par seems to overwrite the first par. Is
> there a
> simple solution?
> Thanks!
> 
> Anna
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Multiple-
> plots-in-one-subplot-tp4203525p4203525.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Zellig Error Message

2011-12-16 Thread Abraham Mathew
I'm trying to calculate predicted probabilities in R with Zelig and keep
getting the following error.

Can anyone help?


> x.low <- setx(mod, type=1)Error in dta[complete.cases(mf), names(dta) %in% 
> vars, drop = FALSE] :
  incorrect number of dimensions


When I ran the model, I ran everything but the explanatory variable as a

numeric variable. Now, I'm trying everything and no luck.


-- 
*Abraham Mathew
Statistical Analyst
www.amathew.com
720-648-0108
@abmathewks*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread Lost in R
I've also attached here a sample of my data in Excel. I'm thinking it must be
a problem with a character, but can't figure it out. Is there a list
somewhere of characters to avoid in R?

Thanks,
Mike

http://r.789695.n4.nabble.com/file/n4205479/Sample_Data_Set.csv
Sample_Data_Set.csv 

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4205479.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package.skeleton()

2011-12-16 Thread Ben Ganzfried
Hi--

I'm creating an R package, I've read through "Writing R Extensions" and the
package.skeleton() R page-- and I'm still running into a little confusion.
I would greatly appreciate any advice you can provide.

Where do I run my following line of code from?:
> package.skeleton(name = "a", code_files = "EsetObject.r"

I'm currently running it from Rgui, but when I type the line above nothing
happens.

Thank you very much.

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package.skeleton()

2011-12-16 Thread Ben Ganzfried
*the line in question is*:
> package.skeleton(name = "a", code_files = "EsetObject.r")

On Fri, Dec 16, 2011 at 4:12 PM, Ben Ganzfried <
benganzfr...@post.harvard.edu> wrote:

> Hi--
>
> I'm creating an R package, I've read through "Writing R Extensions" and
> the package.skeleton() R page-- and I'm still running into a little
> confusion.  I would greatly appreciate any advice you can provide.
>
> Where do I run my following line of code from?:
> > package.skeleton(name = "a", code_files = "EsetObject.r"
>
> I'm currently running it from Rgui, but when I type the line above nothing
> happens.
>
> Thank you very much.
>
> Ben
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] crash in using Rcpp and inline packages.

2011-12-16 Thread Guochun Shen
Hi all,

I am using c++ functions in R by Rcpp and inline packages. 
The code is quite simple, but the R session always automatically crash after 
some running time.
Does anyone here familiar with Rcpp and inline? What¡¯s the problem in the 
following code?
I have checked the input values, no NA and other strange value exists.
Thank you for your attention!


> mkc <- cxxfunction( signature(fx0="vector",fy0="vector",fsp0="vector",
x0="vector",y0="vector",sp0="vector",phyd0="matrix",
rmax0="numeric",step0="numeric",binlength0="integer"),
paste(readLines("mkc.cpp"),collapse = "\n"),
plugin = "Rcpp")

The codes in mkc.cpp file are:

NumericVector fx(fx0);
NumericVector fy(fy0);
NumericVector fsp(fsp0);
NumericVector x(x0);
NumericVector y(y0);
NumericVector sp(sp0);
NumericMatrix phyd(phyd0);
double rmax = as(rmax0);
double step = as(step0);
double  totalcout = 0.0;
double  totalsum = 0.0;

int binlength = as (binlength0);

NumericVector bincout(binlength);
NumericVector binsum(binlength);

int nfocal=fx.size();
int ntotal=x.size();
double dist =0.0;
double ibin = 0.0;
double lpd = 0.0;

typedef NumericVector::iterator vec_iterator;
vec_iterator ifx = fx.begin(), ify = fy.begin();
vec_iterator ix= x.begin(), iy = y.begin();
vec_iterator ifsp = fsp.begin(), isp = sp.begin();


for (int i = 0; i < nfocal; i++){
for (int j= 0; j < ntotal; j++){
dist=pow(pow(ix[j]-ifx[i],2)+pow(iy[j]-ify[i],2),0.5);

if(dist == 0.0 || ifsp[i]==isp[j] || dist > rmax ){
continue;
}

ibin=dist/step;
bincout[ceil(ibin)]++;
bincout[floor(ibin)]++;
totalcout++;

lpd=phyd(ifsp[i],isp[j]);
binsum[ceil(ibin)]+=lpd;
binsum[floor(ibin)]+=lpd;
totalsum+=lpd;
}
}

double k = totalsum/totalcout;
GenericVector result(4);
result[0]=totalcout;
result[1]=totalsum;
result[2]=bincout;
result[3]=binsum;

delete ifx;
delete ify;
delete ix;
delete iy;
delete ifsp;
delete isp;

return result;


Best wishes!
Guochun


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: crash in using Rcpp and inline packages.

2011-12-16 Thread Guochun Shen
Hi all,

I am using c++ functions in R by Rcpp and inline packages. 
The code is quite simple, but the R session always automatically crash after 
some running time.
Does anyone here familiar with Rcpp and inline? What¡¯s the problem in the 
following code?
I have checked the input values, no NA and other strange value exists.
Thank you for your attention!


> mkc <- cxxfunction( signature(fx0="vector",fy0="vector",fsp0="vector",
x0="vector",y0="vector",sp0="vector",phyd0="matrix",
rmax0="numeric",step0="numeric",binlength0="integer"),
paste(readLines("mkc.cpp"),collapse = "\n"),
plugin = "Rcpp")

The codes in mkc.cpp file are:

NumericVector fx(fx0);
NumericVector fy(fy0);
NumericVector fsp(fsp0);
NumericVector x(x0);
NumericVector y(y0);
NumericVector sp(sp0);
NumericMatrix phyd(phyd0);
double rmax = as(rmax0);
double step = as(step0);
double  totalcout = 0.0;
double  totalsum = 0.0;

int binlength = as (binlength0);

NumericVector bincout(binlength);
NumericVector binsum(binlength);

int nfocal=fx.size();
int ntotal=x.size();
double dist =0.0;
double ibin = 0.0;
double lpd = 0.0;

typedef NumericVector::iterator vec_iterator;
vec_iterator ifx = fx.begin(), ify = fy.begin();
vec_iterator ix= x.begin(), iy = y.begin();
vec_iterator ifsp = fsp.begin(), isp = sp.begin();


for (int i = 0; i < nfocal; i++){
for (int j= 0; j < ntotal; j++){
dist=pow(pow(ix[j]-ifx[i],2)+pow(iy[j]-ify[i],2),0.5);

if(dist == 0.0 || ifsp[i]==isp[j] || dist > rmax ){
continue;
}

ibin=dist/step;
bincout[ceil(ibin)]++;
bincout[floor(ibin)]++;
totalcout++;

lpd=phyd(ifsp[i],isp[j]);
binsum[ceil(ibin)]+=lpd;
binsum[floor(ibin)]+=lpd;
totalsum+=lpd;
}
}

double k = totalsum/totalcout;
GenericVector result(4);
result[0]=totalcout;
result[1]=totalsum;
result[2]=bincout;
result[3]=binsum;

delete ifx;
delete ify;
delete ix;
delete iy;
delete ifsp;
delete isp;

return result;


Best wishes!
Guochun


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread jim holtman
What exactly is your problem with this file?  The file that you sent
had 10 lines of what appeared to be data and 4489 lines with just
commas which would read in as NAs.  When you do an 'str' you get:

> str(x)
'data.frame':   4498 obs. of  195 variables:
 $ Good_Bad   : Factor w/ 3 levels "","BAD","GOOD": 3
3 3 3 2 2 2 3 3 1 ...
 $ Good1Bad0  : int  1 1 1 1 0 0 0 1 1 NA ...
 $ PercUltColl: num  1 1 1 0.98 0.09 0.01 0.19 1 1 NA ...
 $ GoodMerchant.  : int  1 1 1 1 0 0 0 1 1 NA ...
 $ Fundid

so there are 4498 lines of data in the file, but you probably only
what the first 10.  Is this what your problem is?

On Fri, Dec 16, 2011 at 12:20 PM, Lost in R
 wrote:
> I've also attached here a sample of my data in Excel. I'm thinking it must be
> a problem with a character, but can't figure it out. Is there a list
> somewhere of characters to avoid in R?
>
> Thanks,
> Mike
>
> http://r.789695.n4.nabble.com/file/n4205479/Sample_Data_Set.csv
> Sample_Data_Set.csv
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4205479.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread David Winsemius


On Dec 16, 2011, at 12:20 PM, Lost in R wrote:


I've also attached here a sample of my data in Excel. I'm thinking it


It? What is "it"?


must be
a problem with a character, but can't figure it out. Is there a list
somewhere of characters to avoid in R?

Thanks,
Mike

http://r.789695.n4.nabble.com/file/n4205479/Sample_Data_Set.csv
Sample_Data_Set.csv

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4205479.html
Sent from the R help mailing list archive at Nabble.com.


We are not looking at this with Nabble. This is a mailing list. You  
are asked to attach context. That is something that can be done easily  
done in Nabble, so your failure to do so is seen by most viewers of  
this list as one of:


cause <-  c("privileged attitude", "clueless about mailing lists",  
"persistent failure to read Posting Guide")




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fortune? -- was Re: optim with simulated annealing SANN ...

2011-12-16 Thread Bert Gunter
Folks:

I thought John Nash's comment below was profound and a possible
Fortunes candidate:
(Aside: I believe it applies to a great deal of what is discussed on
this list, not just stochastic optimization.)

Cheers,
Bert

... (in the context of stochastic optimization)
>...  As with many tools in this domain, for effective use they
> require more knowledge than many of their users possess, and can be dangerous 
> because they
> seem to "work".
>
> JN
>
-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simulation

2011-12-16 Thread Scott Raynaud
I'm using an R program (which I did not write) to simulate multilevel data 
(subjects in locations) used in power calculations. It uses lmer to fit a 
mixed logistic model to the simulated data based on inputs of means, 
variances, slopes and proportions:
 
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))

where modelformula is set up in another part of the program.  Locations are
treated as random and the model is random intercept only.  The program is 
set to run 1000 simulations.
 
I have temperature, five levels of gestational age (GA), birth wieght (BW) 
and four 
other categorical pedictors, all binary.  I scaled everything so that all my 
slopes are in the 
range of -5.2 to 1.6 and variances from .01 to .08.  I have a couple of 
categories 
of GA that have small probabilities (<.10).  I'm using a structured sampling 
approach
looking at 20, 60, 100, and 140 locations with a total n=75k.  The first looks 
like this:
 
# groups   n
5 800
4 2239
3 3678
3 5117
3 6557
2 7996
Total 20   75000
 
As the level 2 sizes increase, the cell sizes decrease.  When I run this model 
in 
the simulation I get:
 
Warning: glm.fit: algorithm did not converge
 
every time the model is fit (I killed this long before it ran 1000 times).
 
I tried increasing the number of iterations to no avail.  I suspected linear 
dependencies among the predictors, so I took out GA (same result), put
GA back and took out BW (same result) and then took out both GA and
BW.  This ran about half the time with th other half passing warnings 
such as:
 
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
 
or
 
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
 
in addition to some like the original warning.
 
If I leave everything in but temperature, then it runs fine.  I also tested the 
full 
model separately at 50 and 75 level 2 units each with total n=75k.  Nothing 
converged.
 
I want to include temperature, but I'm not sure what else to try.  Any 
suggestions?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fortune? -- was Re: optim with simulated annealing SANN ...

2011-12-16 Thread Achim Zeileis

On Fri, 16 Dec 2011, Bert Gunter wrote:


Folks:

I thought John Nash's comment below was profound and a possible
Fortunes candidate:
(Aside: I believe it applies to a great deal of what is discussed on
this list, not just stochastic optimization.)


Thanks for the pointer, very true indeed - and now also in the devel 
version of "fortunes" on R-Forge


Best,
Z


Cheers,
Bert

... (in the context of stochastic optimization)

...  As with many tools in this domain, for effective use they
require more knowledge than many of their users possess, and can be dangerous 
because they
seem to "work".

JN


--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simulation

2011-12-16 Thread Bert Gunter
Suggestions? -- Yes.

1) Wrong list.. Post on R-sig-mixed-models, not here.

2) Follow the posting guide and provide the modelformula, which may
well be the source of the difficulties (overfitting).

-- Bert

On Fri, Dec 16, 2011 at 1:56 PM, Scott Raynaud  wrote:
> I'm using an R program (which I did not write) to simulate multilevel data
> (subjects in locations) used in power calculations. It uses lmer to fit a
> mixed logistic model to the simulated data based on inputs of means,
> variances, slopes and proportions:
>
> (fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
>
> where modelformula is set up in another part of the program.  Locations are
> treated as random and the model is random intercept only.  The program is
> set to run 1000 simulations.
>
> I have temperature, five levels of gestational age (GA), birth wieght (BW) 
> and four
> other categorical pedictors, all binary.  I scaled everything so that all my 
> slopes are in the
> range of -5.2 to 1.6 and variances from .01 to .08.  I have a couple of 
> categories
> of GA that have small probabilities (<.10).  I'm using a structured sampling 
> approach
> looking at 20, 60, 100, and 140 locations with a total n=75k.  The first 
> looks like this:
>
> # groups   n
> 5 800
> 4 2239
> 3 3678
> 3 5117
> 3 6557
> 2 7996
> Total 20   75000
>
> As the level 2 sizes increase, the cell sizes decrease.  When I run this 
> model in
> the simulation I get:
>
> Warning: glm.fit: algorithm did not converge
>
> every time the model is fit (I killed this long before it ran 1000 times).
>
> I tried increasing the number of iterations to no avail.  I suspected linear
> dependencies among the predictors, so I took out GA (same result), put
> GA back and took out BW (same result) and then took out both GA and
> BW.  This ran about half the time with th other half passing warnings
> such as:
>
> Warning: glm.fit: algorithm did not converge
> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
>
> or
>
> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
>
> in addition to some like the original warning.
>
> If I leave everything in but temperature, then it runs fine.  I also tested 
> the full
> model separately at 50 and 75 level 2 units each with total n=75k.  Nothing 
> converged.
>
> I want to include temperature, but I'm not sure what else to try.  Any
> suggestions?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread Lost in R
The data set I attached was just those 10 lines. It was only meant to show
any possible obvious mistake I may have made. The real set has the 4498 line
of data.

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4206630.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread David Winsemius


On Dec 15, 2011, at 2:39 PM, Lost in R wrote:

After checking the original data in Excel for blanks and running  
Summary(cm3)
to identify any null values in my data, I'm unable to identify an  
instances.
Yet when I attempted to use the data in Random Forest, I get the  
following
error. Is there something that Random Forest is reading as null  
which is not

actually null? Is there a better way to check for this?


library(randomForest)
system.time(

+ rf1 <- randomForest(as.matrix(


# Are you aware of the effect of using as.matrix(..) on the storage  
mode?



cm3[,c(2:length(colnames(cm3)))]),


# that was the x argument


+ cm3[,1],


# The y variable


data=cm3,


# That's odd. You already offered the data objects.  I wonder what the  
function will do with that?




ntree=50)
+ )
*Error in randomForest.default(as.matrix(cm3[,  
c(2:length(colnames(cm3)))]),

:
 NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In storage.mode(x) <- "double" : NAs introduced by coercion


I can see two potential sources of such an error.


Timing stopped at: 1.33 0.01 1.35 *


Thanks in advance,
Mike

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4201546.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package.skeleton()

2011-12-16 Thread Duncan Murdoch

On 11-12-16 4:12 PM, Ben Ganzfried wrote:

Hi--

I'm creating an R package, I've read through "Writing R Extensions" and the
package.skeleton() R page-- and I'm still running into a little confusion.
I would greatly appreciate any advice you can provide.

Where do I run my following line of code from?:

package.skeleton(name = "a", code_files = "EsetObject.r"


I'm currently running it from Rgui, but when I type the line above nothing
happens.



What is supposed to happen is that a new source package directory will 
be created in the current directory (what getwd() gives you).  Did that 
happen?  (I thought a message would also be printed and apparently 
you're not seeing that, but maybe it just worked without telling you.)


Duncan Murdoch


Thank you very much.

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] VaR package, where is it?

2011-12-16 Thread jfca283
Hi
I need to perform some analysis about risk management portfolio. There is a
reference to a VaR package, however, i can't find it. It was deprecated? Or
it's inside now of another package? Thanks. 

--
View this message in context: 
http://r.789695.n4.nabble.com/VaR-package-where-is-it-tp4206869p4206869.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VaR package, where is it?

2011-12-16 Thread Achim Zeileis

On Fri, 16 Dec 2011, jfca283 wrote:


Hi
I need to perform some analysis about risk management portfolio. There is a
reference to a VaR package, however, i can't find it. It was deprecated? Or
it's inside now of another package? Thanks.


Using the package=... syntax of the official CRAN package URLs either 
points you to the active package or to the archive (in case the package 
has been removed from the active repository). In case of the VaR package


http://CRAN.R-project.org/package=VaR

the latter is the case. I'm not sure though whether its functionality has 
been incorporated into some other package. Depending on what you are 
looking for, checking out the Finance task view


http://CRAN.R-project.org/view=Finance

may be helpful.


--
View this message in context: 
http://r.789695.n4.nabble.com/VaR-package-where-is-it-tp4206869p4206869.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread William Dunlap
Try randomForest with a small dataset to see how it works:
  > d <- data.frame(stringsAsFactors=FALSE,
  + Num=(1:10)%%9,
  + Fac=factor(rep(LETTERS[1:2],each=5)),
  + Char=rep(letters[24:26],len=10))
  > randomForest(x=d[,"Char",drop=FALSE], y=d$Num)
  Error in randomForest.default(x = d[, "Char", drop = FALSE], y = d$Num) : 
NA/NaN/Inf in foreign function call (arg 1)
  In addition: Warning message:
  In data.matrix(x) : NAs introduced by coercion
  > randomForest(x=d[,"Fac",drop=FALSE], y=d$Num)

  Call:
   randomForest(x = d[, "Fac", drop = FALSE], y = d$Num) 
 Type of random forest: regression
   Number of trees: 500
  No. of variables tried at each split: 1

Mean of squared residuals: 9.573558
  % Var explained: -40.58

It appears to die if any predictors are character vectors:
it will not convert them to factors (as most modelling functions
do).

as.matrix(data.frame) creates a character matrix if not all columns
are numeric or logical, so I suspect you are running into the
no-character-data limitation.  Try leaving off the as.matrix and
pass in the data.frame that it expects:
   randomForest(x=cm3[,-1,drop=FALSE], y=cm3[,1])
(The is no need or use for the data= argument if you use the x=,y=
interface.  It is only there for the formula interface.)

If you dislike the no-character-data limitation discuss it with
the person at the address given by maintainer("randomForest").

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Lost in R
> Sent: Friday, December 16, 2011 2:55 PM
> To: r-help@r-project.org
> Subject: Re: [R] Random Forest Reading N/A's, I don't see them
> 
> The data set I attached was just those 10 lines. It was only meant to show
> any possible obvious mistake I may have made. The real set has the 4498 line
> of data.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-
> them-tp4201546p4206630.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-value for hazard ratio in Cox proportional hazards regression?

2011-12-16 Thread Joshua Wiley
After some discussion off list, I came to the conclusion that most of
my statements below are erroneous.  This is just an errata for the
public archive to try to correct the misinformation.

Josh

On Sat, Dec 10, 2011 at 9:45 PM, Joshua Wiley  wrote:
> Hi Thierry,
>
> I see what you want now---a significance test for the HR specifically.
>  See inline below
>
> On Sat, Dec 10, 2011 at 6:40 PM, Thierry Julian Panje
>  wrote:
>> Hi Josh,
>>
>> Thank you for your quick response!
>>
>> In several papers the results of a Cox Regression were presented in a table 
>> showing the variable name, the hazard ratio for two values of this variable, 
>> the 95% confidence interval of the hazard ratio and a p-value.
>> (e.g., Lett 2007. Social support and prognosis in patients at increased 
>> psychosocial risk recovering from myocardial infarction. Health psychology 
>> "http://psycnet.apa.org/journals/hea/26/4/418.pdf"; page 6 Table 2)
>>
>> When I use summary() for a coxph() function, for example:
>>
>>> summary(mod.allison)
>> Call:
>> coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp +
>>    mar + paro + prio, data = Rossi)
>>
>>  n= 432, number of events= 114
>>
>>                   coef exp(coef) se(coef)      z Pr(>|z|)
>> finyes         -0.37942   0.68426  0.19138 -1.983  0.04742 *
>> age            -0.05744   0.94418  0.02200 -2.611  0.00903 **
>> raceother      -0.31390   0.73059  0.30799 -1.019  0.30812
>> wexpyes        -0.14980   0.86088  0.21222 -0.706  0.48029
>> marnot married  0.43370   1.54296  0.38187  1.136  0.25606
>> paroyes        -0.08487   0.91863  0.19576 -0.434  0.66461
>> prio            0.09150   1.09581  0.02865  3.194  0.00140 **
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>>               exp(coef) exp(-coef) lower .95 upper .95
>> finyes            0.6843     1.4614    0.4702    0.9957
>> age               0.9442     1.0591    0.9043    0.9858
>> raceother         0.7306     1.3688    0.3995    1.3361
>> wexpyes           0.8609     1.1616    0.5679    1.3049
>> marnot married    1.5430     0.6481    0.7300    3.2614
>> paroyes           0.9186     1.0886    0.6259    1.3482
>> prio              1.0958     0.9126    1.0360    1.1591
>>
>> Concordance= 0.64  (se = 0.027 )
>> Rsquare= 0.074   (max possible= 0.956 )
>> Likelihood ratio test= 33.27  on 7 df,   p=2.362e-05
>> Wald test            = 32.11  on 7 df,   p=3.871e-05
>> Score (logrank) test = 33.53  on 7 df,   p=2.11e-05
>>
>>
>> I thought, the value in the "Pr(>|z|)" column indicated the significance of 
>> the coefficient and not of a hazard ratio.
>
> That is correct.
>
>>
>> Is it possible to assess the statistical significance of a hazard ratio not 
>> only with a confidence interval but also with a p-value and to compute that 
>> in R?
>
> I am hoping someone else will chime in here.  I might naively try this
> (but do validate before publishing).  For the standard error of the
> hazard ratio, I use exp(coef) * se, and then assume that (exp(coef) -
> 1)/(exp(coef) * se) is ~ x2(1).  A quick look through Terry Therneau's
> book, Modeling Survival Data did not turn up any particular ways to
> get this automatically in R.  Note that the confidence interval (and
> this is consistent with summary()) is based on exponentiating the
> linear predictor's lower and upper limits, not the SE of the
> exponetiated predictor.
>
> require(survival)
> set.seed(1)
> d <- data.frame(
>  start = start <- sample(1:10, 600, TRUE),
>  stop = start + sample(1:7, 600, TRUE),
>  event = event <- rbinom(600, 1, .7),
>  x1 = event * rbinom(600, 1, .7),
>  x2 = rnorm(600, 0, 1))
>
> m <- coxph(Surv(start, stop, event) ~ x1 + x2, d)
> beta <- coef(m)
> se <- sqrt(diag(vcov(m)))
> HR <- exp(beta)
> HRse <- HR * se
>
> summary(m)
> round(cbind(coef = beta, se = se, z = beta/se, p = 1 - pchisq((beta/se)^2, 1),
>  HR = HR, HRse = HRse,
>    HRz = (HR - 1) / HRse, HRp = 1 - pchisq(((HR - 1)/HRse)^2, 1),
>    HRCILL = exp(beta - qnorm(.975, 0, 1) * se),
>    HRCIUL = exp(beta + qnorm(.975, 0, 1) * se)), 3)
>
>
>
>> Or is it usual to present a hazard ratio together with the p-value of the 
>> coefficient?
>
> "usual" probably depends on the area.  I have seen the coefficients
> presented with p-values, and the 95%CI of the hazard ratio.  I do not
> know that there is any one required set of statistics to be reported.
>
>> Or do the coefficient and any defined hazard ratio of a variable have in 
>> fact the same p-value?
>
> No, it is a nonlinear transformation and the p-values are not the same.
>
>>
>> I would like to apologize if these were trivial questions and I fear they 
>> are not totally R-specific. Thus I understand if you don't have the time to 
>> answer them. However, I would greatly appreciate any help.
>
> Aspects are not, but aspects are.  Certainly, asking if there is a way
> to get a test of the hazard ratio in R (without hacking something
> together like I did) seems reasonable to me at least to ask.
>