date:20110606

Re: [R] logistic growth model

2011-06-06 Thread Rubén Roa


Write the growth formula in an R script.
Define initial par values.
Input the size and age data.
Plot the size and age data as points.
Plot the growth model with the initial par values as a line.
Play with the initial par values until you see a good agreement between the 
model (the line) and the data (the points).
Optimise.
Re-plot.
Plot a residual histogram.
Plot a residual scatterplot.
Plot a Q-Q residual plot.

HTH

Rubén

 

Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN



> -Mensaje original-
> De: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] En nombre de Renalda
> Enviado el: sábado, 04 de junio de 2011 6:17
> Para: r-help@r-project.org
> Asunto: [R] logistic growth model
> 
> I want to Fit a logistic growth model for y = k 
> *eb0+b1(age)/1 + eb0+b1(age), can some one help on how to get 
> the initial coefficients b0 and b1? I need to estimate in 
> order to do the regression analysis. When I run using b0=0.5 
> and b1=3.4818, I get the following error
> 
> 397443.8 :  0.5 3.4818
> Error in nls(Height ~ k * exp(b1 + b2 * Age)/(1 + exp(b1 + b2 
> * Age)),  :
>singular gradient
> "please tell me what is wrong with my initials values, and 
> how to get the initial values"
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can R do zero inflated gamma regression?

2011-06-06 Thread Ben Bolker

siriustar  live.cn> writes:

> 
> Hi, Dear R-help
> I know there are some R package to deal with zero-inflated count data. But I
> am now looking for R package to deal with zero-inflated continuous data.
> 
> The response variable (Y) in my dataset contains a larger mount of zero and
> the Non-zero response are quite right skewed. Now what i am doing is first
> to use a logistic regression on covariates (X) to estimate the probability
> of Y being 0. Then focus on the dataset where Y is not zero, and run a
> linear regression or gamma glm to estimate the association between Y and X
> when Y is not zero.
> However, the linear regression and gamma glm model fit my data poorly.
> 
> So, I am thinking maybe a zero inflated gamma or zero inflated lognormal
> regression are helpful, where I can estimate the probability of Y being zero
> and the association between non zero Y and X at the same time. 
> However, I dont know which R package can do that. 

  I think your 'conditional' strategy is quite useful in general, and
may in general give you the same answers as the zero-inflated approach
you're suggesting.  Perhaps there are some other issues with the
conditional (gamma GLM) parts of your analysis?  Have you tried simple
log-linear regression (i.e. assuming that the non-zero values are
lognormally distributed)?

  I would recommend reading this thread in the r-sig-ecology mailing list:

http://thread.gmane.org/gmane.comp.lang.r.ecology/2124

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about curve function

2011-06-06 Thread Prof Brian Ripley

As a further example of the trickiness, the "function" method of 
plot() relies on curve(x, ...) being a request to plot the function 
x(x) against x.  I've added a comment to that effect to the help page.


On Mon, 6 Jun 2011, Prof Brian Ripley wrote:


On Sun, 5 Jun 2011, Abhilash Balakrishnan wrote:


Dear Mr. Murdoch,

I find out that still do not understand why the following does not work:


curve(expression(x))

Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' and 'y' lengths differ

As here the input to curve is an expression, as documented in the help, and


Not really, and certainly not in the sense you seem to understand it.. 
'expression(x)' is a call to the expression() function, and that evaluates to 
a length-one expression vector. As ?expression says:


‘Expression’ here is not being used in its colloquial sense, that
of mathematical expressions.  Those are calls (see ‘call’) in R,
and an R expression vector is a list of calls, symbols etc, for
example as returned by ‘parse’.


the expression is simply x.


'Simply' untrue.


What is the y mentioned in the error?  There is no y used here.


Yes, there is.  Please do read the code for 'curve':

   y <- eval(expr, envir = list(x = x), enclos = parent.frame())

so you are trying to plot a length-1 expression vector against a length-101 
'x'.


As others have said, curve() is a convenience function, and its requirements 
are rather picky.  And you have already been given one good solution, 
curve(I).



Thank you for support.
Abhilash B.


On Sun, Jun 5, 2011 at 3:39 PM, Duncan Murdoch 
wrote:



On 11-06-05 1:07 PM, Abhilash Balakrishnan wrote:


Dear Sirs,

I am a new user of the R package.  When I try to use the curve function 
it

confuses me.

 curve(x^2)



Works fine.

 curve(x)



Makes a complaint I don't understand.  Why is x^2 valid and x is not?



curve() is a convenience function, and it tries to guess what you mean.
 Sometimes it gets it wrong.

In the first case, it is clear you want to graph x^2.  In the second it
guesses you have a function named x and want to graph that.  You don't, so
it fails.

Probably it could try again after the first failure, but I'd guess there
will always be strange cases where it does weird things.

Duncan Murdoch


I check the documentation of curve, and it says the first argument must 
be

an expression containing x.

 expression(x)



Is an expression containing x.

 curve(expression(x))


Makes a different complaint and mentions different lengths of x and y 
(but

I
use no y here).

I understand that plotting the function y(x) = x is rather silly, but I
want
to know what I am doing wrong, for the sake of my understanding of how R
works.

Thank you for support.
Abhilash B.

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error with BRugs 0.53 and 0.71, on Win7 with R 2.12.2 and 2.13.0 (crashes R GUI)

2011-06-06 Thread Uwe Ligges




On 05.06.2011 23:42, Chris Chapman wrote:

Thanks again. FWIW, I tried an even older R version (2.11) + BRugs 0.53
and 0.70 ... but got the same errors as with the iterations reported
below. So I'm giving up on trying to solve the issue.

My workaround is that I'm using the R2WinBUGS package instead. So far
that has worked well -- I've run a few iterations of models with no
problems.

Specifically, none of the errors below have occurred (this is using R
2.13.0, R2WinBUGS 2.1-18, OpenBUGS 3.0.3 as installed by BRugs, running
under Windows 7/32-bit). Unlike with BRugs, I'm able to use R2WinBUGS to
run a model, change it, run again with new parameters, etc., with no
crashes or other problems yet.


Strange, since the BRugs example always worked well for me and some 
others. Hence I cannot tell where the interaction comes from that causes 
the crashes.




Uwe: thank you for your work there as well!


Well, I am very far behind my schedule for BRugs!!!

Best wishes,
uwe



-- Chris

--
From: "Uwe Ligges" 
Sent: Monday, May 30, 2011 12:57 AM
To: "Chris Chapman" 
Cc: 
Subject: Re: [R] Error with BRugs 0.53 and 0.71, on Win7 with R 2.12.2
and 2.13.0 (crashes R GUI)




On 29.05.2011 23:19, Chris Chapman wrote:

Uwe -- thank you. No, this occurs on three different machines: two at
work (a Lenovo laptop running Win7-32, plus an HP workstation running
Win7-64) ... and I just tried another Compaq desktop machine at home
running WinXP-32, with the same result.

I agree that this seems highly unusual since the examples are so simple
and obviously work in general; yet it is also highly replicable for me,
and I'm at a loss as to what might be the root cause given the different
machines and Windows versions. FWIW, everything else in my R environment
(Rgui, Tinn-R, RStudio, ggplot2, various other packages) runs OK.
OpenBUGS in itself also seems OK albeit in limited tests.

Could there be something in the "handleRes()" error that suggests
anything to examine (firewall, antivirus, file locations, or some file
permissions, perhaps? -- although those also vary across my machines,
esp. from work to home).


The only thing I can say is, yes, perhaps.
Sorry, but I am still unable to reproduce so far.

Uwe




Thanks again,

-- Chris
--
From: "Uwe Ligges" 
Sent: Sunday, May 29, 2011 10:07 AM
To: "Chris Chapman" 
Cc: 
Subject: Re: [R] Error with BRugs 0.53 and 0.71, on Win7 with R 2.12.2
and 2.13.0 (crashes R GUI)


Sounds like a hardware problem to me, since I do not expereice any
problems with the example you gave at first. Is this all on the same
hardware?

Uwe Ligges



On 27.05.2011 18:38, Chris Chapman wrote:

I've run into persistent problems with OpenBUGS crashing when using
BRugs .53 and .71, and am hoping someone has suggestions. There is
obviously something unusual going on in my environment, but I'm at a
loss as to where to begin to try to solve it.

In a nutshell, what happens is that, as soon as I call "modelCheck()"
in BRugs, it gets an error or crashes ... but only some of the time
(90%< p< 100%). Following are details:

1. OpenBUGS 3.0.3 + BRugs 0.531:
It works occasionally, but approximately 90% of the time, I get the
following error from modelCheck():
Error in handleRes(res[[3]]) :

An OpenBUGS module or procedure was called that did not exist.



The specific code seems not to matter, but here is an example (model
taken from the OpenBUGS tutorial):
modelString =
" model
{
for (i in 1:N) {
r[i] ~ dbin(p[i], n[i])
b[i] ~ dnorm(0, tau)
logit(p[i])<- alpha0 + alpha1 * x1[i] + alpha2 * x2[i]
+ alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0, 1.0E-6)
alpha1 ~ dnorm(0, 1.0E-6)
alpha2 ~ dnorm(0, 1.0E-6)
alpha12 ~ dnorm(0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma<- 1 / sqrt(tau)
}
"
print(modelString)
writeLines(modelString,con="model3.txt")
modelCheck( "model3.txt" )

Which (usually) produces:

modelCheck( "model3.txt" )

Error in handleRes(res[[3]]) :
An OpenBUGS module or procedure was called that did not exist.

I've copied at the end of this message an example from a single R
session that shows how it may work sometimes but not others.


2. OpenBUGS 3.2.1 + BRugs 0.71:

As above, the model occasionally works, but mostly it crashes R on
the modelCheck() line with the error "R for Windows GUI front-end has
stopped working".


I've tried the following combinations to try to get it to work:
A. Win7 32-bit + R 2.13 + BRugs 0.531 from standard CRAN repository
(installed from R)
B. Win7 32-bit + R 2.13 + BRugs 0.71 + OpenBUGS 3.2.1 (package and
EXE from OpenBUGS site)
C. Win7 64-bit [different machine] + R 2.13 (32-bit) + BRugs 0.531
D. Win7 64-bit + R 2.13 (32-bit) + BRugs 0.71 + OpenBugs 3.2.1
E. Win7 32-bit + *R 2.12.2* + BRugs 0.531 from standard CRAN
repository (installed from R)
F. Win7 64-bit + R 2.12.2 (32-bit) + BRugs 0.531
G. Win7 64-bit + R 2.12.2 (32-bit) + BRugs 0.71 + OpenBugs 3.2.1
H. ... and various combinat

Re: [R] logistic growth model

2011-06-06 Thread Walmes Zeviani

If you use RStudio (www.rstudio.org) you can find good initial start values
by interactive plot using manipulate() function. Look the simple code below.

age <- 1:20
k <- 3; b0 <- -5; b1 <- .5
y <-  k*exp(b0+b1*age)/(1+exp(b0+b1*age))+rnorm(age,0,0.1)
plot(y~age)
start <- list()

require(manipulate)
manipulate(
   {
 plot(y~age)
 k <- kk; b0 <- b00; b1 <- b10
 curve(k*exp(b0+b1*x)/(1+exp(b0+b1*x)), add=TRUE)
 start <<- list(k=k, b0=b0, b1=b1)
   },
   kk=slider(0,5,step=0.1,initial=2.7),
   b00=slider(-5,0,step=0.1,initial=-2.5),
   b10=slider(0,1,step=0.1,initial=0.5))

start

n0 <- nls(y~k*exp(b0+b1*age)/(1+exp(b0+b1*age)), start=start)
summary(n0)

If you don't use RStudio you can follow the procedures in this blog (that
uses gWidgetsRGtk2).

http://ridiculas.wordpress.com/2011/04/09/metodo-grafico-interativo-para-valores-iniciais-em-regressao-nao-linear/

Bests.
Walmes.

==
Walmes Marques Zeviani
LEG (Laboratório de Estatística e Geoinformação, 25.450418 S, 49.231759 W)
Departamento de Estatística - Universidade Federal do Paraná
fone: (+55) 41 3361 3573
VoIP: (3361 3600) 1053 1173
e-mail: wal...@ufpr.br
twitter: @walmeszeviani
homepage: http://www.leg.ufpr.br/~walmes
linux user number: 531218
==


On Mon, Jun 6, 2011 at 4:20 AM, Rubén Roa  wrote:

>
> Write the growth formula in an R script.
> Define initial par values.
> Input the size and age data.
> Plot the size and age data as points.
> Plot the growth model with the initial par values as a line.
> Play with the initial par values until you see a good agreement between the
> model (the line) and the data (the points).
> Optimise.
> Re-plot.
> Plot a residual histogram.
> Plot a residual scatterplot.
> Plot a Q-Q residual plot.
>
> HTH
>
> Rubén
>
> 
>
> Dr. Rubén Roa-Ureta
> AZTI - Tecnalia / Marine Research Unit
> Txatxarramendi Ugartea z/g
> 48395 Sukarrieta (Bizkaia)
> SPAIN
>
>
>
> > -Mensaje original-
> > De: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org] En nombre de Renalda
> > Enviado el: sábado, 04 de junio de 2011 6:17
> > Para: r-help@r-project.org
> > Asunto: [R] logistic growth model
> >
> > I want to Fit a logistic growth model for y = k
> > *eb0+b1(age)/1 + eb0+b1(age), can some one help on how to get
> > the initial coefficients b0 and b1? I need to estimate in
> > order to do the regression analysis. When I run using b0=0.5
> > and b1=3.4818, I get the following error
> >
> > 397443.8 :  0.5 3.4818
> > Error in nls(Height ~ k * exp(b1 + b2 * Age)/(1 + exp(b1 + b2
> > * Age)),  :
> >singular gradient
> > "please tell me what is wrong with my initials values, and
> > how to get the initial values"
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to assess the accuracy of fitted logistic regression using glm

2011-06-06 Thread Xiaobo Gu

Hi,

I am trying glm with family = binomial to do binary logistic
regression, but how can I assess the accuracy of the fitted model, the
summary method can print a lot of information about the returned
object, such as coefficients, because statistics is not my speciality,
so can you share some rule of thumb to exam the  fitted model from the
practical perspective.

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simulation

2011-06-06 Thread Stat Consult

Dear ALL
I want to simulate data from Multivariate normal distribution.
GE.N<-mvrnorm(25,mu,S)
S <-matrix(rep(0,1),nrow=100)
for( i in 1:100){sigma<-runif(100,0.1,10);S
[i,i]=sigma[i];mu<-runif(100,0,10)}
for (i in 1:20){for (j in 1:20){if (i != j){S [i,j]=0.3*sigma[i]*sigma[j]}}}
for (i in 21:40){for (j in 21:40){if (i != j){S
[i,j]=0.3*sigma[i]*sigma[j]}}}
for (i in 41:60){for (j in 41:60){if (i != j){S
[i,j]=0.3*sigma[i]*sigma[j]}}}
for (i in 61:80){for (j in 61:80){if (i != j){S
[i,j]=0.3*sigma[i]*sigma[j]}}}
for (i in 81:100){for (j in 81:100){if (i != j){S
[i,j]=0.3*sigma[i]*sigma[j]}}}
How should I do when S is not positive definite matrix?
I saw this error: 'Sigma' is not positive definite.

best regards,
Sara

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] qplot fill and colour not working as expected

2011-06-06 Thread wwreith

I am just learning to use qplot and can't get the fill/colour to work. Below
is the R code for a scatter plot and bar graph.

library(ggplot2)
x<-c(1,2,3,4,5,6,7)
y<-c(1,2,3,2,5,6,3)
qplot(x,y, main="Scatter Plot Test", xlab="X Label Test", ylab="Y Label
Test", colour="blue")z<-c("van", "van", "van", "car", "car", "truck",
"truck", "truck", "truck", "van", "van")
qplot(z, main="Bar Graph Test", ylab="Vehichle Count", xlab="Vehicle
Category", fill="blue")
If I set fill=z, then I can get something different, i.e. three preselected
colors are used, but what if I want all three bars to just be blue instead
of the default? 

I am having the same issue if I try to use colour. How do I change the
default color to blue instead of black.

Second question what exactly is the difference between colour and fill?  

--
View this message in context: 
http://r.789695.n4.nabble.com/qplot-fill-and-colour-not-working-as-expected-tp3576949p3576949.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wireframe, custom x-axis values

2011-06-06 Thread Rbjørn Nicolaisen


Hi,

Im plotting some data with wireframe() like so:

wireframe(result ~ u * r, myData, scales=list(arrows=FALSE))

However, I would really like to display something different for the displayed 
values of "u" rather than the actual values.
This is because my u-values are a sequence of quantiles of myData, and I would 
like to display the quantiles used (e.g. "0.8   0.85   0.9   0.95")  instead of 
the actual values of these quantiles, since this is easier to relate to for a 
viewer. This information is accessible in myData in a variable, "qnt".

I've tried meddling around with "axis", "label" and "at" in scales=list(), but 
i've been unable to make it happen.

Can anyone shed some light? Preferably in a short, generic example.

Thanks in advance,
Thor
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Not missing at random

2011-06-06 Thread Blaz Simcic



Hello!
Â 
I would like to sample 30 % of cases (with at least 1 value lower than 3 - in 
the row) and among them I want to set all values lower than 3 (within selected 
cases) as NA (NMAR- Not missing at random). I managed to sample cases, but I 
donât know how to set values (lower than 3) as NA.
Â 
R code:
Â 
x <- 
matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4),
 nrow = 7, ncol=7, byrow=TRUE) matrix
Â 
pMiss <- 30Â Â Â Â  percent of missing values
Â 
N <- dim(x)[1]Â Â  number of cases
Â 
candidate<-which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3 | 
x[,7]<3)Â Â Â   I want to sample all cases with at least 1 value lower than 
3, 
so I have to find candidates
Â 
idMiss <- sample(candidate, N * p / 100)Â Â Â   I sampled cases
Â 
Now I'd like to set all values among sampled cases as NA.
Â 
Any suggestion?
Â 
Thanks,
BlaÅ¾
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem in R documentation

2011-06-06 Thread siddharth arun

I am not able to run Dickey-Fuller test.
adf.test() function is not working. It is showing 'Error: could not find
function "adf.test"


Can any tell how to call "time series" library?

-- 
Siddharth Arun,
4th Year Undergraduate student
Industrial Engineering and Management,
IIT Kharagpur

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Negating two identical characters with regular expressions

2011-06-06 Thread matto in cor

Hello Michael,

try strsplit("aa-bb-cc dd", "-{2,}") . This function returns an array
with all the strings separated by multiple dashes (at least two).
Alternatively if you want the first string only try this: sub("(.*?)--.*",
"\\1", "aa-bbcc dd") (note the reluctant quantifier *? )

Hope this helps
Marco

On Sun, Jun 5, 2011 at 9:59 PM, Michael Young wrote:

> Hello all,
>
> Let's say I have a character string
> "Race-ethnicity-coding information"
>
> I want to extract all text before the multiple dashes, including the word
> "ethnicity."
>
> I wrote a handy function to extract the first matched text:
>
> grepcut <- function(pattern,x){
> start.and.length <- regexpr(pattern,x)
> substring(x,start.and.length,start.and.length
> +attr(start.and.length,"match.length")-1)}
>
> grepcut("^[^-]+","Race-ethnicity-coding information")
>
> The above grepcut, of course, returns only the string "Race"  What I really
> want is a to create a class of two dashes in a row and then negate that. Is
> it possible to create a class of repeated characters?  If so, it might be
> further complicated that "-" is a special character in brackets and can
> only
> go first or last.
>
> Can anyone help me out?
>
> Thanks,
> Michael Young
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ogni tempo ha il suo fascismo. A questo si arriva in molti modi, non
necessariamente col terrore dell'intimidazione poliziesca, ma anche
negando o distorcendo l'informazione, inquinando la giustizia,
paralizzando la scuola, diffondendo in molti sottili modi la
nostalgia per un mondo in cui regnava sovrano l'ordine.
(Primo Levi, 1974)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assess the accuracy of fitted logistic regression using glm

2011-06-06 Thread Prof Brian Ripley


On Mon, 6 Jun 2011, Xiaobo Gu wrote:


Hi,

I am trying glm with family = binomial to do binary logistic
regression, but how can I assess the accuracy of the fitted model, the
summary method can print a lot of information about the returned
object, such as coefficients, because statistics is not my speciality,
so can you share some rule of thumb to exam the  fitted model from the
practical perspective.


It depends entirely on why you did the fit.  People have written whole 
books on assessing the performance of classification procedures such 
as binary logistic regression.  For example, the residual deviance is 
closely related to log-probability scoring: for some purposes that is 
a good performance measure, for others (e.g. when you are going to 
threshold the predicted probabilities) it can be very misleading.


In short, you need statistical advice, not R advice (the purpose of 
this list).




Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RCurl and kerberos

2011-06-06 Thread TAPO (Thomas Agersten Poulsen)

Dear list,

I would like to call a Kerberos-authenticated web-service from within R.

Curl can do it:
$ curl --negotiate -u : "http://my.web.service/";

so I would expect that RCurl also has the capability, but I have not been able 
to find the correct options to set.

listCurlOptions() does not return anything with negotiate, and searching the 
source of RCurl, the only thing I found was 

./RCurl/R/curlInfo.S:names(CurlFeatureBits) = c("ipv6", "kerberos4", "ssl", 
"libz", "ntlm", "gssnegotiate",

but e.g.
getURL("http://my.web.service",.opts=curlOptions(username=":",httpauth="gssnegotiate"))

does not work.

Does anybody know if RCurl or another package is able to negotiate the GSS api 
or otherwise access a Kerberos enabled webservice and how?

Testing was done in R 2.12.1 on Ubuntu 10.04.1 LTS / Lucid.

Thanks in advance,
Thomas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in R documentation

2011-06-06 Thread Sarah Goslee

What operating system are you using?
What version of R are you using?
How did you install the package in question? Did the installation
process give any error messages?
Did you load the package before trying to use it?
What package are you trying to load - there is no "time series"
package. Do you mean timeSeries? Or tseries? Or some other package
entirely?

On Mon, Jun 6, 2011 at 2:41 AM, siddharth arun  wrote:
> I am not able to run Dickey-Fuller test.
> adf.test() function is not working. It is showing 'Error: could not find
> function "adf.test"
>
>
> Can any tell how to call "time series" library?
>
> --

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] adding an ellipse to a PCA plot

2011-06-06 Thread Lukas Baitsch

Hi,

I created a principal component plot using the first two principal
components. I used the function princomp() to calculate the scores.
now, I would like to superimpose an ellipse representing the center
and the 95% confidence interval of a series of points in my plot (as
to illustrate the grouping of my samples).

I looked at the ellipse() function in the ellipse package but can't
get it to work. the princomp()-function gives me the scores of each
point, so I can calculate the mean and the 95%-CI, but I can't
integrate this into the ellipse()-function). Is there a better way of
doing this or can someone help me figure out this function?

best regards,

Lukas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in R documentation

2011-06-06 Thread Jorge Ivan Velez

Hi Siddharth,

adf.test() is part of the "tseries" package, so you need to download and
install it before using that function. Try the following and let us now what
you get:

install.packages('tseries')
require(tseries)
?adf.test

HTH,
Jorge

On Mon, Jun 6, 2011 at 2:41 AM, siddharth arun <> wrote:

> I am not able to run Dickey-Fuller test.
> adf.test() function is not working. It is showing 'Error: could not find
> function "adf.test"
>
>
> Can any tell how to call "time series" library?
>
> --
> Siddharth Arun,
> 4th Year Undergraduate student
> Industrial Engineering and Management,
> IIT Kharagpur
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso for k-subset regression

2011-06-06 Thread Steve Lianoglou

Hi,

On Sun, Jun 5, 2011 at 9:12 PM, Dae-Jin Lee  wrote:
> Dear R-users
>
> I'm trying to use lasso in lars package for subset regression,  I have a
> large matrix of size 1000x100 and my aim is to select a subset k of the 100
> variables.
>
> Is there any way in lars to fix the number k (i.e. to select the best 10
> variables)
>
> library(lars)
>
> aa=lars(X,Y,type="lasso",max.steps=200)
>
> plot(aa,plottype="Cp")
> aa$RSS
> which.min(aa$RSS)
> round(aa$beta,2)
>
> aa$beta[which.min(aa$RSS),]    #  find which coefficients minimizes the RSS
>
> lasso.ind=which((as.vector((aa$beta[which.min(aa$RSS),])))>0)    # index of
> variables
>
> print(lasso.ind)   # this usually gives more than 10 variables (also depends
> on the max.steps in lars)

First off: I'd suggest using the glmnet package instead of lars.
Setting its `alpha` parameter to 1 will give you the lasso, but you
can also play w/ different values of alpha to see if an
elasticnet-type penalty would be better.

Now that you are using glmnet, check its `dfmax` and `pmax` arguments.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] qplot fill and colour not working as expected

2011-06-06 Thread Ista Zahn

Hi,

On Mon, Jun 6, 2011 at 9:49 AM, wwreith  wrote:
> I am just learning to use qplot and can't get the fill/colour to work. Below
> is the R code for a scatter plot and bar graph.
>
> library(ggplot2)
> x<-c(1,2,3,4,5,6,7)
> y<-c(1,2,3,2,5,6,3)
> qplot(x,y, main="Scatter Plot Test", xlab="X Label Test", ylab="Y Label
> Test", colour="blue")z<-c("van", "van", "van", "car", "car", "truck",
> "truck", "truck", "truck", "van", "van")
> qplot(z, main="Bar Graph Test", ylab="Vehichle Count", xlab="Vehicle
> Category", fill="blue")
> If I set fill=z, then I can get something different, i.e. three preselected
> colors are used, but what if I want all three bars to just be blue instead
> of the default?
>
> I am having the same issue if I try to use colour. How do I change the
> default color to blue instead of black.

Use 'fill=I("blue")' or 'colour=I("blue")'.

>
> Second question what exactly is the difference between colour and fill?

As near as I can tell, colour applies to lines and points, fill
applies to the interier of rectangles and other areas.

Best,
Ista

>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/qplot-fill-and-colour-not-working-as-expected-tp3576949p3576949.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] qplot fill and colour not working as expected

2011-06-06 Thread Jonathan Daily

One difference between colour and fill can be demonstrated by
specifying both with different values. In cases where a polygon is
filled, colour specifies border line color.

On Mon, Jun 6, 2011 at 9:49 AM, wwreith  wrote:
> I am just learning to use qplot and can't get the fill/colour to work. Below
> is the R code for a scatter plot and bar graph.
>
> library(ggplot2)
> x<-c(1,2,3,4,5,6,7)
> y<-c(1,2,3,2,5,6,3)
> qplot(x,y, main="Scatter Plot Test", xlab="X Label Test", ylab="Y Label
> Test", colour="blue")z<-c("van", "van", "van", "car", "car", "truck",
> "truck", "truck", "truck", "van", "van")
> qplot(z, main="Bar Graph Test", ylab="Vehichle Count", xlab="Vehicle
> Category", fill="blue")
> If I set fill=z, then I can get something different, i.e. three preselected
> colors are used, but what if I want all three bars to just be blue instead
> of the default?
>
> I am having the same issue if I try to use colour. How do I change the
> default color to blue instead of black.
>
> Second question what exactly is the difference between colour and fill?
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/qplot-fill-and-colour-not-working-as-expected-tp3576949p3576949.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] shading in overlap between two ranges

2011-06-06 Thread Graves, Gregory

This worked perfectly.  An example graphic is located here:
ftp://ftp.sfwmd.gov/pub/ggraves/ribbon.bmp


-Original Message-
From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Thursday, June 02, 2011 12:11 PM
To: Graves, Gregory
Cc: r-help@r-project.org; Kemp, Susan K SAJ; patrick_pi...@fws.gov
Subject: Re: [R] shading in overlap between two ranges

Hi:

Here's one approach using geom_ribbon() in ggplot2 - the 'overlap' is
the change in color where the two ribbons intersect. Using your
example data with the same names and the 'one.month' variable removed,

library(ggplot2)
ggplot() +
  geom_ribbon(data = target, aes(x = i.value, ymin = X25, ymax = X75,
 fill = 'Target'), alpha = 0.4) +
  geom_ribbon(data = observed, aes(x = i.value, ymin = X25, ymax = X75,
 fill = 'Observed'), alpha = 0.4) +
  scale_fill_manual("", c('Target' = 'blue', 'Observed' = 'orange')) +
  opts(legend.position = c(0.88, 0.85),
   legend.background = theme_rect(colour = 'transparent'),
   legend.text = theme_text(size = 12)) +
  labs(x = 'Month', y = 'Value')

There is a separate geom_ribbon() for each of target and observed. A
factor variable for fill color is generated on the fly with colors
specified in scale_fill_manual(). The opts() reposition the legend
inside the graphics region (the values represent proportions of the
total graphics area in each direction), make the legend background
transparent and slightly increase the size of the legend labels
(default size = 10 in theme_text).
Alpha transparency is used so that the overlap creates a blend of the
two colors; without it, one overwrites the other.

HTH,
Dennis


On Thu, Jun 2, 2011 at 8:04 AM, Graves, Gregory  wrote:
> I have 2 datafiles 'target' and 'observed' as shown below (I will gladly
> email these 2 small files to whomever).  X25. And X75. Indicate the
> value of 25th and 75th-percentile of the target ('what should be') and
> the observed ('what is').  The i.value is simply the month.
>
>> target
>        X        i.value    X25.     X75.
> 1  one.month       1 10.845225 17.87237
> 2  one.month       2 12.235813 19.74490
> 3  one.month       3 14.611749 23.44810
> 4  one.month       4 17.529332 28.09647
> 5  one.month       5 19.458738 30.56936
> 6  one.month       6 15.264505 28.29333
> 7  one.month       7 12.370369 23.35455
> 8  one.month       8 12.471224 21.82794
> 9  one.month       9  9.716685 17.28762
> 10 one.month      10  6.470568 12.49830
> 11 one.month      11  6.180560 14.24961
> 12 one.month      12  9.673738 15.79208
>
>> observed
>     X         i.value   X25.     X75.
> 1  one.month       1 19.81000 27.63500
> 2  one.month       2 23.64062 30.09125
> 3  one.month       3 26.04865 35.99104
> 4  one.month       4 32.02625 41.50958
> 5  one.month       5 34.74479 47.75958
> 6  one.month       6 37.48885 46.56448
> 7  one.month       7 30.06740 40.10146
> 8  one.month       8 26.14917 39.49458
> 9  one.month       9 14.12521 32.39406
> 10 one.month      10 11.04125 23.55479
> 11 one.month      11 13.14917 23.56833
> 12 one.month      12 17.17938 27.02458
>
> The following plots 4 lines on one graph.  The area between the two red
> lines represents the target 'zone', and the area between the two black
> lines is the observed 'zone'.
>
> with(target, plot(X25.~i.value,ylim=c(0,55),type='l',col='red'))
> par(new=T)
> with(target, plot(X75.~i.value,ylim=c(0,55),type='l',col='red'))
> par(new=T)
> with(observed, plot(X25.~i.value,ylim=c(0,55),type='l'))
> par(new=T)
> with(observed, plot(X75.~i.value,ylim=c(0,55),type='l'))
> par(new=F)
>
> Ideally, the target and the observed should overlap in every month -
> they don't.  The desire is to visually accentuate the amount of overlap
> by shading in the area where these two "zones" overlap.  How would you
> do that?  Note, that in some of these characterizations, the overlap
> wanders in and out [I already have routines that calculate the percent
> of overlap, but I have been requested to find a way to shade the
> overlap.]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding an ellipse to a PCA plot

2011-06-06 Thread John Fox

Dear Lukas,

You might try the dataEllipse() function in the car package.

I hope this helps,
 John

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Lukas Baitsch
> Sent: June-06-11 10:33 AM
> To: r-help@r-project.org
> Subject: [R] adding an ellipse to a PCA plot
> 
> Hi,
> 
> I created a principal component plot using the first two principal
> components. I used the function princomp() to calculate the scores.
> now, I would like to superimpose an ellipse representing the center and
> the 95% confidence interval of a series of points in my plot (as to
> illustrate the grouping of my samples).
> 
> I looked at the ellipse() function in the ellipse package but can't get
> it to work. the princomp()-function gives me the scores of each point,
> so I can calculate the mean and the 95%-CI, but I can't integrate this
> into the ellipse()-function). Is there a better way of doing this or can
> someone help me figure out this function?
> 
> best regards,
> 
> Lukas
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] test cases (data) for data based modeling

2011-06-06 Thread Immanuel B

Hello all,

I'm working mostly with machine learning code in R and looking for a structured
way to check if my code is working properly.

For example if I train a classifier on some data. How do I know if the
good / bad results
are related to the data are not just an programming error that I
introduced somewhere.

results are to good: I might have used some part of the test data for training
results are to bad: could have any reason

I know that I can in principle generate data containing no information
at all or pure information to benchmark
my code but is there a more elaborate or easyer way to that?

I guess what I'm basically looking for is some kind of unit testing
framework to generate test data
for machine learning tasks, I read about the package RUnit but don't
really know how to proceed from
there.

Any ideas?
How do you test your data analysis code?

best regards,
Immanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I write methods for 'as()'?

2011-06-06 Thread Janko Thyson

Dear list,

I wonder how to write methods for the function 'as' in the sense that I 
can call 'as(object, Class, strict=TRUE, ext)' and let method dispatch 
figure out the correct method.

AFAIU, there is a difference between, e.g. 'as.data.frame' and the 
methods of 'as()' as stated above since the former depends on arg 'x' 
instead of 'object', 'Class' etc.?

 > methods("as")
 > as.data.frame

I have to admit that I'm not really familiar with the S3 style of 
defining methods as I have been coding in S4 a lot, but my first attempt 
was to write something like this:

as.myClass <- function(x, ...){
 if(is(x, "data.frame"){
 x <- as.list(x)
 }
 if(is(x, "character"){
 x <- as.list(x)
 }
 ...
 out <- getRefClass("myClass")$new(X=x)
 return(out)
}

But that way I'd have to explicitly call 'as.myClass(x)' whereas I'd 
simply like to type 'as(x, "myClass")'.
Also, how is it possible to have method dispatch recognize two  
signature arguments in S3? I.e., how can I define something like 
'as.data.frame.character' in order to have explicit "sub" methods for 
all the data types of 'x' so I wouldn't have to process them all in the 
definition of 'as.myClass' as I did above?

Thanks for your help,
Janko


-- 


*Janko Thyson*
janko.thy...@googlemail.com 

Jesuitenstraße 3
D-85049 Ingolstadt

Mobile: +49 (0)176 83294257

This e-mail and any attachment is for authorized use by the intended
recipient(s) only. It may contain proprietary material, confidential
information and/or be subject to legal privilege. It should not be
copied, disclosed to, retained or used by any other party.
If you are not an intended recipient then please promptly delete this
e-mail and any attachment and all copies and inform the sender.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding an ellipse to a PCA plot

2011-06-06 Thread Lukas Baitsch

Dear John,

Thanks, this actually works just fine! much easier than the ellipse-package!

Lukas

On Monday, June 6, 2011, John Fox  wrote:
> Dear Lukas,
>
> You might try the dataEllipse() function in the car package.
>
> I hope this helps,
>  John
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Lukas Baitsch
>> Sent: June-06-11 10:33 AM
>> To: r-help@r-project.org
>> Subject: [R] adding an ellipse to a PCA plot
>>
>> Hi,
>>
>> I created a principal component plot using the first two principal
>> components. I used the function princomp() to calculate the scores.
>> now, I would like to superimpose an ellipse representing the center and
>> the 95% confidence interval of a series of points in my plot (as to
>> illustrate the grouping of my samples).
>>
>> I looked at the ellipse() function in the ellipse package but can't get
>> it to work. the princomp()-function gives me the scores of each point,
>> so I can calculate the mean and the 95%-CI, but I can't integrate this
>> into the ellipse()-function). Is there a better way of doing this or can
>> someone help me figure out this function?
>>
>> best regards,
>>
>> Lukas
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I write methods for 'as()'?

2011-06-06 Thread Janko Thyson

Somehow I don't see my own postings in the list, so sorry for replying 
to my own message and not the one that went out to the list.

I got a little further and I think I found exactly the thing that is 
bothering me: how to get "extended" method dispatch going in 'setAs()':

setRefClass("A", fields=list(X="numeric"))
setRefClass("B", contains="A", fields=list(Y="character"))

setAs(from="numeric", to="A",
 def=function(from,to){
 out <- getRefClass(to)$new(X=from)
 return(out)
 }
)
a <- as(1:5, "A")
a$X

b <- as(1:5, "B")

My problem is the last statement (b <- as(1:5, "B") which fails. I want 
to get around having to write new 'setAs' methods for all classes 
extending class 'A'. If 'B' inherits from 'A', shouldn't it then be 
possible to tell 'setAs' to look for the next suitable method, i.e. the 
method defined for 'A'? I tried 'NextMethod()' inside the body of 
'setAs' but that didn't work out.

Thanks a lot,
Janko

On 06.06.2011 17:15, Janko Thyson wrote:
> Dear list,
>
> I wonder how to write methods for the function 'as' in the sense that 
> I can call 'as(object, Class, strict=TRUE, ext)' and let method 
> dispatch figure out the correct method.
>
> AFAIU, there is a difference between, e.g. 'as.data.frame' and the 
> methods of 'as()' as stated above since the former depends on arg 'x' 
> instead of 'object', 'Class' etc.?
>
> > methods("as")
> > as.data.frame
>
> I have to admit that I'm not really familiar with the S3 style of 
> defining methods as I have been coding in S4 a lot, but my first 
> attempt was to write something like this:
>
> as.myClass <- function(x, ...){
> if(is(x, "data.frame"){
> x <- as.list(x)
> }
> if(is(x, "character"){
> x <- as.list(x)
> }
> ...
> out <- getRefClass("myClass")$new(X=x)
> return(out)
> }
>
> But that way I'd have to explicitly call 'as.myClass(x)' whereas I'd 
> simply like to type 'as(x, "myClass")'.
> Also, how is it possible to have method dispatch recognize two  
> signature arguments in S3? I.e., how can I define something like 
> 'as.data.frame.character' in order to have explicit "sub" methods for 
> all the data types of 'x' so I wouldn't have to process them all in 
> the definition of 'as.myClass' as I did above?
>
> Thanks for your help,
> Janko

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation

2011-06-06 Thread Petr Savicky

On Mon, Jun 06, 2011 at 04:50:57PM +1000, Stat Consult wrote:
> Dear ALL
> I want to simulate data from Multivariate normal distribution.
> GE.N<-mvrnorm(25,mu,S)
> S <-matrix(rep(0,1),nrow=100)
> for( i in 1:100){sigma<-runif(100,0.1,10);S
> [i,i]=sigma[i];mu<-runif(100,0,10)}
> for (i in 1:20){for (j in 1:20){if (i != j){S [i,j]=0.3*sigma[i]*sigma[j]}}}
> for (i in 21:40){for (j in 21:40){if (i != j){S
> [i,j]=0.3*sigma[i]*sigma[j]}}}
> for (i in 41:60){for (j in 41:60){if (i != j){S
> [i,j]=0.3*sigma[i]*sigma[j]}}}
> for (i in 61:80){for (j in 61:80){if (i != j){S
> [i,j]=0.3*sigma[i]*sigma[j]}}}
> for (i in 81:100){for (j in 81:100){if (i != j){S
> [i,j]=0.3*sigma[i]*sigma[j]}}}
> How should I do when S is not positive definite matrix?
> I saw this error: 'Sigma' is not positive definite.

Hello.

I am not sure, how the matrix is created. Should the command

  sigma<-runif(100,0.1,10)

be indeed inside the loop over i? I suspect that no, since
otherwise, only the vector sigma used for S[100, 100] goes to
the remaining part of the construction.

The matrix is block diagonal. So, the corresponding
distribution can be build from parts corresponding to the
blocks generated independently.

Let me look at the first block assuming that sigma is 
generated only once. The first block may be obtained also as

  B <- 0.3*outer(sigma[1:20], sigma[1:20])
  diag(B) <- sigma[1:20]

The result may have negative eigenvalues. For example,
if all components in sigma[1:20] are 4, which is in
the range used for sigma, then we have a matrix, whose
diagonal elements are 4 and nondiagonal elements are
0.3*4^2 = 4.8 > 4. This matrix has negative eigenvalues,
so it is not a covariance matrix.

Is the construction of the matrix, which you sent, correct?

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I write methods for 'as()'?

2011-06-06 Thread Janko Thyson

Okay, I found something that is working, but it looks and feels pretty 
awkward as the method def and method lookup takes place in one function ;-)

setRefClass("A", fields=list(X="numeric"))
setRefClass("B", contains="A", fields=list(Y="character"))

mySetAs <- function(
 from,
 to
){
 if(!existsMethod(f="coerce", signature=c(from=class(from), to=to))){
 setAs(from=class(from), to=to,
 def=function(from, to){
 out <- getRefClass(to)$new(X=from)
 return(out)
 }
 )
 }
 mthd <- selectMethod("coerce", signature=c(from=class(from), 
to=to), useInherited= c(from=TRUE, to=TRUE))
 out <- mthd(from=from, to=to)
 return(out)
}

a <- mySetAs(from=1:5, to="A")
a$X
b <- mySetAs(from=1:5, to="B")
b$X

I'm sure there are much better ways. I'd appreciate any comments whatsoever.

Best regards,
Janko

On 06.06.2011 17:46, Janko Thyson wrote:
> Somehow I don't see my own postings in the list, so sorry for replying 
> to my own message and not the one that went out to the list.
>
> I got a little further and I think I found exactly the thing that is 
> bothering me: how to get "extended" method dispatch going in 'setAs()':
>
> setRefClass("A", fields=list(X="numeric"))
> setRefClass("B", contains="A", fields=list(Y="character"))
>
> setAs(from="numeric", to="A",
> def=function(from,to){
> out <- getRefClass(to)$new(X=from)
> return(out)
> }
> )
> a <- as(1:5, "A")
> a$X
>
> b <- as(1:5, "B")
>
> My problem is the last statement (b <- as(1:5, "B") which fails. I 
> want to get around having to write new 'setAs' methods for all classes 
> extending class 'A'. If 'B' inherits from 'A', shouldn't it then be 
> possible to tell 'setAs' to look for the next suitable method, i.e. 
> the method defined for 'A'? I tried 'NextMethod()' inside the body of 
> 'setAs' but that didn't work out.
>
> Thanks a lot,
> Janko
>
> On 06.06.2011 17:15, Janko Thyson wrote:
>> Dear list,
>>
>> I wonder how to write methods for the function 'as' in the sense that 
>> I can call 'as(object, Class, strict=TRUE, ext)' and let method 
>> dispatch figure out the correct method.
>>
>> AFAIU, there is a difference between, e.g. 'as.data.frame' and the 
>> methods of 'as()' as stated above since the former depends on arg 'x' 
>> instead of 'object', 'Class' etc.?
>>
>> > methods("as")
>> > as.data.frame
>>
>> I have to admit that I'm not really familiar with the S3 style of 
>> defining methods as I have been coding in S4 a lot, but my first 
>> attempt was to write something like this:
>>
>> as.myClass <- function(x, ...){
>> if(is(x, "data.frame"){
>> x <- as.list(x)
>> }
>> if(is(x, "character"){
>> x <- as.list(x)
>> }
>> ...
>> out <- getRefClass("myClass")$new(X=x)
>> return(out)
>> }
>>
>> But that way I'd have to explicitly call 'as.myClass(x)' whereas I'd 
>> simply like to type 'as(x, "myClass")'.
>> Also, how is it possible to have method dispatch recognize two  
>> signature arguments in S3? I.e., how can I define something like 
>> 'as.data.frame.character' in order to have explicit "sub" methods for 
>> all the data types of 'x' so I wouldn't have to process them all in 
>> the definition of 'as.myClass' as I did above?
>>
>> Thanks for your help,
>> Janko

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wireframe, custom x-axis values

2011-06-06 Thread Peter Ehlers


On 2011-06-06 06:33, Rbjørn Nicolaisen wrote:


Hi,

Im plotting some data with wireframe() like so:

wireframe(result ~ u * r, myData, scales=list(arrows=FALSE))

However, I would really like to display something different for the displayed values of 
"u" rather than the actual values.
This is because my u-values are a sequence of quantiles of myData, and I would like to display the 
quantiles used (e.g. "0.8   0.85   0.9   0.95")  instead of the actual values of these 
quantiles, since this is easier to relate to for a viewer. This information is accessible in myData 
in a variable, "qnt".

I've tried meddling around with "axis", "label" and "at" in scales=list(), but 
i've been unable to make it happen.

Can anyone shed some light? Preferably in a short, generic example.


Here is a slight modification of the second example
in help('wireframe'):

  wireframe(z ~ x * y, data = g, groups = gr,
  scales = list(arrows = FALSE,
  x = list(at = c(2, 5, 10)),
  y = list(at = c(6, 10, 14),
   lab = c('A', 'BBB', 'C'))
  ))

Peter Ehlers



Thanks in advance,
Thor

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Seeking help to define method for show() for an S4 object

2011-06-06 Thread Bogaso Christofer

Dear all, I have created a new S4 class with name "MyClass". Please see
below for it's definition. Now I want to create a
method for the show() function for this class. In defining this method, what
I want is, once user would like see an object of this class
some values will be displayed. Then R will ask for **press Enter**. Once
user presses enter then, remaining values will be
displayed. In the following example, I try to explain this concept.


> setClass("MyClass", sealed = FALSE, representation(Slot1 = "vector", Slot2
= "vector"))

[1] "MyClass"

> 

> setMethod("show", "MyClass", definition = function(x) {

+ cat("These are the values. Please press enter to see the
values.\n")

+ ### User will presss the Enter ##

+ ### Then only following figures will be visible #

+ cat(x@Slot1)

+} )

[1] "show"

Warning message:

For function "show", signature "MyClass": argument in method definition
changed from (x) to (object) 

> new("MyClass", Slot1 = 1:3, Slot2 = 4:7)

These are the values. Please press enter to see the values.

1 2 3>

 

 


Can somebody guide me how I can achieve that?


Thanks,

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Seeking help to define method for show() for an S4 object

2011-06-06 Thread Joshua Wiley

Hi,

Take a look at ?readLines  Also, you should use function(object) not
function(x) because of how the generic is defined.  Below is an
example.

Cheers,

Josh

setClass("MyClass", sealed = FALSE, representation(
  Slot1 = "vector",
  Slot2 = "vector"))

setMethod(f = "show", signature = "MyClass",
  definition = function(object) {
cat("These are the values.  Please press enter to see the values.\n")
readLines(n = 1)
cat(object@Slot1)
  })

x <- new("MyClass", Slot1 = 1:3, Slot2 = 4:7)
x

On Mon, Jun 6, 2011 at 10:19 AM, Bogaso Christofer
 wrote:
> Dear all, I have created a new S4 class with name "MyClass". Please see
> below for it's definition. Now I want to create a
> method for the show() function for this class. In defining this method, what
> I want is, once user would like see an object of this class
> some values will be displayed. Then R will ask for **press Enter**. Once
> user presses enter then, remaining values will be
> displayed. In the following example, I try to explain this concept.
>
>
>> setClass("MyClass", sealed = FALSE, representation(Slot1 = "vector", Slot2
> = "vector"))
>
> [1] "MyClass"
>
>>
>
>> setMethod("show", "MyClass", definition = function(x) {
>
> +         cat("These are the values. Please press enter to see the
> values.\n")
>
> +         ### User will presss the Enter ##
>
> +         ### Then only following figures will be visible #
>
> +         cat(x@Slot1)
>
> +        } )
>
> [1] "show"
>
> Warning message:
>
> For function "show", signature "MyClass": argument in method definition
> changed from (x) to (object)
>
>> new("MyClass", Slot1 = 1:3, Slot2 = 4:7)
>
> These are the values. Please press enter to see the values.
>
> 1 2 3>
>
>
>
>
>
>
> Can somebody guide me how I can achieve that?
>
>
> Thanks,
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question with RExcel

2011-06-06 Thread Maria Helena Mourino Silva Nunes

Dear all,
I’m doing some simulation studies in order to compare the estimates (and 
estimated standard deviations) from the ARMA(2,1) Model with an estimator that 
I’ve constructed. For carrying out the simulations I created a VBA project 
within Excel.
Now, I’m using the RExcel tool for running the R commands in the VBA project. I 
run 2500 simulation using the “arima” function from R and it worked! 
Nevertheless, the constant was badly estimated. So, I decided to use the “arma” 
function from R, and the parameters are now well estimated. However, I cannot 
run the 2500 simulations. It can only do 46 simulations! I’ve already tried to 
run the program in another computer, but I’ve got the same problem.

Do you have any suggestions?
Thanks for your attention.
Helena Mouriño.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generating random covariance matrices (with a uniform distribution of correlations)

2011-06-06 Thread Ned Dochtermann

Thank you very much, this does help quite a bit.
Ned

From: Petr Savicky 
Date: Sat, 04 Jun 2011 11:44:52 +0200

On Fri, Jun 03, 2011 at 01:54:33PM -0700, Ned Dochtermann wrote:
> Petr,
> This is the code I used for your suggestion:
>
> k<-6;kk<-(k*(k-1))/2
> x<-matrix(0,5000,kk)
> for(i in 1:5000){
> A.1<-matrix(0,k,k)
> rs<-runif(kk,min=-1,max=1)
> A.1[lower.tri(A.1)]<-rs
> A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)]
> cors.i<-diag(k)
> t<-.001-min(Re(eigen(A.1)$values))
> new.cor<-cov2cor(A.1+(t*cors.i))
> x[i,]<-new.cor[lower.tri(new.cor)]}
> hist(c(x)); max(c(x)); median(c(x))
>
> This, unfortunately, does not maintain the desired distribution of
> correlations.

Hello.

On the contrary to what i thought originally, there are solutions also for
the case of the correlation matrix. The first solution creates a singular
correlation matrix (of rank 3), but the nondiagonal entries have exactly the
uniform distribution on [-1, 1], since the scalar product of two independent
uniformly distributed unit vectors in R^3 has the uniform distribution on
[-1, 1].

  x <- matrix(rnorm(18), nrow=6, ncol=3)
  x <- x/sqrt(rowSums(x^2))
  a <- x %*% t(x)

The next solution produces a correlation matrix of full rank, whose
non-diagonal entries have distribution very close to the uniform on [-1, 1].
KS test finds a difference only with sample size more than 50'000.

  w <- c(0.01459422, 0.01830718, 0.04066405, 0.50148488, 0.60330865,
0.61832829)
  x <- matrix(rnorm(36), nrow=6, ncol=6) %*% diag(w)
  x <- x/sqrt(rowSums(x^2))
  a <- x %*% t(x)

Hope this helps.

Petr Savicky.

-Original Message-
From: Ned Dochtermann [mailto:ned.dochterm...@gmail.com] 
Sent: Friday, June 03, 2011 1:55 PM
To: 'r-help@r-project.org'; 'savi...@praha1.fff.cuni.cz'
Subject: Re: [R] generating random covariance matrices (with a uniform
distribution of correlations)

Petr,
This is the code I used for your suggestion:

k<-6;kk<-(k*(k-1))/2
x<-matrix(0,5000,kk)
for(i in 1:5000){
A.1<-matrix(0,k,k)
rs<-runif(kk,min=-1,max=1)
A.1[lower.tri(A.1)]<-rs
A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)]
cors.i<-diag(k)
t<-.001-min(Re(eigen(A.1)$values))
new.cor<-cov2cor(A.1+(t*cors.i))
x[i,]<-new.cor[lower.tri(new.cor)]}
hist(c(x)); max(c(x)); median(c(x))

This, unfortunately, does not maintain the desired distribution of
correlations.
I did, however, learn some neat coding tricks (that were new for me) along
the way.

Ned
--
On Thu, Jun 02, 2011 at 04:42:59PM -0700, Ned Dochtermann wrote:
> List members,
> 
> Via searches I've seen similar discussion of this topic but have not seen
> resolution of the particular issue I am experiencing. If my search on this
> topic failed, I apologize for the redundancy. I am attempting to generate
> random covariance matrices but would like the corresponding correlations
to
> be uniformly distributed between -1 and 1. 
> 
...
> 
> Any recommendations on how to generate the desired covariance matrices
would
> be appreciated.

Hello.

Let me suggest the following procedure.

1. Generate a symmetric matrix A with the desired distribution of the
   non-diagonal elements and with zeros on the diagonal.
2. Compute the smallest eigenvalue lambda_1 of A.
3. Replace A by A + t I, where I is the identity matrix and t is a
   number such that t + lambda_1 > 0.

The resulting matrix will have the same non-diagonal elements as A,
but will be positive definite.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question with RExcel

2011-06-06 Thread Ethan Brown

It's hard to see where the problem is from this information.

I would suggest subscribing to and asking this question of the RExcel
mailing list (accessible from http://rcom.univie.ac.at/) and providing
more detail of what you're trying to do, what is going wrong, error
messages (is it R or Excel giving the error?) and so on. For all I
know, it may well not be an R issue at all but a problem somewhere in
your Excel or VBA setup.

Best,
Ethan

On Mon, Jun 6, 2011 at 12:39 PM, Maria Helena Mourino Silva Nunes
 wrote:
> Dear all,
> I’m doing some simulation studies in order to compare the estimates (and 
> estimated standard deviations) from the ARMA(2,1) Model with an estimator 
> that I’ve constructed. For carrying out the simulations I created a VBA 
> project within Excel.
> Now, I’m using the RExcel tool for running the R commands in the VBA project. 
> I run 2500 simulation using the “arima” function from R and it worked! 
> Nevertheless, the constant was badly estimated. So, I decided to use the 
> “arma” function from R, and the parameters are now well estimated. However, I 
> cannot run the 2500 simulations. It can only do 46 simulations! I’ve already 
> tried to run the program in another computer, but I’ve got the same problem.
>
> Do you have any suggestions?
> Thanks for your attention.
> Helena Mouriño.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot many x and y

2011-06-06 Thread Alaios

Dear all

could you please plot many x's and y's with one legend per plot?

I would like to thank you in advance for your help

Best Regards
Alex.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot many x and y

2011-06-06 Thread Joshua Wiley

Hi Alex,

You have not given enough details for us to answer your question.  Is
this something like what you mean?

par(mfcol = c(2, 2))
plot(1:10, 1:10, pch = 16)
legend(x = 2, y = 8, legend = "Test 1", pch = 16)
plot(1:10, 1:10, pch = 15)
legend(x = 2, y = 8, legend = "Test 2", pch = 15)
plot(1:10, 1:10, pch = 14)
legend(x = 2, y = 8, legend = "Test 3", pch = 14)
plot(1:10, 1:10, pch = 13)
legend(x = 2, y = 8, legend = "Test 4", pch = 13)

The posting guide (  http://www.R-project.org/posting-guide.html )
provides some helpful tips on how to write a question that will get a
good answer.

Cheers,

Josh

On Mon, Jun 6, 2011 at 12:46 PM, Alaios  wrote:
> Dear all
>
> could you please plot many x's and y's with one legend per plot?
>
> I would like to thank you in advance for your help
>
> Best Regards
> Alex.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot many x and y

2011-06-06 Thread Stephan Kolassa


Hi Alex,

could you be a little more specific as to what exactly you mean by 
"plotting many x's and y's with one legend per plot"?


Please note what appears at the bottom of every R-help mail:
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.

Following this piece of advice usually increases your chances for a 
helpful answer.


Best,
Stephan


Am 06.06.2011 21:46, schrieb Alaios:

Dear all

could you please plot many x's and y's with one legend per plot?

I would like to thank you in advance for your help

Best Regards
Alex.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merge two columns of a data frame

2011-06-06 Thread Abraham Mathew

I have the following data:

prefix <- c("cheap", "budget")
roots <- c("car insurance", "auto insurance")
suffix <- c("quote", "quotes")

prefix2 <- c("cheap", "budget")
roots2 <- c("car insurance", "auto insurance")

roots3 <- c("car insurance", "auto insurance")
suffix3 <- c("quote", "quotes")

df1 <- expand.grid(prefix, roots, suffix)
df2 <- expand.grid(prefix2, roots2)
df3 <- expand.grid(roots3, suffix3)
df1; df2; df3

df1, df2, and df3 are seperate data structures with seperate columns for
root, prefix, and suffix.

  Var1   Var2   Var3
1  cheap  car insurance  quote
2 budget  car insurance  quote
3  cheap auto insurance  quote
4 budget auto insurance  quote
5  cheap  car insurance quotes
6 budget  car insurance quotes
7  cheap auto insurance quotes
8 budget auto insurance quotes
Var1   Var2
1  cheap  car insurance
2 budget  car insurance
3  cheap auto insurance
4 budget auto insurance
Var1   Var2
1  car insurance  quote
2 auto insurance  quote
3  car insurance quotes
4 auto insurance quotes


I want to merge df1, df2, and df3, into one data frame column which looks
like.

Var1
  'cheap  car insurance  quote'
 'budget  car insurance  quote'
  'cheap auto insurance  quote'
 'budget auto insurance  quote'
  'cheap  car insurance quotes'
 'budget  car insurance quotes'
  'cheap auto insurance quotes'
 'budget auto insurance quotes'
 'cheap  car insurance'
 'budget  car insurance'
'cheap auto insurance'
'budget auto insurance'
'car insurance  quote'
 'auto insurance  quote'
'car insurance quotes'
'auto insurance quotes'


Help!
WebRep
Overall rating

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in R documentation

2011-06-06 Thread siddharth arun

Thanks Jorge it worked

On Mon, Jun 6, 2011 at 8:02 PM, Jorge Ivan Velez
wrote:

> Hi Siddharth,
>
> adf.test() is part of the "tseries" package, so you need to download and
> install it before using that function. Try the following and let us now what
> you get:
>
> install.packages('tseries')
> require(tseries)
> ?adf.test
>
> HTH,
> Jorge
>
>
> On Mon, Jun 6, 2011 at 2:41 AM, siddharth arun <> wrote:
>
>> I am not able to run Dickey-Fuller test.
>> adf.test() function is not working. It is showing 'Error: could not find
>> function "adf.test"
>>
>>
>> Can any tell how to call "time series" library?
>>
>> --
>> Siddharth Arun,
>> 4th Year Undergraduate student
>> Industrial Engineering and Management,
>> IIT Kharagpur
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
Siddharth Arun,
4th Year Undergraduate student
Industrial Engineering and Management,
IIT Kharagpur

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Denton benchmarking method

2011-06-06 Thread Pascal Grandeau


Good morning,

Does it exist R code to do Denton benchmarking for time series ?
Thank you.

P. Grandeau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem plotting in R under linux (centos)

2011-06-06 Thread Roger Gill

Dear All,

I am running into a slightly odd issue when attempting to produce a png/pdf 
file 
in R under centos. The version of Centos is 5.6 with an old version of R  
2.12.0 
(2010-10-15). My script is roughly:

mydata <- read.csv(myfile)

png(outfile)

plot(0,0,type='n',ylim=c(-500,500),xlim=c(0,700),xlab=...
lines(mydata$x,mydata$y)

dev.off()

It produces a plot  with the correct limits and labels but the lines command 
does not seem to always work (I have checked that the data fall within the 
defined limits!). No error message is reported. The data sets are reasonable 
large (~6.5 million lines) but not large enough to cause an issue? It runs fine 
on my windows machine albeit with the latest R build. 


If I loop over the data set plotting chunks at a time it works just fine.

n <- floor(nrow(mydata)/100)
for(i in 0:n){
  lines(mydata[max((i*100),1) : min(((i+1)*100),nrow(mydata)),])
}

I have googled and cant find anything helpful. Is it an issue with R, my R 
installation, Centos etc

Thanks in advance,

Roger

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wireframe, custom x-axis values

2011-06-06 Thread David Winsemius



Im plotting some data with wireframe() like so:

wireframe(result ~ u * r, myData, scales=list(arrows=FALSE))

However, I would really like to display something different for the
displayed values of "u" rather than the actual values.
This is because my u-values are a sequence of quantiles of myData, and I
would like to display the quantiles used (e.g. "0.8   0.85   0.9   0.95") 
instead of the actual values of these quantiles, since this is easier to
relate to for a viewer. This information is accessible in myData in a
variable, "qnt".

I've tried meddling around with "axis", "label" and "at" in scales=list(),
but i've been unable to make it happen.
#

Right, because you haven't understood that base graphics are different than
lattice or ggplot graphics. Suggest you read the help page for xyplot which
has material on the names of list elements used. Pay particular attention to
the arguments list in the section on scales. "at" is a proper argument.
"label is not. You should also look at the list structure of you plot object
with str().

#---

Can anyone shed some light? Preferably in a short, generic example.

# 
You are the one responsible for providing examples on rhelp.
\#


-- 
David


--
View this message in context: 
http://r.789695.n4.nabble.com/Wireframe-custom-x-axis-values-tp3576963p3577167.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding an ellipse to a PCA plot

2011-06-06 Thread Alain Guillet


Hi,

I think the easiest way is to use the function plotellipses of the 
FactoMineR package (but you have to do your PCA with the PCA function 
included in this package).


Alain

On 06-Jun-11 16:32, Lukas Baitsch wrote:

Hi,

I created a principal component plot using the first two principal
components. I used the function princomp() to calculate the scores.
now, I would like to superimpose an ellipse representing the center
and the 95% confidence interval of a series of points in my plot (as
to illustrate the grouping of my samples).

I looked at the ellipse() function in the ellipse package but can't
get it to work. the princomp()-function gives me the scores of each
point, so I can calculate the mean and the 95%-CI, but I can't
integrate this into the ellipse()-function). Is there a better way of
doing this or can someone help me figure out this function?

best regards,

Lukas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
http://www.uclouvain.be/smcs

Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50

Accès: http://www.uclouvain.be/323631.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Taking Integral and Optimization using Integrate, Optim and maxNR

2011-06-06 Thread MARYAM ZOLGHADR

Dear All, Hello!

I have some questoins in R programming as follows:

Question 1- How to take the integral of this function with respect to y, such 
that x would appear in the output after taking integral.

f(x,y)=(0.1766*exp(-exp(y+lnx))*-exp(y+lnx))/(1-exp(-exp(y+lnx))) y in 
(-6.907,-1.246)

It is doable in maple but not in R. At least I could not find the way.

p.s: result from maple is:

g(x)=dilog*exp(0.001000755564*x)+0.5*ln(exp(0.001000755564*x))^2-dilog*(exp(0.287653*x))-0.5*ln(exp(0.287653*x))^2

Where dilog=integral(log(t)/(1-t)) for t in (1,x)



Question 2- Then I want to optimize (maximize) the answer of the above integral 
for x. Assum x in (0,100)

The answer should be something between 26 and 27.



What I did is: got answer of integral(f(x,y)) from maple which is g(x), and 
then applied it in R. The code is as follows:

##In the case n=1

library(stats)
require(graphics)
integrand=function(t){(log(t))/(1-t)}
#dilog=integrate(integrand, lower=1, upper=x)
fr <- function(x){(integrate(integrand, lower=1, 
upper=x)$value)*(exp(0.001000755564*x))+0.5*log(exp(0.001000755564*x))^2-(integrate(integrand,
 lower=1, upper=x)$value)*(exp(0.287653*x))-0.5*log(exp(0.287653*x))^2}

optim(20, fr, NULL, method = "BFGS")



Question 2-1-Default by optim is to minimize a function, how I can use it to 
maximization?



Question 2-2- The above code faced with errors, and did not work. What I guess 
is there is something wrong with taking integral. The output of integrate 
function in R is some sort of thing. I had to somehow tell it, just take the 
value and forget about the others. But I guess it is still something wrong with 
it. I also tried maxNR, but is didn't work either.



Question 2-3- Thoes above are the easiest case of my problem. Assume the case 
that I have summation of f(x1,y)+f(x2,y)+...+f(x12,y)



The article have done it by E04JAF-NAG Fortran Library Routine Document. But I 
want to do it in R.

Thanx all.



Cheers,

Maryam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] draw text outside plot boundaries

2011-06-06 Thread Erik Aronesty

i'd like to use the text() function to annotate some points, but the
labels get cropped, if the point is on the right

is there a way to prevent this, and tell the text() function to allow
writing outside the boundaries of the current plot?

i don't mind if it looks "messy" and  steps on the margin a bit.

- erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RPostgreSQL && snowfall

2011-06-06 Thread Florian Endel

Dear expeRts,

I'm currently trying to get data from a PostgreSQL database _in parallel_.
I tried two methods:
* passing the DBI Connection object to the cluster nodes
* connecting the the DB on each node

(1)
The execution of the first method looks like this:
> result = sfClusterApplyLB(input, fun, dbiCon)
and produces an "expired PostgreSQLConnection" error.
(Of course the passed Connection Object is usable at that moment and
afterwards!)

(2)
For the creation of DB connections on every node a function handling
the whole connection is sourced into every node.
This function works perfectly without snowfall.
Calling it with
> sfClusterEval(dbConnect())
again only expired connection objects are produced. Even if I create
the connection 'a line above' the code which is connecting to the DB
it doesn't work...


Is there a possibility to connect to PostgreSQL using snowfall?

--
with kind regards
Florian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme, stepAIC, predict: scope and visibility

2011-06-06 Thread Boris Hayete

Hello all,

I've run into a problem where I can't run predict.lme on an object simplified 
via a stepAIC.  A similar post has been recorded on this list:
https://stat.ethz.ch/pipermail/r-help/2008-May/162047.html
but in my case, I'm going to great lengths to not repeat that poster's error 
and still coming up short.  Any advice would be much appreciated.  It would 
seem that, after stepAIC, predict.lme cannot find the training data for the lme 
object (and why is it even needed?).

Here's some example code:

foo = function() {
x = c(1:20, 1:20)
y = c(1:20, 5 + (1:20)) + rnorm(40)
s = c(rep(1, 20), rep(2, 20))

library(lattice)
xyplot(y~x|s)

dframe = data.frame(x, y, s)

m = lme(y~x, random=~1|s, method='ML')

newdf = data.frame(x=40, s = 2)
res = predict(m, newdata=newdf)
print(res)

m2 = stepAIC(m, k=log(nrow(dframe)))
#res2 = predict(m2, newdata=newdf)
res2 = eval(substitute(predict(mL, newdata=nL), list(mL=m2, nL=newdf)))
print(res2)
}

> foo()
   2 
45.86875 
attr(,"label")
[1] "Predicted values"
Start:  AIC=136.4
y ~ x

Error in eval(expr, envir, enclos) : object 'y' not found
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] list demographics

2011-06-06 Thread Sarah Goslee

Hi all,

I got curious about something, so in proper scientific fashion I
obtained some data and analyzed it.

Question: what is the female participation in the R-help email list?

Data: the most recent list postings, obtained from the website. I took
my best shot at classifying the names given in the email header as
male/female, but ended with a fair number of unknowns.

This dataset had 2797 list messages, in 895 questions. 1501 messages
were replies to one of those questions by someone not the original
querent.

Across all messages, 6.5% were from women, 77.8% from men, 15.7% unknown.

For new questions, 11.7% were from women, 61.2% from men, 27.0% unknown.

Among responses to other people's questions, 2.6% were from women,
92.3% from men, 5.1% unknown.

Nine women answered other people's questions, but only two were what
I'd consider active participants, offering more than two answers. (Not
divided up by separate questions, so could be several replies in one
discussion.)

For men, 214 answered questions, and 90 offered more than two answers.
(Not divided up by separate questions, so could be several replies in
one discussion.)

Six active participants were unclassifiable, so even if all of those
were female, that would still be only eight women actively
participating in the list in this sample.

Is the list representative of statisticians? People who use R? People
who participate in statistical software email lists? I have no idea,
but I found it interesting that there is so little female
participation in the list, even asking questions (where you'd expect
to see students and new R users), and almost no female participation
in answering questions.

For those of you who teach, are your classes heavily skewed?

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Not missing at random

2011-06-06 Thread Joshua Wiley

Hi Blaz,

See below.

x <-
matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4),
 nrow = 7, ncol=7, byrow=TRUE) matrix

pMiss <- 30 percent of missing values

N <- dim(x)[1]   number of cases

candidate <- which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3 |
x[,7]<3) I want to sample all cases with at least 1 value
lower than 3, so I have to find candidates

## easier to use this
## find all x < 3 and return their row and column indices
## select only row indices, and then find unique
candidate <- unique(which(x < 3, arr.ind = TRUE)[, "row"])

idMiss <- sample(candidate, N * pMiss / 100)   I sampled cases

## from the subset of x cases that will be missing
## find all that are < 3 and set to NA
x[idMiss, ][x[idMiss, ] < 3] <- NA

## If you are going to do this a lot, consider a function
nmar <- function(x, op = "<", value = 3, p = 30) {
  op <- get(op)
  candidate <- unique(which(op(x, value), arr.ind = TRUE)[, "row"])
  idMiss <- sample(candidate, nrow(x) * p / 100)
  x[idMiss, ][op(x[idMiss, ], value)] <- NA
  return(x)
}

nmar(x)

## has the advantage that you can easily change
## p, the cut off value, the operator (e.g., "<", ">", "<=", etc.)

Cheers,

Josh

On Sun, Jun 5, 2011 at 11:17 PM, Blaz Simcic  wrote:
>
>
> Hello!
>
> I would like to sample 30 % of cases (with at least 1 value lower than 3 - in
> the row) and among them I want to set all values lower than 3 (within selected
> cases) as NA (NMAR- Not missing at random). I managed to sample cases, but I
> don’t know how to set values (lower than 3) as NA.
>
> R code:
>
> x <-
> matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4),
>  nrow = 7, ncol=7, byrow=TRUE) matrix
>
> pMiss <- 30 percent of missing values
>
> N <- dim(x)[1]   number of cases
>
> candidate<-which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3 |
> x[,7]<3)     I want to sample all cases with at least 1 value lower than 
> 3,
> so I have to find candidates
>
> idMiss <- sample(candidate, N * p / 100)     I sampled cases
>
> Now I'd like to set all values among sampled cases as NA.
>
> Any suggestion?
>
> Thanks,
> Blaž
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge two columns of a data frame

2011-06-06 Thread Ista Zahn

Hi Abraham,
Just take it step by step. Paste the values together, combine them,
and assign them to a data.frame column. Like this perhaps:

df.1.2.3 <- data.frame(Var1 =
c(with(df1, paste(Var1, Var2, Var3)),
  with(df2, paste(Var1, Var2)),
  with(df3, paste(Var1, Var2

Best,
Ista

On Mon, Jun 6, 2011 at 12:22 PM, Abraham Mathew  wrote:
> I have the following data:
>
> prefix <- c("cheap", "budget")
> roots <- c("car insurance", "auto insurance")
> suffix <- c("quote", "quotes")
>
> prefix2 <- c("cheap", "budget")
> roots2 <- c("car insurance", "auto insurance")
>
> roots3 <- c("car insurance", "auto insurance")
> suffix3 <- c("quote", "quotes")
>
> df1 <- expand.grid(prefix, roots, suffix)
> df2 <- expand.grid(prefix2, roots2)
> df3 <- expand.grid(roots3, suffix3)
> df1; df2; df3
>
> df1, df2, and df3 are seperate data structures with seperate columns for
> root, prefix, and suffix.
>
>  Var1           Var2   Var3
> 1  cheap  car insurance  quote
> 2 budget  car insurance  quote
> 3  cheap auto insurance  quote
> 4 budget auto insurance  quote
> 5  cheap  car insurance quotes
> 6 budget  car insurance quotes
> 7  cheap auto insurance quotes
> 8 budget auto insurance quotes
>    Var1           Var2
> 1  cheap  car insurance
> 2 budget  car insurance
> 3  cheap auto insurance
> 4 budget auto insurance
>            Var1   Var2
> 1  car insurance  quote
> 2 auto insurance  quote
> 3  car insurance quotes
> 4 auto insurance quotes
>
>
> I want to merge df1, df2, and df3, into one data frame column which looks
> like.
>
>                    Var1
>  'cheap  car insurance  quote'
>  'budget  car insurance  quote'
>  'cheap auto insurance  quote'
>  'budget auto insurance  quote'
>  'cheap  car insurance quotes'
>  'budget  car insurance quotes'
>  'cheap auto insurance quotes'
>  'budget auto insurance quotes'
>         'cheap  car insurance'
>         'budget  car insurance'
>        'cheap auto insurance'
>        'budget auto insurance'
>        'car insurance  quote'
>         'auto insurance  quote'
>        'car insurance quotes'
>        'auto insurance quotes'
>
>
> Help!
> WebRep
> Overall rating
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge two columns of a data frame

2011-06-06 Thread Ethan Brown

Another possibility:

dfs <- list(df1, df2, df3)
df.1.2.3 <- as.data.frame(unlist(sapply(dfs, function(x) do.call(paste, x

On Mon, Jun 6, 2011 at 2:37 PM, Ista Zahn  wrote:
> Hi Abraham,
> Just take it step by step. Paste the values together, combine them,
> and assign them to a data.frame column. Like this perhaps:
>
> df.1.2.3 <- data.frame(Var1 =
>        c(with(df1, paste(Var1, Var2, Var3)),
>          with(df2, paste(Var1, Var2)),
>          with(df3, paste(Var1, Var2
>
> Best,
> Ista
>
> On Mon, Jun 6, 2011 at 12:22 PM, Abraham Mathew  
> wrote:
>> I have the following data:
>>
>> prefix <- c("cheap", "budget")
>> roots <- c("car insurance", "auto insurance")
>> suffix <- c("quote", "quotes")
>>
>> prefix2 <- c("cheap", "budget")
>> roots2 <- c("car insurance", "auto insurance")
>>
>> roots3 <- c("car insurance", "auto insurance")
>> suffix3 <- c("quote", "quotes")
>>
>> df1 <- expand.grid(prefix, roots, suffix)
>> df2 <- expand.grid(prefix2, roots2)
>> df3 <- expand.grid(roots3, suffix3)
>> df1; df2; df3
>>
>> df1, df2, and df3 are seperate data structures with seperate columns for
>> root, prefix, and suffix.
>>
>>  Var1           Var2   Var3
>> 1  cheap  car insurance  quote
>> 2 budget  car insurance  quote
>> 3  cheap auto insurance  quote
>> 4 budget auto insurance  quote
>> 5  cheap  car insurance quotes
>> 6 budget  car insurance quotes
>> 7  cheap auto insurance quotes
>> 8 budget auto insurance quotes
>>    Var1           Var2
>> 1  cheap  car insurance
>> 2 budget  car insurance
>> 3  cheap auto insurance
>> 4 budget auto insurance
>>            Var1   Var2
>> 1  car insurance  quote
>> 2 auto insurance  quote
>> 3  car insurance quotes
>> 4 auto insurance quotes
>>
>>
>> I want to merge df1, df2, and df3, into one data frame column which looks
>> like.
>>
>>                    Var1
>>  'cheap  car insurance  quote'
>>  'budget  car insurance  quote'
>>  'cheap auto insurance  quote'
>>  'budget auto insurance  quote'
>>  'cheap  car insurance quotes'
>>  'budget  car insurance quotes'
>>  'cheap auto insurance quotes'
>>  'budget auto insurance quotes'
>>         'cheap  car insurance'
>>         'budget  car insurance'
>>        'cheap auto insurance'
>>        'budget auto insurance'
>>        'car insurance  quote'
>>         'auto insurance  quote'
>>        'car insurance quotes'
>>        'auto insurance quotes'
>>
>>
>> Help!
>> WebRep
>> Overall rating
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RPostgreSQL && snowfall

2011-06-06 Thread Whit Armstrong

I don't think you can share dbi connections across different instances of R.

just have each of your helper functions open a local connection.  or
alternatively, load a package on each instance which keeps a dbi
connection open.

and make sure you bump up your allowed number of connections in
pg_conf if you need to.

-Whit


On Mon, Jun 6, 2011 at 12:40 PM, Florian Endel  wrote:
> Dear expeRts,
>
> I'm currently trying to get data from a PostgreSQL database _in parallel_.
> I tried two methods:
> * passing the DBI Connection object to the cluster nodes
> * connecting the the DB on each node
>
> (1)
> The execution of the first method looks like this:
>> result = sfClusterApplyLB(input, fun, dbiCon)
> and produces an "expired PostgreSQLConnection" error.
> (Of course the passed Connection Object is usable at that moment and
> afterwards!)
>
> (2)
> For the creation of DB connections on every node a function handling
> the whole connection is sourced into every node.
> This function works perfectly without snowfall.
> Calling it with
>> sfClusterEval(dbConnect())
> again only expired connection objects are produced. Even if I create
> the connection 'a line above' the code which is connecting to the DB
> it doesn't work...
>
>
> Is there a possibility to connect to PostgreSQL using snowfall?
>
> --
> with kind regards
> Florian
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] draw text outside plot boundaries

2011-06-06 Thread Peter Ehlers


On 2011-06-06 08:44, Erik Aronesty wrote:

i'd like to use the text() function to annotate some points, but the
labels get cropped, if the point is on the right

is there a way to prevent this, and tell the text() function to allow
writing outside the boundaries of the current plot?


Go to ?par and check out the 'xpd' parameter.

Peter Ehlers



i don't mind if it looks "messy" and  steps on the margin a bit.

- erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A Calculation on list object

2011-06-06 Thread Ron Michael

Hello, I am into some calculation on a list object, therefore requesting the 
peers if there is any short cut way to so the same calculation.

Let say I have following list object:

> List <- vector('list', length = 3)
> set.seed(1)
> List[[1]] <- rnorm(5)
> List[[2]] <- rnorm(2)
> List[[3]] <- rnorm(7)
> List
[[1]]
[1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078

[[2]]
[1] -0.8204684  0.4874291

[[3]]
[1]  0.7383247  0.5757814 -0.3053884  1.5117812  0.3898432 -0.6212406 -2.2146999

> 
> Vector <- 3:5
> Vector
[1] 3 4 5

Now, what I want to do is, add List with Vector, element-by-element. Means I 
wanted to do:

> List[[1]] + Vector[1]
[1] 2.373546 3.183643 2.164371 4.595281 3.329508
> List[[2]] + Vector[2]
[1] 3.179532 4.487429
> List[[3]] + Vector[3]
[1] 5.738325 5.575781 4.694612 6.511781 5.389843 4.378759 2.785300

Till now I have done this calculation with for-loop. Therefore it would be 
interesting if there is any elegant way to do the same.

Thanks,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A Calculation on list object

2011-06-06 Thread Joshua Wiley

Hi Ron,

Seems like there might be a really elegant way, but I would use
lapply().  For instance:

lapply(seq_along(List), function(x) List[[x]] + Vector[x])

If you do this regularly and want something that reads more
intuitively, consider defining an operator that does this.  %+% is
undefined (at least on my system), so something like:

set.seed(1)
List <- list(rnorm(5), rnorm(2), rnorm(7))
Vector <- 3:5

"%+%" <- function(e1, e2) {
  if (identical(length(e1), length(e2)))
lapply(seq_along(e1), function(i) e1[[i]] + e2[[i]])
  else stop("length of e1 (", length(e1),
") must match length of e2 (", length(e2), ").")
}

List %+% Vector
List %+% 11:13

This has the advantage of looking more like how you are thinking (add
elements of the list to elements of the vector).

Hope this helps,

Josh

On Mon, Jun 6, 2011 at 3:10 PM, Ron Michael  wrote:
> Hello, I am into some calculation on a list object, therefore requesting the 
> peers if there is any short cut way to so the same calculation.
>
> Let say I have following list object:
>
>> List <- vector('list', length = 3)
>> set.seed(1)
>> List[[1]] <- rnorm(5)
>> List[[2]] <- rnorm(2)
>> List[[3]] <- rnorm(7)
>> List
> [[1]]
> [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078
>
> [[2]]
> [1] -0.8204684  0.4874291
>
> [[3]]
> [1]  0.7383247  0.5757814 -0.3053884  1.5117812  0.3898432 -0.6212406 
> -2.2146999
>
>>
>> Vector <- 3:5
>> Vector
> [1] 3 4 5
>
> Now, what I want to do is, add List with Vector, element-by-element. Means I 
> wanted to do:
>
>> List[[1]] + Vector[1]
> [1] 2.373546 3.183643 2.164371 4.595281 3.329508
>> List[[2]] + Vector[2]
> [1] 3.179532 4.487429
>> List[[3]] + Vector[3]
> [1] 5.738325 5.575781 4.694612 6.511781 5.389843 4.378759 2.785300
>
> Till now I have done this calculation with for-loop. Therefore it would be 
> interesting if there is any elegant way to do the same.
>
> Thanks,
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A Calculation on list object

2011-06-06 Thread Gabor Grothendieck

On Mon, Jun 6, 2011 at 6:10 PM, Ron Michael  wrote:
> Hello, I am into some calculation on a list object, therefore requesting the 
> peers if there is any short cut way to so the same calculation.
>
> Let say I have following list object:
>
>> List <- vector('list', length = 3)
>> set.seed(1)
>> List[[1]] <- rnorm(5)
>> List[[2]] <- rnorm(2)
>> List[[3]] <- rnorm(7)
>> List
> [[1]]
> [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078
>
> [[2]]
> [1] -0.8204684  0.4874291
>
> [[3]]
> [1]  0.7383247  0.5757814 -0.3053884  1.5117812  0.3898432 -0.6212406 
> -2.2146999
>
>>
>> Vector <- 3:5
>> Vector
> [1] 3 4 5
>
> Now, what I want to do is, add List with Vector, element-by-element. Means I 
> wanted to do:
>
>> List[[1]] + Vector[1]
> [1] 2.373546 3.183643 2.164371 4.595281 3.329508
>> List[[2]] + Vector[2]
> [1] 3.179532 4.487429
>> List[[3]] + Vector[3]
> [1] 5.738325 5.575781 4.694612 6.511781 5.389843 4.378759 2.785300
>
> Till now I have done this calculation with for-loop. Therefore it would be 
> interesting if there is any elegant way to do the same.
>

Try this:

mapply("+", List, Vector)



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] draw text outside plot boundaries

2011-06-06 Thread Tyler Rinker


Erik,
 
To add to what epter said...
I created this little function for clicking text anywhere on the plot (I 
probably stole the idea from a list serve or Dalgaard's book or someplace like 
that).  Anyway it is helpful to me and may be of use to you too.  Very basic 
but I use it a ton.  You modify to suit your needs.
 
#TEXT CLICK FUNCTION
textClick<-function(express,col="black",cex=NULL){
par(mar = rep(0, 4),xpd=NA)
text(locator(1),express,col=col,cex=cex)
}
 
#EXAMPLE
frame()
par(mfrow=c(2,2))
with(mtcars,plot(mpg~cyl));with(mtcars,plot(mpg~cyl))
with(mtcars,plot(mpg~cyl));with(mtcars,plot(mpg~cyl))
textClick(expression(sum((bar(X)-X^2))),"pink",.5)
 
Cheers
Tyler
 
> Date: Mon, 6 Jun 2011 15:09:29 -0700
> From: ehl...@ucalgary.ca
> To: e...@q32.com
> CC: r-help@r-project.org
> Subject: Re: [R] draw text outside plot boundaries
> 
> On 2011-06-06 08:44, Erik Aronesty wrote:
> > i'd like to use the text() function to annotate some points, but the
> > labels get cropped, if the point is on the right
> >
> > is there a way to prevent this, and tell the text() function to allow
> > writing outside the boundaries of the current plot?
> 
> Go to ?par and check out the 'xpd' parameter.
> 
> Peter Ehlers
> 
> >
> > i don't mind if it looks "messy" and steps on the margin a bit.
> >
> > - erik
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extract data from a data frame field

2011-06-06 Thread ads pit

Hi all,
I am given the a data frame in which one of the columns has more information
together- see column 4, peak_loc:
  chr  startend  peak_loc cluster_TC strand peak_TC
1 chr1 564620 564649 chr1:564644..564645,+ 94  +  10
2 chr1 565369 565404 chr1:565371..565372,+217  +   8
3 chr1 565463 565541 chr1:565480..565481,+   1214  +  15
4 chr1 565653 565697 chr1:565662..565663,+   1031  +  28
5 chr1 565861 565922 chr1:565883..565884,+316  +  12
6 chr1 566537 566573 chr1:566564..566565,+119  +  11


 I am trying to find out if there's a way to extract the coordinates given
in the 4th column and replace this column with two others that would have
the start coord and the end coord. so instead of chr1:564644..564645,+
I would obtain;
start_peak  end_peak
564644   564645

Best,
nanami

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract data from a data frame field

2011-06-06 Thread jim holtman

Here is a start; you can change the column names:

> x
   chr  startend  peak_loc cluster_TC strand peak_TC
1 chr1 564620 564649 chr1:564644..564645,+ 94  +  10
2 chr1 565369 565404 chr1:565371..565372,+217  +   8
3 chr1 565463 565541 chr1:565480..565481,+   1214  +  15
4 chr1 565653 565697 chr1:565662..565663,+   1031  +  28
5 chr1 565861 565922 chr1:565883..565884,+316  +  12
6 chr1 566537 566573 chr1:566564..566565,+119  +  11
> y <- sub("^.*:([[:digit:]]+)..([[:digit:]]+).*", "\\1 \\2", x$peak_loc)
> y
[1] "564644 564645" "565371 565372" "565480 565481" "565662 565663"
"565883 565884" "566564 566565"
> y <- strsplit(y, ' ')
> y
[[1]]
[1] "564644" "564645"

[[2]]
[1] "565371" "565372"

[[3]]
[1] "565480" "565481"

[[4]]
[1] "565662" "565663"

[[5]]
[1] "565883" "565884"

[[6]]
[1] "566564" "566565"

> x.new <- cbind(x, do.call(rbind, y))
> x.new
   chr  startend  peak_loc cluster_TC strand peak_TC
   1  2
1 chr1 564620 564649 chr1:564644..564645,+ 94  +  10
564644 564645
2 chr1 565369 565404 chr1:565371..565372,+217  +   8
565371 565372
3 chr1 565463 565541 chr1:565480..565481,+   1214  +  15
565480 565481
4 chr1 565653 565697 chr1:565662..565663,+   1031  +  28
565662 565663
5 chr1 565861 565922 chr1:565883..565884,+316  +  12
565883 565884
6 chr1 566537 566573 chr1:566564..566565,+119  +  11
566564 566565


On Mon, Jun 6, 2011 at 8:22 PM, ads pit  wrote:
> Hi all,
> I am given the a data frame in which one of the columns has more information
> together- see column 4, peak_loc:
>  chr  start    end              peak_loc cluster_TC strand peak_TC
> 1 chr1 564620 564649 chr1:564644..564645,+         94      +      10
> 2 chr1 565369 565404 chr1:565371..565372,+        217      +       8
> 3 chr1 565463 565541 chr1:565480..565481,+       1214      +      15
> 4 chr1 565653 565697 chr1:565662..565663,+       1031      +      28
> 5 chr1 565861 565922 chr1:565883..565884,+        316      +      12
> 6 chr1 566537 566573 chr1:566564..566565,+        119      +      11
>
>
>  I am trying to find out if there's a way to extract the coordinates given
> in the 4th column and replace this column with two others that would have
> the start coord and the end coord. so instead of chr1:564644..564645,+
> I would obtain;
> start_peak  end_peak
> 564644       564645
>
> Best,
> nanami
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Custom Sort on a Table object

2011-06-06 Thread Galen Moore

Greetings - 

 

I've got the following table (the result of a two-way table operation):

 

  f m

  0 to 5  11.328000  6.900901

  15 to 24 6.100570  5.190058

  25 to 34 9.428707  6.567280

  35 to 4410.462158  7.513270

  45 to 54 7.621988  5.692905

  5 to 14  6.502741  6.119663

  55 to 64 5.884737  4.319905

  65 to 74 5.075606  4.267810

  75 to 84 4.702020  3.602362

  85 and over  4.75  3.877551

 

Which I'd like to sort so that the column of rownames (which represent age
bands) and their corresponding f and m values appear in logical order.  I've
tried a bunch of things; merging with a separate df bearing Age Bands paired
with a sequence number, stripping out row vectors and rbind-ing a new df,
etc., all to no avail.  It seems to be very difficult to spin a table object
into a data frame without being stuck with the tables rownames!

 

I haven't yet tried writing to an external file and then reading it back (so
as to get R to forget that it's a Table object), and then merging on the
group bands to pull in a sequence vector upon which to do an order().  Seems
like it should be easier. 

 

Many thanks,

 

Galen 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Custom Sort on a Table object

2011-06-06 Thread Bill.Venables

Here is a one way.

> tab
fm
0 to 5  11.328000 6.900901
15 to 24 6.100570 5.190058
25 to 34 9.428707 6.567280
35 to 4410.462158 7.513270
45 to 54 7.621988 5.692905
5 to 14  6.502741 6.119663
55 to 64 5.884737 4.319905
65 to 74 5.075606 4.267810
75 to 84 4.702020 3.602362
85 and_over  4.75 3.877551
> 
> lowAge <- as.numeric(sapply(strsplit(rownames(tab)," "), "[", 1))
> (tab <- tab[order(lowAge), ])
fm
0 to 5  11.328000 6.900901
5 to 14  6.502741 6.119663
15 to 24 6.100570 5.190058
25 to 34 9.428707 6.567280
35 to 4410.462158 7.513270
45 to 54 7.621988 5.692905
55 to 64 5.884737 4.319905
65 to 74 5.075606 4.267810
75 to 84 4.702020 3.602362
85 and over  4.75 3.877551
>  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Galen Moore
Sent: Tuesday, 7 June 2011 1:23 PM
To: r-help@r-project.org
Subject: [R] Custom Sort on a Table object

Greetings - 

 

I've got the following table (the result of a two-way table operation):

 

  f m

  0 to 5  11.328000  6.900901

  15 to 24 6.100570  5.190058

  25 to 34 9.428707  6.567280

  35 to 4410.462158  7.513270

  45 to 54 7.621988  5.692905

  5 to 14  6.502741  6.119663

  55 to 64 5.884737  4.319905

  65 to 74 5.075606  4.267810

  75 to 84 4.702020  3.602362

  85 and over  4.75  3.877551

 

Which I'd like to sort so that the column of rownames (which represent age
bands) and their corresponding f and m values appear in logical order.  I've
tried a bunch of things; merging with a separate df bearing Age Bands paired
with a sequence number, stripping out row vectors and rbind-ing a new df,
etc., all to no avail.  It seems to be very difficult to spin a table object
into a data frame without being stuck with the tables rownames!

 

I haven't yet tried writing to an external file and then reading it back (so
as to get R to forget that it's a Table object), and then merging on the
group bands to pull in a sequence vector upon which to do an order().  Seems
like it should be easier. 

 

Many thanks,

 

Galen 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line Graphs

2011-06-06 Thread Jim Silverton

Hello,
I want to plot 6 line graphs. I have 10 points 0.1, 0.2, 0.3, 0.4, 0.5, 0.6,
0.7, 0.8, 0.9 and 1.0.
At each point say 0.1, I have 6 variables A, B, C, D, E and F. The variables
all have values between 0 and 1 (and including 0 and 1). I also want to
label the x axis from 0.1 to 1.0 and the y axis from 0.1 to 1.0.
My goal is to plot a line graph representing the mean of the variables at
each level. SO for 0.1 on the xaxis, we should expect 6 values for y.
This is what I have so far. The plot omits 1.0 and the abline function does
not make the line y = x on the plot
This is what I have so far:

# Calculate range from 0 to max value of cars and trucks
g_range <- range(0, 1)

# Graph autos using y axis that ranges from 0 to max
# value in cars or trucks vector.  Turn off axes and
# annotations (axis labels) so we can specify them ourself
plot(B, type="o", pch = 0, lty=1,col="blue", ylim=g_range, axes=FALSE,
ann=FALSE)
# Make x axis using the values of pi_0 labels
#

axis(1, at=1:10,
lab=c("0.1","0.2","0.3","0.4","0.5","0.6","0.7","0.8","0.9","1.0"))

# Make y axis with horizontal labels that display ticks at
del = seq(0.1,1, 0.1)
axis(2, at=del,
lab=c("0.1","0.2","0.3","0.4","0.5","0.6","0.7","0.8","0.9","1.0"))

# Create box around plot
box()

# Graph trucks with red dashed line and square points
lines(A, type="o", pch=2, lty=1, col="red")
lines(C, type="o", pch=3, lty=1, col="green")
lines(D, type="o", pch=4, lty=1, col="orange")
lines(E, type="o", pch=6, lty=1, col="brown")
lines(F, type="o", pch=8, lty=1, col="yellow")
abline(0, 1, col = "black")
# Create a title with a red, bold/italic font
#title(main="Methods", col.main="red", font.main=4)

# Label the x and y axes
title(xlab=expression(paste(lambda[0])))
title(ylab= expression(paste("Estimate of ", lambda[0])))

# Create a legend at (1, g_range[2]) that is slightly smaller
# (cex) and uses the same line colors and points used by
# the actual plots
legend(1, g_range[2], c("B","A", "C", "D", "E", "F", "Actual"), cex=0.6,
col=c("blue","red", "green", "orange", "brown","yellow", "black"),
pch=c(0,2,3,4,6,8,9), lty=1)




-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error with geomap in googleVis

2011-06-06 Thread SNV Krishna

Hi All,
 
I am unable to get the plot geomap in googleVis package. data is as follows
 
> head(index.ret)
countryytd
1 Argentina -10.18
2 Australia  -3.42
3   Austria  -2.70
4   Belgium   1.94
5Brazil  -7.16
6Canada   0.56
 
> map1 = gvisGeoMap(index.ret,locationvar = 'country', numvar = 'ytd')
> plot(map1)
 
But it just displays a blank page, showing an error symbol at the right
bottom corner. I tried demo(googleVis), it also had a similar problem. The
demo showed all other plots/maps except for those geomaps. Could any one
please hint me what/where could be the problem? Many thanks for the idea and
support. 
 
Regards,
 
SNV Krishna

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extract data features from subsets

2011-06-06 Thread Williams Scott

I have a large dataset similar to this:

ID  timeresult
A   1   5
A   2   2
A   3   1
A   4   1
A   5   1
A   6   2
A   7   3
A   8   4
B   1   3
B   2   2
B   3   4
B   4   6
B   5   8

I need to extract a number of features for each individual in it (identified by 
"ID"). These are:
* The lowest result (the nadir)
* The time of the nadir - but if the nadir level is present at >1 time point, I 
need the minimum and maximum time of nadir
* For the time period from maximum time of nadir to the last result, I need the 
coefficient from a lm(result~time) 

The result would be a table looking like:

ID  NadirLevel  NadirFirstTime  NadirLastTime   Slope   
A   1   3   5   1
B   2   2   2   2

I can manage to extract all the required elements in a very cumbersome loop, 
but I am sure an elegant method using apply() or the like could be devised but 
I cant presently understand the necessary syntax. An suggestions appreciated.

Thanks 
Scott
_
 
Dr. Scott Williams
Peter MacCallum Cancer Centre
Melbourne, Australia
ph +61 3 9656 
fax +61 3 9656 1424
scott.willi...@petermac.org 
 


This email (including any attachments or links) may contain 
confidential and/or legally privileged information and is 
intended only to be read or used by the addressee.  If you 
are not the intended addressee, any use, distribution, 
disclosure or copying of this email is strictly 
prohibited.  
Confidentiality and legal privilege attached to this email 
(including any attachments) are not waived or lost by 
reason of its mistaken delivery to you.
If you have received this email in error, please delete it 
and notify us immediately by telephone or email.  Peter 
MacCallum Cancer Centre provides no guarantee that this 
transmission is free of virus or that it has not been 
intercepted or altered and will not be liable for any delay 
in its receipt.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract data features from subsets

2011-06-06 Thread Dennis Murphy

Hi:

Here's one way using package plyr and its ddply() function. ddply()
takes a data frame as input and expects to output either a scalar or a
data frame. In this case, we want the latter.

library(plyr)
f <- function(df) {
mn <- min(df$result)
tms <- df$time[df$result == mn]
subdf <- df[max(tms):nrow(df), ]
b1 <- coef(lm(result ~ time, data = subdf))[2]
data.frame(NadirLevel = mn, NadirFirstTime = min(tms),
NadirLastTime = max(tms), Slope = b1)
  }

This function takes a data frame df as input - in practice, it will be
a sub-data frame associated with a level of ID. We find the minimum of
result and assign it to mn, and then find the times that match the
minimum.
Next, we construct the subdata on which to run the simple linear
regression line. Finally, an output data frame is created. ddply()
will add in the ID variable. Calling your example data frame d,

> ddply(d, 'ID', f)
  ID NadirLevel NadirFirstTime NadirLastTime Slope
1  A  1  3 5 1
2  B  2  2 2 2

HTH,
Dennis


On Mon, Jun 6, 2011 at 10:04 PM, Williams Scott
 wrote:
> I have a large dataset similar to this:
>
> ID      time    result
> A       1       5
> A       2       2
> A       3       1
> A       4       1
> A       5       1
> A       6       2
> A       7       3
> A       8       4
> B       1       3
> B       2       2
> B       3       4
> B       4       6
> B       5       8
>
> I need to extract a number of features for each individual in it (identified 
> by "ID"). These are:
> * The lowest result (the nadir)
> * The time of the nadir - but if the nadir level is present at >1 time point, 
> I need the minimum and maximum time of nadir
> * For the time period from maximum time of nadir to the last result, I need 
> the coefficient from a lm(result~time)
>
> The result would be a table looking like:
>
> ID      NadirLevel      NadirFirstTime  NadirLastTime   Slope
> A       1               3                       5                       1
> B       2               2                       2                       2
>
> I can manage to extract all the required elements in a very cumbersome loop, 
> but I am sure an elegant method using apply() or the like could be devised 
> but I cant presently understand the necessary syntax. An suggestions 
> appreciated.
>
> Thanks
> Scott
> _
>
> Dr. Scott Williams
> Peter MacCallum Cancer Centre
> Melbourne, Australia
> ph +61 3 9656 
> fax +61 3 9656 1424
> scott.willi...@petermac.org
>
>
>
> This email (including any attachments or links) may contain
> confidential and/or legally privileged information and is
> intended only to be read or used by the addressee.  If you
> are not the intended addressee, any use, distribution,
> disclosure or copying of this email is strictly
> prohibited.
> Confidentiality and legal privilege attached to this email
> (including any attachments) are not waived or lost by
> reason of its mistaken delivery to you.
> If you have received this email in error, please delete it
> and notify us immediately by telephone or email.  Peter
> MacCallum Cancer Centre provides no guarantee that this
> transmission is free of virus or that it has not been
> intercepted or altered and will not be liable for any delay
> in its receipt.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

65 matches

Mail list logo