date:20110502

Re: [R] multiple mosaic plots layout

2011-05-02 Thread Achim Zeileis


On Mon, 2 May 2011, baptiste auguie wrote:


Unfortunately, it seems that vcd doesn't return grobs but draws
directly to the device, which prevents a concise solution.


Yes. The reason is that vcd was first written before grobs were available.

When we need multiple plots in a single layout, we use Baptiste's second 
(more verbose) solution. A worked example is included in ?Ord_plot.



You could try the following,

library(gridExtra)
library(vcd)
data("Titanic")

p = grid.grabExpr(mosaic(Titanic))
grid.arrange(p, p, p, ncol=2)

Or, more versatile but also more verbose,

pushViewport(...)
mosaic(...)
upViewport()
pushViewport(...)
mosaic(...)
upViewport()
etc..

HTH,

baptiste

On 2 May 2011 11:32, Neuwirth Erich  wrote:

I would like to display multiple mosaic plots from vcd (not defined by a model 
but derived from different data sets)
side by side.
Neither par(mfrow=...)
nor layout seem to allow to arrange multiple mosaic plots in a grid.
Is there an easy way of arranging mosaics in a grid?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple General Statistics and R question (with 3 line example) - get z value from pairwise.wilcox.test

2011-05-02 Thread Uwe Ligges

To get the statsitics, you will have to run each wilcox.test  manually. 
the pairwise... version just extracts the p-values and adjusts them.


Uwe Ligges


On 28.04.2011 15:18, JP wrote:

Hi there,

I am trying to do multiple pairwise Wilcoxon signed rank tests in a
manner similar to:

a<- c(runif(1000, min=1,max=50), rnorm(1000, 50), rnorm(1000, 49.9,
0.5), rgeom(1000, 0.5))
b<- c(rep("group_a", 1000), rep("group_b", 1000), rep("group_c",
1000), rep("group_d", 1000))
pairwise.wilcox.test(a, b, alternative="two.sided",
p.adj="bonferroni", exact=F, paired=T)

This gives me the following output:

 group_a group_b group_c
group_b<2e-16  -   -
group_c<2e-16  0.25-
group_d<2e-16<2e-16<2e-16

(which is kind of expected since group_b and group_c have similar distributions)

I have found that when doing a wilcoxon signed ranked test you should report:

- The median value (and not the mean or sd, presumably because of the
underlying potential non normal distribution)
- The Z score (or value)
- r
- p value

My questions are:

- Are the above enough/correct values to report (some places even
quote W and df) ?  What else would you suggest?
- How do I calculate the Z score and r for the above example?
- How do I get each statistic from the pairwise.wilcox.test call?

Many Thanks
JP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] importing and filtering time series data

2011-05-02 Thread Joel Reymont

My current code looks like this. Anything that can be improved?

#! /usr/bin/rscript

# install.packages(c('zoo','xts'))

library(zoo)
library(xts)

req_stats <- function(data, type = NA)
{
  if (is.na(type))
csv <- data
  else
# subset of data matching our request type
csv <- subset(data, Kind == type)
  # import into a time series
  x <- xts(csv$Duration, as.POSIXct(csv$Time))
  # requests per second
  rps <- period.apply(x, endpoints(x, 'seconds'), length)
  # stats
  c(length(x), mean(x), var(x), quantile(x, c(.05, .95)), mean(rps))
  # indexFormat(x) <- "%Y-%m-%d %H:%M:%OS"
  # options(digits.secs=6)
}

# assumes column headers

data <- read.csv("benchie.csv")

# take out the rows with "N"

all <- subset(data, Include == "Y")

# Kind: R = sidebar request, C = sidebar click, U = upload doc, A = create ad

sidebar_req <- req_stats(all, "R")
# sidebar_click <- req_stats(all, "C")
doc_upload <- req_stats(all, "U")
ad_create <- req_stats(all, "A")
all <- req_stats(all)

# mdat <- rbind(all, sidebar_req, sidebar_click, doc_upload, ad_create)
# rownames(mdat) <- c("all", "sidebar req", "sidebar click", "doc upload", "ad 
create")
mdat <- rbind(all, sidebar_req, doc_upload, ad_create)
rownames(mdat) <- c("all", "sidebar req", "doc upload", "ad create")
colnames(mdat) <- c("count", "mean", "var", "5%", "95%", "rps")

print(round(mdat, digits = 3))

--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-++---
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
-++---
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] vector decreasing by a factor

2011-05-02 Thread andre bedon


Hi,
I'm quite new to R so this question will sound quite fundamental. I need to 
create a vector of length 160. The first element should be (1+r)^159 and each 
element thereafter should decrease by a factor of (1+r) until the 160th element 
that should be 1. Is there a function similar to seq() but increasing or 
decreasing by factors? I need to do this in one step i.e, not using loops. Any 
help would be greatly appreciated. 
Regards,
Andre
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bootstrapping problem

2011-05-02 Thread Uwe Ligges




On 29.04.2011 09:01, mimato wrote:

I want to classify bipolar neurons in human cochleas and have data of the
following structure:

Vol_Nuc Vol_Soma
1   186.23   731.96
2   204.58  4370.96
3   539.98  7344.86
4   477.71  6939.28
5   421.22  5588.53
6   276.61  1017.05
7   392.28  6392.32
8   424.43  6190.13
9   256.41  3850.51
10  249.17  3118.14
11  276.97  3037.29
12  295.30  3703.76
13  314.43  5265.97
14  301.15  5781.73

I already worked with Matlab (I´m not a programmer) and created nice
colourcoded dendrograms and also made some verifications of them. I started
now working with R and bootstrapped data with a library named pvclust. It
worked and R computed ...

here is the code:

library(pvclust)

data =
data.frame(Vol_Nuk=c(186.23,204.58,539.98,477.71,421.22,276.61,392.28,424.43,256.41,249.17,276.97,295.3,314.43,301.15),
Vol_Soma=c(731.96,4370.96,7344.86,6939.28,5588.53,1017.05,6392.32,6190.13,3850.51,3118.14,3037.29,3703.76,5265.97,5781.73))

plot(data)
result<-pvclust(data,nboot=100)
plot(result)

It is also not working using following commands:

cluster.bootstrap<- pvclust(Raw, nboot=1000, method.dist="abscor")
plot(cluster.bootstrap)
pvrect(cluster.bootstrap)

I always get the following problem:

mistake in plot.hclust(x$hclust, main = main, sub = sub, xlab = xlab, col =
col,  :
invalid input for Dendrogram

Does anyone has an idea whats wrong...



Yes: You have only 2 variables and when clustering these 2 variables it 
makes no sense to plot this. My I guess you want to cluster observations 
rather than variables? Then transpose your data before applying pvclust.


Best,
Uwe Ligges




Thanks a lot!!

--
View this message in context: 
http://r.789695.n4.nabble.com/bootstrapping-problem-tp3483068p3483068.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with a survplot

2011-05-02 Thread Marco Barbàra


Thank you very much. 

Despite prof. Harrell's support (for whom I feel great
esteem) I still remain doubtful about this feature.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector decreasing by a factor

2011-05-02 Thread Ted Harding

On 02-May-11 07:55:27, andre bedon wrote:
> Hi,
> I'm quite new to R so this question will sound quite fundamental.
> I need to create a vector of length 160. The first element should
> be (1+r)^159 and each element thereafter should decrease by a
> factor of (1+r) until the 160th element that should be 1.
> Is there a function similar to seq() but increasing or decreasing
> by factors? I need to do this in one step i.e, not using loops.
> Any help would be greatly appreciated. 
> Regards,
> Andre

One expression which would do what you want is

  rev((1+r)^(0:159))

though there may be more efficient ways to do it. This assumes
that r, hence (1+r), is given. If you are given the
value X1 of the first element, which is to be interpreted
as (1+r)^159, then perhaps take (1+r) as X1^(1/159),
though there is a potential slight inaccuracy in recovering
X0 from (1+r)^159. So check this first.

Hoping this helps,
Ted.

E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 02-May-11   Time: 09:50:55
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector decreasing by a factor

2011-05-02 Thread Uwe Ligges




On 02.05.2011 09:55, andre bedon wrote:


Hi,
I'm quite new to R so this question will sound quite fundamental. I need to 
create a vector of length 160. The first element should be (1+r)^159 and each 
element thereafter should decrease by a factor of (1+r) until the 160th element 
that should be 1. Is there a function similar to seq() but increasing or 
decreasing by factors? I need to do this in one step i.e, not using loops. Any 
help would be greatly appreciated.


Yes:
(1+r)^(159:0)

Uwe Ligges


Regards,
Andre

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bwplot in ascending order

2011-05-02 Thread Uwe Ligges




On 01.05.2011 22:52, Doran, Harold wrote:

Can anyone point me to examples with R code where bwplot in lattice is used to 
order the boxes in ascending order? I have found the following discussion and 
it partly works. But, I have a conditioning variable, so my example is more like

bwplot(var1 ~ var2|condition, dat)



I guess you are looking for something along

bwplot(var1 ~ var2 | reorder(condition, var2, median), dat)

Uwe Ligges




Th example in the discussion below works only when there is not a conditioning 
variable as far as I can tell. I can tweak the example below to work, but then 
I get some ugly labels in the lattice plot. It seems index.cond is supposed to 
help me solve this, but I cannot find good examples showing its use.

Thanks
Harold

http://r.789695.n4.nabble.com/bwplot-reorder-factor-on-y-axis-td790903.html
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question of VECM restricted regression

2011-05-02 Thread Pfaff, Bernhard Dr.

Hello Meilan:

'ect' is shorthand for error-correction-term, 'sd' signify seasonal dummy 
variables and 'LRM.dl1' is the lagged first difference of the variable 'LRM' 
(the log of real money demand).

HTH,
Bernhard 

> -Ursprüngliche Nachricht-
> Von: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] Im Auftrag von Meilan Yan
> Gesendet: Freitag, 29. April 2011 11:10
> An: bernhard.pf...@pfaffikus.de; r-help@r-project.org
> Betreff: [R] question of VECM restricted regression
> 
> Dear Colleague
> 
>   I am trying to figure out how to use R to do OLS restricted 
> VECM regression. However, there are some notation I cannot understand.
> 
> Please tell me what is 'ect',  'sd' and 'LRM.dl1  in the 
> following practice:
> 
> #OLS retricted VECM regression
> data(denmark)
> sjd <- denmark[, c("LRM", "LRY", "IBO", "IDE")]
> sjd.vecm<- ca.jo(sjd, ecdet = "const", type="eigen", K=2, 
> spec="longrun",
> season=4)
> sjd.vecm.rls<-cajorls(sjd.vecm,r=1)
> summary(sjd.vecm.rls$rlm)
> sjd.vecm.rls$beta
> 
> Response LRM.d :
> Call:
> lm(formula = substitute(LRM.d), data = data.mat)
> 
> Residuals:
>   Min1QMedian3Q   Max
> -0.027598 -0.012836 -0.003395  0.015523  0.056034
> 
> Coefficients:
>  Estimate Std. Error t value Pr(>|t|)
> ect1-0.212955   0.064354  -3.309  0.00185 **
> sd1 -0.057653   0.010269  -5.614 1.16e-06 ***
> sd2 -0.016305   0.009177  -1.777  0.08238 .
> sd3 -0.040859   0.008767  -4.660 2.82e-05 ***
> LRM.dl1  0.049816   0.191992   0.259  0.79646
> LRY.dl1  0.075717   0.157902   0.480  0.63389
> IBO.dl1 -1.148954   0.372745  -3.082  0.00350 **
> IDE.dl1  0.227094   0.546271   0.416  0.67959
> 
> > sjd.vecm.rls$beta
>   ect1
> LRM.l21.00
> LRY.l2   -1.032949
> IBO.l25.206919
> IDE.l2   -4.215879
> 
> 
> Many thanks
> Meilan
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
*
Confidentiality Note: The information contained in this ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subseting data

2011-05-02 Thread Matevž Pavlič

Hi, 

 

Is it possible (i am sure it is)  to subset data from a data.frame on the basis 
of SQL >LIKE< operator. I.e., i would like to subset a data where only values 
which contains a string >GP< would be used? 

 

Example:

 

Gp<-subset(DF, DF$USCS like >GP<)

 

This like of course is not working, 

 

Thanks, m


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange fluctuations in system.time with kernapply

2011-05-02 Thread Uwe Ligges




On 29.04.2011 23:38, Alexander Senger wrote:

Hello expeRts,


here is something which strikes me as kind of odd and I would like to
ask for some enlightenment:

First let's do this:

tkern <- kernel("modified.daniell", c(5,5))
test <- rep(1,100)
system.time(kernapply(test,tkern))
User System verstrichen
1.100 0.040 1.136

That was easy. Now this:

test <- rep(1,110)
system.time(kernapply(test,tkern))
User System verstrichen
1.40 0.02 1.43

Still fine. Now this:

test <- rep(1,111)
system.time(kernapply(test,tkern))
User System verstrichen
1.390 0.020 1.409

Ok, by now it seems boring. But wait:

test <- rep(1,1110300)
system.time(kernapply(test,tkern))
User System verstrichen
12.270 0.030 12.319

There is a sudden - and repeatable! - jump in the time needed to execute
kernapply. At least from a naive point of view there should not be much
difference between applying a kernel to a vector 111 or 1110300
entries long. But maybe there is some limit here?

So I tried this:

test <- rep(1,1110400)
system.time(kernapply(test,tkern))
User System verstrichen
1.96 0.01 1.97

which doesn't fit into the pattern. But the best thing is still to come.
When I try this

test <- rep(1,1110308)
system.time(kernapply(test,tkern))

then the computer starts to run and does so for longer than 15 minutes
until when I normally kill the process. As noted above this behaviour is
repeatable and occurs every time I issue these commands.

I really would like to know if there is some magic to the number 1110308
I'm not aware of.


The magic is that the length of the vector, 1110308, is inefficient for 
the fft() used within kernapply(). You need integer powers of 2 for a 
really fast FFT.


You can also try smaller numbers  to get longer runtimes, e.g.: 13

As an example, compare:

system.time(fft(rep(1, 32768))) # roughly 0 seconds
system.time(fft(rep(1, 32771))) # almost 10 seconds

Uwe Ligges






Last but not least, here is my

sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
[1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C
[3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8
[5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
[7] LC_PAPER=de_DE.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] tools_2.10.1


Thank you,

Alex



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subseting data

2011-05-02 Thread Andrew Robinson

I wonder if grep()  will help you?

Cheers

Andrew

On Mon, May 02, 2011 at 11:03:52AM +0200, Matev? Pavli? wrote:
> Hi, 
> 
>  
> 
> Is it possible (i am sure it is)  to subset data from a data.frame on the 
> basis of SQL >LIKE< operator. I.e., i would like to subset a data where only 
> values which contains a string >GP< would be used? 
> 
>  
> 
> Example:
> 
>  
> 
> Gp<-subset(DF, DF$USCS like >GP<)
> 
>  
> 
> This like of course is not working, 
> 
>  
> 
> Thanks, m
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subseting data

2011-05-02 Thread Matevž Pavlič

Hi, 

When I use your code i get this :

>dat<-data.frame(test=c("abc","cdf","dabc"))

>d<-subset(dat,grepl(test,"abc"))

>d

Warning message:

In grepl(test, "abc") :

  argument 'pattern' has length > 1 and only the first element will be used

> d

  test

1  abc

2  cdf

3 dabc

I can't seem to make it work. Also how would i use the grepl() to select only 
those that are not like i.e. Â»GPÂ«?

Thanks, m

From: Steven Kennedy [mailto:stevenkennedy2...@gmail.com] 
Sent: Monday, May 02, 2011 11:30 AM
To: MatevÅ¾ PavliÄ
Cc: r-help@r-project.org
Subject: Re: [R] subseting data

You can use grepl:

> dat<-data.frame(test=c("abc","cdf","dabc"))
> d<-subset(dat,grepl(test,"abc"))
> d
  test
1  abc
3 dabc

On Mon, May 2, 2011 at 7:03 PM, MatevÅ¾ PavliÄ  
wrote:

Hi,

Is it possible (i am sure it is)  to subset data from a data.frame on the basis 
of SQL >LIKE< operator. I.e., i would like to subset a data where only values 
which contains a string >GP< would be used?

Example:

Gp<-subset(DF, DF$USCS like >GP<)

This like of course is not working,

Thanks, m

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subseting data

2011-05-02 Thread Uwe Ligges




On 02.05.2011 11:47, Matevž Pavlič wrote:

Hi,



When I use your code i get this :




dat<-data.frame(test=c("abc","cdf","dabc"))



d<-subset(dat,grepl(test,"abc"))


d <- subset(dat, grepl("abc", test))





d


Warning message:

In grepl(test, "abc") :

   argument 'pattern' has length>  1 and only the first element will be used


d


   test

1  abc

2  cdf

3 dabc



I can't seem to make it work. Also how would i use the grepl() to select only 
those that are not like i.e. Â»GPÂ«?



Use the negation:

d <- subset(dat, !grepl("abc", test))


Uwe Ligges







Thanks, m





From: Steven Kennedy [mailto:stevenkennedy2...@gmail.com]
Sent: Monday, May 02, 2011 11:30 AM
To: MatevÅ¾ PavliÄ
Cc: r-help@r-project.org
Subject: Re: [R] subseting data



You can use grepl:


dat<-data.frame(test=c("abc","cdf","dabc"))
d<-subset(dat,grepl(test,"abc"))
d

   test
1  abc
3 dabc




On Mon, May 2, 2011 at 7:03 PM, MatevÅ¾ PavliÄ  
wrote:

Hi,



Is it possible (i am sure it is)  to subset data from a data.frame on the basis of 
SQL>LIKE<  operator. I.e., i would like to subset a data where only values which 
contains a string>GP<  would be used?



Example:



Gp<-subset(DF, DF$USCS like>GP<)



This like of course is not working,



Thanks, m


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The R Inferno revised

2011-05-02 Thread Patrick Burns


Hell is new and improved.

The new version is in the same old place:
http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

An explanation of the changes is at:
http://www.portfolioprobe.com/2011/05/02/the-r-inferno-revised/


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] --mem-vsize in R

2011-05-02 Thread Uwe Ligges




On 29.04.2011 22:14, kparamas wrote:

Hi,

I am calculation pairwise correlation coefficient for a matrix of 234 X
3.
I am getting the following error,
Error in cbind(as.vector(row(cl)), as.vector(col(cl)), as.vector(cl)) :
   allocMatrix: too many elements specified



The problem is that you try to create a matrix with 3 * nrow(cl) * 
ncol(cl) elements here. The maximal number of elements in one single 
vector or matrix is 2^31 - 1. You can have several of those, if you have 
a sufficient amount of RAM, tough.


Uwe Ligges





In addition: There were 50 or more warnings (use warnings() to see the first
50)

The function used is,
corGraphPearson = function(cData, COR) #COR is threshold 0.5,0.7, etc
{

 cl = unname(cor(cData, use="pairwise.complete.obs", method="pearson"))

 result = cbind(as.vector(row(cl)),as.vector(col(cl)),as.vector(cl))
 result = result[result[,1] != result[,2],]

 corm = result

# remove low cor pairs
 corm =corm[abs(corm[,3])>= COR, ]
# the network
 net<- network(corm, directed = F)
}


I am running this in a cluster with 4 machines with 24 GB memory each.

How should I start R so that I make max use of the memory availbale?
Or how to overcome this issue?

--
View this message in context: 
http://r.789695.n4.nabble.com/mem-vsize-in-R-tp3484541p3484541.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bechmarking my code

2011-05-02 Thread Alaios

Dear all,
I have written a quite big piece of code that takes like 6 hourse to execute 
(measured that with system.time).

I was wondering if it is possible to try to further understand which are the 
pieces of code that are more time consuming so to try to improve them.

Could you please help me understand how I can do this sort of benchmarking?

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] caret - prevent resampling when no parameters to find

2011-05-02 Thread Max Kuhn

Yeah, that didn't work. Use

   fitControl<-trainControl(index = list(seq(along = mdrrClass)))

See ?trainControl to understand what this does in detail.

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bechmarking my code

2011-05-02 Thread Uwe Ligges




On 02.05.2011 12:35, Alaios wrote:

Dear all,
I have written a quite big piece of code that takes like 6 hourse to execute 
(measured that with system.time).

I was wondering if it is possible to try to further understand which are the 
pieces of code that are more time consuming so to try to improve them.

Could you please help me understand how I can do this sort of benchmarking?



Use profiling. See ?Rprof and the manuals to get a first idea.

Uwe Ligges



Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The bin/R file - hardcoded paths

2011-05-02 Thread Uwe Ligges

From the manual R Installation and Administration (that you should have 
read yourself before posting):

"You can install into another directory tree by using
make prefix=/path/to/here install"

Uwe Ligges





On 29.04.2011 19:42, Saptarshi Guha wrote:

Hello,

I notice that e.g /home/sguha/lib64 is hard coded into the /bin/R file .
I nstalled R as ./configure --prefix=$HOME ...

What i need to do is ship the entire R distribution to remote nodes,
and run R. These are shipped to ephemeral directories
so I dont know the path ahead of time.

R_HOME doesn't change things either.

So i guess one cant run R on a system unless it's been installed?

1. I can't install R on the compute nodes using ./configure 
2. All nodes do have the same architecture
3. I would like to stick to the 'shipping' approach.


Thanks
Saptarshi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] logistic regression with glm: cooks distance and dfbetas are different compared to SPSS output

2011-05-02 Thread Uwe Ligges




On 29.04.2011 18:29, "Biedermann, Jürgen" wrote:

Hi there,

I have the problem, that I'm not able to reproduce the SPSS residual
statistics (dfbeta and cook's distance) with a simple binary logistic
regression model obtained in R via the glm-function.

I tried the following:

fit <- glm(y ~ x1 + x2 + x3, data, family=binomial)

cooks.distance(fit)#


Just type stats::cooks.distance.glm and see the definition in R yourself:

function (model, infl = influence(model, do.coef = FALSE), res = 
infl$pear.res, dispersion = summary(model)$dispersion, hat = infl$hat, ...)

{
p <- model$rank
res <- (res/(1 - hat))^2 * hat/(dispersion * p)
res[is.infinite(res)] <- NaN
res
}


Now you can digg yourself further on. I do not know how to find the 
actually used algorithm from SPSS, hence I cannot tell what is different.


Uwe Ligges




dfbetas(fit)

When i compare the returned values with the values that I get in SPSS,
they are different, although the same model is calculated (the
coefficients are the same etc.)

It seems that different calculation-formulas are used for cooks.distance
and dfbetas in SPSS compared to R.

Unfortunately I didn't find out, what's the difference in the
calculation and how I could get R to calculate me the same statistics
that SPSS uses.
Or is this an unknown SPSS bug?

Greetings
Jürgen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specify custom par(mfrow()) layout for defined plot()

2011-05-02 Thread Uwe Ligges




On 29.04.2011 17:10, Michael Bach wrote:

Dear R Users,

I am doing stats::decompose() on 4 different time series.  When I issue

csdA<- decompose(tsA)
plot(csdA)

I get a summary plot for observed, trend, seasonal and random components
of decomposed time series tsA.  As I understand it, the object returned
by decompose() has it's own plot method where mfrow(4,1) etc. is
defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
handle these cases?  Something like a "meta" par(mfrow())?



This does not work and is one of the reasons why the grid package was 
developed.


Uwe Ligges



Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tests for the need of cluster analysis

2011-05-02 Thread Tal Galili

Hi Mary,
I'm not sure I understood your question.

Are you using this package:
http://cran.r-project.org/web/packages/prabclus/index.html
 And asking how
to decide if to use it or not?

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

On Sun, May 1, 2011 at 7:54 PM, mary weiss  wrote:

> Does R have the capability to perform tests for the need of clustering
> analysis (e.g., in prabclus)?  I am using panel data with two-way fixed
> effects but am unsure about whether I should be using a cluster option as
> well to estimate my model.--
> View this message in context:
> http://r.789695.n4.nabble.com/Tests-for-the-need-of-cluster-analysis-tp3488097p3488097.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2011-05-02 Thread Oliver Sonnentag



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Copying to R a rectangular array from a Java class

2011-05-02 Thread Hurr

I discovered that a row of a rectangular array returns, but a function
parameter is not sent to Java.
Appended bare test code: My simple Java test class source and R test code
follow:
public class RJavTest { 
  public static void main(String[]args) { RJavTest rJavTest=new RJavTest();
} 
  public final static String conStg="testString"; 
  public final static double con0dbl=1001; 
  public final static double[]con1Arr=new double[] {
10001,10002,10003,10004,10005,10006 }; 
  public final static double[][]con2Arr=new double[][] { { 101,102,103,104
},{ 201,202,203,204 },{ 301,302,303,304 } }; 
  public final static String retConStg() { return(conStg); } 
  public final static double retCon0dbl() { return(con0dbl); } 
  public final static double[] retCon1Arr() { return(con1Arr); } 
  public final static double[] retCon2Row0() { return(con2Arr[0]); } 
  public final static double[] retCon2Row(int row) { return(con2Arr[row]); } 
  public final static double[][] retCon2Arr() { return(con2Arr); } 
}
library(rJava)
.jinit()
.jaddClassPath("C:/ad/j") # a directory on my disk 
print(.jclassPath())
rJavaTst <- .jnew("RJavTest") # compiled java to class file
connStg <- .jfield(rJavaTst,sig="S","conStg")
print(connStg)
connStgRet <- .jcall(rJavaTst,returnSig="S","retConStg")
print(connStgRet)
conn1Arr <- .jfield(rJavaTst,sig="[D","con1Arr")
print(conn1Arr)
print(conn1Arr[2])
conn1ArrRet <- .jcall(rJavaTst,returnSig="[D","retCon1Arr")
print(conn1ArrRet)
print(conn1ArrRet[2])
conn0dbl <- .jfield(rJavaTst,sig="D","con0dbl")
print(conn0dbl,digits=15)
conn2Row0Ret <- .jcall(rJavaTst,returnSig="[D","retCon2Row0")
print(conn2Row0Ret)
print(conn2Row0Ret[2])
# The above is education, questions on rectangular and parameters are below
conn2Arr<- .jfield(rJavaTst,sig="[[D","con2Arr")
conn2ArrRet <- .jcall(rJavaTst,returnSig="[[D","retCon2Arr")
# I can't identify any complaints so far 
print(conn2Arr)
print(conn2ArrRet)
conn2RowRet <- .jcall(rJavaTst,returnSig="[D","retCon2Row",0)
print(conn2RowRet)
print(conn2RowRet[2])
# But what meaning should I get from these strange messages? 

The results are:
> library(rJava)
> .jinit()
> .jaddClassPath("C:/ad/j") # a directory on my disk 
> print(.jclassPath())
[1] "C:\\Users\\ENVY17\\Documents\\R\\win-library\\2.12\\rJava\\java"
[2] "C:\\ad\\j"  
> rJavaTst <- .jnew("RJavTest") # compiled java to class file
> connStg <- .jfield(rJavaTst,sig="S","conStg")
> print(connStg)
[1] "testString"
> connStgRet <- .jcall(rJavaTst,returnSig="S","retConStg")
> print(connStgRet)
[1] "testString"
> conn1Arr <- .jfield(rJavaTst,sig="[D","con1Arr")
> print(conn1Arr)
[1] 10001 10002 10003 10004 10005 10006
> print(conn1Arr[2])
[1] 10002
> conn1ArrRet <- .jcall(rJavaTst,returnSig="[D","retCon1Arr")
> print(conn1ArrRet)
[1] 10001 10002 10003 10004 10005 10006
> print(conn1ArrRet[2])
[1] 10002
> conn0dbl <- .jfield(rJavaTst,sig="D","con0dbl")
> print(conn0dbl,digits=15)
[1] 1001
> conn2Row0Ret <- .jcall(rJavaTst,returnSig="[D","retCon2Row0")
> print(conn2Row0Ret)
[1] 101 102 103 104
> print(conn2Row0Ret[2])
[1] 102
> # The above is education, questions on rectangular and parameters are
> below
> conn2Arr<- .jfield(rJavaTst,sig="[[D","con2Arr")
> conn2ArrRet <- .jcall(rJavaTst,returnSig="[[D","retCon2Arr")
> # I can't identify any complaints so far 
> print(conn2Arr)
[[1]]
[1] "Java-Array-Object[D:[D@66848c"
[[2]]
[1] "Java-Array-Object[D:[D@8813f2"
[[3]]
[1] "Java-Array-Object[D:[D@1d58aae"
> print(conn2ArrRet)
[[1]]
[1] "Java-Array-Object[D:[D@66848c"
[[2]]
[1] "Java-Array-Object[D:[D@8813f2"
[[3]]
[1] "Java-Array-Object[D:[D@1d58aae"
> conn2RowRet <- .jcall(rJavaTst,returnSig="[D","retCon2Row",0)
Error in .jcall(rJavaTst, returnSig = "[D", "retCon2Row", 0) : 
  method retCon2Row with signature (D)[D not found
> print(conn2RowRet)
Error in print(conn2RowRet) : object 'conn2RowRet' not found
> print(conn2RowRet[2])
Error in print(conn2RowRet[2]) : object 'conn2RowRet' not found
> # But what meaning should I get from these strange messages? 
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Copying-to-R-a-rectangular-array-from-a-Java-class-tp3486167p3489899.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grid

2011-05-02 Thread azam jaafari

Dear All
 
I trained a neural network for 200 data and I did prediction for a grid file 
(e.g. 100 points) such as below:
 
snn<-predict(nn, newdata=data.frame(wetness=wetnessgrid$band1, 
ndvi=ndvigrid$band1))
 
the pixels of snn is same with wetnessgrid or ndvigrid 
 
 I want to convert this file (snn) to a map that I can open it in GIS 
software.  
 
Thanks alot
 

 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subseting data

2011-05-02 Thread Steven Kennedy

You can use grepl:

> dat<-data.frame(test=c("abc","cdf","dabc"))
> d<-subset(dat,grepl(test,"abc"))
> d
  test
1  abc
3 dabc



On Mon, May 2, 2011 at 7:03 PM, MatevÅ¾ PavliÄ wrote:

> Hi,
>
>
>
> Is it possible (i am sure it is)  to subset data from a data.frame on the
> basis of SQL >LIKE< operator. I.e., i would like to subset a data where only
> values which contains a string >GP< would be used?
>
>
>
> Example:
>
>
>
> Gp<-subset(DF, DF$USCS like >GP<)
>
>
>
> This like of course is not working,
>
>
>
> Thanks, m
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bwplot in ascending order

2011-05-02 Thread Mark Difford

On May 01 (2011) Harold Doran wrote:

>> Can anyone point me to examples with R code where bwplot in lattice is
>> used to order the boxes in 
>> ascending order?

You don't give an example and what you want is not entirely clear.

Presumably you want ordering by the median (boxplot, and based on the
example you point to, where the median is mentioned as an _example_).

Is this what you want?

##
bwplot(var1 ~ var2|condition, dat, index.cond = function(x, y) reorder(y, x,
median))  ## if x is numeric
bwplot(var1 ~ var2|condition, dat, index.cond = function(x, y) reorder(x, y,
median))  ## if y is numeric

Regards, Mark.

--
View this message in context: 
http://r.789695.n4.nabble.com/bwplot-in-ascending-order-tp3488557p3489544.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] adding legend to matplot

2011-05-02 Thread kcchalmers

Hi all, I am new to R programming and I was trying to write a simple code in
order to plot my data. The problem is that I am not able to insert a legend
corresponding to each column of the data matrix. Please can someone help me
out. How can i directly get the legend relating to each data curve.

This is the code I had written:

matrix<-read.table(file="PNPLA.txt",
header=TRUE,
sep="\t",
row.names=1
)

a<-matrix[1:6]
c1<-a[,1]/a[28,1]
c2<-a[,2]/a[28,2]
c3<-a[,3]/a[28,3]
c4<-a[,4]/a[28,4]
c5<-a[,5]/a[28,5]
c6<-a[,6]/a[28,6]

mat<-cbind(c1,c2,c3,c4,c5,c6)

x1<-mat[,1]/a[,1]
x2<-mat[,2]/a[,1]
x3<-mat[,3]/a[,1]
x4<-mat[,4]/a[,1]
x5<-mat[,5]/a[,1]
x6<-mat[,6]/a[,1]

final<-cbind(x1,x2,x3,x4,x5,x6)
matplot(final,type="l")
tfin<-t(final)
colnames(tfin)<-c("PC26","PC28","PC28:02","PC30","PC32","PC3201","PC3202","PC34",
"PC3401","PC3402","PC3404","PC36","PC3601","PC3602","PC3604","PC3606","PC38","PC3804","PC3806","PC40","PC4002","PC4006","PC4008","PC42","PC44","SM21")
 

matplot(tfin,pch = 1:25, type =
"o",lty=20,lwd=1.9,xlab="Time",ylab="PCmix/SM21:00 ratio")

--
View this message in context: 
http://r.789695.n4.nabble.com/adding-legend-to-matplot-tp3489844p3489844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing new packages

2011-05-02 Thread liyatle

I'm having that problem, what did you do?

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-installing-new-packages-tp1589974p3489573.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Axis label colour

2011-05-02 Thread Kang Min

Hi all,

Is there an argument in the axis() function to change the colour of
the tick labels? I only found col.ticks, and col.lab, but they're not
doing what I want.

Thanks,
KM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Global variables

2011-05-02 Thread abhagwat

Well, what would be really helpful is to restrict the scope of all
non-function variables, but keep a global for scope of all function
variables. Then, you still have access to all loaded functions, but you
don't mix up variables.

How would one do that?

Adi


> Is there a way I can prevent global variables to be visible within my
> functions?

Yes, but you probably shouldn't.  You would do it by setting the 
environment of the function to something that doesn't have the global 
environment as a parent, or grandparent, etc.  The only common examples 
of that are baseenv() and emptyenv().  For example,

x <- 1
f <- function() print(x)


--
View this message in context: 
http://r.789695.n4.nabble.com/Global-variables-tp3178242p3489796.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to get row name using the which function

2011-05-02 Thread Schumacher, G.

Dear All,

Probably a very basic question, but can't seem to work my way around it.

I want to which row has the maximum value. But what if the row names do not 
correspond with the row numbers. In the example below, you'll see that the max 
of example is row 4, but the name of row 4 is "9". How do I get R to return "9" 
as value, instead of 4.

example <- matrix(c(0,0,0,1), 4, 1, dimnames=list(c("1", "3", "5", "9"), 
c("1")))
which.max(example)

[1] 4

Hope someone can help out.

Gijs Schumacher, MSc
PhD candidate

--
Department of Political Science
VU University Amsterdam

Contact:
Tel: +31(0)20 5986798
Fax: +31(0)20 5986820
Web: http://home.fsw.vu.nl/g.schumacher
Email: g.schumac...@vu.nl

Visiting address:
Metropolitan
Buitenveldertselaan 2
Room Z - 333

Mail:
De Boelelaan 1081
1081 HV Amsterdam
The Netherlands


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specify custom par(mfrow()) layout for defined plot()

2011-05-02 Thread Michael Bach

Uwe Ligges  writes:

> On 29.04.2011 17:10, Michael Bach wrote:
>> Dear R Users,
>>
>> I am doing stats::decompose() on 4 different time series.  When I issue
>>
>> csdA<- decompose(tsA)
>> plot(csdA)
>>
>> I get a summary plot for observed, trend, seasonal and random components
>> of decomposed time series tsA.  As I understand it, the object returned
>> by decompose() has it's own plot method where mfrow(4,1) etc. is
>> defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
>> mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
>> handle these cases?  Something like a "meta" par(mfrow())?
>
>
> This does not work and is one of the reasons why the grid package was 
> developed.
>

Does this mean that there is no way whatsoever or that there is a
workaround via the grid package??

Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bwplot in ascending order

2011-05-02 Thread Doran, Harold

Doesn't seem to work. My data structure is below (I will send data to anyone 
off-list who could offer support).

The following code below does work, but since I concatenate Region and Gender, 
the labels on the lattice are ugly.

dat$test <- factor(paste(dat$Region, dat$Gender, sep='_'))
bymedian <- with(dat, reorder(test, finalRank, median))
bwplot(reorder(test, finalRank, median) ~ finalRank|Gender, dat, 
subset = Region !="",
scale='free',
xlab = 'Total Score',
ylab = 'Region',
)

> str(dat)
'data.frame':   58921 obs. of  16 variables:
 $ Athlete   : int  13 13 13 13 13 14 14 15 15 15 ...
 $ Workout   : Factor w/ 6 levels "11.1","11.2",..: 1 2 3 4 5 1 2 1 1 2 ...
 $ Result: int  309 375 46 100 300 158 232 353 359 479 ...
 $ Valid : Factor w/ 6 levels "bogus","invalid",..: 5 5 5 5 5 5 5 5 5 5 ...
 $ Gender: Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
 $ Height.cm.: num  196 196 196 196 196 ...
 $ Weight.kg.: num  97.7 97.7 97.7 97.7 97.7 ...
 $ Age   : int  29 29 29 29 29 42 42 24 24 24 ...
 $ Region: Factor w/ 18 levels "","Africa","Asia",..: 16 16 16 16 16 13 13 
18 18 18 ...
 $ AgeCut: num  2 2 2 2 2 4 4 2 2 2 ...
 $ Height.met: num  1.96 1.96 1.96 1.96 1.96 ...
 $ spVar : chr  "11.1_male" "11.2_male" "11.3_male" "11.4_male" ...
 $ Rank  : int  1567 2253 2050 1651 1462 8155 7624 322 208 206 ...
 $ totalRank : int [1:58921(1d)] 8983 8983 8983 8983 8983 15779 15779 1252 1252 
1252 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr  "13" "13" "13" "13" ...
 $ finalRank : int  1274 1274 1274 1274 1274 2643 2643 81 81 81 ...
 $ totalScore: int [1:58921(1d)] 1130 1130 1130 1130 1130 390 390 1768 1768 
1768 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr  "13" "13" "13" "13" ...

> -Original Message-
> From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
> Sent: Monday, May 02, 2011 4:58 AM
> To: Doran, Harold
> Cc: r-help@r-project.org
> Subject: Re: [R] bwplot in ascending order
> 
> 
> 
> On 01.05.2011 22:52, Doran, Harold wrote:
> > Can anyone point me to examples with R code where bwplot in lattice is used
> to order the boxes in ascending order? I have found the following discussion
> and it partly works. But, I have a conditioning variable, so my example is
> more like
> >
> > bwplot(var1 ~ var2|condition, dat)
> 
> 
> I guess you are looking for something along
> 
> bwplot(var1 ~ var2 | reorder(condition, var2, median), dat)
> 
> Uwe Ligges
> 
> 
> >
> > Th example in the discussion below works only when there is not a
> conditioning variable as far as I can tell. I can tweak the example below to
> work, but then I get some ugly labels in the lattice plot. It seems index.cond
> is supposed to help me solve this, but I cannot find good examples showing its
> use.
> >
> > Thanks
> > Harold
> >
> > http://r.789695.n4.nabble.com/bwplot-reorder-factor-on-y-axis-td790903.html
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get row name using the which function

2011-05-02 Thread Downey, Patrick

Perhaps not the most elegant. 

rownames(example)[which.max(example)]

If you wanted to type less, you could always write a function.

names.max <- function(x){
  return(rownames(example)[which.max(example)])
}


-Mitch


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Schumacher, G.
Sent: Monday, May 02, 2011 7:54 AM
To: 'r-help@r-project.org'
Subject: [R] how to get row name using the which function

Dear All,

Probably a very basic question, but can't seem to work my way around it.

I want to which row has the maximum value. But what if the row names do not
correspond with the row numbers. In the example below, you'll see that the
max of example is row 4, but the name of row 4 is "9". How do I get R to
return "9" as value, instead of 4.

example <- matrix(c(0,0,0,1), 4, 1, dimnames=list(c("1", "3", "5", "9"),
c("1")))
which.max(example)

[1] 4

Hope someone can help out.

Gijs Schumacher, MSc
PhD candidate

--
Department of Political Science
VU University Amsterdam

Contact:
Tel: +31(0)20 5986798
Fax: +31(0)20 5986820
Web: http://home.fsw.vu.nl/g.schumacher
Email: g.schumac...@vu.nl

Visiting address:
Metropolitan
Buitenveldertselaan 2
Room Z - 333

Mail:
De Boelelaan 1081
1081 HV Amsterdam
The Netherlands


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specify custom par(mfrow()) layout for defined plot()

2011-05-02 Thread Prof Brian Ripley


On Mon, 2 May 2011, Michael Bach wrote:


Uwe Ligges  writes:


On 29.04.2011 17:10, Michael Bach wrote:

Dear R Users,

I am doing stats::decompose() on 4 different time series.  When I issue

csdA<- decompose(tsA)
plot(csdA)

I get a summary plot for observed, trend, seasonal and random components
of decomposed time series tsA.  As I understand it, the object returned
by decompose() has it's own plot method where mfrow(4,1) etc. is
defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
handle these cases?  Something like a "meta" par(mfrow())?



This does not work and is one of the reasons why the grid package was developed.



Does this mean that there is no way whatsoever or that there is a
workaround via the grid package??


See the gridBase package.



Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with Sweave and pdflatex

2011-05-02 Thread Frank Lehmann

Hallo,

 

when I plot figures with Sweave, I get the message "pdflatex: Permission
denied". This problem only occurs while working on local system. When I copy
the *.rnw-File to my AFS drive, there is no problem at all.

 

Here is a small example:

 

\documentclass{scrartcl}

\usepackage[OT1]{fontenc}

\usepackage[latin1]{inputenc}

\usepackage[ngerman]{babel}

\usepackage[pdftex]{graphicx}

\usepackage{Sweave}

 

\begin{document}

 

\setkeys{Gin}{width=\textwidth}

\begin{figure}[htbp]

<>=

x <- 1:10

plot(x)

@

\caption{Eine einfache Grafik}

\end{figure}

 

\end{document}

 

Does anyone have an idea, how to solve that problem? Im working with Windows
XP.

 

Thanks!

 

Frank


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with Rterm 2.13.0 - but not RGui

2011-05-02 Thread Stefan McKinnon Høj-Edwards

Hi all,

I have just installed R 2.13.0 and I am experiencing problems with the 
terminal, but not the with the GUI interface.
I am Windows 7.

When running "R" or "Rterm" from a commandline I receive the following:

Warning message:
In normalizePath(path.expand(path), winslash, mustWork) :
  path[3]="C:/Programmer/R/R-2.13.0/library": Adgang nægtet

R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-pc-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Warning message:
package "methods" in options("defaultPackages") was not found
During startup - Warning messages:
1: package 'datasets' in options("defaultPackages") was not found
2: package 'utils' in options("defaultPackages") was not found
3: package 'grDevices' in options("defaultPackages") was not found
4: package 'graphics' in options("defaultPackages") was not found
5: package 'stats' in options("defaultPackages") was not found
6: package 'methods' in options("defaultPackages") was not found


Notice: "C:/Programmer/" is the Danish equivalent of "C:/Program Files".
The first error "Adgang nægtet" is directly translated to "Access denied".

Any suggestions as how to fix this?

Kind regards,
Stefan McKinnon Edwards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange fluctuations in system.time with kernapply

2011-05-02 Thread Ravi Varadhan

Why not do `zero padding' to improve the efficiency, i.e. add a bunch of zeros 
to the end of the data vector such that the resulting vector is a power of 2?  
This is very common in signal processing, and is legitimate since zero padding 
does not add any new information.

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Uwe Ligges
Sent: Monday, May 02, 2011 5:31 AM
To: Alexander Senger
Cc: r-help@r-project.org
Subject: Re: [R] strange fluctuations in system.time with kernapply



On 29.04.2011 23:38, Alexander Senger wrote:
> Hello expeRts,
>
>
> here is something which strikes me as kind of odd and I would like to
> ask for some enlightenment:
>
> First let's do this:
>
> tkern <- kernel("modified.daniell", c(5,5))
> test <- rep(1,100)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.100 0.040 1.136
>
> That was easy. Now this:
>
> test <- rep(1,110)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.40 0.02 1.43
>
> Still fine. Now this:
>
> test <- rep(1,111)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.390 0.020 1.409
>
> Ok, by now it seems boring. But wait:
>
> test <- rep(1,1110300)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 12.270 0.030 12.319
>
> There is a sudden - and repeatable! - jump in the time needed to execute
> kernapply. At least from a naive point of view there should not be much
> difference between applying a kernel to a vector 111 or 1110300
> entries long. But maybe there is some limit here?
>
> So I tried this:
>
> test <- rep(1,1110400)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.96 0.01 1.97
>
> which doesn't fit into the pattern. But the best thing is still to come.
> When I try this
>
> test <- rep(1,1110308)
> system.time(kernapply(test,tkern))
>
> then the computer starts to run and does so for longer than 15 minutes
> until when I normally kill the process. As noted above this behaviour is
> repeatable and occurs every time I issue these commands.
>
> I really would like to know if there is some magic to the number 1110308
> I'm not aware of.

The magic is that the length of the vector, 1110308, is inefficient for 
the fft() used within kernapply(). You need integer powers of 2 for a 
really fast FFT.

You can also try smaller numbers  to get longer runtimes, e.g.: 13

As an example, compare:

system.time(fft(rep(1, 32768))) # roughly 0 seconds
system.time(fft(rep(1, 32771))) # almost 10 seconds

Uwe Ligges



>
>
> Last but not least, here is my
>
> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C
> [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8
> [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
> [7] LC_PAPER=de_DE.utf8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.1
>
>
> Thank you,
>
> Alex
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Global variables

2011-05-02 Thread Duncan Murdoch

On 02/05/2011 7:19 AM, abhagwat wrote:

Well, what would be really helpful is to restrict the scope of all
non-function variables, but keep a global for scope of all function
variables. Then, you still have access to all loaded functions, but you
don't mix up variables.

How would one do that?

You can't without low level modifications.  Before R has done the 
lookup, it doesn't know if an object is a function or not.  It can guess 
by usage, e.g. it can recognize that "print" should be a function in 
print(1) and it will ignore non-functions named "print", but it is very 
common in R code to do things like

fn <- print
fn(1)

and that would fail.  But if you want to experiment with the change, you 
can, because R is open source.   I doubt if you'll get much help unless 
you give a really convincing argument (on the R-devel list, not on this 
list) why to make the change.

Duncan Murdoch

Adi

>  Is there a way I can prevent global variables to be visible within my
>  functions?

Yes, but you probably shouldn't.  You would do it by setting the
environment of the function to something that doesn't have the global
environment as a parent, or grandparent, etc.  The only common examples
of that are baseenv() and emptyenv().  For example,

x<- 1
f<- function() print(x)

--
View this message in context: 
http://r.789695.n4.nabble.com/Global-variables-tp3178242p3489796.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Axis label colour

2011-05-02 Thread David Winsemius



On May 2, 2011, at 2:48 AM, Kang Min wrote:


Hi all,

Is there an argument in the axis() function to change the colour of
the tick labels? I only found col.ticks, and col.lab, but they're not
doing what I want.


You just need to read a bit further down in the help page for `axis`.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Global variables

2011-05-02 Thread Kenn Konstabel

On Mon, May 2, 2011 at 2:19 PM, abhagwat  wrote:
> Well, what would be really helpful is to restrict the scope of all
> non-function variables, but keep a global for scope of all function
> variables. Then, you still have access to all loaded functions, but you
> don't mix up variables.
>
> How would one do that?

But what's the real motivation for this? It could be useful for
ensuring that there are no unexpected global variables in your code
but you can do it using findGlobals in codetools package.

fun <- function() mean(x)
findGlobals(fun, merge=FALSE)


Kenn

>> Is there a way I can prevent global variables to be visible within my
>> functions?
>
> Yes, but you probably shouldn't.  You would do it by setting the
> environment of the function to something that doesn't have the global
> environment as a parent, or grandparent, etc.  The only common examples
> of that are baseenv() and emptyenv().  For example,
>
> x <- 1
> f <- function() print(x)
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Global-variables-tp3178242p3489796.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3-way contingency table

2011-05-02 Thread Mathias Walter

Hi David,

thanks for your quick response. It was really helpful.

--
Kind regards,
Mathias

2011/4/29 David Winsemius :
>
> On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote:
>
>> Hi,
>>
>> I have large data frame with many columns. A short example is given below:
>>
>>> dataH
>>
>>   host ms01 ms31 ms33 ms34
>> 1  cattle    4   20    9    6
>> 2   sheep    4    3    4    5
>> 3  cattle    4    3    4    5
>> 4  cattle    4    3    4    5
>> 5   sheep    4    3    5    5
>> 6    goat    4    3    4    5
>> 7   sheep    4    3    5    5
>> 8    goat    4    3    4    5
>> 9    goat    4    3    4    5
>> 10 cattle    4    3    4    5
>>
>> Now I want to determine the the frequencies of every unique value in
>> every column depending on the host column.
>>
>> It is quite easy to determine the frequencies in total with the
>> following command:
>>
>>> dataH2 <- dataH[,c(2,3,4,5)]
>>> table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA="ifany")
>>
>>   ms01 ms31 ms33 ms34
>> 3     0    9    0    0
>> 4    10    0    7    0
>> 5     0    0    2    9
>> 6     0    0    0    1
>> 9     0    0    1    0
>> 20    0    1    0    0
>>
>> But I cannot manage to get it dependent on the host.
>>
>> I tried
>>
>>> xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)
>>
>> and many other ways but I'm not stressful.
>>
>> I can get it for each column individually with
>>
>>> with(dataH, table(host, ms33))
>>
>>      ms33
>> host     4 5 9
>> cattle 3 0 1
>> deer   0 0 0
>> goat   3 0 0
>> human  0 0 0
>> sheep  1 2 0
>> tick   0 0 0
>>
>> But I do not want to repeat the command for every column. I need a
>> single table which can be plotted as a balloon plot, for instance.
>
> You have obviously not given us the full data from which your "correct
> answer" was drawn, but see if this is going  the right direction:
>
> require(reshape)
>> dataHm <- melt(dataH)
> Using host as id variables
>> xtabs(~host+value+variable, dataHm)
> , , variable = ms01
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 4 0 0 0  0
>  goat   0 3 0 0 0  0
>  sheep  0 3 0 0 0  0
>
> , , variable = ms31
>
>        value
> host     3 4 5 6 9 20
>  cattle 3 0 0 0 0  1
>  goat   3 0 0 0 0  0
>  sheep  3 0 0 0 0  0
>
> , , variable = ms33
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 3 0 0 1  0
>  goat   0 3 0 0 0  0
>  sheep  0 1 2 0 0  0
>
> , , variable = ms34
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 0 3 1 0  0
>  goat   0 0 3 0 0  0
>  sheep  0 0 3 0 0  0
>
>>
>> Does anybody knows how to achieve this?
>>
>> --
>> Kind regards,
>> Mathias
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pie of pie chart

2011-05-02 Thread Mathias Walter

Hi,

despite the fact that pie charts often fail, I'll draw them anyway (in
a case were they are not fail ;-) ).

Does anybody know a package/methods which can draw pie of pie or bar
of pie charts similar to that in MS Excel?

--
Kind regards,
Mathias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pie of pie chart

2011-05-02 Thread Jonathan Daily

The package ggplot2 can do this using a density statistic, polar
coordinates, and faceting.

Extra documentation for the package can be found at the author's site [1].

[1] http://had.co.nz/

On Mon, May 2, 2011 at 10:06 AM, Mathias Walter  wrote:
> Hi,
>
> despite the fact that pie charts often fail, I'll draw them anyway (in
> a case were they are not fail ;-) ).
>
> Does anybody know a package/methods which can draw pie of pie or bar
> of pie charts similar to that in MS Excel?
>
> --
> Kind regards,
> Mathias
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with a survplot

2011-05-02 Thread Frank Harrell

Please elaborate.
Thanks
Frank


Marco Barbàra-2 wrote:
> 
> Thank you very much. 
> 
> Despite prof. Harrell's support (for whom I feel great
> esteem) I still remain doubtful about this feature.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/help-with-a-survplot-tp3485998p3490126.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing new packages

2011-05-02 Thread Uwe Ligges




On 02.05.2011 10:47, liyatle wrote:

I'm having that problem, what did you do?


1. This is the mailing list R-help, not an individual person. I guess 
you sent to the wrong address.
2. PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html and provide commented, 
minimal, self-contained, reproducible code.

3. Please always quote the questions/answers you are referring to.
4. The post your are referring to is ancient. I guess you either just 
downloaded the files or you do not have write permissions to the library 
or you install do a different library than the one you expect. But since 
you have not given any details, we cannot help.


Uwe Ligges





--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-installing-new-packages-tp1589974p3489573.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with Rterm 2.13.0 - but not RGui

2011-05-02 Thread Jonathan Daily

The message is pretty clear. Access denied means you don't have
permission to access the path. This also explains why the packages
fail to load - you don't have access to R's package library. It most
likely works on RGui because you are clicking it/running it as admin
(you did not specify how you ran RGui).

2011/5/2 Stefan McKinnon Høj-Edwards :
> Hi all,
>
> I have just installed R 2.13.0 and I am experiencing problems with the 
> terminal, but not the with the GUI interface.
> I am Windows 7.
>
> When running "R" or "Rterm" from a commandline I receive the following:
>
> Warning message:
> In normalizePath(path.expand(path), winslash, mustWork) :
>  path[3]="C:/Programmer/R/R-2.13.0/library": Adgang nægtet
>
> R version 2.13.0 (2011-04-13)
> Copyright (C) 2011 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
> Warning message:
> package "methods" in options("defaultPackages") was not found
> During startup - Warning messages:
> 1: package 'datasets' in options("defaultPackages") was not found
> 2: package 'utils' in options("defaultPackages") was not found
> 3: package 'grDevices' in options("defaultPackages") was not found
> 4: package 'graphics' in options("defaultPackages") was not found
> 5: package 'stats' in options("defaultPackages") was not found
> 6: package 'methods' in options("defaultPackages") was not found
>
>
> Notice: "C:/Programmer/" is the Danish equivalent of "C:/Program Files".
> The first error "Adgang nægtet" is directly translated to "Access denied".
>
> Any suggestions as how to fix this?
>
> Kind regards,
> Stefan McKinnon Edwards
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with Sweave and pdflatex

2011-05-02 Thread Uwe Ligges

Have you checked the permissions in the working directory? Is there a 
blank in your path (LaTeX does not like spaces in the path).


Uwe Ligges


On 02.05.2011 14:51, Frank Lehmann wrote:

Hallo,



when I plot figures with Sweave, I get the message "pdflatex: Permission
denied". This problem only occurs while working on local system. When I copy
the *.rnw-File to my AFS drive, there is no problem at all.



Here is a small example:



\documentclass{scrartcl}

\usepackage[OT1]{fontenc}

\usepackage[latin1]{inputenc}

\usepackage[ngerman]{babel}

\usepackage[pdftex]{graphicx}

\usepackage{Sweave}



\begin{document}



\setkeys{Gin}{width=\textwidth}

\begin{figure}[htbp]

<>=

x<- 1:10

plot(x)

@

\caption{Eine einfache Grafik}

\end{figure}



\end{document}



Does anyone have an idea, how to solve that problem? Im working with Windows
XP.



Thanks!



Frank


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get row name using the which function

2011-05-02 Thread Uwe Ligges


rownames(which(example == max(example), arr.ind=TRUE))

Uwe Ligges

On 02.05.2011 13:54, Schumacher, G. wrote:

Dear All,

Probably a very basic question, but can't seem to work my way around it.

I want to which row has the maximum value. But what if the row names do not correspond with the row 
numbers. In the example below, you'll see that the max of example is row 4, but the name of row 4 
is "9". How do I get R to return "9" as value, instead of 4.

example<- matrix(c(0,0,0,1), 4, 1, dimnames=list(c("1", "3", "5", "9"), c("1")))
which.max(example)

[1] 4

Hope someone can help out.

Gijs Schumacher, MSc
PhD candidate

--
Department of Political Science
VU University Amsterdam

Contact:
Tel: +31(0)20 5986798
Fax: +31(0)20 5986820
Web: http://home.fsw.vu.nl/g.schumacher
Email: g.schumac...@vu.nl

Visiting address:
Metropolitan
Buitenveldertselaan 2
Room Z - 333

Mail:
De Boelelaan 1081
1081 HV Amsterdam
The Netherlands


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specify custom par(mfrow()) layout for defined plot()

2011-05-02 Thread Michael Bach

Prof Brian Ripley  writes:

> On Mon, 2 May 2011, Michael Bach wrote:
>
>> Uwe Ligges  writes:
>>
>>> On 29.04.2011 17:10, Michael Bach wrote:
 Dear R Users,

 I am doing stats::decompose() on 4 different time series.  When I issue

 csdA<- decompose(tsA)
 plot(csdA)

 I get a summary plot for observed, trend, seasonal and random components
 of decomposed time series tsA.  As I understand it, the object returned
 by decompose() has it's own plot method where mfrow(4,1) etc. is
 defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
 mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
 handle these cases?  Something like a "meta" par(mfrow())?
>>>
>>>
>>> This does not work and is one of the reasons why the grid package was 
>>> developed.
>>>
>>
>> Does this mean that there is no way whatsoever or that there is a
>> workaround via the grid package??
>
> See the gridBase package.
>

Will do. Thanks for the hint

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tests for the need of cluster analysis

2011-05-02 Thread Tal Galili

Hi Mary,
Are you using R for your other analysis?
If so, What commands are you using for your analysis?

p.s: please keep the rest of the R-help mailing list in the loop.

Cheers,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Mon, May 2, 2011 at 4:24 PM, MARY A. WEISS  wrote:

> Hi Tal,
>
> Thanks for your answer.  I am running models with two-way fixed effects and
> two-way fixed effects with a cluster option.  The results are very
> different.  I wanted to know if it is appropriate to cluster my data or
> not.  In looking through the R manual, I thought that prabclus might help me
> answer the question.  Does prabclus include any tests that will tell me if
> cluster analysis is appropriate to use with my data?  That is, is cluster
> analysis valid for my data?
>
> Thanks in advance for any help you can give me.  I really appreciate it.
>
> Mary
>
> On Mon, May 2, 2011 at 7:20 AM, Tal Galili  wrote:
>
>> Hi Mary,
>> I'm not sure I understood your question.
>>
>> Are you using this package:
>> http://cran.r-project.org/web/packages/prabclus/index.html
>>  And asking
>> how to decide if to use it or not?
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>> --
>>
>>
>>
>>
>>
>>
>> On Sun, May 1, 2011 at 7:54 PM, mary weiss  wrote:
>>
>>> Does R have the capability to perform tests for the need of clustering
>>> analysis (e.g., in prabclus)?  I am using panel data with two-way fixed
>>> effects but am unsure about whether I should be using a cluster option as
>>> well to estimate my model.--
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/Tests-for-the-need-of-cluster-analysis-tp3488097p3488097.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
> Mary A. Weiss, Ph. D.
> Deaver Prof. of Risk, Insurance & Healthcare Mgmt
> Editor, Risk Management and Insurance Review
> Risk, Ins. & Healthcare Mgmt Dept.
> Fox School of Business
> 1801 Liacouras Walk 6th Fl (006-07)
> Temple University
> Philadelphia, PA 19122
> 215-204-1916
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tests for the need of cluster analysis

2011-05-02 Thread MARY A. WEISS

Hi,

I am currently using STATA in my analysis.  STATA has a cluster option but
does not have any tests for whether cluster analysis is necessary or not for
a dataset.  So I am trying to figure out whether R could be used to test
whether I need to be doing cluster analysis or not.  If R does tests to
determine whether cluster analysis is valid for my data, I will learn R and
use it on my data.

My data are panel data consisting of 49 states and 25 years.  Currently, I
am estimating models with fixed state and time effects.

Thanks for any help you can give me.

Cheers,

Mary



On Mon, May 2, 2011 at 1:02 PM, Tal Galili  wrote:

> Hi Mary,
> Are you using R for your other analysis?
> If so, What commands are you using for your analysis?
>
> p.s: please keep the rest of the R-help mailing list in the loop.
>
> Cheers,
> Tal
>
>
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
>   On Mon, May 2, 2011 at 4:24 PM, MARY A. WEISS  wrote:
>
>> Hi Tal,
>>
>> Thanks for your answer.  I am running models with two-way fixed effects
>> and two-way fixed effects with a cluster option.  The results are very
>> different.  I wanted to know if it is appropriate to cluster my data or
>> not.  In looking through the R manual, I thought that prabclus might help me
>> answer the question.  Does prabclus include any tests that will tell me if
>> cluster analysis is appropriate to use with my data?  That is, is cluster
>> analysis valid for my data?
>>
>> Thanks in advance for any help you can give me.  I really appreciate it.
>>
>> Mary
>>
>>   On Mon, May 2, 2011 at 7:20 AM, Tal Galili wrote:
>>
>>> Hi Mary,
>>> I'm not sure I understood your question.
>>>
>>> Are you using this package:
>>> http://cran.r-project.org/web/packages/prabclus/index.html
>>>  And asking
>>> how to decide if to use it or not?
>>>
>>> Contact
>>> Details:---
>>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>>> www.r-statistics.com (English)
>>> --
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, May 1, 2011 at 7:54 PM, mary weiss  wrote:
>>>
 Does R have the capability to perform tests for the need of clustering
 analysis (e.g., in prabclus)?  I am using panel data with two-way fixed
 effects but am unsure about whether I should be using a cluster option
 as
 well to estimate my model.--
 View this message in context:
 http://r.789695.n4.nabble.com/Tests-for-the-need-of-cluster-analysis-tp3488097p3488097.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>
>>
>> --
>> Mary A. Weiss, Ph. D.
>> Deaver Prof. of Risk, Insurance & Healthcare Mgmt
>> Editor, Risk Management and Insurance Review
>> Risk, Ins. & Healthcare Mgmt Dept.
>> Fox School of Business
>> 1801 Liacouras Walk 6th Fl (006-07)
>> Temple University
>> Philadelphia, PA 19122
>> 215-204-1916
>>
>
>


-- 
Mary A. Weiss, Ph. D.
Deaver Prof. of Risk, Insurance & Healthcare Mgmt
Editor, Risk Management and Insurance Review
Risk, Ins. & Healthcare Mgmt Dept.
Fox School of Business
1801 Liacouras Walk 6th Fl (006-07)
Temple University
Philadelphia, PA 19122
215-204-1916

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lasso with Categorical Variables

2011-05-02 Thread Clemontina Alexander

Hi! This is my first time posting. I've read the general rules and
guidelines, but please bear with me if I make some fatal error in
posting. Anyway, I have a continuous response and 29 predictors made
up of continuous variables and nominal and ordinal categorical
variables. I'd like to do lasso on these, but I get an error. The way
I am using "lars" doesn't allow for the factors. Is there a special
option or some other method in order to do lasso with cat. variables?

Here is and example (considering ordinal variables as just nominal):

set.seed(1)
Y <- rnorm(10,0,1)
X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
X4 <- rchisq(10, df=4, ncp=0)
X <- data.frame(X1,X2,X3,X4)

> str(X)
'data.frame':   10 obs. of  4 variables:
 $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
 $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
 $ X3: int  51 46 50 44 43 50 30 42 49 48
 $ X4: num  2.86 1.55 1.94 2.45 2.75 ...


I'd like to do:
obj <- lars(x=X, y=Y, type = "lasso")

Instead, what I have been doing is converting all data to continuous
but I think this is really bad!
XX <- data.matrix(X)
obj <- lars(x=XX, y=Y, type = "lasso")

Thanks for any consideration,
Tina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ID parameter in model

2011-05-02 Thread Mike Harwood

Hello,

I am apparently confused about the use of an id parameter for an event
history/survival model, and why the EHA documentation for aftreg does
not specify one.  All assistance and insights are appreciated.

Attempting to specifiy an id variable with the documentation example
generates an "overlapping intervals" error, so I sorted the original
mort dataframe and set subsequent entry times an id to the previous
exit time + 0.0001.  This allowed me to see the affect of the id
parameter on the coefficients and significance tests, and prompted my
question.  The code I used is shown below, with the results at the
bottom.  Thanks in advance!

Mike

head(mort) ## data clearly contains multiple entries for some of the
dataframe ids

no.id.aft <- aftreg(Surv(enter, exit, event) ~ ses, data = mort)  ##
Inital model
id.aft <- aftreg(Surv(enter, exit, event) ~ ses, data = mort, id=id)
## overlapping intervals error

mort.sort <- ## ensure records ordered
mort[
order(mort$id, mort$enter),]

## remove overlap
for (i in 2:nrow(mort.sort)){
 if (mort.sort[i,'id'] == mort.sort[i-1,'id'])
 mort.sort[i,'enter'] <- mort.sort[i-1, 'exit'] + 0.0001
}

no.id.aft.sort <- aftreg(Surv(enter, exit, event) ~ ses, data =
mort.sort) ## initial model on modified df
id.aft.sort <- aftreg(Surv(enter, exit, event) ~ ses, id=id, data =
mort.sort) ## with id parameter


#=== output ===#
> no.id.aft.sort
Call:
aftreg(formula = Surv(enter, exit, event) ~ ses, data = mort.sort)

Covariate  W.mean  Coef Exp(Coef)  se(Coef)Wald p
ses
   lower0.416 0 1   (reference)
   upper0.584-0.347 0.707 0.089 0.000

log(scale)3.60336.704 0.065 0.000
log(shape)0.331 1.393 0.058 0.000

Events276
Total time at risk 17045
Max. log. likelihood  -1391.4
LR test statistic 16.1
Degrees of freedom1
Overall p-value   6.04394e-05
> id.aft.sort
Call:
aftreg(formula = Surv(enter, exit, event) ~ ses, data = mort.sort,
id = id)

Covariate  W.mean  Coef Exp(Coef)  se(Coef)Wald p
ses
   lower0.416 0 1   (reference)
   upper0.584-0.364 0.695 0.090 0.000

log(scale)3.58836.171 0.065 0.000
log(shape)0.338 1.402 0.058 0.000

Events276
Total time at risk 17045
Max. log. likelihood  -1390.8
LR test statistic 17.2
Degrees of freedom1
Overall p-value   3.3091e-05
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Steve Lianoglou

Hi,

On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander  wrote:
> Hi! This is my first time posting. I've read the general rules and
> guidelines, but please bear with me if I make some fatal error in
> posting. Anyway, I have a continuous response and 29 predictors made
> up of continuous variables and nominal and ordinal categorical
> variables. I'd like to do lasso on these, but I get an error. The way
> I am using "lars" doesn't allow for the factors. Is there a special
> option or some other method in order to do lasso with cat. variables?
>
> Here is and example (considering ordinal variables as just nominal):
>
> set.seed(1)
> Y <- rnorm(10,0,1)
> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
> X4 <- rchisq(10, df=4, ncp=0)
> X <- data.frame(X1,X2,X3,X4)
>
>> str(X)
> 'data.frame':   10 obs. of  4 variables:
>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
>  $ X3: int  51 46 50 44 43 50 30 42 49 48
>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
>
>
> I'd like to do:
> obj <- lars(x=X, y=Y, type = "lasso")
>
> Instead, what I have been doing is converting all data to continuous
> but I think this is really bad!

Yeah, it is.

Check out the "Categorical Predictor Variables" section here for a way
to handle such predictor vars:
http://www.psychstat.missouristate.edu/multibook/mlt08m.html

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread David Winsemius



On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:


Hi,

On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander > wrote:

Hi! This is my first time posting. I've read the general rules and
guidelines, but please bear with me if I make some fatal error in
posting. Anyway, I have a continuous response and 29 predictors made
up of continuous variables and nominal and ordinal categorical
variables. I'd like to do lasso on these, but I get an error. The way
I am using "lars" doesn't allow for the factors. Is there a special
option or some other method in order to do lasso with cat. variables?

Here is and example (considering ordinal variables as just nominal):

set.seed(1)
Y <- rnorm(10,0,1)
X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
X4 <- rchisq(10, df=4, ncp=0)
X <- data.frame(X1,X2,X3,X4)


str(X)

'data.frame':   10 obs. of  4 variables:
 $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
 $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
 $ X3: int  51 46 50 44 43 50 30 42 49 48
 $ X4: num  2.86 1.55 1.94 2.45 2.75 ...


I'd like to do:
obj <- lars(x=X, y=Y, type = "lasso")

Instead, what I have been doing is converting all data to continuous
but I think this is really bad!


Yeah, it is.

Check out the "Categorical Predictor Variables" section here for a way
to handle such predictor vars:
http://www.psychstat.missouristate.edu/multibook/mlt08m.html


Steve's citation is somewhat helpful, but not sufficient to take the  
next steps. You can find details regarding the mechanics of typical  
linear regression in R on the ?lm page where you find that the factor  
variables are typically handled by model.matrix. See below:


> model.matrix(~X1 + X2 + X3 + X4, X)
   (Intercept) X1B X1C X1D X2F X2G X2H X2I X3X4
11   0   0   1   0   1   0   0 51 2.8640884
21   0   0   0   0   0   1   0 46 1.5462243
31   0   1   0   0   1   0   0 50 1.9430901
41   0   0   0   1   0   0   0 44 2.4504180
51   1   0   0   0   0   0   1 43 2.7535052
61   1   0   0   0   0   0   1 50 1.6200326
71   0   0   0   0   0   0   1 30 0.5750533
81   1   0   0   0   0   0   0 42 5.9224777
91   0   0   1   0   0   0   1 49 2.0401528
10   1   1   0   0   0   1   0   0 48 6.2995288
attr(,"assign")
 [1] 0 1 1 1 2 2 2 2 3 4
attr(,"contrasts")
attr(,"contrasts")$X1
[1] "contr.treatment"

attr(,"contrasts")$X2
[1] "contr.treatment"

The numeric variables are passed through, while the dummy variables  
for factor columns are constructed (as treatment contrasts) and the  
whole thing it returned in a neat package.


--
David.


HTH,
-steve


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tests for the need of cluster analysis

2011-05-02 Thread Ben Bolker

MARY A. WEISS  temple.edu> writes:

> 
> Hi,
> 
> I am currently using STATA in my analysis.  STATA has a cluster option but
> does not have any tests for whether cluster analysis is necessary or not for
> a dataset.  So I am trying to figure out whether R could be used to test
> whether I need to be doing cluster analysis or not.  If R does tests to
> determine whether cluster analysis is valid for my data, I will learn R and
> use it on my data.
> 
> My data are panel data consisting of 49 states and 25 years.  Currently, I
> am estimating models with fixed state and time effects.
> 
> Thanks for any help you can give me.
> 
> Cheers,
> 
> Mary

  You might want to forward this question to the r-sig-mixed-models
list.   I think you are fairly far off base in comparing 'prabclus'
(spatial clustering) to what Stata means by "clustered standard errors"
(e.g. ).
Cluster _analysis_ has to do with finding clusters in data; prabclus
uses spatial information to do cluster analysis; robust cluster
variances or standard errors have to do with adjusting variance/SE
to account for predetermined grouping variables ("clusters" in the
data, e.g. states).

  I don't know offhand whether there are packages in R that implement
the "robust cluster variance" estimator; packages like geeglm,
geepack, and especially the "sandwich" package are definitely worth
looking at (they implement the equivalent of robust, but not robust
cluster [as far as I can tell], variance estimators]), as well as
the Econometrics Task View and the book "R for Stata Users" by
Muenchen and Hilbe.

  A final philosophical note: I don't think you should be
testing _based on your data_ whether robust or robust cluster
variance estimators are more appropriate; there's a fairly
dangerous data snooping issue here.  Rather, you should try to
decide _a priori_ based on your data what's most appropriate.

  Ben Bolker

> 
> On Mon, May 2, 2011 at 1:02 PM, Tal Galili  gmail.com> wrote:
> 
> > Hi Mary,
> > Are you using R for your other analysis?
> > If so, What commands are you using for your analysis?
> >
> > p.s: please keep the rest of the R-help mailing list in the loop.
> >
> > Cheers,
> > Tal
> >
> >
> >
[snip]

> >
> >
> >
> >
> [snip] MARY A. WEISS  temple.edu> wrote:
> >
> >> Hi Tal,
> >>
> >> Thanks for your answer.  I am running models with two-way fixed effects
> >> and two-way fixed effects with a cluster option.  The results are very
> >> different.  I wanted to know if it is appropriate to cluster my data or
> >> not.  In looking through the R manual, 
> >> I thought that prabclus might help me
> >> answer the question.  Does prabclus include any tests that will tell me if
> >> cluster analysis is appropriate to use with my data?  That is, is cluster
> >> analysis valid for my data?
> >>
> >> Thanks in advance for any help you can give me.  I really appreciate it.
> >>
> >> Mary
> >>
[snip]
> >>
> >>> Hi Mary,
> >>> I'm not sure I understood your question.
> >>>
> >>> Are you using this package:
> >>> http://cran.r-project.org/web/packages/prabclus/index.html
> >>>  And asking
> >>> how to decide if to use it or not?
> >>>
> >>> Contact
> >>> Details:---
> >>> Contact me: Tal.Galili  gmail.com |  972-52-7275845
> >>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> >>> www.r-statistics.com (English)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sun, May 1, 2011 at 7:54 PM, mary weiss  temple.edu> wrote:
> >>>
>  Does R have the capability to perform tests for the need of clustering
>  analysis (e.g., in prabclus)?  I am using panel data with two-way fixed
>  effects but am unsure about whether I should be using a cluster option
>  as
>  well to estimate my model.--

> 

[snip]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with coloring segments on a plot

2011-05-02 Thread Paul Davison

Hi. I need a very short piece of help regarding colouring segments plotted
on a graph.

When I am plotting segments for the graph, I am using "red" and "darkgreen
for the values "1" and "2" respectively. Heres the relevant line of code in
R:

+ col = c("red", "darkgreen")[line.colour.value])

I just need to extend this to refer to a larger range of numbers from 1 to
10, to plot the segments in ten different colours. The values are just the
first ten integers: 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10
Each of the ten values will refer to a different colour just as "1" would
plot a segment in red and "2" would plot a segment in darkgreen.

The only other condition I need is that the colours be in hex format. Would
this be along the right lines? :

+ col = c("#FF", "#FF", "#FF", "#FF", "#FF", "#FF",
"#FF", "#FF", "#FF", "#FF",)[line.colour.value])

Or would I need to adjust the code in other places too?

I have copied the code I am using below. I have also copied below a small
excerpt of the simple data I am plotting - with the headers at the top.

Thank you so much for your help.

Paul Davison
University of Cambridge, UK




> data = read.csv("r.test.data.csv", header = TRUE)
> with(data, {
+ par(bg="#0B5FA5")
+ par(lwd=0.01)
+ plot(NA, NA,
+ xlim = range(start.x.co.ordinate, end.x.co.ordinate, 5),
+ ylim = range(start.y.co.ordinate, end.y.co.ordinate, 5),
+ type = "n", ann = FALSE, axes = FALSE)
+ segments(start.x.co.ordinate, start.y.co.ordinate,
+ end.x.co.ordinate, end.y.co.ordinate,
+ col = c("red", "darkgreen")[line.colour.value])
+ title(main = "10th April 1991",
+ xlab = "Pandora",
+ ylab = "Luna")
+ })
>> quartz.save("sample4.png","png")


The values in the following data table for the column "line.colour.value"
are just 1s and 2s. Ideally I would have numbers of 1 through to 10 and each
one would plot a different coloured (using a hex value) segment.


start.x.co.ordinatestart.y.co.ordinate  end.x.co.ordinate
end.y.co.ordinate   line.colour.value
300 300 2289 20289 2 300 300 2692 20467 1 300 300 3010 20608 2 300 300
2727 19828 1 300 300 2606 20056 2 300 300 16244 21416 1 300 300 16154 21899
2 300 300 16941 21434 1 300 300 17356 20205 2 300 300 16928 21245 1 300 300
16011 21024 2 300 300 17323 20053 1 300 300 17312 20435 2 300 300 17175
21259 1 300 300 16851 21268 2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] INSERT OR UPDATE

2011-05-02 Thread Mikkel Grum

I'm trying to insert rows of a data.frame into a database table, or update 
where the key fields of a record already exist in the table. I've come up with 
a possible solution below, but would like to hear if anyone has a better 
solution.

# The problem demonstrated:
# Create a data.frame with test values
library(RODBC)
tbl <- data.frame(
key1 = rep(1:3, each = 2),
key2 = rep(LETTERS[1:2], 3),
somevalue = rnorm(6)
)

# Create table in database using the following SQL
CREATE TABLE tbl
(
  key1 integer NOT NULL,
  key2 character varying(1) NOT NULL,
  somevalue double precision,
  CONSTRAINT pktbl PRIMARY KEY (key1, key2)
)

# Continue in R
pg <- odbcConnect("testdb")
sqlSave(pg, tbl[1:2, ], append = TRUE, rownames = FALSE)
sqlSave(pg, tbl[3, ], append = TRUE, rownames = FALSE)

tbl[1, 3] <- 1
sqlUpdate(pg, tbl[1:4, ], index = c("key1", "key2")) # Fails

# Can replace the above sqlUpdate with:
sqlUpdate(pg, tbl[1:3, ], index = c("key1", "key2")) 
sqlSave(pg, tbl[4, ], append = TRUE, rownames = FALSE)

# Proposed solution:
tbl[1, 3] <- 0
tmp <- tbl
yes <- sqlQuery(pg, "SELECT key1, key2 FROM tabl", as.is = TRUE)
for (i in seq(along = present$key1)) {
sqlUpdate(pg, tmp[tmp$key1 == yes$key1[i] & tmp$key2 == yes$key2[i], ], 
"tbl", index = c("key1", "key2"))
tmp <- tmp[!(tmp$key1 == yes$key1[i] & tmp$key2 == yes$key2[i]), ]
}
sqlSave(pg, tmp, "tbl", append = TRUE, rownames = FALSE)

This is fine for small tables, where the need for updates is frequent, and 
there is no risk of anyone else doing the same thing at the same time. If the 
table is big and updates are rare, it seems like quite an overhead for what 
would essential be inserts. Does anyone have a more rational way of doing this 
with big data sets where updates are rare, e.g. only do it if sqlSave fails?

Is it possible to put a lock on the database while doing the updates and  
inserts to avoid problems with concurrency?

I'm working with PostgreSQL, but the example should be generic.

Thanks in advance
Mikkel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help converting a data.frame to ordered factors

2011-05-02 Thread Robert Cassidy

I have a 96x34 array of Likert scale data (96 cases, 34 items) of
ordered factors (strongly disagree, disagree, neutral, agree, strongly
agree) that are coded numerically (1 through 5).

I cannot seem to convert this array (in any class) into ordered vectors.

I have all the cases as vectors of ordered factors, but any which way
I reassemble those vectors loses the ordered factors and converts back
to numbers.

Can someone tell me how to either convert the data.frame into ordered
factors OR how to assemble the vectors (of ordered factors) into an
array that preserves the factors.

Many thanks in advance for any help.
Robert


-- 
Robert Cassidy, PhD
Department of Psychology
Concordia University
7141 Sherbrooke W.
Montreal (QC) H4B 1R6
tel: (514) 848-2424 x2244
fax: (514) 848-4523
office: PY-119.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Optimization - n dimension matrix

2011-05-02 Thread petrolmaniac

Dear all,

I am facing the following problem in optimization:

w = (d, o1, ..., op, m1, ..., mq) is a 1 + p + q vector

I want to determine: 

w = argmin (a - d(w))' A (a - d(w))

where a is a 1xK marix, A is the covariance matrix of vector a, d(w) is a
1xK vector which parameters are functions of parameters d, o1 .. op, m1 ..
mq.

Is there some function to solve this problem easily? I know optim() and
ucminf() for one-dimensional optimization (I believe). Are there some tools
for such n-dimensional problem?

Kind regards,

C.
-- 

--
View this message in context: 
http://r.789695.n4.nabble.com/Optimization-n-dimension-matrix-tp3490772p3490772.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Copying to R a rectangular array from a Java class

2011-05-02 Thread Hurr

I am happy to report that the author and maintainer of rJava informed me
that the 2-dim array in java needs sapply and .jevalArray as follows:

> conn2Arr<- sapply(.jfield(rJavaTst,sig="[[D","con2Arr"),.jevalArray)
> conn2ArrRet <-
> sapply(.jcall(rJavaTst,returnSig="[[D","retCon2Arr"),.jevalArray)
> # I can't identify any complaints so far 
> print(conn2Arr)
 [,1] [,2] [,3]
[1,]  101  201  301
[2,]  102  202  302
[3,]  103  203  303
[4,]  104  204  304
> print(conn2ArrRet)
 [,1] [,2] [,3]
[1,]  101  201  301
[2,]  102  202  302
[3,]  103  203  303
[4,]  104  204  304
 I know there are a few out there interested since I see I got a few views.
But, I don't know the solution to the parameter-passing problem yet.


--
View this message in context: 
http://r.789695.n4.nabble.com/Copying-to-R-a-rectangular-array-from-a-Java-class-tp3486167p3490919.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] QQ plot for normality testing

2011-05-02 Thread Greg Snow

I would use the vis.test function along with vt.qqnorm (both in TeachingDemos 
package).  This will create several plots, one of which is your data, the rest 
are simulated normals with the same mean and standard deviation.  If you can 
tell which plot stands out (and it is your real data) then that suggests that 
the data is not normal.  If you cannot tell which plot is the real data then 
that suggests that your data is close enough to normal.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Matevž Pavlic
> Sent: Saturday, April 30, 2011 11:28 AM
> To: r-help@r-project.org
> Subject: [R] QQ plot for normality testing
> 
> Hi all,
> 
> 
> 
> I am trying to test wheater the distribution of my samples is normal
> with QQ plot.
> 
> 
> 
> I have a values of water content in clays in around few hundred
> samples. Is the code :
> 
> 
> 
> qqnorm(w)  #w being water content
> 
> qqline(w)
> 
> 
> 
> 
> 
> sufficient?
> 
> 
> 
> How do I know when I get the plots which distribution is normal and
> which is not?
> 
> 
> 
> Thanks, m
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] UNIX-like "cut" command in R

2011-05-02 Thread Mike Miller

The R "cut" command is entirely different from the UNIX "cut" command. 
The latter retains selected fields in a line of text.  I can do that kind 
of manipulation using sub() or gsub(), but it is tedious.  I assume there 
is an R function that will do this, but I don't know its name.  Can you 
tell me?


I'm also guessing that there is a web page somewhere that will tell me how 
to do a lot of common GNU/UNIX/Linux "text util" commmand-line kinds of 
things in R.  By that I mean by using R functions, not by making system 
calls.  Does anyone know of such a web page?


Thanks in advance.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help converting a data.frame to ordered factors

2011-05-02 Thread Phil Spector


Robert -
   It would be helpful to know what you've tried that didn't
work, but the data.frame() function is the usual way of combining
things like this:


a = factor(sample(1:5,100,replace=TRUE),ordered=TRUE)
b = factor(sample(1:5,100,replace=TRUE),ordered=TRUE)
ab = data.frame(a,b)
sapply(ab,class)
 a b 
[1,] "ordered" "ordered"

[2,] "factor"  "factor"

In particular cbind() and matrix() will not work properly for
what you're trying to do.

Of course, if you explained exactly how you're creating the 
96x34 array, there might be a better solution.


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Mon, 2 May 2011, Robert Cassidy wrote:


I have a 96x34 array of Likert scale data (96 cases, 34 items) of
ordered factors (strongly disagree, disagree, neutral, agree, strongly
agree) that are coded numerically (1 through 5).

I cannot seem to convert this array (in any class) into ordered vectors.

I have all the cases as vectors of ordered factors, but any which way
I reassemble those vectors loses the ordered factors and converts back
to numbers.

Can someone tell me how to either convert the data.frame into ordered
factors OR how to assemble the vectors (of ordered factors) into an
array that preserves the factors.

Many thanks in advance for any help.
Robert


--
Robert Cassidy, PhD
Department of Psychology
Concordia University
7141 Sherbrooke W.
Montreal (QC) H4B 1R6
tel: (514) 848-2424 x2244
fax: (514) 848-4523
office: PY-119.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] INSERT OR UPDATE

2011-05-02 Thread Steven Kennedy

Rather than selecting all the keys, then having R loop through them, why not
have postgres do it for you with something like:

#go through each line in our entry table
for (i in 1:dim(tbl)[1]){
#check if the pkey already exists
q <- paste ("SELECT key1, key2 FROM tabl WHERE key1=",tbl[i,1],"
AND key2=",tbl[i,1]",sep="")
yes <- sqlQuery(pg, q, as.is = TRUE)
if (dim(yes)[1] == 1){
#update the row if it exists
sqlUpdate(pg, tbl[i,],"tbl", index = c("key1", "key2"))
} else {
#add the row if it doesn't
sqlSave(pg, tbl[i,], "tbl", append = TRUE, rownames = FALSE)
}
}

This should work fine for small or large tables (especially if you index the
large table that doesn't change much).

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setting options only inside functions

2011-05-02 Thread luke-tierney

On Fri, 29 Apr 2011, William Dunlap wrote:

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of
luke-tier...@uiowa.edu
Sent: Friday, April 29, 2011 9:35 AM
To: Jonathan Daily
Cc: r-help@r-project.org; Hadley Wickham; Barry Rowlingson
Subject: Re: [R] setting options only inside functions

The Python solution does not extend, at least not cleanly, to things
like dev on/ dev off or to Hadley's locale example.  In any case if I
am reading the Python source correctly on how they handle user
interrupts this solution has the same non-robusness to user interrupts
issue that Bill's initial solution had.

As a basis I believe what we need is a mechanism that handles a
setup, an action, and a cleanup, with setup and cleanup occurring with
interrupts disablednand the action with interrupts enabled. Scheme's
dynamic wind is similar, though I don't believe the scheme standard
addresses interrupts and we don't need to worry about continuations,
but some of the issues are similar.  Probably we would want two
flavors, one in which the action has to be a function that takes as a
single argument the result produced by the setup code, and one in
which the action can be an argument expression that is then evaluated
at the appropriate place by laze evaluation.

This can be done at the R level except for the controlling of
interrupts (and possibly other asynchronous stuff)-- that would need a
new pair of primitives (suspendInterrupts/enableInterupts or something
like that).  There is something in the Haskell literature on this that
I have looked at a while back -- probably time to have another look.

Luke,

 A similar problem is that if optionsList contains an illegal
option then setting options(optionList) will commit changes
to .Options as it works it way down the optionList until it
hits the illegal option, when it throws an error.  Then the
following on.exit is never called (it wouldn't have the output
of options(optionList) to work on if it were called) and the
initial settings in optionList stick around forever.  E.g.,

 > withOptions <- function(optionList, expr) {
 + oldOpt <- options(optionList)
 + on.exit(options(oldOpt))
 + expr
 + }
 > getOption("height")
 NULL
 > getOption("width")
 [1] 80
 > withOptions(list(height=10, width=-2), 666)
 Error in options(optionList) :
   invalid 'width' parameter, allowed 10...1
 > getOption("height")
 [1] 10
 > getOption("width")
 [1] 80

I haven't checked to see if par() works in the same way - it
does in S+.

An ignoreInterrupts(expr) function would not help in that case.

It would be solving an orthogonal problem.

Making options() (and par()) atomic operations would help, but that
may be a lot of work.

But it would be the right thing to do for this purpose, either by
creating an atomic version just for use in this context or by having a
withOptions construct recursively work thougheach option.

 options() might also warn but no change
.Options if there were an attempt to set an illegal option.

Seems more or less the same as making options() atomic.

Best,

luke

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

On Thu, 28 Apr 2011, Jonathan Daily wrote:

I would also love to see this implemented in R, as my

current solution

to the issue of doing tons of open/close, dev/dev.off, etc.

is to use

snippets in my IDE, and in the end I feel like it is a hack job. A
pythonic "with" function would also solve most of the

situations where

I have had to use awkward try or tryCatch calls. I would be

willing to

help with this project, even if it is just testing.

On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
 wrote:

but it's a little clumsy, because

with_connection(file("myfile.txt"), {do stuff...})

isn't very useful because you have no way to reference

the connection

that you're using. Ruby's blocks have arguments which

would require

big changes to R's syntax.  One option would to use pronouns:

 Looking very much like python 'with' statements:

http://effbot.org/zone/python-with-statement.htm

 Implemented via the 'with' statement which can operate on anything
that has a __enter__ and an __exit__ method. Very neat.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

--
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathe

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Andrew Robinson

Hi Mike,

try substr()

Cheers

Andrew

On Mon, May 02, 2011 at 03:53:58PM -0500, Mike Miller wrote:
> The R "cut" command is entirely different from the UNIX "cut" command. 
> The latter retains selected fields in a line of text.  I can do that kind 
> of manipulation using sub() or gsub(), but it is tedious.  I assume there 
> is an R function that will do this, but I don't know its name.  Can you 
> tell me?
> 
> I'm also guessing that there is a web page somewhere that will tell me how 
> to do a lot of common GNU/UNIX/Linux "text util" commmand-line kinds of 
> things in R.  By that I mean by using R functions, not by making system 
> calls.  Does anyone know of such a web page?
> 
> Thanks in advance.
> 
> Mike
> 
> --
> Michael B. Miller, Ph.D.
> Minnesota Center for Twin and Family Research
> Department of Psychology
> University of Minnesota
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization - n dimension matrix

2011-05-02 Thread Andrew Robinson

Hello,

optim() works for more than one dimension.  You might also find this
page helpful:

http://cran.r-project.org/web/views/Optimization.html

Cheers

Andrew

On Mon, May 02, 2011 at 12:41:19PM -0700, petrolmaniac wrote:
> Dear all,
> 
> I am facing the following problem in optimization:
> 
> w = (d, o1, ..., op, m1, ..., mq) is a 1 + p + q vector
> 
> I want to determine: 
> 
> w = argmin (a - d(w))' A (a - d(w))
> 
> where a is a 1xK marix, A is the covariance matrix of vector a, d(w) is a
> 1xK vector which parameters are functions of parameters d, o1 .. op, m1 ..
> mq.
> 
> Is there some function to solve this problem easily? I know optim() and
> ucminf() for one-dimensional optimization (I believe). Are there some tools
> for such n-dimensional problem?
> 
> Kind regards,
> 
> C.
> -- 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Optimization-n-dimension-matrix-tp3490772p3490772.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] INSERT OR UPDATE

2011-05-02 Thread Mikkel Grum

Thanks Steven. It obviously makes sense to loop on the much smaller dataset 
that is being added than the set of everything that might already be in the 
database. I've added your message in plain text, so that others can see it too. 
Mikkel 

From: Steven Kennedy 
Subject: Re: [R] INSERT OR UPDATE
To: "Mikkel Grum" 
Cc: "R Help" 
Date: Monday, May 2, 2011, 5:15 PM

Rather than selecting all the keys, then having R loop through them, why not 
have postgres do it for you with something like:

#go through each line in our entry table
for (i in 1:dim(tbl)[1]){
    #check if the pkey already exists
    q <- paste ("SELECT key1, key2 FROM tabl WHERE key1=",tbl[i,1],"
    AND key2=",tbl[i,1]",sep="") 
    yes <- sqlQuery(pg, q, as.is = TRUE)
    if (dim(yes)[1] == 1){
    #update the row if it exists
    sqlUpdate(pg, tbl[i,],"tbl", index = c("key1", "key2"))
    } else {
    #add the row if it doesn't
    sqlSave(pg, tbl[i,], "tbl", append = TRUE, rownames = FALSE)
    }
}

This should work fine for small or large tables (especially if you index the 
large table that doesn't change much).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] easy way to do a 2-D fit to an array of data?

2011-05-02 Thread Carl Witthoft


Hi,
I've got a matrix, Z, of values representing (as it happens) optical 
power at each pixel location.  Since I know in advance I've got a 
single,  convex peak, I would like to do a 2D parabolic fit of the form 
Z = poly((x+y),2) where x and y are the x,y coordinates of each pixel 
(or equivalently, the row, column numbers).
Is there an R function that lets me easily implement that? I've started 
down the path of something like


zvec <- as.vector(Z), and creating  applicable x,y vectors by something 
like  (where for the sake of argument Z is 128x128)


foo<-matrix(seq(1,128),128,128)

xvec <- as.vector(foo)
yvec <- as.vector(t(foo))

at which point I can feed zvec, xvec, yvec to lm() .

I'm  hopeful someone can point me to a much easier way to do the same 
thing.  Oh, and if there's a 2-D  splinefunction generator, that would 
work for me as well.


thanks
Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with coloring segments on a plot

2011-05-02 Thread Andrew Robinson

Hi Paul,

not to seem naive, but have you actually tried the code below?  It
doesn't seem that you have, from your text.  I think that if you try
it and hack then ask concrete questions (e.g. can anyone explain why
the following simple, reproducible, commented code does not work) then
you'll have more luck.

Best wishes

Andrew

On Mon, May 02, 2011 at 02:26:16PM -0400, Paul Davison wrote:
> Hi. I need a very short piece of help regarding colouring segments plotted
> on a graph.
> 
> When I am plotting segments for the graph, I am using "red" and "darkgreen
> for the values "1" and "2" respectively. Heres the relevant line of code in
> R:
> 
> + col = c("red", "darkgreen")[line.colour.value])
> 
> I just need to extend this to refer to a larger range of numbers from 1 to
> 10, to plot the segments in ten different colours. The values are just the
> first ten integers: 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10
> Each of the ten values will refer to a different colour just as "1" would
> plot a segment in red and "2" would plot a segment in darkgreen.
> 
> The only other condition I need is that the colours be in hex format. Would
> this be along the right lines? :
> 
> + col = c("#FF", "#FF", "#FF", "#FF", "#FF", "#FF",
> "#FF", "#FF", "#FF", "#FF",)[line.colour.value])
> 
> Or would I need to adjust the code in other places too?
> 
> I have copied the code I am using below. I have also copied below a small
> excerpt of the simple data I am plotting - with the headers at the top.
> 
> Thank you so much for your help.
> 
> Paul Davison
> University of Cambridge, UK
> 
> 
> 
> 
> > data = read.csv("r.test.data.csv", header = TRUE)
> > with(data, {
> + par(bg="#0B5FA5")
> + par(lwd=0.01)
> + plot(NA, NA,
> + xlim = range(start.x.co.ordinate, end.x.co.ordinate, 5),
> + ylim = range(start.y.co.ordinate, end.y.co.ordinate, 5),
> + type = "n", ann = FALSE, axes = FALSE)
> + segments(start.x.co.ordinate, start.y.co.ordinate,
> + end.x.co.ordinate, end.y.co.ordinate,
> + col = c("red", "darkgreen")[line.colour.value])
> + title(main = "10th April 1991",
> + xlab = "Pandora",
> + ylab = "Luna")
> + })
> >> quartz.save("sample4.png","png")
> 
> 
> The values in the following data table for the column "line.colour.value"
> are just 1s and 2s. Ideally I would have numbers of 1 through to 10 and each
> one would plot a different coloured (using a hex value) segment.
> 
> 
> start.x.co.ordinatestart.y.co.ordinate  end.x.co.ordinate
> end.y.co.ordinate   line.colour.value
> 300 300 2289 20289 2 300 300 2692 20467 1 300 300 3010 20608 2 300 300
> 2727 19828 1 300 300 2606 20056 2 300 300 16244 21416 1 300 300 16154 21899
> 2 300 300 16941 21434 1 300 300 17356 20205 2 300 300 16928 21245 1 300 300
> 16011 21024 2 300 300 17323 20053 1 300 300 17312 20435 2 300 300 17175
> 21259 1 300 300 16851 21268 2
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation Questions

2011-05-02 Thread Andrew Robinson

Hi Shane,

it sounds to me as though you have a fairly well-defined problem.  You
want to generate random numbers with a specific mean, variance, and
correlation with another random varaible.  I would reverse-enginerr
the fuinctions for simple linear regression to get a result like

y = beta_0 + beta_1 * x + rnorm(n, 0, sigma^2)

and use that as the basis of generating random numbers.

Not sure how to interpret the second question ...

Cheers

Andrew

On Sun, May 01, 2011 at 12:33:41AM -0400, Shane Phillips wrote:
> I have the following script for generating a dataset.  It works like a champ 
> except for a couple of things.
> 
> 1.  I need the variables "itbs" and  "map" to be negatively correlated with 
> the binomial variable "lunch"  (around -0.21 and -0.24, respectively). The 
> binomial variable  "lunch" needs to remain unchanged.
> 2.  While my generated variables do come out with the desired means and 
> correlations, the distribution is very narrow and only represents a small 
> portion of the possible scores.  Can I force it to encompass a wider range of 
> scores, while maintaining my desired parameters and correlations?
> 
> Please help...
> 
> Shane
> 
> Script follows...
> 
> 
> 
> #Number the subjects
> subject=1:1000
> #Assign a treatment condition from a binomial distribution with a probability 
> of 0.13
> treat=rbinom(1*1000,1,.13)
> #Assign a lunch status condition froma binomial distribution with a 
> probability of 0.35
> lunch=rbinom(1*1000,1,.35)
> #Generate age in months from a random normal distribution with mean of 87 and 
> sd of 2
> age=rnorm(1000,87,2)
> #invoke the MASS package
> require(MASS)
> #Establish the covariance matrix for MAP, ITBS and CogAT scores
> sigma <- matrix(c(1, 0.84, 0.59, 0.84, 1, 0.56, 0.59, 0.56, 1), ncol = 3)
> #Establish MAP as a random normal variable with mean of 200 and sd of 9
> map   <- rnorm(1000, 200, 9)
> #Establish ITBS as a random normal variable with mean of 175 and sd of 15
> itbs <- rnorm(1000, 175, 15)
> #Establish CogAT as a random normal variable with mean of 100 and sd of 16
> cogat<-rnorm(1000,100,16)
> #Create a dataframe of MAP, ITBS, and CogAT
> data <- data.frame(map, itbs, cogat)
> #Draw from the multivariate distribution defined by MAP, ITBS, and CogAT 
> means and the covariance matrix
> sim <- mvrnorm(1000, mu=mean(data), sigma, empirical=FALSE)
> #Set growth at 0
> growth=0
> #Combine elements into a single dataset
> simtest=data.frame (subject=subject, treat=treat,lunch, 
> age=round(age,0),round(sim,0),growth)
> #Set mean growth by treatment condition with treatd subjects having a mean 
> growth of 1.5 and non-treated having a mean growth of 0.1
> simtest<-transform(simtest, growth=rnorm(1000,m=ifelse(treat==0,0.1,1.5),s=1))
> simtest
> cor (simtest)
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Clemontina Alexander

Thanks for your response, but I guess I didn't make my question clear.
I am already familiar with the concept of dummy variables and
regression in R. My question is, can the "lars" package (or some other
lasso algorithm) handle factors? I did use dummy variables in my
original data, but lars (lasso) only shrank the coefficients of some
of the levels of one factor to 0. Is this the correct thing to do?
Because intuitively it seems like I would want to shrink the whole
factor coefficient to 0. If this is correct, what is the
interpretation? For example, for X1, if lasso drops the coefficient
for levels A and B, but not C and D, does this mean that X1 should be
included in the model?
Thanks.



On Mon, May 2, 2011 at 2:47 PM, David Winsemius  wrote:
>
> On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:
>
>> Hi,
>>
>> On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander 
>> wrote:
>>>
>>> Hi! This is my first time posting. I've read the general rules and
>>> guidelines, but please bear with me if I make some fatal error in
>>> posting. Anyway, I have a continuous response and 29 predictors made
>>> up of continuous variables and nominal and ordinal categorical
>>> variables. I'd like to do lasso on these, but I get an error. The way
>>> I am using "lars" doesn't allow for the factors. Is there a special
>>> option or some other method in order to do lasso with cat. variables?
>>>
>>> Here is and example (considering ordinal variables as just nominal):
>>>
>>> set.seed(1)
>>> Y <- rnorm(10,0,1)
>>> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
>>> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
>>> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
>>> X4 <- rchisq(10, df=4, ncp=0)
>>> X <- data.frame(X1,X2,X3,X4)
>>>
 str(X)
>>>
>>> 'data.frame':   10 obs. of  4 variables:
>>>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
>>>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
>>>  $ X3: int  51 46 50 44 43 50 30 42 49 48
>>>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
>>>
>>>
>>> I'd like to do:
>>> obj <- lars(x=X, y=Y, type = "lasso")
>>>
>>> Instead, what I have been doing is converting all data to continuous
>>> but I think this is really bad!
>>
>> Yeah, it is.
>>
>> Check out the "Categorical Predictor Variables" section here for a way
>> to handle such predictor vars:
>> http://www.psychstat.missouristate.edu/multibook/mlt08m.html
>
> Steve's citation is somewhat helpful, but not sufficient to take the next
> steps. You can find details regarding the mechanics of typical linear
> regression in R on the ?lm page where you find that the factor variables are
> typically handled by model.matrix. See below:
>
>> model.matrix(~X1 + X2 + X3 + X4, X)
>   (Intercept) X1B X1C X1D X2F X2G X2H X2I X3        X4
> 1            1   0   0   1   0   1   0   0 51 2.8640884
> 2            1   0   0   0   0   0   1   0 46 1.5462243
> 3            1   0   1   0   0   1   0   0 50 1.9430901
> 4            1   0   0   0   1   0   0   0 44 2.4504180
> 5            1   1   0   0   0   0   0   1 43 2.7535052
> 6            1   1   0   0   0   0   0   1 50 1.6200326
> 7            1   0   0   0   0   0   0   1 30 0.5750533
> 8            1   1   0   0   0   0   0   0 42 5.9224777
> 9            1   0   0   1   0   0   0   1 49 2.0401528
> 10           1   1   0   0   0   1   0   0 48 6.2995288
> attr(,"assign")
>  [1] 0 1 1 1 2 2 2 2 3 4
> attr(,"contrasts")
> attr(,"contrasts")$X1
> [1] "contr.treatment"
>
> attr(,"contrasts")$X2
> [1] "contr.treatment"
>
> The numeric variables are passed through, while the dummy variables for
> factor columns are constructed (as treatment contrasts) and the whole thing
> it returned in a neat package.
>
> --
> David.
>>
>> HTH,
>> -steve
>>
> --
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Impulse response analysis within package vars

2011-05-02 Thread jessezeng

Hi, I have a similar question: 

ir <- irf(varsumm, impulse=c("prod", "rea", "rpo") n.ahead=20, runs=500,
ci=0.95)

will calculate the orthogonalized impulse responses from "prod", "rea", and
"rpo", i.e. a (1, 1, 1)' vector. What do I need to do to make the impulse
(-1, 1, 1)', i.e. I want the the first shock to be negative 1 unit? 

Thanks, 

Jesse

--
View this message in context: 
http://r.789695.n4.nabble.com/Impulse-response-analysis-within-package-vars-tp841596p3491284.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] install rdcomclient source

2011-05-02 Thread Richard Wang

Hi,

I'd like to ask a installation question.  I want to install a source code
through the following command,
R CMD INSTALL RDCOMClient

but get  Error: unexpected symbol in "r cmd"

Please let know if I miss anything.  I my utils package loaded.

Thanks,
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] List of Data Frames

2011-05-02 Thread Mike Smith

I'm trying to create a list of Data Frames.  I have 17 data frames that I need 
to move through in a loop, but if I simply make a list of them, then they do 
not stay data frames, and I can't sort through them.  I tried to create an 
array, but the data frames can have anywhere from 14-16 rows, and I couldn't 
find a way to make a variable size array.  If you have any ideas, I would 
greatly appreciate any help, as I'm trying to learn R, and decided to apply it 
to a project that I have been working on.  My goal is splitting a sports season 
into games per week, and then do statistics on each week, but have an average 
running up to that point in the season.  Thus the list would be indexed by 
weeks, and then there's a data frame of the game and all relevant statistics. 

Thank You,
Mike Smith


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List of Data Frames

2011-05-02 Thread Jerome Asselin

On Mon, 2011-05-02 at 16:42 -0700, Mike Smith wrote:
> I'm trying to create a list of Data Frames.  I have 17 data frames
> that I need to move through in a loop, but if I simply make a list of
> them, then they do not stay data frames, and I can't sort through
> them.  I tried to create an array, but the data frames can have
> anywhere from 14-16 rows, and I couldn't find a way to make a variable
> size array.  If you have any ideas, I would greatly appreciate any
> help, as I'm trying to learn R, and decided to apply it to a project
> that I have been working on.  My goal is splitting a sports season
> into games per week, and then do statistics on each week, but have an
> average running up to that point in the season.  Thus the list would
> be indexed by weeks, and then there's a data frame of the game and
> all relevant statistics. 

My understanding is that you want to have one data frame per week. I
question whether it is necessary to split these data frames. I would
design a single data frame with one column to identify the week and then
work on subsets of that data frame to calculate the weekly stats or
cumulative indexes as you wish.

Jerome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List of Data Frames

2011-05-02 Thread Rolf Turner


On 03/05/11 11:42, Mike Smith wrote:

I'm trying to create a list of Data Frames.  I have 17 data frames that I need 
to move through in a loop, but if I simply make a list of them, then they do 
not stay data frames,
That is simply not true. Just ***how*** did you ``make a list of 
them''???

and I can't sort through them.  I tried to create an array, but the data frames 
can have anywhere from 14-16 rows, and I couldn't find a way to make a variable 
size array.  If you have any ideas, I would greatly appreciate any help, as I'm 
trying to learn R, and decided to apply it to a project that I have been 
working on.  My goal is splitting a sports season into games per week, and then 
do statistics on each week, but have an average running up to that point in the 
season.  Thus the list would be indexed by weeks, and then there's a data frame 
of the game and all relevant statistics.


You can make a list of data frames with syntax something like

L <- list(DF1, DF2,etc.)

where DF1, ... DF17 are your data frames.

If your data frames all have the same number of columns (and the
column names are the same) you might want to rbind() them together
into a single data frame.  If the data frames correspond to ``week'' then
you might want to add a ``week'' column to each data frame  before
doing the rbind(); the value of week would be constant over the rows
of each of your original data frames, but would of course vary over rows
in your ``big'' data frame (the object returned by rbind).

It's very hard to recommend anything specific, since your question was
so vague.  I suggest that you read the Posting Guide.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread David Winsemius



On May 2, 2011, at 2:22 PM, Clemontina Alexander wrote:


Thanks for your response, but I guess I didn't make my question clear.
I am already familiar with the concept of dummy variables and
regression in R. My question is, can the "lars" package (or some other
lasso algorithm) handle factors?


The error message when you do so and the help page make it fairly  
clear that it does not.



I did use dummy variables in my
original data, but lars (lasso) only shrank the coefficients of some
of the levels of one factor to 0.


You certainly gave no evidence that would lead anyone to think that  
you did so. Please try to understand that just converting factors to  
'numeric' is not the same as creating dummy variables.


--
David.

Is this the correct thing to do?
Because intuitively it seems like I would want to shrink the whole
factor coefficient to 0. If this is correct, what is the
interpretation? For example, for X1, if lasso drops the coefficient
for levels A and B, but not C and D, does this mean that X1 should be
included in the model?
Thanks.



On Mon, May 2, 2011 at 2:47 PM, David Winsemius > wrote:


On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:


Hi,

On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander >

wrote:


Hi! This is my first time posting. I've read the general rules and
guidelines, but please bear with me if I make some fatal error in
posting. Anyway, I have a continuous response and 29 predictors  
made

up of continuous variables and nominal and ordinal categorical
variables. I'd like to do lasso on these, but I get an error. The  
way

I am using "lars" doesn't allow for the factors. Is there a special
option or some other method in order to do lasso with cat.  
variables?


Here is and example (considering ordinal variables as just  
nominal):


set.seed(1)
Y <- rnorm(10,0,1)
X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
X4 <- rchisq(10, df=4, ncp=0)
X <- data.frame(X1,X2,X3,X4)


str(X)


'data.frame':   10 obs. of  4 variables:
 $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
 $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
 $ X3: int  51 46 50 44 43 50 30 42 49 48
 $ X4: num  2.86 1.55 1.94 2.45 2.75 ...


I'd like to do:
obj <- lars(x=X, y=Y, type = "lasso")

Instead, what I have been doing is converting all data to  
continuous

but I think this is really bad!


Yeah, it is.

Check out the "Categorical Predictor Variables" section here for a  
way

to handle such predictor vars:
http://www.psychstat.missouristate.edu/multibook/mlt08m.html


Steve's citation is somewhat helpful, but not sufficient to take  
the next

steps. You can find details regarding the mechanics of typical linear
regression in R on the ?lm page where you find that the factor  
variables are

typically handled by model.matrix. See below:


model.matrix(~X1 + X2 + X3 + X4, X)

  (Intercept) X1B X1C X1D X2F X2G X2H X2I X3X4
11   0   0   1   0   1   0   0 51 2.8640884
21   0   0   0   0   0   1   0 46 1.5462243
31   0   1   0   0   1   0   0 50 1.9430901
41   0   0   0   1   0   0   0 44 2.4504180
51   1   0   0   0   0   0   1 43 2.7535052
61   1   0   0   0   0   0   1 50 1.6200326
71   0   0   0   0   0   0   1 30 0.5750533
81   1   0   0   0   0   0   0 42 5.9224777
91   0   0   1   0   0   0   1 49 2.0401528
10   1   1   0   0   0   1   0   0 48 6.2995288
attr(,"assign")
 [1] 0 1 1 1 2 2 2 2 3 4
attr(,"contrasts")
attr(,"contrasts")$X1
[1] "contr.treatment"

attr(,"contrasts")$X2
[1] "contr.treatment"

The numeric variables are passed through, while the dummy variables  
for
factor columns are constructed (as treatment contrasts) and the  
whole thing

it returned in a neat package.

--
David.


HTH,
-steve


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Andrew Robinson

On Mon, May 02, 2011 at 05:22:57PM -0400, Clemontina Alexander wrote:
> Thanks for your response, but I guess I didn't make my question clear.
> I am already familiar with the concept of dummy variables and
> regression in R. My question is, can the "lars" package (or some other
> lasso algorithm) handle factors? I did use dummy variables in my
> original data, but lars (lasso) only shrank the coefficients of some
> of the levels of one factor to 0. Is this the correct thing to do?

It's because, so far as the linear model is concerned, factors are a
convenience to help us handle the dummy variables. So, yes, it's the
correct thing to do.  It sounds to me as though you are after a
shrinkage device that will treat the factor as a whole. 

> Because intuitively it seems like I would want to shrink the whole
> factor coefficient to 0. If this is correct, what is the
> interpretation? For example, for X1, if lasso drops the coefficient
> for levels A and B, but not C and D, does this mean that X1 should be
> included in the model?

It means that X1 should be recoded to be C, D, and the rest. 

Cheers

Andrew

> Thanks.
> 
> 
> 
> On Mon, May 2, 2011 at 2:47 PM, David Winsemius  
> wrote:
> >
> > On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:
> >
> >> Hi,
> >>
> >> On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander 
> >> wrote:
> >>>
> >>> Hi! This is my first time posting. I've read the general rules and
> >>> guidelines, but please bear with me if I make some fatal error in
> >>> posting. Anyway, I have a continuous response and 29 predictors made
> >>> up of continuous variables and nominal and ordinal categorical
> >>> variables. I'd like to do lasso on these, but I get an error. The way
> >>> I am using "lars" doesn't allow for the factors. Is there a special
> >>> option or some other method in order to do lasso with cat. variables?
> >>>
> >>> Here is and example (considering ordinal variables as just nominal):
> >>>
> >>> set.seed(1)
> >>> Y <- rnorm(10,0,1)
> >>> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
> >>> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
> >>> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
> >>> X4 <- rchisq(10, df=4, ncp=0)
> >>> X <- data.frame(X1,X2,X3,X4)
> >>>
>  str(X)
> >>>
> >>> 'data.frame':   10 obs. of  4 variables:
> >>>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
> >>>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
> >>>  $ X3: int  51 46 50 44 43 50 30 42 49 48
> >>>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
> >>>
> >>>
> >>> I'd like to do:
> >>> obj <- lars(x=X, y=Y, type = "lasso")
> >>>
> >>> Instead, what I have been doing is converting all data to continuous
> >>> but I think this is really bad!
> >>
> >> Yeah, it is.
> >>
> >> Check out the "Categorical Predictor Variables" section here for a way
> >> to handle such predictor vars:
> >> http://www.psychstat.missouristate.edu/multibook/mlt08m.html
> >
> > Steve's citation is somewhat helpful, but not sufficient to take the next
> > steps. You can find details regarding the mechanics of typical linear
> > regression in R on the ?lm page where you find that the factor variables are
> > typically handled by model.matrix. See below:
> >
> >> model.matrix(~X1 + X2 + X3 + X4, X)
> >   (Intercept) X1B X1C X1D X2F X2G X2H X2I X3        X4
> > 1            1   0   0   1   0   1   0   0 51 2.8640884
> > 2            1   0   0   0   0   0   1   0 46 1.5462243
> > 3            1   0   1   0   0   1   0   0 50 1.9430901
> > 4            1   0   0   0   1   0   0   0 44 2.4504180
> > 5            1   1   0   0   0   0   0   1 43 2.7535052
> > 6            1   1   0   0   0   0   0   1 50 1.6200326
> > 7            1   0   0   0   0   0   0   1 30 0.5750533
> > 8            1   1   0   0   0   0   0   0 42 5.9224777
> > 9            1   0   0   1   0   0   0   1 49 2.0401528
> > 10           1   1   0   0   0   1   0   0 48 6.2995288
> > attr(,"assign")
> >  [1] 0 1 1 1 2 2 2 2 3 4
> > attr(,"contrasts")
> > attr(,"contrasts")$X1
> > [1] "contr.treatment"
> >
> > attr(,"contrasts")$X2
> > [1] "contr.treatment"
> >
> > The numeric variables are passed through, while the dummy variables for
> > factor columns are constructed (as treatment contrasts) and the whole thing
> > it returned in a neat package.
> >
> > --
> > David.
> >>
> >> HTH,
> >> -steve
> >>
> > --
> > David Winsemius, MD
> > Heritage Laboratories
> > West Hartford, CT
> >
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au

Re: [R] easy way to do a 2-D fit to an array of data?

2011-05-02 Thread Ravi Varadhan

You may want to consider spatial::surf.ls  

Or, a simplistic approach where you fit a model such as using `lm':

E[Z | x, y] = a + b(x - x0)^2 + c(y - y0)^2 

where (x0, y0) is the location of maximum.

Ravi.

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Carl Witthoft [c...@witthoft.com]
Sent: Monday, May 02, 2011 7:14 PM
To: r-help@r-project.org
Subject: [R] easy way to do a 2-D fit to an array of data?

Hi,
I've got a matrix, Z, of values representing (as it happens) optical
power at each pixel location.  Since I know in advance I've got a
single,  convex peak, I would like to do a 2D parabolic fit of the form
Z = poly((x+y),2) where x and y are the x,y coordinates of each pixel
(or equivalently, the row, column numbers).
Is there an R function that lets me easily implement that? I've started
down the path of something like

zvec <- as.vector(Z), and creating  applicable x,y vectors by something
like  (where for the sake of argument Z is 128x128)

foo<-matrix(seq(1,128),128,128)

xvec <- as.vector(foo)
yvec <- as.vector(t(foo))

at which point I can feed zvec, xvec, yvec to lm() .

I'm  hopeful someone can point me to a much easier way to do the same
thing.  Oh, and if there's a 2-D  splinefunction generator, that would
work for me as well.

thanks
Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Breheny, Patrick

Clementonia,

It sounds like you are looking for the group lasso (Yuan & Lin, 2006).  There 
are two packages on CRAN that have implemented this idea: grpreg and grplasso.  
The syntax of each is similar to lars (in particular requiring a numeric design 
matrix as produced by model.matrix), except you must also supply a vector that 
describes the grouping (e.g., c(1,1,1,2,2,3,3,...)).  The members of each group 
will then either be all zero or all nonzero (i.e., the variable selection 
occurs at the group level).
___
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Clemontina Alexander [ckale...@ncsu.edu]
Sent: Monday, May 02, 2011 5:22 PM
To: David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] Lasso with Categorical Variables

Thanks for your response, but I guess I didn't make my question clear.
I am already familiar with the concept of dummy variables and
regression in R. My question is, can the "lars" package (or some other
lasso algorithm) handle factors? I did use dummy variables in my
original data, but lars (lasso) only shrank the coefficients of some
of the levels of one factor to 0. Is this the correct thing to do?
Because intuitively it seems like I would want to shrink the whole
factor coefficient to 0. If this is correct, what is the
interpretation? For example, for X1, if lasso drops the coefficient
for levels A and B, but not C and D, does this mean that X1 should be
included in the model?
Thanks.



On Mon, May 2, 2011 at 2:47 PM, David Winsemius  wrote:
>
> On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:
>
>> Hi,
>>
>> On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander 
>> wrote:
>>>
>>> Hi! This is my first time posting. I've read the general rules and
>>> guidelines, but please bear with me if I make some fatal error in
>>> posting. Anyway, I have a continuous response and 29 predictors made
>>> up of continuous variables and nominal and ordinal categorical
>>> variables. I'd like to do lasso on these, but I get an error. The way
>>> I am using "lars" doesn't allow for the factors. Is there a special
>>> option or some other method in order to do lasso with cat. variables?
>>>
>>> Here is and example (considering ordinal variables as just nominal):
>>>
>>> set.seed(1)
>>> Y <- rnorm(10,0,1)
>>> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
>>> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
>>> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
>>> X4 <- rchisq(10, df=4, ncp=0)
>>> X <- data.frame(X1,X2,X3,X4)
>>>
 str(X)
>>>
>>> 'data.frame':   10 obs. of  4 variables:
>>>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
>>>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
>>>  $ X3: int  51 46 50 44 43 50 30 42 49 48
>>>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
>>>
>>>
>>> I'd like to do:
>>> obj <- lars(x=X, y=Y, type = "lasso")
>>>
>>> Instead, what I have been doing is converting all data to continuous
>>> but I think this is really bad!
>>
>> Yeah, it is.
>>
>> Check out the "Categorical Predictor Variables" section here for a way
>> to handle such predictor vars:
>> http://www.psychstat.missouristate.edu/multibook/mlt08m.html
>
> Steve's citation is somewhat helpful, but not sufficient to take the next
> steps. You can find details regarding the mechanics of typical linear
> regression in R on the ?lm page where you find that the factor variables are
> typically handled by model.matrix. See below:
>
>> model.matrix(~X1 + X2 + X3 + X4, X)
>   (Intercept) X1B X1C X1D X2F X2G X2H X2I X3X4
> 11   0   0   1   0   1   0   0 51 2.8640884
> 21   0   0   0   0   0   1   0 46 1.5462243
> 31   0   1   0   0   1   0   0 50 1.9430901
> 41   0   0   0   1   0   0   0 44 2.4504180
> 51   1   0   0   0   0   0   1 43 2.7535052
> 61   1   0   0   0   0   0   1 50 1.6200326
> 71   0   0   0   0   0   0   1 30 0.5750533
> 81   1   0   0   0   0   0   0 42 5.9224777
> 91   0   0   1   0   0   0   1 49 2.0401528
> 10   1   1   0   0   0   1   0   0 48 6.2995288
> attr(,"assign")
>  [1] 0 1 1 1 2 2 2 2 3 4
> attr(,"contrasts")
> attr(,"contrasts")$X1
> [1] "contr.treatment"
>
> attr(,"contrasts")$X2
> [1] "contr.treatment"
>
> The numeric variables are passed through, while the dummy variables for
> factor columns are constructed (as treatment contrasts) and the whole thing
> it returned in a neat package.
>
> --
> David.
>>
>> HTH,
>> -steve
>>
> --
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-projec

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Mike Miller


On Tue, 3 May 2011, Andrew Robinson wrote:


try substr()


OK.  Apparently, it allows things like this...


substr("abcdef",2,4)

[1] "bcd"

...which is like this:

echo "abcdef" | cut -c2-4

But that doesn't use a delimiter, it only does character-based cutting, 
and it is very limited.  With "cut -c" I can do stuff this:


echo "abcdefghijklmnopqrstuvwxyz" | cut -c-3,12-15,17-

abclmnoqrstuvwxyz

It extracts characters 1 to 3, 12 to 15 and 17 to the end.

That was a great tip, though, because it led me to strsplit, which can do 
what I want, however somewhat awkwardly:



y <- "a b c d e f g h i j k l m n o p q r s t u v w x y z"
paste(unlist(strsplit(y, delim))[c(1:3,12:15,17:26)], collapse=delim)

[1] "a b c l m n o q r s t u v w x y z"

That gives me what I want, but it is still a little awkward.  I guess I 
don't quite get what I'm doing with lists.  I'm not clear on how this 
would work with a vector of strings.


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Gabor Grothendieck

On Mon, May 2, 2011 at 10:32 PM, Mike Miller  wrote:
> On Tue, 3 May 2011, Andrew Robinson wrote:
>
>> try substr()
>
> OK.  Apparently, it allows things like this...
>
>> substr("abcdef",2,4)
>
> [1] "bcd"
>
> ...which is like this:
>
> echo "abcdef" | cut -c2-4
>
> But that doesn't use a delimiter, it only does character-based cutting, and
> it is very limited.  With "cut -c" I can do stuff this:
>
> echo "abcdefghijklmnopqrstuvwxyz" | cut -c-3,12-15,17-
>
> abclmnoqrstuvwxyz
>
> It extracts characters 1 to 3, 12 to 15 and 17 to the end.
>
> That was a great tip, though, because it led me to strsplit, which can do
> what I want, however somewhat awkwardly:
>
>> y <- "a b c d e f g h i j k l m n o p q r s t u v w x y z"
>> paste(unlist(strsplit(y, delim))[c(1:3,12:15,17:26)], collapse=delim)
>
> [1] "a b c l m n o q r s t u v w x y z"
>
> That gives me what I want, but it is still a little awkward.  I guess I
> don't quite get what I'm doing with lists.  I'm not clear on how this would
> work with a vector of strings.
>

Try this:

> read.fwf(textConnection("abcdefghijklmnopqrstuvwxyz"), widths = c(3, 8, 4, 1, 
> 10), colClasses = c(NA, "NULL"))
   V1   V3 V5
1 abc lmno qrstuvwxyz



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] easy way to do a 2-D fit to an array of data?

2011-05-02 Thread Ravi Varadhan

Hi Carl,

Here is another slightly different (not necessarily the easiest) approach that 
uses a profiling technique. An advantage is that you get the maximum location 
directly.

n <- 20

x <- sort(rnorm(n))

y <- sort(rnorm(n))

xy <- expand.grid(x, y)

zfn <- function(x) 0.5 - 2.2 * (x[1] - 0.5)^2 - 0.9 * (x[2] + 0.5)^2

z <- rep(NA, length=n^2)

for (i in 1:nrow(xy)) z[i] <- zfn(xy[i, ])

z <- z + rnorm(n^2, sd=0.3)

obj <- function(par, x, y,  z) {
-summary(lm(z ~ I((x - par[1])^2) + I((y - par[2])^2)))$r.sq
}

require(dfoptim)

ans <- nmk(par=colMeans(xy), fn=obj, x=xy[,1], y=xy[,2], z=z)

ans$par # location of the maximum

summary(lm(z ~ I((xy[,1] - ans$par[1])^2) + I((xy[,2] - ans$par[2])^2)))


Ravi.

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Carl Witthoft [c...@witthoft.com]
Sent: Monday, May 02, 2011 7:14 PM
To: r-help@r-project.org
Subject: [R] easy way to do a 2-D fit to an array of data?

Hi,
I've got a matrix, Z, of values representing (as it happens) optical
power at each pixel location.  Since I know in advance I've got a
single,  convex peak, I would like to do a 2D parabolic fit of the form
Z = poly((x+y),2) where x and y are the x,y coordinates of each pixel
(or equivalently, the row, column numbers).
Is there an R function that lets me easily implement that? I've started
down the path of something like

zvec <- as.vector(Z), and creating  applicable x,y vectors by something
like  (where for the sake of argument Z is 128x128)

foo<-matrix(seq(1,128),128,128)

xvec <- as.vector(foo)
yvec <- as.vector(t(foo))

at which point I can feed zvec, xvec, yvec to lm() .

I'm  hopeful someone can point me to a much easier way to do the same
thing.  Oh, and if there's a 2-D  splinefunction generator, that would
work for me as well.

thanks
Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Mike Miller


On Mon, 2 May 2011, Gabor Grothendieck wrote:


On Mon, May 2, 2011 at 10:32 PM, Mike Miller  wrote:

On Tue, 3 May 2011, Andrew Robinson wrote:


try substr()


OK.  Apparently, it allows things like this...


substr("abcdef",2,4)


[1] "bcd"

...which is like this:

echo "abcdef" | cut -c2-4

But that doesn't use a delimiter, it only does character-based cutting, and
it is very limited.  With "cut -c" I can do stuff this:

echo "abcdefghijklmnopqrstuvwxyz" | cut -c-3,12-15,17-

abclmnoqrstuvwxyz

It extracts characters 1 to 3, 12 to 15 and 17 to the end.

That was a great tip, though, because it led me to strsplit, which can do
what I want, however somewhat awkwardly:


y <- "a b c d e f g h i j k l m n o p q r s t u v w x y z"
paste(unlist(strsplit(y, delim))[c(1:3,12:15,17:26)], collapse=delim)


[1] "a b c l m n o q r s t u v w x y z"

That gives me what I want, but it is still a little awkward.  I guess I
don't quite get what I'm doing with lists.  I'm not clear on how this would
work with a vector of strings.



Try this:


read.fwf(textConnection("abcdefghijklmnopqrstuvwxyz"), widths = c(3, 8, 4, 1, 10), 
colClasses = c(NA, "NULL"))

  V1   V3 V5
1 abc lmno qrstuvwxyz



That gives me a few more functions to study.  Of course the new code 
(using read.fwf() and textConnection()) is not doing what was requested 
and it requires some work to compute the widths from the given numbers 
(c(1:3, 12:15, 17:26) has to be converted to c(3, 8, 4, 1, 10)).


Mike__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] adaptIntegrate - how to pass additional parameters to the integrand

2011-05-02 Thread HC

Hello,

I am trying to use adaptIntegrate function but I need to pass on a few
additional parameters to the integrand. However, this function seems not to
have the flexibility of passing on such additional parameters.

Am I missing something or this is a known limitation. Is there a good
alternative to such restrictions, if there at all are?

Many thanks for your time.
HC


--
View this message in context: 
http://r.789695.n4.nabble.com/adaptIntegrate-how-to-pass-additional-parameters-to-the-integrand-tp3491701p3491701.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread P Ehlers


Mike Miller wrote:

On Mon, 2 May 2011, Gabor Grothendieck wrote:


On Mon, May 2, 2011 at 10:32 PM, Mike Miller  wrote:

On Tue, 3 May 2011, Andrew Robinson wrote:


try substr()

OK.  Apparently, it allows things like this...


substr("abcdef",2,4)

[1] "bcd"

...which is like this:

echo "abcdef" | cut -c2-4

But that doesn't use a delimiter, it only does character-based cutting, and
it is very limited.  With "cut -c" I can do stuff this:

echo "abcdefghijklmnopqrstuvwxyz" | cut -c-3,12-15,17-

abclmnoqrstuvwxyz

It extracts characters 1 to 3, 12 to 15 and 17 to the end.

That was a great tip, though, because it led me to strsplit, which can do
what I want, however somewhat awkwardly:


y <- "a b c d e f g h i j k l m n o p q r s t u v w x y z"
paste(unlist(strsplit(y, delim))[c(1:3,12:15,17:26)], collapse=delim)

[1] "a b c l m n o q r s t u v w x y z"

That gives me what I want, but it is still a little awkward.  I guess I
don't quite get what I'm doing with lists.  I'm not clear on how this would
work with a vector of strings.


Try this:


read.fwf(textConnection("abcdefghijklmnopqrstuvwxyz"), widths = c(3, 8, 4, 1, 10), 
colClasses = c(NA, "NULL"))

  V1   V3 V5
1 abc lmno qrstuvwxyz



That gives me a few more functions to study.  Of course the new code 
(using read.fwf() and textConnection()) is not doing what was requested 
and it requires some work to compute the widths from the given numbers 
(c(1:3, 12:15, 17:26) has to be converted to c(3, 8, 4, 1, 10)).


Mike


Use str_sub() in the stringr package:

require(stringr)  # install first if necessary
s <- "abcdefghijklmnopqrstuvwxyz"

str_sub(s, c(1,12,17), c(3,15,-1))
#[1] "abc""lmno"   "qrstuvwxyz"


Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Mike Miller


On Mon, 2 May 2011, P Ehlers wrote:


Use str_sub() in the stringr package:

require(stringr)  # install first if necessary
s <- "abcdefghijklmnopqrstuvwxyz"

str_sub(s, c(1,12,17), c(3,15,-1))
#[1] "abc""lmno"   "qrstuvwxyz"



Thanks.  That's very close to what I'm looking for, but it seems to 
correspond to "cut -c", not to "cut -f".  Can it work with delimiters or 
only with character counts?


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Christian Schulz




On Mon, 2 May 2011, P Ehlers wrote:


Use str_sub() in the stringr package:

require(stringr)  # install first if necessary
s <- "abcdefghijklmnopqrstuvwxyz"

str_sub(s, c(1,12,17), c(3,15,-1))
#[1] "abc""lmno"   "qrstuvwxyz"



Thanks.  That's very close to what I'm looking for, but it seems to 
correspond to "cut -c", not to "cut -f".  Can it work with delimiters 
or only with character counts?


Mike



x <- "this is a string"
unlist(strsplit(x," "))[c(1,4)]

HTH Christian



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparison of two penalized spline fits in mixed model framework

2011-05-02 Thread Anna-Leena Orsama


Hello!

I have faced a problem in nlme-environment. My intention is to fit a  
penalized spline  model in mixed model framework. I want make a  
comparison in smooth curves between two groups but for some reason I  
get NaN in output..


Hereis the R-code I have used.

#Z.overall is for truncated lines part
knots <- seq(1.5,6.5,by=1)
Z <- outer(weekday,knots,"-")
Z.overall <- Z*(Z>0)

fit1 <- lme(weight~group*weekday, random=list(group=pdIdent(~Z.overall-1)))
summary(fit1)
Linear mixed-effects model fit by REML
 Data: NULL
   AIC  BIClogLik
  4379.838 4411.895 -2183.919

Random effects:
 Formula: ~Z.overall - 1 | group
 Structure: Multiple of an Identity
Z.overall1 Z.overall2 Z.overall3 Z.overall4 Z.overall5  
Z.overall6  Residual
StdDev:  0.1310617  0.1310617  0.1310617  0.1310617  0.1310617   
0.1310617 0.9833188


Fixed effects: normMAweight2 ~ group * weekday
Value Std.Error   DFt-value p-value
(Intercept)0.23589909 0.4431604 1545  0.5323109  0.5946
group  0.14167744 0.26297140  0.5387562 NaN
weekday   -0.08601980 0.2770228 1545 -0.3105152  0.7562
group:weekday -0.02575775 0.1686222 1545 -0.1527542  0.8786
 Correlation:
  (Intr) group  weekdy
group -0.957
weekday   -0.911  0.883
group:weekday  0.861 -0.915 -0.953

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max
-3.87210364 -0.64218983 -0.03839294  0.60534552  4.30615258

Number of Observations: 1549
Number of Groups: 2
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced




Kindly Regards,
Anna-Leena Orsama

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with Sweave and pdflatex

2011-05-02 Thread Frank Lehmann

That might fix the problem... I will test it.

Thanks!

Frank

-Ursprüngliche Nachricht-
Von: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Gesendet: Montag, 2. Mai 2011 17:52
An: Frank Lehmann
Cc: r-help@r-project.org
Betreff: Re: [R] problem with Sweave and pdflatex

Have you checked the permissions in the working directory? Is there a 
blank in your path (LaTeX does not like spaces in the path).

Uwe Ligges


On 02.05.2011 14:51, Frank Lehmann wrote:
> Hallo,
>
>
>
> when I plot figures with Sweave, I get the message "pdflatex: Permission
> denied". This problem only occurs while working on local system. When I
copy
> the *.rnw-File to my AFS drive, there is no problem at all.
>
>
>
> Here is a small example:
>
>
>
> \documentclass{scrartcl}
>
> \usepackage[OT1]{fontenc}
>
> \usepackage[latin1]{inputenc}
>
> \usepackage[ngerman]{babel}
>
> \usepackage[pdftex]{graphicx}
>
> \usepackage{Sweave}
>
>
>
> \begin{document}
>
>
>
> \setkeys{Gin}{width=\textwidth}
>
> \begin{figure}[htbp]
>
> <>=
>
> x<- 1:10
>
> plot(x)
>
> @
>
> \caption{Eine einfache Grafik}
>
> \end{figure}
>
>
>
> \end{document}
>
>
>
> Does anyone have an idea, how to solve that problem? Im working with
Windows
> XP.
>
>
>
> Thanks!
>
>
>
> Frank
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Mike Miller


On Tue, 3 May 2011, Christian Schulz wrote:


On Mon, 2 May 2011, P Ehlers wrote:


Use str_sub() in the stringr package:

require(stringr)  # install first if necessary
s <- "abcdefghijklmnopqrstuvwxyz"

str_sub(s, c(1,12,17), c(3,15,-1))
#[1] "abc""lmno"   "qrstuvwxyz"



Thanks.  That's very close to what I'm looking for, but it seems to 
correspond to "cut -c", not to "cut -f".  Can it work with delimiters or 
only with character counts?


Mike



x <- "this is a string"
unlist(strsplit(x," "))[c(1,4)]



Thanks.  I did figure that one out a couple of messages back, but to get 
it do behave like "cut -d' ' -f1,4", I had to add a paste command to 
reassemble the parts:


paste(unlist(strsplit(x," "))[c(1,4)], collapse=" ")

Then I wasn't sure if I could do this to every element of a vector of 
strings without looping -- I have to think not.


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bootstrapping confidence intervals

2011-05-02 Thread khosoda

Hi,
Sorry for repeated question.

I performed logistic regression using lrm and penalized it with pentrace
function. I wanted to get confidence intervals of odds ratio of each
predictor and summary(MyModel) gave them. I also tried to get
bootstrapping standard errors in the logistic regression. bootcov
function in rms package provided them. Then, I found that the confidence
intervals provided by bootstrapping (bootcov) was narrower than CIs
provided by usual variance-covariance matrix in the followings.

My data has no cluster structure.
I am wondering which confidence interval is better. I guess
bootstrapping one, but is it right?

I would appreciate anybody's help in advance.

> summary(MyModel, stenosis=c(70, 80), x1=c(1.5, 2.0), x2=c(1.5, 2.0))
 Effects  Response : outcome

 Factor  Low  High Diff. Effect S.E. Lower 0.95 Upper 0.95
 stenosis70.0 80   10.0  -0.11  0.24 -0.59  0.37
  Odds Ratio 70.0 80   10.0   0.90NA  0.56  1.45
 x1   1.5  20.5   1.21  0.37  0.49  1.94
  Odds Ratio  1.5  20.5   3.36NA  1.63  6.95
 x2   1.5  20.5  -0.29  0.19 -0.65  0.08
  Odds Ratio  1.5  20.5   0.75NA  0.52  1.08
 ClinicalScore3.0  52.0   0.61  0.38 -0.14  1.36
  Odds Ratio  3.0  52.0   1.84NA  0.87  3.89
 procedure - CA:CE2.0  1 NA   0.83  0.46 -0.07  1.72
  Odds Ratio  2.0  1 NA   2.28NA  0.93  5.59

> summary(MyModel.boot, stenosis=c(70, 80), x1=c(1.5, 2.0), x2=c(1.5, 2.0))
 Effects  Response : outcome

 Factor  Low  High Diff. Effect S.E. Lower 0.95 Upper 0.95
 stenosis70.0 80   10.0  -0.11  0.28 -0.65  0.43
  Odds Ratio 70.0 80   10.0   0.90NA  0.52  1.54
 x1   1.5  20.5   1.21  0.29  0.65  1.77
  Odds Ratio  1.5  20.5   3.36NA  1.92  5.89
 x2   1.5  20.5  -0.29  0.16 -0.59  0.02
  Odds Ratio  1.5  20.5   0.75NA  0.55  1.02
 ClinicalScore3.0  52.0   0.61  0.45 -0.28  1.50
  Odds Ratio  3.0  52.0   1.84NA  0.76  4.47
 procedure - CAS:CEA  2.0  1 NA   0.83  0.38  0.07  1.58
  Odds Ratio  2.0  1 NA   2.28NA  1.08  4.85

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Nick Sabbe

For performance reasons, I advise on using the following function instead of
model.matrix:

factorsToDummyVariables<-function(dfr, betweenColAndLevel="")
{
nc<-dim(dfr)[2]
firstRow<-dfr[1,]
coln<-colnames(dfr)
retval<-do.call(cbind, lapply(seq(nc), function(ci){
if(is.factor(firstRow[,ci]))
{
lvls<-levels(firstRow[,ci])[-1]
stretchedcols<-sapply(lvls, function(lvl){
rv<-dfr[,ci]==lvl
mode(rv)<-"integer"
return(rv)
})
if(!is.matrix(stretchedcols))
stretchedcols<-matrix(stretchedcols, nrow=1)
colnames(stretchedcols)<-paste(coln[ci],
lvls, sep=betweenColAndLevel)
return(stretchedcols)
}
else
{
curcol<-matrix(dfr[,ci], ncol=1)
colnames(curcol)<-coln[ci]
return(curcol)
}
}))
rownames(retval)<-rownames(dfr)
return(retval)
}


Just for comparison: here is my old version of the same function, using
model.matrix:

factorsToDummyVariables.old<-function(dfrPredictors,
form=paste("~",paste(colnames(dfrPredictors), collapse="+"), sep=""))
{
#note: this function seems to operate quite slowly!
#Because it is used often, it may be worth improving its speed
dfrTmp<-model.frame(dfrPredictors, na.action=na.pass)
frm<-as.formula(form)
mm<-model.matrix(frm, data=dfrTmp)
retval<-as.matrix(mm)[,-1]

return(retval)
}

In a testcase with a reasonably big dataset, I compared the speeds:

#system.time(tmp.fd.convds.full.man<-manualFactorsToDummyVariables(ds))
##   user  system elapsed
##   9.440.009.48
#system.time(tmp.fd.convds.full<-factorsToDummyVariables.old(ds))
##   user  system elapsed
##  15.490.00   15.64
#system.time(invisible(factorsToDummyVariables (ds[10,])))
##   user  system elapsed
##   0.360.000.36
#system.time(invisible(factorsToDummyVariables.old (ds[10,])))
##   user  system elapsed
##   2.180.002.20
#system.time(invisible(factorsToDummyVariables (ds[20:30,])))
##   user  system elapsed
##   0.340.000.38
#system.time(invisible(factorsToDummyVariables.old (ds[20:30,])))
##   user  system elapsed
##   2.110.002.15

If you have to do this quite often, the difference surely adds up...
More improvements may be possible.
This function only works if you don't include interactions, though.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of David Winsemius
Sent: maandag 2 mei 2011 20:48
To: Steve Lianoglou
Cc: r-help@r-project.org
Subject: Re: [R] Lasso with Categorical Variables


On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:

> Hi,
>
> On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander  > wrote:
>> Hi! This is my first time posting. I've read the general rules and
>> guidelines, but please bear with me if I make some fatal error in
>> posting. Anyway, I have a continuous response and 29 predictors made
>> up of continuous variables and nominal and ordinal categorical
>> variables. I'd like to do lasso on these, but I get an error. The way
>> I am using "lars" doesn't allow for the factors. Is there a special
>> option or some other method in order to do lasso with cat. variables?
>>
>> Here is and example (considering ordinal variables as just nominal):
>>
>> set.seed(1)
>> Y <- rnorm(10,0,1)
>> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
>> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
>> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
>> X4 <- rchisq(10, df=4, ncp=0)
>> X <- data.frame(X1,X2,X3,X4)
>>
>>> str(X)
>> 'data.frame':   10 obs. of  4 variables:
>>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
>>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
>>  $ X3: int  51 46 50 44 43 50 30 42 49 48
>>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
>>
>>
>> I'd like to do:
>> obj <- lars(x=X, y=Y, type = "lasso")
>>
>> Instead, what I have been doing is converting all data to continuous
>> but I think this is really bad!
>
> Yeah, it is.
>
> Check out the "Categorical Predictor Variables" section here for a way
> to handle such predictor vars:
> http://www.psychstat.missouristate.edu/multibook/mlt08m.html

Steve's citation is somewhat helpful, but not sufficient to take the  
next steps.

99 matches

Mail list logo