[R] Efficient swapping

2017-07-06 Thread Gang Chen
Suppose that we have the following dataframe:

set.seed(1)
(tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE)))

x R1 R2
1   1  B  B
2   2  B  A
3   3  C  D
4   4  E  B
5   5  B  D
6   6  E  C
7   7  E  D
8   8  D  E
9   9  D  B
10 10  A  D

I want to do the following: if the difference between the level index
of factor R1 and that of factor R2 is an odd number, the levels of the
two factors need to be switched between them, which can be performed
through the following code:

for(ii in 1:dim(tmp)[1]) {
   kk <- which(levels(tmp$R2) %in% tmp[ii,'R2'], arr.ind = TRUE) -
which(levels(tmp$R1) %in% tmp[ii,'R1'], arr.ind = TRUE)
   if(kk%%2!=0) { # swap the their levels between the two factors
  qq <- tmp[ii,]$R1
  tmp[ii,]$R1 <- tmp[ii,]$R2
  tmp[ii,]$R2 <- qq
  }
}

More concise and efficient way to do this?

Thanks,
Gang

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient swapping

2017-07-06 Thread Gang Chen
Thanks a lot, Ista! I really appreciate it.

How about a slightly different case as the following:

set.seed(1)
(tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
 TRUE), R2 = sample(LETTERS[2:6], 10, replace = TRUE)))

   x R1 R2
   1  C  B
   2  B  B
   3  C  E
   4  E  C
   5  E  B
   6  D  E
   7  E  E
   8  D  F
   9  C  D
  10  A  E

Notice that the factor levels between the two factors, R1 and R2,
slide by one level; that is, factor R1 does not have level F while
factor R2 does not have level A. I want to swap the factor levels
based on the combined levels of the two factors as shown below:

tl <- unique(c(levels(tmp$R1), levels(tmp$R2)))
for(ii in 1:dim(tmp)[1]) {
   kk <- which(tl %in% tmp[ii,'R2'], arr.ind = TRUE) - which(tl %in%
  tmp[ii,'R1'], arr.ind = TRUE)
   if(kk%%2!=0) { # swap the their levels between the two factors
  qq <- tmp[ii,]$R1
  tmp[ii,]$R1 <- tmp[ii,]$R2
  tmp[ii,]$R2 <- qq
  }
}

How to go about this case? Thanks!


On Thu, Jul 6, 2017 at 5:16 PM, Ista Zahn  wrote:
> How about
>
> foo <- with(list(r1 = tmp$R1,
>  r2 = tmp$R2,
>  swapme = (as.numeric(tmp$R1) - as.numeric(tmp$R2)) %% 2 != 
> 0),
> {
> tmp[swapme, "R1"] <- r2[swapme]
> tmp[swapme, "R2"] <- r1[swapme]
> tmp
> })
>
> Best,
> Ista
>
> On Thu, Jul 6, 2017 at 4:06 PM, Gang Chen  wrote:
>> Suppose that we have the following dataframe:
>>
>> set.seed(1)
>> (tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
>> TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE)))
>>
>> x R1 R2
>> 1   1  B  B
>> 2   2  B  A
>> 3   3  C  D
>> 4   4  E  B
>> 5   5  B  D
>> 6   6  E  C
>> 7   7  E  D
>> 8   8  D  E
>> 9   9  D  B
>> 10 10  A  D
>>
>> I want to do the following: if the difference between the level index
>> of factor R1 and that of factor R2 is an odd number, the levels of the
>> two factors need to be switched between them, which can be performed
>> through the following code:
>>
>> for(ii in 1:dim(tmp)[1]) {
>>kk <- which(levels(tmp$R2) %in% tmp[ii,'R2'], arr.ind = TRUE) -
>> which(levels(tmp$R1) %in% tmp[ii,'R1'], arr.ind = TRUE)
>>if(kk%%2!=0) { # swap the their levels between the two factors
>>   qq <- tmp[ii,]$R1
>>   tmp[ii,]$R1 <- tmp[ii,]$R2
>>   tmp[ii,]$R2 <- qq
>>   }
>> }
>>
>> More concise and efficient way to do this?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subtraction with aggregate

2016-07-28 Thread Gang Chen
With the following data in data.frame:

subject   QMemotion yi
  s1   75.1017   neutral  -75.928276
  s2  -47.3512   neutral -178.295990
  s3  -68.9016   neutral -134.753906
  s1   17.2099  negative -104.168312
  s2  -53.1114  negative -182.373474
  s3  -33.0322  negative -137.420410

I can obtain the average between the two emotions with

mydata <- read.table('clipboard', header=TRUE)
aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean)

My question is, what is a nice way to get the difference between the
two emotions?

Thanks,
Gang

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subtraction with aggregate

2016-07-28 Thread Gang Chen
Hi Jim and Jeff,

Thanks for the quick help!

Sorry I didn't state the question clearly: I want the difference
between 'neutral' and 'negative' for each subject. And another person
offered a solution for it:

aggregate(cbind(QM, yi) ~ subject, data = mydata, FUN = diff)


On Thu, Jul 28, 2016 at 4:53 PM, jim holtman  wrote:
> Not sure what you mean by "nice way", but here is a dplyr solution:
>
>> library(dplyr)
>> mydata <- read.table(text = "subject   QMemotion yi
> +s1   75.1017   neutral  -75.928276
> +s2  -47.3512   neutral -178.295990
> +s3  -68.9016   neutral -134.753906
> +s1   17.2099  negative -104.168312
> +s2  -53.1114  negative -182.373474
> +s3  -33.0322  negative -137.420410", header = TRUE)
>> agg <- mydata %>%
> + group_by(subject) %>%
> + summarise(QM = mean(QM),
> +   yi = mean(yi)
> +   )
>>
>>
>> agg
> # A tibble: 3 x 3
>   subject   QM yi
>  
> 1  s1  46.1558  -90.04829
> 2  s2 -50.2313 -180.33473
> 3  s3 -50.9669 -136.08716
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Thu, Jul 28, 2016 at 4:40 PM, Gang Chen  wrote:
>>
>> With the following data in data.frame:
>>
>> subject   QMemotion yi
>>   s1   75.1017   neutral  -75.928276
>>   s2  -47.3512   neutral -178.295990
>>   s3  -68.9016   neutral -134.753906
>>   s1   17.2099  negative -104.168312
>>   s2  -53.1114  negative -182.373474
>>   s3  -33.0322  negative -137.420410
>>
>> I can obtain the average between the two emotions with
>>
>> mydata <- read.table('clipboard', header=TRUE)
>> aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean)
>>
>> My question is, what is a nice way to get the difference between the
>> two emotions?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggregate

2016-08-23 Thread Gang Chen
This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate

2016-08-24 Thread Gang Chen
Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson  wrote:
> Thank you for the reproducible example, but it is not clear what cross 
> product you want. Jim's solution gives you the cross product of the 2-column 
> matrix with itself. If you want the cross product between the columns you 
> need something else. The aggregate function will not work since it will treat 
> the columns separately:
>
>> A <- as.matrix(myData[myData$Z=="A", 1:2])
>> A
>   X Y
> 1 1 4
> 2 2 3
>> crossprod(A) # Same as t(A) %*% A
>X  Y
> X  5 10
> Y 10 25
>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]
>  [,1]
> [1,]   10
>>
>> # For all the groups
>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))
> $A
>X  Y
> X  5 10
> Y 10 25
>
> $B
>X  Y
> X 25 10
> Y 10  5
>
>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
> $A
>  [,1]
> [1,]   10
>
> $B
>  [,1]
> [1,]   10
>
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
> Sent: Tuesday, August 23, 2016 6:02 PM
> To: Gang Chen; r-help mailing list
> Subject: Re: [R] aggregate
>
> Hi Gang Chen,
> If I have the right idea:
>
> for(zval in levels(myData$Z))
> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))
>
> Jim
>
> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen  wrote:
>> This is a simple question: With a dataframe like the following
>>
>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 
>> 'B'))
>>
>> how can I get the cross product between X and Y for each level of
>> factor Z? My difficulty is that I don't know how to deal with the fact
>> that crossprod() acts on two variables in this case.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate

2016-08-24 Thread Gang Chen
Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’),
Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’))

> myData

  X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson  wrote:
> Your is fine, but it will be a little simpler if you use sapply() instead:
>
>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),
> + function(x) crossprod(x[, 1], x[, 2])))
>   Z CP
> A A 10
> B B 10
>
> David C
>
>
> -Original Message-
> From: Gang Chen [mailto:gangch...@gmail.com]
> Sent: Wednesday, August 24, 2016 10:17 AM
> To: David L Carlson
> Cc: Jim Lemon; r-help mailing list
> Subject: Re: [R] aggregate
>
> Thank you all for the suggestions! Yes, I'm looking for the cross
> product between the two columns of X and Y.
>
> A follow-up question: what is a nice way to merge the output of
>
> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
>
> with the column Z in myData so that I would get a new dataframe as the
> following (the 2nd column is the cross product between X and Y)?
>
> Z   CP
> A   10
> B   10
>
> Is the following legitimate?
>
> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
> myData$Z), function(x) crossprod(x[, 1], x[, 2]
>
>
> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson  wrote:
>> Thank you for the reproducible example, but it is not clear what cross 
>> product you want. Jim's solution gives you the cross product of the 2-column 
>> matrix with itself. If you want the cross product between the columns you 
>> need something else. The aggregate function will not work since it will 
>> treat the columns separately:
>>
>>> A <- as.matrix(myData[myData$Z=="A", 1:2])
>>> A
>>   X Y
>> 1 1 4
>> 2 2 3
>>> crossprod(A) # Same as t(A) %*% A
>>X  Y
>> X  5 10
>> Y 10 25
>>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]
>>  [,1]
>> [1,]   10
>>>
>>> # For all the groups
>>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))
>> $A
>>X  Y
>> X  5 10
>> Y 10 25
>>
>> $B
>>X  Y
>> X 25 10
>> Y 10  5
>>
>>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
>> $A
>>  [,1]
>> [1,]   10
>>
>> $B
>>  [,1]
>> [1,]   10
>>
>> -
>> David L Carlson
>> Department of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
>> Sent: Tuesday, August 23, 2016 6:02 PM
>> To: Gang Chen; r-help mailing list
>> Subject: Re: [R] aggregate
>>
>> Hi Gang Chen,
>> If I have the right idea:
>>
>> for(zval in levels(myData$Z))
>> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))
>>
>> Jim
>>
>> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen  wrote:
>>> This is a simple question: With a dataframe like the following
>>>
>>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 
>>> 'B'))
>>>
>>> how can I get the cross product between X and Y for each level of
>>> factor Z? My difficulty is that I don't know how to deal with the fact
>>> that crossprod() acts on two variables in this case.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate

2016-08-24 Thread Gang Chen
Thanks again for patiently offering great help, David! I just learned
dput() and paste0() now. Hopefully this is my last question.

Suppose a new dataframe is as below (one more numeric column):

myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")),
.Names = c("X",
"Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

> myData

  X Y   N  S Z
1 1 8 2.1 S1 A
2 2 7 2.1 S1 A
3 3 6 2.1 S1 B
4 4 5 2.1 S1 B
5 5 4 3.2 S2 A
6 6 3 3.2 S2 A
7 7 2 3.2 S2 B
8 8 1 3.2 S2 B

Once I obtain the cross product,

> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 
> 1], x[, 2]))
S1A S1B S2A S2B
 22  38  38  22

how can I easily add the other 3 columns (N, S, and Z) in a new
dataframe? For S and Z, I can play with the names from the cross
product output, but I have trouble dealing with the numeric column N.




On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson  wrote:
> You need to spend some time with a basic R tutorial. Your data is messed up 
> because you did not use a simple text editor somewhere along the way. R 
> understands ', but not ‘ or ’. The best way to send data to the list is to 
> use dput:
>
>> dput(myData)
> structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
> 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
> ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
> 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names 
> = c("X",
> "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")
>
> Combining two labels just requires the paste0() function:
>
>> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 
>> 1], x[, 2]))
> S1A S1B S2A S2B
>  22  38  38  22
>
> David C
>
> -Original Message-
> From: Gang Chen [mailto:gangch...@gmail.com]
> Sent: Wednesday, August 24, 2016 11:56 AM
> To: David L Carlson
> Cc: Jim Lemon; r-help mailing list
> Subject: Re: [R] aggregate
>
> Thanks a lot, David! I want to further expand the operation a little
> bit. With a new dataframe:
>
> myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
> 3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’),
> Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’))
>
>> myData
>
>   X Y  S Z
> 1 1 8 S1 A
> 2 2 7 S1 A
> 3 3 6 S1 B
> 4 4 5 S1 B
> 5 5 4 S2 A
> 6 6 3 S2 A
> 7 7 2 S2 B
> 8 8 1 S2 B
>
> I would like to obtain the same cross product between columns X and Y,
> but at each combination level of factors S and Z. In other words, the
> cross product would be still performed each two rows in the new
> dataframe myData. How can I achieve that?
>
> On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson  wrote:
>> Your is fine, but it will be a little simpler if you use sapply() instead:
>>
>>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),
>> + function(x) crossprod(x[, 1], x[, 2])))
>>   Z CP
>> A A 10
>> B B 10
>>
>> David C
>>
>>
>> -Original Message-
>> From: Gang Chen [mailto:gangch...@gmail.com]
>> Sent: Wednesday, August 24, 2016 10:17 AM
>> To: David L Carlson
>> Cc: Jim Lemon; r-help mailing list
>> Subject: Re: [R] aggregate
>>
>> Thank you all for the suggestions! Yes, I'm looking for the cross
>> product between the two columns of X and Y.
>>
>> A follow-up question: what is a nice way to merge the output of
>>
>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
>>
>> with the column Z in myData so that I would get a new dataframe as the
>> following (the 2nd column is the cross product between X and Y)?
>>
>> Z   CP
>> A   10
>> B   10
>>
>> Is the following legitimate?
>>
>> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
>> myData$Z), function(x) crossprod(x[, 1], x[, 2]
>>
>>
>> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson  wrote:
>>> Thank you for the reproducible example, but it is not clear what cross 
>>> product you want. Jim's solution gives you the cross product of the 
>>> 2-column matrix with itself. If you want the cross product between the 
>>> colum

Re: [R] aggregate

2016-08-24 Thread Gang Chen
Yes, this works out perfectly! Thanks a lot, David. Have a wonderful day...

On Wed, Aug 24, 2016 at 4:24 PM, David L Carlson  wrote:
> This will work, but you should double-check to be certain that CP and 
> unique(myData[, 3:5]) are in the same order. It will fail if N is not 
> identical for all rows of the same S-Z combination.
>
>> CP <- sapply(split(myData, paste0(myData$S, myData$Z)), function(x)
> +   crossprod(x[, 1], x[, 2]))
>> data.frame(CP, unique(myData[, 3:5]))
> CP   N  S Z
> S1A 22 2.1 S1 A
> S1B 38 2.1 S1 B
> S2A 38 3.2 S2 A
> S2B 22 3.2 S2 B
>
> David C
>
> -Original Message-
> From: Gang Chen [mailto:gangch...@gmail.com]
> Sent: Wednesday, August 24, 2016 2:51 PM
> To: David L Carlson
> Cc: r-help mailing list
> Subject: Re: [R] aggregate
>
> Thanks again for patiently offering great help, David! I just learned
> dput() and paste0() now. Hopefully this is my last question.
>
> Suppose a new dataframe is as below (one more numeric column):
>
> myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
> 5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L,
> 1L, 1L, 1L, 2L, 2L, 2L, 2L
> ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
> 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")),
> .Names = c("X",
> "Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")
>
>> myData
>
>   X Y   N  S Z
> 1 1 8 2.1 S1 A
> 2 2 7 2.1 S1 A
> 3 3 6 2.1 S1 B
> 4 4 5 2.1 S1 B
> 5 5 4 3.2 S2 A
> 6 6 3 3.2 S2 A
> 7 7 2 3.2 S2 B
> 8 8 1 3.2 S2 B
>
> Once I obtain the cross product,
>
>> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 
>> 1], x[, 2]))
> S1A S1B S2A S2B
>  22  38  38  22
>
> how can I easily add the other 3 columns (N, S, and Z) in a new
> dataframe? For S and Z, I can play with the names from the cross
> product output, but I have trouble dealing with the numeric column N.
>
>
>
>
> On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson  wrote:
>> You need to spend some time with a basic R tutorial. Your data is messed up 
>> because you did not use a simple text editor somewhere along the way. R 
>> understands ', but not ‘ or ’. The best way to send data to the list is to 
>> use dput:
>>
>>> dput(myData)
>> structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
>> 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
>> ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
>> 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), 
>> .Names = c("X",
>> "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")
>>
>> Combining two labels just requires the paste0() function:
>>
>>> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 
>>> 1], x[, 2]))
>> S1A S1B S2A S2B
>>  22  38  38  22
>>
>> David C
>>
>> -Original Message-
>> From: Gang Chen [mailto:gangch...@gmail.com]
>> Sent: Wednesday, August 24, 2016 11:56 AM
>> To: David L Carlson
>> Cc: Jim Lemon; r-help mailing list
>> Subject: Re: [R] aggregate
>>
>> Thanks a lot, David! I want to further expand the operation a little
>> bit. With a new dataframe:
>>
>> myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
>> 3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’),
>> Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’))
>>
>>> myData
>>
>>   X Y  S Z
>> 1 1 8 S1 A
>> 2 2 7 S1 A
>> 3 3 6 S1 B
>> 4 4 5 S1 B
>> 5 5 4 S2 A
>> 6 6 3 S2 A
>> 7 7 2 S2 B
>> 8 8 1 S2 B
>>
>> I would like to obtain the same cross product between columns X and Y,
>> but at each combination level of factors S and Z. In other words, the
>> cross product would be still performed each two rows in the new
>> dataframe myData. How can I achieve that?
>>
>> On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson  wrote:
>>> Your is fine, but it will be a little simpler if you use sapply() instead:
>>>
>>>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),
>>> + function(x) crossprod(x[, 1], x[, 2])))
>>>   Z CP
>>> A A 10
>>> B B 10
>>>
>>> David C
>>>
>>>
>>> -Original 

[R] String manipulation

2014-12-08 Thread Gang Chen
I want to do the following: if a string does not contain a colon (:),
no change is needed; if it contains one or more colons, break the
string into multiple strings using the colon as a separator. For
example, "happy:" becomes

"happy" ":"

":sad" turns to

":" "sad"

and "happy:sad" changes to

"happy" ":" "sad"

How to do this?

Thanks,
Gang

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with Anova() in package "car"

2015-01-13 Thread Gang Chen
I'm having some trouble with Anova() in package "car". When the model
formula is explicitly expressed:

library('nlme')
library('car')
fm <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)

Anova() works fine:

Anova(fm)

However, if the model formula is scanned from an external source:

myModel <- as.formula("distance ~ age + Sex")
fm2 <- lme(myModel, data = Orthodont, random = ~ 1)

I get the following error:

Anova(fm2)
Error: object of type 'symbol' is not subsettable

How to resolve the situation when the model formula is defined externally?

Thanks,
Gang

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with Anova() in package "car"

2015-01-13 Thread Gang Chen
Dear John,

Thanks a lot for the quick response and fix! I'm looking forward to
try out the development version. I assume that the fix will be
released in the official version at some point.

Thanks again,
Gang

On Tue, Jan 13, 2015 at 5:06 PM, John Fox  wrote:
> Dear Gang,
>
> The problem was in the model.matrix.lme() method provided by the car
> package, and is now fixed in the development version of the car package on
> R-Forge. You should be able to install it from there via
> install.packages("car", repos="http://R-Forge.R-project.org";) after the
> package is next built on R-Forge, usually in a day or so.
>
> Best,
>  John
>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Gang Chen
>> Sent: January-13-15 1:48 PM
>> To: r-help
>> Subject: [R] Problem with Anova() in package "car"
>>
>> I'm having some trouble with Anova() in package "car". When the model
>> formula is explicitly expressed:
>>
>> library('nlme')
>> library('car')
>> fm <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)
>>
>> Anova() works fine:
>>
>> Anova(fm)
>>
>> However, if the model formula is scanned from an external source:
>>
>> myModel <- as.formula("distance ~ age + Sex")
>> fm2 <- lme(myModel, data = Orthodont, random = ~ 1)
>>
>> I get the following error:
>>
>> Anova(fm2)
>> Error: object of type 'symbol' is not subsettable
>>
>> How to resolve the situation when the model formula is defined externally?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> http://www.avast.com
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue with formula conversion

2014-08-27 Thread Gang Chen
A random effect formulation for R package nlme is read in as a string
of characters from an input file:

ranEff <- "pdCompSymm(~1+Age)"

I need to convert 'ranEff' to a formula class. However, as shown below:

> as.formula(ranEff)
~1 + Age

the "pdCompSymm" is lost in the conversion. Any solutions? Thanks!

Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with formula conversion

2014-08-27 Thread Gang Chen
Thanks for the help! However, I just need to get

pdCompSymm(~1 + Age)

without a tilde (~) at the beginning.

On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius  wrote:
>
> On Aug 27, 2014, at 11:19 AM, Gang Chen wrote:
>
>> A random effect formulation for R package nlme is read in as a string
>> of characters from an input file:
>>
>> ranEff <- "pdCompSymm(~1+Age)"
>>
>> I need to convert 'ranEff' to a formula class. However, as shown below:
>>
>>> as.formula(ranEff)
>> ~1 + Age
>>
>> the "pdCompSymm" is lost in the conversion. Any solutions?
>
> as.formula(paste("~",ranEff))
> ~pdCompSymm(~1 + Age)
> --
>
> David Winsemius
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with formula conversion

2014-08-27 Thread Gang Chen
Good point!

Here is an example:

library(nlme)
fm <- lme(yield ~ nitro, data=Oats, random=list(Block=pdComSymm(~Variety-1)))

Now the problem I'm facing is that the following part

pdComSymm(~Variety-1)

is read in as a string of characters from an external source:

ranEff <- 'pdComSymm(~Variety-1)'

The following

(ranEff1 <- as.formula(ranEff))
~Variety - 1

is not what I want. Even though

fm <- lme(yield ~ nitro, data=Oats, random=list(Block=pdCompSymm(ranEff1)))

works, I don't know the 'pdCompSymm' part in advance and would like to
make the process automatic.

On Wed, Aug 27, 2014 at 3:49 PM, David Winsemius  wrote:
>
> On Aug 27, 2014, at 12:44 PM, Gang Chen wrote:
>
>> Thanks for the help! However, I just need to get
>>
>> pdCompSymm(~1 + Age)
>
> That's not a formula in the R sense of the word. You should do a better job 
> of posting a use case. Perhaps you want an expression?
>
> --
> David.
>>
>> without a tilde (~) at the beginning.
>>
>> On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius  
>> wrote:
>>>
>>> On Aug 27, 2014, at 11:19 AM, Gang Chen wrote:
>>>
>>>> A random effect formulation for R package nlme is read in as a string
>>>> of characters from an input file:
>>>>
>>>> ranEff <- "pdCompSymm(~1+Age)"
>>>>
>>>> I need to convert 'ranEff' to a formula class. However, as shown below:
>>>>
>>>>> as.formula(ranEff)
>>>> ~1 + Age
>>>>
>>>> the "pdCompSymm" is lost in the conversion. Any solutions?
>>>
>>> as.formula(paste("~",ranEff))
>>> ~pdCompSymm(~1 + Age)
>>> --
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>
> David Winsemius
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with formula conversion

2014-08-27 Thread Gang Chen
Sorry for the misspelling! And more importantly, thanks a lot for the
nice solution and for the quick help!

On Wed, Aug 27, 2014 at 4:22 PM, David Winsemius  wrote:
>
> On Aug 27, 2014, at 1:11 PM, Gang Chen wrote:
>
>> Good point!
>>
>> Here is an example:
>>
>> library(nlme)
>> fm <- lme(yield ~ nitro, data=Oats, random=list(Block=pdComSymm(~Variety-1)))
>>
>
> One problem is that youa re misspelling the function name.
>
>
>> Now the problem I'm facing is that the following part
>>
>> pdComSymm(~Variety-1)
>>
>> is read in as a string of characters from an external source:
>>
>> ranEff <- 'pdComSymm(~Variety-1)'
>>
>> The following
>>
>> (ranEff1 <- as.formula(ranEff))
>> ~Variety - 1
>>
>> is not what I want. Even though
>>
>> fm <- lme(yield ~ nitro, data=Oats, random=list(Block=pdCompSymm(ranEff1)))
>
>> ranEff <- 'pdCompSymm(~Variety-1)'
>> tt <- parse(text=ranEff)
>
> # Returns an unevaluated expression
>
>> fm <- lme(yield ~ nitro, data=Oats, random=list(Block=eval(tt) ))
>> fm
> Linear mixed-effects model fit by REML
>   Data: Oats
>   Log-restricted-likelihood: -296.5209
>   Fixed: yield ~ nitro
> (Intercept)   nitro
>81.8722273.66667
> snipped rest of output.
>
> --
> David
>
>>
>> works, I don't know the 'pdCompSymm' part in advance and would like to
>> make the process automatic.
>>
>> On Wed, Aug 27, 2014 at 3:49 PM, David Winsemius  
>> wrote:
>>>
>>> On Aug 27, 2014, at 12:44 PM, Gang Chen wrote:
>>>
>>>> Thanks for the help! However, I just need to get
>>>>
>>>> pdCompSymm(~1 + Age)
>>>
>>> That's not a formula in the R sense of the word. You should do a better job 
>>> of posting a use case. Perhaps you want an expression?
>>>
>>> --
>>> David.
>>>>
>>>> without a tilde (~) at the beginning.
>>>>
>>>> On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius  
>>>> wrote:
>>>>>
>>>>> On Aug 27, 2014, at 11:19 AM, Gang Chen wrote:
>>>>>
>>>>>> A random effect formulation for R package nlme is read in as a string
>>>>>> of characters from an input file:
>>>>>>
>>>>>> ranEff <- "pdCompSymm(~1+Age)"
>>>>>>
>>>>>> I need to convert 'ranEff' to a formula class. However, as shown below:
>>>>>>
>>>>>>> as.formula(ranEff)
>>>>>> ~1 + Age
>>>>>>
>>>>>> the "pdCompSymm" is lost in the conversion. Any solutions?
>>>>>
>>>>> as.formula(paste("~",ranEff))
>>>>> ~pdCompSymm(~1 + Age)
>>>>> --
>>>>>
>>>>> David Winsemius
>>>>> Alameda, CA, USA
>>>>>
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>
> David Winsemius
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Failure with .Rprofile on Mac OS X

2014-09-18 Thread Gang Chen
When R starts in GUI (e.g., /Applications/R.app/Contents/MacOS/R) on
my Mac OS X 10.7.5, the startup configuration in .Rprofile works fine.
However, when R starts on the terminal (e.g.,
/Library/Frameworks/R.framework/Resources/bin/R), it does not work at
all. What could be the reason for the failure?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Failure with .Rprofile on Mac OS X

2014-09-19 Thread Gang Chen
Thanks for the help, Amos!

> The only reason that *should* happen is if there's an .Rprofile in the 
> directory you're in when you start R.

There is only one .Rprofil, which is in my home directory ~/

> Where *exactly* is the .Rprofile file you want loaded

The only one is in my home directory.

> what directory are you starting from

It does not matter where I start R because on my Mac the CLI R is
linked to /Library/Frameworks/R.framework/Resources/bin/R while the
GUI R is linked to /Applications/R.app/Contents/MacOS/R

> what does R say is the user's home directory? Did you make *any* changes to 
> Rprofile.site, or Renviron?

> R.home()
[1] "/Library/Frameworks/R.framework/Resources"

> What is the output from Sys.getenv() in gui and cli, and do they differ?

They differ slightly. I have trouble pinpointing the exact difference
because the format is a little different and vim does not help much in
tracking the differences. I just noticed that the CLI version has a
few extra terms such as

COLUMNS
"130"
...
DYLD_FALLBACK_LIBRARY_PATH
"/Library/Frameworks/R.framework/Resources/lib"

Thanks,
Gang


On Thu, Sep 18, 2014 at 7:04 PM, Amos B. Elberg  wrote:
> The only reason that *should* happen is if there's an .Rprofile in the 
> directory you're in when you start R.
>
> Where *exactly* is the .Rprofile file you want loaded, what directory are you 
> starting from, and what does R say is the user's home directory? Did you make 
> *any* changes to Rprofile.site, or Renviron?
>
> What is the output from Sys.getenv() in gui and cli, and do they differ?
>
>
>> On Sep 18, 2014, at 11:18 AM, Gang Chen  wrote:
>>
>> When R starts in GUI (e.g., /Applications/R.app/Contents/MacOS/R) on
>> my Mac OS X 10.7.5, the startup configuration in .Rprofile works fine.
>> However, when R starts on the terminal (e.g.,
>> /Library/Frameworks/R.framework/Resources/bin/R), it does not work at
>> all. What could be the reason for the failure?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem running mvrnorm

2015-11-18 Thread Gang Chen
 I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
following problem.

It works fine with the following:

require('MASS’)
var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))

However, when running the following in a loop with simulated data (Sigma):

# Sigma defined somewhere else
mvrnorm(n=1000, rep(0, 190), Sigma)

I get this opaque message:

 *** caught illegal operation ***
address 0x7fe78f8693d2, cause 'illegal operand'

Traceback:
 1: eigen(Sigma, symmetric = TRUE)
 2: mvrnorm(n = nr, rep(0, NN), Sigma)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I tried to do core dump (option 1), but it didn’t go anywhere (hanging
there forever). I also ran the same code on a Mac, and there was no problem
at all. What is causing the problem on the Linux server? In case the
variance-covariance matrix ‘Sigma’ is needed, I can provide its definition
later.

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Problem running mvrnorm

2015-11-18 Thread Gang Chen
Thanks a lot for the pointer, Rolf!

You're correct that

 E <- eigen(Sigma,symmetric=TRUE)

does lead to the same error on the RedHat machine. However, the same
computation on my Mac works fine:

E <- eigen(Sigma,symmetric=TRUE)

E$values

  [1] 4.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6
2.6
 [19] 2.6 2.6 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [37] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [55] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [73] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [91] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[109] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[127] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[145] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[163] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[181] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

As you can see, all the eigenvalues are truly positive. Is it possible that
some numerical library or package is required by eigen() but missing on the
Linux server? In case the real Sigma is useful, here is how the matrix
Sigma is defined:

constrSigma <- function(n, sig) {
   N <- n*(n-1)/2
   Sigma <- diag(N)
   bgn <- 1
   for(ii in (n-1):1) {
  end <- bgn+(ii-1)
  Sigma[bgn:end,bgn:end][lower.tri(Sigma[bgn:end,bgn:end])] <- rep(sig,
ii*(ii-1)/2)
  if(ii<(n-1)) {
 xx <- cbind(rep(sig,ii), diag(ii)*sig)
 yy <- xx
 for(jj in 1:(n-1-ii)) {
if(jj>1) {
   xx <- cbind(rep(0, ii), xx)
   yy <- cbind(xx, yy)
}
 }
 Sigma[bgn:end,1:(bgn-1)] <- yy
  }
  bgn <- end+1
   }
   Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(t(Sigma))]
   return(Sigma)
}

Sigma <- constrSigma(20, 0.1)
mvrnorm(n=1000, rep(0, 190), Sigma)






On Wed, Nov 18, 2015 at 3:56 PM, Rolf Turner 
wrote:
>
>
> I cobbled together a 190 x 190 positive definite matrix Sigma and ran
your example.  I got a result instantaneously, with no error message. (I'm
running Linux; an ancient Fedora 17 system.)
>
> So the problem is peculiar to your particular Sigma.
>
> As the error message tells you, the problem comes from doing an
eigendecomposition of Sigma.  So start your investigation by doing
>
> E <- eigen(Sigma,symmetric=TRUE)
>
> Presumably that will lead to the same error.  How to get around this
error is beyond the scope of my capabilities.
>
> You *might* get somewhere by using the singular value decomposition
> (equivalent for a positive definite matrix) rather than the
eigendecomposition.  I have the vague notion that the svd is more
numerically robust than eigendecomposition.  However I might well have that
wrong.
>
> Doing anything in 190 dimensions is bound to be fraught with numeric
> peril.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
>
> On 19/11/15 08:28, Gang Chen wrote:
>>
>>   I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
>> following problem.
>>
>> It works fine with the following:
>>
>> require('MASS’)
>> var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))
>>
>> However, when running the following in a loop with simulated data
(Sigma):
>>
>> # Sigma defined somewhere else
>> mvrnorm(n=1000, rep(0, 190), Sigma)
>>
>> I get this opaque message:
>>
>>   *** caught illegal operation ***
>> address 0x7fe78f8693d2, cause 'illegal operand'
>>
>> Traceback:
>>   1: eigen(Sigma, symmetric = TRUE)
>>   2: mvrnorm(n = nr, rep(0, NN), Sigma)
>>
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>>
>> I tried to do core dump (option 1), but it didn’t go anywhere (hanging
>> there forever). I also ran the same code on a Mac, and there was no
problem
>> at all. What is causing the problem on the Linux server? In case the
>> variance-covariance matrix ‘Sigma’ is needed, I can provide its
definition
>> later.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mixed model with multiple response variables?

2008-08-05 Thread Gang Chen
Hi,

I have a data set collected from 10 measurements (response variables)
on two groups (healthy and patient) of subjects performing 4 different
tasks. In other words there are two fixed factors (group and task),
and 10 response variables. I could analyze the data with aov() or
lme() in package nlme for each response variable separately, but since
most likely there are correlations among the 10 response variables,
would it be more meaningful to run a MANOVA? However manova() in R
seems not to allow an error term in the formula. What else can I try
for this kind of multivariate mixed model?

Also, if I want to find out which response variables (among the 10
measurements) are statistically significant in terms of acting as
indicators for group difference, what kind of statistical analysis
would help me sort them out?

Thanks in advance,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating an array of lists

2008-08-07 Thread Gang Chen
Hi,

I want to store some number of outputs from running a bunch of
analyses such as lm() into an array. I know how to do this with a
one-dimensional array (vector) by creating

myArray <- vector(mode='list', length=10)

and storing each lm() result into a component of myArray.

My question is, how can do this for a multiple dimensional array? It
seems array() does not have such a 'mode' option as in vector(). Any
alternatives?

Thanks in advance,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating an array of lists

2008-08-07 Thread Gang Chen
Yes, both array of lists and list of lists work well as desired. Thanks a lot!

Gang

On Thu, Aug 7, 2008 at 4:05 PM, Patrick Burns <[EMAIL PROTECTED]> wrote:
> myListArray <- array(list(NULL), c(3,2))
> myListArray[[1,2]] <- list(letters, 1:4)
>
> Patrick Burns
> [EMAIL PROTECTED]
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>
> Gang Chen wrote:
>>
>> Hi,
>>
>> I want to store some number of outputs from running a bunch of
>> analyses such as lm() into an array. I know how to do this with a
>> one-dimensional array (vector) by creating
>>
>> myArray <- vector(mode='list', length=10)
>>
>> and storing each lm() result into a component of myArray.
>>
>> My question is, how can I do this for a multiple dimensional array? It
>> seems array() does not have such a 'mode' option as in vector(). Any
>> alternatives?
>>
>> Thanks in advance,
>> Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculate variance/covariance with complex numbers

2010-03-27 Thread Gang Chen
Anybody knows what functions can be used to calculate
variance/covariance with complex numbers? var and cov don't seem to
work:

> a
1
V1 0.00810014+0.00169366i
V2 0.00813054+0.00158251i
V3 0.00805489+0.00163295i
V4 0.00809141+0.00159533i
V5 0.00813976+0.00161850i

> var(a)
 1
1 1.141556e-09
Warning message:
In var(a) : imaginary parts discarded in coercion

> cov(a)
 1
1 1.141556e-09
Warning message:
In cov(a) : imaginary parts discarded in coercion

Thanks in advance,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate variance/covariance with complex numbers

2010-03-27 Thread Gang Chen
Thanks a lot for the help, Ben Bolker and Chuck!

Chuck, it seems your version needs a little modification. Instead of this:

> xri <- matrix(rnorm(1)+1i*rnorm(1),nc=2)
> crossprod(xri-colMeans(xri))/(nrow(xri)-1)

it seems to me it should be this (maybe there is a more elegant
modification than mine?):
> crossprod(t(apply(xri, 1, '-', colMeans(xri/(nrow(xri)-1)

Do you agree?

Gang



On Sat, Mar 27, 2010 at 7:07 PM, Charles C. Berry  wrote:
> On Sat, 27 Mar 2010, Gang Chen wrote:
>
>> Anybody knows what functions can be used to calculate
>> variance/covariance with complex numbers? var and cov don't seem to
>> work:
>
> How about?
>
>> xri <- matrix(rnorm(1)+1i*rnorm(1),nc=2)
>> crossprod(xri-colMeans(xri))/(nrow(xri)-1)
>
> HTH,
>
> Chuck
>
>>
>>> a
>>
>>                       1
>> V1 0.00810014+0.00169366i
>> V2 0.00813054+0.00158251i
>> V3 0.00805489+0.00163295i
>> V4 0.00809141+0.00159533i
>> V5 0.00813976+0.00161850i
>>
>>> var(a)
>>
>>            1
>> 1 1.141556e-09
>> Warning message:
>> In var(a) : imaginary parts discarded in coercion
>>
>>> cov(a)
>>
>>            1
>> 1 1.141556e-09
>> Warning message:
>> In cov(a) : imaginary parts discarded in coercion
>>
>> Thanks in advance,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive
> Medicine
> E mailto:cbe...@tajo.ucsd.edu               UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Puzzled by an error with apply()

2009-04-06 Thread Gang Chen
I've written a function, myFunc, that works fine with myFunc(data,
...), but when I use apply() to run it with an array of data

apply(myArray, 1, myFunc, ...)

I get a strange error:

Error in match.fun(FUN) : '1' is not a function, character or symbol

which really puzzles me because '1' is meant to be the margin of the
array I want to apply over, but how come does apply() treat it as a
function?

I have been successfully using apply() for a while, so I must have
made a stupid mistake this time. Hopefully somebody can point out
something obviously wrong without me providing any details of the
function.

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert bits to numbers in base 10

2009-04-09 Thread Gang Chen
I have some bits stored like the following variable nn

(nn <- c(1, 0, 0, 1, 0, 1,0))
[1] 1 0 0 1 0 1 0

not in the format of

1001010

and I need to convert them to numbers in base 10. What's an easy way to do it?

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert bits to numbers in base 10

2009-04-09 Thread Gang Chen
Yes, such a concise and elegant solution!

Thanks a lot!

Gang

On Thu, Apr 9, 2009 at 5:51 PM, Marc Schwartz  wrote:
> I suspect that Gang was looking for something along the lines of:
>
>> sum(2 ^ (which(as.logical(rev(nn))) - 1))
> [1] 74
>
> You might also want to look at the digitsBase() function in Martin's sfsmisc
> package on CRAN.
>
> HTH,
>
> Marc Schwartz
>
> On Apr 9, 2009, at 4:34 PM, Jorge Ivan Velez wrote:
>
>> Dear Gang,
>> Try this:
>>
>> nn <- c(1, 0, 0, 1, 0, 1,0)
>> paste(nn,sep="",collapse="")
>>
>> See ?paste for more information.
>>
>> HTH,
>>
>> Jorge
>>
>>
>> On Thu, Apr 9, 2009 at 5:23 PM, Gang Chen  wrote:
>>
>>> I have some bits stored like the following variable nn
>>>
>>> (nn <- c(1, 0, 0, 1, 0, 1,0))
>>> [1] 1 0 0 1 0 1 0
>>>
>>> not in the format of
>>>
>>> 1001010
>>>
>>> and I need to convert them to numbers in base 10. What's an easy way to
>>> do
>>> it?
>>>
>>> TIA,
>>> Gang
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metafor package: effect sizes are not fully independent

2010-02-05 Thread Gang Chen
In a classical meta analysis model y_i = X_i * beta_i + e_i, data
{y_i} are assumed to be independent effect sizes. However, I'm
encountering the following two scenarios:

(1) Each source has multiple effect sizes, thus {y_i} are not fully
independent with each other.
(2) Each source has multiple effect sizes, each of the effect size
from a source can be categorized as one of a factor levels (e.g.,
happy, sad, and neutral). Maybe better denote the data as y_ij, effect
size at the j-th level from the i-th source. I can code the levels
with dummy variables into the X_i matrix, but apparently the data from
the same source are correlated with each other. In this case, I would
like to run a few tests one of which is, for example, whether there is
any difference across all the levels of the factor.

Can metafor handle these two cases?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metafor package: effect sizes are not fully independent

2010-02-08 Thread Gang Chen
Thanks for the confirmation and pointer, Mike!

Dr. Viechtbauer, I'm looking forward to the new functionality of
specifying covariance structures in metafor().

Thanks both again for the great help,
Gang


On Sun, Feb 7, 2010 at 8:59 PM, Mike Cheung  wrote:
> Dear Gang,
>
> It seems that it is possible to use a univariate meta-analysis to
> handle your multivariate effect sizes. If you want to calculate a
> weighted average first, Hedges and Olkin (1985) has discussed this
> approach.
>
> Hedges, L. V., & Olkin, I. (1985). Statistical methods for
> meta-analysis. Orlando, FL: Academic Press.
>
> Regards,
> Mike
> --
> -
>  Mike W.L. Cheung               Phone: (65) 6516-3702
>  Department of Psychology       Fax:   (65) 6773-1843
>  National University of Singapore
>  http://courses.nus.edu.sg/course/psycwlm/internet/
> -
>
> On Mon, Feb 8, 2010 at 6:48 AM, Gang Chen  wrote:
>> Dear Mike,
>>
>> Thanks a lot for the kind help!
>>
>> Actually a few months ago I happened to read a couple of your posts on
>> the R-help archive when I was exploring the possibility of using lme()
>> in R for meta analysis.
>>
>> First of all, I didn't specify the meta analysis model for my cases
>> correctly in my previous message. Currently I'm only interested in
>> random- or mixed-effects meta analysis. So what you've suggested is
>> directly relevant to what I've been looking for, especially for case
>> (2). I'll try to gather those references you listed, and figure out
>> the details.
>>
>> Also I think I didn't state my case (1) clearly in my previous post.
>> In that case, all the effect sizes are the same and in the same
>> condition too (e.g., happy), but each source has multiple samples of
>> the measurement (and also measurement error, or standard error). Could
>> this still be handled as a multivariate meta analysis since the
>> samples for the the same source are correlated? Or somehow the
>> multiple measures from the same source can be somehow summarized
>> (weighted average?) before the meta analysis?
>>
>> Your suggestions are highly appreciated.
>>
>> Best wishes,
>> Gang
>>
>>
>> On Sun, Feb 7, 2010 at 10:39 AM, Mike Cheung  wrote:
>>> Dear Gang,
>>>
>>> Here are just some general thoughts. Wolfgang Viechtbauer will be a
>>> better position to answer questions related to metafor.
>>>
>>> For multivariate effect sizes, we first have to estimate the
>>> asymptotic sampling covariance matrix among the effect sizes. Formulas
>>> for some common effect sizes are provided by Gleser and Olkin (2009).
>>>
>>> If a fixed-effects model is required, it is quite easy to write your
>>> own GLS function to conduct the multivariate meta-analysis (see e.g.,
>>> Becker, 1992). If a random-effects model is required, it is more
>>> challenging in R. SAS Proc MIXED can do the work (e.g., van
>>> Houwelingen, Arends, & Stijnen, 2002).
>>>
>>> Sometimes, it is possible to transform the multivariate effect sizes
>>> into independent effect sizes (Kalaian & Raudenbush, 1996; Raudenbush,
>>> Becker, & Kalaian, 1988). Then univariate meta-analysis, e.g.,
>>> metafor(), can be performed on the transformed effect sizes. This
>>> approach works if it makes sense to pool the multivariate effect sizes
>>> as in your case (2)- the effect sizes are the same but in different
>>> conditions (happy, sad, and neutral). However, this approach does not
>>> work if the multivariate effect sizes are measuring different
>>> concepts, e.g., verbal achievement and mathematical achievement.
>>>
>>> Hope this helps.
>>>
>>> Becker, B. J. (1992). Using results from replicated studies to
>>> estimate linear models. Journal of Educational Statistics, 17,
>>> 341-362.
>>> Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect
>>> sizes. In H. Cooper, L. V. Hedges, and J. C. Valentine (Eds.), The
>>> handbook of research synthesis and meta-analysis, 2nd edition (pp.
>>> 357-376). New York: Russell Sage Foundation.
>>> Kalaian, H. A., & Raudenbush, S. W. (1996). A multivariate mixed
>>> linear model for meta-analysis. Psychological Methods, 1, 227-235.
>>> Raudenbush, S. W., Becker, B. J., & Kalaian, H. (1988). Modeling
>>> multivariate effect sizes. Psychological Bul

[R] How to capture the printout on the screen?

2013-07-17 Thread Gang Chen
This is most likely a silly question.

First I run the following:

require(car)
mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
post.3, post.4, post.5,
  fup.1, fup.2, fup.3, fup.4, fup.5) ~
treatment*gender, data=OBrienKaiser)
phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
levels=c("pretest", "posttest", "followup"))
hour <- ordered(rep(1:5, 3))
idata <- data.frame(phase, hour)
(fm <- linearHypothesis(mod.ok, c("treatment1", "treatment2"),
idata=idata, idesign=~phase*hour, iterms="phase:hour"))

I get the output like the following (only the last part at the end is
shown below):

...
Multivariate Tests:
 Df test stat  approx F num Df den Df  Pr(>F)
Pillai2 0.6623840 0.2475987 16  8 0.99155
Wilks 2 0.4460357 0.1864957 16  6 0.99668
Hotelling-Lawley  2 0.9988990 0.1248624 16  4 0.99904
Roy   2 0.5792977 0.2896488  8  4 0.93605

I want to capture the data frame shown above, but it's not part of 'fm':

str(fm)

I guess the print method for an object of class "Anova.mlm"
("print.Anova.mlm") generates the above data frame. My question is:
How to capture it? I mean, how can I store the output on the screen
and then extract the data frame?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to capture the printout on the screen?

2013-07-17 Thread Gang Chen
Thanks a lot for the pointer!

After I downloaded the source code and saw the innards of the print
function, I know what to do now.

Thanks again,
Gang


On Wed, Jul 17, 2013 at 6:00 PM, Greg Snow <538...@gmail.com> wrote:
> If you are happy with the character strings printed out then you can use the
> capture.output function.
>
> If you want the numbers (without needing to convert) then look at the print
> method for the class of object that you are working with and see what the
> code is.  It is probably calling another function to produce the output and
> you could just call that function directly.  If not, you can copy that
> section of the code into your own function to call and have it return the
> object rather than printing.
>
>
> On Wed, Jul 17, 2013 at 3:38 PM, Gang Chen  wrote:
>>
>> This is most likely a silly question.
>>
>> First I run the following:
>>
>> require(car)
>> mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
>> post.3, post.4, post.5,
>>   fup.1, fup.2, fup.3, fup.4, fup.5) ~
>> treatment*gender, data=OBrienKaiser)
>> phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
>> levels=c("pretest", "posttest", "followup"))
>> hour <- ordered(rep(1:5, 3))
>> idata <- data.frame(phase, hour)
>> (fm <- linearHypothesis(mod.ok, c("treatment1", "treatment2"),
>> idata=idata, idesign=~phase*hour, iterms="phase:hour"))
>>
>> I get the output like the following (only the last part at the end is
>> shown below):
>>
>> ...
>> Multivariate Tests:
>>  Df test stat  approx F num Df den Df  Pr(>F)
>> Pillai2 0.6623840 0.2475987 16  8 0.99155
>> Wilks 2 0.4460357 0.1864957 16  6 0.99668
>> Hotelling-Lawley  2 0.9988990 0.1248624 16  4 0.99904
>> Roy   2 0.5792977 0.2896488  8  4 0.93605
>>
>> I want to capture the data frame shown above, but it's not part of 'fm':
>>
>> str(fm)
>>
>> I guess the print method for an object of class "Anova.mlm"
>> ("print.Anova.mlm") generates the above data frame. My question is:
>> How to capture it? I mean, how can I store the output on the screen
>> and then extract the data frame?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package for comparing sensitivity, specificity, PPV, NPV, and accuracy?

2012-09-13 Thread Gang Chen
Hi, I have two sets of sensitivity, specificity, positive predictive
value, and negative predictive value, and accuracy from two tests on
the same subjects. Is there an R package that does such paired
comparisons?

Thanks,
Gang Chen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any refit function available for 'car' package?

2014-06-16 Thread Gang Chen
Suppose that I need to run a multivariate linear model

Y = X B + E

many times with the same model matrix X but each time with different
response matrix Y. Is there a function available in 'car' package
similar to refit() in lme4 package so that the model matrix X would
not be reassembled each time? Also, runtime can be saved without
repeatedly performing the same matrix computations such as
(X'X)^(-1)X'.

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Display a dataframe

2014-07-03 Thread Gang Chen
I have a matrix 'dd' defined as below:

dd <- t(matrix(c(153.0216306,  1, 7.578366e-35,
13.3696538,  1, 5.114571e-04,
0.8476713,  1, 7.144239e-01,
1.2196050,  1, 5.388764e-01,
2.6349405,  1, 2.090719e-01,
6.0507714,  1, 2.780045e-02), nrow=3, ncol=6))
dimnames(dd)[[2]] <- c('# Chisq', 'DF', 'Pr(>Chisq)')
dimnames(dd)[[1]] <- c('# Sex', '# Volume', '# Weight', '# Intensity',
'# ISO', '# SEC')

'dd' displays as the following:

# Chisq DF   Pr(>Chisq)
# Sex   153.0216306  1 7.578366e-35
# Volume 13.3696538  1 5.114571e-04
# Weight  0.8476713  1 7.144239e-01
# Intensity   1.2196050  1 5.388764e-01
# ISO 2.6349405  1 2.090719e-01
# SEC 6.0507714  1 2.780045e-02

I would like to display it as:

# Chisq   DF   Pr(>Chisq)term
153.0216306  1 7.578366e-35# Sex
13.3696538  1 5.114571e-04  # Volume
0.8476713  1 7.144239e-01# Weight
1.2196050  1 5.388764e-01# Intensity
2.6349405  1 2.090719e-01# ISO
6.0507714  1 2.780045e-02# SEC

This is what I came up with

(cc <- data.frame(data.frame(dd), term=dimnames(dd)[[1]]))

   X..Chisq DF   Pr..Chisq.term
# Sex   153.0216306  1 7.578366e-35   # Sex
# Volume 13.3696538  1 5.114571e-04# Volume
# Weight  0.8476713  1 7.144239e-01# Weight
# Intensity   1.2196050  1 5.388764e-01 # Intensity
# ISO 2.6349405  1 2.090719e-01   # ISO
# SEC 6.0507714  1 2.780045e-02   # SEC

But I'm not happy with the following two issues:

1) How to get rid of the row names?
2) The special characters of #, (, >,) in the column names are not
displayed correctly.

Any suggestions?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Display a dataframe

2014-07-04 Thread Gang Chen
Perfect! Thanks a lot!

On July 3, 2014 5:10:02 PM EDT, David L Carlson  wrote:
>Not elegant, but it works:
>
>> term <- dimnames(dd)[[1]]
>> dd1 <- dd
>> dimnames(dd1)[[1]] <- rep("", 6)
>> dd2 <- capture.output(dd1)
>> cat(paste(dd2, "   ", c("Term", term)), fill=48)
> # Chisq DF   Pr(>Chisq) Term 
> 153.0216306  1 7.578366e-35 # Sex 
>  13.3696538  1 5.114571e-04 # Volume 
>   0.8476713  1 7.144239e-01 # Weight 
>   1.2196050  1 5.388764e-01 # Intensity 
>   2.6349405  1 2.090719e-01 # ISO 
>   6.0507714  1 2.780045e-02 # SEC
>
>
>David Carlson
>
>-----Original Message-
>From: r-help-boun...@r-project.org
>[mailto:r-help-boun...@r-project.org] On Behalf Of Gang Chen
>Sent: Thursday, July 3, 2014 2:56 PM
>To: r-help
>Subject: [R] Display a dataframe
>
>I have a matrix 'dd' defined as below:
>
>dd <- t(matrix(c(153.0216306,  1, 7.578366e-35,
>13.3696538,  1, 5.114571e-04,
>0.8476713,  1, 7.144239e-01,
>1.2196050,  1, 5.388764e-01,
>2.6349405,  1, 2.090719e-01,
>6.0507714,  1, 2.780045e-02), nrow=3, ncol=6))
>dimnames(dd)[[2]] <- c('# Chisq', 'DF', 'Pr(>Chisq)')
>dimnames(dd)[[1]] <- c('# Sex', '# Volume', '# Weight', '# Intensity',
>'# ISO', '# SEC')
>
>'dd' displays as the following:
>
># Chisq DF   Pr(>Chisq)
># Sex   153.0216306  1 7.578366e-35
># Volume 13.3696538  1 5.114571e-04
># Weight  0.8476713  1 7.144239e-01
># Intensity   1.2196050  1 5.388764e-01
># ISO 2.6349405  1 2.090719e-01
># SEC 6.0507714  1 2.780045e-02
>
>I would like to display it as:
>
># Chisq   DF   Pr(>Chisq)term
>153.0216306  1 7.578366e-35# Sex
>13.3696538  1 5.114571e-04  # Volume
>0.8476713  1 7.144239e-01# Weight
>1.2196050  1 5.388764e-01# Intensity
>2.6349405  1 2.090719e-01# ISO
>6.0507714  1 2.780045e-02# SEC
>
>This is what I came up with
>
>(cc <- data.frame(data.frame(dd), term=dimnames(dd)[[1]]))
>
>   X..Chisq DF   Pr..Chisq.term
># Sex   153.0216306  1 7.578366e-35   # Sex
># Volume 13.3696538  1 5.114571e-04# Volume
># Weight  0.8476713  1 7.144239e-01# Weight
># Intensity   1.2196050  1 5.388764e-01 # Intensity
># ISO 2.6349405  1 2.090719e-01   # ISO
># SEC 6.0507714  1 2.780045e-02   # SEC
>
>But I'm not happy with the following two issues:
>
>1) How to get rid of the row names?
>2) The special characters of #, (, >,) in the column names are not
>displayed correctly.
>
>Any suggestions?
>
>Thanks,
>Gang
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Display a dataframe

2014-07-04 Thread Gang Chen
I really your kind help! This is exactly what I was looking for except that I 
need to get rid of the numbered row names.

On July 3, 2014 9:57:00 PM EDT, arun  wrote:
>Hi,
>May be this helps:
>nC <- max(nchar(row.names(dd)))
> term <- formatC(row.names(dd), width=-nC)
>#or
> term <- sprintf("%-11s", row.names(dd))
>
>  dd1 <- setNames(data.frame(unname(dd), term,stringsAsFactors=F),
>c(colnames(dd), formatC("term",width=-nC)))
>dd1
>#      # Chisq DF   Pr(>Chisq) term       
>#1 153.0216306  1 7.578366e-35 # Sex      
>#2  13.3696538  1 5.114571e-04 # Volume   
>#3   0.8476713  1 7.144239e-01 # Weight   
>#4   1.2196050  1 5.388764e-01 # Intensity
>#5   2.6349405  1 2.090719e-01 # ISO      
>#6   6.0507714  1 2.780045e-02 # SEC      
>
>A.K.
>
>
>
>
>
>On Thursday, July 3, 2014 3:57 PM, Gang Chen 
>wrote:
>I have a matrix 'dd' defined as below:
>
>dd <- t(matrix(c(153.0216306,  1, 7.578366e-35,
>13.3696538,  1, 5.114571e-04,
>0.8476713,  1, 7.144239e-01,
>1.2196050,  1, 5.388764e-01,
>2.6349405,  1, 2.090719e-01,
>6.0507714,  1, 2.780045e-02), nrow=3, ncol=6))
>dimnames(dd)[[2]] <- c('# Chisq', 'DF', 'Pr(>Chisq)')
>dimnames(dd)[[1]] <- c('# Sex', '# Volume', '# Weight', '# Intensity',
>'# ISO', '# SEC')
>
>'dd' displays as the following:
>
>                # Chisq DF   Pr(>Chisq)
># Sex       153.0216306  1 7.578366e-35
># Volume     13.3696538  1 5.114571e-04
># Weight      0.8476713  1 7.144239e-01
># Intensity   1.2196050  1 5.388764e-01
># ISO         2.6349405  1 2.090719e-01
># SEC         6.0507714  1 2.780045e-02
>
>I would like to display it as:
>
># Chisq               DF   Pr(>Chisq)                      
>  term
>153.0216306  1 7.578366e-35                            # Sex
>13.3696538  1 5.114571e-04                              # 
>Volume
>0.8476713  1 7.144239e-01                                # 
>Weight
>1.2196050  1 5.388764e-01                                # 
>Intensity
>2.6349405  1 2.090719e-01                                # ISO
>6.0507714  1 2.780045e-02                                # SEC
>
>This is what I came up with
>
>(cc <- data.frame(data.frame(dd), term=dimnames(dd)[[1]]))
>
>               X..Chisq DF   Pr..Chisq.        term
># Sex       153.0216306  1 7.578366e-35       # Sex
># Volume     13.3696538  1 5.114571e-04    # Volume
># Weight      0.8476713  1 7.144239e-01    # Weight
># Intensity   1.2196050  1 5.388764e-01 # Intensity
># ISO         2.6349405  1 2.090719e-01       # ISO
># SEC         6.0507714  1 2.780045e-02       # SEC
>
>But I'm not happy with the following two issues:
>
>1) How to get rid of the row names?
>2) The special characters of #, (, >,) in the column names are not
>displayed correctly.
>
>Any suggestions?
>
>Thanks,
>Gang
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mapping from one vector to another

2014-07-17 Thread Gang Chen
Suppose I have the following dataframe:

L4 <- LETTERS[1:4]
fac <- sample(L4, 10, replace = TRUE)
(d <- data.frame(x = 1, y = 1:10, fac = fac))

 x  y  fac
1  1  1   B
2  1  2   B
3  1  3   D
4  1  4   A
5  1  5   C
6  1  6   D
7  1  7   C
8  1  8   B
9  1  9   B
10 1 10   B

I'd like to add another column 'var' that is defined based on the
following mapping of column 'fac':

A -> 8
B -> 11
C -> 3
D -> 2

How can I achieve this in an elegant way (with a generic approach for
any length)?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping from one vector to another

2014-07-17 Thread Gang Chen
Thanks a lot for the quick and elegant solutions, Sarah, Bill and
Petr! I really appreciate it, including the suggestion of setting a
random seed. Have a nice day!

Gang

On Thu, Jul 17, 2014 at 11:15 AM, Sarah Goslee  wrote:
> What about:
>
> d$var <- c(8, 11, 3, 2)[d$fac]
>
> Side note: it's much appreciated that you included data and a clear
> problem statement. If you use
> set.seed(123)
> before your call to sample(), everyone who tries it will get the same
> fac that you do. Otherwise we all get something different. Or just
> generate your own example data and use dput() to include it in your
> email.
>
> Sarah
>
> On Thu, Jul 17, 2014 at 11:00 AM, Gang Chen  wrote:
>> Suppose I have the following dataframe:
>>
>> L4 <- LETTERS[1:4]
>> fac <- sample(L4, 10, replace = TRUE)
>> (d <- data.frame(x = 1, y = 1:10, fac = fac))
>>
>>  x  y  fac
>> 1  1  1   B
>> 2  1  2   B
>> 3  1  3   D
>> 4  1  4   A
>> 5  1  5   C
>> 6  1  6   D
>> 7  1  7   C
>> 8  1  8   B
>> 9  1  9   B
>> 10 1 10   B
>>
>> I'd like to add another column 'var' that is defined based on the
>> following mapping of column 'fac':
>>
>> A -> 8
>> B -> 11
>> C -> 3
>> D -> 2
>>
>> How can I achieve this in an elegant way (with a generic approach for
>> any length)?
>>
>> Thanks,
>> Gang
>>
> --
> Sarah Goslee
> http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping from one vector to another

2014-07-17 Thread Gang Chen
Jeff,

Even though the solutions from the previous responders are good enough
for my current situation, the principle you just raised will be
definitely beneficial to your future work. Thanks a lot for sharing
the insights!

Gang

On Thu, Jul 17, 2014 at 12:06 PM, Jeff Newmiller
 wrote:
> You ask about generic methods for introducing alternate values for factors,
> and some of the other responses address this quite efficiently.
>
> However, a factor has meaning only within one vector at a time, since
> another vector may have additional values or missing values relative to
> the first vector. For example, you used the "sample" function which
> is not guaranteed to select at least one of each of the four letters in L4.
> Or, what if the data has values the mapping doesn't address?
>
> For any work in which I am dealing with categorical data in multiple
> places (e.g. your "d" data frame and whatever data structure you use
> to define your mapping) I prefer NOT to work with factors until all of
> my categories of data are moved into one vector (typically a column
> in a data frame). Rather, I work with character vectors during the
> data manipulation phase and only convert to factor when I start
> analyzing or displaying the data.
>
> With this in mind, I use a general flow something like:
>
> d <- data.frame( x = 1, y = 1:10, fac = fac, stringsAsFactors=FALSE )
> mp <- data.frame( fac=LETTERS[1:4], value=c(8,11,3,2) )
> d2 <- merge( d, mp, all.x=TRUE )
> d2$fac <- factor( d2$fac ) # optional
>
> If you actually are in the analysis phase and are not pulling data from
> multiple external sources, then you may have already confirmed the
> completeness and range of values you have to work with then one of the other
> more efficient methods may still be a better choice for this specific task.
>
> Hadley Wickham's "tidy data" [1] principles address this concern more
> thoroughly than I have.
>
> [1] Google this phrase... paper seems to be a work in progress.
>
>
> On Thu, 17 Jul 2014, Gang Chen wrote:
>
>> Suppose I have the following dataframe:
>>
>> L4 <- LETTERS[1:4]
>> fac <- sample(L4, 10, replace = TRUE)
>> (d <- data.frame(x = 1, y = 1:10, fac = fac))
>>
>> x  y  fac
>> 1  1  1   B
>> 2  1  2   B
>> 3  1  3   D
>> 4  1  4   A
>> 5  1  5   C
>> 6  1  6   D
>> 7  1  7   C
>> 8  1  8   B
>> 9  1  9   B
>> 10 1 10   B
>>
>> I'd like to add another column 'var' that is defined based on the
>> following mapping of column 'fac':
>>
>> A -> 8
>> B -> 11
>> C -> 3
>> D -> 2
>>
>> How can I achieve this in an elegant way (with a generic approach for
>> any length)?
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Puzzling "Execution halted"

2012-11-20 Thread Gang Chen
I wrote an R program that does heavy computations with hundreds of
lines of code. It's running fine both interactively and in batch mode
on a Mac OS X computer. The program also has no problem running on a
Linux system (Fedora 14) interactively. However, when I try it on the
terminal in batch mode on the Linux system, it chokes in the middle of
the computation with the "Execution halted" error message.

I already put try or tryCatch in those places where computation may
throw an error. And the warnings are set in default (options(warn=1)).
I wish I could provide the code for help, but that seems impractical.
How to debug this?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Puzzling "Execution halted"

2012-11-20 Thread Gang Chen
Thanks for the pointer!

Unfortunately it doesn't look like a memory issue: /var/log/messages
contains nothing about memory problems. And the Linux system has
enough RAM for this computation.

Gang

On Tue, Nov 20, 2012 at 1:03 PM, Milan Bouchet-Valat  wrote:
> Le mardi 20 novembre 2012 à 12:54 -0500, Gang Chen a écrit :
>> I wrote an R program that does heavy computations with hundreds of
>> lines of code. It's running fine both interactively and in batch mode
>> on a Mac OS X computer. The program also has no problem running on a
>> Linux system (Fedora 14) interactively. However, when I try it on the
>> terminal in batch mode on the Linux system, it chokes in the middle of
>> the computation with the "Execution halted" error message.
>>
>> I already put try or tryCatch in those places where computation may
>> throw an error. And the warnings are set in default (options(warn=1)).
>> I wish I could provide the code for help, but that seems impractical.
>> How to debug this?
> Have a look at your /var/log/messages: it might well be the
> out-of-memory killer killing your R process because it eats too much
> RAM...
>
> My two cents

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dataframe manipulation

2013-12-13 Thread Gang Chen
Suppose I have a dataframe defined as

 L3 <- LETTERS[1:3]
 (d <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
= TRUE)))

   x  y fac
1  1  1   C
2  1  2   A
3  1  3   B
4  1  4   C
5  1  5   B
6  1  6   B
7  1  7   A
8  1  8   A
9  1  9   B
10 1 10   A

I want to extract those rows that are the first occurrences for each level
of factor 'fac', which are basically the first three rows above. How can I
achieve that? The real dataframe is more complicated than the example
above, and I can't simply list all the levels of factor 'fac' by
exhaustibly listing all the levels like the following

d[d$fac=='A' | d$fac=='B' | d$fac=='C', ]

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe manipulation

2013-12-13 Thread Gang Chen
Perfect! Thanks a lot, A.K!


On Fri, Dec 13, 2013 at 4:21 PM, arun  wrote:

>
>
> Hi,
> Try:
>  d[match(unique(d$fac),d$fac),]
> A.K.
>
>
> On Friday, December 13, 2013 4:17 PM, Gang Chen 
> wrote:
> Suppose I have a dataframe defined as
>
>  L3 <- LETTERS[1:3]
>  (d <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
> = TRUE)))
>
>x  y fac
> 1  1  1   C
> 2  1  2   A
> 3  1  3   B
> 4  1  4   C
> 5  1  5   B
> 6  1  6   B
> 7  1  7   A
> 8  1  8   A
> 9  1  9   B
> 10 1 10   A
>
> I want to extract those rows that are the first occurrences for each level
> of factor 'fac', which are basically the first three rows above. How can I
> achieve that? The real dataframe is more complicated than the example
> above, and I can't simply list all the levels of factor 'fac' by
> exhaustibly listing all the levels like the following
>
> d[d$fac=='A' | d$fac=='B' | d$fac=='C', ]
>
> Thanks,
> Gang
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe manipulation

2013-12-13 Thread Gang Chen
Another neat solution! Thanks a lot, Sarah!


On Fri, Dec 13, 2013 at 4:35 PM, Sarah Goslee wrote:

> What about:
>
> lapply(levels(d$fac), function(x)head(d[d$fac == x,], 1))
>
>
> Thanks for the reproducible example. If you put set.seed(123) before
> the call to sample, then everyone who tries it will get the same data
> frame d.
>
> Sarah
>
>
> On Fri, Dec 13, 2013 at 4:15 PM, Gang Chen  wrote:
> > Suppose I have a dataframe defined as
> >
> >  L3 <- LETTERS[1:3]
> >  (d <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10,
> replace
> > = TRUE)))
> >
> >x  y fac
> > 1  1  1   C
> > 2  1  2   A
> > 3  1  3   B
> > 4  1  4   C
> > 5  1  5   B
> > 6  1  6   B
> > 7  1  7   A
> > 8  1  8   A
> > 9  1  9   B
> > 10 1 10   A
> >
> > I want to extract those rows that are the first occurrences for each
> level
> > of factor 'fac', which are basically the first three rows above. How can
> I
> > achieve that? The real dataframe is more complicated than the example
> > above, and I can't simply list all the levels of factor 'fac' by
> > exhaustibly listing all the levels like the following
> >
> > d[d$fac=='A' | d$fac=='B' | d$fac=='C', ]
> >
> > Thanks,
> > Gang
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change factor levels

2013-12-14 Thread Gang Chen
Suppose I have a dataframe 'd' defined as

 L3 <- LETTERS[1:3]
 d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
= TRUE))
 (d <- d0[d0$fac %in% c('A', 'B'),])

  x y fac
2 1 2   B
3 1 3   A
4 1 4   A
5 1 5   A
6 1 6   B
8 1 8   A

Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the
birthmark of 3 levels from its parent 'd0':

str(d)

'data.frame': 6 obs. of  3 variables:
 $ x  : num  1 1 1 1 1 1
 $ y  : num  2 3 4 5 6 8
 $ fac: Factor w/ 3 levels "A","B","C": 2 1 1 1 2 1

How can I cut the umbilical cord so that factor 'fac' in 'd' would have an
accurate birth certificate with the correct number of levels? Apparently
the following does not work:

levels(d$fac) <- c('A', 'B')

Also any reason for this heritage?

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change factor levels

2013-12-14 Thread Gang Chen
Thanks both Uwe and Daniel for the great help!


On Sat, Dec 14, 2013 at 3:30 PM, Dániel Kehl  wrote:

> Dear Gang,
>
> this seem to solve your problem.
>
>
> http://stackoverflow.com/questions/1195826/dropping-factor-levels-in-a-subsetted-data-frame-in-r
>
>
> best
> daniel
> 
> Feladó: r-help-boun...@r-project.org [r-help-boun...@r-project.org] ;
> meghatalmazó: Gang Chen [gangch...@gmail.com]
> Küldve: 2013. december 14. 21:09
> To: r-help
> Tárgy: [R] Change factor levels
>
> Suppose I have a dataframe 'd' defined as
>
>  L3 <- LETTERS[1:3]
>  d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
> = TRUE))
>  (d <- d0[d0$fac %in% c('A', 'B'),])
>
>   x y fac
> 2 1 2   B
> 3 1 3   A
> 4 1 4   A
> 5 1 5   A
> 6 1 6   B
> 8 1 8   A
>
> Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the
> birthmark of 3 levels from its parent 'd0':
>
> str(d)
>
> 'data.frame': 6 obs. of  3 variables:
>  $ x  : num  1 1 1 1 1 1
>  $ y  : num  2 3 4 5 6 8
>  $ fac: Factor w/ 3 levels "A","B","C": 2 1 1 1 2 1
>
> How can I cut the umbilical cord so that factor 'fac' in 'd' would have an
> accurate birth certificate with the correct number of levels? Apparently
> the following does not work:
>
> levels(d$fac) <- c('A', 'B')
>
> Also any reason for this heritage?
>
> Thanks,
> Gang
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble installing gsl wrapper

2010-10-29 Thread Gang Chen
Hi,

I'm trying to install the gsl wrapper source code
(http://cran.r-project.org/src/contrib/gsl_1.9-8.tar.gz) on a Linux
system (OpenSuse 11.1), but encountering the following problem. I've
already installed 'gsl' version 1.14
(ftp://ftp.gnu.org/gnu/gsl/gsl-1.14.tar.gz) on the system. What's
missing? Thanks a lot...

> R CMD INSTALL gsl

* installing to library ‘/usr/lib64/R/library’
* installing *source* package ‘gsl’ ...
checking for gsl-config... /usr/local/bin/gsl-config
checking if GSL version >= 1.12... checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
make: Nothing to be done for `all'.
installing to /usr/lib64/R/library/gsl/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object '/usr/lib64/R/library/gsl/libs/gsl.so':
  libgsl.so.0: cannot open shared object file: No such file or directory
ERROR: loading failed
* removing ‘/usr/lib64/R/library/gsl’

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble installing gsl wrapper

2010-10-30 Thread Gang Chen
You nailed it, Prof. Ripley! Thanks a lot...

Gang

On Sat, Oct 30, 2010 at 2:58 PM, Prof Brian Ripley
 wrote:
> On Sat, 30 Oct 2010, Gang Chen wrote:
>
>> Hi,
>>
>> I'm trying to install the gsl wrapper source code
>> (http://cran.r-project.org/src/contrib/gsl_1.9-8.tar.gz) on a Linux
>> system (OpenSuse 11.1), but encountering the following problem. I've
>> already installed 'gsl' version 1.14
>> (ftp://ftp.gnu.org/gnu/gsl/gsl-1.14.tar.gz) on the system. What's
>> missing? Thanks a lot...
>
> Installing the gsl library correctly?  I need to guess quite a bit here (and
> a clean install attempt would have given a few more clues).
>
> I suspect you installed into /usr/local/lib whereas your OS probably uses
> /usr/local/lib64 (most x86_64 Linuxen do, and you seem to be using lib64 for
> R).  In that case ld.so most likely will not find the dynamic library in
> /usr/local/lib.
>
> You can avoid such problems by installing auxiliary software such as gsl
> from RPMs -- I would be very surprised if OpenSuse did not have gsl and
> gsl-devel RPMs.  Otherwise you need to install from the sources by something
> like
>
> make install libdir=/usr/local/lib64
>
>>
>>> R CMD INSTALL gsl
>>
>> * installing to library ‘/usr/lib64/R/library’
>> * installing *source* package ‘gsl’ ...
>> checking for gsl-config... /usr/local/bin/gsl-config
>> checking if GSL version >= 1.12... checking for gcc... gcc
>> checking for C compiler default output file name... a.out
>> checking whether the C compiler works... yes
>> checking whether we are cross compiling... no
>> checking for suffix of executables...
>> checking for suffix of object files... o
>> checking whether we are using the GNU C compiler... yes
>> checking whether gcc accepts -g... yes
>> checking for gcc option to accept ISO C89... none needed
>> yes
>> configure: creating ./config.status
>> config.status: creating src/Makevars
>> ** libs
>> make: Nothing to be done for `all'.
>
> This was not a clean install 
>
>> installing to /usr/lib64/R/library/gsl/libs
>> ** R
>> ** inst
>> ** preparing package for lazy loading
>> ** help
>> *** installing help indices
>> ** building package indices ...
>> ** testing if installed package can be loaded
>> Error in dyn.load(file, DLLpath = DLLpath, ...) :
>>  unable to load shared object '/usr/lib64/R/library/gsl/libs/gsl.so':
>>  libgsl.so.0: cannot open shared object file: No such file or directory
>> ERROR: loading failed
>> * removing ‘/usr/lib64/R/library/gsl’
>>
>>> sessionInfo()
>>
>> R version 2.12.0 (2010-10-15)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>
> --
> Brian D. Ripley,                  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D scatter plot with projections

2011-01-08 Thread Gang Chen
I want to create some 3D scatter plot with a diagonal line. In addition, I'd
like to have those points plus the diagonal line projected to those three
planes (xy, yz and xz). Which package can I use to achieve this,
scatterplot3d or something else?

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D scatter plot with projections

2011-01-08 Thread Gang Chen
Thanks a lot for the quick help! How to project the scatter plot with the
diagonal line to the three planes with scatterplot3d? I could not find such
an example demonstrating that in the vignette.

Thanks,
Gang

2011/1/8 Uwe Ligges 

>
>
> On 08.01.2011 16:38, Gang Chen wrote:
>
>> I want to create some 3D scatter plot with a diagonal line. In addition,
>> I'd
>> like to have those points plus the diagonal line projected to those three
>> planes (xy, yz and xz). Which package can I use to achieve this,
>> scatterplot3d or something else?
>>
>
> Yes, scatterplot3d, rgl, and maybe also others.
>
> For looking at it interactively I always prefer rgl, for statical
> representations (e.g. printing) scatterplot3d can be used with all known R
> devices.
>
> Best,
> Uwe Ligges
>
>  Thanks,
>> Gang
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D scatter plot with projections

2011-01-08 Thread Gang Chen
Yes, too bad I didn't realize that it's so simple like that! Thanks...

On Sat, Jan 8, 2011 at 12:45 PM, David Winsemius wrote:

>
> On Jan 8, 2011, at 11:21 AM, Gang Chen wrote:
>
>  Thanks a lot for the quick help! How to project the scatter plot with the
>> diagonal line to the three planes with scatterplot3d? I could not find
>> such
>> an example demonstrating that in the vignette.
>>
>
> I'm puzzled. If you have (x1, y1, z1) and (x2, y2, z2) as starting and
> ending points, Deducing the three projected segments , i.e. the starting and
> ending points of the projections on the xy, xz and yz planes (z = 0, y=0,
> and x= 0 respectively) would seem to be trivial. So maybe I just don't
> understand.  What part is offering difficulty? Please show your code so far.
>
> --
> David.
>
>>
>> Thanks,
>> Gang
>>
>> 2011/1/8 Uwe Ligges 
>>
>>
>>>
>>> On 08.01.2011 16:38, Gang Chen wrote:
>>>
>>>  I want to create some 3D scatter plot with a diagonal line. In addition,
>>>> I'd
>>>> like to have those points plus the diagonal line projected to those
>>>> three
>>>> planes (xy, yz and xz). Which package can I use to achieve this,
>>>> scatterplot3d or something else?
>>>>
>>>>
>>> Yes, scatterplot3d, rgl, and maybe also others.
>>>
>>> For looking at it interactively I always prefer rgl, for statical
>>> representations (e.g. printing) scatterplot3d can be used with all known
>>> R
>>> devices.
>>>
>>> Best,
>>> Uwe Ligges
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert dataframe with two factors from wide to long format

2011-05-22 Thread Gang Chen
I know how to convert a simple dataframe from wide to long format with one
varying factor. However, for a dataset with two factors like the following,

Subj T1_Cond1 T1_Cond2 T2_Cond1 T2_Cond2
 1   0.125869   4.108232   1.099392   5.556614
 2   1.427940   2.170026   0.120748   1.176353

How to elegantly convert to a long form as

Subj  Time Cond  Value
1 11   0.125869
1 12   4.108232
1 21   1.099392
...

Thanks in advance!

Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert dataframe with two factors from wide to long format

2011-05-22 Thread Gang Chen
Just learned another trick today. Thanks a lot to both of you for the kind
help!

Gang

On Sun, May 22, 2011 at 6:04 PM, Dennis Murphy  wrote:

> Hi:
>
> library(reshape2)
> d1 <- melt(d, id = 'Subj')
> d1 <- cbind(d1, colsplit(d1$variable, '_', c('Time', 'Cond')))
> d1 <- transform(d1,
>Time = substr(Time, 2, 2),
>Cond = substr(Cond, 5, 5))[c(1, 4, 5, 3)]
> str(d1)
> d1
>
> You can decide whether to leave Time and Cond as character or to
> convert them to numeric or factor.
>
> HTH,
> Dennis
>


> Assume this is a dataframe named 'tst'
>
> require(reshape2)
> ltest <- melt(tst, id.vars =1)
>
> dcast(ltest,  ubj+substr(variable, 1,2) + substr(variable, 4,8) ~. )
>  ubj substr(variable, 1, 2) substr(variable, 4, 8)   NA
> 1   1 T1  Cond1 0.125869
> 2   1 T1  Cond2 4.108232
> 3   1 T2  Cond1 1.099392
> 4   1 T2  Cond2 5.556614
> 5   2 T1  Cond1 1.427940
> 6   2 T1  Cond2 2.170026
> 7   2 T2  Cond1 0.120748
> 8   2 T2  Cond2 1.176353
>
> dcast(ltest,  ubj+substr(variable, 1,2) ~ substr(variable, 4,8) )
>
>  ubj substr(variable, 1, 2)Cond1Cond2
> 1   1 T1 0.125869 4.108232
> 2   1 T2 1.099392 5.556614
> 3   2 T1 1.427940 2.170026
> 4   2 T2 0.120748 1.176353
>
> The modifications to get it exactly as requested are left to the reader
>


> On Sun, May 22, 2011 at 2:25 PM, Gang Chen  wrote:
> > I know how to convert a simple dataframe from wide to long format with
> one
> > varying factor. However, for a dataset with two factors like the
> following,
> >
> > Subj T1_Cond1 T1_Cond2 T2_Cond1 T2_Cond2
> >  1   0.125869   4.108232   1.099392   5.556614
> >  2   1.427940   2.170026   0.120748   1.176353
> >
> > How to elegantly convert to a long form as
> >
> > Subj  Time Cond  Value
> > 1 11   0.125869
> > 1 12   4.108232
> > 1 21   1.099392
> > ...
> >
> > Thanks in advance!
> >
> > Gang
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A question about data frame

2011-03-10 Thread Gang Chen
A very simple question. With a data frame like this:

> n = c(2, 3, 5)
> s = c("aa", "bb", "cc")
> df = data.frame(n, s)

I want df$s[1] or df[1,2], but how can I get rid of the extra line in
the output about the factor levels:

> df$s[1]
[1] aa
Levels: aa bb cc

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question about data frame

2011-03-10 Thread Gang Chen
Thanks! You mean something like:

> print(df$s[1], max.levels=0)

It seems I could also do

> as.character(df$s[1])

Any other/better solutions?

On Thu, Mar 10, 2011 at 4:40 PM, Rob Tirrell  wrote:
> See the max.levels argument in ?print. I think this is what you're looking
> for.
> --
> Robert Tirrell | r...@stanford.edu | (607) 437-6532
> Program in Biomedical Informatics | Butte Lab | Stanford University
>
>
> On Thu, Mar 10, 2011 at 13:35, Gang Chen  wrote:
>>
>> n = c(2, 3, 5)
>> > s = c("aa", "bb", "cc")
>> > df = data.frame(n, s)
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question about data frame

2011-03-10 Thread Gang Chen
Yes, indeed I wanted them stored as characters instead of factor
levels. Thanks a lot for the help, Jim and Phil!

Gang

On Thu, Mar 10, 2011 at 5:19 PM, Phil Spector  wrote:
> Gang -
>   It sounds like you want your character variables to
> be stored as character values, not factor values.  If that's
> the case, use
>
> df = data.frame(n, s,stringsAsFactors=FALSE)
>
> If you want them to be factors, but not to display as factors,
> others have provided usable solutions.
>
>                                        - Phil Spector
>                                         Statistical Computing Facility
>                                         Department of Statistics
>                                         UC Berkeley
>                                         spec...@stat.berkeley.edu
>
>
>
> On Thu, 10 Mar 2011, Gang Chen wrote:
>
>> A very simple question. With a data frame like this:
>>
>>> n = c(2, 3, 5)
>>> s = c("aa", "bb", "cc")
>>> df = data.frame(n, s)
>>
>> I want df$s[1] or df[1,2], but how can I get rid of the extra line in
>> the output about the factor levels:
>>
>>> df$s[1]
>>
>> [1] aa
>> Levels: aa bb cc
>>
>> Thanks,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving a graph

2011-08-29 Thread Gang Chen
I read somewhere that vector graphics such as eps or dpf are more favorable
than alternatives (jpeg, bmp or png) for publication because vector graphics
scale properly when enlarged. However, my problem is that the file generated
from a graph of fixed size is too large (in the order of 10MB) because of
many data points in multiple scatterplots. Any suggestions?

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-recursive SVAR model with vars

2011-06-20 Thread Gang Chen
Hi, I have a question about SVAR modeling with the package vars. How does it
handle the situation where the A (structural) matrix has a non-recursive
structure in the SVAR model? In other words, what kind of algorithm does
vars adopt to deal with the unidentifiable issue in a non-recursive model?

Thank you,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about a warning message

2011-07-07 Thread Gang Chen
I define the following function to convert a t-value with degrees of freedom
DF to another t-value with different degrees of freedom fullDF:

tConvert <- function(tval, DF, fullDF) ifelse(DF>=1, qt(pt(tval, DF),
fullDF), 0)

It works as expected with the following case:

> tConvert(c(2,3), c(10,12), 12)
[1] 1.961905 3.00

However, it gives me warning for the example below although the output is
still as intended:

> tConvert(c(2,3), c(0,12), 12)
[1] 0 3
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced

I'm confused about the warning especially considering the fact that the
following works correctly without such warning:

> tConvert(2, 0, 12)
[1] 0

What am I missing?

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about a warning message

2011-07-07 Thread Gang Chen
Thanks for the help! Are you sure R version plays a role in this case? My R
version is 2.13.0

Your suggestion prompted me to look into the help content of ifelse, and a
similar example exists there:

 x <- c(6:-4)
 sqrt(x)  #- gives warning
 sqrt(ifelse(x >= 0, x, NA))  # no warning

 ## Note: the following also gives the warning !
 ifelse(x >= 0, sqrt(x), NA)

Based on the above example, now I have a solution for my situation:

tConvert2 <- function(tval, DF, fullDF) qt(pt(ifelse(DF>=1, tval, 0),
ifelse(DF>=1, DF, 1)), fullDF)

> tConvert2(c(2,3), c(0,12), 12)
[1] 0 3

However, I feel my solution is a little kludged. Any better idea?

Thanks,
Gang



On Thu, Jul 7, 2011 at 9:04 PM, David Winsemius wrote:

>
> On Jul 7, 2011, at 8:52 PM, David Winsemius wrote:
>
>
>> On Jul 7, 2011, at 8:47 PM, Gang Chen wrote:
>>
>>  I define the following function to convert a t-value with degrees of
>>> freedom
>>> DF to another t-value with different degrees of freedom fullDF:
>>>
>>> tConvert <- function(tval, DF, fullDF) ifelse(DF>=1, qt(pt(tval, DF),
>>> fullDF), 0)
>>>
>>> It works as expected with the following case:
>>>
>>>  tConvert(c(2,3), c(10,12), 12)
>>>>
>>> [1] 1.961905 3.00
>>>
>>> However, it gives me warning for the example below although the output is
>>> still as intended:
>>>
>>>  tConvert(c(2,3), c(0,12), 12)
>>>>
>>> [1] 0 3
>>> Warning message:
>>> In pt(q, df, lower.tail, log.p) : NaNs produced
>>>
>>> I'm confused about the warning especially considering the fact that the
>>> following works correctly without such warning:
>>>
>>>  tConvert(2, 0, 12)
>>>>
>>> [1] 0
>>>
>>> What am I missing?
>>>
>>
>> The fact that ifelse evaluates both sides of the consequent and
>> alternative.
>>
>
> I also think you should update yur R to the most recent version since a
> current version does not issue that warning.
>
>
> --
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Wide to long form conversion

2011-10-06 Thread Gang Chen
I have some data 'myData' in wide form (attached at the end), and
would like to convert it to long form. I wish to have five variables
in the result:

1) Subj: factor
2) Group: between-subjects factor (2 levels: s / w)
3) Reference: within-subject factor (2 levels: Me / She)
4) F: within-subject factor (2 levels: F1 / F2)
5) J: within-subject factor (2 levels: J1 / J2)

As this is the first time I'm learning such a conversion, could
someone help me out?

Many thanks,
Gang

> myData

   Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj
1  s45 61036 6 9   S1
2  s65 5 643 5 6   S2
3  s74 6 574 5 3   S3
4  s85 8 771 8 6   S4
5  s   106 4 796 4 6   S5
6  s52 4 741 4 2   S6
7  s   13210 4   112 4 3   S7
8  s81 31160 310   S8
9  s69 5 868 5 6   S9
10 s   145 610   135 510  S10
11 s   15218 2   14118 2  S11
12 s69 4 95   11 3 8  S12
13 s55 01243 0 8  S13
14 s56 4 946 2 6  S14
15 s   14512 3   12311 3  S15
16 s7211 35210 2  S16
17 s17 4 516 3 5  S17
18 s62 7 462 7 4  S18
19 s94 8 5   104 6 3  S19
20 s82 6 592 6 4  S20
21 s65 5 766 5 5  S21
22 s88 3 767 5 3  S22
23 s   114 6 711 6 4  S23
24 s63 2 464 2 2  S24
25 s44 6 623 4 6  S25
26 w59 4 737 3 5  S26
27 w76 3 541 0 4  S27
28 w   10414 28410 2  S28
29 w97 5 684 5 3  S29
30 w92 7 562 6 5  S30
31 w67 6 765 5 8  S31
32 w7612 76310 7  S32
33 w   123 8 9   113 4 7  S33
34 w   12210 592 6 3  S34
35 w6310 453 5 3  S35
36 w93 9 963 7 8  S36
37 w5   11 7 74   11 3 4  S37
38 w74 4 673 1 5  S38
39 w65 1 833 0 8  S39
40 w   10310 273 7 2  S40
41 w1   11 7 518 4 3  S41
42 w   105 610   104 3 9  S42
43 w63 9 242 6 0  S43
44 w9511 454 7 3  S44
45 w85 6 384 2 3  S45
46 w84 8 741 2 6  S46
47 w   122 6 2   101 5 2  S47
48 w   106 9 875 7 8  S48
49 w   13615 1   12414 0  S49
50 w78 11247 111  S50
51 w   123 9 491 7 4  S51

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wide to long form conversion

2011-10-06 Thread Gang Chen
Thanks for the pointer! I still couldn't figure out how to convert my
data because the example at stackoverflow seems to have only one
variable (Year) while there are three within-subject variables plus
one between-subjects variable. Any further help?

Gang

On Thu, Oct 6, 2011 at 4:34 PM, Andrew Miles  wrote:
> Take a look here.
> http://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format
> Andrew Miles
> Department of Sociology
> Duke University
> On Oct 6, 2011, at 4:28 PM, Gang Chen wrote:
>
> I have some data 'myData' in wide form (attached at the end), and
> would like to convert it to long form. I wish to have five variables
> in the result:
>
> 1) Subj: factor
> 2) Group: between-subjects factor (2 levels: s / w)
> 3) Reference: within-subject factor (2 levels: Me / She)
> 4) F: within-subject factor (2 levels: F1 / F2)
> 5) J: within-subject factor (2 levels: J1 / J2)
>
> As this is the first time I'm learning such a conversion, could
> someone help me out?
>
> Many thanks,
> Gang
>
> myData
>
>   Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj
> 1  s    4    5 6    10    3    6 6 9   S1
> 2  s    6    5 5 6    4    3 5 6   S2
> 3  s    7    4 6 5    7    4 5 3   S3
> 4  s    8    5 8 7    7    1 8 6   S4
> 5  s   10    6 4 7    9    6 4 6   S5
> 6  s    5    2 4 7    4    1 4 2   S6
> 7  s   13    2    10 4   11    2 4 3   S7
> 8  s    8    1 3    11    6    0 3    10   S8
> 9  s    6    9 5 8    6    8 5 6   S9
> 10 s   14    5 6    10   13    5 5    10  S10
> 11 s   15    2    18 2   14    1    18 2  S11
> 12 s    6    9 4 9    5   11 3 8  S12
> 13 s    5    5 0    12    4    3 0 8  S13
> 14 s    5    6 4 9    4    6 2 6  S14
> 15 s   14    5    12 3   12    3    11 3  S15
> 16 s    7    2    11 3    5    2    10 2  S16
> 17 s    1    7 4 5    1    6 3 5  S17
> 18 s    6    2 7 4    6    2 7 4  S18
> 19 s    9    4 8 5   10    4 6 3  S19
> 20 s    8    2 6 5    9    2 6 4  S20
> 21 s    6    5 5 7    6    6 5 5  S21
> 22 s    8    8 3 7    6    7 5 3  S22
> 23 s   11    4 6 7    1    1 6 4  S23
> 24 s    6    3 2 4    6    4 2 2  S24
> 25 s    4    4 6 6    2    3 4 6  S25
> 26 w    5    9 4 7    3    7 3 5  S26
> 27 w    7    6 3 5    4    1 0 4  S27
> 28 w   10    4    14 2    8    4    10 2  S28
> 29 w    9    7 5 6    8    4 5 3  S29
> 30 w    9    2 7 5    6    2 6 5  S30
> 31 w    6    7 6 7    6    5 5 8  S31
> 32 w    7    6    12 7    6    3    10 7  S32
> 33 w   12    3 8 9   11    3 4 7  S33
> 34 w   12    2    10 5    9    2 6 3  S34
> 35 w    6    3    10 4    5    3 5 3  S35
> 36 w    9    3 9 9    6    3 7 8  S36
> 37 w    5   11 7 7    4   11 3 4  S37
> 38 w    7    4 4 6    7    3 1 5  S38
> 39 w    6    5 1 8    3    3 0 8  S39
> 40 w   10    3    10 2    7    3 7 2  S40
> 41 w    1   11 7 5    1    8 4 3  S41
> 42 w   10    5 6    10   10    4 3 9  S42
> 43 w    6    3 9 2    4    2 6 0  S43
> 44 w    9    5    11 4    5    4 7 3  S44
> 45 w    8    5 6 3    8    4 2 3  S45
> 46 w    8    4 8 7    4    1 2 6  S46
> 47 w   12    2 6 2   10    1 5 2  S47
> 48 w   10    6 9 8    7    5 7 8  S48
> 49 w   13    6    15 1   12    4    14 0  S49
> 50 w    7    8 1    12    4    7 1    11  S50
> 51 w   12    3 9 4    9    1 7 4  S51
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wide to long form conversion

2011-10-07 Thread Gang Chen
Dennis,

Yes, I was wrong! The variables should be:

1) Subj: factor
2) Group: between-subjects factor (2 levels: s / w)
3) Reference: within-subject factor (2 levels: Me / She)
4) F/J: within-subject factor (2 levels: F / J)
5) Time: within-subject factor (2 levels: 1 / 2)
6) Value

So your mData3 is what I was looking for. Thanks a lot!

Gang

On Thu, Oct 6, 2011 at 9:09 PM, Dennis Murphy  wrote:
> Hi:
>
>> I have some data 'myData' in wide form (attached at the end), and
>> would like to convert it to long form. I wish to have five variables
>> in the result:
>>
>> 1) Subj: factor
>> 2) Group: between-subjects factor (2 levels: s / w)
>> 3) Reference: within-subject factor (2 levels: Me / She)
>> 4) F: within-subject factor (2 levels: F1 / F2)
>> 5) J: within-subject factor (2 levels: J1 / J2)
>
> I don't see how you can get all of 3-5 given the way your data is
> structured. The problem is that each column contains two of the three
> variables you want, but not all three. I can see a way to get
>
> Subj   Group  Ref   Time  F   J
>  S1       s      Me     1     4   5
>  S1       s      Me     2     3   6
>  S1       s     She    1     6  10
>  S1       s     She    2     6   9
>
> or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2)
> repeated twice and the appropriate response vector, but not a way
> where you have three columns for Ref, F and J. For example, what is
> the 'J' for MeF1 or the F for SheJ2?
>
> With that, here are a few stabs using the reshape2 package. The first
> step is to do a little renaming of your data frame so that one can use
> the colsplit() function to generate a new set of variables.
>
> names(myData) <- c("Group", "Me_F_1",  "Me_J_1",  "She_F_1", "She_J_1",
>                   "Me_F_2",  "Me_J_2",  "She_F_2", "She_J_2", "Subj")
>
> library('plyr')
> library('reshape2')
>
> # collapses the eight columns to be reshaped into a factor named
> # variable with a corresponding variable named value
> mData <- melt(myData, id = c('Subj', 'Group'))
> head(mData)
>
> # Split the original variables into three new columns, named
> # Ref, Var and Time, respectively:
> newvars <- colsplit(mData$variable, '_', c('Ref', 'Var', 'Time'))
>
> # Append these to the melted data frame and remove 'variable'
> mData2 <- cbind(mData, newvars)[, -3]
>
> # This comes closest to your original intent:
> mData3 <- arrange(mData2, Subj, Ref, Var, Time)
> head(mData3, 8)
>
>   Subj Group value Ref Var Time
> 1    S1     s     4  Me   F    1
> 2    S1     s     3  Me   F    2
> 3    S1     s     5  Me   J    1
> 4    S1     s     6  Me   J    2
> 5    S1     s     6 She   F    1
> 6    S1     s     6 She   F    2
> 7    S1     s    10 She   J    1
> 8    S1     s     9 She   J    2
>
> # Some rearrangements to consider:
> mData4 <- cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value')
> head(mData4, 4)
>
>  Subj Group Ref Time  F  J
> 1   S1     s  Me    1  4  5
> 2   S1     s  Me    2  3  6
> 3   S1     s She    1  6 10
> 4   S1     s She    2  6  9
>
> mData5 <- cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value')
> head(mData5, 4)
>
>  Subj Group Ref Var  1  2
> 1   S1     s  Me   F  4  3
> 2   S1     s  Me   J  5  6
> 3   S1     s She   F  6  6
> 4   S1     s She   J 10  9
>
> If you like this one, it's probably a good idea to rename the last two
> columns 'Time1' and 'Time2' or something similar.
>
> HTH,
> Dennis
>
> On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen  wrote:
>> I have some data 'myData' in wide form (attached at the end), and
>> would like to convert it to long form. I wish to have five variables
>> in the result:
>>
>> 1) Subj: factor
>> 2) Group: between-subjects factor (2 levels: s / w)
>> 3) Reference: within-subject factor (2 levels: Me / She)
>> 4) F: within-subject factor (2 levels: F1 / F2)
>> 5) J: within-subject factor (2 levels: J1 / J2)
>>
>> As this is the first time I'm learning such a conversion, could
>> someone help me out?
>>
>> Many thanks,
>> Gang
>>
>>> myData
>>
>>   Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj
>> 1      s    4    5     6    10    3    6     6     9   S1
>> 2      s    6    5     5     6    4    3     5     6   S2
>> 3      s    7    4     6     5    7    4     5     3   S3
>> 4  

Re: [R] Wide to long form conversion

2011-10-07 Thread Gang Chen
Jim, I really appreciate your help!

I like the power of rep_n_stack, but how can I use rep_n_stack to get
the following result?

  Subj Group value Ref Var Time
1S1 s 4  Me   F1
2S1 s 3  Me   F2
3S1 s 5  Me   J1
4S1 s 6  Me   J2
5S1 s 6 She   F1
6S1 s 6 She   F2
7S1 s10 She   J1
8S1 s 9 She   J2

On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon  wrote:
> On 10/07/2011 07:28 AM, Gang Chen wrote:
>>
>> I have some data 'myData' in wide form (attached at the end), and
>> would like to convert it to long form. I wish to have five variables
>> in the result:
>>
>> 1) Subj: factor
>> 2) Group: between-subjects factor (2 levels: s / w)
>> 3) Reference: within-subject factor (2 levels: Me / She)
>> 4) F: within-subject factor (2 levels: F1 / F2)
>> 5) J: within-subject factor (2 levels: J1 / J2)
>
> Hi Gang,
> I don't know whether this is the format you want, but:
>
> library(prettyR)
> rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE))
>
> Jim
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wide to long form conversion

2011-10-07 Thread Gang Chen
David, thanks a lot for the code! I've learned quite a bit from all
the generous help...

Gang

On Fri, Oct 7, 2011 at 1:37 PM, David Winsemius  wrote:
>
> On Oct 7, 2011, at 1:30 PM, David Winsemius wrote:
>
>>
>> On Oct 7, 2011, at 7:40 AM, Gang Chen wrote:
>>
>>> Jim, I really appreciate your help!
>>>
>>> I like the power of rep_n_stack, but how can I use rep_n_stack to get
>>> the following result?
>>>
>>> Subj Group value Ref Var Time
>>> 1    S1     s     4  Me   F    1
>>> 2    S1     s     3  Me   F    2
>>> 3    S1     s     5  Me   J    1
>>> 4    S1     s     6  Me   J    2
>>> 5    S1     s     6 She   F    1
>>> 6    S1     s     6 She   F    2
>>> 7    S1     s    10 She   J    1
>>> 8    S1     s     9 She   J    2
>>
>> I was not able to construct a one step solution with `reshape` that will
>> contains all the columns. You can do it in about 4 steps by first making the
>> data "long" and then adding annotation columns. Using just rows 1 and 26 you
>> might get:
>>
>> reshape(myData[c(1,26), ], idvar=c("Group","Subj"),
>>      direction="long",
>>      varying=2:9,
>>      v.names=c("value") )
>>       Group Subj time value
>> s.S1.1      s   S1    1      4
>> w.S26.1     w  S26    1      5
>> s.S1.2      s   S1    2      5
>> w.S26.2     w  S26    2      9
>> s.S1.3      s   S1    3      6
>> w.S26.3     w  S26    3      4
>> s.S1.4      s   S1    4     10
>> w.S26.4     w  S26    4      7
>> s.S1.5      s   S1    5      3
>> w.S26.5     w  S26    5      3
>> s.S1.6      s   S1    6      6
>> w.S26.6     w  S26    6      7
>> s.S1.7      s   S1    7      6
>> w.S26.7     w  S26    7      3
>> s.S1.8      s   S1    8      9
>> w.S26.8     w  S26    8      5
>>
>> The 'time' variable is not really what you wanted but refers to the
>> sequence along the original wide column names
>> You can add the desired  Ref, Var and Time columms with these
>> constructions:
>>
>> > str(times<-rep(c(1,2), length=nrow(myData)*8 )  )
>> num [1:408] 1 2 1 2 1 2 1 2 1 2 ...
>> > str(times<-rep(c("F","J"), each=2, length=nrow(myData)*8 )  )
>> chr [1:408] "F" "F" "J" "J" "F" "F" "J" "J" "F" "F" ...
>> > str(times<-rep(c("Me","She"), each=4, length=nrow(myData)*8 )  )
>> chr [1:408] "Me" "Me" "Me" "Me" "She" "She" "She" "She" ...
>>
> It occured to me that the ordering operation probably should have preceded
> teh ancillary column creation so this method is tested:
>
>> longData <- reshape(myData, idvar=c("Group","Subj"),
>>       direction="long",    #fixed the direction argument
>>      varying=2:9,
>>      v.names=c("value") )
>> longData <- longData[order(longData$Subj), ]
>> longData$Time <- rep(c(1,2), length=nrow(myData)*8 )
>> longData$Var <- rep(c("F","J"), each=2, length=nrow(myData)*8 )
>> longData$Ref <- rep(c("Me","She"), each=4, length=nrow(myData)*8 )
>>
>       Group Subj time value Time Var Ref
> s.S1.1     s   S1    1     4    1   F  Me
> s.S1.2     s   S1    2     5    2   F  Me
> s.S1.3     s   S1    3     6    1   J  Me
> s.S1.4     s   S1    4    10    2   J  Me
> s.S1.5     s   S1    5     3    1   F She
> s.S1.6     s   S1    6     6    2   F She
> s.S1.7     s   S1    7     6    1   J She
> s.S1.8     s   S1    8     9    2   J She
>
>
>>
>> Looking at Jim Lemon's response, I think he just misinterpreted the
>> structure of your data but gave you a perfectly usable response. You could
>> have done much the same thing with a minor modification:
>>
>> > str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE)))
>> 'data.frame':   408 obs. of  4 variables:
>> $ Group : Factor w/ 2 levels "s","w": 1 1 1 1 1 1 1 1 1 1 ...
>> $ Subj  : Factor w/ 51 levels "S1","S10","S11",..: 1 12 23 34 45 48 49 50
>> 51 2 ...
>> $ group1: Factor w/ 8 levels "Me.F.1","Me.F.2",..: 1 1 1 1 1 1 1 1 1 1 ...
>> $ value1: int  4 6 7 8 10 5 13 8 6 14 ...
>>
>> Now you can just split apart the 'group1' column with sub() to make the
>> three specified

[R] Setting up infile for R CMD BATCH

2012-02-07 Thread Gang Chen
Suppose I create an R program called myTest.R with only one line like
the following:

type <- as.integer(readline("input type (1: type1; 2: type2)? "))

Then I'd like to run myTest.R in batch mode by constructing an input
file called answers.R with the following:

source("myTest.R")
1

When I ran the following at the terminal:

R CMD BATCH answer.R output.Rout

it failed to pick up the answer '1' from the 2nd line in answers.R as
shown inside output.Rout:

> source("myTest.R")
input type (0: quit; 1: type1; 2: type2)?
> 1
[1] 1

What am I missing here?

Thanks in advance,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting up infile for R CMD BATCH

2012-02-07 Thread Gang Chen
Thanks for the help.

> You're not missing anything.
> In your output.Rout: the ">1" right after the source('test') is the
> "1" inputed from answers.R. the "[1] 1" is the result of test. Remove
> the second line from answers.R and see what happens (hint: script ends
> after the readline prompt).


That number '1' at the 2nd line was meant to be the answer for the
readline() part, but apparently it does not worked as I intended. If
the script ends right after the readline prompt as you suggested, my
question is, how can I feed in this answer '1' in the infile
'answers.R'?


> Just out of curiosity, why will you use a script that requires user
> input (readlines) in batch mode ?


I know this sounds contradictory, but my intention is that, in case
the user has all the answers available from a previous run in
interactive mode, s/he may try out the batch mode the next time.


> On Tue, Feb 7, 2012 at 4:05 PM, Gang Chen  wrote:
>> Suppose I create an R program called myTest.R with only one line like
>> the following:
>>
>> type <- as.integer(readline("input type (1: type1; 2: type2)? "))
>>
>> Then I'd like to run myTest.R in batch mode by constructing an input
>> file called answers.R with the following:
>>
>> source("myTest.R")
>> 1
>>
>> When I ran the following at the terminal:
>>
>> R CMD BATCH answer.R output.Rout
>>
>> it failed to pick up the answer '1' from the 2nd line in answers.R as
>> shown inside output.Rout:
>>
>>> source("myTest.R")
>> input type (0: quit; 1: type1; 2: type2)?
>>> 1
>> [1] 1
>>
>> What am I missing here?
>>
>> Thanks in advance,
>> Gang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting up infile for R CMD BATCH

2012-02-08 Thread Gang Chen
Sorry Elai for the confusions.

Let me try to reframe my predicament. The main program "myTest.R" has
been written in interactive mode with many readline() lines embedded.
Suppose a user has already run the program once before in interactive
mode with all the answers saved in a text file called answer.R. Now
s/he does not want to go through the interactive again because it's a
tedious process, and would like to run the program but in batch mode
with answer.R. And that's why I tried the following which didn't pan
out:

R CMD BATCH answer.R output.Rout

Of couse I could rewrite a separate program specifically for batch
mode as you suggested previously with eval(), for example. However, is
there an approach to keeping the original program so that the user
could run both interactive and batch mode? That probably requires
modifying the readline() part, but how?

Thanks,
Gang


On Wed, Feb 8, 2012 at 2:08 PM, ilai  wrote:
> Gang,
> Maybe someone here has a different take on things. I'm afraid I have
> no more insights on this unless you explain exactly what you are
> trying to achieve, or more importantly why? That may help understand
> what the problem really is.
>
> Do you want to save an interactive session for future runs? then
> ?save.image and all your "answers" are in the environment. In this
> case consider putting an "if(!exists('type') | length(type)<1 |
> is.na(type))" before "type<- readline(...)"  in your script so type
> wouldn't be overwritten in subsequent runs.
>
> If your goal is to batch evaluate multiple answer files from users
> (why else would you ask questions with readline?), then you should
> have enough to go on with my answer and the examples in ?eval.
>
> Elai
>
>
> On Wed, Feb 8, 2012 at 9:04 AM, Gang Chen  wrote:
>> Hi Elai,
>>
>> Thanks a lot for the suggestions! I really appreciate it...
>>
>> Your suggestion of using eval() and creating those answers in a list
>> would work, but there is no alternative to readline() with which I
>> could read the input in batch mode? I'm asking this because I'd like
>> to have the program work in both interactive and batch mode.
>>
>> Thanks again,
>> Gang
>>
>>
>> On Wed, Feb 8, 2012 at 12:50 AM, ilai  wrote:
>>> Ahh,
>>> I think I'm getting it now. Well, readlines() is not going to work for
>>> you. The help file ?readline clearly states "In non-interactive use
>>> the result is as if the response was RETURN and the value is ‘""’."
>>> The implication is you cannot use it to "insert" different answers as
>>> if you were really there.
>>> How about using eval() instead? You will need to make the answers a
>>> named list (or just assigned objects).
>>>
>>> test <- expression({
>>>  if(a>2) print('+')
>>>  else print('I got more')
>>>  b <- b+3   # reassign b in the environment
>>>  print(b)
>>>  print(c)
>>>  d^2
>>> })
>>> dump('test',file='myTest.R') ; rm(test)
>>>
>>> # make the answers.R file:
>>>
>>> a=5 ; b=2 ; c=2 ; d=3
>>> source("myTest.R")
>>> eval(test)
>>>
>>> # Now, from the terminal  R CMD BATCH answers.R out.R
>>> # And here is my $ cat out.R
>>> ... flushed
>>>> a=5 ; b=2 ; c=2 ; d=3
>>>> source("myTest.R")
>>>> eval(test)
>>> [1] "+"
>>> [1] 5
>>> [1] 2
>>> [1] 9
>>>>
>>>> proc.time()
>>>   user  system elapsed
>>>  0.640   0.048   0.720
>>>
>>> Would this work?
>>> Elai
>>>
>>>
>>>
>>>
>>> On Tue, Feb 7, 2012 at 4:05 PM, Gang Chen  wrote:
>>>> Suppose I create an R program called myTest.R with only one line like
>>>> the following:
>>>>
>>>> type <- as.integer(readline("input type (1: type1; 2: type2)? "))
>>>>
>>>> Then I'd like to run myTest.R in batch mode by constructing an input
>>>> file called answers.R with the following:
>>>>
>>>> source("myTest.R")
>>>> 1
>>>>
>>>> When I ran the following at the terminal:
>>>>
>>>> R CMD BATCH answer.R output.Rout
>>>>
>>>> it failed to pick up the answer '1' from the 2nd line in answers.R as
>>>> shown inside output.Rout:
>>>>
>>>>> source("myTest.R")
>>>> input type (0: quit; 1: type1; 2: type2)?
>>>>> 1
>>>> [1] 1
>>>>
>>>> What am I missing here?
>>>>
>>>> Thanks in advance,
>>>> Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting up infile for R CMD BATCH

2012-02-09 Thread Gang Chen
Hi Elai, yes, the approach works out pretty well. Thanks a lot for
spending time on this and for the great help!

Gang

On Wed, Feb 8, 2012 at 5:43 PM, ilai  wrote:
> I'm going to declare this SOLVED. Yes, if you don't want a separate
> script for batch, you will need to modify the original script so it
> either readline or skips it. Here is an example:
>
> # Save in file myTest.R
> 
> # Add this local function to the beginning of your original "program"
> and replace all the readline prompts with myreadline.
> # The new function takes on two arguments:
> # "what": the original object (in your example it was "type"<-...)
> # "prompt": The same string from readline
> # All it is doing is searching for "Answers.R", sourcing if available
> or prompting if not.
>
> myreadline <- function(what,prompt){
>  ans <- length(grep('Answers.R',list.files(),fixed=T))>0  # add a
> warning for multiple files
>  if(ans) source('Answers.R')
>  else{
>    ret <- as.integer(readline(prompt))
>    assign(what,ret,1)
>  }
> }
>
> # here is an interactive program to square only negative values
>
> print(x <- rnorm(1))
> myreadline('type','x>0 ? 1:T,2:F \n')
> print(x^type)
>
> ### END myTest.R 
>
> Running myTest interactivly:
>
>> source('myTest.R')
> [1] -0.3712215
> x>0 ? 1:T,2:F
> 2
> [1] 0.1378054           # -.37^2
>> source('myTest.R')
> [1] 0.3160747
> x>0 ? 1:T,2:F
> 1
> [1] 0.3160747          # .316^1
>
> # Create a list of answers
>> dump('type',file='Answers.R')
>
> # run again this time with answers available
>> source('myTest.R')
> [1] -1.088665   # skips prompt
> [1] -1.088665   # -1.088^1 (type in Answer.R ==1)
>
> # Now you can also run as batch
> $ R CMD BATCH myTest.R out.R
> $ cat out.R
> # ...
>> print(x <- rnorm(1))
> [1] -1.248487
>> myreadline('type','x>0 ? 1:T,2:F \n')
>> print(x^type)
> [1] -1.248487
>
> That's it. Only thing is to keep the file names in the working
> directory straight.
>
> Enjoy
>
>
> On Wed, Feb 8, 2012 at 12:39 PM, Gang Chen  wrote:
>> Sorry Elai for the confusions.
>>
>> Let me try to reframe my predicament. The main program "myTest.R" has
>> been written in interactive mode with many readline() lines embedded.
>> Suppose a user has already run the program once before in interactive
>> mode with all the answers saved in a text file called answer.R. Now
>> s/he does not want to go through the interactive again because it's a
>> tedious process, and would like to run the program but in batch mode
>> with answer.R. And that's why I tried the following which didn't pan
>> out:
>>
>> R CMD BATCH answer.R output.Rout
>>
>> Of couse I could rewrite a separate program specifically for batch
>> mode as you suggested previously with eval(), for example. However, is
>> there an approach to keeping the original program so that the user
>> could run both interactive and batch mode? That probably requires
>> modifying the readline() part, but how?
>>
>> Thanks,
>> Gang
>>
>>
>> On Wed, Feb 8, 2012 at 2:08 PM, ilai  wrote:
>>> Gang,
>>> Maybe someone here has a different take on things. I'm afraid I have
>>> no more insights on this unless you explain exactly what you are
>>> trying to achieve, or more importantly why? That may help understand
>>> what the problem really is.
>>>
>>> Do you want to save an interactive session for future runs? then
>>> ?save.image and all your "answers" are in the environment. In this
>>> case consider putting an "if(!exists('type') | length(type)<1 |
>>> is.na(type))" before "type<- readline(...)"  in your script so type
>>> wouldn't be overwritten in subsequent runs.
>>>
>>> If your goal is to batch evaluate multiple answer files from users
>>> (why else would you ask questions with readline?), then you should
>>> have enough to go on with my answer and the examples in ?eval.
>>>
>>> Elai
>>>
>>>
>>> On Wed, Feb 8, 2012 at 9:04 AM, Gang Chen  wrote:
>>>> Hi Elai,
>>>>
>>>> Thanks a lot for the suggestions! I really appreciate it...
>>>>
>>>> Your suggestion of using eval() and creating those answers in a list
>>>> would work, but ther

[R] Is there any package can be used to solve individual haplotyping problem?

2008-04-02 Thread Gang Chen
Hi all,

I am working for reconstructing haplotype from individual's SNPs. But I have
met a problem that most of paper and software I found are focus on
reconstructing haplotype from population's SNPs, rather than individual's
SNPs. I wonder is there any R package can be used to solve such problem?

Thanks in advance for any information,

Gang Chen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract values from a named array

2008-04-07 Thread Gang Chen
Sorry for this dumb question. Suppose I have a named array ww defined as

ww <- 1:5
names(ww) <- c("a", "b", "c", "d", "e")

How can I extract the whole array of numbers without the names?
ww[1:5] does not work while ww[[1]] can only extract one number at a
time.

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract values from a named array

2008-04-07 Thread Gang Chen
Thanks a lot for the suggestions!

Gang

On 4/7/08, Uwe Ligges <[EMAIL PROTECTED]> wrote:
>
>
>  Gang Chen wrote:
>
> > Sorry for this dumb question. Suppose I have a named array ww defined as
> >
> > ww <- 1:5
> > names(ww) <- c("a", "b", "c", "d", "e")
> >
> > How can I extract the whole array of numbers without the names?
> > ww[1:5] does not work while ww[[1]] can only extract one number at a
> > time.
> >
>
>
>  Use as.vector to strip these attributes:
>
>  as.vector(ww)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Formula with no intercept

2008-04-14 Thread Gang Chen
I'm trying to analyze a model with two variables, one is Group with
two levels (male and female), and other is Time with four levels (T1,
T2, T3 and T4). And for the convenience of post-hoc testing I wanted
to consider a model with no intercept for factor Time, so I tried
formula

Group*(Time-1)

However this seems to give me the following terms in the model

GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4, GroupMale:TimeT2,
GroupMale:TimeT3, GroupMale:TimeT4, GroupFemale:TimeT2,
GroupFemale:TimeT3, GroupFemale:TimeT4

which is not exactly what I wanted. Also it seems (Group-1)*Time and
(Group-1)*(Time-1) also give me exactly the same set of terms as
Group*(Time-1).

So I have some conceptual trouble understanding this. And how could I
create a model with terms including all the levels of factor Time?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Post hoc tests with lme

2008-04-16 Thread Gang Chen
Using the "ergoStool" data cited in Mixed-Effects Models in S and
S-PLUS by Pinheiro and Bates as an example, we have


> library(nlme)
> fm <- lme(effort~Type-1, data=ergoStool, random=~1|Subject)
> summary(fm)

Linear mixed-effects model fit by REML
  Data: ergoStool
   AIC  BIC   logLik
  133.1308 141.9252 -60.5654

Random effects:
 Formula: ~1 | Subject
(Intercept) Residual
StdDev:1.332465 1.100295

Fixed effects: effort ~ Type - 1
   Value Std.Error DF  t-value p-value
TypeT1  8.56 0.5760123 24 14.85308   0
TypeT2 12.44 0.5760123 24 21.60448   0
TypeT3 10.78 0.5760123 24 18.71102   0
TypeT4  9.22 0.5760123 24 16.01046   0
 Correlation:
   TypeT1 TypeT2 TypeT3
TypeT2 0.595
TypeT3 0.595  0.595
TypeT4 0.595  0.595  0.595

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max
-1.80200345 -0.64316591  0.05783115  0.70099706  1.63142054

Number of Observations: 36
Number of Groups: 9



Now suppose I want to test the following hypothesis

H0: TypeT1 =0 and TypeT2 = 0

I've tried estimable() and glh.test() in package gmodels, esticon() in
package boBy, and linear.hypothesis() in package car, but it seems
none of them would work with objects from lme:


> library(gmodels)
> estimable(fm, rbind(c("TypeT1"=1), c("TypeT2"=1)))
Error in FUN(newX[, i], ...) :
  `param' has no names and does not match number of coefficients of
model. Unable to construct coefficient vector
> glh.test(fm, rbind(c("TypeT1"=1), c("TypeT2"=1)))
Error in glh.test(fm, rbind(c(TypeT1 = 1), c(TypeT2 = 1))) :
  Only defined for lm,glm objects

> library(doBy)
> esticon(fm, rbind(c("TypeT1"=1), c("TypeT2"=1)))
Error in t(abs(t(tmp) * obj$fixDF$X)) :
  dims [product 2] do not match the length of object [4]
In addition: Warning message:
In esticon.lme(fm, rbind(c(TypeT1 = 1), c(TypeT2 = 1))) :
  The esticon function has not been thoroughly teste on 'lme' objects

> library(car)
> linear.hypothesis(fm, rbind(c("TypeT1"=1), c("TypeT2"=1)))
Error in L %*% b : requires numeric matrix/vector arguments


So is there any other package with which I can run this kind of tests?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] Post hoc tests with lme

2008-04-17 Thread Gang Chen
Thanks for the suggestion, Somon! I did try glht from multcomp
package, but the problem is that for the hypothesis

H0: TypeT1 =0 and TypeT2 = 0

it gives results for two separate hypotheses H01: TypeT1 =0 and H02:
TypeT2 = 0, not exactly one statistic for the original hypothesis H0.
So my question is, how can I get only one statistic for H0? Any more
suggestions?

Thanks,
Gang


> library(nlme)
> fm <- lme(effort~Type-1, data=ergoStool, random=~1|Subject)
> library(multcomp)
> summary(glht(fm, linfct=c("TypeT1=0", "TypeT2=0")))

 Simultaneous Tests for General Linear Hypotheses

Fit: lme.formula(fixed = effort ~ Type - 1, data = ergoStool, random = ~1 |
Subject)

Linear Hypotheses:
Estimate Std. Error z value p value
TypeT1 == 08.556  0.576   14.85  <1e-10 ***
TypeT2 == 0   12.444  0.576   21.60  <1e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)


On 4/16/08, Simon Blomberg <[EMAIL PROTECTED]> wrote:
> Try glht in package multcomp.
>
>  Simon.
>
>
>  On Wed, 2008-04-16 at 12:00 -0400, Gang Chen wrote:
Using the "ergoStool" data cited in Mixed-Effects Models in S and
S-PLUS by Pinheiro and Bates as an example, we have


> library(nlme)
> fm <- lme(effort~Type-1, data=ergoStool, random=~1|Subject)
> summary(fm)

Now suppose I want to test the following hypothesis

H0: TypeT1 =0 and TypeT2 = 0

I've tried estimable() and glh.test() in package gmodels, esticon() in
package boBy, and linear.hypothesis() in package car, but it seems
none of them would work with objects from lme.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formula with no intercept

2008-04-17 Thread Gang Chen
Thanks both Harold Doran and Prof. Ripley for the suggestion.
Time*Group - 1 or Time*(Group-1) does seem better. However as Prof.
Ripley pointed out, it is a little complicated with the interactions.
For example,

==

> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> summary(fit <- lm ( y ~ time * group - 1))

Call:
lm(formula = y ~ time * group - 1)

Residuals:
 1  2  3  4  5  6  7
-5.122e-01  3.916e-01  5.985e-01  9.547e-01  5.122e-01  1.665e-16 -5.985e-01
 8  9 10 11 12
-9.547e-01  0.000e+00 -3.916e-01 -5.551e-17  2.220e-16

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
time1 1.124930.91795   1.2250.288
time2 0.389840.91795   0.4250.693
time3-0.022730.64909  -0.0350.974
time4-1.260040.64909  -1.9410.124
groupM   -0.125331.12426  -0.1110.917
time2:groupM  0.082181.58994   0.0520.961
time3:groupM  0.131871.58994   0.0830.938
time4:groupM  2.329211.58994   1.4650.217

Residual standard error: 0.918 on 4 degrees of freedom
Multiple R-squared: 0.6962, Adjusted R-squared: 0.08858
F-statistic: 1.146 on 8 and 4 DF,  p-value: 0.4796
=

There are totally 8 fixed effects listed above. I believe I can
interpret time1, time2, time3 and time4 as the fixed effects of those
4 levels of factor Time in groupF. But I'm not so sure about the other
4 fixed effects: are time2:groupM, time3:groupM, and time4:groupM the
fixed effect differences of those 3 levels of factor Time between
groupM and groupF? If so, what is groupM (the 5th)? Or are
time2:groupM, time3:groupM, and time4:groupM the difference (between
groupM and groupF) of the fixed effects of those 3 levels of time
factor versus time1 while groupM (the 5th) the fixed effect of time1
or groupM versus GroupF?

> packages such as multcomp can post-hoc test any (coherent) set of hypotheses 
> you
> choose, irrespective of the model parametrization.

This does not seem true unless I'm missing something. See the following example:

===
> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> fit <- lm(y ~ time * group)
> library(multcomp)
> summary(glht(fit, linfct=c("time1=0", "time2=0")))
Error in chrlinfct2matrix(linfct, names(beta)) :
  variable(s) 'time1' not found
> summary(glht(fit, linfct=c("time2=0", "time3=0")))

 Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = y ~ time * group)

Linear Hypotheses:
   Estimate Std. Error t value p value
time2 == 0  -0.7351 1.2982  -0.566   0.797
time3 == 0  -1.1477 1.1243  -1.021   0.533
(Adjusted p values reported -- single-step method)
==

The problem is that glht doesn't allow any hypothesis involving time1
if intercept is included in the model specification. Any more
thoughts?

Thanks,
Gang


On 4/16/08, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Wed, 16 Apr 2008, Doran, Harold wrote:
>
>
> > R may not be giving you what you want, but it is doing the right thing.
> > You can change what the base category is through contrasts but you can't
> > get the marginal effects for every level of all factors because this
> > creates a linear dependence in the model matrix.
> >
>
>  I suspect that Time*Group - 1 or Time*(Group-1) come closer to the aim. It
> is the first factor in the model which is coded without contrasts in a
> no-intercept model.
>
>  Once you include interactions I think the 'convenience' is largely lost,
> and packages such as multcomp can post-hoc test any (coherent) set of
> hypotheses you choose, irrespective of the model parametrization.
>
>
>
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED]
> > > [mailto:[EMAIL PROTECTED] On Behalf Of Gang Chen
> > > Sent: Monday, April 14, 2008 5:38 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: [R] Formula with no intercept
> > >
> > > I'm trying to analyze a model with two variables, one is
> > > Group with two levels (male and female), and other is Time
> > > with four levels (T1, T2, T3 and T4). And for the convenience
> > > of post-hoc testing I wanted to consider a model with no
> > > intercept for factor Time, so I tried formula
> > >
> > > Group*(Time-1)
> > >
> > > However this seems to give me the following terms in the model
> > >
> > > GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4,
> &g

[R] How to deal with character(0)?

2007-11-15 Thread Gang Chen
I want to identify whether a variable is character(0), but get lost.  
For example, if I have

 > dd<-character(0)

the following doesn't seem to serve as a good identifier:

 > dd==character(0)
logical(0)

So how to detect character(0)?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to deal with character(0)?

2007-11-15 Thread Gang Chen
Thanks a lot for all who've provided suggestions!

Gang

On Nov 15, 2007, at 5:09 PM, Duncan Murdoch wrote:

> On 11/15/2007 4:54 PM, Gang Chen wrote:
>> I want to identify whether a variable is character(0), but get  
>> lost.  For example, if I have
>>  > dd<-character(0)
>> the following doesn't seem to serve as a good identifier:
>>  > dd==character(0)
>> logical(0)
>> So how to detect character(0)?
>
> (length(dd) == 0) && (typeof(dd) == "character")
>
> or if you really want to be specific (and rule out things that just  
> act like character(0) in most respects)
>
> identical(dd, character(0))
>
> Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm for nominal X ordinal

2008-01-04 Thread Gang Chen
Suppose I have a two-way table of nominal category (party  
affiliation) X ordinal category (political ideology):

party affiliation X (3 levels) - democratic, independent, and republic
political ideology Y (3 levels) - liberal, moderate, and conservative

The dependent variable is the frequency (or count) for all the two- 
way cells sampled from the voters. I want to test whether there is  
any party affiliation effect, and, if there is, the pair-wise  
contrasts. I have never used glm (I assume this the program I should  
use) before, so I am not so sure how I can code the two independent  
variables reflecting the fact that one is nominal while the other is  
ordinal and how to formulate the model.

Any help is highly appreciated,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a new dataframe from an existing dataframe

2008-01-07 Thread Gang Chen
I have a dataframe DF with 4 columns (variables) A, B, C, and D, and  
want to create a new dataframe DF2 by keeping B and C in DF but  
counting the frequency of D while collapsing A. I tried

by(DF$D, list(DF$B, DF$C), FUN=summary)

but this is not exactly what I want. What is a good way to do it?

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm for nominal X ordinal

2008-01-07 Thread Gang Chen
Thanks a lot! This is exactly what I wanted.

Gang


On Jan 5, 2008, at 2:20 PM, David Winsemius wrote:

> Gang Chen <[EMAIL PROTECTED]> wrote in
> news:[EMAIL PROTECTED]:
>
>> Suppose I have a two-way table of nominal category (party
>> affiliation) X ordinal category (political ideology):
>>
>> party affiliation X (3 levels) - democratic, independent, and
>> republic
>> political ideology Y (3 levels) - liberal, moderate, and
>> conservative
>>
>> The dependent variable is the frequency (or count) for all the two-
>> way cells sampled from the voters. I want to test whether there is
>> any party affiliation effect, and, if there is, the pair-wise
>> contrasts. I have never used glm (I assume this the program I should
>>  use) before, so I am not so sure how I can code the two independent
>>  variables reflecting the fact that one is nominal while the other
>> is  ordinal and how to formulate the model.
>
> ?factor
>
> Set up affiliation as a factor and ideology as an ordered factor. The
> levels argument in factor sets the sort order.
>
>> theo<-c("cons", "mod", "cons", "cons", "lib", "mod")
>> table(theo)
> theo
> cons  lib  mod
>312
>
>> theo<-ordered(theo, levels=c("lib", "mod", "cons"))
>> table(theo)
> theo
>  lib  mod cons
>123
>
>  The formula  in glm would be something similar to counts ~ theo +  
> affil
>
> Much more detail and worked examples regarding modeling count data  
> would
> be found in:
> Thompson, LA;  (2004) R (and S-PLUS)Manual to Accompany Agresti’s  
> (2002)
> Catagorical Data Analysis;
> https://home.comcast.net/~lthompson221/Splusdiscrete2.pdf
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a new dataframe from an existing dataframe

2008-01-07 Thread Gang Chen
Yes, I misstated it when I said that I would keep B and C. I want to  
collapse column A, but count the frequency of D as a new column in  
the new dataframe DF2 while collapsing A. The rows of columns B, C,  
and D of course would be reduced because of A collapsing.

For example, if dataframe DF is

AB   C   D
A1  B1 C1 D1
A1  B2 C1 D1
A1  B1 C1 D1
A1  B2 C2 D2
A2  B1 C1 D1
A1  B2 C2 D2
A1  B1 C2 D2
A2  B2 C1 D1
A1  B1 C2 D2
A2  B2  C1 D1
..

I would like to have a new dataframe DF2

B   C   D FreqA
B1 C1 D1 6
B1 C1 D2 31
B1 C2 D1 8
B1 C2 D214
B2 C1 D112
B2 C1 D243
B2 C2 D123
B2 C2 D243

Thanks,
Gang



On Jan 7, 2008, at 2:06 PM, Duncan Murdoch wrote:

> On 1/7/2008 1:28 PM, Gang Chen wrote:
>> I have a dataframe DF with 4 columns (variables) A, B, C, and D,  
>> and  want to create a new dataframe DF2 by keeping B and C in DF  
>> but  counting the frequency of D while collapsing A. I tried
>> by(DF$D, list(DF$B, DF$C), FUN=summary)
>> but this is not exactly what I want. What is a good way to do it?
>
> I think you can't do that.  If you want to keep B and C, then you  
> can't reduce the number of rows, but "collapsing A" sounds like you  
> want fewer rows.
>
> Perhaps if you posted a simple before and after example?
>
> Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a new dataframe from an existing dataframe

2008-01-07 Thread Gang Chen
Sorry the new column in DF2 should be called FreqD instead of FreqA.  
How can I get DF2 with aggregate?

Gang

On Jan 7, 2008, at 2:38 PM, Gang Chen wrote:

> Yes, I misstated it when I said that I would keep B and C. I want to
> collapse column A, but count the frequency of D as a new column in
> the new dataframe DF2 while collapsing A. The rows of columns B, C,
> and D of course would be reduced because of A collapsing.
>
> For example, if dataframe DF is
>
> AB   C   D
> A1  B1 C1 D1
> A1  B2 C1 D1
> A1  B1 C1 D1
> A1  B2 C2 D2
> A2  B1 C1 D1
> A1  B2 C2 D2
> A1  B1 C2 D2
> A2  B2 C1 D1
> A1  B1 C2 D2
> A2  B2  C1 D1
> ..
>
> I would like to have a new dataframe DF2
>
> B   C   D FreqA
> B1 C1 D1 6
> B1 C1 D2 31
> B1 C2 D1 8
> B1 C2 D214
> B2 C1 D112
> B2 C1 D243
> B2 C2 D123
> B2 C2 D243
>
> Thanks,
> Gang
>
>
>
> On Jan 7, 2008, at 2:06 PM, Duncan Murdoch wrote:
>
>> On 1/7/2008 1:28 PM, Gang Chen wrote:
>>> I have a dataframe DF with 4 columns (variables) A, B, C, and D,
>>> and  want to create a new dataframe DF2 by keeping B and C in DF
>>> but  counting the frequency of D while collapsing A. I tried
>>> by(DF$D, list(DF$B, DF$C), FUN=summary)
>>> but this is not exactly what I want. What is a good way to do it?
>>
>> I think you can't do that.  If you want to keep B and C, then you
>> can't reduce the number of rows, but "collapsing A" sounds like you
>> want fewer rows.
>>
>> Perhaps if you posted a simple before and after example?
>>
>> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a new dataframe from an existing dataframe

2008-01-08 Thread Gang Chen
Yes, this works well. Also 'table' suggested by Bert Gunter also  
works perfectly.

Thank all who've helped me on this.

Gang


On Jan 7, 2008, at 7:39 PM, jim holtman wrote:

> Does this do what you want?
>
>> x <- read.table(textConnection("AB   C   D
> + A1  B1 C1 D1
> + A1  B2 C1 D1
> + A1  B1 C1 D1
> + A1  B2 C2 D2
> + A2  B1 C1 D1
> + A1  B2 C2 D2
> + A1  B1 C2 D2
> + A2  B2 C1 D1
> + A1  B1 C2 D2
> + A2  B2  C1 D1"), header=TRUE)
>> counts <- ave(seq(nrow(x)), x$B, x$C, x$D, FUN=length)  # get counts
>> x.new <- x[, -1]  # delete "A"
>> x.new$FreqD <- counts  # add new column
>> # print out unique entries
>> unique(x.new)
>B  C  D FreqD
> 1 B1 C1 D1 3
> 2 B2 C1 D1 3
> 4 B2 C2 D2 2
> 7 B1 C2 D2 2
>>
>
>
> On Jan 7, 2008 2:38 PM, Gang Chen <[EMAIL PROTECTED]> wrote:
>> Yes, I misstated it when I said that I would keep B and C. I want to
>> collapse column A, but count the frequency of D as a new column in
>> the new dataframe DF2 while collapsing A. The rows of columns B, C,
>> and D of course would be reduced because of A collapsing.
>>
>> For example, if dataframe DF is
>>
>> AB   C   D
>> A1  B1 C1 D1
>> A1  B2 C1 D1
>> A1  B1 C1 D1
>> A1  B2 C2 D2
>> A2  B1 C1 D1
>> A1  B2 C2 D2
>> A1  B1 C2 D2
>> A2  B2 C1 D1
>> A1  B1 C2 D2
>> A2  B2  C1 D1
>> ..
>>
>> I would like to have a new dataframe DF2
>>
>> B   C   D FreqA
>> B1 C1 D1 6
>> B1 C1 D2 31
>> B1 C2 D1 8
>> B1 C2 D214
>> B2 C1 D112
>> B2 C1 D243
>> B2 C2 D123
>> B2 C2 D243
>>
>> Thanks,
>> Gang
>>
>>
>>
>>
>> On Jan 7, 2008, at 2:06 PM, Duncan Murdoch wrote:
>>
>>> On 1/7/2008 1:28 PM, Gang Chen wrote:
>>>> I have a dataframe DF with 4 columns (variables) A, B, C, and D,
>>>> and  want to create a new dataframe DF2 by keeping B and C in DF
>>>> but  counting the frequency of D while collapsing A. I tried
>>>> by(DF$D, list(DF$B, DF$C), FUN=summary)
>>>> but this is not exactly what I want. What is a good way to do it?
>>>
>>> I think you can't do that.  If you want to keep B and C, then you
>>> can't reduce the number of rows, but "collapsing A" sounds like you
>>> want fewer rows.
>>>
>>> Perhaps if you posted a simple before and after example?
>>>
>>> Duncan Murdoch
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting- 
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Residual deviance in glm

2008-01-10 Thread Gang Chen
I'm running a categorical data analysis with a two-way design of  
nominal by ordinal structure like the Political Ideology Example  
(Table 9.5) in Agresti's book Categorical Data Analysis. The nominal  
variable is Method while the ordinal variable is Quality (Bad,  
Moderate, Good, Excellent). I rank/quantify Quality with another  
variable QualityR (1, 2, 3, 4), and run the following:

fm <- glm(Count ~  Quality + Method * QualityR, family=poisson, MyData)

I'm pretty happy with the significance testing of the main effects  
and contrasts. However after examining the following deviances, I'm  
concerned about the Poisson fitting of the data:

=
 Null deviance: 426.36  on 35  degrees of freedom
Residual deviance: 171.71  on 16  degrees of freedom
AIC: 369.78

Number of Fisher Scoring iterations: 6
=

If I interpret the deviances correctly, it seems the Poisson fitting  
only explains (426.36-171.71)/426.36 ~ 60% of the total variability  
in the data. Also with a residual deviance of 171.71  on 16  degrees  
of freedom, the p value is 3.83738e-28. So does it indicate Poisson  
is not a good model for the data? If not, how can I improve the  
fitting? Change the ranking numbers or switch to a different model?  
Sorry this seems more like a statistical question than R-related.

Thanks,
Gang





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glht() and contrast() comparison

2008-01-11 Thread Gang Chen
With the example you provided, it seems both glht() and contrast()  
work fine.

Based on my limited experience with contrast(), if you encounter such  
an error message you just mentioned, check

 > dat.lme$apVar

You might see something like this

  [1] "Non-positive definite approximate variance-covariance"

which may indicate an inappropriate model for the data?

Gang


On Jan 11, 2008, at 6:04 PM, array chip wrote:

> Hi, I have been trying glht() from multcomp package
> and contrast() from contrast package to test a
> contrast that I am interested in.
>
> With the following simulated dataset (fixed effect
> "type" with 3 levels (b, m, t), and random effect
> "batch" of 4 levels, a randomized block design with
> interaction), sometimes both glht() and contrast()
> worked and gave nearly the same p values; other times,
> glht()worked, but contrast() didn't and gave the error
> message:
> "Error in dimnames(X) <- list(dn[[1L]],
> unlist(collabs, use.names = FALSE)) : length of
> 'dimnames' [2] not equal to array extent".
>
> so apparently, the error message from contrast() is
> data dependent. glht() always worked.
>
> Here is the code, I just copy and paste in R
> repeatedly:
>
> batch<-as.factor(rep(1:4,3,each=20))
> type<-as.factor(rep(c('b','m','t'),each=80))
> y<-2*(type=='m')+4*(type=='t')+rnorm(240,100,20)
> dat<-cbind(as.data.frame(y),type=type,batch=batch)
> rm(batch,type,y)
>
> library(nlme)
> dat.lme<-lme(y~type, random=~1|batch/type, data=dat)
>
> library(multcomp)
> summary(glht(dat.lme, linfct = mcp(type
> =c("b+t-2*m=0"))),test=Chisqtest())
>
> library(contrast)
> contrast(dat.lme, a=list(type=c('b','t')),
> b=list(type='m'),type='average')
>
> Does anybody have a clue?
>
> Thanks,
>
> John
>
>
>
> __ 
> __
> Never miss a thing.  Make Yahoo your home page.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble receiving messages from the mailing list

2008-01-17 Thread Gang Chen
For some unknown reason I stopped receiving any messages from the R- 
help mailing list.  See if this test gets through.

Thanks,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] trouble with tkgetOpenFile

2008-09-17 Thread Gang Chen
I'm trying to use the following loop to open a window multiple times
to select files, but only the last window shows up. What am I missing?

library(tcltk)
nWin <- 6
fn <- vector('list', nWin)
for (ii in nWin) {
   fn[[ii]] <- tclvalue( tkgetOpenFile( filetypes =
   "{{Files} {.1D}} {{All files} *}",
   title = paste('Choose number', ii, 'time series file')))
}

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with tkgetOpenFile

2008-09-17 Thread Gang Chen
Such a silly mistake! Thanks a lot, John!

Gang

On Wed, Sep 17, 2008 at 7:21 PM, John Fox <[EMAIL PROTECTED]> wrote:
> Dear Gang,
>
> Your for loop is in error; try
>
> for (ii in seq(length=nWin))
>
> I hope this helps,
>  John
>
> --
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>> -Original Message-
>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On
>> Behalf Of Gang Chen
>> Sent: September-17-08 4:31 PM
>> To: [EMAIL PROTECTED]
>> Subject: [R] trouble with tkgetOpenFile
>>
>> I'm trying to use the following loop to open a window multiple times
>> to select files, but only the last window shows up. What am I missing?
>>
>> library(tcltk)
>> nWin <- 6
>> fn <- vector('list', nWin)
>> for (ii in nWin) {
>>fn[[ii]] <- tclvalue( tkgetOpenFile( filetypes =
>>"{{Files} {.1D}} {{All files} *}",
>>title = paste('Choose number', ii, 'time series file')))
>> }
>>
>> TIA,
>> Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running source() on a file in another directory

2008-10-01 Thread Gang Chen
Suppose I have a file prog.R stored in a directory under ~/dirname,
and ~/dirname is set in a shell script file (e.g. .cshrc) as one of
the accessible paths on terminal. On a different directory I could run
prog.R interactively by executing

source("~/dirname/prog.R")

It seems that source() does not search for all the paths set in the
shell script. So, is there a better and more elegant way to do this,
for example without explicitly typing the directory?

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running source() on a file in another directory

2008-10-02 Thread Gang Chen
Thanks a lot! It seems to work fine now with what you suggested. I
also tried another approach by running the following on the terminal

R -f ~dir1/prog.R

and it read in some lines in prog.R, but then the execution halted at
one point for some reason I could not figure out. Any way I could get
"R -f ..." working?

Thanks,
Gang


On Wed, Oct 1, 2008 at 11:49 PM, Gabor Grothendieck
<[EMAIL PROTECTED]> wrote:
> Suppose p is a vector of paths, e.g., p <- c("~/dir1", "~/dir2/dir3")
> Then the following will return the full pathname of the first found
> location:
>
>   Find(file.exists, file.path(p, "prog.R"))
>
> so you can source that.
>
> On Wed, Oct 1, 2008 at 6:09 PM, Gang Chen <[EMAIL PROTECTED]> wrote:
>> Suppose I have a file prog.R stored in a directory under ~/dirname,
>> and ~/dirname is set in a shell script file (e.g. .cshrc) as one of
>> the accessible paths on terminal. On a different directory I could run
>> prog.R interactively by executing
>>
>> source("~/dirname/prog.R")
>>
>> It seems that source() does not search for all the paths set in the
>> shell script. So, is there a better and more elegant way to do this,
>> for example without explicitly typing the directory?
>>
>> TIA,
>> Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t.test() on a list

2008-10-02 Thread Gang Chen
I have a list, myList, with each of its 9 components being a 15X15
matrix. I want to run a t-test across the list for each component in
the matrix. For example, the first t-test is on myList[[1]][1, 1],
myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15
t-tests. How can I run these t-tests in a simple way?

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Load a program at the front end

2008-10-02 Thread Gang Chen
I want to run a R program, prog.R,  interactively. My question is, is
there a way I can start prog.R on the shell terminal when invoking R,
instead of using source() inside R?

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test() on a list

2008-10-02 Thread Gang Chen
I appreciate your suggestion. The example I provided was fabricated
because I was only focusing on the programming perspective on how to
deal with such a data structure, not real statistical issues. Do you
mind elaborating a little more why that would not be appreciate? I
know I can do it with a couple of loops,  but I still appreciate
suggestions on how to write a line or two to run t-tests on such a
data structure.

Thanks,
Gang


On Thu, Oct 2, 2008 at 5:11 PM, Bert Gunter <[EMAIL PROTECTED]> wrote:
> I am sure you will get helpful answers. I am almost as sure that you
> shouldn't be doing this. I suggest you consult with your local statistician.
>
> -- Bert Gunter
>
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> Behalf Of Gang Chen
> Sent: Thursday, October 02, 2008 11:24 AM
> To: [EMAIL PROTECTED]
> Subject: [R] t.test() on a list
>
> I have a list, myList, with each of its 9 components being a 15X15
> matrix. I want to run a t-test across the list for each component in
> the matrix. For example, the first t-test is on myList[[1]][1, 1],
> myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15
> t-tests. How can I run these t-tests in a simple way?
>
> TIA,
> Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Load a program at the front end

2008-10-07 Thread Gang Chen
Thanks a lot for the suggestion!

Unfortunately " R --no-save < prog.R" does not work well with my
situation because prog.R
contain lines such as readline() and () that require user response in
the middle of the
execution. I also tried other options such as "R -f prog.R" and "R
--interactive < prog.R", and
they all failed.

Any other suggestions?

Thanks,
Gang


On Mon, Oct 6, 2008 at 10:12 PM, Bernardo Rangel Tura
<[EMAIL PROTECTED]> wrote:
> Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu:
>> I want to run a R program, prog.R,  interactively. My question is, is
>> there a way I can start prog.R on the shell terminal when invoking R,
>> instead of using source() inside R?
>>
>> TIA,
>> Gang
>
> Hi Gang
>
> I my system just only type:
>
>  R --no-save 
> platform   x86_64-unknown-linux-gnu
> arch   x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status Patched
> major  2
> minor  7.2
> year   2008
> month  09
> day11
> svn rev46532
> language   R
> version.string R version 2.7.2 Patched (2008-09-11 r46532)
>
> --
> Bernardo Rangel Tura, M.D,MPH,Ph.D
> National Institute of Cardiology
> Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Load a program at the front end

2008-10-07 Thread Gang Chen
Thanks for the suggestion, Luke!

Your approach would load prog.R whenever invoking R, but that is not
exactly what I want. Basically I can run prog.R through
source("~/someDir/prog.R") inside R. Also since prog.R involves many
lines of readline() and tclvalue() in tcltk package, the program
requires the user responses in the middle of the execution. If
possible, I would like to create an executable  shell script RunProg
with one line or two inside like

R ... ~/someDir/prog.R

If this is feasible, whenever I want to run prog.R, I can simply type
the following at the terminal prompt:

RunProg

and I would still be able to interact with the user. Is this something doable?

Thanks,
Gang



On Tue, Oct 7, 2008 at 12:43 PM, Luke Tierney <[EMAIL PROTECTED]> wrote:
> Something like
>
>env R_PROFILE=prog.R R
>
> may work for you.  You may need to call .First.sys at the beginning of
> prog.R to get default packages loaded.
>
> luke
>
> On Tue, 7 Oct 2008, Gang Chen wrote:
>
>> Thanks a lot for the suggestion!
>>
>> Unfortunately " R --no-save < prog.R" does not work well with my
>> situation because prog.R
>> contain lines such as readline() and () that require user response in
>> the middle of the
>> execution. I also tried other options such as "R -f prog.R" and "R
>> --interactive < prog.R", and
>> they all failed.
>>
>> Any other suggestions?
>>
>> Thanks,
>> Gang
>>
>>
>> On Mon, Oct 6, 2008 at 10:12 PM, Bernardo Rangel Tura
>> <[EMAIL PROTECTED]> wrote:
>>>
>>> Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu:
>>>>
>>>> I want to run a R program, prog.R,  interactively. My question is, is
>>>> there a way I can start prog.R on the shell terminal when invoking R,
>>>> instead of using source() inside R?
>>>>
>>>> TIA,
>>>> Gang
>>>
>>> Hi Gang
>>>
>>> I my system just only type:
>>>
>>>  R --no-save >>
>>> platform   x86_64-unknown-linux-gnu
>>> arch   x86_64
>>> os linux-gnu
>>> system x86_64, linux-gnu
>>> status Patched
>>> major  2
>>> minor  7.2
>>> year   2008
>>> month  09
>>> day11
>>> svn rev46532
>>> language   R
>>> version.string R version 2.7.2 Patched (2008-09-11 r46532)
>>>
>>> --
>>> Bernardo Rangel Tura, M.D,MPH,Ph.D
>>> National Institute of Cardiology
>>> Brazil
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>   Actuarial Science
> 241 Schaeffer Hall  email:  [EMAIL PROTECTED]
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test() on a list

2008-10-07 Thread Gang Chen
Thanks a lot, Petr! This works perfect!

myList was actually a list of data.frame, and the command line
initially choked. But once I converted it to matrix, it worked like a
charm:

apply(do.call(rbind, lapply(lapply(myList, as.matrix), c)), 2, t.test)

Thanks,
Gang


On Mon, Oct 6, 2008 at 9:42 AM, Petr PIKAL <[EMAIL PROTECTED]> wrote:
> Hi
>
> maybe not an answer you like but
>
> apply(do.call(rbind, lapply(myList, c)), 2, t.test)
>
> shall give you desired results
>
> Regards
> Petr
>
>
> [EMAIL PROTECTED] napsal dne 02.10.2008 23:25:23:
>
>> I appreciate your suggestion. The example I provided was fabricated
>> because I was only focusing on the programming perspective on how to
>> deal with such a data structure, not real statistical issues. Do you
>> mind elaborating a little more why that would not be appreciate? I
>> know I can do it with a couple of loops,  but I still appreciate
>> suggestions on how to write a line or two to run t-tests on such a
>> data structure.
>>
>> Thanks,
>> Gang
>>
>>
>> On Thu, Oct 2, 2008 at 5:11 PM, Bert Gunter <[EMAIL PROTECTED]>
> wrote:
>> > I am sure you will get helpful answers. I am almost as sure that you
>> > shouldn't be doing this. I suggest you consult with your local
> statistician.
>> >
>> > -- Bert Gunter
>> >
>> > -Original Message-
>> > From: [EMAIL PROTECTED] [
> mailto:[EMAIL PROTECTED] On
>> > Behalf Of Gang Chen
>> > Sent: Thursday, October 02, 2008 11:24 AM
>> > To: [EMAIL PROTECTED]
>> > Subject: [R] t.test() on a list
>> >
>> > I have a list, myList, with each of its 9 components being a 15X15
>> > matrix. I want to run a t-test across the list for each component in
>> > the matrix. For example, the first t-test is on myList[[1]][1, 1],
>> > myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15
>> > t-tests. How can I run these t-tests in a simple way?
>> >
>> > TIA,
>> > Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused with default device setup

2008-10-15 Thread Gang Chen
When invoking dev.new() on my Mac OS X 10.4.11, I get an X11 window
instead of quartz which I feel more desirable.  So I'd like to set
the default device to quartz. However I'm confused because of the
following:

> Sys.getenv("R_DEFAULT_DEVICE")
R_DEFAULT_DEVICE
 "quartz"

> getOption("device")
[1] "X11"

What's going on?

Also is file Renviron under /Library/Frameworks/R.framework/Resources/
etc/ppc/ the one I should modify if I want to change some environment
variables? But I don't see R_DEFAULT_DEVICE there.

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parallel computing with snow

2009-01-02 Thread Gang Chen
I've been using parApply() in snow package for parallel computing with
the following lines in R 2.8.1:

  library(snow)
  nNodes <- 4
  cl <- makeCluster(nNodes, type = "SOCK")
  fm <- parApply(cl, myData, c(1,2), func1, ...)

Since I have a Mac OS X (version 10.4.11) with two dual-core
processors, I thought that I could run 4 simultaneous clusters.
However with the 1st job it seems only two clusters (362 and 364
below) were running with roughly the same CPU time (4th column) while
the other two clusters were pretty much idling (I assume the 1st row
with PID 357 was the main process with which I started R):

 PID COMMAND  %CPU   TIME
 357 R 0.0%   0:15.81
 362 R99.8%  11:41.07
 364 R   100.3%  12:26.43
 366 R 0.0%   0:01.67
 368 R 0.0%   0:01.68

Why weren't 4 clusters split roughly equally in CPU time with two barely used?

I also tried a different job with fm <- parApply(cl, myData, c(1,2),
func2, ...), and the result is slightly different with all 4 clusters
more or less involved although they were still not distributed evenly
neither:

 PID COMMAND  %CPU   TIME
 413  R0.0%   0:18.46
 419  R   93.3%   2:53.62
 421  R   93.6%   6:07.85
 423  R   92.8%   5:12.13
 425  R   93.3%   1:39.73

What gives? Why different usage of clusters between the two jobs?

All help is highly appreciated,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >