Suppose that we have the following dataframe:
(tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE)))
x R1 R2
1 1 B B
2 2 B A
3 3 C D
4 4 E B
5 5 B D
6 6 E C
7 7 E D
8 8 D E
9 9
swapme = (as.numeric(tmp$R1) - as.numeric(tmp$R2)) %% 2 !=
> 0),
> {
> tmp[swapme, "R1"] <- r2[swapme]
> tmp[swapme, "R2"] <- r1[swapme]
> tmp
> })
With the following data in data.frame:
subject QMemotion yi
s1 75.1017 neutral -75.928276
s2 -47.3512 neutral -178.295990
s3 -68.9016 neutral -134.753906
s1 17.2099 negative -104.168312
s2 -53.1114 negative -182.373474
s3 -33.0322 negative -137.420410
I can
s3 -50.9669 -136.08716
>> With the follo
This is a simple question: With a dataframe like the following
myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))
how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod()
> Hi Gang Chen,
> I
], x[, 2])))
> Z CP
> A A 10
> B B 10
> 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names
> = c("X",
> "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")
> Combining two labels just requires the
38 3.2 S2 A
> S2B 22 3.2 S2 B
I want to do the following: if a string does not contain a colon (:),
no change is needed; if it contains one or more colons, break the
string into multiple strings using the colon as a separator. For
example, "happy:" becomes
"happy" ":"
":sad" turns to
":" "sad"
and "happy:sad" changes to
I'm having some trouble with Anova() in package "car". When the model
formula is explicitly expressed:
fm <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)
Anova() works fine:
However, if the model formula is scanned from an external source:
fter the
A random effect formulation for R package nlme is read in as a string
of characters from an input file:
ranEff <- "pdCompSymm(~1+Age)"
I need to convert 'ranEff' to a formula class. However, as shown below:
> as.formula(ranEff)
~1 + Age
the "pdCompSymm" is lost in the conversion. Any solutions?
Thanks for the help! However, I just need to get
pdCompSymm(~1 + Age)
without a tilde (~) at the beginning.
rocess automatic.
Sorry for the misspelling! And more importantly, thanks a lot for the
nice solution and for the quick help!
When R starts in GUI (e.g., /Applications/ on
my Mac OS X 10.7.5, the startup configuration in .Rprofile works fine.
However, when R starts on the terminal (e.g.,
/Library/Frameworks/R.framework/Resources/bin/R), it does not work at
all. What could be the reason for the failure?
I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
following problem.
It works fine with the following:
var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))
However, when running the following in a loop with simulated data (Sigma):
# Sigma defin
I have a data set collected from 10 measurements (response variables)
on two groups (healthy and patient) of subjects performing 4 different
tasks. In other words there are two fixed factors (group and task),
and 10 response variables. I could analyze the data with aov() or
lme() in package nlme.
I want to store some number of outputs from running a bunch of
analyses such as lm() into an array. I know how to do this with a
one-dimensional array (vector) by creating
myArray <- vector(mode='list', length=10)
and storing each lm() result into a component of myArray.
My question is, how
Anybody knows what functions can be used to calculate
variance/covariance with complex numbers? var and cov don't seem to
> a
V1 0.00810014+0.00169366i
V2 0.00813054+0.00158251i
V3 0.00805489+0.00163295i
V4 0.00809141+0.00159533i
V5 0.00813976+0.00161850i
> var(a)
e elegant
modification than mine?):
> crossprod(t(apply(xri, 1, '-', colMeans(xri/(nrow(xri)-1)
Do you agree?
On Sat, Mar 27, 2010 at 7:07 PM, Charles C. Berry wrote:
> On Sat, 27 Mar 2010, Gang Chen wrote:
>> Anybody knows what functions can be used to calculate
I've written a function, myFunc, that works fine with myFunc(data,
...), but when I use apply() to run it with an array of data
apply(myArray, 1, myFunc, ...)
I get a strange error:
Error in : '1' is not a function, character or symbol
which really puzzles me because '1' is meant to be the margin.
I have some bits stored like the following variable nn
(nn <- c(1, 0, 0, 1, 0, 1,0))
[1] 1 0 0 1 0 1 0
not in the format of
and I need to convert them to numbers in base 10. What's an easy way to do it?
In a classical meta analysis model y_i = X_i * beta_i + e_i, data
{y_i} are assumed to be independent effect sizes. However, I'm
encountering the following two scenarios:
(1) Each source has multiple effect sizes, thus {y_i} are not fully
independent with each other.
(2) Each source has multiple effect sizes.
This is most likely a silly question.
First I run the following:
mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
post.3, post.4, post.5,
fup.1, fup.2, fup.3, fup.4, fup.5) ~
treatment*gender, data=OBrienKaiser)
phase <- factor(rep(c("pr
Hi, I have two sets of sensitivity, specificity, positive predictive
value, and negative predictive value, and accuracy from two tests on
the same subjects. Is there an R package that does such paired
Gang Chen
Suppose that I need to run a multivariate linear model
Y = X B + E
many times with the same model matrix X but each time with different
response matrix Y. Is there a function available in 'car' package
similar to refit() in lme4 package so that the model matrix X would
not be reassembled each time?
I have a matrix 'dd' defined as below:
dd <- t(matrix(c(153.0216306, 1, 7.578366e-35,
13.3696538, 1, 5.114571e-04,
0.8476713, 1, 7.144239e-01,
1.2196050, 1, 5.388764e-01,
2.6349405, 1, 2.090719e-01,
6.0507714, 1, 2.780045e-02), nrow=3, ncol=6))
dimnames(dd)[[2]] <- c('# Chisq', 'DF', 'Pr(>Chisq)')
8366e-35 # Sex    Â
>#2 13.3696538 1 5.114571e-04 # Volume Â
>#3  0.8476713 1 7.144239e-01 # Weight Â
>#4Â Â 1.2196050Â 1 5.388764e-01 # Intensity
>#5Â Â 2.6349405Â 1 2.090719e-01 # ISOÂ Â Â Â Â
>#6Â Â 6.0507714Â 1 2.780045e-02 # SECÂ Â Â Â Â
Suppose I have the following dataframe:
L4 <- LETTERS[1:4]
fac <- sample(L4, 10, replace = TRUE)
(d <- data.frame(x = 1, y = 1:10, fac = fac))
x y fac
1 1 1 B
2 1 2 B
3 1 3 D
4 1 4 A
5 1 5 C
6 1 6 D
7 1 7 C
8 1 8 B
9 1 9 B
10 1 10 B
I'd like to add an
range of values you have to work with then one of the other
> more efficient methods may still be a better choice for this specific task.
> Hadley Wickham's "tidy data" [1] principles address this concern more
> thoroughly than I have.
> [1] Google this phrase...
I wrote an R program that does heavy computations with hundreds of
lines of code. It's running fine both interactively and in batch mode
on a Mac OS X computer. The program also has no problem running on a
Linux system (Fedora 14) interactively. However, when I try it on the
terminal in batch mode
Suppose I have a dataframe defined as
L3 <- LETTERS[1:3]
(d <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
= TRUE)))
x y fac
1 1 1 C
2 1 2 A
3 1 3 B
4 1 4 C
5 1 5 B
6 1 6 B
7 1 7 A
8 1 8 A
9 1 9 B
10 1 10 A
I want to extract
Perfect! Thanks a lot, A.K!
Suppose I have a dataframe 'd' defined as
L3 <- LETTERS[1:3]
d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
= TRUE))
(d <- d0[d0$fac %in% c('A', 'B'),])
x y fac
2 1 2 B
3 1 3 A
4 1 4 A
5 1 5 A
6 1 6 B
8 1 8 A
Even though factor 'fac' in 'd' onl
I'm trying to install the gsl wrapper source code
( on a Linux
system (OpenSuse 11.1), but encountering the following problem. I've
already installed 'gsl' version 1.14
( on the system. What's wrong?
You nailed it, Prof. Ripley! Thanks a lot...
I want to create some 3D scatter plot with a diagonal line. In addition, I'd
like to have those points plus the diagonal line projected to those three
planes (xy, yz and xz). Which package can I use to achieve this,
scatterplot3d or something else?
[[alternative HTML version deleted]]
Thanks a lot for the quick help! How to project the scatter plot with the
diagonal line to the three planes with scatterplot3d? I could not find such
an example demonstrating that in the vignette.
I know how to convert a simple dataframe from wide to long format with one
varying factor. However, for a dataset with two factors like the following,
Subj T1_Cond1 T1_Cond2 T2_Cond1 T2_Cond2
1 0.125869 4.108232 1.099392 5.556614
2 1.427940 2.170026 0.120748 1.176353
How to eleg
, 4,8) )
> ubj substr(variable, 1, 2)Cond1Cond2
> 1 1 T1 0.125869 4.108232
> 2 1 T2 1.099392 5.556614
> 3 2 T1 1.427940 2.170026
> 4 2 T2 0.120748 1.176353
> The modifications to
A very simple question. With a data frame like this:
> n = c(2, 3, 5)
> s = c("aa", "bb", "cc")
> df = data.frame(n, s)
I want df$s[1] or df[1,2], but how can I get rid of the extra line in
the output about the factor levels:
> df$s[1]
[1] aa
Levels: aa bb cc
I read somewhere that vector graphics such as eps or dpf are more favorable
than alternatives (jpeg, bmp or png) for publication because vector graphics
scale properly when enlarged. However, my problem is that the file generated
from a graph of fixed size is too large (in the order of 10MB) because of too many data points.
Hi, I have a question about SVAR modeling with the package vars. How does it
handle the situation where the A (structural) matrix has a non-recursive
structure in the SVAR model? In other words, what kind of algorithm does
vars adopt to deal with the unidentifiable issue in a non-recursive model?
I define the following function to convert a t-value with degrees of freedom
DF to another t-value with different degrees of freedom fullDF:
tConvert <- function(tval, DF, fullDF) ifelse(DF>=1, qt(pt(tval, DF),
fullDF), 0)
It works as expected with the following case:
> tConvert(c(2,3), c(10,12), 12)
c(0,12), 12)
[1] 0 3
However, I feel my solution is a little kludged. Any better idea?
I have some data 'myData' in wide form (attached at the end), and
would like to convert it to long form. I wish to have five variables
in the result:
1) Subj: factor
2) Group: between-subjects factor (2 levels: s / w)
3) Reference: within-subject factor (2 levels: Me / She)
4) F: within-subject fa
value_var = 'value')
> head(mData4, 4)
> Subj Group Ref Time F J
> 1 S1 s Me 1 4 5
> 2 S1 s Me 2 3 6
> 3 S1 s She 1 6 10
> 4 S1 s She 2 6 9
> mData5 <- cast(mData3, Subj + Group + Ref + Var ~ Time, value_var
S1 s 6 She F1
6S1 s 6 She F2
7S1 s10 She J1
8S1 s 9 She J2
David, thanks a lot for the code! I've learned quite a bit from all
the generous help...
Suppose I create an R program called myTest.R with only one line like
the following:
type <- as.integer(readline("input type (1: type1; 2: type2)? "))
Then I'd like to run myTest.R in batch mode by constructing an input
file called answers.R with the following:
When I ran t
ve such problem?
Sorry for this dumb question. Suppose I have a named array ww defined as
ww <- 1:5
names(ww) <- c("a", "b", "c", "d", "e")
How can I extract the whole array of numbers without the names?
ww[1:5] does not work while ww[[1]] can only extract one number at a time.
Thanks a lot for the suggestions!
I'm trying to analyze a model with two variables, one is Group with
two levels (male and female), and other is Time with four levels (T1,
T2, T3 and T4). And for the convenience of post-hoc testing I wanted
to consider a model with no intercept for factor Time, so I tried
Using the "ergoStool" data cited in Mixed-Effects Models in S and
S-PLUS by Pinheiro and Bates as an example, we have
> library(nlme)
> fm <- lme(effort~Type-1, data=ergoStool, random=~1|Subject)
> summary(fm)
Linear mixed-effects model fit by REML
Data: ergoStool
-10 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
I want to identify whether a variable is character(0), but get lost.
For example, if I have
> dd<-character(0)
the following doesn't seem to serve as a good identifier:
> dd==character(0)
So how to detect character(0)?
Thanks a lot for all who've provided suggestions!
Suppose I have a two-way table of nominal category (party
affiliation) X ordinal category (political ideology):
party affiliation X (3 levels) - democratic, independent, and republic
political ideology Y (3 levels) - liberal, moderate, and conservative
The dependent variable is the frequency (o
I have a dataframe DF with 4 columns (variables) A, B, C, and D, and
want to create a new dataframe DF2 by keeping B and C in DF but
counting the frequency of D while collapsing A. I tried
by(DF$D, list(DF$B, DF$C), FUN=summary)
but this is not exactly what I want. What is a good way to do it?
Thanks a lot! This is exactly what I wanted.
C1 D243
B2 C2 D123
B2 C2 D243
>> <- x[, -1] # delete "A"
>>$FreqD <- counts # add new column
>> # print out unique entries
>> unique(
>B C D FreqD
> 1 B1 C1 D1 3
> 2 B2 C1 D1 3
> 4 B2 C2 D2 2
> 7 B1 C2 D2 2
> On
I'm running a categorical data analysis with a two-way design of
nominal by ordinal structure like the Political Ideology Example
(Table 9.5) in Agresti's book Categorical Data Analysis. The nominal
variable is Method while the ordinal variable is Quality (Bad,
Moderate, Good, Excellent). I
With the example you provided, it seems both glht() and contrast()
work fine.
Based on my limited experience with contrast(), if you encounter such
an error message you just mentioned, check
> dat.lme$apVar
You might see something like this
[1] "Non-positive definite approximate variance
For some unknown reason I stopped receiving any messages from the R-
help mailing list. See if this test gets through.
__ mailing list
PLEASE do read the posting guide http
I'm trying to use the following loop to open a window multiple times
to select files, but only the last window shows up. What am I missing?
nWin <- 6
fn <- vector('list', nWin)
for (ii in nWin) {
fn[[ii]] <- tclvalue( tkgetOpenFile( filetypes =
"{{Files} {.1D}} {{All files} {*}}" ))
Suppose I have a file prog.R stored in a directory under ~/dirname,
and ~/dirname is set in a shell script file (e.g. .cshrc) as one of
the accessible paths on terminal. On a different directory I could run
prog.R interactively by executing
It seems that source() does n
I have a list, myList, with each of its 9 components being a 15X15
matrix. I want to run a t-test across the list for each component in
the matrix. For example, the first t-test is on myList[[1]][1, 1],
myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15
t-tests. How can I run th
I want to run a R program, prog.R, interactively. My question is, is
there a way I can start prog.R on the shell terminal when invoking R,
instead of using source() inside R?
__ mailing list
When invoking on my Mac OS X 10.4.11, I get an X11 window
instead of quartz which I feel more desirable. So I'd like to set
the default device to quartz. However I'm confused because of the
> Sys.getenv("R_DEFAULT_DEVICE")
> getOption("device")
I've been using parApply() in snow package for parallel computing with
the following lines in R 2.8.1:
nNodes <- 4
cl <- makeCluster(nNodes, type = "SOCK")
fm <- parApply(cl, myData, c(1,2), func1, ...)
Since I have a Mac OS X (version 10.4.11) with two dual-core
