ldn't put the result into a R
> variable.
> Do you have any comments on this?
Try the following.
writeLines(c("uno dos tres", "cuatro cinco", "seis"), "some_file.txt")
out <- system("wc some_file.txt", intern=TRUE)
if (length(x) >= n) break
stopifnot(length(x) >= n)
x <- x[1:n]
Hope this helps.
occurrences of each protocol or
sum(all[, 2] == "UDP")
to get the number of UDP rows or
udp <- all[all[, 2] == "UDP", ]
to extract only UDP rows.
If you cannot change the export to .csv, you can use the function strsplit().
On Fri, Nov 19, 2010 at 10:34:26AM -0800, wangwallace wrote:
> this is a simple question, but I wasn't able to figure it out myself.
> here is the data frame:
> M P Q
> 1 2 3
> 4 5 6
> 7 8 9
> M, P, Q each represent a variable
> I want to draw 2 random sample from each row separatel
On Fri, Nov 19, 2010 at 07:22:57PM -0800, wangwallace wrote:
> actually, what I meant is to draw two random numbers from each row
> separately to create a new data frame. for example, an example output could
> be:
> 1 3
> 4 5
> 9 8
This may be done, for example
X <- matrix(1:9, ncol = 3, byr
On Sun, Nov 21, 2010 at 10:56:14AM -0500, David Winsemius wrote:
> On Nov 21, 2010, at 10:43 AM, madr wrote:
> >Is there any way of suppressing that error, like in other programming
> >languages you can specifically invoke an error or simply exit,
> If you are in a function, then return()
> >
col3 <- A[cbind(seq(nrow(A)), 3 + ind[, 3])]
B <- cbind(col1, col2, col3)
# or with a cycle over rows
C <- matrix(nrow=nrow(A), ncol=3)
for (i in seq(nrow(A))) {
C[i, 1] <- A[i, ind[i, 1]]
C[i, 2:3] <- A[i, 3 + ind[i, 2:3]]
ize under Linux, one can use
Sys.getenv("COLUMNS"). I do not know, whether this applies also to MasOS.
s them in some situations, may be found
in the first section of
and in
ngle character
may be done using 0/1 instead of FALSE/TRUE.
000 0.400 0.375 0.3636364
[,1] [,2] [,3] [,4]
[1,] 6 15 24 33
[2,]6 15 24 33
[3,]6 15 24 33
Alternatively, it is possible to use
sweep(M, 2, colSums(M), FUN="/")
) {
out[i, seq(length(p[[i]]))] <- p[[i]]
[,1] [,2] [,3] [,4]
[1,]3 NA NA NA
[2,]25 NA NA
[3,]369 11
[4,]13 NA NA
ies in the table is 310660*17431. Using integer
type, this is 310660*17431*4 bytes, which is 20.17 GB. This probably
does not fit into RAM. Function table() produces a full matrix, not
a sparse one, even if there are empty cells.
(rowSums((a - rep(x, each=nrow(a)))^2)))
xinit <- colMeans(a)
x <- optim(xinit, d, a=a)$par
points(rbind(x), col=2)
Is this, what you mean?
Function optim() has further parameters, which influence efficiency
and accuracy, and there are also
ng the end-nodes of the edges starting in each node.
In a sorted file, they form blocks of consecutive lines, so a simple text
processing with perl is sufficient.
3 4"
3 "2"
4 "1 3 5"
5 "2 4"
and to a text (with a possible file= argument)
cat(paste(names(out2), out2), sep="\n")
1 2 3 4 5
2 3 4
3 2
4 1 3 5
5 2 4
> some of the basic function will cause old code to break?
I think that this is an important part of the reason.
On Wed, Dec 15, 2010 at 11:08:06AM -0200, Henrique Dallazuanna wrote:
> Try this:
> gsub("[^0-9]", "", "AB15E9SDF654VKBN?dvb.65")
Consider also
strsplit("AB15E9SDF654VKBN?dvb.65", "[^.0-9][^.0-9]*")
[1] """15" "9" "654" ".65"
> On Wed, Dec 15, 2010 at 6:55 AM, Luis Fe
> > pullchar("AB15E9SDF654VKBN?dvb.65", "[0-9]")
> [1] "15965465"
> Still learning regex so if there is a "positive" strategy I'm all
> ears. ...er, eyes?
One of the suggestions in this thread was to use an external program.
On Thu, Dec 16, 2010 at 06:17:45AM -0800, Dieter Menne wrote:
> Petr Savicky wrote:
> >
> > One of the suggestions in this thread was to use an external program.
> > A possible solution without negation in Perl is
> >
> > @a = ("AB15E9SDF654VKBN?dvb.
cars1_diff <- cars2_plus[ - seq(nrow(cars2_set)), ]
cars2_diff <- cars1_plus[ - seq(nrow(cars1_set)), ]
all(cars1_unique == cars1_diff) # [1] TRUE
all(cars2_unique == cars2_diff) # [1] TRUE
"9 Ladies Dancing"
> [5] "8 Maids-a-Milking""7 Swans-a-Swimming"
Additionally, if you know the exact character positions, which have
to be changed, then substr() can be used.
x <- "123456789"
substr(x, 5, 7) &l
On Fri, Dec 17, 2010 at 07:39:46AM -0500, Gabor Grothendieck wrote:
> On Thu, Dec 16, 2010 at 11:42 AM, Petr Savicky wrote:
> > Can something similar be done in R either specifically for numbers or
> > for a general regular expression?
> Dieter's first p
006 for row 1, 2008 for row 2 and 2008 for row 3.
If the pattern is always c("0","1"), the number of rows is large
and the number of years is relatively small, then this may
computed also using matrix calculations. For example
M <-
Hi Mark:
> However, if the dataframe contains non-unique rows (two rows with
> exactly the same values in each column) then the unique function will
> delete one of them and that may not be desirable.
In order to get information about equal rows between two dataframes
without removing duplicated
ding errors
in simple situations.
x[i-2]), f(x[i-1])),
points(x[i], 0)
aux <- readline("press Enter to continue")
regulafalsi(function(x) x^(1/2)+3*log(x)-5,1,10)
regulafalsi(function(x) x^(1/2)+3*log(x)-5,1,100)
then it may be seen that
der). Which
of these two was in your actual R code? In a formula, the ASCII tilda is
"Subscribing to R-help" and follow the description.
sample(which()) samples from 1:i.
However, with the parameters 34 and 40, your code uses sample() to vectors
of length at least 35 or at least 40 - 34.
If you want to keep all cases and only reassign the groups, you can either
modify df$mar.y (and not the
0) > 34) {
df$mar.y[sample(ind0, length(ind0) - 34)] <- 1
} else {
df$mar.y[sample(ind1, 34 - length(ind0))] <- 0
table(old, new=df$mar.y) # just to check the change
Does this work in your situation?
[1] 0.868
[1] 1.079
[1] 1.29
[1] 1.507
[1] 1.724
[1] 1.947
[1] 2.172
like the following?
xx <- c("abc", "abcd", "abcde")
z1 <- rep("0", times=length(xx))
z2 <- substr(z1, 1, 9 - nchar(xx))
yy <- paste(z2, xx, sep="")
# yy
#[1,] "00abc"
le of the identifiers would be helpful.
Filtering out different types of delimiters may be done as
a preprocessing step, for example, using gsub()
s <- c("ab cd", "ab cd", "a b cd")
gsub(" ", "", s) # [1] "abcd" "abcd" &qu
> b b b b
> c c c c
> d d d d
> But what I really want is:
> a b c d
> b c d a
> c d a b
> d a b c
> How can I do this?
Try the following
A <- c("a","b","c","d")
B <- matrix(
to the previous ones
nro <- as.numeric(readline("no of teams "))
teams <- rep(NA, times=nro)
for (i in seq(length=nro)) {
repeat {
current <- readline(paste("team", i, ""))
if (curren
lution needs to be applied many times, so I need something quick -- I
> was hoping a base function would do it, but I'm drawing a blank.
If the matrix can have different number of columns, then
also the following can be used
combs <- as.matrix(expand.grid(c(0,1),c(0,1),c(0,1)))
x <
4 4
> 4
> [75] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Let me suggest a slightly simpler code, which produces the same
output, if the input has length at most 9.
xx <- c("abc", "abcd", "abcde")
xx <- paste(&quo
e sum of numbers, which have very different magnitudes, may
be approximated by their maximum, so max(logx) is an
approximation of the required logarithm. A more accurate
calculation can be done, for example, as follows
maxlog + log(sum(exp(logx - maxlog)))
# [1] 5675.977
Petr Savick
calculate the ratios of the
to the largest of them or to a close approximation of the largest.
a[, ] <- v[row(a) + col(a) - 1]
[,1] [,2] [,3] [,4]
[1,] "a" "b" "c" "d"
[2,] "b" "c" "d" "a"
[3,] "c" "d" "a" "b"
[4,] "d" "a" "b" "c"
dimensions represent the columns.
This does not reorder the elements of the array, only changes the
dimension information.
A1 <- matrix(A, nrow=3*3, ncol=2)
Compute the means
B1 <- c(B)
C1 <- rowsum(A1, group=B1)/c(table(B1))
Put the means to the required positions
for(a in
f the residues in the columns are periodic, like in the
example above, then also the vector x is periodic. If this
really occurs in the application, generating the new column
using this fact may also be useful. The "length.out" argument
of rep() can be used.
y <- rep(rowSums(a[
Sums(a %% 13 == 0) != 0)
b <- cbind(a, x)
x <- (rowSums(a %% 13 == 0) != 0) + 0
"Resp", "other")
Stat <- Vals[c(1, 3, 1, 2, 3, 2, 1, 3, 2)]
ind <- which(Stat %in% c("MagDwn", "Resp"))
Reduced <- Stat[ind]
ind[which(diff(Reduced == "Resp") == 1) + 1]
# [1] 4 9
The positions of the corresponding MagDwn are
> for(i in 1:length(positions)){
If "positions" may, in some situations, be of length 0, then it
is better to use
for(i in seq(along=positions))
when needed. There is an R Wiki
page with some tips concerning factors at
uot;id", "diagnosis"), class = "data.frame", row.names = c(NA,
tab <- table(df$id, df$diag)
Then, for example, the data cases for "2. Patients with ah but no ihd"
may be obtained
sel <- tab[, "ah"] != 0 &
# AUC diff
# [1,] 0.73920
The difference is not always exactly zero, but is at the level
of the machine rounding error.
> On Thu, Jan 20, 2011 at 3:04 PM, He, Yulei wrote:
> > Hi, there.
> >
> > Suppose I already have sensitivit
tisfy some property, which allows to find a unique solution, then
the algorithm depends on what is known about the original matrices.
Hope this helps.
619691, 0.018334730, -0.009747171)
x <- numeric(length(y))
for (i in 1:length(y)) {
x[i] <- ifelse(i==1, 1*(1+y[i]), (1+y[i])*x[i-1])
z <- 1*cumprod(1 + y)
max(abs(x - z))
# [1] 1.818989e-12
, digits=2)
yields the correct comparison
round(t, 2) == round(tt, 2)
# [1] TRUE
athough 0.2 is also not exactly representable. Both sides are rounded
to the same representable number.
See also
for other examples.
nfluence the next iteration of the loop.
For example, the following loop always makes m*n repetitions, although
using the same variable in nested loops is definitely not suggested.
m <- 3
n <- 5
for (i in seq(length=m)) {
for (i in seq(length=n)) {
, if k is 0, since 1:0 is a vector of length 2.
If k may be 0, then it is better to use
for (n in seq(length=k))
since seq(length=0) has length 0.
Hope this helps.
On Tue, Jan 25, 2011 at 09:05:03AM +0100, Petr Savicky wrote:
> to foreach loop in Perl. If v is a vector, then
> for (n in v)
> first creates the vector v and then always performs length(v) iterations.
I forgot that ‘break’ may stop the loop. See ?"for" fo
(-1, 1, length=5)
xints <- data.frame(
x1=cut(x[, 1], breaks=breaks),
x2=cut(x[, 2], breaks=breaks),
x3=cut(x[, 3], breaks=breaks))
xtabs(~ ., xints)
Hope this helps.
nd faster, as others already
mentioned. Using replicate(), i obtained on my computer a speed up by a
factor between 5 and 7 for k <= 20 and there is a remarkable speed up
also for larger k. The function seq.int() is more general than the other
two. In particular, it can ge
e, then also the data frame will not
contain the unwanted level.
ncol=3, byrow=TRUE)
A <- matrix(nrow=n+1, ncol=n)
for(i in 1:n){
A[i, seq.int(along=x)] <- x
x <- diff(x)
M <- matrix(A, nrow=n, ncol=n)
M[upper.tri(M)] <- t(M)[upper.tri(M)]
Reorganizing an (n+1)
root(f, c(0, 10), x=1, y=3)$root
[1] 5
Hope this helps.
logical result, then use
all(A == B)
for exact equality and
all(abs(A - B) <= eps)
for approximate equality of all entries.
See also ?all.equal, which uses the relative error, not absolute
Hope this helps.
a b
> 1 1 1 3 3 5 6
> So the desired result is:
> id name
> 8 e
> 17f
> 20g
> 4c
> 11c
> 19c
> 6d
> 9d
> 10d
> 1a
> 5a
> 12a
> 14a
> 15a
> 2b
> 3b
colMeans() and variance using
var(). See also ?var.
Hope this helps.
> strings without looping -- I have to think not.
Try the following
x <- c("this is a string", "this is a numeric")
reassemble <- function(x, ind) paste(x[ind], collapse=" ")
vapply(strsplit(x," "), reassemble, "chara
Col2 = with(x, ave(H, paste(Site, Prof),
mory, but it is more
efficient than read.table(), since it does no parsing of
the file as a whole.
x <- readLines("file")
strsplit(x[length(x)], " +")[[1]][3]
Hope this helps.
On Wed, May 04, 2011 at 08:52:07AM -0700, William Dunlap wrote:
> > -Original Message-
> > From: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org] On Behalf Of Petr Savicky
> > Sent: Wednesday, May 04, 2011 12:51 AM
> > To: r-help@r-p
t;train" is a subset of "master".
master <- data.frame(ID=2001:2011)
train <- data.frame(ID=2004:2006)
valid <- master[! (master[, 1] %in% train[ ,1]), , drop=FALSE]
Hope this helps.
[1] 6 6 6 6 6 6 6 7 7 7 7 7 7 6 6 6 6 6 6 6 7 7 7 7 7 7
The input vector may be obtained using c() from a matrix. The
output vector may be reformatted using matrix(). However, for
a matrix solution, a more precise description of the question
is needed.
Hope this helps.
more elements than the input, since 1040/160/1 = 6.5. This
corresponds to the understanding that odd elements should repeat 7
times and even elements 6 times. However, it is not clear, what
the dimension of the output matrix should be.
Hope this helps.
the standard arithmetic and includes explicit tests like
abs(x) < 1e-20.
the second datset.
I am not sure, whether you can consider also other types
of subsets to increase the number of different samples.
For example, the following selects 16 rows at random
a[sort(sample(1:32, 16)), ]
Hope this helps.
Try the following
a <- matrix(c(0, 0, 2, 0, 4, 0, 1, 8, 0, 56), ncol=2)
a[rowSums(a != 0) != 0, ]
Hope this helps.
the desired covariance matrices would
> be appreciated.
Let me suggest the following procedure.
1. Generate a symmetric matrix A with the desired distribution of the
non-diagonal elements and with zeros on the diagonal.
2. Compute the smallest eigenvalu
On Fri, Jun 03, 2011 at 01:54:33PM -0700, Ned Dochtermann wrote:
> Petr,
> This is the code I used for your suggestion:
> k<-6;kk<-(k*(k-1))/2
> x<-matrix(0,5000,kk)
> for(i in 1:5000){
> A.1<-matrix(0,k,k)
> rs<-runif(kk,min=-1,max=1)
> A.1[lower.tri(A.1)]<-r
, 0.60330865, 0.61832829)
x <- matrix(rnorm(36), nrow=6, ncol=6) %*% diag(w)
x <- x/sqrt(rowSums(x^2))
a <- x %*% t(x)
Hope this helps.
which contains seeds, a critical function used in the simulation
and also a package and R version. The last two things may be obtained
for example as
Up to now, i did not really needed these
e negative eigenvalues. For example,
if all components in sigma[1:20] are 4, which is in
the range used for sigma, then we have a matrix, whose
diagonal elements are 4 and nondiagonal elements are
0.3*4^2 = 4.8 > 4. This matrix has negative eigenvalues,
so it is not a covarian
jumping ahead in the sequence
without generating all intermediate numbers, but i do
not know about an efficient available implementation.
squared test for given probabilities
data: x
X-squared = 25, df = 4, p-value = 5.031e-05
It is not clear, whether this is suitable for your application.
If you generate the values in a different way, then another
test may be needed. Can you specify more detail on how the
numbers are generated
er of species, which are contained
in a sum of a random selection of k rows may be computed easily,
since we can consider the columns (species) individually and
for each column, the probability to get a nonzero sum may be
computed without actually constructing all the subsets.
If you need a parame
On Thu, Jan 27, 2011 at 05:30:15PM +0100, Petr Savicky wrote:
> On Thu, Jan 27, 2011 at 11:30:37AM +0100, Serena Corezzola wrote:
> > Hello everybody!
> >
> >
> >
> > I?m trying to define the optimal number of surveys to detect the highest
> > numbe
0.04101562 0.03222656 0.0625 -0.05468750 0.04687500
matrix(apply(expand.grid(x, y), 1, FUN=FUN), nrow=length(x), ncol=length(y))
[,1] [,2]
[1,] 0.05468750 0.0625000
[2,] -0.04101562 -0.0546875
[3,] 0.03222656 0.0468750
Hope this helps.
2 chr1 70 80
3 chr2 90 110
4 chr2 130 140
5 chr3 190 230
The column tb1$index contains for each row the index of the interval [V2, V3]
in tb2, which contains the values V2, V3 from tb1. For example, line
10 chr2 130 131 1 4
of tb1 contains index 4, because the interval
4 chr
tb2. If you need to test a nonempty
intersection, it is slightly more complicated, but not much.
Are you from the same research team?
If r1, r2 should be the names of the columns, then use named
arguments in the call of the function data.frame()
rr[1:3, ]
r1 r2
1 12.3274362 224.7632
2 13.1347464 214.3805
3 0.7495177 219.6179
n <- 33
y <- cumsum(runif(n))
# restarting indices
ind <- 1:n - (1:n) %% 10
ind[ind == 0] <- 1
plot(y - y[ind])
Is this close to what you want?
If not, then i suggest to send the loop solution as a part of the description.
manage to find the
> answer.
The reason is that a number, which has a finite expansion as a decimal
number, need not have a finite expansion as a binary number. Besides FAQ 7.31,
see also
for further examples and some hints.
density function over that region. If the
region is a single point, then this integral is zero.
Functions related to a multivariate normal distribution may
be computed using package
on of the points into a finite
number of regions and keeps the information needed for any of the
tasks, which you mention.
Hope this helps.
1 0
3 0 2 0 0
Does this approach work for your data?
rownames(expand) <- NULL
animal N
1 a 2
2 a 2
3 b 1
4 c 3
5 c 3
6 c 3
Hope this helps.
e numbers, consider
also the function lfactorial(), which computes the logarithm
in the standard numeric type.
[1] 5912.128
Hope this helps.
eal variables, which has
a measurable difference of expected value on Mersenne Twister numbers and
truly random ones, then this is likely to be an interesting mathematical
t;- rbind(x %o% y, zero)
k <- m + n - 1
b <- matrix(c(a)[1:(n*k)], nrow=k, ncol=n)
Testing this on computing the product of the polynomials (1+t)^4 (1+t)^3
x <- choose(4, 0:4)
y <- choose(3, 0:3)
convolution(x, y)
[1] 1 7 21
5.803329 5.803329
6.289873 6.28988
6.876084 6.876084
7.5992 7.599201
8.518665 8.518683
9.736212 9.736212
11.44329 11.44328
14.05345 14.05345
18.68155 18.68155
29.97659 29.9766
219.3156 219.3155
Hope this helps.
This also suggests that the same distribution on the random assignments
is obtained, if area is created already sorted and only the second
column of "ass" is random
ass <- as.data.frame(cbind("area"=1:7, "strategy"=sample(1:7, 7)))
Whether creating only thi
;- data.frame(V1=1:3, V2="CC", V3=seq(3.1, 3.3, by=0.1))
for (j in 1:3) {
# here any command using tab[[j]] may be used
# using print() for simplicity
See chapter 6 Lists and data frames of R-intro.pdf available at
> How do I do it in R ?
Let me use a small example
df <- data.frame(a=1:7, b=11:17, row.names=letters[1:7])
a b
a 1 11
b 2 12
c 3 13
d 4 14
e 5 15
f 6 16
g 7 17
Is the following what you are asking for in terms of this small example?
