# get multiple vendors for each account
> z <- with(Data2, tapply(Vendor,Account, I))
> n <- vapply(z,length,1)
> data.frame (Vendor = unlist(z),
>Account = rep(names(z),n),
>NumVen = rep(n,n)
> )
> ## which gives:
>Vendor Account NumVen
> A1 V1 A1 1
z <- with(Data2, tapply(Vendor,Account, I))
n <- vapply(z,length,1)
data.frame (Vendor = unlist(z),
Account = rep(names(z),n),
NumVen = rep(n,n)
## which gives:
Vendor Account NumVen
A1 V1 A1 1
A21 V2 A2 3
A22 V3 A2 3
A23 V1 A2 3
Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
Hi everyone. I have a dataframe that is a collection of Vendor IDs
plus a bank account number for each vendor. I'm trying to find a way
to count the number of duplicate bank accounts that occur in more than
one unique Vendor_ID, and then assign the count value for each row in
the dataframe
I have the following little scrip to create a df of summary stats. I'm
having problems obtaining the # of unique values
unique=sapply(myData, function (x)
length(unique(x), replace = TRUE))
Hi all,
I have a data frame with a variable Description containing text of speeches and
I would like to count number of sentences in each speech,
> str(data)
'data.frame': 255 obs. of 3 variables:
$ Group : Factor w/ 255 levels "AlzheimerGroup1","AlzheimerGroup10",..: 1
112 179 190
> I have a data.table which is shown below. I want to count combinations of
> columns on i and count on j with by. A few examples are given below the
> table.
> I want to:
> all months to show on the output including those that they ha
I have a data.table which is shown below. I want to count combinations of
columns on i and count on j with by. A few examples are given below the
I want to:
all months to show on the output including those that they have zero value
I want the three statements combined in on if possible
What's the expected output for this sample?
How do _you_ define what should be counted?
Hi all,
I was not clearly enough in my example code. Please see below where "blah
blah blah" can be ANY text or numbers: No predictable pattern at all to
what may or may not be written in place of "blah blah blah".
text1<-c("blah blah blah.
blah blah blah
1) blah blah blah 1
2) blah blah blah
stringr::str_count (and stringi::stri_count that it wraps) interpret
the pattern argument as a regular expression by default.
I like Boris's "Hadley" solution. For the record, I've appended a
version that uses regular expressions, the only benefit of which is
that it could be generalized to find more-complicated patterns.
counts <- sapply(text1, function(next_string) {
loc_example <- length(gregexpr("Exampl
I should add: there's a str_count() function in the stringr package.
str_count(text1, "Example")
# [1] 5 5 5 5
I guess that would be the neater solution.
How about:
unlist(lapply(strsplit(text1, "Example"), function(x) { length(x) - 1 } ))
Splitting your string on the five "Examples" in each gives six elements.
length(x) - 1 is the number of
matches. You can use any regex instead of "example" if you need to tweak what
you are looking for.
Hi all,
I am looking for a streamlined way of counting the number of enumerated
items are each element of a character vector. For example:
text1<-c("This is an example.
List 1
1) Example 1
2) Example 2
10) Example 10
List 2
1) Example 1
2) Example 2
These have been examples.","This is another ex
On Oct 2, 2015, at 2:33 AM, Duncan Murdoch wrote:
The zoo package replaces as.Date.numeric() with a function that
assumes an origin of "1970-01-01". There may be other packages
that also make a replacement like this. David appears to have one
of t
On 01/10/2015 11:29 PM, Rolf Turner wrote:
> On 02/10/15 15:47, David Winsemius wrote:
>> On Oct 1, 2015, at 6:22 PM, Rolf Turner wrote:
>>> P.S. I have been unable to find a corresponding vector of the names
>>> of the days of the week, although I have a very vague recollection
>>> of
>>> P.S. I have been unable to find a corresponding vector of the names
>>> of the days of the week, although I have a very vague recollection
On 02/10/15 15:47, David Winsemius wrote:
On Oct 1, 2015, at 6:22 PM, Rolf Turner wrote:
P.S. I have been unable to find a corresponding vector of the names
of the days of the week, although I have a very vague recollection
of the existence of such a vector. Does it exist, and if so what
If you want the month names:
> mnt <- c("Jan", "Feb", "Mar", "Apr",
On 02/10/15 10:54, peter dalgaard wrote:
On 01 Oct 2015, at 23:04 , Rolf Turner
On 02/10/15 03:45, David L Carlson wrote:
If you want the month names:
mnt <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
+ "July", "Aug", "Sep", "Oct", "Nov", "Dec")
dimnames(tbl)$Month <- mnt
> Unnecessary
On 02/10/15 03:45, David L Carlson wrote:
If you want the month names:
mnt <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
+ "July", "Aug", "Sep", "Oct", "Nov", "Dec")
dimnames(tbl)$Month <- mnt
Unnecessary typing; there is a built-in data set "" (in the
"base" package) that is
df <- data.frame( V1= 1, V2= c( 2, 3, 2, 1), V3= c( 1, 2, 1, 1))
dfO <- df[ order, df), ]
dfOD <- duplicated( dfO)
dfODTrigger <- ! c( dfOD[-1], FALSE)
dfOCounts <- diff( c( 0, which( dfODTrigger)))
cbind( dfO[ dfODTrigger, ], dfOCounts)
V1 V2 V3 dfOCounts
4 1 1 1 1
3 1 2
Have a look at the dplyr package
n <- 1000
V1 = sample(0:1, n, replace = TRUE),
V2 = sample(0:1, n, replace = TRUE),
V3 = sample(0:1, n, replace = TRUE)
) %>%
group_by(V1, V2, V3) %>%
Freq = n()
On 10/09/2015 9:11 AM, Thomas Chesney wrote:
> Can anyone suggest a way of counting how frequently sets of values occurs in
> a data frame? Like table() only with sets.
Do you want 1,2,1 to be the same as 1,1,2, or different? What about
1,2,2? For sets, those are all the same, but for most purp
Can anyone suggest a way of counting how frequently sets of values occurs in a
data frame? Like table() only with sets.
So for a dataset:
V1, V2, V3
1, 2, 1
1, 3, 2
1, 2, 1
1, 1, 1
The output would be something like:
1,2,1: 2
1,3,2: 1
1,1,1: 1
Thank you,
Thomas Chesney
This message and an
Try the following:
## step 1: write raw data to an array
# entering the numbers (not the 'year' etc. labels) into R as a vector after
# convert the vector into a 2-d array with 4 columns (year, month, day,
## step 2:
Hello R-users,
I want to ask how to count the number of daily rain data. My data as below:
Year Month Day Amount 1901 1 1 0 1901 1 2 3 1901 1 3 0 1901 1 4 0.5 1901 1 5 0
1901 1 6 0 1901 1 7 0.3 1901 1 8 0 1901 1 9 0 1901 1 10 0 1901 1 11 0.5 1901 1
12 1.8 1901 1 13 0 1901 1 14 0 1901 1 15 2.5]
Am 26.06.2015 um 10:38 schrieb PIKAL Petr:
I am little bit lost in your logic. Why triple in your fourth line is one. I
expected it will be four?
Sorry yes you are right ...
type mismatch
__ mailing list -- To UNS
Sorry last count was wrong ...
test =data.frame("first"=c("seven","two","five","four"),
count =data.frame("dobule1"=c("four",
Dear Members,
is there a better solution to count the amounts of occurrence in a row
with string data than with loops to get the count data.frame?
test =data.frame("first"=c("seven","two","five","four"),
I normally use rle() for these problems, see ?rle.
for instance,
k <- rbinom(999, 1, .5)
series <- function(run) {
Assuming I understand the problem correctly, you want to check for
runs of at least length five where both Score and Test_desc assume
particular values. You don't care where they are or what other data
are associated, you just want to know if at least one such run exists
in your data frame.
I have the following dataframe
structure(list(Type = c("QRS", "QRS", "QRS", "QRS", "QRS", "QRS",
"QRS", "QRS", "QRS", "QRS", "QRS", "QRS", "RR", "RR", "RR", "PP",
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "QTc", "QTc",
"QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc",
That' s perfect. Many thanks forma your appreciated help.
In addition to the other suggestions, which are fine for your simple
example, I would take a
example, I would take a trip to the CRAN Task View "Natural Language
Processing", and see if there's anything there.
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
table(strsplit("hola mundo mundo", " ")[[1]])
> x <- c("hola mundo mundo");
> table(unlist(strsplit(x, " ")))
hola mundo
1 2
Is this what you are looking for? I hope this helps.
Chel Hee Lee
Hi all,
I want to cout the different words in a text.
You see if the text is: "hola mundo m
Hi all,
I want to cout the different words in a text.
You see if the text is: "hola mundo mundo" the program will count:
hola 1
mundo 2
Is posible that Cran r have a similar function?
Here is a solution using data.table
> require(data.table)
> x <- data.table(v, diff = cumsum(c(1, diff(v)) != 1))
> x
v diff
1: 10
2: 20
3: 51
4: 61
5: 71
6: 81
7: 252
8: 303
9: 313
10: 323
11: 333
I have a vector of sorted positive integer values (e.g., postive integers after
applying sort() and unique()). For example, this:
I w
I have a vector of sorted positive integer values (e.g., postive integers
after applying sort() and unique()). For example, this:
I want to make a matrix from that vector that has two columns: (1) the
first value in every run of consecutive integer values, and (2
> ave(dfa$value, dfa$group, FUN=length)
[1] 3 3 3 4 4 4 4 1
> ave(dfa$value, dfa$group)
[1] 2 2 2 3 3 3 3 1
Help file ?ave should apply here.
Please read the Posting Guide mentioned in the footer of every email on this
list and on the list manager page for this mailing list. It warns you to read
the archives before posting and to post in plain text format rather than HTML
Hi everyone!
I have problems finding a solution to the following two problems:
My sample-dataframe consists of two variables "group" and "value":
group<-c("A", "A", "A", "B", "B", "B", "B", "C")
df<, value))
Problem 1:
Now I'd like
X <-, 4*50,replace=TRUE), ncol=4))
#[1] 15
#[1] 15
I have 4 columns, and about 300K plus rows with 0s and 1s.
I'm trying to count h
Hi Kate,
You could try
sum(X[, 1] == 1 & X[, 2] == 1)
where X is your data set.
> I have 4 columns, and about 300K plus rows with 0s and 1s.
> I'm trying to count how many rows satisfy a certain criteria... for
> instan
I have 4 columns, and about 300K plus rows with 0s and 1s.
I'm trying to count how many rows satisfy a certain criteria... for
instance, how many rows are there that have the first column == 1 as
well as the second column == 1.
I've tried using rowSums and colSums but it keeps giving me this type
Hi all ,
I have a package and i want to count the 1st
execution day of the package till 30 days afterwards ?
I hope I am clear with this question .
Please reply if you have anything to share .
[[alternative HTML version deleted]]
May be this helps:
vec1 <- c("victory","happiness","medal","war","service","ribbon", "dates")
vec2 <- c("The World War II Victory Medal was first issued as a service ribbon
referred to as the Victory Ribbon.", "By 1946, a full medal had been
established which was referred to as the World W
Hi Mintewab,
With the IRanges packages (from Bioconductor):
> library(IRanges)
> countMatches(z, w)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 3 1 1 0 1 0 0 0 0 0 0 1 3
2 0 0 1 0 0
[39] 0 0 0 0 0 0 0 0
And if you don't want to depend on I
Here's a solution:
# This gives a vector of counts (if z is a data frame, first convert
it to a matrix)
res = sapply(as.vector(z), function(x) sum(w==x))
# This copies the dimensions of the variable 'z' to 'res':
dim(res) = dim(z)
Hi Mintewab,
With the IRanges packages (from Bioconductor):
> library(IRanges)
> countMatches(z, w)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 3 1 1 0 1 0 0 0 0 0 0 1 3
2 0 0 1 0 0
[39] 0 0 0 0 0 0 0 0
To install the IRanges package:
Thank you for the reproducible example, but your description is missing a clear
definition of what you want.
For example, if your desired output is
result <- c(rep(0,16),2,1,0,3,1,1,0,1,0,0,0,0,0,0,1,3,2,0,0,1,rep(0,10))
then one answer might be
Many thanks, Arun.
Res 1 is exactly what I wanted.
May be this helps:
res3 <- table(z1[match(w,z1)])
#[1] TRUE
May be this helps:
z1 <- factor(z)
res1 <- table(z1[cut(w,breaks=c(-Inf,z,Inf),labels=F)])
#-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
May be this helps:
z1 <- factor(z)
res1 <- table(z1[cut(w,breaks=c(-Inf,z,Inf),labels=F)])
#-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 0 0 0 0 0 0 0 0 0 0 2 1 0 3 1 1 0 1 0 0 0 0 0 0 1 3
#21 22 23 24 25 26 27 28 29 30 31 32
Hi all,
I have the following reproducible example
w<-c(11, 11, 12, 14, 14, 14, 15, 16, 18, 25, 26, 26, 26, 27, 27, 30)
r<-z %in% w
now r gives me the presence or absence of elements in z that are in w but I am
interested in getting the number of times each element in z appears (o
data_m <- read.table(text="Abortusovis07918 Agona08561 Anatum08125 Arizonae65S
1 S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR
2 S5305A_IGR S5300A_IGR S5305A_IGR S5300A_IGR S5300A_IGR
3 S5300A_IGR S5300B_IGR S5300A_IGR S5300B_IGR S5300B_IGR
I'm new in R and I'm writing you asking for some guidance. I had
analyzed a comparative genomic microarray data of /56 Salmonella/
strains to identify absent genes in each of the serovars, and finally I
got a matrix that looks like that:
> data[1:5,1:5]
Abortusovis07918 Agona08561 Anat
> I have a set of data and I need to find out how many points are below a
> certain value but R will not calculate this properly for me.
R will. But you aren't.
> Negative numbers seem to be causing the issue.
You haven't got any negative numbers in your data set. In fact, you haven't got
any nu
It is hard to know exactly what you mean with such a generic question.
If you mean "treat survival as a counting process", then the answer is yes. The survival
package in S (which is the direct ancestor of the Splus package, which is the direct
ancestor of the R package) was the very first to d
s it what you want?
limits, include.lowest=T)))
> res <- rep(rrr$lengths, rrr$lengths)
> res
> }
> you can use split/lapply approach
> test$res2<-unlist(lapply(split(test$act, factor(test$day, levels=c(1,0))),
> fff))
> Beware of correct ordering of days in output. Without
[0,1] 4
2 14655 (1,199] 1
3 14655 (199,200] 2
try this:
> test <- structure(list(jul = structure(c(14655, 14655, 14655, 14655,
+ 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655,
+ 14655, 14655, 14655), origin = structure(0, class = "Date")),
+ time = structure(c(1266258354, 1266258954, 1266259554, 1266260154,
+ 126626075
Forgot the last part of the question:
> test <- structure(list(jul = structure(c(14655, 14655, 14655, 14655,
+ 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655,
+ 14655, 14655, 14655), origin = structure(0, class = "Date")),
+ time = structure(c(1266258354, 1266258954, 1266259554,
I would appreciate if somebody could help me with following calculation.
I have a dataframe, by 10 minutes time, for mostly one year data. This is
small example:
> dput(test)
structure(list(jul = structure(c(14655, 14655, 14655, 14655,
14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 1
Dear R forum
I have a vector say as given below
df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B"
Dear Sir,
Thanks a lot for your great help. I couldn't have figured it out.
Thanks again.
try this:
df <- c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B",
tab <- table(df)
rep(names(tab), 100 * tab)
I hope it helps.
Dear R forum
I have a vector say as given below
df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B", "C")
I need to find
(1) how many times each element occurs? e.g. in above vector F occurs 4 times,
C occurs 2 times etc.
(2) Depending on the number of occurrences, I ne
You should look at findInterval. Used with as.numeric it could do what you
request although it has a much wider range of uses.
The TeachingDemos package has %<% and %<=% functions that can be chained
simply, so you could do something like:
sum( 5:1 %<=% 1:5 %<=% 10:14 )
and other similar approaches.
The idea is that you can do comparisons as:
lower %<% x %<% upper
instead of
lower < x & x < upper
On Mon, Mar 18, 2
> There _is_ a function ?within.
There _is_ a function ?within. Maybe your function can be named 'between'
Rui Barradas
I want to cont how many
times a number say 12 lies in the interval. Can anyone assist?
Has anyone else ever wished there was a moderately general 'inside' or
