I am trying to use the aggregate function to run a function, catsbydat2, that
produces the mean, minimum, maximum, and number of observations of the values
in a dataframe, inJan2Test, by levels of the dataframe variable MyDay. The
output should be
#my code:
# This fu
But am I the only one who finds that behavior of aggregate() completely
unexpected and confusing? Especially considering that dplyr::summarise() and
doBy::summaryBy() deal with NAs differe
Just one perhaps extraneous comment.
You said that you were surprised that aggregate() and group_by() did not
have the same behavior. That is a misconception on your part. As you know,
the tidyverse recapitulates the functionality of many base R functions; but
it makes no claims to do so in
So running it with na.pass instead of na.omit gives the same results as
Dear useRs,
I have just stumbled across a behavior in aggregate() that I cannot
explain. Any help would be appreciated!
Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
"FLINT-101", "FLINT-102", "HORN-10", "HORN
no missing data.
> Dear useRs,
Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103",
"HORN-104"), Edge
You don't have to bother with the subtracting from pi/2 bit ... just assume the
cartesian complex values are (y,x) instead of (x,y).
>I think that using complex numbers to represent the wind velocity makes
>this simpler. You would need to write
I think that using complex numbers to represent the wind velocity makes
this simpler. You would need to write some simple conversion functions
since wind directions are typically measured clockwise from north and the
argument of a complex number is measured counterclockwise from east. E.g.,
Dear list users,
I have to aggregate wind direction data (wd) using a function that requires
also a second input variable, wind speed (ws).
This is the function that I need to use:
my_fun <- function(wd1, ws1){
u_component <- -ws1*sin(2*pi*
Dear list users,
I have to aggregate wind direction data (wd) using a function that requires
also a second input variable, wind speed (ws).
This is the function that I need to use:
my_fun <- function(wd1, ws1){
u_component <- -ws1*sin(2*pi*wd1/360)
v_component <- -ws1*cos(2*pi*wd1/360)
> The base R as.difftime function is
The base R as.difftime function is perfectly usable to create this offset
without pulling in lubridate.
Dear R-list members,
I have semi-hourly snowfall data.
I should sum the semi-hourly increments (only the positive ones, but this is
not described in my example) day by day, not from 00 to 24 but from 9 to 9.
I am able to use the diff function, create a list of days and use the function
Hi Stefan,
How about this:
sddf<-read.table(text="age x
45 1
45 2
46 1
47 3
47 3",
You didn't say how you wanted to use it as a data.frame, but here is one way
d <- data.frame(
check.names = FALSE,
age = c(45L, 45L, 46L, 47L, 47L),
x = c(1L, 2L, 1L, 3L, 3L))
with(d, as.data.frame(table(age,x)))
which gives:
age x Freq
1 45 11
2 46 11
3 47 10
4 45 2
Dear All,
I have a seemingly standard problem to which I somehow I do not find
a simple solution. I have individual level data where x is a
categorical variable with 3 categories which I would like to aggregate
by age.
age x
45 1
45 2
46 1
47 3
47 3
and so on.
It should after transfo
You can also use 'dplyr'
result <- pcr %>%
group_by(Gene, Type, Rep) %>%
summarise(mean = mean(Ct),
sd = sd(Ct),
oth = sd(Ct) / sqrt(sd(Ct))
Dear users,
i am trying to summarize data using "aggregate" with the following command:
sd(x), sd(x)/sqrt(sd(x)))})
and the structure of the resulting data frame is
'data.frame':66 obs. of 4 variables:
$ Gen
if you are willing to use dplyr, you can do all in one line of code:
df%>%group_by(unique_A=A)%>%summarise(list_id=paste(id,collapse=", "))->r
> #given
Massimo
#ok, finally this is my final "best and more compact" solution of the problem
by merging different contributions (thanks to all indeed)
l<-sapply(unique(t$A), function(x) t$id[which(t$A==x)])
vals<- lapply(idx, function(index) x$id[index])
data.frame(unique_A = uA, list_vals=unlist(lapply(vals, paste, collapse = ",
Does this do what you want? I had to change the id values to something more
obvious. It uses tibbles which allow each variable to be a list.
x <- tibble(id=LETTERS[1:10],
uA <- unique(x$
sorry, but by further looking at the example I just realised that the posted
solution it's not completely what I need because in fact I do not need to get
back the 'indices' but instead the corrisponding values of column A
#please consider this new example
Hi Massimo,
Something along those lines could help you I guess:
t$A <- factor(t$A)
sapply(levels(t$A), function(x) which(t$A==x))
You can then play with the output using paste()
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Resea
#given the following reproducible and simplified example
#I need to get the following result
r<-data.frame(unique_A=c(123, 345, 678,
# i.e. aggregate over the variable "A
gt; I could construct example data.frames myself but most probably they
> would be different from yours and also the result would not be necessary
> the same as you expect.
> >
> > You should post those data frames as output from dput(data) and show us
> real desired result from
Hi All--
I have generated several 2 column data frames with variable length. The
data frames have the same column names and variable types. I was trying to
aggregate over the 2nd column for all the date frames, but could not figure
out how.
I thought i could make them all of equal length then co
Don't use aggregate's simplify=TRUE when FUN() produces return
values of various dimensions. In your case, the shape of table(subset)'s
return value depends on the number of levels in the factor 'subset'.
If you make B a factor before splitting it by C, each split will have the
same number of leve
The normal input to a factory that builds cars is car parts. Feeding whole
trucks into such a factory is likely to yield odd-looking results.
Both aggregate and table do similar kinds of things, but yield differently
constructed outputs. The output of the table function is not well-suited to be
Dear R users,
When I use aggregate with table as FUN, I get what I would call a
strange behaviour if it involves numerical vectors and one "level" of it
is not present for every "levels" of the "by" variable:
> df <-
Hi Mark,
I think you might want something like this:
# orders the time
I have a data frame that has a set of observed dwell times at a set of
locations. The metadata for the locations includes things that have varying
degrees of specificity. I'm interested in tracking the number of people
present at a given time in a given store, type of store, or zip code.
Here's an
5, -53.75, -53.75, -53.75,
>>> -53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
>>> 0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
>>> c("longitude",
>>> "latitude", "GDP"), row.names = c(4L, 17L
;>> same result can be achieved by
> > >>>
> > >>> dat.ag<-aggregate(dat[ , c("DCE","DP")], by= list(dat$first.Name,
> dat$Name, dat$Department) , "I")
> > >>>
> > >>> Sorting according to the first row seems
; >>> Sorting according to the first row seems to be quite tricky. You could
> >>> probably get closer by using some combination of split and order and
> >>> arranging back chunks of data
> >>>
> >>> ooo1<-order(s
tricky. You could
> probably get closer by using some combination of split and order and
> arranging back chunks of data
> >>>
> >>> ooo1<-order(split(dat$DCE,interaction(dat$first.Name, dat$Name,
> dat$Department, drop=T))[[1]])
> >>> data.frame
>>> dat$Department, drop=T))[[1]])
>>> data.frame(sapply(split(dat$DCE,interaction(dat$first.Name, dat$Name,
>>> dat$Department, drop=T)), rbind))[ooo1,]
>>> Ancient.Nation.QLH Amish.Wives.TAS Auction.Videos.YME
>>> 2
> > 4 0.28 NA NA
> > 1 0.540.59 0.57
> > 3 0.540.59 0.57
> >
> > however I wonder why the order according to the first row is n
> > 4 0.28 NA NA
> > 1 0.540.59 0.57
> > 3 0.540.59 0.57
> >
> > however I wonder why the order according to the fir
0.59 0.57
> 3 0.540.59 0.57
> however I wonder why the order according to the first row is necessary if all
> NAs are on correct positions?
> Cheers
> Petr
> On Nov 17, 2016, at 11:27 PM, Karim Mezhoud wrote:
> Dear all,
> the dat has missing values NA,
>first.Name Name Department DCE DP date
> 5 Auction VideosYME 0.57 0.56 2013-09-30
> 18 Amish WivesTAS 0.59 0.56 2013-09-30
> 34 Ancient Natio
Dear all,
the dat has missing values NA,
first.Name Name Department DCE DP date
5 Auction VideosYME 0.57 0.56 2013-09-30
18 Amish WivesTAS 0.59 0.56 2013-09-30
34 Ancient NationQLH 0.54 0.58 2013-09-30
53 Auction VideosYME NA
38 3.2 S2 A
> S2B 22 3.2 S2 B
Suppose a new dataframe is as below (one more numeric column):
myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
>> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[,
>> 1], x[, 2]))
> S1A S1B S2A S2B
> 22 38 38 22
> David C
], x[, 2])))
> Z CP
> A A 10
> B B 10
> David C
ge Station, TX 77840-4352
> This is a simple question: With a dataframe like the following
> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3
> This is a simple question: With a dataframe like the following
> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B',
> 'B'))
> how can I get the cross product between X and Y for each level of
> factor Z? My difficu
ecessarily faster, because apply
>>> is still a for loop inside):
>>>>> f <- function( m, nx, ny ) {
>>>>> # redefine the dimensions of my
>>>>> a <- array( m
>>>>> , dim = c( ny
lock.means, 2, 4)
[,1] [,2] [,3] [,4]
[1,] 3.5 11.5 19.5 27.5
[2,] 5.5 13.5 21.5 29.5
6L, 69L,
>> 82L, 95L, 108L, 121L), class = "data.frame")
>> I would like to aggregate the data 1 degree by 1 degree. I understand that
>> the first step is to convert to raster. I have tried:
>> rasterDF
like to aggregate the data 1 degree by 1 degree. I understand that
> the first step is to convert to raster. I have tried:
> rasterDF <- rasterFromXYZ(temp)
> r <- aggregate(rasterDF,fact=2, fun=sum)
> But this does not seem to work. Cou
= "data.frame")
I would like to aggregate the data 1 degree by 1 degree. I understand that
the first step is to convert to raster. I have tried:
rasterDF <- rasterFromXYZ(temp)
r <- aggregate(rasterDF,fact=2, fun=sum)
But this does not seem to work. Could anyone help me ou
