HI, In my previous solution, the order got messed up. I should have ordered the columns. Try this: dat1<-read.table(text=" Trip_id Vessel CommonName Length Count 1 230 Sunlight ShadAmerican 19 1 2 230 Sunlight ShadAmerican 20 1 3 230 Sunlight ShadAmerican 21 1 4 230 Sunlight ShadAmerican 23 1 5 230 Sunlight ShadAmerican 26 1 6 230 Sunlight ShadAmerican 27 1 7 230 Sunlight ShadAmerican 30 2 8 230 Sunlight ShadAmerican 33 1 9 230 Sunlight ShadAmerican 34 1 10 230 Sunlight ShadAmerican 37 1 11 230 Sunlight HerringBlueback 20 1 12 230 Sunlight HerringBlueback 21 2 13 230 Sunlight HerringBlueback 22 5 14 230 Sunlight HerringBlueback 26 1 15 230 Sunlight Alewife 17 1 16 230 Sunlight Alewife 18 1 17 230 Sunlight Alewife 20 2 18 230 Sunlight Alewife 21 4 19 230 Sunlight Alewife 22 16 20 230 Sunlight Alewife 23 22 21 230 Sunlight Alewife 24 16 22 230 Sunlight Alewife 25 4 23 230 Sunlight Alewife 26 1 24 230 Sunlight Alewife 27 2 25 230 Sunlight Alewife 28 2 26 231 Western_Venture ShadAmerican 23 1 27 231 Western_Venture ShadAmerican 24 1 28 231 Western_Venture ShadAmerican 25 1 29 231 Western_Venture ShadAmerican 28 2 30 231 Western_Venture ShadAmerican 29 2 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-dat1[order(dat1$Trip_id,dat1$Vessel,dat1$CommonName,dat1$Length,dat1$Count),] dat3<-dat2 dat3$Prop<-unlist(tapply(dat3$Count,list(dat3$Trip_id,dat3$CommonName),function(x) x/sum(x)))
#Jean's method: agg <- with(dat2, aggregate(data.frame(Total=Count), data.frame(Trip_id, CommonName), sum)) # combine the totals with the full data frame data2 <- merge(dat2, agg) # then calculate proportions data2$Prop <- data2$Count/data2$Total data3<-data2[,-6] data4<-data3[,c(1,3,2,4:6)] rownames(dat3)<-1:nrow(dat3) identical(dat3,data4) #[1] TRUE head(dat3) # Trip_id Vessel CommonName Length Count Prop #1 230 Sunlight Alewife 17 1 0.01408451 #2 230 Sunlight Alewife 18 1 0.01408451 #3 230 Sunlight Alewife 20 2 0.02816901 #4 230 Sunlight Alewife 21 4 0.05633803 #5 230 Sunlight Alewife 22 16 0.22535211 #6 230 Sunlight Alewife 23 22 0.30985915 head(data4) # Trip_id Vessel CommonName Length Count Prop #1 230 Sunlight Alewife 17 1 0.01408451 #2 230 Sunlight Alewife 18 1 0.01408451 #3 230 Sunlight Alewife 20 2 0.02816901 #4 230 Sunlight Alewife 21 4 0.05633803 #5 230 Sunlight Alewife 22 16 0.22535211 #6 230 Sunlight Alewife 23 22 0.30985915 A.K. ----- Original Message ----- From: Jean V Adams <jvad...@usgs.gov> To: Sally_roman <sro...@umassd.edu> Cc: r-help@r-project.org Sent: Thursday, October 25, 2012 2:45 PM Subject: Re: [R] trying ti use a function in aggregate Sally, It's great that you provided data and code. To make it even more user-friendly for R-help readers, supply your data as Rcode, using (for example) the dput() function. The reason you were getting all 1s with your code, is that you had told it to aggregate by trip, LENGTH, and species. But the data are already summarized by trip, LENGTH, and species, so your myfun() function is calculating the count/count=1 for each row. You could get rid of LENGTH to use your myfun() function, but the results aren't pretty ... with(data, aggregate(data.frame(Total=Count), data.frame(Trip_id, CommonName), myfun)) Instead, I suggest you can use the aggregate function to calculate the total counts, then merge these totals with your original data to calculate the proportions. # small subset of data data <- structure(list(Trip_id = c(230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 231L, 231L, 231L, 231L, 231L), Vessel = c("Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Sunlight", "Western Venture", "Western Venture", "Western Venture", "Western Venture", "Western Venture"), CommonName = c("Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Herring,Blueback", "Herring,Blueback", "Herring,Blueback", "Herring,Blueback", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Alewife", "Shad,American", "Shad,American", "Shad,American", "Shad,American", "Shad,American"), Length = c(19L, 20L, 21L, 23L, 26L, 27L, 30L, 33L, 34L, 37L, 20L, 21L, 22L, 26L, 17L, 18L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 23L, 24L, 25L, 28L, 29L), Count = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 5L, 1L, 1L, 1L, 2L, 4L, 16L, 22L, 16L, 4L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L)), .Names = c("Trip_id", "Vessel", "CommonName", "Length", "Count"), row.names = c(NA, -30L), class = "data.frame") # calculate the total count for each trip and Species agg <- with(data, aggregate(data.frame(Total=Count), data.frame(Trip_id, CommonName), sum)) # combine the totals with the full data frame data2 <- merge(data, agg) # then calculate proportions data2$Prop <- data2$Count/data2$Total data2 Jean Sally_roman <sro...@umassd.edu> wrote on 10/25/2012 09:19:57 AM: > > Hi -I am using R v 2.13.0. I am trying to use the aggregate function to > calculate the percent at length for each Trip_id and CommonName. Here is a > small subset of the data. > Trip_id Vessel CommonName Length Count > 1 230 Sunlight Shad,American 19 1 > 2 230 Sunlight Shad,American 20 1 > 3 230 Sunlight Shad,American 21 1 > 4 230 Sunlight Shad,American 23 1 > 5 230 Sunlight Shad,American 26 1 > 6 230 Sunlight Shad,American 27 1 > 7 230 Sunlight Shad,American 30 2 > 8 230 Sunlight Shad,American 33 1 > 9 230 Sunlight Shad,American 34 1 > 10 230 Sunlight Shad,American 37 1 > 11 230 Sunlight Herring,Blueback 20 1 > 12 230 Sunlight Herring,Blueback 21 2 > 13 230 Sunlight Herring,Blueback 22 5 > 14 230 Sunlight Herring,Blueback 26 1 > 15 230 Sunlight Alewife 17 1 > 16 230 Sunlight Alewife 18 1 > 17 230 Sunlight Alewife 20 2 > 18 230 Sunlight Alewife 21 4 > 19 230 Sunlight Alewife 22 16 > 20 230 Sunlight Alewife 23 22 > 21 230 Sunlight Alewife 24 16 > 22 230 Sunlight Alewife 25 4 > 23 230 Sunlight Alewife 26 1 > 24 230 Sunlight Alewife 27 2 > 25 230 Sunlight Alewife 28 2 > 26 231 Western Venture Shad,American 23 1 > 27 231 Western Venture Shad,American 24 1 > 28 231 Western Venture Shad,American 25 1 > 29 231 Western Venture Shad,American 28 2 > 30 231 Western Venture Shad,American 29 2 > > My code is: > myfun<-function (x) x/sum(x) > b<-with(data,aggregate(x=list(Percent=Count),by=list > (Trip_id=Trip_id,Length=Length,Species=CommonName), > FUN="myfun")) > > My issue is that the percent is not be calculated by Trip_id and CommonName. > The result is that each row has a percent of 1 indicating that myfun is not > dividing by the sum of counts with a Trip_id/CommonName group. Any help > would be appreciated. > Thank you [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.