Hello Pert, thanks for your reply! You are right, my problem is just between ANT01 and ANT02. All other will keep in the filtered data. I have six more stations.
Looks like your solution will work pretty well for me! Once that I have to I think that I can insert this inside a function and use lapply to use it with all data that I have - that I can separe fishes by code. I just do not understand one thing: In the keep variable assumes value of 2 in the ANT that I have to keep, but value of 1 for other stations. How I can keep with just necessary data after use your solution? Thanks again for your attention and help. Raoni Em qui, 31 de out de 2019 às 04:30, PIKAL Petr <petr.pi...@precheza.cz> escreveu: > Hi. > > Bert's questions should be clarified. But from your question I understand > that only ANT01 and ANT02 are the Stations which you want to filter and all > others you want to keep regardless of condition. If this is true, I would > add the new column which would have one value for ANT stations and > different > for all others (if you have more than one). Than you could set flag which > is > the biggest number in each day. And after that you could add in each day > stations different from ANT and want to keep. > > I named your data as test and change them to data frame as I am not > familiar > with tibbles. > > The code is like that. > test$m <- ave(test$N_records, interaction(test$Date, test$Station), > FUN=mean) > test$flag <- ave(test$m, test$Date, FUN=function(x) max(x) == x) > test$keep <- test$flag + (test$Station == "ETE01")*1 > > but you need to think about questions asked by Bert. > > Cheers > Petr > > > -----Original Message----- > > From: R-help <r-help-boun...@r-project.org> On Behalf Of Bert Gunter > > Sent: Thursday, October 31, 2019 5:18 AM > > To: Cacique Samurai <caciquesamu...@gmail.com> > > Cc: R help <r-help@r-project.org> > > Subject: Re: [R] Tricky filtering > > > > Thanks for the nice dput example, but your specification confuses me. > > What if the 2 records with largest Mean_power are not the same as the two > > with largest N_records. Do you want to keep all four records? Or various > > combinations of this question that would keep 3 records. And will you > > always have two records on a date, or could you have just one? And if the > 2 > > records with largest Mean_power always also have the largest N_records, > > then you only need to choose the two with largest Mean_power and can > > ignore the N_records, right? > > > > Once you have answered these questions -- or someone else has a better > > understanding than I -- it should be easy. It will require a loop of one > form or > > another, however, and therefore might take a while. > > > > Cheers, > > Bert > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Wed, Oct 30, 2019 at 7:55 PM Cacique Samurai > > <caciquesamu...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > I had a fish telemetry data with more then 11 million lines. I had > > > some false records in the data, that I have to eliminate. I can solve > > > this using a loop, but I think that dplyr:: filter could be faster and > > > elegant. I just can't figure out how to do it. > > > > > > At this moment, I already summarized this raw data, and had something > > > like this (dput at end of e-mail): > > > > > > Date Station Antenna Mean_power N_records *Action need (manually > > > inserted)* > > > 29/03/2019 ANT01 1 108 1704 Remove > > > 29/03/2019 ANT01 2 94 1219 Remove > > > 29/03/2019 ANT02 1 220 3029 Keep > > > 29/03/2019 ANT02 2 219 2711 Keep > > > 30/03/2019 ANT01 1 204 2289 Keep > > > 30/03/2019 ANT01 2 172 1477 Keep > > > 30/03/2019 ANT02 1 88 913 Remove > > > 30/03/2019 ANT02 2 72 1080 Remove > > > 30/03/2019 ETE01 AH0 87 1 Keep > > > > > > The problem occurs between Stations ANT01 and ANT02. In the same day, > > > I have to keep the pair of records that have bigger Mean_power and > > > more N_records. In this example, I have to keep records in Station > > > ANT02 in > > > 29/03 and of ANT01 and ETE01 in 30/03. If I do not have more than > > > ANT01 and > > > ANT02 in the same day, it was a simple question. > > > > > > I have to do this for each marked fish, that is identified by a Code > > > supres here for resuming. > > > > > > Thanks in advanced, > > > > > > Raoni > > > > > > > > > structure(list(Date = structure(c(17984, 17984, 17984, 17984, 17985, > > > 17985, 17985, 17985, 17985), class = "Date"), Station = > > > c("ANT01","ANT01", "ANT02", "ANT02", "ANT01", "ANT01", "ANT02", > > > "ANT02","ETE01"), Antenna = c("1", "2", "1", "2", "1", "2", "1", > > > "2","AH0"), Media_power = c(108, 94, 220, 219, 204, 172, 88, 72, 87), > > > N_records = c(1704L, 1219L, 3029L, 2711L, 2289L, 1477L, 913L, 1080L, > > > 1L)), row.names = c(NA, -9L), class = c("grouped_df", "tbl_df", "tbl", > > > "data.frame"), groups = structure(list(Date = structure(c(17984, > > > 17984, 17985, 17985, 17985), class = "Date"), Station = c("ANT01", > > > "ANT02", "ANT01", "ANT02", "ETE01"), .rows = list(1:2, 3:4, 5:6, 7:8, > > > 9L)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", > > > "data.frame"), .drop = TRUE)) > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Raoni Rosa Rodrigues > > > Research Associate of Fish Transposition Center CTPeixes Universidade > > > Federal de Minas Gerais - UFMG Brasil rodrigues.ra...@gmail.com > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Raoni Rosa Rodrigues Research Associate of Fish Transposition Center CTPeixes Universidade Federal de Minas Gerais - UFMG Brasil rodrigues.ra...@gmail.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.