Hi all, I had a fish telemetry data with more then 11 million lines. I had some false records in the data, that I have to eliminate. I can solve this using a loop, but I think that dplyr:: filter could be faster and elegant. I just can't figure out how to do it.
At this moment, I already summarized this raw data, and had something like this (dput at end of e-mail): Date Station Antenna Mean_power N_records *Action need (manually inserted)* 29/03/2019 ANT01 1 108 1704 Remove 29/03/2019 ANT01 2 94 1219 Remove 29/03/2019 ANT02 1 220 3029 Keep 29/03/2019 ANT02 2 219 2711 Keep 30/03/2019 ANT01 1 204 2289 Keep 30/03/2019 ANT01 2 172 1477 Keep 30/03/2019 ANT02 1 88 913 Remove 30/03/2019 ANT02 2 72 1080 Remove 30/03/2019 ETE01 AH0 87 1 Keep The problem occurs between Stations ANT01 and ANT02. In the same day, I have to keep the pair of records that have bigger Mean_power and more N_records. In this example, I have to keep records in Station ANT02 in 29/03 and of ANT01 and ETE01 in 30/03. If I do not have more than ANT01 and ANT02 in the same day, it was a simple question. I have to do this for each marked fish, that is identified by a Code supres here for resuming. Thanks in advanced, Raoni structure(list(Date = structure(c(17984, 17984, 17984, 17984, 17985, 17985, 17985, 17985, 17985), class = "Date"), Station = c("ANT01","ANT01", "ANT02", "ANT02", "ANT01", "ANT01", "ANT02", "ANT02","ETE01"), Antenna = c("1", "2", "1", "2", "1", "2", "1", "2","AH0"), Media_power = c(108, 94, 220, 219, 204, 172, 88, 72, 87), N_records = c(1704L, 1219L, 3029L, 2711L, 2289L, 1477L, 913L, 1080L, 1L)), row.names = c(NA, -9L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), groups = structure(list(Date = structure(c(17984, 17984, 17985, 17985, 17985), class = "Date"), Station = c("ANT01", "ANT02", "ANT01", "ANT02", "ETE01"), .rows = list(1:2, 3:4, 5:6, 7:8, 9L)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE)) -- Raoni Rosa Rodrigues Research Associate of Fish Transposition Center CTPeixes Universidade Federal de Minas Gerais - UFMG Brasil rodrigues.ra...@gmail.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.