Hi, I am trying to process a large dataset in R. The dataset contains the following three columns:
key_column - a unique key identifier begin_date - the start date of the active period end_date - the end date of the active period Example data is here: key_column,begin_date,end_date 123456,2013-01-01,2014-01-01 123456,2013-07-01,2014-07-01 789102,2012-03-01,2014-03-01 789102,2015-02-01,2016-02-01 789102,2015-02-06,2016-02-06 I want to build a condensed table of key_column and begin_date's and end_date's. As you can see in the example data above, some begin and end date periods overlap with begin_date and end_date pairs for the same key_column. In situations where overlap exists I want to have one record for the key_column with the min(begin_date) and the max(end_date). Can anyone help me build the commands to process this data in R? Thanks, Matt -- Matt Gross gro...@gmail.com 503.329.4545 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.