Hi,

I am trying to process a large dataset in R.  The dataset contains the
following three columns:

key_column - a unique key identifier
begin_date - the start date of the active period
end_date - the end date of the active period


Example data is here:

key_column,begin_date,end_date
123456,2013-01-01,2014-01-01
123456,2013-07-01,2014-07-01
789102,2012-03-01,2014-03-01
789102,2015-02-01,2016-02-01
789102,2015-02-06,2016-02-06

I want to build a condensed table of key_column and begin_date's and
end_date's.  As you can see in the example data above, some begin and end
date periods overlap with begin_date and end_date pairs for the same
key_column.  In situations where overlap exists I want to have one record
for the key_column with the min(begin_date) and the max(end_date).

Can anyone help me build the commands to process this data in R?

Thanks,
Matt

-- 
Matt Gross
gro...@gmail.com
503.329.4545

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to