Hello, Given a list of files:
In [81]: ec_files[0:10] Out[81]: [u'EC_20160604002000.csv', u'EC_20160604010000.csv', u'EC_20160604012000.csv', u'EC_20160604014000.csv', u'EC_20160604020000.csv'] where the numbers are are a timestamp with format %Y%m%d%H%M%S, I'd like to generate a list of matching files for each 2-hr period in a 2-h frequency time series. Ultimately I'm using Pandas to read and handle the data in each group of files. For the task of generating the files for each 2-hr period, I've done the following: beg_tstamp = pd.to_datetime(ec_files[0][-18:-4], format="%Y%m%d%H%M%S") end_tstamp = pd.to_datetime(ec_files[-1][-18:-4], format="%Y%m%d%H%M%S") tstamp_win = pd.date_range(beg_tstamp, end_tstamp, freq="2H") So tstamp_win is the 2-hr frequency time series spanning the timestamps in the files in ec_files. I've generated the list of matching files for each tstamp_win using a comprehension: win_files = [] for i, w in enumerate(tstamp_win): nextw = w + pd.Timedelta(2, "h") ifiles = [x for x in ec_files if pd.to_datetime(x[-18:-4], format="%Y%m%d%H%M%S") >= w and pd.to_datetime(x[-18:-4], format="%Y%m%d%H%M%S") < nextw] win_files.append(ifiles) However, this is proving very slow, and was wondering whether there's a better/faster way to do this. Any tips would be appreciated. -- Seb -- https://mail.python.org/mailman/listinfo/python-list