DFS wrote: > Have: > '584323 Fri 13 May 2016 17:37:01 -0000 (UTC) 584324 Fri 13 May 2016 > 13:44:40 -0400 584325 13 May 2016 17:45:25 GMT 584326 Fri 13 May 2016 > 13:47:28 -0400' > > Want: > [('584323', 'Fri 13 May 2016 17:37:01 -0000 (UTC)'), > ('584324', 'Fri 13 May 2016 13:44:40 -0400'), > ('584325', '13 May 2016 17:45:25 GMT'), > ('584326', 'Fri 13 May 2016 13:47:28 -0400')] > > > Or maybe split() on space, then run through and add words of 6+ numbers > to the list, then recombine everything until you hit the next group of > 6+ numbers, and so on? > > The data is guaranteed to contain those 6+ groups of numbers.
Test with regexp under Python3 >>> import re >>> s = '584323 Fri 13 May 2016 17:37:01 -0000 (UTC) 584324 Fri 13 May 2016 13:44:40 -0400 584325 13 May 2016 17:45:25 GMT 584326 Fri 13 May 2016 13:47:28 -0400' >>> re.split("(\d{6})(.*?)", s) ['', '584323', '', ' Fri 13 May 2016 17:37:01 -0000 (UTC) ', '584324', '', ' Fri 13 May 2016 13:44:40 -0400 ', '584325', '', ' 13 May 2016 17:45:25 GMT ', '584326', '', ' Fri 13 May 2016 13:47:28 -0400'] Dismiss empty items and strip whitespaces at begin or end of string, and that's done. A+ Laurent. Note: re experts will provide a cleaner solution. -- https://mail.python.org/mailman/listinfo/python-list