Re: Splitting text into lines

2016-12-13 Thread Steve D'Aprano
On Wed, 14 Dec 2016 03:45 am, George Trojan - NOAA Federal wrote: > I have files containing ASCII text with line s separated by '\r\r\n'. > Example: > > $ od -c FTAK31_PANC_131140.1481629265635 > 000 F T A K 3 1 P A N C 1 3 1 1 > 020 4 0 \r \r \n

Re: Splitting text into lines

2016-12-13 Thread Random832
On Tue, Dec 13, 2016, at 12:25, George Trojan - NOAA Federal wrote: > > > > Are repeated newlines/carriage returns significant at all? What about > > just using re and just replacing any repeated instances of '\r' or '\n' > > with '\n'? I.e. something like > > >>> # the_string is your file all rea

Re: Splitting text into lines

2016-12-13 Thread George Trojan - NOAA Federal
> > Tell Python to keep the newline chars as seen with > open(filename, newline="") > For example: > >>> > * open("odd-newlines.txt", "rb").read() * > b'alpha\nbeta\r\r\ngamma\r\r\ndelta\n' > >>> > * open("odd-newlines.txt", "r", newline="").read().replace("\r", * > "").splitlines() > ['alpha', 'be

Re: Splitting text into lines

2016-12-13 Thread George Trojan - NOAA Federal
> > Are repeated newlines/carriage returns significant at all? What about > just using re and just replacing any repeated instances of '\r' or '\n' > with '\n'? I.e. something like > >>> # the_string is your file all read in > >>> import re > >>> re.sub("[\r\n]+", "\n", the_string) > and then co

Re: Splitting text into lines

2016-12-13 Thread Peter Otten
George Trojan - NOAA Federal wrote: > I have files containing ASCII text with line s separated by '\r\r\n'. > but it looks cumbersome. I Python2.x I stripped '\r' before passing the > string to split(): > open('FTAK31_PANC_131140.1481629265635').read().replace('\r', '') > 'FTAK31 PANC 13114

Re: Splitting text into lines

2016-12-13 Thread Thomas Nyberg
On 12/13/2016 08:45 AM, George Trojan - NOAA Federal wrote: Ideally I'd like to have code that handles both '\r\r\n' and '\n' as the split character. George Are repeated newlines/carriage returns significant at all? What about just using re and just replacing any repeated instances of '\r' or

Splitting text into lines

2016-12-13 Thread George Trojan - NOAA Federal
I have files containing ASCII text with line s separated by '\r\r\n'. Example: $ od -c FTAK31_PANC_131140.1481629265635 000 F T A K 3 1 P A N C 1 3 1 1 020 4 0 \r \r \n T A F A B E \r \r \n T A 040 F \r \r \n P A