Re: Splitting a file from specific column content

2012-01-22 Thread MRAB
On 22/01/2012 19:58, Arnaud Delobelle wrote: On 22 January 2012 16:09, MRAB wrote: On 22/01/2012 15:39, Arnaud Delobelle wrote: [...] Or more succintly (but not tested): sections = [ ("3", "section_1") ("5", "section_2") ("\xFF", "section_3") ] with open(input_path)

Re: Splitting a file from specific column content

2012-01-22 Thread Eelco
The grep solution is not cross-platform, and not really an answer to a question about python. The by-line iteration examples are inefficient and bad practice from a numpy/vectorization perspective. I would advice to do it the numpythonic way (untested code): breakpoints = [3, 5, 7] data = np.loa

Re: Splitting a file from specific column content

2012-01-22 Thread Yigit Turgut
On Jan 22, 9:37 pm, Roy Smith wrote: > On Jan 22, 2012, at 2:34 PM, Tim Chase wrote: > > > On 01/22/12 13:26, Roy Smith wrote: > >>> If you wanted to do it in one pass using standard unix > >>> tools, you can use: > > >>> sed -n -e'/^[0-2]/w first-three.txt' -e'/^[34]/w > >>> next-two.txt' -e'/^[5

Re: Splitting a file from specific column content

2012-01-22 Thread Arnaud Delobelle
On 22 January 2012 16:09, MRAB wrote: > On 22/01/2012 15:39, Arnaud Delobelle wrote: [...] >> Or more succintly (but not tested): >> >> >> sections = [ >>     ("3", "section_1") >>     ("5", "section_2") >>     ("\xFF", "section_3") >> ] >> >> with open(input_path) as input_file: >>     lines = it

Re: Splitting a file from specific column content

2012-01-22 Thread Roy Smith
On Jan 22, 2012, at 2:34 PM, Tim Chase wrote: > On 01/22/12 13:26, Roy Smith wrote: >>> If you wanted to do it in one pass using standard unix >>> tools, you can use: >>> >>> sed -n -e'/^[0-2]/w first-three.txt' -e'/^[34]/w >>> next-two.txt' -e'/^[5-7]/w next-three.txt' >>> >> I stand humbled. >

Re: Splitting a file from specific column content

2012-01-22 Thread Tim Chase
On 01/22/12 13:26, Roy Smith wrote: If you wanted to do it in one pass using standard unix tools, you can use: sed -n -e'/^[0-2]/w first-three.txt' -e'/^[34]/w next-two.txt' -e'/^[5-7]/w next-three.txt' I stand humbled. In all likelyhood, you stand *younger*, not so much humbled ;-) -tkc

Re: Splitting a file from specific column content

2012-01-22 Thread Roy Smith
I stand humbled. On Jan 22, 2012, at 2:25 PM, Tim Chase wrote: > On 01/22/12 08:45, Roy Smith wrote: >> I would do this with standard unix tools: >> >> grep '^[012]' input.txt> first-three-seconds.txt >> grep '^[34]' input.txt> next-two-seconds.txt >> grep '^[567]' input.txt> next-three-secon

Re: Splitting a file from specific column content

2012-01-22 Thread Tim Chase
On 01/22/12 08:45, Roy Smith wrote: I would do this with standard unix tools: grep '^[012]' input.txt> first-three-seconds.txt grep '^[34]' input.txt> next-two-seconds.txt grep '^[567]' input.txt> next-three-seconds.txt Sure, it makes three passes over the data, but for 20 MB of data, you co

Re: Splitting a file from specific column content

2012-01-22 Thread Yigit Turgut
On Jan 22, 6:56 pm, MRAB wrote: > On 22/01/2012 16:17, Yigit Turgut wrote: > [snip] > > > > > > > > > On Jan 22, 5:39 pm, Arnaud Delobelle  wrote: > [snip] > >>  Or more succintly (but not tested): > > >>  sections = [ > >>      ("3", "section_1") > >>      ("5", "section_2") > >>      ("\xFF", "s

Re: Splitting a file from specific column content

2012-01-22 Thread MRAB
On 22/01/2012 16:17, Yigit Turgut wrote: [snip] On Jan 22, 5:39 pm, Arnaud Delobelle wrote: [snip] Or more succintly (but not tested): sections = [ ("3", "section_1") ("5", "section_2") ("\xFF", "section_3") ] with open(input_path) as input_file: lines = iter(input_fi

Re: Splitting a file from specific column content

2012-01-22 Thread Yigit Turgut
On Jan 22, 4:45 pm, Roy Smith wrote: > In article > , > Yigit Turgut wrote: > > Hi all, > > > I have a text file approximately 20mb in size and contains about one > > million lines. I was doing some processing on the data but then the > > data rate increased and it takes very long time to proce

Re: Splitting a file from specific column content

2012-01-22 Thread MRAB
On 22/01/2012 15:39, Arnaud Delobelle wrote: On 22 January 2012 15:19, MRAB wrote: Here's a solution in Python 3: input_path = "..." section_1_path = "..." section_2_path = "..." section_3_path = "..." with open(input_path) as input_file: try: line = next(input_file)

Re: Splitting a file from specific column content

2012-01-22 Thread Arnaud Delobelle
On 22 January 2012 15:19, MRAB wrote: > Here's a solution in Python 3: > > input_path = "..." > section_1_path = "..." > section_2_path = "..." > section_3_path = "..." > > with open(input_path) as input_file: >    try: >        line = next(input_file) > >        # Copy section 1. >        with o

Re: Splitting a file from specific column content

2012-01-22 Thread MRAB
On 22/01/2012 14:32, Yigit Turgut wrote: Hi all, I have a text file approximately 20mb in size and contains about one million lines. I was doing some processing on the data but then the data rate increased and it takes very long time to process. I import using numpy.loadtxt, here is a fragment o

Re: Splitting a file from specific column content

2012-01-22 Thread Roy Smith
In article , Yigit Turgut wrote: > Hi all, > > I have a text file approximately 20mb in size and contains about one > million lines. I was doing some processing on the data but then the > data rate increased and it takes very long time to process. I import > using numpy.loadtxt, here is a frag