Split single file into multiple files based on patterns

2012-10-23 Thread satyam
I have a text file like this

A1980JE3937 2732 4195 12.527000
A1980JE3937 3465 9720 22.00
A1980JE3937 1853 3278 12.50
A1980JE3937 2732 2732 187.50
A1980JE3937 19 4688 3.619000
A1980JE3937 2995 9720 6.667000
A1980JE3937 1603 9720 30.00
A1980JE3937 234 4195 42.416000
A1980JE3937 2732 9720 18.00
A1980KK18700010 130 303 4.985000
A1980KK18700010 7 4915 0.435000
A1980KK18700010 25 1620 1.722000
A1980KK18700010 25 186 0.654000
A1980KK18700010 50 130 3.199000
A1980KK18700010 186 3366 4.78
A1980KK18700010 30 186 1.285000
A1980KK18700010 30 185 4.395000
A1980KK18700010 185 186 9.00
A1980KK18700010 25 30 3.493000

I want to split the file and get multiple files like A1980JE3937.txt and 
A1980KK18700010.txt, where each file will contain column2, 3 and 4.
Thanks
Satyam
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Split single file into multiple files based on patterns

2012-10-23 Thread satyam mukherjee
Thanks I will take a look...My actual data is 2.5Gb in size.
Satyam

On Tue, Oct 23, 2012 at 10:43 PM, Jason Friedman wrote:

> On Tue, Oct 23, 2012 at 9:01 PM, satyam  wrote:
> > I have a text file like this
> >
> > A1980JE3937 2732 4195 12.527000
> > A1980JE3937 3465 9720 22.00
> > A1980JE3937 1853 3278 12.50
> > A1980JE3937 2732 2732 187.50
> > A1980JE3937 19 4688 3.619000
> > A1980KK18700010 30 186 1.285000
> > A1980KK18700010 30 185 4.395000
> > A1980KK18700010 185 186 9.00
> > A1980KK18700010 25 30 3.493000
> >
> > I want to split the file and get multiple files like A1980JE3937.txt
> and A1980KK18700010.txt, where each file will contain column2, 3 and 4.
>
> Unless your source file is very large this should be sufficient:
>
> $ cat source
> A1980JE3937 2732 4195 12.527000
> A1980JE3937 3465 9720 22.00
> A1980JE3937 1853 3278 12.50
> A1980JE3937 2732 2732 187.50
> A1980JE3937 19 4688 3.619000
> A1980JE3937 2995 9720 6.667000
> A1980JE3937 1603 9720 30.00
> A1980JE3937 234 4195 42.416000
> A1980JE3937 2732 9720 18.00
> A1980KK18700010 130 303 4.985000
> A1980KK18700010 7 4915 0.435000
> A1980KK18700010 25 1620 1.722000
> A1980KK18700010 25 186 0.654000
> A1980KK18700010 50 130 3.199000
> A1980KK18700010 186 3366 4.78
> A1980KK18700010 30 186 1.285000
> A1980KK18700010 30 185 4.395000
> A1980KK18700010 185 186 9.00
> A1980KK18700010 25 30 3.493000
>
> $ python3
> Python 3.2.3 (default, Sep 10 2012, 18:14:40)
> [GCC 4.6.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> for line in open("source"):
> ... file_name, remainder = line.strip().split(None, 1)
> ... with open(file_name + ".txt", "a") as writer:
> ... print(remainder, file=writer)
> ...
> >>>
>
> $ ls *txt
> A1980JE3937.txt  A1980KK18700010.txt
>
> $ cat A1980JE3937.txt
> 2732 4195 12.527000
> 3465 9720 22.00
> 1853 3278 12.50
> 2732 2732 187.50
> 19 4688 3.619000
> 2995 9720 6.667000
> 1603 9720 30.00
> 234 4195 42.416000
> 2732 9720 18.00
>



-- 
---
WHEN LIFE GIVES U HUNDRED REASONS TO CRY,SHOW LIFE THAT U HAVE THOUSAND
REASONS TO SMILE :-)

satyam mukherjee
224-436-3672 (Mob)
847-491-7238 (Off)
-- 
http://mail.python.org/mailman/listinfo/python-list