Re: Simple Text Processing

2009-09-13 Thread Tim Roberts
Steven D'Aprano wrote: >On Fri, 11 Sep 2009 21:52:36 -0700, Tim Roberts wrote: > >> Basically, when you're good with Perl, you start to think of every task >> in terms of regular expression matches. When you're good with Python, >> you start to think of every task in terms of lists and tuples. >

Re: Simple Text Processing

2009-09-11 Thread Steven D'Aprano
On Fri, 11 Sep 2009 21:52:36 -0700, Tim Roberts wrote: > Basically, when you're good with Perl, you start to think of every task > in terms of regular expression matches. When you're good with Python, > you start to think of every task in terms of lists and tuples. Not me -- I think of most such

Re: Simple Text Processing

2009-09-11 Thread Tim Roberts
AJAskey wrote: > >Never mind. I guess I had been trying to make it more difficult than >it is. As a note, I can work on something for 10 hours and not figure >it out. But the second I post to a group, then I immediately figure >it out myself. Strange snake this Python... Come sit on the couch

Re: Simple Text Processing

2009-09-10 Thread AJAskey
Never mind. I guess I had been trying to make it more difficult than it is. As a note, I can work on something for 10 hours and not figure it out. But the second I post to a group, then I immediately figure it out myself. Strange snake this Python... Example for anyone else interested: line =

Re: Simple Text Processing

2009-09-10 Thread Benjamin Kaplan
On Thu, Sep 10, 2009 at 11:36 AM, AJAskey wrote: > New to Python. I can solve the problem in perl by using "split()" to > an array. Can't figure it out in Python. > > I'm reading variable lines of text. I want to use the first number I > find. The problem is the lines are variable. > > Input

Simple Text Processing

2009-09-10 Thread AJAskey
New to Python. I can solve the problem in perl by using "split()" to an array. Can't figure it out in Python. I'm reading variable lines of text. I want to use the first number I find. The problem is the lines are variable. Input example: this is a number: 1 here are some numbers 1 2 3 4

Re: Simple Text Processing Help

2007-10-17 Thread Tim Roberts
[EMAIL PROTECTED] wrote: > >And now for something completely different... > >I've been reading up a bit about Python and Excel and I quickly told >the program to output to Excel quite easily. However, what if the >input file were a Word document? I can't seem to find much >information about parsi

Re: Simple Text Processing Help

2007-10-16 Thread patrick . waldo
And now for something completely different... I've been reading up a bit about Python and Excel and I quickly told the program to output to Excel quite easily. However, what if the input file were a Word document? I can't seem to find much information about parsing Word files. What could I add

Re: Simple Text Processing Help

2007-10-16 Thread patrick . waldo
And now for something completely different... I see a lot of COM stuff with Python for excel...and I quickly made the same program output to excel. What if the input file were a Word document? Where is there information about manipulating word documents, or what could I add to make the same prog

Re: Simple Text Processing Help

2007-10-16 Thread Peter Otten
patrick.waldo wrote: > manipulation? Also, I conceptually get it, but would you mind walking > me through >> for key, group in groupby(instream, unicode.isspace): >> if not key: >> yield "".join(group) itertools.groupby() splits a sequence into groups with the same key; e. g

Re: Simple Text Processing Help

2007-10-15 Thread Paul McGuire
On Oct 14, 8:48 am, [EMAIL PROTECTED] wrote: > Hi all, > > I started Python just a little while ago and I am stuck on something > that is really simple, but I just can't figure out. > > Essentially I need to take a text document with some chemical > information in Czech and organize it into another

Re: Simple Text Processing Help

2007-10-15 Thread Paul Hankin
On Oct 15, 10:08 pm, [EMAIL PROTECTED] wrote: > Because of my limited Python knowledge, I will need to try to figure > out exactly how they work for future text manipulation and for my own > knowledge. Could you recommend some resources for this kind of text > manipulation? Also, I conceptually g

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
Wow, thank you all. All three work. To output correctly I needed to add: output.write("\r\n") This is really a great help!! Because of my limited Python knowledge, I will need to try to figure out exactly how they work for future text manipulation and for my own knowledge. Could you recommend

Re: Simple Text Processing Help

2007-10-15 Thread Peter Otten
patrick.waldo wrote: > my sample input file looks like this( not organized,as you see it): > 200-720-769-93-2 > kyselina mocová C5H4N4O3 > > 200-001-8 50-00-0 > formaldehyd CH2O > > 200-002-3 > 50-01-1 > guanidínium-chlorid CH5N3.ClH Assuming that the records are al

Re: Simple Text Processing Help

2007-10-15 Thread Paul Hankin
On Oct 15, 12:20 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote: > > my sample input file looks like this( not organized,as you see it): > > 200-720-769-93-2 > > kyselina mocová C5H4N4O3 > > > 200-001-8 50-00-0 >

Re: Simple Text Processing Help

2007-10-15 Thread Marc 'BlackJack' Rintsch
On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote: > my sample input file looks like this( not organized,as you see it): > 200-720-769-93-2 > kyselina mocová C5H4N4O3 > > 200-001-8 50-00-0 > formaldehyd CH2O > > 200-002-3 > 50-01-1 > guanidínium-chlorid CH5N3.C

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
> lines = open('your_file.txt').readlines()[:4] > print lines > print map(len, lines) gave me: ['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov \xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n'] [28, 32, 1, 18] I think it means that I'm still at option 3. I got

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
> lines = open('your_file.txt').readlines()[:4] > print lines > print map(len, lines) gave me: ['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov \xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n'] [28, 32, 1, 18] I think it means that I'm still at option 3. I got

Re: Simple Text Processing Help

2007-10-14 Thread John Machin
On Oct 14, 11:48 pm, [EMAIL PROTECTED] wrote: > Hi all, > > I started Python just a little while ago and I am stuck on something > that is really simple, but I just can't figure out. > > Essentially I need to take a text document with some chemical > information in Czech and organize it into anothe

Re: Simple Text Processing Help

2007-10-14 Thread Marc 'BlackJack' Rintsch
On Sun, 14 Oct 2007 16:57:06 +, patrick.waldo wrote: > Thank you both for helping me out. I am still rather new to Python > and so I'm probably trying to reinvent the wheel here. > > When I try to do Paul's response, I get tokens = line.strip().split() > [] What is in `line`? Paul wrot

Re: Simple Text Processing Help

2007-10-14 Thread patrick . waldo
Thank you both for helping me out. I am still rather new to Python and so I'm probably trying to reinvent the wheel here. When I try to do Paul's response, I get >>>tokens = line.strip().split() [] So I am not quite sure how to read line by line. tokens = input.read().split() gets me all the in

Re: Simple Text Processing Help

2007-10-14 Thread Paul Hankin
On Oct 14, 2:48 pm, [EMAIL PROTECTED] wrote: > Hi all, > > I started Python just a little while ago and I am stuck on something > that is really simple, but I just can't figure out. > > Essentially I need to take a text document with some chemical > information in Czech and organize it into another

Re: Simple Text Processing Help

2007-10-14 Thread Marc 'BlackJack' Rintsch
On Sun, 14 Oct 2007 13:48:51 +, patrick.waldo wrote: > Essentially I need to take a text document with some chemical > information in Czech and organize it into another text file. The > information is always EINECS number, CAS, chemical name, and formula > in tables. I need to organize them

Simple Text Processing Help

2007-10-14 Thread patrick . waldo
Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out. Essentially I need to take a text document with some chemical information in Czech and organize it into another text file. The information is always EINECS number, CAS