Re: Improving my text processing script

2005-09-01 Thread Paul McGuire
Yes indeed, the real data often has surprising differences from the simulations! :) It turns out that pyparsing LineStart()'s are pretty fussy. Usually, pyparsing is very forgiving about whitespace between expressions, but it turns out that LineStart *must* be followed by the next expression, wit

Re: Improving my text processing script

2005-09-01 Thread pruebauno
[EMAIL PROTECTED] wrote: > Paul McGuire wrote: > > match...), this program has quite a few holes. > tried run it though and it is not working for me. The following code > runs but prints nothing at all: > > import pyparsing as prs > And this is the point where I have to post the real stuff because

Re: Improving my text processing script

2005-09-01 Thread pruebauno
Paul McGuire wrote: > match...), this program has quite a few holes. > > What if the word "Identifier" is inside one of the quoted strings? > What if the actual value is "tablename10"? This will match your > "tablename1" string search, but it is certainly not what you want. > Did you know there ar

Re: Improving my text processing script

2005-09-01 Thread pruebauno
Miki Tebeka wrote: > Look at re.findall, I think it'll be easier. Minor changes aside the interesting thing, as you pointed out, would be using re.findall. I could not figure out how to. -- http://mail.python.org/mailman/listinfo/python-list

Re: Improving my text processing script

2005-09-01 Thread Miki Tebeka
Hello pruebauno, > import re > f=file('tlst') > tlst=f.read().split('\n') > f.close() tlst = open("tlst").readlines() > f=file('plst') > sep=re.compile('Identifier "(.*?)"') > plst=[] > for elem in f.read().split('Identifier'): > content='Identifier'+elem > match=sep.search(content) >

Re: Improving my text processing script

2005-08-31 Thread Paul McGuire
Even though you are using re's to try to look for specific substrings (which you sort of fake in by splitting on "Identifier", and then prepending "Identifier" to every list element, so that the re will match...), this program has quite a few holes. What if the word "Identifier" is inside one of t

Improving my text processing script

2005-08-31 Thread pruebauno
I am sure there is a better way of writing this, but how? import re f=file('tlst') tlst=f.read().split('\n') f.close() f=file('plst') sep=re.compile('Identifier "(.*?)"') plst=[] for elem in f.read().split('Identifier'): content='Identifier'+elem match=sep.search(content) i