Re: Multiline regex help

Kent Johnson Thu, 03 Mar 2005 04:15:05 -0800

Yatima wrote:

Hey Folks,

I've got some info in a bunch of files that kind of looks like so:

Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34

and so on...

Anyhow, these "fields" repeat several times in a given file (number of
repetitions varies from file to file). The number on the line following the
"RelevantInfo" lines is really what I'm after. Ideally, I would like to have
something like so:

RelevantInfo1 = 10/10/04 # The variable name isn't actually important
RelevantInfo3 = 23       # it's just there to illustrate what info I'm
                         # trying to snag.


Here is a way to create a list of [RelevantInfo, value] pairs:
import cStringIO

raw_data = '''Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34'''
raw_data = cStringIO.StringIO(raw_data)

data = []
for line in raw_data:
    if line.startswith('RelevantInfo'):
        key = line.strip()
        value = raw_data.next().strip()
        data.append([key, value])

print data

Score[RelevantInfo1][RelevantInfo3] = 22 # The value from RelevantInfo2


I'm not sure what you mean by this. Do you want to build a Score dictionary 
as well?

Kent


Collected from all of the files.

So, there would be several of these "scores" per file and there are a bunch
of files. Ultimately, I am interested in printing them out as a csv file but
that should be relatively easy once they are trapped in my array of doom
<cue evil laughter>.

I've got a fairly ugly "solution" (I am using this term *very* loosely)
using awk and his faithfail companion sed, but I would prefer something in
python.

Thanks for your time.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Multiline regex help

Reply via email to