The parser is a simple-minded state machine that will misbehave if the input does not have entries in the order Relevant1, Relevant2, Relevant3 (with as many intervening lines as you like).
All three values are available when Relevant3 is detected so you could do something else with them if you want.
HTH Kent
import cStringIO
raw_data = '''Gibberish 53 MoreGarbage 12 RelevantInfo1 10/10/04 NothingImportant ThisDoesNotMatter 44 RelevantInfo2 22 BlahBlah 343 RelevantInfo3 23 Hubris Crap 34
Gibberish 53 MoreGarbage 12 RelevantInfo1 10/10/04 NothingImportant ThisDoesNotMatter 44 RelevantInfo2 22 BlahBlah 343 RelevantInfo3 23 Hubris Crap 34
SecondSetofGarbage 2423 YouGetThePicture 342342 RelevantInfo1 10/10/04 HoHum 343 MoreStuffNotNeeded 232 RelevantInfo2 33 RelevantInfo3 44 sdfsdf RelevantInfo1 10/11/04 InsertBoringFillerHere 43234 Stuff MoreStuff RelevantInfo2 45 ExcitingIsntIt 324234 RelevantInfo3 60 Lalala''' raw_data = cStringIO.StringIO(raw_data)
scores = {} info1 = info2 = info3 = None
for line in raw_data: if line.startswith('RelevantInfo1'): info1 = raw_data.next().strip() elif line.startswith('RelevantInfo2'): info2 = raw_data.next().strip() elif line.startswith('RelevantInfo3'): info3 = raw_data.next().strip() scores.setdefault(info1, {}).setdefault(info3, []).append(info2) info1 = info2 = info3 = None
print scores print scores['10/11/04']['60'] print scores['10/10/04']['23']
## prints: {'10/10/04': {'44': ['33'], '23': ['22', '22']}, '10/11/04': {'60': ['45']}} ['45'] ['22', '22'] -- http://mail.python.org/mailman/listinfo/python-list