On Aug 6, 3:55 pm, Tommy Grav <[EMAIL PROTECTED]> wrote: > I have a file with the format > > Field f29227: Ra=20:23:46.54 Dec=+67:30:00.0 MJD=53370.06797690 Frames > 5 Set 1 > Field f31448: Ra=20:24:58.13 Dec=+79:39:43.9 MJD=53370.06811620 Frames > 5 Set 2 > Field f31226: Ra=20:24:45.50 Dec=+78:26:45.2 MJD=53370.06823860 Frames > 5 Set 3 > Field f31004: Ra=20:25:05.28 Dec=+77:13:46.9 MJD=53370.06836020 Frames > 5 Set 4 > Field f30782: Ra=20:25:51.94 Dec=+76:00:48.6 MJD=53370.06848210 Frames > 5 Set 5 > Field f30560: Ra=20:27:01.82 Dec=+74:47:50.3 MJD=53370.06860400 Frames > 5 Set 6 > Field f30338: Ra=20:28:32.35 Dec=+73:34:52.0 MJD=53370.06872620 Frames > 5 Set 7 > Field f30116: Ra=20:30:21.70 Dec=+72:21:53.6 MJD=53370.06884890 Frames > 5 Set 8 > Field f29894: Ra=20:32:28.54 Dec=+71:08:55.0 MJD=53370.06897070 Frames > 5 Set 9 > Field f29672: Ra=20:34:51.89 Dec=+69:55:56.6 MJD=53370.06909350 Frames > 5 Set 10 > > I would like to parse this file by extracting the field id, ra, dec > and mjd for each line. It is > not, however, certain that the width of each value of the field id, > ra, dec or mjd is the same > in each line. Is there a way to do this such that even if there was a > line where Ra=****** and > MJD=******** was swapped it would be parsed correctly? > > Cheers > Tommy
Did you consider changing the file format in the first place, so that you don't have to do any contortions to parse it ? Anyway, here is a solution with regular expressions (I'm a beginner with re's in python, so, please correct it if wrong and suggest better solutions): import re s = """Field f29227: Ra=20:23:46.54 Dec=+67:30:00.0 MJD=53370.06797690 Frames 5 Set 1 Field f31448: Ra=20:24:58.13 Dec=+79:39:43.9 MJD=53370.06811620 Frames 5 Set 2 Field f31226: Ra=20:24:45.50 Dec=+78:26:45.2 MJD=53370.06823860 Frames 5 Set 3 Field f31004: Ra=20:25:05.28 Dec=+77:13:46.9 MJD=53370.06836020 Frames 5 Set 4 Field f30782: Ra=20:25:51.94 Dec=+76:00:48.6 MJD=53370.06848210 Frames 5 Set 5 Field f30560: Dec=+74:47:50.3 Ra=20:27:01.82 MJD=53370.06860400 Frames 5 Set 6 Field f30338: Ra=20:28:32.35 Dec=+73:34:52.0 MJD=53370.06872620 Frames 5 Set 7 Field f30116: Ra=20:30:21.70 Dec=+72:21:53.6 MJD=53370.06884890 Frames 5 Set 8 Field f29894: Ra=20:32:28.54 Dec=+71:08:55.0 MJD=53370.06897070 Frames 5 Set 9 Field f29672: Ra=20:34:51.89 Dec=+69:55:56.6 MJD=53370.06909350 Frames 5 Set 10""" s = s.split('\n') r = re.compile(r'Field (\S+): (?:(?:Ra=(\S+) Dec=(\S+))|(?:Dec=(\S+) Ra=(\S+))) MJD=(\S+)') for i in s: match = r.findall(i) field = match[0][0] Ra = match[0][1] or match[0][4] Dec = match[0][2] or match[0][3] MJD = match[0][5] print field, Ra, Dec, MJD -- http://mail.python.org/mailman/listinfo/python-list