Greg Lindstrom wrote: > Hello- > I have a file generated by an HP-9000 running Unix containing form feeds > signified by ^M^L. I am trying to scan for the linefeed to signal > certain processing to be performed but can not get the regex to "see"
> it. Suppose I read my input line into a variable named "input" > > The following does not seem to work... > input = input_file.readline() You are shadowing a builtin. > if re.match('\f', input): print 'Found a formfeed!' > else: print 'No linefeed!' formfeed == not not linefeed???? > > I also tried to create a ^M^L (typed in as <ctrl>Q M <ctrlQ> L) but that > gives me a syntax error when I try to run the program (re does not like > the control characters, I guess). Is it possible for me to pull out the > formfeeds in a straightforward manner? > For a start, resolve your confusion between formfeed and linefeed. Formfeed makes your printer skip to the top of a new page (form), without changing the column position. FF, '\f', ctrl-L, 0x0C. Linefeed makes the printer skip to a new line, without changing the column position. LF, '\n', ctrl-J, 0x0D. There is also carriage return, which makes your typewriter return to column 1, without moving to the next line. CR, '\r', ctrl-M, 0x0A. Now you can probably guess why the writer of your report file is emitting "\r\f". What we can't guess for you is where in your file these "\r\f" occurrences are in relation to the newlines (i.e. '\n') which Python is interpreting as line breaks. As others have pointed out, (1) re.match works on the start of the string and (2) you probably don't need to use re anyway. The solution may be as simple as: if input_line[:2] == "\r\f": BTW, have you checked that there are no other control characters embedded in the file, e.g. ESC (introducing an escape sequence), SI/SO (change character set), BEL * 100 (Hey, Fred, the printout's finished), HT, VT, BS (yeah, probably lots of that, but I mean BackSpace)? HTH, John -- http://mail.python.org/mailman/listinfo/python-list