"ardief" wrote: > sorry if I'm asking something very obvious but I'm stumped. I have a > text that looks like this: > > Sentence 401 > 4.00pm — We set off again; this time via Tony's home to collect > a variety of possessions, finally arriving at hospital no.3. > Sentence 402 > 4.55pm — Tony is ushered into a side ward with three doctors and > I stay outside with Mum. > > And I want the HTML char codes to turn into their equivalent plain > text. I've looked at the newsgroup archives, the cookbook, the web in > general and can't manage to sort it out.
> file = open('filename', 'r') > ofile = open('otherfile', 'w') > > done = 0 > > while not done: > line = file.readline() > if 'THE END' in line: > done = 1 > elif '—' in line: > line.replace('—', '--') this returns a new line; it doesn't update the line in place. > ofile.write(line) > else: > ofile.write(line) for a more general solution to the actual replace problem, see: http://effbot.org/zone/re-sub.htm#unescape-html you may also want to lookup the "fileinput" module in the library reference manual. </F> -- http://mail.python.org/mailman/listinfo/python-list