Charles Hartman wrote: > I'm working on text-handling programs that want plain-text files as > input. It's fine to tell users to feed the programs with plain-text > only, but not all users know what this means, even after you explain > it, or they forget. So it would be nice to be able to handle gracefully > the stuff that MS Word (or any word-processor) puts into a file. > Inserting a 0-127 filter is easy but not very friendly. Typically, the > w.p. file loads OK (into a wx.StyledTextCtrl a.k.a Scintilla editing > pane), and mostly be readable. Just a few characters will be wrong: > "smart" quotation marks and the like. > > Is there some well-known way to filter or translate this w.p. garbage? > I don't know whether encodings are relevant; I don't know what encoding > an MSW file uses. I don't see how to use s.translate() because I don't > know how to predict what the incoming format will be. > > Any hints welcome.
This may help: http://wvware.sourceforge.net/ [not a recommendation, I've never used it] -- http://mail.python.org/mailman/listinfo/python-list