On Sun, May 04, 2008 at 09:52:10PM +0200, Uwe Stöhr wrote: > Andre Poenitz schrieb: > >> And, instead of assuming 7 bit ASCII we should assume UTF-8. This would >> be a uniformly better guess. > > No it won't, most programs still produces TeX-output in plain ASCII.
Sure, and all plain ASCII is also valid UTF-8. > We assume currently 8-bit latin1 btw. This would be another sensible choice since a lot of Western legacy files are Latin1. Using UTF-8 is slightly more politically correct, and has a slightly higher proabability to be a good guess in the long run. But anyway: We should be able to override any guesstimate on the commandline. Andre'