On Sunday 21 June 2015 21:22:32 Georg Baum wrote: > Some general remarks: > Chardet works fine in some cases, but not at all in some others (if only > very few non-ASCII symbols are in the file). Therefore I would always try to > determine the encoding from other sources, and only use chardet as a last > fallback. > > Unfortunately I am not sure which files we are talking about. Do you mean > lyx2lyx and .lyx files? Or other files?
My main concern as you correctly guessed is with .lyx files and lyx2lyx code. > In case of lyx2lyx, very old files > can use several encodings in one file, so these would need to be opened in > binary mode. Maybe we should split lyx2lyx into one part which works only > with python2 which converts up to 246 (LyX 1.4), and another one which > converts from 246 on? I fear that a rewrite of this old code would need far > too much testing. You are right in your analysis and that is what I fear. :-) But in the long term it does not make much sense to have code in python 2 and code in python 3. We need to go all the way to python 3 ready code. I think that I will go first after the easy task and leave lyx2lyx to the end. :-( > Georg Regards, -- José Abílio