On Thursday 28 December 2006 18:43, Jean-Marc Lasgouttes wrote: > I still think we want to translate to LaTeX macros (xml entities) the > caracters that do not have a good match in the output charset. To be > able to do that, I am not sure what is required, but an exception > seems reasonable.
I agree. I also would like to have a new encoding ASCII, that would not contain any non-ASCII chracters. I guess you mean sgml entities? I believe that we don't need anything special for XML, since the XML variant of docbook allows UTF-8 encoding. > However, while I can see how this works in a char-by-char conversion, > it seems more complicated for a complete string. I guess though that > we do mainly char-by-char in cases of interest. Jose outlined a possible implementation with a postprocessor in denmark some time ago. I don't think that such postprocessor is really a good idea, since it would need to be able to parse LaTeX (e.g. to find an \inputencoding command). One thing I have learned from tex2lyx is that it is always possible to find a simple and sensible use case that breaks it, regardless how much time and code you invest to make it work. Of course we do not want to extend the existing list of hardcoded commands either. Changing that code so that it reads the list of commands and needed packges from a file should not be difficult. Then we can think how to use existing lists of LaTeX commands for unicode characters. We could even find out the list of unicode characters that have a representation in a given encoding automatically (with iconv). I am going to investigate this approach. Georg