Georg Baum wrote:
On Saturday 30 December 2006 21:26, Dov Feldstern wrote:

I'm sorry for not being clear. You're right --- using the file
heb142-default.lyx, there is a problem with the display, but the
generated latex is correct. But the reason (for both of those things) is
that the lyx file was originally created with 1.4.2 --- so it's not a
unicode file at all!


No, that is not the reason. As you probably know LyX converts old files with lyx2lyx to the current format. This conversion works perfectly for the file heb142-default.lyx. It does also work perfectly for files with fixed encoding such as cp1255. It does not work for files with multiple encodings (http://bugzilla.lyx.org/show_bug.cgi?id=3049) , but I sent a patch yesterday that fixes this bug. With this patch you should not have any file format related bugs anymore (at least I don't know any).


(And I guess this also implies a problem with conversion of files from 1.4.X to 1.5.0?),


No, see above.


So on the one hand, 1.5 doesn't know how to display the characters anymore (I'm not sure exactly
why, though);


But I know: Since we don't know how LaTeX interprets the "default" encoding (this depends on many things) we treat it internally (and in the lyx2lyx conversion) as latin1. Therefore lyx2lyx interprets files in this encoding as latin1 and converts that to unicode. Since in your case the encoding was actually interpreted as cp1255 the unicode characters in LyX are wrong. Hoever, the generated output is OK, since the wrong unicode charcters re converted to latin1. This is then interpreted as cp1255 by LaTeX, and you get the correct DVI output. I guess that LyX 1.4 displayed the characters correctly, because it used the language to determine the encoding. If that is true we can do the same in 1.5, but I am not sure yet.



Georg, you win. I stand corrected. And thanks for the detailed explanation, I think I understand the situation now.

So it seems to me that what we really want (for Hebrew, at least) is for LyX (and lyx2lyx) to determine the encoding based on the language, if the encoding is set to "default" (and maybe also "auto"?). I understand, however, that that may not be the right thing for other languages...

If you can point me to where in the code this is happening, I'd be willing to take a shot at trying to patch it up. I keep asking you to fix things, but I'm willing to try and help, too...

But here's where the second problem arises, and this time it's LyX's
problem, not latex's (though I'm less sure about this part): it seems
to me like LyX itself --- not only latex --- is also determining the
encoding based on the paragraph, rather than based on the individual
characters' language.

Yes. It is implemented like that because of the limitation of older
inputenc packages.

There's no real reason why LyX should limit itself just because latex
does. Here exactly is an example where latex will manage, if only LyX
would.

I don't think that latex would manage, but I'll create a test patch so
that we can try out.

The reason I say that latex will manage, is that the generated latex
file should look exactly the same as the file generated by 1.4.2, which
does work...

I don't understand. 1.4 has the same inputenc limitation, or is that not true?

Yes, 1.4 does have the same inputenc limitation. That's why we use "default" encoding instead of "auto". "Auto" uses inputenc, and explicitly states the encoding of each paragraph. But "default" doesn't state anything about encodings at all, and doesn't use inputenc; and latex just manages, I guess because it uses the language in order to determine the encoding, and the language *can* change in the middle of a paragraph.

Dov

Reply via email to