Dov Feldstern wrote:
Abdelrazak Younes wrote:
Dov Feldstern wrote:
The re-ordering of characters should _exclusively_ be based on the unicode range not on the language. Fixing the bidi algorithm to do that should not be very difficult.


This is certainly an option, however I am far from convinced that this is the correct behavior. I'd be happy to hear additional opinions... (Georg?)

Abdel.


I feel I should explain *why* I'm always so adamant about claiming that the explicit language information is important. Admittedly, the example I'm about to bring is rather esoteric --- but then again, that's precisely where the little annoyances arise, and after all, LaTeX is all about getting things "just right", isn't it? And the example I'm bringing is something that I've really wanted to do (and been able to do with LyX, but not with regular word-processors), not totally fabricated...

So here it is (uppercase will represent an RTL language, lowercase -- LTR). First of all, if I were to print an LTR list, it would have the form "xxx, yyy, zzz" (i.e., immediately to the right of each word comes the comma, then a space, then the next word). An RTL list, on the other hand, would have the form (visually) of: "ZZZ ,YYY ,XXX" (the first word XXX is on the right, followed to the left of it by a comma and only then a space, then the next word, and so on). Just to clarify the differences, I'll print them in two rows (note especially the positions of the commas):

LTR: xxx, yyy, zzz
RTL: ZZZ ,YYY ,XXX

So far so good. Now, let's say I want a list of RTL words in an overall LTR sentence.There are situations in which *to me* it makes more sense that the overall structure should remain LTR. In other words, I want something that looks like this:

"this is the list: CBA, FED, IHG..."

--- note that each RTL word is of course in RTL order, but the order of the words, as well as the positioning of the commas, is LTR. I challenge you to try that in openoffice or MS-Word --- it's just not possible, because the bidi algorithm decides that the entire string from A to G is all RTL, and therefore renders it as

"IHG ,FED ,CBA"

, which is *not* what I want. (To be sure, you could actually get that output even in OO/Word, basically by typing backwards --- in other words, you have to mess up the *logical* order in order to get the correct *visual* output; but I find that unacceptable, and if the list is very long, it becomes unmanageable.) In LyX, I *can* do it without typing everything backwards, precisely because I have the explicit language mechanism.

I admit that it's debatable whether what I want is correct or not --- I don't know if there are any rules about this --- but nonetheless, I want to be able to do it that way if I choose to.

There are other similar cases --- but they are all similar in that they revolve around the "ambiguous" characters --- punctuation marks, for example, and perhaps digits --- which really are ambiguous in the sense that they are not inherently either RTL or LTR. The bidi algorithms do a good job of guessing what the user wants *usually*, but sometimes they foul up; and it's not their fault --- it's a real ambiguity, which can only be resolved by *explicitly* disambiguating it... In LyX, we already have that built in, so it's a shame to throw that away...

Reply via email to