Hi

On Sun, 11 Feb 2001 [EMAIL PROTECTED] wrote:

> As far as I know, the only *standard* which defines how logical Hebrew
> text must be displayed is Unicode, and more particularly its Technical
> Report 9.The Israeli bureau of standards (SII == Standard Institution of
> Israel) has also adopted theUnicode algorithm as its official algorithm.
>
> Although Microsoft purports that Windows implements the Unicode algorithm,
> they allow themselves some freedom, varying from one Windows version to
> another.NT4 and Win2k are the most conformant, but the handling of makaf
> is not Unicode-conformant even in Win2k.
>
> According to Unicode, hyphen-minus is a "European Number Terminator",
> which means that if it appears within a number, or adjacent to a number on
> either side, it must be handled like a digit,in fact extending the
> number.So if we have a sequence like   minus one two three, the minus
> *must* appear on the left of 123, according to the Unicode standard.When
> hyphen-minus is not part of a number, it must be treated like a Neutral
> (never asa LTR, as mentioned in a previous posting).
>
> As said above, MS is still not fully Unicode conformant, but they claim
> that they want to and will be (from the mouth of the National Language
> person at MS Israel).IMHO, it does not seem wise to spec Linux to
> imitate some Microsoft bug/feature and ignore standards, both national and
> international.Also consider that different versions of Windows behave
> differently, so it is not possible to be compatible with all of them
> anyway.While we still are at the beginning of the Hebrew road in Linux,
> let us do the Right Thing, and stand by the standards.
>
> By the way, one proper solution to keep the makaf on the right side of the
> number in the example given is to add a RLM (Right-to-Left Marker) between
> the makaf and the number.

This is only useful if all of the texts in question are created somewhat
under the control of that library. However, it certainly won't work for
text you recieve from the outside (e.g.: a web page, or a mail message).

Also: there are now a couple of different implementations of input widgets
that support bidi. Does any of them try to add such RLM characters?


I'm not satisfied with this behaveiour. Can anybody think of a better
workaround? A space there is also not a good idea, because the hyphen here
is supposed to connected two words to one word.


IMHO (and I mean the H here!) setting the hyphen/minus as a Neutral
character would have meant that the case of a minus near a RTL sequnce
would have been misinterpeterd (but this seems like a less common case)
whereas with the current settings, a hyphen near a hebrew text is
misinterpeted, which is the more common case.


However, I left the above text about standard conformance quoted, because
I wish to reiterate it. There is no point in adding further
incompatibilities.

-- 
Tzafrir Cohen
mailto:[EMAIL PROTECTED]
http://www.technion.ac.il/~tzafrir


=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to