Dear Rimas,

On Sun, Apr 3, 2016 at 11:55 PM, Rimas Kudelis <r...@akl.lt> wrote:
>
> With all the data you shared, I'm even more certain that this belongs to
> the locale data, much like quotation characters and number formatting
> characters. I'm not sure if this locale property is readily available
> for inclusion in locale data though.

A while ago, I created the LO locale XML files for Church Slavonic
("Church Slavic"). Please see here:
https://gerrit.libreoffice.org/#/c/15540/

There was nothing there about hyphenation characters, AFAICT.

> It might be that Slavonic is a very
> rare exception to the common rule of using hyphens for that, and that
> this hasn't been accounted for anywhere.

I don't think it's that uncommon. Unicode includes a number of
script-specific hyphenation characters, for example U+058A Armenian
Hyphen, U+1400 Canadian Syllabics Hyphen, etc. How are users supposed
to use those? Also, some Indic scripts, IIUC, do not use a hyphen
character at all; they just split a word across line.  What if users
are using a legacy codepage where the Hyphen is encoded somewhere
other that U+002D? (BTW, strictly speaking U+002D is *not* a hyphen,
and LO should really be using U+2010 for hyphenation). Or the user
wants to set some decorative character to be a hyphen.

IMHO, a hyphenation character should be settable from the user
interface, for example, together with the "Characters at line end" and
"Characters at line begin" in Format->Paragraph -> Text Flow. It
should not involve having to hack an XML file and rebuild LO from
source.

BTW, despite setting LEFTHYPHMIN and RIGHTHYPHMIN in the hyphenation
dictionary, "Characters at line end" and "Characters at line begin"
cannot be set lower than 2. But Church Slavic uses LEFTHYPHEMIN = 1
(Ancient Greek uses both LEFTHYPHMIN and RIGHTHYPHMIN = 1). Is this a
bug? Or a feature?

> anything about this neither in the LDML standard, nor in our DTD for
> locale definition files
> (https://cgit.freedesktop.org/libreoffice/core/plain/i18npool/source/localedata/data/locale.dtd).
>

So, I guess as a first step, LO should support changing the hyphen
character in the XML locale files. Or, is this really a Hunspell
issue, and it should be specified from the hyphenation dictionary
extension? Could someone confirm this?

Cordially,

Aleksandr

-- 
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to