Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
>> What concerns me most is the non-portability of locale names, for >> example 'fr_FR', 'fr.UTF-8', 'fr_FR.utf8', 'fr_FR.UTF-8', or >> 'French_France.65001'. > > I thought locales were already standardised ... > > languagecode underscore countrycode dot characterset This is the theory, yes.

Re: Thoughts on GLib regexes

2022-11-29 Thread Werner LEMBERG
> As shown by https://gitlab.com/lilypond/lilypond/-/issues/6463, > Guile regular expressions are a trap when it comes to Unicode. > Under a non-Unicode locale, characters that can't be expressed in > the locale encoding get converted to "?", both in the pattern and > the search string, before in

Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
> [...] locales are not small and easy but quite heavy instead – the > `/usr/lib/locale` directory on my openSUSE GNU/Linux box provides > 494 locales and has a whopping size of 227MByte, mainly for > collation and character type information. Normally, you won't get a > single locale as a separat

Thoughts on GLib regexes

2022-11-29 Thread Jean Abou Samra
Hi, As shown by https://gitlab.com/lilypond/lilypond/-/issues/6463, Guile regular expressions are a trap when it comes to Unicode. Under a non-Unicode locale, characters that can't be expressed in the locale encoding get converted to "?", both in the pattern and the search string, before invoking

Re: Prefer luatex for documentation

2022-11-29 Thread Wol
On 29/11/2022 18:00, Werner LEMBERG wrote: I'd have thought au jour d'ui 227MB qualifies as small, no? It's averaging at 500kB/package which is bigger than I thought (I'd have thought more like 50k tops, tbh) but it seems like it'd be a relatively manageable size for a one-off setup... For me

Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
> Is it sustainable to build a 2D dictionary keyed by platform and a > name of our choosing for the locale to use as a map and fill it > gradually as we discover new holes? Well, I will eventually build on the wisdom of other guys, in particular the people from the 'gnulib' library, who deal wit

Re: Prefer luatex for documentation

2022-11-29 Thread Luca Fascione
That last statement is terrifying. Is it sustainable to build a 2D dictionary keyed by platform and a name of our choosing for the locale to use as a map and fill it gradually as we discover new holes? Or would it be too much hassle? I'm thinking new platforms or new translations are events that

Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
> I'd have thought au jour d'ui 227MB qualifies as small, no? It's > averaging at 500kB/package which is bigger than I thought (I'd have > thought more like 50k tops, tbh) but it seems like it'd be a > relatively manageable size for a one-off setup... For me, it's a big package. However, I agre

Re: Prefer luatex for documentation

2022-11-29 Thread Luca Fascione
I'd have thought au jour d'ui 227MB qualifies as small, no? It's averaging at 500kB/package which is bigger than I thought (I'd have thought more like 50k tops, tbh) but it seems like it'd be a relatively manageable size for a one-off setup... A compiler or browser release or a couple weeks worth

Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
> Question: I would have thought locales would be a) largely present, > b) small and easy to install as dependencies, like many other > dependencies we have (and substantially less prone to change than > any software dependency) > > Where does the concern with locales not being available on a sys

Re: Prefer luatex for documentation

2022-11-29 Thread Luca Fascione
Question: I would have thought locales would be a) largely present, b) small and easy to install as dependencies, like many other dependencies we have (and substantially less prone to change than any software dependency) Where does the concern with locales not being available on a system come from

Re: Prefer luatex for documentation,Re: Prefer luatex for documentation

2022-11-29 Thread Werner LEMBERG
> Indeed, I didn't find a Perl library that would support locales > without them being present on the system. However, all of > > https://perldoc.perl.org/Unicode::Collate (Perl) > https://github.com/jtauber/pyuca (Python) > https://docs.rs/unicode-collation/0.0.1/unicode_collation/ (Rust) > >

Re: Prefer luatex for documentation

2022-11-29 Thread Jean Abou Samra
Le 29/11/2022 à 07:29, Werner LEMBERG a écrit : In the long run, however, it would be good to have correctly sorted indices in the non-English manuals. AFAIK, and assuming that `texindex` is eventually capable of doing that, we will need UTF-8 locales for that, because virtually all programs, ir