Hi all!

If you ever need to work with language/locale identifiers, we now have a
clean API to use for all Rust, C++ and JavaScript.

The most common scenario in which you may encounter the need is when you
want to verify that, say, Firefox UI locale is English, or that the
regional variant is "US".

Historically you'd write this code as:

```js
const locale = Services.locale.appLocaleAsBCP47;
if (locale.substr(0, 2) == "en") {
  // it's English!
}
```

Unicode Language Identifiers are a bit more complicated, and such DIY code
is likely to break in all sorts of creative ways.
You can read the complete spec [0] and implement some of that parsing
yourself, but there's now an easier way - Locale API!

=== How to use ==

The API supports parsing, canonicalizing, validating, serializing,
modifying etc, but here I'm just going to show how testing for a subtag
works:

1) JavaScript

Andre Bargul implemented `Intl.Locale` API, part of the upcoming edition of
EcmaScript 2020! This API allows you to parse the language identifier
string and operate on it semantically:

```js
const locale = new Locale(Services.locale.appLocaleAsBCP47);
if (locale.language == "en") {
  // it's English!
}
```

You can find full docs here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Locale

2) C++

```cpp
nsAutoCString locale
LocaleService::GetInstance()->GetAppLocaleAsBCP47(locale);
Locale loc = Locale(locale);
if (loc.GetLanguage().Equals("en")) {
  // it's English!
}
```

You can find full docs here:
https://searchfox.org/mozilla-central/source/intl/locale/MozLocale.h

3) Rust

```rust
// No access to LocaleService yet!

let locale = LanguageIdentifier::try_from("en-US").unwrap_or_default();
if (locale.language() == "en") {
  // it's English!
}
```

You can find full docs here: https://docs.rs/unic-langid/0.8.0/unic_langid/

== Why it's important ==

In order to improve our handling of Firefox localization, multilingualism
and internationalization, we need to be able to operate on the whole range
of possible language and locale identifiers.

With an API for operating on them centralized in a single place, we can
extend our features providing improved support for our users and their
regional, linguistic and cultural preferences encoded as part of the
identifier (for example: "en-DE-u-hc-h12-fw-sun" is English in Germany with
12h clock with first day of the week set to Sunday)

Before we can enable such customizations, we need to clean up our codebase
to make sure that all code that works with those identifiers will not break.

It may sound theoretical, but there are real cases where major software
(Android) had to implement terrible dirty hacks in their code to support
pseudolocalization and language fallbacking because the codebase was
sprinkled with DYI language identifier parsing and just could handle the
standard.

Fortunately, Gecko is already much cleaner than that, and we only have a
small number of places which we have to switch to Locale API, but that code
is hard to find or lint for, and we'd like to make sure that all new code
uses the correct logic.

== Language Negotiation ==

Tangible to that is the topic of language negotiation. If you ever need to
match one language identifier to another, please use `Locale::Matches`.
If you need to negotiate, use `LocaleService::NegotiateLanguages`.

== Next steps ==

Over the next months we'll be looking to migrate the remaining cases around
our codebase [1]. Then we'll want to turn on Unicode Extensions in
MozLocale class.

We'll also want to switch LocaleService API to return instances of `Locale`
in C++ and JS.

If you have any questions or feedback,
please let me know!

Thank you,
zb.

[0] https://unicode.org/reports/tr35/#Identifiers
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1433329
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to