Hi all! If you ever need to work with language/locale identifiers, we now have a clean API to use for all Rust, C++ and JavaScript.
The most common scenario in which you may encounter the need is when you want to verify that, say, Firefox UI locale is English, or that the regional variant is "US". Historically you'd write this code as: ```js const locale = Services.locale.appLocaleAsBCP47; if (locale.substr(0, 2) == "en") { // it's English! } ``` Unicode Language Identifiers are a bit more complicated, and such DIY code is likely to break in all sorts of creative ways. You can read the complete spec [0] and implement some of that parsing yourself, but there's now an easier way - Locale API! === How to use == The API supports parsing, canonicalizing, validating, serializing, modifying etc, but here I'm just going to show how testing for a subtag works: 1) JavaScript Andre Bargul implemented `Intl.Locale` API, part of the upcoming edition of EcmaScript 2020! This API allows you to parse the language identifier string and operate on it semantically: ```js const locale = new Locale(Services.locale.appLocaleAsBCP47); if (locale.language == "en") { // it's English! } ``` You can find full docs here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Locale 2) C++ ```cpp nsAutoCString locale LocaleService::GetInstance()->GetAppLocaleAsBCP47(locale); Locale loc = Locale(locale); if (loc.GetLanguage().Equals("en")) { // it's English! } ``` You can find full docs here: https://searchfox.org/mozilla-central/source/intl/locale/MozLocale.h 3) Rust ```rust // No access to LocaleService yet! let locale = LanguageIdentifier::try_from("en-US").unwrap_or_default(); if (locale.language() == "en") { // it's English! } ``` You can find full docs here: https://docs.rs/unic-langid/0.8.0/unic_langid/ == Why it's important == In order to improve our handling of Firefox localization, multilingualism and internationalization, we need to be able to operate on the whole range of possible language and locale identifiers. With an API for operating on them centralized in a single place, we can extend our features providing improved support for our users and their regional, linguistic and cultural preferences encoded as part of the identifier (for example: "en-DE-u-hc-h12-fw-sun" is English in Germany with 12h clock with first day of the week set to Sunday) Before we can enable such customizations, we need to clean up our codebase to make sure that all code that works with those identifiers will not break. It may sound theoretical, but there are real cases where major software (Android) had to implement terrible dirty hacks in their code to support pseudolocalization and language fallbacking because the codebase was sprinkled with DYI language identifier parsing and just could handle the standard. Fortunately, Gecko is already much cleaner than that, and we only have a small number of places which we have to switch to Locale API, but that code is hard to find or lint for, and we'd like to make sure that all new code uses the correct logic. == Language Negotiation == Tangible to that is the topic of language negotiation. If you ever need to match one language identifier to another, please use `Locale::Matches`. If you need to negotiate, use `LocaleService::NegotiateLanguages`. == Next steps == Over the next months we'll be looking to migrate the remaining cases around our codebase [1]. Then we'll want to turn on Unicode Extensions in MozLocale class. We'll also want to switch LocaleService API to return instances of `Locale` in C++ and JS. If you have any questions or feedback, please let me know! Thank you, zb. [0] https://unicode.org/reports/tr35/#Identifiers [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1433329 _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform