This PR corrects Locale parsing logic for extra languages. The BCP syntax
enforces that extlangs may only follows `2*3 ALPHA` langs. This is also
reinforced by the syntax comment described in `LanguageTag.parse` (which is
based off the BNF). However, the current implementation does not respect this,
and allows extlangs to follow `4ALPHA` (future use) as well as `5*8ALPHA` langs.
For example, `Locale.forLanguageTag("quux-bar").toLanguageTag()` returns the
extlang "bar" when it should return the lang "quux" and discard the extlang
"bar".
This is likely an oversight and should be fixed rather than kept and specified
as a BCP deviation, since it is non standard for extlangs to follow those
previously mentioned longer tags.
I can file a release note if deemed warranted since the acceptable inputs
shrink as a result (even if the correct behavior). Personally, I would lean
towards not filing one since such occurrences would be non-standard as there
are no extlangs that follow a non 2-3 length language prefix.
---------
- [x] I confirm that I make this contribution in accordance with the [OpenJDK
Interim AI Policy](https://openjdk.org/legal/ai).
-------------
Commit messages:
- init
Changes: https://git.openjdk.org/jdk/pull/31663/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=31663&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8387253
Stats: 21 lines in 2 files changed: 17 ins; 0 del; 4 mod
Patch: https://git.openjdk.org/jdk/pull/31663.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/31663/head:pull/31663
PR: https://git.openjdk.org/jdk/pull/31663