@tomhughes commented on this pull request.


> +
+  {
+    :code => code,
+    :native_name => native_name
+  }
+end
+
+AVAILABLE_LANGUAGES.sort_by! do |entry|
+  # https://stackoverflow.com/a/74029319
+  diactrics = [*0x1DC0..0x1DFF, *0x0300..0x036F, *0xFE20..0xFE2F].pack("U*")
+  entry[:native_name]
+    .downcase
+    .unicode_normalize(:nfd)
+    .tr(diactrics, "")
+    .unicode_normalize(:nfc)
+end

The more correct way to do this is to use a gem like `ffi-icu` or 
`twitter_cldr` that can do proper unicode sorting, for example with `ffi-icu` 
you would do something like:

```ruby
collator = ICU::Collation::Collator.new("root")
collator.strength = :primary
AVAILABLE_LANGUAGES.sort! do |a, b|
  collator.compare(a, b)
end
```
Reducing strength from the default of `:tertiary` to `:primary` makes the sort 
case and accent insensitive though even at tertiary it still sorts by the base 
character first and only uses accent (at level secondary) and then case (at 
level tertiary) to break ties which probably makes little practical difference 
for the data set here.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/openstreetmap-website/pull/6024#pullrequestreview-2848905847
You are receiving this because you are subscribed to this thread.

Message ID: 
<openstreetmap/openstreetmap-website/pull/6024/review/2848905...@github.com>
_______________________________________________
rails-dev mailing list
rails-dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/rails-dev

Reply via email to