When comparing strings using C++, the default behavior is to order by UTF8 codepoints which impacts comparing strings such as a < b < c [1][2]. This may not be appropriate in all cases and like in the sort function [3], it may be helpful to have an optional field for comparison keys. An example in C++ is at the end of this message. Are there any suggestions for or objections to adding an optional field with comparison keys?

[1] https://issues.apache.org/jira/browse/ARROW-9843
[2] https://issues.apache.org/jira/browse/ARROW-14290
[3] https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/vector_sort.cc

/* Follows
* http://www.localizingjapan.com/blog/2011/02/13/sorting-in-japanese-%E2%80%94-an-unsolved-problem/
* https://stackoverflow.com/questions/2803071/c-sort-array-of-strings
*/

#include<iostream>
#include<algorithm>
int main()
{
  string settings[19] = {"システム", "画面", "Windows ファイウォール",
"インターネット オプション", "キーボード", "メール", "音声認識", "管理ツール", "自動更新", "日付と時刻",
                         "タスク", "プログラムの追加と削除", "フォント",
"電源オプション", "マウス", "地域と言語オプション",
                         "電話とモデムのオプション", "Java", "NVIDIA"};
  string names[8] = {"Ayumi", "アユミ", "あゆみ",  "歩美",
                    "Tanaka", "タナカ",  "たなか", "田中"};
  std::sort(begin(settings), end(settings));
  std::cout << "Settings" << std::endl;
  for(auto& Word: settings){
    cout << Word << endl;
  }
  std::sort(begin(names), std::end(names));
  std::cout << "Names" << std::endl;
  for(auto& Name: names){
    std::cout << Name << std::endl;
  }
  return 0;
 }

Reply via email to