>> You can have these queries return both rows if you use an accent-ignoring collation, like this example in the documentation:
CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = false); << Indeed. Is the dependency between the character expansion capability and accent-insensitive collations documented anywhere? Another unexpected dependency appears to be @colCaseFirst=upper. If specified in combination with colStrength=secondary, it appears that the upper/lower case ordering is random within a group of characters that are secondary equal, e.g. 'A' < 'a', but 'b' < 'B', 'c' < 'C', ... , but then 'L' < 'l'. It is not even consistently ordered with respect to case. If I make it a nondeterministic CS_AI collation, then it sorts upper before lower consistently. The rule seems to be that you can't sort by case within a group that is case-insensitive. Can a CI collation be ordered upper case first, or is this a limitation of ICU? For example, this is part of the sort order that I'd like to achieve with ICU, with the code point in column 1 and dense_rank() shown in the rightmost column indicating that 'b' = 'B', for example: 66 B B 138 151 98 b b 138 151 <- so within a group that is CI_AS equal, the sort order needs to be upper case first 67 C C 139 152 99 c c 139 152 199 Ç Ç 140 153 231 ç ç 140 153 68 D D 141 154 100 d d 141 154 208 Ð Ð 142 199 240 ð ð 142 199 69 E E 143 155 101 e e 143 155 Can this sort order be achieved with ICU? More generally, is there any interest in leveraging the full power of ICU tailoring rules to get whatever order someone may need, subject to the limitations of ICU itself? what would be required to extend CREATE COLLATION to accept an optional sequence of tailoring rules that we would store in the pg_collation catalog and apply along with the modifiers in the locale string? /Jim