On Mon, 2025-07-28 at 13:20 +0300, Alexander Korotkov wrote:
> I can confirm inicap works with libc and libicu as you stated.  The
> documentation patch looks good to me.  I’ve written a commit message.
>  The REL_12_STABLE branch is not relevant anymore as it’s out of
> support.  I’m going to push this if no objections.

Apologies for the late review.

First, it doesn't mention the "builtin" provider, which uses the same
word break rules as libc.

Second, word boundaries can be complex, and I'm wondering if we should
not be so precise about what ICU does or doesn't do. For instance, ICU
has options like U_TITLECASE_ADJUST_TO_CASED,
U_TITLECASE_NO_BREAK_ADJUSTMENT, etc.[1], and I'm not sure exactly
which one of those we use.

I'd prefer that we try to explain that INITCAP() is intended for
convenient display, and the specific result should not be relied upon
(at least for ICU; maybe for all providers). If you want specific word
boundary rules, write your own function.

Regards,
        Jeff Davis

[1]
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/stringoptions_8h.html#a4975f537b9960f0330b233061ef0608d



Reply via email to