Cases like these are exactly why we do not include an implementation in SWORD for uppercasing a string. I would guess the primary switch in SWORD to provide this, ICU support, does correctly handle your scenario, so I hope, "yes," is the answer, but I haven't checked.
It wouldn't be a bad idea to include a unit test for stringmgr, which we don't currently have. Troy On 1/31/21 10:26 AM, David Haslam wrote: > Hi Troy, > > Does what you describe about case conversion deal correctly for those > few letters where case rules are different in some locales? > > For example, Turkish and Azerbaijani have both lowercase and uppercase > letters for both dotted iİ and dotless ıI. > > Uppercase(“i”) in Turkish is NOT “I” but “I”. > > Asking for a friend. :) > > Best regards, > > David > > Sent from ProtonMail Mobile > > > On Sun, Jan 31, 2021 at 17:11, Troy A. Griffitts <scr...@crosswire.org > <mailto:scr...@crosswire.org>> wrote: >> >> Dear Tobias, >> >> My apologies for taking so long to respond to this, but I wanted to >> give a thorough answer. See the summary at the end if you don't care >> about the details. >> >> So, SWORD has a class StringMgr, which manages strings within SWORD, >> and by default SWORD includes a very basic implementation, which >> doesn't necessarily know about or support anything beyond what the >> basic C string methods support. >> >> I am sure this invokes a sense of horror from you at first, so let me >> explain a bit how we properly handle character sets. First, short >> background: since we existed well before the Unicode world, we have >> multiple locale files for each language, which you will still see in >> the locales.d/ folder, each specifying their character encoding, and >> most of the time SWORD doesn't need to manipulate characters, so >> simply holding data, and passing that data to a display frontend, and >> specifying a font which will handle that encoding was enough in the >> old world. IMPORTANT: the one place we do need to manipulate >> character data is to perform case-insensitive comparisons. We did >> this in the past by converting a string to uppercase before >> comparison. You'll notice this in the section for Bible book >> abbreviation in each locale-- the partial match key must be in a >> toupper state. >> >> Today, everything in SWORD prefers Unicode and specifically, encoded >> as UTF-8. To support this: >> >> First, we have utility functions within SWORD for working with >> Unicode encoded strings, see: >> >> http://crosswire.org/svn/sword/trunk/include/utilstr.h >> >> Specifically: >> >> SWBuf assureValidUTF8(const char *buf); >> SW_u32 getUniCharFromUTF8(const unsigned char **buf, bool skipValidation = >> false); >> SWBuf *getUTF8FromUniChar(SW_u32 uchar, SWBuf *appendTo); >> SWBuf utf8ToWChar(const char *buf); >> SWBuf wcharToUTF8(const wchar_t *buf); >> >> >> To wrap this up, by subclassing StringMgr, SWORD supports >> implementing character encoding by linking to other libraries, e.g., >> ICU, Qt, etc. to handle full Unicode support. And while the >> StringMgr interface allow implementation of many string functions, >> upperUTF8 is the only real method the SWORD engine needs to work >> completely. Some utilities use the other methods in there, but the >> engine, only needs this method. >> >> >> In summary, on Android, you are likely not linking to ICU when you >> build the native SWORD binary-- which I don't do either for Bishop. >> The Cordova SWORD plugin uses the SWORD java-jni bindings, which use >> the Java VM to implement StringMgr: >> >> https://crosswire.org/svn/sword/trunk/bindings/java-jni/jni/swordstub.cpp >> Search for: AndroidStringMgr >> >> And on iOS the Cordova plugin uses the Swift libraries to do the >> same. This is done by using the SWORD flatapi call to >> org_crosswire_sword_StringMgr_setToUpper to provide a Swift >> implementation to uppercase a string. >> >> http://crosswire.org/svn/sword/trunk/bindings/cordova/cordova-plugin-crosswire-sword/src/ios/SWORD.swift >> >> I hope this give you the information you need to get things working >> for you. Please don't hesitate to ask if you need help, >> >> Troy >> >> >> On 1/17/21 11:59 AM, Tobias Klein wrote: >>> Dear Troy, >>> >>> I'm playing with an Android Build of Sword and I get issues with the >>> German Umlauts. >>> >>> So I have issues with Bible book names like Römer, Könige, etc. >>> >>> The Umlauts are shown as ?. >>> >>> I'm configuring the SWORD build with CMake like below (without ICU!) >>> >>> I remember having similar issues on Linux when building without ICU. >>> >>> How do you build SWORD for Bishop? Any suggestions? >>> >>> Best regards, >>> Tobias >>> >>> -- Check for working CXX compiler: >>> /opt/Android/SDK/ndk/r21b/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ >>> -- Check for working CXX compiler: >>> /opt/Android/SDK/ndk/r21b/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ >>> -- works >>> -- Detecting CXX compiler ABI info >>> -- Detecting CXX compiler ABI info - done >>> -- Detecting CXX compile features >>> -- Detecting CXX compile features - done >>> -- Check for working C compiler: >>> /opt/Android/SDK/ndk/r21b/toolchains/llvm/prebuilt/linux-x86_64/bin/clang >>> -- Check for working C compiler: >>> /opt/Android/SDK/ndk/r21b/toolchains/llvm/prebuilt/linux-x86_64/bin/clang >>> -- works >>> -- Detecting C compiler ABI info >>> -- Detecting C compiler ABI info - done >>> -- Detecting C compile features >>> -- Detecting C compile features - done >>> -- Configuring your system to build libsword. >>> -- SWORD Version 1008900000 > > > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page