On Sun, Oct 01, 2006 at 06:07:12PM +0200, Georg Baum wrote: > Am Sonntag, 1. Oktober 2006 16:45 schrieb Enrico Forestieri: > > On Sun, Oct 01, 2006 at 03:28:20PM +0200, Georg Baum wrote: > > > > > Am Sonntag, 1. Oktober 2006 14:46 schrieb Enrico Forestieri: > > > > > > > The problem here is that sizeof(wchar_t) is 2 on cygwin and uint32_t > > > > cannot be used due to missing specializations in STLPort. I am not > > > > able to fill the missing bits in the Georg's work (after all, I am a > > > > sorcerer's apprentice :) ) but succeeded in hacking STLPort such that > > > > uint32_t can be used in place of wchar_t. > > > > > > That does not work. You are now using a 4byte character type, but the > > > internal stlport functions such as std::ctype::to_upper do still assume > a > > > 2byte encoding. Therefore you cannot store 4 bytes in uint32_t, and it > > > would have been easier to just use a 2byte wchar_t instead of patching > > > stlport. > > > > I don't understand this. I am compiling the library by replacing > > everywhere wchar_t with uint32_t, so this would be also true on > > a system where wchar_t is 4 bytes. > > No. You do not change the C library for example (e.g. towupper). I am > pretty sure that C library functions are used in STLPort. Besides that: > Even if you replace the 2byte character type by a 4byte character type, > and C library functions are not used, you do not change the algorithms, so > you now use UCS2 (or whatever encoding is assumed for wchars on cygwin) > stored in 4 bytes per character. That is equivalent to simply using the > 2byte wchar_t in LyX, but requires more memory.
You are right, of course, but a wrapper function could be used. In STLPort there is code like this: #if !defined (_STLP_NO_CSTD_FUNCTION_IMPORTS) ... using _STLP_VENDOR_CSTD_WFUNC::towupper; ... #endif /* _STLP_NO_CSTD_FUNCTION_IMPORTS */ So it would be easy writing a wrapper which calls the system functions where this makes sense and otherwise tries to do by itself. Or even restrict the thing to UCS2. Yes, that requires more memory, but I am able to compile a working LyX instead of a crashing one... The right approach would be sticking to wchar_t in LyX, meaning that on some platforms it would use UCS2 and on some others UCS4, but I think that there was already a discussion on this matter and I don't want to revamp it. Frankly, I could not care less about unicode and would be happy even with latin1 only... > > > > The patch is specifically thought for cygwin as I don't know whether > > > > there are other platforms (apart mingw) where sizeof(wchar_t) is 2. > > > > > > IIRC some (older?) commercial unices have this too. > > > > So maybe the --with-stlport-hack switch makes sense. > > I don't agree for the reasons above. And I am not going to endorse it. > > I have yet to be convinced that it does not work. Please, can you think > > of some test that I can perform? > > Have a look which C library functions for wchar_t * are used. Then you can > tell what will not work. If you don't find C library functions then have a > look how the tolower/toupper ctype functions are implemented. > > If you restrict your input to latin1 you will probably not see any problem. > That is a reasonable subset to start with, but IMO an assertion should > trigger if you have such an implementation and feed it some real unciode > content. I am wondering how these things are going to work with MSVC, given that wchar_t is 2 bytes there, as well, hinting that towupper will not work with a 4byte char... -- Enrico