On Sun, Oct 01, 2006 at 06:07:12PM +0200, Georg Baum wrote:

> Am Sonntag, 1. Oktober 2006 16:45 schrieb Enrico Forestieri:
> > On Sun, Oct 01, 2006 at 03:28:20PM +0200, Georg Baum wrote:
> > 
> > > Am Sonntag, 1. Oktober 2006 14:46 schrieb Enrico Forestieri:
> > > 
> > > > The problem here is that sizeof(wchar_t) is 2 on cygwin and uint32_t
> > > > cannot be used due to missing specializations in STLPort. I am not
> > > > able to fill the missing bits in the Georg's work (after all, I am a
> > > > sorcerer's apprentice :) ) but succeeded in hacking STLPort such that
> > > > uint32_t can be used in place of wchar_t.
> > > 
> > > That does not work. You are now using a 4byte character type, but the 
> > > internal stlport functions such as std::ctype::to_upper do still assume 
> a 
> > > 2byte encoding. Therefore you cannot store 4 bytes in uint32_t, and it 
> > > would have been easier to just use a 2byte wchar_t instead of patching 
> > > stlport.
> > 
> > I don't understand this. I am compiling the library by replacing
> > everywhere wchar_t with uint32_t, so this would be also true on
> > a system where wchar_t is 4 bytes.
> 
> No. You do not change the C library for example (e.g. towupper). I am 
> pretty sure that C library functions are used in STLPort. Besides that: 
> Even if you replace the 2byte character type by a 4byte character type, 
> and C library functions are not used, you do not change the algorithms, so 
> you now use UCS2 (or whatever encoding is assumed for wchars on cygwin) 
> stored in 4 bytes per character. That is equivalent to simply using the 
> 2byte wchar_t in LyX, but requires more memory.

You are right, of course, but a wrapper function could be used.
In STLPort there is code like this:

#if !defined (_STLP_NO_CSTD_FUNCTION_IMPORTS)
...
using _STLP_VENDOR_CSTD_WFUNC::towupper;
...
#endif /* _STLP_NO_CSTD_FUNCTION_IMPORTS */

So it would be easy writing a wrapper which calls the system functions
where this makes sense and otherwise tries to do by itself.
Or even restrict the thing to UCS2. Yes, that requires more memory,
but I am able to compile a working LyX instead of a crashing one...

The right approach would be sticking to wchar_t in LyX, meaning that on
some platforms it would use UCS2 and on some others UCS4, but I think
that there was already a discussion on this matter and I don't want to
revamp it. Frankly, I could not care less about unicode and would be
happy even with latin1 only...

> > > > The patch is specifically thought for cygwin as I don't know whether
> > > > there are other platforms (apart mingw) where sizeof(wchar_t) is 2.
> > > 
> > > IIRC some (older?) commercial unices have this too.
> > 
> > So maybe the --with-stlport-hack switch makes sense.
> 
> I don't agree for the reasons above.

And I am not going to endorse it.

> > I have yet to be convinced that it does not work. Please, can you think
> > of some test that I can perform?
> 
> Have a look which C library functions for wchar_t * are used. Then you can 
> tell what will not work. If you don't find C library functions then have a 
> look how the tolower/toupper ctype functions are implemented.
> 
> If you restrict your input to latin1 you will probably not see any problem. 
> That is a reasonable subset to start with, but IMO an assertion should 
> trigger if you have such an implementation and feed it some real unciode 
> content.

I am wondering how these things are going to work with MSVC, given that
wchar_t is 2 bytes there, as well, hinting that towupper will not work
with a 4byte char...

-- 
Enrico

Reply via email to