On Sun, 23 Nov 2008 14:11:50 +0200 listmember <[EMAIL PROTECTED]> wrote:
>[...] > > For very large projects, that should probably be done anyway at some > > point. But even in that case, using a more memory-efficient string > > type enables you to keep more data in memory and hence potentially > > obtain better performance. > > The last time I joined a relevant discussion, I was told worrying > about native UCS-4 string-type would be pointless simply because that > sort of thing is really needed for word processors only. > > Now, I have been informed that Lazarus (and perhaps other IDEs) use > upwards of 50 MB string space just to do one of their basic > operations. > > That leaves me wondering how much do we lose performance-wise in > endlessly decompressing UTF-8 data, instead of using, say, UCS-4 > strings. I'm wondering what you mean with 'endlessly decompressing UTF-8 data'. You have to make a compromise between memory, ease of use and compatibility. There is no solution without drawbacks. If you want to process large 8bit text files then UTF-8 is better. If you want to paint glyphs then normalized UTF-32 is better. If you want some unicode with some mem overhead and some easy usage and have compiler support for some compatibility then UTF-16 is better. Mattias _______________________________________________ fpc-devel maillist - [email protected] http://lists.freepascal.org/mailman/listinfo/fpc-devel
