Asger Ottar Alstrup <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > No. I am not sure... but it depends... a combining character can be | > used to produce accents as well... why not an umlaut on top of an | > grave on top of an 'e'. | | The reason I suggest a unicode inset is that we already have it: the | latex accent inset.
That is rather different... you could only use it as a template. | Of course you can start playing tricks with fancy underlying character | types which are in fact composed of other things, but that is an | astronaut design: it is overlayered, so many abstractions on top of | abstractions, so many that you need as many complicated mechanisms to | make it go fast - it is so far up in the sky that there is no oxygen | left, and the brain stops to work. I am not convinced. | I would think it's best to just start with getting what we already | have to work in a basic unicode setting, and maybe extend to a few | eastern languages if volunteers come and help out. Don't worry about | composed Unicode glyphs for now - it's a corner case that can be | handled once someone feels the heat (which will probably when hell | freezes over AFAICT). Well... I claim that this is not uncommon at all. I also claim that we can come quite close by using a UniChar (basically struct UniChar { int32_t uni_char; }) And have this optimized for storage of single ucs-4 codepoints. But also having the ability to stor n-codepoints. | The big step that takes us 99.9% of the way is just going | single-code-point Unicode. yes... for western languages... (that already can do fine with latin variants) | Ligatures and other display headaches are handled by the toolkits | these days, so don't loose sleep over those. We have to do something... we have a cursor to position. | The trick is to make the job as small and simple as possible, and | single-code-point unicode is a huge, monotoneous improvement over the | implicit 8-bit encodings used now, so why make the job harder? There | is always another release after the next one. This is the correct time to discuss problems and solutions and how complete our first unicode version will be. It would be rather stupid to not think of combining characters. Just killing the discission is not good. -- Lgb