>> Wow, that's great! >> Maybe asking more at this point isn't the most polite thing, >> but I can't resist, do you think it's possible to add new >> pcode to store UTF8 (or U16) strings and some string syntax >> or cmdline switch to generate such strings in pcode? > > The encoding in source code needs more modifications then introducing > new PCODEs to store files UTF8 or U16 strings and I plan to address it > in the future. > In fact such new PCODEs is that last and the easiest part of all unicode > modifications. More important is moving CP support from RTL to compiler. > Then we will add command line switches to inform compiler about source > code encoding and some code to detect UTF8 or UCS2/UTF16 encoding > automatically. We will also need switch to set encoding in generated > PCODE to add support for compile time CP translation. And of course > we will have to add HVM support for Unicode strings in some encoding > which will be well separated from non core code. Such separation will > allow us to easy change the internal representation without any > interactions to other code. At this moment the less expensive in HVM > management and translations seems to be native machine U16 representation > (flat version of UCS2/UTF16). > We will have to define some arithmetic for unicode and non unicode strings > in math operations or for [] operators. Define metalanguage sorting rules > used with unicode non unicode and mixed strings, etc. > Finally we will have to introduce support for UNICODE strings in generated > PCODE and here we will need new PCODE values.
Okay, great plans, I'm awaiting to see more, plus probably one next step for everyone is to think about how to exploit new APIs with current contrib/core/3rd party code, if there is now room for such improvements. I wonder if these can replace HB_TCHAR_* conversion usage. > Please remember that UNICODE does not resolve many problems. In fact using > Unicode introduces new very serious problems so for some applications it's > not possible to use it and Unicode isn't and will never be working > alternative. I've yet to write a large Unicode app so it's well possible I'm missing some obstacles, however in my case (business app used in multiple countries) it would most probably solve much more problems than it'd cause. For one thing, I wouldn't have to worry again about adding a new CP to the mix. Conversions to be done at interface points. Storage of Unicode strings in .dbfs seems to be somewhat problematic or at least unusual. I'd be glad to read more about such serious possible problems you have in mind, and also some cases where it would actually be a showstopper as you say. Brgds, Viktor _______________________________________________ Harbour mailing list (attachment size limit: 40KB) Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour