Re: [Harbour] codepage and RDD

Przemyslaw Czerpak Mon, 03 Nov 2008 09:58:14 -0800

On Mon, 03 Nov 2008, Szak�ts Viktor wrote:

Hi Viktor,


>> And what if it's FOR or WHILE clause of some RDD commands
>> like INDEX ON ..., COUNT or SET FILTER?
> I'd think INDEX ON expression is definitely RDD CP
> context, just like FOR/WHILE and every other code
> initiated by the RDD.

What about situations when part of expression is taken
from your source code and other part from table?
   INDEX ON FOR MY_FIELD <= "[text with national character]"
If you begin to implement it and resolve all problems which will
appear then you will end with CDP pointer attached to each string
item (HB_IT_STRING) what in some cases may not be bed idea but you
will also have to define the behavior of + and - operations for strings
with two different encoding. Probably by simple using encoding of 1-st
nonempty string. Next it will be necessary to define the behavior of
<, <=, >, >=, =, ==, $ operators for strings with different encoding
and implement it in HVM. Without attaching CDP information to string
item you will not be able to realize above idea because when string
item is created the evaluation context may not be known yet.
So as long as we will not have such functionality we have to use simplified
version which translate all strings to one encoding used by application.
This encoding is set by HB_SETCODEPAGE() function and if user wants to change
it at run time then he should know that it may break string operations for
items already created because they will not be translated to new encoding.

> [ Honestly until you see or agree that my raised issue is
> a valid one, I see no point in speculating on implementation
> details.

I see because you still do not understand what is the problem.
I hoped to you begin to see it when I'll ask you about it.

> A possible answer could be that the issue is there,
> but implementation is far-reaching or not possible because
> of certain legacy constructs, or else, but first I think
> we should identify the problem. Maybe my thinking is
> completely wrong from start. ]
> [ BTW the other problem is conversion from one CP to
> another from .prg code. But that's really another topic. ]

No it's not different topic because you can have in one expression
string items created from .prg source code and string items created
from table. If you have
   INDEX ON ... FOR MY_FIELD1=="<string with national character>"
then MY_FIELD1 have to be created with the same encoding as item created
by "<string with national character>" or during '==' evaluation you will
have to make translation dynamically. To make it dynamically you will have
to attach to each string item information about CDP or use some arbitrary
CDP used for all comparisons.

> I think the WA CP to caller CP conversion should be
> done when evaluating the content of FIELDn. From this
> point the context is of the caller's.

So you have to attach to each string item CP information so later
can be used during evaluation string related operators.

> Another view of this is if we'd say that HVM is internally
> Unicode, and we need to convert everything coming from outside
> to it, and use a selected national collation when doing
> these comparisons.

Yes keeping internally string items in some type of unicode representation
and making the translations to/from this representation each time string is
passed to external resources resolves problems for text data. Of course we
will stil have to decide do with Clipper comaptibility. Should ASC() and CHR()
use ASCII or UNICODE values? What to do with functions like l2bin() and
other Clipper code which binary strings? What to do with FREAD()/FWRITE()?
If HVM will use internally Unicode strings then all such operations will
have to make translations to/from unicode. Also for binary data. It means
that for reverted translations you will still need information about CP
which should be used and you will still have the same problem. Should this
CP be stored in some SET or attached to each string item.

>> Will enable translations for all open WAs even if user does not need any
> No, it should enable translations for _newly opened_ WAs.
> (in the current thread possibly).

Eactly. So it will break any code which operates on binary data.

> All it does is giving a default to the already existing
> <codepage> parameter.

No. Because now the translation is enabled _only_ if I use explicitly
different CP passed to USE command and this CP is different then HVM
one set by HB_SETCODEPAGE().
With the above extension I will have to add to all code which may operates
on binary fields or text which should not be translated, f.e. read in raw
form from other files:

   cOldRddCP := hb_rddDefaultCP( hb_setcodepage() ) // disable any translations

   use table new
   copy to table2
   close

   hb_rddDefaultCP( cOldRddCP )

I can also update all existing code and add CODEPAGE clause to USE and
related commands/functions:
   use table new codepage hb_setcodepage()
   copy to table2 codepage hb_setcodepage()
What is exactly the same what you have do in new code when you want
to enable automatic translations.
There different is only that your proposition breaks backward
compatibility because existing Clipper code will have to be updated.

>> translation and want to open table in raw form, f.e. to extract binary
>> so it will break existing code. The old problems will still exists and we
>> will have new one. User will have to add to existing code saving and
>> restoring hb_rdddefaultcodepage() to make it structural safe.
> No existing code will be broken, as this is a new command,
> and no defaults would be changed.

See above. It will be yet another _SET_EXACT switch :-(

> I still don't understand why isn't it a problem to support
> <codepage> currently as a parameter. Can we say it's a broken
> feature?

Because it will begin to introduce translation unconditionally if
hb_rddDefaultCP() is different then hb_setcodepage() so I will
have to add protection against such situation to existing code.
Current solution is very far from being perfect and has only limited
usage but at least it does not effects existing Clipper code.

> [ my battery going flat in a minute. See you later ]

:-)

best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Re: [Harbour] codepage and RDD

Reply via email to