Tom Lane writes:

> One thing that's really unclear to me is what's the difference between
> a <character translation> and a <form-of-use conversion>, other than
> that they didn't provide a syntax for defining new conversions.

The standard has this messed up.  In part 1, a form-of-use and an encoding
are two distinct things that can be applied to a character repertoire (see
clause 4.6.2.1), whereas in part 2 the term encoding is used in the
definition of form-of-use (clause 3.1.5 r).

When I sort it out, however, I think that what Tatsuo was describing is
indeed a form-of-use conversion.  Note that in part 2, clause 4.2.2.1, it
says about form-of-use conversions,

    It is intended,
    though not enforced by this part of ISO/IEC 9075, that S2 be
    exactly the same sequence of characters as S1, but encoded
    according some different form-of-use. A typical use might be to
    convert a character string from two-octet UCS to one-octet Latin1
    or vice versa.

This seems to match what we're doing.

A character translation does not make this requirement and it explicitly
calls out the possibility of "many-to-one or one-to-one mapping between
two not necessarily distinct character sets".  I imagine that what this is
intended to do is to allow the user to create mappings such as ö
-> oe (as is common in German to avoid using characters with diacritic
marks), or ö -> o (as one might do in French to achieve the same).  In
fact, it's a glorified sed command.

So I withdraw my earlier comment.  But perhaps the syntax of the proposed
command could be aligned with the CREATE TRANSLATION command.

-- 
Peter Eisentraut   [EMAIL PROTECTED]





---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Reply via email to