INFORMATION_SCHEMA node

Tatsuo Ishii Mon, 01 Jan 2024 22:40:00 -0800

In the following paragraph in information_schema:

 <term>character encoding form</term>
     <listitem>
      <para>
       An encoding of some character repertoire.  Most older character
       repertoires only use one encoding form, and so there are no
       separate names for them (e.g., <literal>LATIN1</literal> is an
       encoding form applicable to the <literal>LATIN1</literal>
       repertoire).  But for example Unicode has the encoding forms
       <literal>UTF8</literal>, <literal>UTF16</literal>, etc. (not
       all supported by PostgreSQL).  Encoding forms are not exposed
       as an SQL object, but are visible in this view.


This claims that the LATIN1 repertoire only uses one encoding form,
but actually LATIN1 can be encoded in another form: ISO-2022-JP-2 (a 7
bit encoding. See RFC 1554
(https://datatracker.ietf.org/doc/html/rfc1554) for more details).

If we still want to list a use-one-encoding-form example, probably we
could use LATIN2 instead or others that are not supported by
ISO-2022-JP-2 (ISO-2022-JP-2 supports LATIN1 and LATIN7).

Attached is the patch that does this.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

INFORMATION_SCHEMA node

Reply via email to