Dear Karl, Thank you for reviewing! PSA new version.
> I see a few problems with the English and style of the patches > and am commenting below and have signed up as a reviewer. Your effort is quite helpful for me. > At > commitfest.postgresql.org I have marked the thread > as needing author attention. Hayato, you will need > to mark the thread as needing review when you have > replied to this message. Sure. I will change the status after posting the patch. Before replying your comments, I thought I should show the difference between versions. Regarding old versions (here PG15 was used), non-ASCIIs (like Japanese) are replaced with '?'. ``` psql (15.4) Type "help" for help. postgres=# SET application_name TO 'あああ'; SET postgres=# SHOW application_name ; application_name ------------------ ????????? (1 row) ``` As for the HEAD, as my patch said, non-ASCIIs are replaced with hexadecimal representations. (Were my terminologies correct?). ``` psql (17devel) Type "help" for help. postgres=# SET application_name TO 'あああ'; SET postgres=# SHOW application_name ; application_name -------------------------------------- \xe3\x81\x82\xe3\x81\x82\xe3\x81\x82 (1 row) ``` Note that non-printable ASCIIs are also replaced with the same rule. ``` psql (15.4) Type "help" for help. postgres=# SET application_name TO E'\x03'; SET postgres=# SHOW application_name ; application_name ------------------ ? (1 row) psql (17devel) Type "help" for help. postgres=# SET application_name TO E'\x03'; SET postgres=# SHOW application_name ; application_name ------------------ \x03 (1 row) ``` > Now, to reviewing the patch: > > First, it is now best practice in the PG docs to > put a line break at the end of each sentence. > At least for the sentences on the lines you change. > (No need to update the whole document!) Please > do this in the next version of your patch. I don't > know if this is a requirement for acceptance by > a committer, but it won't hurt. I didn't know that. Could you please tell me if you have a source? Anyway, I put a line break for each sentences for now. > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml > index e700782d3c..a4ce99ba4d 100644 > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -7040,9 +7040,8 @@ local0.* /var/log/postgresql > The name will be displayed in the > <structname>pg_stat_activity</structname> view and included in CSV log > entries. It can also be included in regular log entries via the <xref > linkend="guc-log-line-prefix"/> parameter. > - Only printable ASCII characters may be used in the > - <varname>application_name</varname> value. Other characters > will be > - replaced with question marks (<literal>?</literal>). > + Non-ASCII characters used in the > <varname>application_name</varname> > + will be replaced with hexadecimal strings. > </para> > </listitem> > </varlistentry> > > Don't use the future tense to describe the system's behavior. Instead > of "will be" write "are". (Yes, this change would be an improvement > on the original text. We should fix it while we're working on it > and our attention is focused.) > > It is more accurate, if I understand the issue, to say that characters > are replaced with hexadecimal _representations_ of the input bytes. > Finally, it would be good to know what representation we're talking > about. Perhaps just give the \xhh example and say: the Postgres > C-style escaped hexadecimal byte value. And hyperlink to > https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-ST > RINGS-ESCAPE > > (The docbook would be, depending on text you want to link: > > <link linkend="sql-syntax-strings-escape">C-style escaped hexadecimal > byte value</link>. > > I think. You link to id="someidvalue" attribute values.) IIUC the word " Postgres" cannot be used in the doc. Based on your all comments, I changed as below. How do you think? ``` Characters that are not printable ASCII, like <literal>\x03</literal>, are replaced with the <productname>PostgreSQL</productname> <link linkend="sql-syntax-strings-escape">C-style escaped hexadecimal byte value</link>. ``` > @@ -8037,10 +8036,9 @@ COPY postgres_log FROM > '/full/path/to/logfile.csv' WITH csv; <para> > The name can be any string of less > than <symbol>NAMEDATALEN</symbol> characters (64 characters > in > a standard > - build). Only printable ASCII characters may be used in the > - <varname>cluster_name</varname> value. Other characters will be > - replaced with question marks (<literal>?</literal>). No name > is shown > - if this parameter is set to the empty string > <literal>''</literal> (which is > + build). Non-ASCII characters used in the > <varname>cluster_name</varname> > + will be replaced with hexadecimal strings. No name is shown if > this > + parameter is set to the empty string <literal>''</literal> > (which is the default). This parameter can only be set at server start. > </para> > </listitem> > > Same review as for the first patch hunk. Fixed like above. You can refer the patch. > diff --git a/doc/src/sgml/postgres-fdw.sgml > b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644 > --- a/doc/src/sgml/postgres-fdw.sgml > +++ b/doc/src/sgml/postgres-fdw.sgml > @@ -1067,9 +1067,8 @@ postgres=# SELECT postgres_fdw_disconnect_all(); > of any length and contain even non-ASCII characters. However > when it's passed to and used as <varname>application_name</varname> > in a foreign server, note that it will be truncated to less than > - <symbol>NAMEDATALEN</symbol> characters and anything other than > - printable ASCII characters will be replaced with question > - marks (<literal>?</literal>). > + <symbol>NAMEDATALEN</symbol> characters and non-ASCII > characters > will be > + replaced with hexadecimal strings. > See <xref linkend="guc-application-name"/> for details. > </para> > > Same review as for the first patch hunk. Fixed like above. > Since the both of you have looked and confirmed that the > actual behavior matches the new documentation I have not > done this. I showed the result again, please see. > But, have either of you checked that we're really talking about > replacing everything outside the 7-bit ASCII encodings? > My reading of the commit referenced in the first email of this > thread says that it's everything outside of the _printable_ > ASCII encodings, ASCII values outside of the range 32 to 127, > inclusive. > > Please check. The docs should probably say "printable ASCII", > or "non-printable ASCII", depending. I think the meaning > of "printable ASCII" is widely enough known to be 32-127. > So "printable" is good enough, right? For me, "non-printable ASCII" sounds like control characters. So that I used the sentence "Characters that are not printable ASCII ... are replaced with...". Please tell me if you have better explanation? Best Regards, Hayato Kuroda FUJITSU LIMITED
v3_doc_fix.patch
Description: v3_doc_fix.patch