On Thu, Sep 28, 2023 at 10:30 AM Karl O. Pinc <k...@karlpinc.com> wrote: > > On Thu, 28 Sep 2023 09:49:03 +1000 > Peter Smith <smithpb2...@gmail.com> wrote: > > > On Wed, Sep 27, 2023 at 11:59 PM Karl O. Pinc <k...@karlpinc.com> > > wrote: > > > > > > On Wed, 27 Sep 2023 12:58:54 +0000 > > > "Hayato Kuroda (Fujitsu)" <kuroda.hay...@fujitsu.com> wrote: > > > > > > > > Should the committer be interested, your patch applies cleanly > > > > > and the docs build as expected. > > > > > > > > Yeah, but cfbot accepted previous version. Did you have anything > > > > in your mind? > > > > > > No. I'm letting the committer know everything I've checked > > > so that they can decide what they want to check. > > > > > > > Hmm, what you said looked right. But as Peter pointed out [1], the > > > > fix seems too much. So I attached three version of patches. How do > > > > you think? For me, type C is best. > > > > > > > > A. A patch which completely follows your comments. The name is > > > > "v3-0001-...patch". Cfbot tests it. > > > > B. A patch which completely follows Peter's comments [1]. The > > > > name is "Peter_v3-....txt". > > > > C. A patch which follows both comments. Based on > > > > b, but some comments (Don't use the future tense, "Other > > > > characters"->"The bytes of other characters"...) were picked. The > > > > name is "Both_v3-....txt". > > > > > > I also like C. Fewer words is better. So long > > > as nothing is left unsaid fewer words make for clarity. > > > > > > However, in the last hunk, "of other than" does not read well. > > > Instead of writing > > > "and the bytes of other than printable ASCII characters" > > > you want "and the bytes that are not printable ASCII characters". > > > That would be my suggestion. > > > > > > > I also prefer Option C, but... > > > > ~~~ > > > > + <varname>application_name</varname> value. > > + The bytes of other characters are replaced with > > + <link linkend="sql-syntax-strings-escape">C-style escaped > > hexadecimal > > + byte values</link>. > > > > V > > > > + <varname>cluster_name</varname> value. > > + The bytes of other characters are replaced with > > + <link linkend="sql-syntax-strings-escape">C-style escaped > > hexadecimal > > + byte values</link>. > > > > V > > > > + <symbol>NAMEDATALEN</symbol> characters and the bytes of other > > than > > + printable ASCII characters are replaced with <link > > + linkend="sql-syntax-strings-escape">C-style escaped > > hexadecimal byte > > + values</link>. > > > > > > IIUC all of these 3 places can have exactly the same wording change > > (e.g. like Karl's last suggestion [1]). > > > > SUGGESTION > > Any bytes that are not printable ASCII characters are replaced with > > <link linkend="sql-syntax-strings-escape">C-style escaped hexadecimal > > byte values</link>. > > I don't see the utility in having exactly the same phrase everywhere, > especially since the last hunk is modifying the end of a long > sentence. (Apologies if I'm mis-reading what Peter wrote above.) > > I like short sentences. So I prefer "The bytes of other characters" > rather than "Any bytes that are not printable ASCII characters" > for the first 2 hunks. In context I don't see the need to repeat > the whole "printable ASCII characters" part that appears in the > preceding sentence of both hunks. "Other" is clear, IMHO. >
I had in mind something like a SHIFT-JIS encoding where a single "character" may include some trail bytes that happen to be in the ASCII printable range. AFAIK because the new logic is processing bytes, not characters, I thought the end result could be a mix of escaped and unescaped bytes for the single SJIS character. In that context, I felt "The bytes of other characters" was not quite accurate. But now looking at PostgreSQL-supported character sets [1] I saw SJIS is not supported anyhow. Unfortunately, I am not familiar enough with other encodings to know if there is still a chance of similar printable ASCII trail bytes so I am fine with whatever wording is chosen. > But because I like short sentences I now think that it's a good > idea to break the long sentence of the last hunk into two. > Add a period and use the Peter's SUGGESTION above as the > text for the second sentence. > > Is this desireable? > +1. ====== [1] https://www.postgresql.org/docs/current/multibyte.html Kind Regards, Peter Smith. Fujitsu Australia