> We can check non-ASCII letters SGML/XML files by preparing "allowlist"
> that contains lines which are allowed to have non-ascii characters,
> although this list will need to be maintained when lines in it are modified.
> I've attached a patch to add a simple Perl script to do this.

I doubt it really works. For example, nbsp can be used formatting
(that's the purpose of the character in the first place). Whenever a
developer decides to or not to use nbsp, "allowlist" needs to be
maintained. It's too annoying.

I think it's better to add the non-ASCII character checking to the
comitting check list and let committers check non-ASCII character in
the patch. Non-ASCII characters rarely used and it would not become a
burden.
https://wiki.postgresql.org/wiki/Committing_checklist

Maybe we can add to the wiki page something like this?

git diff origin/master | grep -P '[^\x00-\x7f]'

> During testing this script, I found "stylesheet-man.xsl" also has non-ascii
> characters. I don't know these characters are really necessary though, since
> I don't understand this file well.

They are U+201C (double turned comma quotation mark) and U+201D
(double comma quotation mark).

       <l:template name="sect3" text="Section %n, “%t”, in the documentation"/>

I would like to know why they are necessary too.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


Reply via email to