On 5/14/19 11:31 PM, Martin Sebor wrote: > Near the end of every release a bunch of problem reports are > raised for various punctuation, quoting, and spelling issues > in GCC. In GCC 9 a total 28 such issues were submitted.
Hi Martin. Great that you prepared the patchset. > A fair number of them are discovered as new or changed > diagnostics are being translated. A checker was developed > to help find some of these as well many others by scanning > the gcc.pot file: contrib/check-internal-format-escaping.py. > The checker automates the process of finding these issues but > it doesn't prevent them. Being external to GCC the checker > cannot easily distinguish between message strings that are > expected to be translated and those that aren't. > > To help avoid introducing the most glaring subset of these > problems as early as possible while making it possible to > differentiate messages that are expected to conform from those > that need not, the attached patch adds a simple checker to GCC: > -Wformat-diag. My goal is to improve consistency of messages > and relieve the burden at the end of each release both on > translators and those who then have to fix the problems. I like the idea! One limitation compared to the linter is that one needs to have a test-suite coverage for errors and warnings. Moreover, one needs to have that for all target supported by GCC. > > The warning detects some of the following: > > * unquoted operators consisting of two or more characters > (e.g., == or ?:) > * unquoted keywords, types, GCC built-ins and command line > options, or identifiers with underscores > * unquoted apostrophes and contractions like in can't > * informal abbreviations (arg or reg) > * unbalanced paired tokens (brackets, parentheses, or quotes) > * duplicate or trailing spaces (it allows multiple spaces at > the beginning of a diagnostic since that's used by the C++ > front-end as an "indentation") > * unintended control characters > * inconsistent sentence capitalization You implemented very complex checks! Nice. > > To avoid diagnosing direct calls to pretty-printer functions > that often format parts of the same diagnostic in multiple steps, > the solution introduces the "_raw__" suffix to the __gcc_diag__ > attributes that the warning then uses to avoid checking those > calls. Functions with the raw attributes aren't checked. > > I tested the patch kit as a whole in an x86_64-linux build and > fixed all the issues it pointed out except for 43 instances of > the new warning in the Go front-end that I wasn't sure how best > to handle (I will follow up on that with Ian). I also adjusted > the affected tests. > > There were just a few issues specific to the i386 back-end but they > had an impact on 9 tests. The impact of fixing the same kinds of > issues in other back-ends is likely to be similar. Rather than > trying to do the clean up across all supported targets I think it's > better to handle the rest of the issues over time and with the help > of back end maintainers who can more easily verify that tests still > pass. With than in mind, I have prevented the warning from causing > bootstrap failures. Once the cleanup is complete I expect to remove > this and include the new warning in -Werror. The approach works form me. Martin > > Since a few files in GCC "misuse" the diagnostic machinery for > debugging output leading up to internal errors (instead of > calling a pp_xxx function) I suppressed the new warning in those > files via #pragma GCC diagnostic ignored "-Wformat-diag". If we > agree that calling pp_format or some such is the way to go I can > work replacing those calls as a subsequent step. > > The subset should cover the following bug reports (some already > resolved): > > 90176 - diagnostics should generally contain underscore only > inside quotes > 90162 - exclamation mark in diagnostic!!!!!1111!!!! > 90158 - aarch64: wrong quotation in diagnostics > 90157 - aarch64: unnecessary abbreviation in diagnostic > 90156 - add linter check suggesting to replace %<%s%> with %qs > 90149 - diagnostics containing BIT_FIELD_REF don't conform to > diagnostics guideline > 90121 - extra space in error message > 90117 - Replace %<%s%> with %qs > 90011 - [9 Regression] trailing space in diagnostic > 79477 - Please write code more translator-friendly (unquoted options) > 89936 - wrong punctuation in tree-profile.c (%<%s%>) > > Although the changes are mechanical, since I made them all by hand > I broke up the patch into a series to make it easier to review: > > [PATCH 1/12] implement -Wformat-diag to detect quoting and spelling > issues in GCC diagnostics > [PATCH 2/12] fix diagnostic quoting/spelling in ada > [PATCH 3/12] fix diagnostic quoting/spelling in Brig > [PATCH 4/12] fix diagnostic quoting/spelling in the C front-end > [PATCH 5/12] fix diagnostic quoting/spelling in c-family > [PATCH 6/12] fix diagnostic quoting/spelling in C++ > [PATCH 7/12] fix diagnostic quoting/spelling in libgcc > [PATCH 8/12] fix diagnostic quoting/spelling in the middle-end > [PATCH 9/12] adjust tests to quoting/spelling diagnostics fixes > [PATCH 10/12] fix diagnostic quoting/spelling in D > [PATCH 11/12] fix diagnostic quoting/spelling issues in i386 back-end > [PATCH 12/12] fix diagnostic quoting/spelling issues in ObjC > > Martin