At 2023-04-30T03:04:27+0200, Alejandro Colomar wrote: > On 4/30/23 02:05, G. Branden Robinson wrote: > > I should have said "_Warn on_ semantic newlines" is a terrible > > instruction/summary. > > That's why I used the phrase (at least I tried to do it consistently > recently) "warn on S. N. violations".
Alas, it got lost in the most recent thread subject line on this topic to the groff list... https://lists.gnu.org/archive/html/groff/2023-04/msg00334.html Hmm, I see that was Bjarni's doing. Being from Iceland, he perhaps has more of the spirit of Loki than most... > > They are what we _don't_ want to warn about upon encountering them. > > > > If man-pages(7) or other people continue to call the practice of > > breaking *roff input lines after sentence-ending punctuation > > "semantic newlines", I have no complaint. It could also be called > > "Kernighan breaking", in honor of an early popularizer of the > > practice. > > You could use it for the warning name ;). Not a chance. :P As I noted, I want this under the "style" penumbra now, along with some other bits of weirdness. https://savannah.gnu.org/bugs/?62776 > > This is categorically not what regular expressions can cope with, > > formally. > > Well, formally yes. And a regex can't find C function definitions in > a source tree; at least if you try to fool it by writing the most > horrible code in the universe. But I wrote a relatively small > script[1] that finds a lot of C code with pcre2grep(1), and works most > of the time. It has limitations; some of which can be fixed by > improving the regexes (read: making them even more unreadable); some > others are likely impossible to fix with a regex. The biggest > limitation I think I've met is K&R-style functions: I don't think a > regex can cope with them. I don't know if you have to cope with "the lexer hack", but you might. https://en.wikipedia.org/wiki/Lexer_hack How much grief might have been saved if objects in C had been prefixed with a sigil like $, or if types had been prefixed with %? In my imagination, Thompson vetoed this, but when I consider it more seriously, I reckon the truth is more complicated, and arises from C's origins in the wholly untyped B language. The dialect of C we see in Version 6 Unix (q.v. the Lions book) is shockingly loosely typed to modern eyes. I once ground the productivity of my workplace to a halt for an entire afternoon by presenting my colleagues with the attached exhibit of "legal C". (It remained legal in AT&T USG Unix for many, many years.) > I believe a regex-based script can be good enough for some purposes, > even if it's not perfect. All of this is true, and I like programming languages that are dead simple to lexically analyze. (But I spend next to no time working in them.) I'm strident on this point because I'm opposed to putting a diagnostic into the formatter that throws false positives. That would disserve users. Regards, Branden
signature.asc
Description: PGP signature