Re: [groff] 04/05: {g, n}roff.1.man: Give assistance to pager users.

2019-07-01 Thread G. Branden Robinson
At 2019-06-30T18:43:31+0200, Ingo Schwarze wrote:
> Sure, paper teletypes is what backspace encoding historically comes
> from.  But that doesn't mean its usefulness is restricted to
> paper teletypes.  In fact, modern pagers handle it just fine.

Yes, but the simple fact is that groff supports applying attributes to
characters that the backspacing semantic model cannot express.

Now, conversely, the backspacing semantic model supports arbitrary
character composition, which glass TTYs and their emulators never do.
(Almost never?  I'd love to hear of any exceptions.)

> That's a non sequitur because i didn't say "use backspace encoding
> because nothing except paper teletypes matters"; i intended to say
> "use backspace encoding because it does the job for most use cases

The context here is the full groff typsesetting system, not just man
pages.  The \m[] and \M[] escapes exist for those who choose to use
them (and give up portability to many or all non-GNU implementations),
for instance when writing a document with ms macros and embedded images
like our doc/wepage.ms[1].

> and avoids the risks involved in allowing terminal escape sequences
> in your pager".  Additional risks from using UTF-8 exist, too, but
> they are relatively minor.

I think you understate the hazards of confusable codepoints in Unicode,
and overstate the hazards of ISO 6429 escape sequences.  We know which
escapes are trouble: those which might inject characters into the input
stream (which is why modern Bash and XTerm and possibly other programs
now support "bracketed paste mode") and those which can collect
information from the host environment.  grotty produces neither of
these.

I grant, however, that risk assessment is a subjective thing.  I
wouldn't support taking away grotty's -c flag.

[..]
> All that said, and given that SGR encoding was already made the
> default in groff at some point in the past,

If I'm reading the history correctly, it came in with groff 1.18, almost
17 years ago (https://ftp.gnu.org/gnu/groff/old/).

> there may not be
> consensus in the groff community for recommending -c more strongly,
> so just describing both modes neutrally, as you did, is probably
> OK after all.

I'll re-review my changes and see if I left anything in nroff that
better belongs in grotty, but I do recall including some arguably
redundant information in the nroff page _because it's a front-end
program_.  There are few reasons to invoke grotty directly, especially
given groff's -P flag.

Regards,
Branden


signature.asc
Description: PGP signature


Re: [groff] 02/02: nroff.1.man: Make editorial fixes.

2019-07-01 Thread G. Branden Robinson
[redirecting to discussion list]

At 2019-07-01T16:42:10+0200, Ingo Schwarze wrote:
> I know this is a really minor point - but i don't understand this change:
> 
>$ LC=C printf "a\nA\n" | sort
>   A
>   a
>$ LC=en_US.UTF-8 printf "a\nA\n" | sort
>   A
>   a

Well, (1) LC is not a POSIX-standard locale-controlling variable, as far
as I know.  Is it a BSDism?

> The above holds independently of the operating system - i tested
> OpenBSD, Debian Linux, and Solaris, and on the latter two also
> with a couple of non-English locales.  Also,

(2) Your test case is insufficiently developed, since you have
characters from only one equivalence class.  Try this:

$ printf 'A\na\nb\nB\n' | LC_ALL=en_US.UTF-8 sort
a
A
b
B

Then try it with LC_ALL=C.

> The only system i was able to find with "small before capital"
> is Solaris/illumos.  Linux appears to have no clear convention:
> most often, ordering is totally random in Linux manual pages.
> 
> So why did you change the order?

To combat the chaos.

Regards,
Branden


signature.asc
Description: PGP signature


[groff] man7/groff.man. Was Make editorial changes.

2019-07-01 Thread Doug McIlroy
I agree with Ingo about proposed descriptions of \& and sentence
spaces. Elaboration is not explanation.

\& is simply a zero-length character. Its primary use is to disguise
sequences that groff would otherwise unwantedly interpret. For example,
"\&." at the beginning of an input line will be taken as text, not a
groff request. Given the general case, further examples are unnecessary.

"Sentence space" is a fraught convention, mentioned in groff(7) but
not defined. It is not revealed that "sentence space" is extra sace,
not the whole space between sentences. Nor is the default sentence space
stated. A first cut at a general definition might be:

BUGS
Extra "sentence space", by default one space character, is
inserted after sentences, which are identified by artificial
intelligence. False identifications may be mitigated by judicious
use of \&.

A personal false-identification hazard: in the court of groff I will
be declared innocent if I call myself M. Douglas McIlroy, but will be
sentenced if I call myself Mr. Douglas McIlroy.

Again speaking personally, this discussion has made me aware of the
second argument of .ss. I expect from now on to cut the Gordian
knot by using .ss 12 0, at least in nroff.

Incidentally, groff(7) defines \n[.ss] enigmatically thus: "The value
of the parameters [sic] set by the first argument of the ss request",
and defines \n[.sss] similarly. A more informative definition would be,
"The value N set by .ss N M". This rules out other plausible values,
e.g. \w' '*N/12.

Doug



Re: [groff] 04/05: {g, n}roff.1.man: Give assistance to pager users.

2019-07-01 Thread Tadziu Hoffmann



> Now, conversely, the backspacing semantic model supports arbitrary
> character composition, which glass TTYs and their emulators never do.
> (Almost never?  I'd love to hear of any exceptions.)

Tektronix (storage scope) terminals allowed arbitrary overprinting.
The Tek emulation in xterm still supports this.

(Overprinting also used to be central to generating the full
APL symbol set.)





Re: [groff] 02/02: nroff.1.man: Make editorial fixes.

2019-07-01 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Tue, Jul 02, 2019 at 01:03:04AM +1000:
> At 2019-07-01T16:42:10+0200, Ingo Schwarze wrote:

>> I know this is a really minor point - but i don't understand this change:
>> 
>>$ LC=C printf "a\nA\n" | sort
>>   A
>>   a
>>$ LC=en_US.UTF-8 printf "a\nA\n" | sort
>>   A
>>   a

> Well, (1) LC is not a POSIX-standard locale-controlling variable,
> as far as I know.  Is it a BSDism?

Ooops, no, it was just a typo.
My test was indeed quite wrong in more than one respect.

> Try this:
> 
> $ printf 'A\na\nb\nB\n' | LC_ALL=en_US.UTF-8 sort
> a
> A
> b
> B

Confirmed on Debian Linux and on Oracle Solaris.

> Then try it with LC_ALL=C.

A
B
a
b

on both systems.

(OpenBSD invariably gives the LC_ALL=C result, but that's no
surprise because LC_COLLATE intentionally does nothing on OpenBSD.)

This is funny (on Solaris 11):

   > printf 'A\na\nb\nB\n' | LC_ALL=zh_CN.UTF-8 sort
  a
  A
  b
  B
   > printf 'A\na\nb\nB\n' | LC_ALL=ja_JP.UTF-8 sort
  A
  B
  a
  b

even though both are valid according to "locale -a".  So the concept
of "locale collation" doesn't appear to be all that well-defined -
but instead locale-dependent - even for base latin characters.

>> The only system i was able to find with "small before capital"
>> is Solaris/illumos.  Linux appears to have no clear convention:
>> most often, ordering is totally random in Linux manual pages.
>> 
>> So why did you change the order?

> To combat the chaos.

Oh well, so it appears there are three different conventions then:

 - the POSIX / traditional BSD / FreeBSD / AIX / ASCII  -ABCabc
 - the NetBSD / OpenBSD -AaBbCc
 - the Solaris / illumos / en_US(?) -aAbBcC

I'm probably just too used to the first two such that i considered
them universial...

Given all those different conventions, it's probably fair enough
to leave the bike shed painted in the colour you chose.
Sorry for the noise, then.

Yours,
  Ingo



[groff] man 7 groff; was nroff.1.man Make editorial fixes.

2019-07-01 Thread Doug McIlroy
I agree with Ingo about proposed descriptions of \& and sentence
spaces. Elaboration is not explanation.

\& is simply a zero-length character. Its primary use is to disguise
sequences that groff would otherwise unwantedly interpret. For example,
"\&." at the beginning of an input line will be taken as text, not a
groff request. Given the general case, further examples are unnecessary.

"Sentence space" is a fraught convention, mentioned in groff(7) but
not defined. It is not revealed that "sentence space" is extra sace,
not the whole space between sentences. Nor is the default sentence space
stated. A first cut at a general definition might be:

BUGS
Extra "sentence space", by default one space character, is
inserted after sentences, which are identified by artificial
intelligence. False identifications may be mitigated by judicious
use of \&.

A personal false-identification hazard: in the court of groff I will
be declared innocent if I call myself M. Douglas McIlroy, but will be
sentenced if I call myself Mr. Douglas McIlroy,

Incidentally, groff(7) defines \n[.ss] enigmatically thus: "The value
of the parameters [sic] set by the first argument of the ss request",
and defines \n[.sss] similarly. A more informative definition would be,
"The value N set by .ss N M". This rules out other plausible values,
e.g. \w' '*N/12.

Doug