Re: The Technical Committee needs you!

2023-10-14 Thread Kunal Ambasta
Dear Matthew

Thank you so much for your email. I am here to present myself as a suitable
candidate for the position. I am here to request you to consider. I am from
INDIA and for the last 8 years I have been closely working with Raspberry
Pi and Raspbian OS (headless). My feild of Interest is AI, ML and Robotics.
I am working very actively in offline speech recognition and Computer
Vision. SInce 2011, I have been using debian as my main production and main
testing system. Even my prime application deployment platform is even
debian.

I request you to kindly go through my LinkedIn profile and if you find me
suitable, kindly consider my active participation in the debian team.

https://www.linkedin.com/in/kunal-ambasta-124036179

Waiting eagerly for your response.

Thank you.

Warm Regards.

On Sat, Oct 14, 2023 at 8:03 PM Matthew Vernon  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Dear Debian Developers,
>
> The Debian Technical Committee (TC) is seeking nominations for new
> members, since Simon McVittie's term on the TC ends at the end of the
> year. I'd like to thank Simon, on behalf of the TC, for his service.
>
> Please consider if you might be a good member, or if you know someone
> who might be, and let us know :) To nominate, please email
> debian-ctte-priv...@debian.org with the subject "TC nomination of
> loginname", where loginname is the nominee's Debian account login[0].
>
> The TC can offer advice, decide matters of technical policy (and where
> Developers' jurisdiction overlaps), and make decisions that people
> refer to it[1]. To do this well, it needs members who are able to
> contribute constructively to written discussions (given this is how
> Debian makes decisions), and are able to help resolve disagreements
> (technical and otherwise).
>
> If this sounds like you, please nominate yourself! If this makes you
> think of another Developer, please nominate them. In either case,
> please include examples and/or links showing the nominee has the
> qualities mentioned in the previous paragraph.
>
> The TC meets monthly, and we also expect members to keep up with email
> discussions regarding any issues before the TC in a timely
> manner. Typically this is not a lot of time, but might be up to 10
> hours a month in busy periods.
>
> Our next meeting is 2023-11-14, so ideally we'd like nominations by
> then, but you can nominate yourself or someone else at any time.
>
> Regards,
>
> Matthew
> for the Debian Technical Committee
>
> [0] If you're not sure, you can look these up at https://db.debian.org/
> [1] For more details - https://www.debian.org/devel/constitution#item-6
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCgAdFiEEuk75yE35bTfYoeLUEvTSHI9qY8gFAmUqmqcACgkQEvTSHI9q
> Y8jTtg/+MHiF9YTi2J+7jBKiLdc3gpGp16ZpG4n4QHoPykXp9V7MFAp1MWwopKEt
> 56G3VkH1hXkz7H85R0d5MpTECN5mo8Lh7zcMFuZLkhaRoQC5lnNtU3gmAv8V3g4b
> viWWw3v4bXv40klkjsmUV/Auk04Pk8wL/e0FDzzgYsodzlO90uotCiMYlHDcBqim
> 7f21/bLMf15PcGrnk7sxGESVBqpJI/tHF7EK2LuvcyZd3oog1xV9wV30wRvbDORL
> PiZB73sEJNahJ7v1+SeMIU7AJJwQSd5i4TP616fbJnljHL8tV49vV7/z+ss+seAP
> x5ActHnO877vV3l6TeNPEwKUj8PmUFGNxs3Cu7GKfuUunLlP9KgiiaV17w4wHX8Y
> 0YeBxfvJyCyJZ7ZIfsuQoaqmvSTw7GP++J2JMGr3L8fKqpsQhqUiSLbtlwyXf9R5
> nJhBkV8rCLgQYauYwyAe9pZO+pkfCaUMxSklEbAZMpty/iTbHdbqFRy+MB/TJWKI
> b2i9GkxSJqVQraX4dN5yGW76bZZrcp7xSUhK7zPOwsauA4t50LoUkJZOrAgQy5Jd
> 1LbJxvQh4ub5kqMJ2mfS9AILr5I3xgAmbR3flch2bkLpQnjNvCtABwoX6h9LZxIz
> wts2oo1CeZRPr+0DmZrISMkJdO5iIarCff09mEGu6rvm6USJ3l4=
> =+lpw
> -END PGP SIGNATURE-
>
>


Hyphens in man pages

2023-10-14 Thread Antonio Russo
Hello,

I discovered a new pet peeve today: if you search for a command in a manual 
page,
say -e in man 1 zgrep, it's a crapshot whether just searching for '-e' will find
the command or not.  The reason is that "-" may been accidentally encoded as ‐
instead of -.

Now, depending on your email client and settings, the above will appear to be 
the
ravings of an unhinged lunatic who wrote the same thing twice, or an unhinged
lunatic who slammed their fists onto the keyboard.

The reason is that man(1) convert bare dashes (0x2D) to hyphens (U+2010).  These
are not the same symbol: searching for one does not find the other without some
kind of normalization, pasting commands with one vs. the other does different
things.  New users who do not understand this will be discouraged trying to read
manual pages.  Chances are, they will fill forums with mundane questions that
could and should have been addressed by a simple search of a manual page.

I recently fixed a ton of these in another upstream package with this vim 
"one-liner":

:%s/--\([a-z]\+\)\(-[a-z]\+\)*/\=substitute(submatch(0), '-', '\\-', 'g')/g

However, this requires manual review and does not fix the '-e' example from 
zgrep.
There are also a whole host of this kind of problem, e.g., dashes in URLs that 
get
naievely pasted into man pages (another live example I just addressed).

I come here with several questions:

 - Am I off-base thinking this is a problem?
 - Should we really be using troff to typeset anything in this year 2023?
   (In particular, if we can make the source text more human-readable, we might
   be able to leverage LLMs on this wealth of information in the future and 
automate
   support.  Are LLMs "fluent" in troff? I have not investigated at all.)
 - Are there any alternatives that actually produce nice looking man pages?
   (My experience with pandoc is that the source is still awkward, I literally
   just found another example of this bug in my own man page, and it looks 
pretty
   ugly in man. But maybe I just didn't find good examples/documentation.)
 - Should we try to come up with some lintian rules to flag this behavior?
   (This one: /--\([a-z]\+\)\(-[a-z]\+\)*/ finds long GNU-style commands, I'd
   have to think for at least a little bit about finding short ones.  This would
   ultimately be fragile. For example, the above doesn't find partially broken
   tokens; i.e., only one unescaped dash.)
 Automated tooling around this, more generally, seems fragile.  HTML might 
have
   been a nice compromise, but writing that appears to be out of vogue these 
days,
   despite being a pretty OK thing to read and write
   by hand. But seriously, I would love to be writing HTML 
instead
   of troff for manual pages.

Antonio

OpenPGP_0xB01C53D5DED4A4EE.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Hyphens in man pages

2023-10-14 Thread Jochen Sprickerhof

Hi Antonio,

this is discussed in:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052675

Cheers Jochen


signature.asc
Description: PGP signature


Re: Hyphens in man pages

2023-10-14 Thread G. Branden Robinson
At 2023-10-14T20:51:27-0600, Antonio Russo wrote:
> I discovered a new pet peeve today: if you search for a command in a
> manual page, say -e in man 1 zgrep, it's a crapshot whether just
> searching for '-e' will find the command or not.  The reason is that
> "-" may been accidentally encoded as ‐ instead of -.

You can blame me for this.

https://git.savannah.gnu.org/cgit/groff.git/tree/NEWS?h=1.23.0#n206

...me, and man page authors who don't think about whether they intend
a hyphen or a minus sign when they strike the '-' key...

Quick background: in the context of Unix usage as documented by
nroff/troff, the dash used at the shell prompt, in text editors, and in
programming language source code is a "minus sign".  troff has an em
dash special character as well since the mid-1970s; groff adds an en
dash as well, and furthermore supports user definition of characters
providing access to any other sort of dash that comes down the Unicode
pike.  (Not that doing so is a good idea in a man page; see below
regarding a "restricted dialect" of man(7).)

> Now, depending on your email client and settings, the above will
> appear to be the ravings of an unhinged lunatic who wrote the same
> thing twice, or an unhinged lunatic who slammed their fists onto the
> keyboard.

This issue does indeed have a history of provoking unhinged lunacy.

Before we proceed, you might wish to be aware of
 and its
proposed remedy.

> The reason is that man(1) convert bare dashes (0x2D) to hyphens
> (U+2010).  These are not the same symbol: searching for one does not
> find the other without some kind of normalization, pasting commands
> with one vs. the other does different things.  New users who do not
> understand this will be discouraged trying to read manual pages.
> Chances are, they will fill forums with mundane questions that could
> and should have been addressed by a simple search of a manual page.

I run into this problem, too, since I dogfood my own changes.  When
irritated by this, I try the search again, replacing '-' with '.', which
has yet to fail me (and produces false positives surprisingly rarely).

For example, I've recently been playing with the mg(1) editor, and
observed extremely poor discipline in this area.  So I forked it on
GitHub and have been preparing a bunch of revisions.  I wrote a sed
script to fix its numerous hyphen/dash problems.[1]

> I recently fixed a ton of these in another upstream package with this
> vim "one-liner":
> 
> :%s/--\([a-z]\+\)\(-[a-z]\+\)*/\=substitute(submatch(0), '-', '\\-', 'g')/g

My Vimscript is not very sophisticated, but it looks like you're
replacing only hyphens that appear in long option names here.  That's
good, as you're unlikely to clobber any hyphens that should _not_ become
minus signs.

Such discernment is important.  Many people who want to "solve" this
issue forget (or ignore) that not every '-' is a minus sign.  Some are
actual hyphens, as in "long-term effects" and "word-aligned struct
members".  Trying to infer a distinction from white space adjacency also
won't work.  Consider the phrases "word- or byte-sized caching" and
"object-based vs. -oriented programming".  While sophistication with
compound hyphenated affixes is seldom seen in man pages, we most often
find it where a man page author has taken considerable care with their
technical writing.  Such pages are less likely than most to require
revision with blunt instruments like regular expression-based global
search and replace operations.

> However, this requires manual review

Surprisingly often, the composition of high-quality technical
documentation requires the engagement of a human brain.

> and does not fix the '-e' example from zgrep.

Mapping all hyphens and minus signs to a single character, as people
whose blood pressure spikes over this issue tend to promote as a first
resort, is an ineluctably information-discarding operation.  In my
opinion, man page source documents are not the correct place to discard
that information.

(I acknowledge that you didn't propose such a crude remedy; I write to
anticipate the inevitable follow-ups from people who will.)

Doing so at rendering time is much more defensible, and happens anyway
on devices that do not distinguish these characters in the first place.

> There are also a whole host of this kind of problem, e.g., dashes in
> URLs that get naievely pasted into man pages (another live example I
> just addressed).

Yes, people commonly type URLs and email addresses into man page sources
as they would into an MUA or browser navigation bar.  Since U+2010 is
difficult to encode in such things, the man(7) package could help by
performing an automatic character translation in this area.  However,
(1) no one's actually asked for this and (2) it would address only a
tiny part of the problem.  The means of "help" I have in mind is
employment of the groff man(7) extension macros `UR`/`UE` and `MT`/`ME`