On Thu, Jun 25, 2020 at 4:40 AM MRAB <pyt...@mrabarnett.plus.com> wrote: > > On 2020-06-24 18:59, Chris Angelico wrote: > > On Thu, Jun 25, 2020 at 3:51 AM Dennis Lee Bieber <wlfr...@ix.netcom.com> > > wrote: > >> > >> On Tue, 23 Jun 2020 20:49:36 +0000, Tony Kaloki <tkal...@live.co.uk> > >> declaimed the following: > >> > >> >Alexander, > >> > Thank you so much! It worked! Thank you. One question: > >> > in your reply, are you saying that Python would have treated the two > >> > separate underscores the same way as a long underscore i.e. it's a > >> > stylistic choice rather than a functional necessity? > >> > >> There is no "long underscore" in the character set. If there were, > >> Python would not know what to do with it as it was created back when ASCII > >> and ISO-Latin-1 were the common character sets. (Interesting: Windows > >> Character Map utility calls the underscore character "low line"). > > > > That's what Unicode calls it - charmap is probably using that name. > > > >> Many word processors are configured to change sequences of hyphens: > >> - -- --- into - – — (hyphen, en-dash, em-dash)... But in this case, those > >> are each single characters in the character map (using Windows-Western, > >> similar to ISO-Latin-1): hyphen is x2D, en-dash is x96, em-dash is x97 > >> (note that en-/em-dash are >127, hence would not be in pure ASCII) > > > > Hyphen is U+002D, en dash is U+2013, em dash is 2014. :) > > > Not quite. :-) > > Hyphen is U+2010. > > U+002D is hyphen-minus; it's does double-duty, for historical reasons.
True true, I should have corrected that name. But the point is, "Windows-Western" is not a good way to describe characters (I think that probably means "code page 1252"?). Use the Unicode codepoints for reliability :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list