Extend unicodedata with a name/pattern/regex search for character entity references?
https://mail.python.org/pipermail//python-ideas/2014-October/029630.htm Wanted to know if the above link idea, had been implemented and if there's a module that accepts a pattern like 'cap' and give you all the instances of unicode 'CAP' characters. ⋂ \bigcap ⊓ \sqcap ∩ \cap ♑ \capricornus ⪸ \succapprox ⪷ \precapprox (above's from tex) I found two useful modules in this regard: unicode_tex, unicodedata but unicodedata is a builtin which does not do globs, regexs - so it's kind of limiting in nature. Would be nice if you could search html/xml character entity references as well. -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Sun, 4 Sep 2016 06:47 am, Thomas 'PointedEars' Lahn wrote: > Your posting is lacking a real name in the “From” header field. Thomas, if that is really your name, how do we know that: Thomas 'PointedEars' Lahn is a real name? Is sounds made up to me. I'm afraid that we're going to have to insist that you scan you birth certificate, drivers licence and other forms of ID and send them to us so that we can see proof that this is your real name. Please either comply, or give up your stupid and pointless obsession with trying to be the Internet Police for something that isn't even a real rule. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Pythons for .Net
On Sat, 3 Sep 2016 12:34 pm, Denis Akhiyarov wrote: > Finally if anyone can contact Christian Heimes (Python Core Developer), > then please ask him to reply on request to update the license to MIT: > > https://github.com/pythonnet/pythonnet/issues/234 > > He is the only contributor that prevents updating to MIT license. I have emailed him off-list. Thanks for the information on PythonNet, Denis. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Sun, Sep 4, 2016 at 11:51 AM, Steve D'Aprano wrote: > On Sun, 4 Sep 2016 06:47 am, Thomas 'PointedEars' Lahn wrote: > >> Your posting is lacking a real name in the “From” header field. > > > Thomas, if that is really your name, how do we know that: > > Thomas 'PointedEars' Lahn > > is a real name? Is sounds made up to me. I'm afraid that we're going to have > to insist that you scan you birth certificate, drivers licence and other > forms of ID and send them to us so that we can see proof that this is your > real name. > > Please either comply, or give up your stupid and pointless obsession with > trying to be the Internet Police for something that isn't even a real rule. > His posts aren't making it across the news->list gateway any more. Killfile him and move on... ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Sun, 4 Sep 2016 12:19 pm, Chris Angelico wrote: [...] >> Please either comply, or give up your stupid and pointless obsession with >> trying to be the Internet Police for something that isn't even a real >> rule. > > His posts aren't making it across the news->list gateway any more. > Killfile him and move on... But but but... I couldn't do that. https://www.xkcd.com/386/ -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Sun, Sep 4, 2016 at 12:49 PM, Steve D'Aprano wrote: > On Sun, 4 Sep 2016 12:19 pm, Chris Angelico wrote: > > [...] >>> Please either comply, or give up your stupid and pointless obsession with >>> trying to be the Internet Police for something that isn't even a real >>> rule. >> >> His posts aren't making it across the news->list gateway any more. >> Killfile him and move on... > > But but but... I couldn't do that. > > https://www.xkcd.com/386/ Ah, you got me. Can't force anyone to violate standards documents like RFCs and XKCDs. ChrisA who may or may not have recently pushed a commit to make something XKCD 859 compliant... -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
Thomas 'PointedEars' Lahn wrote: > Veek. M wrote: > >> https://mail.python.org/pipermail//python-ideas/2014-October/029630.htm >> >> Wanted to know if the above link idea, > > … which is 404-compliant; the Internet Archive does not have it either > … > >> had been implemented > > Probably not. > >> and if there's a module that accepts a pattern like 'cap' and give >> you all the instances of unicode 'CAP' characters. > > I do not know any. > >> ⋂ \bigcap >> ⊓ \sqcap >> ∩ \cap >> ♑ \capricornus >> ⪸ \succapprox >> ⪷ \precapprox >> >> (above's from tex) >> >> I found two useful modules in this regard: unicode_tex, unicodedata >> but unicodedata is a builtin which does not do globs, regexs - so >> it's kind of limiting in nature. > > Quick hack: > > # > from unicode_tex import unicode_to_tex_map > > for key, value \ > in filter(lambda item: "cap" in item[1], unicode_to_tex_map.items()): > print(key, value) > # > > (Optimizations are welcome.) > > It is easy to come up with methods that take a globbing or a regular > expression (globbing expressions can be turned into regular > expressions easily) and returns, perhaps as a dictionary or list of > tuples, only the matching entries. > > Other than that I think you will have to turn the Unicode Character > Database (which is available via HTTP as one huge text file; see the > Python Tutorial on “Internet Access” for how to get it dynamically) > into whatever form suits you for querying it. > >> Would be nice if you could search html/xml character entity >> references as well. > > For what purpose? > > Your posting is lacking a real name in the “From” header field. > Ouch! Sorry for the bad link Thomas. The link is titled '[Python-ideas] Extend unicodedata with a name search' and I suspect this updated link data (http://code.activestate.com/lists/python-ideas/29504/) may work - if it doesn't you could google the title. I don't want to dump/replicate the existing Unicode data in module 'unicodedata'. Regarding purpose, well I need this for hexchat. I IRC a lot and often, I want to ask a question involving math symbols. I've written some python (included at the bottom) that translates: \help filter_word #into a list of symbols and names (serves as a memory jog). It also translates stuff like: A \cap B \epsilon C to A ∩ B ε C. but all this works with a subset of tex - it can't do complicated formula. I wanted to extend it further.. I don't think I shall be able to subscript integrals easily but I could make better use of the available unicode, which means making it more accessible (hence the pattern matching feature) - html/xml entities provide a new way of remembering stuff. --- Regarding the name (From field), my name *is* Veek.M though I tend to shorten it to Vek.M on Google (i think Veek was taken or some such thing). Just to be clear, my parents call me something closely related to Veek that is NOT Beek or Peek or Squeak or Sneak and my official name is something really weird. Identity theft being what it is, I probably am lying anyhow about all this, but it sounds funny so :p import hexchat import re, unicode_tex, unicodedata __module_name__ = 'Unicode' __module_version__ = '0.1' __module_description__ = 'Substitute \whatever with Unicode char in cmdline input' #re_repl = unicodedata.lookup('N-ARY UNION') def debug(*args): hexchat.prnt('#{}#'.format(*args)) def print_help(*args): hexchat.prnt('{}'.format(*args)) def send_message(word, word_eol, userdata): if not(word[0] == "65293"): return msg = hexchat.get_info('inputbox') if msg is None: return x = re.match(r'(^\\help)\s+(\w+)', msg) if x: filter = x.groups()[1] for key, value in unicode_tex.tex_to_unicode_map.items(): if filter in key: print_help(value + ' ' + key) hexchat.command("settext %s" % '') return tex_matches = re.findall(r'(\\\w+)', msg) for tex_word in tex_matches: repl = unicode_tex.tex_to_unicode_map.get(tex_word) if repl is None: repl = 'err' msg = re.sub(re.escape(tex_word), repl, msg) hexchat.command("settext %s" % msg) hexchat.hook_print('Key Press', send_message) -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Saturday, September 3, 2016 at 5:25:48 PM UTC+5:30, Veek. M wrote: > https://mail.python.org/pipermail//python-ideas/2014-October/029630.htm > > Wanted to know if the above link idea, had been implemented and if > there's a module that accepts a pattern like 'cap' and give you all the > instances of unicode 'CAP' characters. > ⋂ \bigcap > ⊓ \sqcap > ∩ \cap > ♑ \capricornus > ⪸ \succapprox > ⪷ \precapprox > > (above's from tex) > > I found two useful modules in this regard: unicode_tex, unicodedata > but unicodedata is a builtin which does not do globs, regexs - so it's > kind of limiting in nature. > > Would be nice if you could search html/xml character entity references > as well. [Not exactly an answer] I use a number of things for such 1. Google 2. Xah Lee’s excellent pages which often fit my brain better than wikipedia: http://xahlee.info/comp/unicode_index.html 3. emacs’ function ucs-insert recently renamed to insert-char ie [In emacs] Type Alt-x insert-char After that some kind of TAB-globbing (case-insensitive) works I wont try with Cap (because the number of *CAPITAL* is in thousands!) eg alphaTAB gives nothing. However *alphaTAB gives a bunch. Narrow to "greek alpha"TAB and you get a bunch The fact that we should have a series of levels for char-input from most general and unergonomic (google) to most specific and ergonomic (special purpose keyboard) Ive tried to talk of as 7 levels near end of http://blog.languager.org/2015/01/unicode-and-universe.html -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
On Sunday, September 4, 2016 at 9:32:28 AM UTC+5:30, Veek. M wrote: > Regarding the name (From field), my name *is* Veek.M though I tend to > shorten it to Vek.M on Google (i think Veek was taken or some such > thing). Just to be clear, my parents call me something closely related > to Veek that is NOT Beek or Peek or Squeak or Sneak and my official name > is something really weird. Identity theft being what it is, I probably > am lying anyhow about all this, but it sounds funny so :p Please dont take the name-police bait. As far as I am concerned telling someone “I dont like your name” is in the same bracket as “I don’t like your religion/skin-color/gender/nationality/etc” Ie its highly offensive -- https://mail.python.org/mailman/listinfo/python-list
Re: Extend unicodedata with a name/pattern/regex search for character entity references?
Chris Angelico writes: > On Sun, Sep 4, 2016 at 12:49 PM, Steve D'Aprano > wrote: >> On Sun, 4 Sep 2016 12:19 pm, Chris Angelico wrote: >> >> [...] Please either comply, or give up your stupid and pointless obsession with trying to be the Internet Police for something that isn't even a real rule. >>> >>> His posts aren't making it across the news->list gateway any more. >>> Killfile him and move on... >> >> But but but... I couldn't do that. >> >> https://www.xkcd.com/386/ > > Ah, you got me. Can't force anyone to violate standards documents like > RFCs and XKCDs. > > ChrisA > who may or may not have recently pushed a commit to make something > XKCD 859 compliant... (-: (You were baiting for that.) -- https://mail.python.org/mailman/listinfo/python-list