Re: Letter class in re

2015-03-10 Thread Albert-Jan Roskam
On Tue, 3/10/15, Antoon Pardon wrote: Subject: Re: Letter class in re To: python-list@python.org Date: Tuesday, March 10, 2015, 9:35 AM Op 09-03-15 om 17:11 schreef Steven D'Aprano: > Antoon Pardon wrote: > >> I am using PLY for a parsing task which uses re

Re: Letter class in re

2015-03-10 Thread Antoon Pardon
Op 09-03-15 om 17:11 schreef Steven D'Aprano: > Antoon Pardon wrote: > >> I am using PLY for a parsing task which uses re for the lexical >> analysis. Does anyone >> know what regular expression to use for a sequence of letters? There is >> a class for alphanumerics but I can't find one for just le

Re: Letter class in re

2015-03-10 Thread Antoon Pardon
Op 09-03-15 om 16:17 schreef Tim Chase: > On 2015-03-09 15:29, Antoon Pardon wrote: >> Op 09-03-15 om 13:50 schreef Tim Chase: (?:(?!_|\d)\w)\w+ >>> If you don't have to treat it as an atom, you can simplify that to >>> just >>> >>> (?!_|\d)\w+ >>> >>> which just means that the first chara

Re: Letter class in re

2015-03-09 Thread Steven D'Aprano
Antoon Pardon wrote: > I am using PLY for a parsing task which uses re for the lexical > analysis. Does anyone > know what regular expression to use for a sequence of letters? There is > a class for alphanumerics but I can't find one for just letters, which I > find odd. > > I am using python 3.4

Re: Letter class in re

2015-03-09 Thread Tim Chase
On 2015-03-09 15:29, Antoon Pardon wrote: > Op 09-03-15 om 13:50 schreef Tim Chase: > >> (?:(?!_|\d)\w)\w+ > > If you don't have to treat it as an atom, you can simplify that to > > just > > > > (?!_|\d)\w+ > > > > which just means that the first character can't be an underscore > > or digit. >

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 14:33 schreef Albert-Jan Roskam: > I was going to make the same remark, but with a slightly different solution: > In [1]: repr(re.search("[a-zA-Z]", "é")) > Out[1]: 'None' > > In [2]: repr(re.search(u"[^\d\W_]+", u"é", re.I | re.U)) > Out[2]: '<_sre.SRE_Match object at 0x027CDB10>

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 15:44 schreef Chris Angelico: > On Tue, Mar 10, 2015 at 1:41 AM, Antoon Pardon > wrote: >> Op 09-03-15 om 14:35 schreef Chris Angelico: >>> On Mon, Mar 9, 2015 at 11:26 PM, Antoon Pardon >>> wrote: It seems odd that one should need such an ugly expression for something t

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 15:39 schreef Chris Angelico: > On Tue, Mar 10, 2015 at 1:34 AM, Antoon Pardon > wrote: >>> There is str.isidentifier, which returns True if something is a valid >>> identifier name: >>> >> '℮'.isidentifier() >>> True >> Which is not very usefull in a context of lexical analysis

Re: Letter class in re

2015-03-09 Thread Chris Angelico
On Tue, Mar 10, 2015 at 1:41 AM, Antoon Pardon wrote: > Op 09-03-15 om 14:35 schreef Chris Angelico: >> On Mon, Mar 9, 2015 at 11:26 PM, Antoon Pardon >> wrote: >>> It seems odd that one should need such an ugly expression for something >>> that is >>> used rather frequently for parsing computer

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 14:35 schreef Chris Angelico: > On Mon, Mar 9, 2015 at 11:26 PM, Antoon Pardon > wrote: >> It seems odd that one should need such an ugly expression for something that >> is >> used rather frequently for parsing computer languages and the like. > Possibly because computer language

Re: Letter class in re

2015-03-09 Thread Chris Angelico
On Tue, Mar 10, 2015 at 1:34 AM, Antoon Pardon wrote: >> There is str.isidentifier, which returns True if something is a valid >> identifier name: >> >> >>> '℮'.isidentifier() >> True > > Which is not very usefull in a context of lexical analysis. I don't need to > know > if a particular string i

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 14:32 schreef Wolfgang Maier: ... > >> It seems odd that one should need such an ugly expression for >> something that is >> used rather frequently for parsing computer languages and the like. >> > > There is str.isidentifier, which returns True if something is a valid > identifier

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 13:50 schreef Tim Chase: > On 2015-03-09 13:26, Antoon Pardon wrote: >> Op 09-03-15 om 12:17 schreef Tim Chase: >>> (?:(?!_|\d)\w) >> So if I understand correctly the following should be a regular >> expression for a python3 identifier. >> >> (?:(?!_|\d)\w)\w+ > If you don't have

Re: Letter class in re

2015-03-09 Thread Wolfgang Maier
On 03/09/2015 03:04 PM, Wolfgang Maier wrote: On 03/09/2015 02:33 PM, Albert-Jan Roskam wrote: On Mon, 3/9/15, Tim Chase wrote: "[^\d\W_]+" means something like "one or more (+) of 'not (a digit, a non-word, an underscore)'. interesting (using Py

Re: Letter class in re

2015-03-09 Thread Tim Chase
On 2015-03-09 13:26, Antoon Pardon wrote: > Op 09-03-15 om 12:17 schreef Tim Chase: >> (?:(?!_|\d)\w) > > So if I understand correctly the following should be a regular > expression for a python3 identifier. > > (?:(?!_|\d)\w)\w+ If you don't have to treat it as an atom, you can simplify tha

Re: Letter class in re

2015-03-09 Thread Wolfgang Maier
On 03/09/2015 02:33 PM, Albert-Jan Roskam wrote: On Mon, 3/9/15, Tim Chase wrote: "[^\d\W_]+" means something like "one or more (+) of 'not (a digit, a non-word, an underscore)'. interesting (using Python3.4 and U+2188 ROMAN NUMERAL ONE HUNDRED

Re: Letter class in re

2015-03-09 Thread Serhiy Storchaka
On 09.03.15 14:26, Antoon Pardon wrote: So if I understand correctly the following should be a regular expression for a python3 identifier. (?:(?!_|\d)\w)\w+ It seems odd that one should need such an ugly expression for something that is used rather frequently for parsing computer languages

Re: Letter class in re

2015-03-09 Thread Albert-Jan Roskam
On Mon, 3/9/15, Tim Chase wrote: Subject: Re: Letter class in re To: python-list@python.org Date: Monday, March 9, 2015, 12:17 PM On 2015-03-09 11:37, Wolfgang Maier wrote: > On 03/09/2015 11:23 AM, Antoon Pardon wrote: >> Does an

Re: Letter class in re

2015-03-09 Thread Chris Angelico
On Mon, Mar 9, 2015 at 11:26 PM, Antoon Pardon wrote: > It seems odd that one should need such an ugly expression for something that > is > used rather frequently for parsing computer languages and the like. Possibly because computer language parsers don't use regular expressions. :) ChrisA --

Re: Letter class in re

2015-03-09 Thread Wolfgang Maier
On 03/09/2015 01:26 PM, Antoon Pardon wrote: Op 09-03-15 om 12:17 schreef Tim Chase: On 2015-03-09 11:37, Wolfgang Maier wrote: On 03/09/2015 11:23 AM, Antoon Pardon wrote: Does anyone know what regular expression to use for a sequence of letters? There is a class for alphanumerics but I can't

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 12:17 schreef Tim Chase: > On 2015-03-09 11:37, Wolfgang Maier wrote: >> On 03/09/2015 11:23 AM, Antoon Pardon wrote: >>> Does anyone know what regular expression to use for a sequence of >>> letters? There is a class for alphanumerics but I can't find one >>> for just letters, which

Re: Letter class in re

2015-03-09 Thread Tim Chase
On 2015-03-09 11:37, Wolfgang Maier wrote: > On 03/09/2015 11:23 AM, Antoon Pardon wrote: >> Does anyone know what regular expression to use for a sequence of >> letters? There is a class for alphanumerics but I can't find one >> for just letters, which I find odd. > > how about [a-zA-Z] ? That b

Re: Letter class in re

2015-03-09 Thread Antoon Pardon
Op 09-03-15 om 11:37 schreef Wolfgang Maier: > On 03/09/2015 11:23 AM, Antoon Pardon wrote: >> I am using PLY for a parsing task which uses re for the lexical >> analysis. Does anyone >> know what regular expression to use for a sequence of letters? There is >> a class for alphanumerics but I can't

Re: Letter class in re

2015-03-09 Thread Wolfgang Maier
On 03/09/2015 11:23 AM, Antoon Pardon wrote: I am using PLY for a parsing task which uses re for the lexical analysis. Does anyone know what regular expression to use for a sequence of letters? There is a class for alphanumerics but I can't find one for just letters, which I find odd. I am using

Letter class in re

2015-03-09 Thread Antoon Pardon
I am using PLY for a parsing task which uses re for the lexical analysis. Does anyone know what regular expression to use for a sequence of letters? There is a class for alphanumerics but I can't find one for just letters, which I find odd. I am using python 3.4 -- Antoon Pardon -- https://mail