Re: PEP 358 and operations on bytes

2006-10-04 Thread John Machin
Fredrik Lundh wrote: > John Machin wrote: > > > But not on other integer subtypes. If regexps should not be restricted > > to text, they should work on domains whose number of symbols is greater > > than 256, shouldn't they? > > they do: > > import re, array > > data = [0, 1, 1, 2] > > array_type

Re: PEP 358 and operations on bytes

2006-10-04 Thread bearophileHUGS
A simple RE engine written in Python can be short, this is a toy: http://paste.lisp.org/display/24849 If you can't live without the usual syntax: http://paste.lisp.org/display/24872 Paul Rubin: > Yes, I want something like that all the time for file scanning without > having to resort to parser mo

Re: PEP 358 and operations on bytes

2006-10-04 Thread Fredrik Lundh
John Machin wrote: > But not on other integer subtypes. If regexps should not be restricted > to text, they should work on domains whose number of symbols is greater > than 256, shouldn't they? they do: import re, array data = [0, 1, 1, 2] array_type = "IH"[re.sre_compile.MAXCODE == 0x] a

Re: PEP 358 and operations on bytes

2006-10-04 Thread Paul Rubin
[EMAIL PROTECTED] writes: > > I think the underlying regexp C library isn't written that way. I can > > see reasons to want a higher-level regexp library that works on > > arbitrary sequences, calling a user-supplied function to classify > > sequence elements, the way current regexps use the chara

Re: PEP 358 and operations on bytes

2006-10-04 Thread bearophileHUGS
Paul Rubin: > I think the underlying regexp C library isn't written that way. I can > see reasons to want a higher-level regexp library that works on > arbitrary sequences, calling a user-supplied function to classify > sequence elements, the way current regexps use the character code to > classif

Re: PEP 358 and operations on bytes

2006-10-04 Thread Paul Rubin
"John Machin" <[EMAIL PROTECTED]> writes: > But not on other integer subtypes. If regexps should not be restricted > to text, they should work on domains whose number of symbols is greater > than 256, shouldn't they? I think the underlying regexp C library isn't written that way. I can see reason

Re: PEP 358 and operations on bytes

2006-10-04 Thread John Machin
Paul Rubin wrote: > "John Machin" <[EMAIL PROTECTED]> writes: > > So why haven't you been campaigning for regular expression support for > > sequences of int, and for various array.array subtypes? > > regexps work on byte arrays. But not on other integer subtypes. If regexps should not be restric

Re: PEP 358 and operations on bytes

2006-10-04 Thread Paul Rubin
"John Machin" <[EMAIL PROTECTED]> writes: > So why haven't you been campaigning for regular expression support for > sequences of int, and for various array.array subtypes? regexps work on byte arrays. -- http://mail.python.org/mailman/listinfo/python-list

Re: PEP 358 and operations on bytes

2006-10-04 Thread John Machin
Gerrit Holl wrote: > On 2006-10-04 05:10:32 +0200, John Machin wrote: > > > - str methods endswith, find, partition, replace, split(lines), > > > startswith, > > > - Regular expressions > > > > > > I think those can be useful on a bytes type. Perhaps bytes and str could > > > share a

Re: PEP 358 and operations on bytes

2006-10-04 Thread Gerrit Holl
On 2006-10-04 05:10:32 +0200, John Machin wrote: > > - str methods endswith, find, partition, replace, split(lines), > > startswith, > > - Regular expressions > > > > I think those can be useful on a bytes type. Perhaps bytes and str could > > share a common parent class? They certain

Re: PEP 358 and operations on bytes

2006-10-03 Thread Steve Holden
Ben Finney wrote: > Steve Holden <[EMAIL PROTECTED]> writes: > > >>This would just be bloat > > > How would it be bloat? I'm describing a situation where the existing > methods merely move, being implemented in a common ancestor rather > than directly in the concrete sequence classes. > > >>w

Re: PEP 358 and operations on bytes

2006-10-03 Thread John Machin
Gerrit Holl wrote: > Hi, > > In Python 3, reading from a file gives bytes rather than characters. > Some operations currently performed on strings also make sense when > performed on bytes, either if it's binary data or if it's text of > unknown or mixed encoding. Those include of course slicing a

Re: PEP 358 and operations on bytes

2006-10-03 Thread Ben Finney
Steve Holden <[EMAIL PROTECTED]> writes: > This would just be bloat How would it be bloat? I'm describing a situation where the existing methods merely move, being implemented in a common ancestor rather than directly in the concrete sequence classes. > without any use cases being demonstrated.

Re: PEP 358 and operations on bytes

2006-10-03 Thread Steve Holden
Ben Finney wrote: > Gabriel G <[EMAIL PROTECTED]> writes: > > >>At Tuesday 3/10/2006 21:52, Ben Finney wrote: >> >> >>>Gerrit Holl <[EMAIL PROTECTED]> writes: >>> - str methods endswith, find, partition, replace, split(lines), startswith, - Regular expressions >>> >>>Loo

Re: PEP 358 and operations on bytes

2006-10-03 Thread Ben Finney
Gabriel G <[EMAIL PROTECTED]> writes: > At Tuesday 3/10/2006 21:52, Ben Finney wrote: > > >Gerrit Holl <[EMAIL PROTECTED]> writes: > > > - str methods endswith, find, partition, replace, split(lines), > > > startswith, > > > - Regular expressions > > > >Looking at those, I don't see

Re: PEP 358 and operations on bytes

2006-10-03 Thread Gabriel G
At Tuesday 3/10/2006 21:52, Ben Finney wrote: Gerrit Holl <[EMAIL PROTECTED]> writes: > operations that aren't currently defined in PEP 358, like: > > - str methods endswith, find, partition, replace, split(lines), > startswith, > - Regular expressions > > I think those can be use

Re: PEP 358 and operations on bytes

2006-10-03 Thread Ben Finney
Gerrit Holl <[EMAIL PROTECTED]> writes: > operations that aren't currently defined in PEP 358, like: > > - str methods endswith, find, partition, replace, split(lines), > startswith, > - Regular expressions > > I think those can be useful on a bytes type. Perhaps bytes and str > coul

PEP 358 and operations on bytes

2006-10-03 Thread Gerrit Holl
Hi, In Python 3, reading from a file gives bytes rather than characters. Some operations currently performed on strings also make sense when performed on bytes, either if it's binary data or if it's text of unknown or mixed encoding. Those include of course slicing and other operators that exist i