any chance for contracts and invariants in Python?

2013-02-14 Thread mrkafk

This PEP seems to be gathering dust:

http://www.python.org/dev/peps/pep-0316/

I was thinking the other day, would contracts and invariants not be better than 
unit tests? That is, they could do what unit tests do and more, bc they run at 
execution time and not just at development time?

-- 
http://mail.python.org/mailman/listinfo/python-list


s.split() on multiple separators

2007-09-30 Thread mrkafk
Hello everyone,

OK, so I want to split a string c into words using several different
separators from a list (dels).

I can do this the following C-like way:

>>> c=' abcde abc cba fdsa bcd '.split()
>>> dels='ce '
>>> for j in dels:
cp=[]
for i in xrange(0,len(c)-1):
cp.extend(c[i].split(j))
c=cp


>>> c
['ab', 'd', '', 'ab', '', '']

But. Surely there is a more Pythonic way to do this?

I cannot do this:

>>> for i in dels:
c=[x.split(i) for x in c]

because x.split(i) is a list.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: s.split() on multiple separators

2007-09-30 Thread mrkafk

> > ['ab', 'd', '', 'ab', '', '']
>
> Given your original string, I'm not sure how that would be the
> expected result of "split c on the characters in dels".

Oops, the inner loop should be:

for i in xrange(0,len(c)):

Now it works.


>   >>> c=' abcde abc cba fdsa bcd '
>   >>> import re
>   >>> r = re.compile('[ce ]')
>   >>> r.split(c)
>   ['', 'ab', 'd', '', 'ab', '', '', 'ba', 'fdsa', 'b', 'd', '']
>
> given that a regexp object has a split() method.

That's probably optimum solution. Thanks!

Regards,
Marcin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: s.split() on multiple separators

2007-09-30 Thread mrkafk
On 30 Wrz, 20:27, William James <[EMAIL PROTECTED]> wrote:
> On Sep 30, 8:53 am, [EMAIL PROTECTED] wrote:

> E:\Ruby>irb
> irb(main):001:0> ' abcde abc cba fdsa bcd '.split(/[ce ]/)
> => ["", "ab", "d", "", "ab", "", "", "ba", "fdsa", "b", "d"]

That's acceptable only if you write perfect ruby-to-python
translator. ;-P

Regards,
Marcin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: s.split() on multiple separators

2007-09-30 Thread mrkafk
> >  c=' abcde abc cba fdsa bcd '.split()
> >  dels='ce '
> >  for j in dels:
> >cp=[]
> >for i in xrange(0,len(c)-1):
>
> The "-1" looks like a bug; remember in Python 'stop' bounds
> are exclusive. The indexes of c are simply xrange(len(c)).

Yep. Just found it out, though this seems a bit counterintuitive to
me, even if it makes for more elegant code: I forgot about the high
stop bound.

>From my POV, if I want sequence from here to there, it should include
both here and there.

I do understand the consequences of making high bound exclusive, which
is more elegant code: xrange(len(c)). But it does seem a bit
illogical...

>  print re.split('[ce ]', c)

Yes, that does the job. Thanks.

Regards,
Marcin

-- 
http://mail.python.org/mailman/listinfo/python-list


Descriptors and side effects

2007-11-04 Thread mrkafk
Hello everyone,

I'm trying to do seemingly trivial thing with descriptors: have
another attribute updated on dot access in object defined using
descriptors.

For example, let's take a simple example where you set an attribute s
to a string and have another attribute l set automatically to its
length.

>>> class Desc(str):
def __init__(self,val):
self.s=val
self.l=len(val)
print "creating value: ", self.s
print "id(self.l)", id(self.l)
def __set__(self, obj, val):
self.s=val
self.l=len(val)
print "setting value:", self.s, "length:", self.l
def __get__(self, obj, type=None):
print "getting value:", self.s, "length:", self.l
return self.l


>>> class some(str):
m=Desc('abc')
l=m.l


creating value:  abc
id(self.l) 10049688
>>> ta=some()
>>> ta.m='test string'
setting value: test string length: 11

However, the attribute ta.l didn't get updated:

>>> ta.l
3

This is so much weirder that object id of ta.l is the same as id of
instance of descriptor:

>>> id(ta.l)
10049688

A setter function should have updated self.l just like it updated
self.s:

def __set__(self, obj, val):
self.s=val
self.l=len(val)
print "setting value:", self.s, "length:", self.l

Yet it didn't happen.

>From my POV, the main benefit of a descriptor lies in its side effect:
on dot access (getting/setting) I can get other attributes updated
automatically: say, in class of Squares I get area automatically
updated on updating side, etc.

Yet, I'm struggling with getting it done in Python. Descriptors are a
great idea, but I would like to see them implemented in Python in a
way that makes it easier to get desireable side effects.

-- 
http://mail.python.org/mailman/listinfo/python-list


File to dict

2007-12-07 Thread mrkafk
Hello everyone,

I have written this small utility function for transforming legacy
file to Python dict:


def lookupdmo(domain):
lines = open('/etc/virtual/domainowners','r').readlines()
lines = [ [y.lstrip().rstrip() for y in x.split(':')] for x in
lines]
lines = [ x for x in lines if len(x) == 2 ]
d = dict()
for line in lines:
d[line[0]]=line[1]
return d[domain]

The /etc/virtual/domainowners file contains double-colon separated
entries:
domain1.tld: owner1
domain2.tld: own2
domain3.another: somebody
...

Now, the above lookupdmo function works. However, it's rather tedious
to transform files into dicts this way and I have quite a lot of such
files to transform (like custom 'passwd' files for virtual email
accounts etc).

Is there any more clever / more pythonic way of parsing files like
this? Say, I would like to transform a file containing entries like
the following into a list of lists with doublecolon treated as
separators, i.e. this:

tm:$1$$:1010:6::/home/owner1/imap/domain1.tld/tm:/sbin/nologin

would get transformed into this:

[ ['tm', '$1$$', '1010', '6', , '/home/owner1/imap/domain1.tld/
tm', '/sbin/nologin'] [...] [...] ]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File to dict

2007-12-07 Thread mrkafk

> >>> def shelper(line):
> ... return x.replace(' ','').strip('\n').split(':',1)

Argh, typo, should be def shelper(x) of course.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File to dict

2007-12-07 Thread mrkafk

> I guess Duncan's point wasn't the construction of the dictionary but the
> throw it away part.  If you don't keep it, the loop above is even more
> efficient than building a dictionary with *all* lines of the file, just to
> pick one value afterwards.

Sure, but I have two options here, none of them nice: either "write C
in Python" or do it inefficient and still elaborate way.

Anyway, I found my nirvana at last:

>>> def shelper(line):
... return x.replace(' ','').strip('\n').split(':',1)
...

>>> ownerslist = [ shelper(x)[1] for x in it if len(shelper(x)) == 2 and 
>>> shelper(x)[0] == domain ]

>>> ownerslist
['da2']


Python rulez. :-)




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File to dict

2007-12-07 Thread mrkafk


> The csv module is your friend.

(slapping forehead) why the Holy Grail didn't I think about this? That
should be much simpler than using SimpleParse or SPARK.

Thx Bruno & everyone.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File to dict

2007-12-07 Thread mrkafk

Duncan Booth wrote:
> Just some minor points without changing the basis of what you have done
> here:

All good points, thanks. Phew, there's nothing like peer review for
your code...

> But why do you construct a dict from that input data simply to throw it
> away?

Because comparing strings for equality in a loop is writing C in
Python, and that's
exactly what I'm trying to unlearn.

The proper way to do it is to produce a dictionary and look up a value
using a key.

>If you only want 1 domain from the file just pick it out of the list.

for item in list:
if item == 'searched.domain':
return item...

Yuck.


> with open('/etc/virtual/domainowners','r') as infile:
> pairs = [ line.split(':',1) for line in infile if ':' in line ]

Didn't think about doing it this way. Good point. Thx

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File to dict

2007-12-07 Thread mrkafk


Glauco wrote:

> cache = None
>
> def lookup( domain ):
> if not cache:
>cache = dict( [map( lambda x: x.strip(), x.split(':'))  for x in
> open('/etc/virtual/domainowners','r').readlines()])
> return cache.get(domain)

Neat solution! It just needs small correction for empty or badly
formed lines:

dict([map( lambda x: x.strip(), x.split(':'))  for x in open('/etc/
virtual/domainowners','r') if ':' in x])

-- 
http://mail.python.org/mailman/listinfo/python-list


Daily WTF with XML, or error handling in SAX

2008-05-03 Thread mrkafk

So I set out to learn handling three-letter-acronym files in Python,
and SAX worked nicely until I encountered badly formed XMLs, like with
bad characters in it (well Unicode supposed to handle it all but
apparently doesn't), using http://dchublist.com/hublist.xml.bz2 as
example data, with goal to extract Users and Address properties where
number of Users is greater than given number.

So I extended my First XML Example with an error handler:

# = snip ===
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
from xml.sax.handler import ErrorHandler

class HubHandler(ContentHandler):
def __init__(self, hublist):
self.Address = ''
self.Users = ''
hl = hublist
def startElement(self, name, attrs):
self.Address = attrs.get('Address',"")
self.Users = attrs.get('Users', "")
def endElement(self, name):
if name == "Hub" and int(self.Users) > 2000:
#print self.Address, self.Users
hl.append({self.Address: int(self.Users)})

class HubErrorHandler(ErrorHandler):
def __init__(self):
pass
def error(self, exception):
import sys
print "Error, exception: %s\n" % exception
def fatalError(self, exception):
print "Fatal Error, exception: %s\n" % exception

hl = []

parser = make_parser()

hHandler = HubHandler(hl)
errHandler = HubErrorHandler()

parser.setContentHandler(hHandler)
parser.setErrorHandler(errHandler)

fh = file('hublist.xml')
parser.parse(fh)

def compare(x,y):
if x.values()[0] > y.values()[0]:
return 1
elif x.values()[0] < y.values()[0]:
return -1
return 0

hl.sort(cmp=compare, reverse=True)

for h in hl:
print h.keys()[0], "   ", h.values()[0]
# = snip ===

And then BAM, Pythonwin has hit me:


>>> execfile('ph.py')
Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)


>>>  RESTART 

Just before the "RESTART" line, Windows has announced it killed
pythonw.exe process (I suppose it was a child process).

WTF is happening here? Wasn't fatalError method in the HubErrorHandler
supposed to handle the invalid tokens? And why is the message repeated
many times? My method is called apparently, but something in SAX goes
awry and the interpreter crashes.


--
http://mail.python.org/mailman/listinfo/python-list


Error handling in SAX

2008-05-03 Thread mrkafk
(this is a repost, for it's been a while since I posted this text via
Google Groups and it plain didn't appear on c.l.py - if it did appear
anyway, apols)

So I set out to learn handling three-letter-acronym files in Python,
and SAX worked nicely until I encountered badly formed XMLs, like with
bad characters in it (well Unicode supposed to handle it all but
apparently doesn't), using http://dchublist.com/hublist.xml.bz2 as
example data, with goal to extract Users and Address properties where
number of Users is greater than given number.

So I extended my First XML Example with an error handler:

# = snip ===
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
from xml.sax.handler import ErrorHandler

class HubHandler(ContentHandler):
def __init__(self, hublist):
self.Address = ''
self.Users = ''
hl = hublist
def startElement(self, name, attrs):
self.Address = attrs.get('Address',"")
self.Users = attrs.get('Users', "")
def endElement(self, name):
if name == "Hub" and int(self.Users) > 2000:
#print self.Address, self.Users
hl.append({self.Address: int(self.Users)})

class HubErrorHandler(ErrorHandler):
def __init__(self):
pass
def error(self, exception):
import sys
print "Error, exception: %s\n" % exception
def fatalError(self, exception):
print "Fatal Error, exception: %s\n" % exception

hl = []

parser = make_parser()

hHandler = HubHandler(hl)
errHandler = HubErrorHandler()

parser.setContentHandler(hHandler)
parser.setErrorHandler(errHandler)

fh = file('hublist.xml')
parser.parse(fh)

def compare(x,y):
if x.values()[0] > y.values()[0]:
return 1
elif x.values()[0] < y.values()[0]:
return -1
return 0

hl.sort(cmp=compare, reverse=True)

for h in hl:
print h.keys()[0], "   ", h.values()[0]
# = snip ===

And then BAM, Pythonwin has hit me:


>>> execfile('ph.py')
Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)

Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid
token)


>>>  RESTART 

Just before the "RESTART" line, Windows has announced it killed
pythonw.exe process (I suppose it was a child process).

WTF is happening here? Wasn't fatalError method in the HubErrorHandler
supposed to handle the invalid tokens? And why is the message repeated
many times? My method is called apparently, but something in SAX goes
awry and the interpreter crashes.


--
http://mail.python.org/mailman/listinfo/python-list


Re: dict invert - learning question

2008-05-03 Thread mrkafk
Assuming all the values are unique:

>>> a={1:'a', 2:'b', 3:'c'}

>>> dict(zip(a.keys(), a.values()))

{1: 'a', 2: 'b', 3: 'c'}

The problem is you obviously can't assume that in most cases.

Still, zip() is very useful function.
--
http://mail.python.org/mailman/listinfo/python-list


Re: dict invert - learning question

2008-05-03 Thread mrkafk
On 4 Maj, 01:27, [EMAIL PROTECTED] wrote:

> >>> a={1:'a', 2:'b', 3:'c'}

Oops, it should obviously be:

>>> dict(zip(a.values(), a.keys()))
{'a': 1, 'c': 3, 'b': 2}

--
http://mail.python.org/mailman/listinfo/python-list