any chance for contracts and invariants in Python?
This PEP seems to be gathering dust: http://www.python.org/dev/peps/pep-0316/ I was thinking the other day, would contracts and invariants not be better than unit tests? That is, they could do what unit tests do and more, bc they run at execution time and not just at development time? -- http://mail.python.org/mailman/listinfo/python-list
s.split() on multiple separators
Hello everyone, OK, so I want to split a string c into words using several different separators from a list (dels). I can do this the following C-like way: >>> c=' abcde abc cba fdsa bcd '.split() >>> dels='ce ' >>> for j in dels: cp=[] for i in xrange(0,len(c)-1): cp.extend(c[i].split(j)) c=cp >>> c ['ab', 'd', '', 'ab', '', ''] But. Surely there is a more Pythonic way to do this? I cannot do this: >>> for i in dels: c=[x.split(i) for x in c] because x.split(i) is a list. -- http://mail.python.org/mailman/listinfo/python-list
Re: s.split() on multiple separators
> > ['ab', 'd', '', 'ab', '', ''] > > Given your original string, I'm not sure how that would be the > expected result of "split c on the characters in dels". Oops, the inner loop should be: for i in xrange(0,len(c)): Now it works. > >>> c=' abcde abc cba fdsa bcd ' > >>> import re > >>> r = re.compile('[ce ]') > >>> r.split(c) > ['', 'ab', 'd', '', 'ab', '', '', 'ba', 'fdsa', 'b', 'd', ''] > > given that a regexp object has a split() method. That's probably optimum solution. Thanks! Regards, Marcin -- http://mail.python.org/mailman/listinfo/python-list
Re: s.split() on multiple separators
On 30 Wrz, 20:27, William James <[EMAIL PROTECTED]> wrote: > On Sep 30, 8:53 am, [EMAIL PROTECTED] wrote: > E:\Ruby>irb > irb(main):001:0> ' abcde abc cba fdsa bcd '.split(/[ce ]/) > => ["", "ab", "d", "", "ab", "", "", "ba", "fdsa", "b", "d"] That's acceptable only if you write perfect ruby-to-python translator. ;-P Regards, Marcin -- http://mail.python.org/mailman/listinfo/python-list
Re: s.split() on multiple separators
> > c=' abcde abc cba fdsa bcd '.split() > > dels='ce ' > > for j in dels: > >cp=[] > >for i in xrange(0,len(c)-1): > > The "-1" looks like a bug; remember in Python 'stop' bounds > are exclusive. The indexes of c are simply xrange(len(c)). Yep. Just found it out, though this seems a bit counterintuitive to me, even if it makes for more elegant code: I forgot about the high stop bound. >From my POV, if I want sequence from here to there, it should include both here and there. I do understand the consequences of making high bound exclusive, which is more elegant code: xrange(len(c)). But it does seem a bit illogical... > print re.split('[ce ]', c) Yes, that does the job. Thanks. Regards, Marcin -- http://mail.python.org/mailman/listinfo/python-list
Descriptors and side effects
Hello everyone, I'm trying to do seemingly trivial thing with descriptors: have another attribute updated on dot access in object defined using descriptors. For example, let's take a simple example where you set an attribute s to a string and have another attribute l set automatically to its length. >>> class Desc(str): def __init__(self,val): self.s=val self.l=len(val) print "creating value: ", self.s print "id(self.l)", id(self.l) def __set__(self, obj, val): self.s=val self.l=len(val) print "setting value:", self.s, "length:", self.l def __get__(self, obj, type=None): print "getting value:", self.s, "length:", self.l return self.l >>> class some(str): m=Desc('abc') l=m.l creating value: abc id(self.l) 10049688 >>> ta=some() >>> ta.m='test string' setting value: test string length: 11 However, the attribute ta.l didn't get updated: >>> ta.l 3 This is so much weirder that object id of ta.l is the same as id of instance of descriptor: >>> id(ta.l) 10049688 A setter function should have updated self.l just like it updated self.s: def __set__(self, obj, val): self.s=val self.l=len(val) print "setting value:", self.s, "length:", self.l Yet it didn't happen. >From my POV, the main benefit of a descriptor lies in its side effect: on dot access (getting/setting) I can get other attributes updated automatically: say, in class of Squares I get area automatically updated on updating side, etc. Yet, I'm struggling with getting it done in Python. Descriptors are a great idea, but I would like to see them implemented in Python in a way that makes it easier to get desireable side effects. -- http://mail.python.org/mailman/listinfo/python-list
File to dict
Hello everyone, I have written this small utility function for transforming legacy file to Python dict: def lookupdmo(domain): lines = open('/etc/virtual/domainowners','r').readlines() lines = [ [y.lstrip().rstrip() for y in x.split(':')] for x in lines] lines = [ x for x in lines if len(x) == 2 ] d = dict() for line in lines: d[line[0]]=line[1] return d[domain] The /etc/virtual/domainowners file contains double-colon separated entries: domain1.tld: owner1 domain2.tld: own2 domain3.another: somebody ... Now, the above lookupdmo function works. However, it's rather tedious to transform files into dicts this way and I have quite a lot of such files to transform (like custom 'passwd' files for virtual email accounts etc). Is there any more clever / more pythonic way of parsing files like this? Say, I would like to transform a file containing entries like the following into a list of lists with doublecolon treated as separators, i.e. this: tm:$1$$:1010:6::/home/owner1/imap/domain1.tld/tm:/sbin/nologin would get transformed into this: [ ['tm', '$1$$', '1010', '6', , '/home/owner1/imap/domain1.tld/ tm', '/sbin/nologin'] [...] [...] ] -- http://mail.python.org/mailman/listinfo/python-list
Re: File to dict
> >>> def shelper(line): > ... return x.replace(' ','').strip('\n').split(':',1) Argh, typo, should be def shelper(x) of course. -- http://mail.python.org/mailman/listinfo/python-list
Re: File to dict
> I guess Duncan's point wasn't the construction of the dictionary but the > throw it away part. If you don't keep it, the loop above is even more > efficient than building a dictionary with *all* lines of the file, just to > pick one value afterwards. Sure, but I have two options here, none of them nice: either "write C in Python" or do it inefficient and still elaborate way. Anyway, I found my nirvana at last: >>> def shelper(line): ... return x.replace(' ','').strip('\n').split(':',1) ... >>> ownerslist = [ shelper(x)[1] for x in it if len(shelper(x)) == 2 and >>> shelper(x)[0] == domain ] >>> ownerslist ['da2'] Python rulez. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: File to dict
> The csv module is your friend. (slapping forehead) why the Holy Grail didn't I think about this? That should be much simpler than using SimpleParse or SPARK. Thx Bruno & everyone. -- http://mail.python.org/mailman/listinfo/python-list
Re: File to dict
Duncan Booth wrote: > Just some minor points without changing the basis of what you have done > here: All good points, thanks. Phew, there's nothing like peer review for your code... > But why do you construct a dict from that input data simply to throw it > away? Because comparing strings for equality in a loop is writing C in Python, and that's exactly what I'm trying to unlearn. The proper way to do it is to produce a dictionary and look up a value using a key. >If you only want 1 domain from the file just pick it out of the list. for item in list: if item == 'searched.domain': return item... Yuck. > with open('/etc/virtual/domainowners','r') as infile: > pairs = [ line.split(':',1) for line in infile if ':' in line ] Didn't think about doing it this way. Good point. Thx -- http://mail.python.org/mailman/listinfo/python-list
Re: File to dict
Glauco wrote: > cache = None > > def lookup( domain ): > if not cache: >cache = dict( [map( lambda x: x.strip(), x.split(':')) for x in > open('/etc/virtual/domainowners','r').readlines()]) > return cache.get(domain) Neat solution! It just needs small correction for empty or badly formed lines: dict([map( lambda x: x.strip(), x.split(':')) for x in open('/etc/ virtual/domainowners','r') if ':' in x]) -- http://mail.python.org/mailman/listinfo/python-list
Daily WTF with XML, or error handling in SAX
So I set out to learn handling three-letter-acronym files in Python, and SAX worked nicely until I encountered badly formed XMLs, like with bad characters in it (well Unicode supposed to handle it all but apparently doesn't), using http://dchublist.com/hublist.xml.bz2 as example data, with goal to extract Users and Address properties where number of Users is greater than given number. So I extended my First XML Example with an error handler: # = snip === from xml.sax import make_parser from xml.sax.handler import ContentHandler from xml.sax.handler import ErrorHandler class HubHandler(ContentHandler): def __init__(self, hublist): self.Address = '' self.Users = '' hl = hublist def startElement(self, name, attrs): self.Address = attrs.get('Address',"") self.Users = attrs.get('Users', "") def endElement(self, name): if name == "Hub" and int(self.Users) > 2000: #print self.Address, self.Users hl.append({self.Address: int(self.Users)}) class HubErrorHandler(ErrorHandler): def __init__(self): pass def error(self, exception): import sys print "Error, exception: %s\n" % exception def fatalError(self, exception): print "Fatal Error, exception: %s\n" % exception hl = [] parser = make_parser() hHandler = HubHandler(hl) errHandler = HubErrorHandler() parser.setContentHandler(hHandler) parser.setErrorHandler(errHandler) fh = file('hublist.xml') parser.parse(fh) def compare(x,y): if x.values()[0] > y.values()[0]: return 1 elif x.values()[0] < y.values()[0]: return -1 return 0 hl.sort(cmp=compare, reverse=True) for h in hl: print h.keys()[0], " ", h.values()[0] # = snip === And then BAM, Pythonwin has hit me: >>> execfile('ph.py') Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) >>> RESTART Just before the "RESTART" line, Windows has announced it killed pythonw.exe process (I suppose it was a child process). WTF is happening here? Wasn't fatalError method in the HubErrorHandler supposed to handle the invalid tokens? And why is the message repeated many times? My method is called apparently, but something in SAX goes awry and the interpreter crashes. -- http://mail.python.org/mailman/listinfo/python-list
Error handling in SAX
(this is a repost, for it's been a while since I posted this text via Google Groups and it plain didn't appear on c.l.py - if it did appear anyway, apols) So I set out to learn handling three-letter-acronym files in Python, and SAX worked nicely until I encountered badly formed XMLs, like with bad characters in it (well Unicode supposed to handle it all but apparently doesn't), using http://dchublist.com/hublist.xml.bz2 as example data, with goal to extract Users and Address properties where number of Users is greater than given number. So I extended my First XML Example with an error handler: # = snip === from xml.sax import make_parser from xml.sax.handler import ContentHandler from xml.sax.handler import ErrorHandler class HubHandler(ContentHandler): def __init__(self, hublist): self.Address = '' self.Users = '' hl = hublist def startElement(self, name, attrs): self.Address = attrs.get('Address',"") self.Users = attrs.get('Users', "") def endElement(self, name): if name == "Hub" and int(self.Users) > 2000: #print self.Address, self.Users hl.append({self.Address: int(self.Users)}) class HubErrorHandler(ErrorHandler): def __init__(self): pass def error(self, exception): import sys print "Error, exception: %s\n" % exception def fatalError(self, exception): print "Fatal Error, exception: %s\n" % exception hl = [] parser = make_parser() hHandler = HubHandler(hl) errHandler = HubErrorHandler() parser.setContentHandler(hHandler) parser.setErrorHandler(errHandler) fh = file('hublist.xml') parser.parse(fh) def compare(x,y): if x.values()[0] > y.values()[0]: return 1 elif x.values()[0] < y.values()[0]: return -1 return 0 hl.sort(cmp=compare, reverse=True) for h in hl: print h.keys()[0], " ", h.values()[0] # = snip === And then BAM, Pythonwin has hit me: >>> execfile('ph.py') Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) Fatal Error, exception: hublist.xml:2247:11: not well-formed (invalid token) >>> RESTART Just before the "RESTART" line, Windows has announced it killed pythonw.exe process (I suppose it was a child process). WTF is happening here? Wasn't fatalError method in the HubErrorHandler supposed to handle the invalid tokens? And why is the message repeated many times? My method is called apparently, but something in SAX goes awry and the interpreter crashes. -- http://mail.python.org/mailman/listinfo/python-list
Re: dict invert - learning question
Assuming all the values are unique: >>> a={1:'a', 2:'b', 3:'c'} >>> dict(zip(a.keys(), a.values())) {1: 'a', 2: 'b', 3: 'c'} The problem is you obviously can't assume that in most cases. Still, zip() is very useful function. -- http://mail.python.org/mailman/listinfo/python-list
Re: dict invert - learning question
On 4 Maj, 01:27, [EMAIL PROTECTED] wrote: > >>> a={1:'a', 2:'b', 3:'c'} Oops, it should obviously be: >>> dict(zip(a.values(), a.keys())) {'a': 1, 'c': 3, 'b': 2} -- http://mail.python.org/mailman/listinfo/python-list