Regular expression match objects - compact syntax?
Hello python-list, I have a question about the match objects that are returned from the match() method of compiled regular expression objects from the 're' module. To parse Postscript T1 fonts that were disassembled into plaintext, I came up with the following code: import re rmoveto = re.compile('^\s*(-?\d+)\s+(-?\d+)\s+rmoveto$') rlineto = re.compile('^\s*(-?\d+)\s+(-?\d+)\s+rlineto$') # ... other expressions with up to six paren groups f = open(filename, 'r') for line in f.readlines(): m = rmoveto.match(line) if m: x = x + int(m.group(1)) y = y + int(m.group(2)) glyph.append(('move', (x, y))) continue m = rlineto.match(line) if m: x = x + int(m.group(1)) y = y + int(m.group(2)) glyph.append(('line', (x, y))) continue # ... and so forth for the other expressions Now here is my question: is there a simple way to join the following two python code lines: m = rmoveto.match(line) if m: into one single line like in the following: if rmoveto.match(line): x = x + int(rmoveto.group(1)) y = y + int(rmoveto.group(2)) glyph.append(('move', (x, y))) elif rlineto.match(line): # ... The above syntax does not work because the compiled regular expression object rmoveto doesn't provide a method called group(), as it comes from module 're'. The obsolete package 'regex' did provide this, if I read the docs correctly. As a workaround, I also tried to use a nested function like this: def match(expr): m = expr.match(line) return m if match(rmoveto): x = x + int(m.group(1)) # ... This approach failed because the match function has its own local m, so it didn't update the outer m. I couldn't use 'global m' either because the whole thing, including the outer m, happens to be inside a function, too. How do you people handle this? Thanks for your time, Johann C. Rocholl -- http://mail.python.org/mailman/listinfo/python-list
RE: Python and version control
Robert Brewer wrote: > Peter Hansen wrote: > > Carl wrote: > > > What is the ultimate version control tool for Python if you > > > are working in a Windows environment? > > > > I never liked coupling the two together like that. Instead > > I use tools like TortoiseCVS or (now) TortoiseSVN with a > > Subversion repository. These things let you access revision > > control features from context (right-button) menus right in > > Windows Explorer, as you browse the file system. > > Seconded. Thirded. Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: That horrible regexp idiom
Hi, > import re > foo_pattern = re.compile('foo') > > '>>> m = foo_pattern.search(subject) > '>>> if m: > '>>>pass > '>>> else: > '>>>pass I agree that is horrible. This is one of my favorite problems with python syntax. > but it occured to me today, that it is possible to do it in python > without the extra line. > ' > '>>> def xsearch(pattern, subject): > '>>> yield pattern.search(subject) > > '>>> for m in xsearch(foo_pattern, subject): > '>>> pass > '>>> else: > '>>> pass I think I wouldd rather have it this way, based on a suggestion by Diez B. Roggisch recently: import re class matcher: def __init__(self, regex): self.regex = re.compile(regex) def match(self, s): self.m = self.regex.match(s) return not self.m is None def search(self, s): self.m = self.regex.search(s) return not self.m is None def group(self, n = None): if n is None: return self.m.group() return self.m.group(n) m = matcher('(foo)(.*)') if m.match('foobar'): print m.group() if m.search('barfoobaz'): print m.group(2) I think one solution that does not need a wrapper class would be to add the group() method to the match objects from module 're'. IIRC, the obsolete package 'regex' provided this, once upon a time. Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: wxPython 2.7.1.3
Hi Robin, You may want to use a spell checker for announcements and for the wxpython.org website. For example, the first paragraph of your announcement contains the words "plust" and "pacakges", and the word "pacakge" can also be found on the following pages: www.wxpython.org/download.php www.wxpython.org/wxPython.spec www.wxpython.org/CHANGES.html Oh, it's fun to be a speling fanatic! :-) Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
XML-RPC server with xmlrpclib and mod_python
Hi all, I'm wondering what would be the best way to write an XML-RPC server using mod_python with Apache 2.0. I want the mod_python environment because the rest of my project is web-based, and Apache gives me multi-threading and everything. My first attempt implements XML-RPC introspection. Please have a look at the following files and give me a rigorous critique. All comments are welcome, even about coding style and module design. My handler for incoming mod_python requests: http://trac.browsershots.org/browser/trunk/shotserver/lib/xmlrpc/__init__.py The introspection module (I plan to have more modules in that folder): http://trac.browsershots.org/browser/trunk/shotserver/lib/xmlrpc/system.py A simple standalone test client: http://trac.browsershots.org/browser/trunk/shotserver/scripts/xmlrpc_help.py Are there any existing interfaces to use xmlrpclib with mod_python? Are there any security issues that I should be aware of when implementing XML-RPC? Thanks in advance, Johann -- http://mail.python.org/mailman/listinfo/python-list
Writing PNG with pure Python
Just in case anybody has the same problem, here's my first attempt at implementing a subset of the PNG spec in pure Python. I license it to you under the terms of the GNU GPL. http://trac.browsershots.org/browser/trunk/shotfactory/lib/image/png.py It encodes RGB images with 24 bits per pixel into PNG, using only the modules sys, zlib and struct. These are all included in the base distribution of Python. You don't need gd or imlib. I have done a little testing, and my implementation processes 8 megs of RGB input in 0.6 seconds. With Adam7 interlacing enabled, it takes 10 times longer. I would really appreciate any feedback and suggestions for improvement. Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
> You should really also include the alpha channel. Without that, PNG is > crippled IMHO. I have now added simple transparency (marking one color as transparent with a tRNS chunk). If anybody wants full alpha channel support, ask kindly or send me a patch. I would like to avoid duplicating all the functions, so maybe we should introduce a parameter to switch between 3 and 4 bytes per pixel. Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
Alan Isaac schrieb: > It's your code, so you get to license it. > But if you wish to solicit patches, > a more Pythonic license is IMHO more likely > to prove fruitful. What license would you suggest? After some reading at [1] and [2] and [3], I found that the Academic Free License (version 2.1) and the Apache License (version 2.0) are considered helpful for contributions to the Python Software Foundation. So far, I haven't used either of these licenses for my own code, and after a little reading the AFL seems okay for me. I would perhaps consider the LGPL [4] as an alternative because it is less restrictive than the GPL. Thoughts or links, anybody? Cheers, Johann [1] http://www.python.org/moin/PythonSoftwareFoundationLicenseFaq [2] http://www.python.org/psf/records/board/minutes/2004-11-09/ [3] http://www.python.org/psf/contrib/ [4] http://www.gnu.org/licenses/lgpl.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
The MIT license is enticingly short and simple, thank you for the tip. I have now decided to license my project (including the pure python PNG library) under the Apache License 2.0 which is less restrictive than the GPL in terms of sublicensing. The Apache License looks modern and well-maintained to me. My project runs on Apache, so there is some context there as well. Also, this change will simplify things if I ever want to contribute some of the code to the Python Software Foundation. Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
> Just in case anybody has the same problem, here's my first attempt at > implementing a subset of the PNG spec in pure Python. I license it to > you under the terms of the GNU GPL. Update: the code is now licensed under the Apache License 2.0. > http://trac.browsershots.org/browser/trunk/shotfactory/lib/image/png.py Update: the module has moved to its own package, with its own setup.py: http://trac.browsershots.org/browser/trunk/pypng http://svn.browsershots.org/trunk/pypng/ Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
> > I have now decided to license my project (including the pure python PNG > > library) under the Apache License 2.0 which is less restrictive than > > the GPL in terms of sublicensing. > > But it is also incompatible with the GPL: > > http://www.fsf.org/licensing/licenses/index_html#GPLIncompatibleLicenses Thank you for enlightening me. I was under the wrong impression that the Apache License was compatible with the GPL, perhaps because it is OSI-approved, which means a different thing as I now understand. > If you're convinced that a permissive licence suits your code best, > please consider something whose side-effects you understand. If the > additional patent grant or licence termination clauses (which the FSF > don't regard as a bad thing, just something incompatible with the > current GPL/LGPL) are specifically what you want, then the Apache > Licence may be what you're after; otherwise, you should choose > something less baroque and better understood, perhaps from this list: > > http://www.fsf.org/licensing/licenses/index_html#GPLCompatibleLicenses I do believe that my code will be useful for more people if it's under a permissive license, but obviously also if it's under a GPL-compatible license. Therefore it's perhaps a good idea to change the license of my software again. Currently, I am considering the following options: - LGPL - Modified BSD License - X11 License (aka MIT License) I appreciate the simplicity of the BSD and MIT Licenses, except for the names. "BSD License" can be confused with the original BSD License, while "MIT License" according to the FSF "is misleading, since MIT has used many licenses for software." But perhaps these drawbacks are just mentioned on the FSF page to get more people to use the GPL or LGPL. :-) I don't want to start a holy war about the benefits of the GPL, but I would like some more input about the choices of licensing. Perhaps I'll put the larger part of my Project under the GPL and only some standalone library parts (like the PNG encoder) under the LGPL. If I ever want to contribute some of the code to the Python Software Foundation, I can still license it to them under the Apache License, right? But how about the parts of the code that others contribute to my software while it's licensed under the LGPL? Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
How about this here construct? #!/usr/bin/env python # png.py - PNG encoder in pure Python # Copyright (C) 2006 Johann C. Rocholl <[EMAIL PROTECTED]> # # This file is licensed alternatively under one of the following: # 1. GNU Lesser General Public License (LGPL), Version 2.1 or newer # 2. GNU General Public License (GPL), Version 2 or newer # 3. Apache License, Version 2.0 or newer # 4. The following license (aka MIT License) # # - start of license - # Copyright (C) 2006 Johann C. Rocholl <[EMAIL PROTECTED]> # # Permission is hereby granted, free of charge, to any person # obtaining a copy of this software and associated documentation files # (the "Software"), to deal in the Software without restriction, # including without limitation the rights to use, copy, modify, merge, # publish, distribute, sublicense, and/or sell copies of the Software, # and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be # included in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # --- end of license - # # You may not use this file except in compliance with at least one of # the above four licenses. -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing PNG with pure Python
Ben Finney schrieb: > Simplify. Please don't attempt to write yet another set of license > terms without expert legal assistance. You've already chosen the Expat > license as being acceptable; use that, and you grant all the rest > without even mentioning it. Sorry for my stubborn ignorance, and thank you for your patient explanations. I think I finally got it: I'll put the bulk of my browsershots project under the GNU GPL again, and the independent library parts like png.py under the Expat License. Cheers, Johann -- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-Dev] Python 3000 PEP: Postfix type declarations
Brilliant! On 4/1/07, Georg Brandl <[EMAIL PROTECTED]> wrote: > def foo${LATIN SMALL LETTER LAMBDA WITH STROKE}$(x${DOUBLE-STRUCK > CAPITAL C}$): > return None${ZERO WIDTH NO-BREAK SPACE}$ > > This is still easy to read and makes the full power of type-annotated Python > available to ASCII believers. +1 J -- http://mail.python.org/mailman/listinfo/python-list
Taint (like in Perl) as a Python module: taint.py
The following is my first attempt at adding a taint feature to Python to prevent os.system() from being called with untrusted input. What do you think of it? # taint.py - Emulate Perl's taint feature in Python # Copyright (C) 2007 Johann C. Rocholl <[EMAIL PROTECTED]> # # Permission is hereby granted, free of charge, to any person # obtaining a copy of this software and associated documentation files # (the "Software"), to deal in the Software without restriction, # including without limitation the rights to use, copy, modify, merge, # publish, distribute, sublicense, and/or sell copies of the Software, # and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be # included in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. """ Emulate Perl's taint feature in Python This module replaces all functions in the os module (except stat) with wrappers that will raise an Exception called TaintError if any of the parameters is a tainted string. All strings are tainted by default, and you have to call untaint on a string to create a safe string from it. Stripping, zero-filling, and changes to lowercase or uppercase don't taint a safe string. If you combine strings with + or join or replace, the result will be a tainted string unless all its parts are safe. It is probably a good idea to run some checks on user input before you call untaint() on it. The safest way is to design a regex that matches legal input only. A regex that tries to match illegal input is very hard to prove complete. You can run the following examples with the command python taint.py -v to test if this module works as designed. >>> unsafe = 'test' >>> tainted(unsafe) True >>> os.system(unsafe) Traceback (most recent call last): TaintError >>> safe = untaint(unsafe) >>> tainted(safe) False >>> os.system(safe) 256 >>> safe + unsafe u'testtest' >>> safe.join([safe, unsafe]) u'testtesttest' >>> tainted(safe + unsafe) True >>> tainted(safe + safe) False >>> tainted(unsafe.join([safe, safe])) True >>> tainted(safe.join([safe, unsafe])) True >>> tainted(safe.join([safe, safe])) False >>> tainted(safe.replace(safe, unsafe)) True >>> tainted(safe.replace(safe, safe)) False >>> tainted(safe.capitalize()) or tainted(safe.title()) False >>> tainted(safe.lower()) or tainted(safe.upper()) False >>> tainted(safe.strip()) or tainted(safe.rstrip()) or tainted(safe.lstrip()) False >>> tainted(safe.zfill(8)) False >>> tainted(safe.expandtabs()) True """ import os import types class TaintError(Exception): """ This exception is raised when you try to call a function in the os module with a string parameter that isn't a SafeString. """ pass class SafeString(unicode): """ A string class that you must use for parameters to functions in the os module. """ def __add__(self, other): """Create a safe string if the other string is also safe.""" if tainted(other): return unicode.__add__(self, other) return untaint(unicode.__add__(self, other)) def join(self, sequence): """Create a safe string if all components are safe.""" for element in sequence: if tainted(element): return unicode.join(self, sequence) return untaint(unicode.join(self, sequence)) def replace(self, old, new, *args): """Create a safe string if the replacement text is also safe.""" if tainted(new): return unicode.replace(self, old, new, *args) return untaint(unicode.replace(self, old, new, *args)) def strip(self, *args): return untaint(unicode.strip(self, *args)) def lstrip(self, *args): return untaint(unicode.lstrip(self, *args)) def rstrip(self, *args): return untaint(unicode.rstrip(self, *args)) def zfill(self, *args): return untaint(unicode.zfill(self, *args)) def capitalize(self): return untaint(unicode.capitalize(self))
Re: Taint (like in Perl) as a Python module: taint.py
On Feb 6, 3:01 am, Ben Finney <[EMAIL PROTECTED]> wrote: > "Gabriel Genellina" <[EMAIL PROTECTED]> writes: > > And tainted() returns False by default? > > Sorry but in general, this won't work :( > > I'm inclined to agree that the default should be to flag an object as > tainted unless known otherwise. That's true. For example, my first attempt didn't prevent this: os.open(buffer('/etc/passwd'), os.O_RDONLY) Here's a stricter version: def tainted(param): """ Check if a parameter is tainted. If it's a sequence or dict, all values will be checked (but not the keys). """ if isinstance(param, unicode): return not isinstance(param, SafeString) elif isinstance(param, (bool, int, long, float, complex, file)): return False elif isinstance(param, (tuple, list)): for element in param: if tainted(element): return True elif isinstance(param, dict): return tainted(param.values()) else: return True -- http://mail.python.org/mailman/listinfo/python-list