Regular expression match objects - compact syntax?

2005-02-03 Thread Johann C. Rocholl
Hello python-list,

I have a question about the match objects that are returned from the
match() method of compiled regular expression objects from the 're'
module. To parse Postscript T1 fonts that were disassembled into
plaintext, I came up with the following code:

import re
rmoveto = re.compile('^\s*(-?\d+)\s+(-?\d+)\s+rmoveto$')
rlineto = re.compile('^\s*(-?\d+)\s+(-?\d+)\s+rlineto$')
# ... other expressions with up to six paren groups

f = open(filename, 'r')
for line in f.readlines():

m = rmoveto.match(line)
if m:
x = x + int(m.group(1))
y = y + int(m.group(2))
glyph.append(('move', (x, y)))
continue

m = rlineto.match(line)
if m:
x = x + int(m.group(1))
y = y + int(m.group(2))
glyph.append(('line', (x, y)))
continue

# ... and so forth for the other expressions

Now here is my question: is there a simple way to join the following
two python code lines:
m = rmoveto.match(line)
if m:
into one single line like in the following:
if rmoveto.match(line):
x = x + int(rmoveto.group(1))
y = y + int(rmoveto.group(2))
glyph.append(('move', (x, y)))
elif rlineto.match(line):
# ...

The above syntax does not work because the compiled regular expression
object rmoveto doesn't provide a method called group(), as it comes
from module 're'. The obsolete package 'regex' did provide this, if I
read the docs correctly.

As a workaround, I also tried to use a nested function like this:

def match(expr):
m = expr.match(line)
return m

if match(rmoveto):
x = x + int(m.group(1))
# ...

This approach failed because the match function has its own local m,
so it didn't update the outer m. I couldn't use 'global m' either
because the whole thing, including the outer m, happens to be inside a
function, too.

How do you people handle this?

Thanks for your time,
Johann C. Rocholl
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Python and version control

2005-02-09 Thread Johann C. Rocholl
Robert Brewer wrote:
> Peter Hansen wrote:
> > Carl wrote:
> > > What is the ultimate version control tool for Python if you 
> > > are working in a Windows environment? 
> > 
> > I never liked coupling the two together like that.  Instead
> > I use tools like TortoiseCVS or (now) TortoiseSVN with a
> > Subversion repository.  These things let you access revision
> > control features from context (right-button) menus right in
> > Windows Explorer, as you browse the file system.
> 
> Seconded.

Thirded.

Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: That horrible regexp idiom

2005-02-10 Thread Johann C. Rocholl
Hi,

> import re
> foo_pattern = re.compile('foo')
> 
> '>>> m = foo_pattern.search(subject)
> '>>> if m:
> '>>>pass
> '>>> else:
> '>>>pass

I agree that is horrible. This is one of my favorite problems with
python syntax.

> but it occured to me today, that it is possible to do it in python
> without the extra line.
> '
> '>>> def xsearch(pattern, subject):
> '>>> yield pattern.search(subject)
> 
> '>>> for m in xsearch(foo_pattern, subject):
> '>>> pass
> '>>> else:
> '>>> pass

I think I wouldd rather have it this way, based on a suggestion by
Diez B. Roggisch recently:

import re

class matcher:
def __init__(self, regex):
self.regex = re.compile(regex)
def match(self, s):
self.m = self.regex.match(s)
return not self.m is None
def search(self, s):
self.m = self.regex.search(s)
return not self.m is None
def group(self, n = None):
if n is None:
return self.m.group()
return self.m.group(n)

m = matcher('(foo)(.*)')
if m.match('foobar'):
print m.group()
if m.search('barfoobaz'):
print m.group(2)

I think one solution that does not need a wrapper class would be to
add the group() method to the match objects from module 're'. IIRC,
the obsolete package 'regex' provided this, once upon a time.

Cheers,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: wxPython 2.7.1.3

2006-10-27 Thread Johann C. Rocholl
Hi Robin,

You may want to use a spell checker for announcements and for the
wxpython.org website. For example, the first paragraph of your
announcement contains the words "plust" and "pacakges", and the word
"pacakge" can also be found on the following pages:

www.wxpython.org/download.php
www.wxpython.org/wxPython.spec
www.wxpython.org/CHANGES.html

Oh, it's fun to be a speling fanatic! :-)

Cheers,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


XML-RPC server with xmlrpclib and mod_python

2006-06-01 Thread Johann C. Rocholl
Hi all,

I'm wondering what would be the best way to write an XML-RPC server
using mod_python with Apache 2.0. I want the mod_python environment
because the rest of my project is web-based, and Apache gives me
multi-threading and everything.

My first attempt implements XML-RPC introspection. Please have a look
at the following files and give me a rigorous critique. All comments
are welcome, even about coding style and module design.

My handler for incoming mod_python requests:
http://trac.browsershots.org/browser/trunk/shotserver/lib/xmlrpc/__init__.py

The introspection module (I plan to have more modules in that folder):
http://trac.browsershots.org/browser/trunk/shotserver/lib/xmlrpc/system.py

A simple standalone test client:
http://trac.browsershots.org/browser/trunk/shotserver/scripts/xmlrpc_help.py

Are there any existing interfaces to use xmlrpclib with mod_python?
Are there any security issues that I should be aware of when
implementing XML-RPC?

Thanks in advance,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Writing PNG with pure Python

2006-06-09 Thread Johann C. Rocholl
Just in case anybody has the same problem, here's my first attempt at
implementing a subset of the PNG spec in pure Python. I license it to
you under the terms of the GNU GPL.

http://trac.browsershots.org/browser/trunk/shotfactory/lib/image/png.py

It encodes RGB images with 24 bits per pixel into PNG, using only the
modules sys, zlib and struct. These are all included in the base
distribution of Python. You don't need gd or imlib.

I have done a little testing, and my implementation processes 8 megs of
RGB input in 0.6 seconds. With Adam7 interlacing enabled, it takes 10
times longer.

I would really appreciate any feedback and suggestions for improvement.

Cheers,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-09 Thread Johann C. Rocholl
> You should really also include the alpha channel. Without that, PNG is
> crippled IMHO.

I have now added simple transparency (marking one color as transparent
with a tRNS chunk). If anybody wants full alpha channel support, ask
kindly or send me a patch. I would like to avoid duplicating all the
functions, so maybe we should introduce a parameter to switch between 3
and 4 bytes per pixel.

Cheers,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-09 Thread Johann C. Rocholl
Alan Isaac schrieb:
> It's your code, so you get to license it.
> But if you wish to solicit patches,
> a more Pythonic license is IMHO more likely
> to prove fruitful.

What license would you suggest? After some reading at [1] and [2] and
[3], I found that the Academic Free License (version 2.1) and the
Apache License (version 2.0) are considered helpful for contributions
to the Python Software Foundation.

So far, I haven't used either of these licenses for my own code, and
after a little reading the AFL seems okay for me. I would perhaps
consider the LGPL [4] as an alternative because it is less restrictive
than the GPL.

Thoughts or links, anybody?

Cheers,
Johann

[1] http://www.python.org/moin/PythonSoftwareFoundationLicenseFaq
[2] http://www.python.org/psf/records/board/minutes/2004-11-09/
[3] http://www.python.org/psf/contrib/
[4] http://www.gnu.org/licenses/lgpl.html

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-09 Thread Johann C. Rocholl
The MIT license is enticingly short and simple, thank you for the tip.

I have now decided to license my project (including the pure python PNG
library) under the Apache License 2.0 which is less restrictive than
the GPL in terms of sublicensing. The Apache License looks modern and
well-maintained to me. My project runs on Apache, so there is some
context there as well. Also, this change will simplify things if I ever
want to contribute some of the code to the Python Software Foundation.

Cheers,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-12 Thread Johann C. Rocholl
> Just in case anybody has the same problem, here's my first attempt at
> implementing a subset of the PNG spec in pure Python. I license it to
> you under the terms of the GNU GPL.

Update: the code is now licensed under the Apache License 2.0.

> http://trac.browsershots.org/browser/trunk/shotfactory/lib/image/png.py

Update: the module has moved to its own package, with its own setup.py:

http://trac.browsershots.org/browser/trunk/pypng
http://svn.browsershots.org/trunk/pypng/

Cheers,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-12 Thread Johann C. Rocholl
> > I have now decided to license my project (including the pure python PNG
> > library) under the Apache License 2.0 which is less restrictive than
> > the GPL in terms of sublicensing.
>
> But it is also incompatible with the GPL:
>
> http://www.fsf.org/licensing/licenses/index_html#GPLIncompatibleLicenses

Thank you for enlightening me. I was under the wrong impression that
the Apache License was compatible with the GPL, perhaps because it is
OSI-approved, which means a different thing as I now understand.

> If you're convinced that a permissive licence suits your code best,
> please consider something whose side-effects you understand. If the
> additional patent grant or licence termination clauses (which the FSF
> don't regard as a bad thing, just something incompatible with the
> current GPL/LGPL) are specifically what you want, then the Apache
> Licence may be what you're after; otherwise, you should choose
> something less baroque and better understood, perhaps from this list:
>
> http://www.fsf.org/licensing/licenses/index_html#GPLCompatibleLicenses

I do believe that my code will be useful for more people if it's under
a permissive license, but obviously also if it's under a GPL-compatible
license. Therefore it's perhaps a good idea to change the license of my
software again.

Currently, I am considering the following options:
- LGPL
- Modified BSD License
- X11 License (aka MIT License)

I appreciate the simplicity of the BSD and MIT Licenses, except for the
names. "BSD License" can be confused with the original BSD License,
while "MIT License" according to the FSF "is misleading, since MIT has
used many licenses for software." But perhaps these drawbacks are just
mentioned on the FSF page to get more people to use the GPL or LGPL.
:-)

I don't want to start a holy war about the benefits of the GPL, but I
would like some more input about the choices of licensing. Perhaps I'll
put the larger part of my Project under the GPL and only some
standalone library parts (like the PNG encoder) under the LGPL.

If I ever want to contribute some of the code to the Python Software
Foundation, I can still license it to them under the Apache License,
right? But how about the parts of the code that others contribute to my
software while it's licensed under the LGPL?

Cheers, Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-13 Thread Johann C. Rocholl
How about this here construct?

#!/usr/bin/env python
# png.py - PNG encoder in pure Python
# Copyright (C) 2006 Johann C. Rocholl <[EMAIL PROTECTED]>
#
# This file is licensed alternatively under one of the following:
# 1. GNU Lesser General Public License (LGPL), Version 2.1 or newer
# 2. GNU General Public License (GPL), Version 2 or newer
# 3. Apache License, Version 2.0 or newer
# 4. The following license (aka MIT License)
#
# - start of license -
# Copyright (C) 2006 Johann C. Rocholl <[EMAIL PROTECTED]>
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# --- end of license -
#
# You may not use this file except in compliance with at least one of
# the above four licenses.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing PNG with pure Python

2006-06-14 Thread Johann C. Rocholl
Ben Finney schrieb:
> Simplify. Please don't attempt to write yet another set of license
> terms without expert legal assistance. You've already chosen the Expat
> license as being acceptable; use that, and you grant all the rest
> without even mentioning it.

Sorry for my stubborn ignorance, and thank you for your patient
explanations.

I think I finally got it: I'll put the bulk of my browsershots project
under the GNU GPL again, and the independent library parts like png.py
under the Expat License.

Cheers,
Johann

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Python-Dev] Python 3000 PEP: Postfix type declarations

2007-04-01 Thread Johann C. Rocholl
Brilliant!

On 4/1/07, Georg Brandl <[EMAIL PROTECTED]> wrote:
>  def foo${LATIN SMALL LETTER LAMBDA WITH STROKE}$(x${DOUBLE-STRUCK 
> CAPITAL C}$):
>  return None${ZERO WIDTH NO-BREAK SPACE}$
>
> This is still easy to read and makes the full power of type-annotated Python
> available to ASCII believers.

+1

J
-- 
http://mail.python.org/mailman/listinfo/python-list


Taint (like in Perl) as a Python module: taint.py

2007-02-05 Thread Johann C. Rocholl
The following is my first attempt at adding a taint feature to Python
to prevent os.system() from being called with untrusted input. What do
you think of it?

# taint.py - Emulate Perl's taint feature in Python
# Copyright (C) 2007 Johann C. Rocholl <[EMAIL PROTECTED]>
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


"""
Emulate Perl's taint feature in Python

This module replaces all functions in the os module (except stat) with
wrappers that will raise an Exception called TaintError if any of the
parameters is a tainted string.

All strings are tainted by default, and you have to call untaint on a
string to create a safe string from it.

Stripping, zero-filling, and changes to lowercase or uppercase don't
taint a safe string.

If you combine strings with + or join or replace, the result will be a
tainted string unless all its parts are safe.

It is probably a good idea to run some checks on user input before you
call untaint() on it. The safest way is to design a regex that matches
legal input only. A regex that tries to match illegal input is very
hard to prove complete.

You can run the following examples with the command
python taint.py -v
to test if this module works as designed.

>>> unsafe = 'test'
>>> tainted(unsafe)
True
>>> os.system(unsafe)
Traceback (most recent call last):
TaintError
>>> safe = untaint(unsafe)
>>> tainted(safe)
False
>>> os.system(safe)
256
>>> safe + unsafe
u'testtest'
>>> safe.join([safe, unsafe])
u'testtesttest'
>>> tainted(safe + unsafe)
True
>>> tainted(safe + safe)
False
>>> tainted(unsafe.join([safe, safe]))
True
>>> tainted(safe.join([safe, unsafe]))
True
>>> tainted(safe.join([safe, safe]))
False
>>> tainted(safe.replace(safe, unsafe))
True
>>> tainted(safe.replace(safe, safe))
False
>>> tainted(safe.capitalize()) or tainted(safe.title())
False
>>> tainted(safe.lower()) or tainted(safe.upper())
False
>>> tainted(safe.strip()) or tainted(safe.rstrip()) or tainted(safe.lstrip())
False
>>> tainted(safe.zfill(8))
False
>>> tainted(safe.expandtabs())
True
"""

import os
import types


class TaintError(Exception):
"""
This exception is raised when you try to call a function in the os
module with a string parameter that isn't a SafeString.
"""
pass


class SafeString(unicode):
"""
A string class that you must use for parameters to functions in
the os module.
"""

def __add__(self, other):
"""Create a safe string if the other string is also safe."""
if tainted(other):
return unicode.__add__(self, other)
return untaint(unicode.__add__(self, other))

def join(self, sequence):
"""Create a safe string if all components are safe."""
for element in sequence:
if tainted(element):
return unicode.join(self, sequence)
return untaint(unicode.join(self, sequence))

def replace(self, old, new, *args):
"""Create a safe string if the replacement text is also
safe."""
if tainted(new):
return unicode.replace(self, old, new, *args)
return untaint(unicode.replace(self, old, new, *args))

def strip(self, *args):
return untaint(unicode.strip(self, *args))

def lstrip(self, *args):
return untaint(unicode.lstrip(self, *args))

def rstrip(self, *args):
return untaint(unicode.rstrip(self, *args))

def zfill(self, *args):
return untaint(unicode.zfill(self, *args))

def capitalize(self):
return untaint(unicode.capitalize(self))


Re: Taint (like in Perl) as a Python module: taint.py

2007-02-06 Thread Johann C. Rocholl
On Feb 6, 3:01 am, Ben Finney <[EMAIL PROTECTED]>
wrote:
> "Gabriel Genellina" <[EMAIL PROTECTED]> writes:
> > And tainted() returns False by default?
> > Sorry but in general, this won't work :(
>
> I'm inclined to agree that the default should be to flag an object as
> tainted unless known otherwise.

That's true. For example, my first attempt didn't prevent this:
os.open(buffer('/etc/passwd'), os.O_RDONLY)

Here's a stricter version:

def tainted(param):
"""
Check if a parameter is tainted. If it's a sequence or dict, all
values will be checked (but not the keys).
"""
if isinstance(param, unicode):
return not isinstance(param, SafeString)
elif isinstance(param, (bool, int, long, float, complex, file)):
return False
elif isinstance(param, (tuple, list)):
for element in param:
if tainted(element):
return True
elif isinstance(param, dict):
return tainted(param.values())
else:
return True

-- 
http://mail.python.org/mailman/listinfo/python-list