A quirk/gotcha of for i, x in enumerate(seq) when seq is empty

2012-02-23 Thread Alex Willmer
This week I was slightly surprised by a behaviour that I've not
considered before. I've long used

for i, x in enumerate(seq):
   # do stuff

as a standard looping-with-index construct. In Python for loops don't
create a scope, so the loop variables are available afterward. I've
sometimes used this to print or return a record count e.g.

for i, x in enumerate(seq):
   # do stuff
print 'Processed %i records' % i+1

However as I found out, if seq is empty then i and x are never
created. The above code will raise NameError. So if a record count is
needed, and the loop is not guaranteed to execute the following seems
more correct:

i = 0
for x in seq:
# do stuff
i += 1
print 'Processed %i records' % i

Just thought it worth mentioning, curious to hear other options/
improvements/corrections.
-- 
http://mail.python.org/mailman/listinfo/python-list


Benchmarking stripping of Unicode characters which are invalid XML

2012-03-18 Thread Alex Willmer
Last week I was surprised to discover that there are Unicode characters that 
aren't valid in an XML document. That is regardless of escaping (e.g. �) 
and unicode encoding (e.g. UTF-8) - not every Unicode string can be stored in 
XML. The valid characters are (as of XML 1.0) #x9 | #xA | #xD | [#x20-#xD7FF] | 
[#xE000-#xFFFD] | [#x1-#x10]. Others such as #x13 must be stripped, 
replaced or placed inside a wrapper such as base64.

I didn't find an existing function to strip these so I wrote some and 
benchmarked them. I'd be interested for thoughts, suggestions and improvements.

regsub_p2 was the fastest on a string containing mostly printable-ascii.

regsub_p1 0.422097921371 True
regsub_p2 0.353546857834 True
regsub_p3 0.697242021561 True
regsub_p4 0.677567005157 True
genexp_p1 6.43633103371 True
genexp_p2 6.43329787254 True
genexp_p3 6.80837488174 True
genexp_p4 6.81470417976 True
filter_p1 7.2116046 True
filter_p2 7.46805095673 True
filter_p3 7.37018704414 True
filter_p4 7.03261303902 True
genexp_f1 12.8470640182 True
genexp_f2 5.43630099297 True
genexp_f3 4.9708840847 True
genexp_f4 12.2384109497 True
genexp_f5 6.95861411095 True
genexp_f6 4.7168610096 True
genexp_f7 20.2065701485 True
genexp_f8 21.1112251282 True

Regards, Alex
#!/usr/bin/python
# Valid XML 1.0 characters are
# #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10]
# http://www.w3.org/TR/2008/PER-xml-20080205/#charsets
#
# Before passing an arbitrary unicode string to an XML encoder invalid 
characters
# must be stripped or replaced. Escaping them doesn't help - they're simply not
# allowed in a well formed XML 1.0 document.

# The following  script banchmarks several functions that strip them

import re
import string
import timeit


p1 = re.compile(u'[^\x09\x0A\x0D\u0020-\uD7FF'
u'\uE000-\uFFFD\U0001-\U0010]', re.U)

p2 = re.compile(u'[^\u0020-\uD7FF\x09\x0A\x0D'
u'\uE000-\uFFFD\U0001-\U0010]', re.U)

p3 = re.compile(p1.pattern + u'+', re.U)
p4 = re.compile(p2.pattern + u'+', re.U)

def regsub_p1(s): return p1.sub(u'', s)
def regsub_p2(s): return p2.sub(u'', s)
def regsub_p3(s): return p3.sub(u'', s)
def regsub_p4(s): return p4.sub(u'', s)

def genexp_p1(s): return u''.join(c for c in s if not p1.match(c))
def genexp_p2(s): return u''.join(c for c in s if not p2.match(c))
def genexp_p3(s): return u''.join(c for c in s if not p3.match(c))
def genexp_p4(s): return u''.join(c for c in s if not p4.match(c))

def filter_p1(s): return u''.join(filter(lambda c: not p1.match(c), s))
def filter_p2(s): return u''.join(filter(lambda c: not p2.match(c), s))
def filter_p3(s): return u''.join(filter(lambda c: not p3.match(c), s))
def filter_p4(s): return u''.join(filter(lambda c: not p4.match(c), s))


def f1(c):
i = ord(c)
return (i in set([0x09, 0x0A, 0x0D]) or 0x0020 <= i <= 0xD7FF
or 0xE000 <= i <= 0xFFFD or 0x0001 <= i <= 0x0010)

def f2(c):
i = ord(c)
return (0x0020 <= i <= 0xD7FF or i in set([0x09, 0x0A, 0x0D])
or 0xE000 <= i <= 0xFFFD or 0x0001 <= i <= 0x0010)

def f3(c):
return (u'\u0020' <= c <= u'\uD7FF' or c in set([u'\x09', u'\x0A', u'\x0D'])
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

def f4(c):
return (c in set([u'\x09', u'\x0A', u'\x0D']) or u'\u0020' <= c <= u'\uD7FF'
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

def f5(c):
return (c == u'\x09' or c == u'\x0A' or c == u'\x0D' or u'\u0020' <= c <= 
u'\uD7FF'
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

def f6(c):
return (u'\u0020' <= c <= u'\uD7FF' or c == u'\x09' or c == u'\x0A' or c == 
u'\x0D'
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

every_8bit = u''.join(unichr(i) for i in range(256))
valid_8bit = u''.join(c for c in every_8bit if f1(c))
invalid_8bit = u''.join(c for c in every_8bit if not f1(c))
invalid_8bit_iso88591 = invalid_8bit.encode('iso-8859-1')
translator = string.maketrans(invalid_8bit_iso88591,
  '\x00' * len(invalid_8bit_iso88591))

def f7(c):
return ((c <= u'\xff' and ord(string.translate(c.encode('iso-8859-1'), 
translator)))
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

def f8(c):
return ((c <= u'\xff' and string.translate(c.encode('iso-8859-1'), None, 
invalid_8bit_iso88591))
or u'\uE000' <= c <= u'\uFFFD' or u'\U0001' <= c <= 
u'\U0010')

def genexp_f1(s): return u''.join(c for c in s if f1(c))
def genexp_f2(s): return u''.join(c for c in s if f2(c))
def genexp_f3(s): return u''.join(c for c in s if f3(c))
def genexp_f4(s): return u''.join(c for c in s if f4(c))
def genexp_f5(s): return u''.join(c for c in s if f5(c))
def genexp_f6(s): return u''.join(c for c in s if f6(c))
def genexp_f7(s): return u''.join(c for c in s if f7(c))
def genexp_f8(s): return u''.joi

Re: trac.util

2012-04-14 Thread Alex Willmer
On Apr 11, 9:52 pm, cerr  wrote:
> Hi,
>
> I want to install some python driver on my system that requires trac.util 
> (from Image.py) but I can't find that anywhere, any suggestions, anyone?
>
> Thank you very much, any help is appreciated!
>
> Error:
> File "/root/weewx/bin/Image.py", line 32, in 
>     from  trac.util import escape
> ImportError: No module named trac.util

Trac is a bug tracker/wiki. Installation instruction are at
http://trac.edgewall.org/wiki/TracDownload or you can probably through
your operating system's package manager.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: *.sdf database access

2012-04-21 Thread Alex Willmer
On Apr 19, 9:18 pm, Page3D  wrote:
> Hi, I am trying to connect and access data in a *.sdf file on Win7
> system using Python 2.7. I have three questions:
>
> 1. What python module should I use? I have looked at sqlite3 and
> pyodbc. However, I can seem to get the connection to the database file
> setup properly.

I assume you mean SQL Server Compact by *.sdf. However please note
that there are several several file formats matching SDF
http://en.wikipedia.org/wiki/SDF#Computing and explicit is better than
implicit.

The sqlite3 module won't help - that's for sqlite files, which an
entirely different file format. Wikpedia says of SQL Server Compact
"An ODBC driver for SQL CE does not exist, and one is not planned
either. Native applications may use SQL CE via OLE DB"
http://en.wikipedia.org/wiki/SQL_Server_Compact. I believe the
adodbapi module, part of PyWin32 http://sourceforge.net/projects/pywin32/files/
can connect over OLE DB.

> 2. How can I determine the appropriate connection string? I have
> opened database file in Visual Studio and can see the tables. I don't
> understand where to find the connection string in Visual Studio.

These look promising http://www.connectionstrings.com/sql-server-2005-ce

> 3. Assuming a module from (1) above, does anyone have a code snippet
> for connecting to the database and then accessing a varbinary (image)
> in one of the tables of the databese?

Pass, I'm afraid

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subclass urllib2

2011-02-28 Thread Alex Willmer
On Feb 28, 6:53 pm, monkeys paw  wrote:
> I'm trying to subclass urllib2 in order to mask the
> version attribute. Here's what i'm using:
>
> import urllib2
>
> class myURL(urllib2):
>      def __init__(self):
>          urllib2.__init__(self)
>          self.version = 'firefox'
>
> I get this>
> Traceback (most recent call last):
>    File "", line 1, in 
> TypeError: Error when calling the metaclass bases
> module.__init__() takes at most 2 arguments (3 given)
>
> I don't see where i am supplying 3 arguments. What am i
> missing?

urllib2 is a module, not a class, so you can't subclass it. You could
subclass one of the classes inside urllib2, such as
urllib2.BaseHandler. Whether you want to depends on what your want to
achieve.
-- 
http://mail.python.org/mailman/listinfo/python-list


下载 below Download, in python.org site menu

2011-03-06 Thread Alex Willmer
On the English version of http://python.org I'm seeing 下载 as a menu
item between Download and Community. AFAICT it's Simplified Chinese
for 'download'. Is it's appearance intentional, or a leak through from
a translation of the entire page?

Regards, Alex

PS Tested with 10.0.648.114 (75702) and Firefox 3.6.14 on Ubuntu 10.10/
en_GB locale.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: error in exception syntax

2011-03-09 Thread Alex Willmer
On Mar 9, 6:12 pm, "Aaron Gray"  wrote:
> On Windows I have installed Python 3.2 and PyOpenGL-3.0.1 and am getting the
> following error :-
>
>     File "c:\Python32\lib\site-packages\OpenGL\platform\win32.py", line 13
>       except OSError, err:
>                 ^
>
> It works okay on my Linux machine running Python 2.6.2.

Python 3.x is a different beast to Python 2.x. It has a number of
backward incompatible changes, including the try/except syntax.
Install the latest Python 2.x (Python 2.7) on your Windows machine and
use that. Then read http://docs.python.org/release/3.2/whatsnew/3.0.html

Regards, Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argparse and filetypes

2011-03-22 Thread Alex Willmer
On Mar 22, 2:06 pm, Bradley Hintze 
wrote:
> I just started with argparse. I want to simply check the extension of
> the file that the user passes to the program. I get a ''file' object
> has no attribute 'rfind'' error when I use
> os.path.splitext(args.infile).  Here is my code.
>
> import argparse, sys, os
>
> des = 'Get restraint definitions from probe.'
> parser = argparse.ArgumentParser(description=des)
> parser.add_argument('infile', nargs='?', type=argparse.FileType('r'))
> # parser.add_argument('outfile', nargs='?', type=argparse.FileType('w'),
>                    # default=sys.stdout)
>
> args = parser.parse_args()
> # print args.infile.readlines()
> print basename, extension = os.path.splitext(args.infile)

Because you specified type=argparse.FileType('r') argparse has created
args.infile as a file object (e.g. open('/some/path/data.dat', 'r')),
not as a string containing the path. So type(args.infile) ==
type(file) and args.infile.readlines() returns the contents of that
file.

If you wish to check the file extension of the path in question I
suggest you remove type=argparse.FileType('r'), argparse will create
args.infile as a string containing that path. To open the file call
open(args.infile, 'r'), this will return the file object.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argparse and filetypes

2011-03-22 Thread Alex Willmer
On Mar 22, 2:06 pm, Bradley Hintze 
wrote:
> Hi,
>
> I just started with argparse. I want to simply check the extension of
> the file that the user passes to the program. I get a ''file' object
> has no attribute 'rfind'' error when I use
> os.path.splitext(args.infile).

Addendum, some file objects have a name attribute (which I hadn't
noticed until today):

file.name
If the file object was created using open(), the name of the file.
Otherwise, some string that indicates the source of the file object,
of the form <...>. This is a read-only attribute and may not be
present on all file-like objects.

http://docs.python.org/library/stdtypes.html#file-objects
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: file print extra spaces

2011-03-22 Thread Alex Willmer
On Mar 23, 1:33 am, monkeys paw  wrote:
> When i open a file in python, and then print the
> contents line by line, the printout has an extra blank
> line between each printed line (shown below):
>
>  >>> f=open('authors.py')
>  >>> i=0
>  >>> for line in f:
>         print(line)
>         i=i+1
>         if i > 14:
>                 break
>
> author_list = {
>
>                    '829337' : {
>
>                                  'author_name' : 'Carter, John',
>
>                                  'count' : 49,
>
>                                  'c2' : '0.102040816326531',
>
> How can i print the file out without having an extra newline between
> printed lines?
>
> Thanks for the help all.

What you are seeing is 1 newline from the original file, and a second
newline from the print() function. To resolve this use either
print(line, end='') or sys.stdout.write(line).

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Validating Command Line Options

2011-03-23 Thread Alex Willmer
On Mar 23, 3:20 pm, T  wrote:
> Thanks!  argparse is definitely what I need..unfortunately I'm running
> 2.6 now, so I'll need to upgrade to 2.7 and hope that none of my other
> scripts break.

Argparse was a third-party module before it became part of the std-
lib. You may find it easier to use this version:

http://pypi.python.org/pypi/argparse/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Inconsistency with split() - Script, OS, or Package Problem?

2011-05-09 Thread Alex Willmer
(Direct reply to me, reposted on Jame's behalf)



Hi Alex,

On Mon, May 9, 2011 at 3:21 PM, Alex Willmer 
wrote:
> On May 9, 8:10 pm, James Wright  wrote:
>> Hello Ian,
>>
>> It does indeed to seem that way.  However the script works just fine
>> on other machines, with the same input file.
>
> How sure of that are you? Post md5sums of the script and the input
> file on a working machine and a non-working (4 md5sums total).
>
> Does the script use an non-stdlib modules? Check the md5sum of those
> too.
>
> Are the platforms/python versions the same? Post the uname and python -
> v from each.
>
> Can you post the script and the file online for us? Do so.
>
> Regards, Alex
>

The md5sums match (great idea by the way, I hadn't thought of that).

What do you mean by using a non-stdlib module?  I am not using any
import statements, if that is what you are referring to.

As for kernel and Python versions, I have updated both in case this
was a bug - they no longer match across the machines I am testing on.
The script consistently works with the machines that it always has
been working with.  The script continues to not work on the new VMWare
installs (but does in new VirtualBox installs - odd).

I can post the script (attached), but am not so certain that is at
fault - given that the script is running on other hosts just fine.

Please read the script with a grain of salt, this is my first run at
this, and I am well aware that I am not yet following the conventions
of Python.  I have much to learn.



Thanks,
James
-- 
http://mail.python.org/mailman/listinfo/python-list


Human readable number formatting

2005-09-27 Thread Alex Willmer
When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
  prefixes[order], suffix)

Example usage
>>> print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,
100, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?
Does this function already exist in some python module?
Any hints, or would anyone care to finish it off/enhance it?

With thanks

Alex


-- 
http://mail.python.org/mailman/listinfo/python-list


CSV like file format featured recently in Daily Python URL?

2005-10-07 Thread Alex Willmer
I'm trying to track down the name of a file format and python module,
that was featured in the Daily Python URL some time in the last month or
two.

The format was ASCII with a multiline header defining types for the
comma seperated column data below. It may have had the capability to
store multiple tables in one file. There was news on the homepage that
an alternate 'no data here' syntax was also supported.

An example file had vaguely this structure:

columnname as datatype
columnname as datatype
columnname as datatype
columnname as datatype

data,data,data,data
data,"other data",data,data
data,data,"",data

Can anyone remember this file format/python module?

With thanks

Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CSV like file format featured recently in Daily Python URL?

2005-10-08 Thread Alex Willmer
On Fri, 2005-10-07 at 18:56 +0200, Fredrik Lundh wrote:
> Alex Willmer wrote:
> 
> > I'm trying to track down the name of a file format and python module,
> > that was featured in the Daily Python URL some time in the last month or
> > two.
> 
> http://www.netpromi.com/kirbybase.html ?

No I don't think that was it. Although KirbyBase looks like a nice
project, particularly the alternative to SQL it uses to specify queries.

I remember the webpage presenting the format as fairly established and
in active use as an export/import medium, the python module was a
binding to an existing library. It's very possible I've combined the
memories of KirbyBase (for instance) and HDF. My recollection is to say
the least, foggy.

Thankyou for replying and thankyou for Daily Python-URL. I'll put this
on a backburner for now. I may remember it or come across it again by
fortune.

Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CSV like file format featured recently in Daily Python URL?

2005-10-08 Thread Alex Willmer
On Thu, 2005-10-06 at 07:44 -0800, EP wrote:
> Was it something like ARFF?  http://www.cs.waikato.ac.nz/~ml/weka/arff.html

Yes that was it thankyou. Although it would seem there isn't a general
python module, rather a Cookbook script to perform conversion to SQL. I
must have confused ARFF with HDF.

> Google "ARFF Python":  http://www.google.com/search?q=arff+python

Curiously, when I just performed that search, this thread was on the
first page of results as a supplemental result.

-- 
http://mail.python.org/mailman/listinfo/python-list


Coding challenge: Optimise a custom string encoding

2014-08-18 Thread Alex Willmer
A challenge, just for fun. Can you speed up this function?

import string

charset = set(string.ascii_letters + string.digits + '@_-')
byteseq = [chr(i) for i in xrange(256)]
bytemap = {byte: byte if byte in charset else '+' + byte.encode('hex')
   for byte in byteseq}

def plus_encode(s):
"""Encode a unicode string with only ascii letters, digits, _, -, @, +
"""
bytemap_ = bytemap
s_utf8 = s.encode('utf-8')
return ''.join([bytemap[byte] for byte in s_utf8])

On my machine (Ubuntu 14.04, CPython 2.7.6, PyPy 2.2.1) this gets

alex@martha:~$ python -m timeit -s 'import plus_encode' 
'plus_encode.plus_encode(u"""qwertyuiop1234567890!"£$%^&*()EURO""")'
10 loops, best of 3: 2.96 usec per loop

alex@martha:~$ pypy -m timeit -s 'import plus_encode' 
'plus_encode.plus_encode(u"""qwertyuiop1234567890!"£$%^&*()EURO""")'
100 loops, best of 3: 1.24 usec per loop

Back story:
Last week we needed a custom encoding to store unicode usernames in a config 
file that only allowed mixed case ascii, digits, underscore, dash, at-sign and 
plus sign. We also wanted to keeping the encoded usernames somewhat human 
readable.

My design was utf-8 and a variant of %-escaping, using the plus symbol. So 
u'alic EURO 123' would be encoded as b'alic+e2+82+ac123'. This evening as a 
learning exercise I've tried to make it fast. This is the result.

This challenge is just for fun. The chosen solution ended up being

def name_encode(s):
return %s_%s' % (s.encode('utf-8').encode('hex'),
 re.replace('[A-Za-z0-9]', '', s))

Regards, Alex
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Coding challenge: Optimise a custom string encoding

2014-08-18 Thread Alex Willmer
On Monday, 18 August 2014 21:16:26 UTC+1, Terry Reedy  wrote:
> On 8/18/2014 3:16 PM, Alex Willmer wrote:
> > A challenge, just for fun. Can you speed up this function?
> 
> You should give a specification here, with examples. You should perhaps 

Sorry, the (informal) spec was further down.

> > a custom encoding to store unicode usernames in a config file that only 
> > allowed mixed case ascii, digits, underscore, dash, at-sign and plus sign. 
> > We also wanted to keeping the encoded usernames somewhat human readable.

> > My design was utf-8 and a variant of %-escaping, using the plus symbol. So 
> > u'alic EURO 123' would be encoded as b'alic+e2+82+ac123'.

Other examples:
>>> plus_encode(u'alice')
'alice'
>>> plus_encode(u'Bacon & eggs only $19.95')
'Bacon+20+26+20eggs+20only+20+2419+2e95'
>>> plus_encode(u'ünïcoԁë')
'+c3+bc+ef+bd+8e+c3+af+ef+bd+83+ef+bd+8f+d4+81+c3+ab'

> You should perhaps be using .maketrans and .translate.

That wouldn't work, maketrans() can only map single bytes to other single 
bytes. To encode 256 possible source bytes with 66 possible symbols requires a 
multi-symbol expansion of some or all source bytes.
-- 
https://mail.python.org/mailman/listinfo/python-list


Finding x is 1, and x is 'foo' comparisons in a code base

2012-01-17 Thread Alex Willmer
Hello,

I'm looking for a way to find the occurrences of x is y comparisons in
an existing code base. Except for a few special cases (e.g. x is [not]
None) they're a usually mistakes, the correct test being x == y.
However they happen to work most of the time on CPython (e.g. when y
is a small integer or string) so they slip into production code
unnoticed.

PyLint and PyFlakes don't check this AFAICT. Any suggestions for such
a tool, or a pointer how to add the check to an existing tool would be
most welcome.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding the CPython dict implementation

2010-03-15 Thread Alex Willmer
On Mar 15, 4:06 am, John Nagle  wrote:
>     Is this available as a paper?
>
>                                 John Nagle

It doesn't wppear to be, slides are here:

http://us.pycon.org/2010/conference/schedule/event/12/

Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


EuroPython 2010 - Open for registration and reminder of participation

2010-03-15 Thread Alex Willmer
EuroPython 2010 - 17th to 24th July 2010


EuroPython is a conference for the Python programming language
community, including the Django, Zope and Plone communities. It is
aimed at everyone in the Python community, of all skill levels, both
users and programmers.

Last year's conference was the largest open source conference in the
UK and one of the largest community organised software conferences in
Europe.

This year EuroPython will be held from the 17th to 24th July in
Birmingham, UK. It will include over 100 talks, tutorials, sprints and
social events.

Registration


Registration is open now at: http://www.europython.eu/registration/

For the best registration rates, book as soon as you can! Extra Early
Bird closes soon, after which normal Early Bird rate will apply until
10th May

Talks, Activities and Events


Do you have something you wish to present at EuroPython? You want to
give a talk, run a tutorial or sprint?

Go to http://www.europython.eu/talks/cfp/ for information and advice!
Go to http://wiki.europython.eu/Sprints to plan a sprint!

Help Us Out
---

EuroPython is run by volunteers, like you! We could use a hand, and
any contribution is welcome.

Go to http://wiki.europython.eu/Helping to join us!
Go to http://www.europython.eu/contact/ to contact us directly!

Sponsors


Sponsoring EuroPython is a unique opportunity to affiliate with this
prestigious conference and to reach a large number of Python users
from computing professionals to academics, from entrepreneurs to
motivated and well-educated job seekers.

http://www.europython.eu/sponsors/

Spread the Word
---

We are a community-run not-for-profit conference. Please help to
spread the word by distributing this announcement to colleagues,
project mailing lists, friends, your blog, Web site, and through your
social networking connections. Take a look at our publicity resources:

http://wiki.europython.eu/Publicity

General Information
---

For more information about the conference, please visit the official
site: http://www.europython.eu/

Looking forward to see you!

The EuroPython Team
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Splitting a string

2010-04-02 Thread Alex Willmer
On Apr 2, 11:12 am, Thomas Heller  wrote:
> Maybe I'm just lazy, but what is the fastest way to convert a string
> into a tuple containing character sequences and integer numbers, like this:
>
> 'si_pos_99_rep_1_0.ita'  -> ('si_pos_', 99, '_rep_', 1, '_', 0, '.ita')
>

This is very probably not the fastest execution wise, it was the
fastest development time wise:

import re

def maybe_int(x):
try:
return int(x)
except ValueError:
return x

def strings_n_ints(s):
return tuple(maybe_int(x) for x in re.findall('(\d+|\D+)', s))

>>> strings_n_ints('si_pos_99_rep_1_0.ita')
('si_pos_', 99, '_rep_', 1, '_', 0, '.ita')

Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python creates "locked" temp dir

2010-12-08 Thread Alex Willmer
On Dec 7, 9:03 pm, utabintarbo  wrote:
> I am using tempfile.mkdtemp() to create a working directory on a
> remote *nix system through a Samba share. When I use this on a Windows
> box, it works, and I have full access to the created dir. When used on
> a Linux box (through the same Samba share), the created directory
> shows as "locked", and I am unable to access. Obviously, I need
> access. Any clues?

You haven't provided enough details to go on.
1. Please post the actual code, and the trace back (if any).
2. When you say "I am unable to access". Do you mean another script/
process is unable to access? If so, that is the point of mkdtemp() -
to make a temporary directory that _only_ the creating process can
access. If you want to share it then tempfile is not the right module
for you.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python creates "locked" temp dir

2010-12-10 Thread Alex Willmer
On Dec 8, 6:26 pm, Christian Heimes  wrote:
> There isn't a way to limit access to a single process. mkdtemp creates
> the directory with mode 0700 and thus limits it to the (effective) user
> of the current process. Any process of the same user is able to access
> the directory.
>
> Christian

Quite right. My apologies for confusing temporary file creation, for
which exclusive access is used and temporary directory creation for
which there is no such mode.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: string u'hyv\xe4' to file as 'hyvä'

2010-12-27 Thread Alex Willmer
On Dec 27, 6:47 am, "Mark Tolonen"  wrote:
> "gintare"  wrote in message
> > In file i find 'hyv\xe4' instead of hyv .
>
> When you open a file with codecs.open(), it expects Unicode strings to be
> written to the file.  Don't encode them again.  Also, .writelines() expects
> a list of strings.  Use .write():
>
>     import codecs
>     item=u'hyv\xe4'
>     F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
>     F.write(item)
>     F.close()

Gintare, Mark's code is correct. When you are reading the file back
make sure you understand what you are seeing:

>>> F2 = codecs.open('finnish.txt', 'r', 'utf8')
>>> item2 = F2.read()
>>> item2
u'hyv\xe4'

That might like as though item2 is 7 characters long, and it contains
a backslash followed by x, e, 4. However item2 is identical to item,
they both contain 4 characters - the final one being a-umlaut. Python
has shown the string using a backslash escape, because printing a non-
ascii character might fail. You can see this directly, if your Python
session is running in a terminal (or GUI) that can handle non-ascii
characters:

>>> print item2
hyvä
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list 2 dict?

2011-01-02 Thread Alex Willmer
On Sunday, January 2, 2011 3:36:35 PM UTC, T wrote:
> The grouper-way looks nice, but I tried it and it didn't work:
> 
> from itertools import *
> ...
> d = dict(grouper(2, l))
> 
> NameError: name 'grouper' is not defined
> 
> I use Python 2.7. Should it work with this version?

No. As Ian said grouper() is a receipe in the itertools documentation. 

http://docs.python.org/library/itertools.html#recipes

The module doesn't provide it directly
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: String building using join

2011-01-02 Thread Alex Willmer
On Sunday, January 2, 2011 5:43:38 PM UTC, gervaz wrote:
> Sorry, but it does not work
> 
> >>> def prg3(l):
> ... return "\n".join([x for x in l if x])
> ...
> >>> prg3(t)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 2, in prg3
> TypeError: sequence item 0: expected str instance, Test found

def prg3(l):
return '\n'.join([str(x) for x in l if x])

That should do it
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where is win32service

2011-01-02 Thread Alex Willmer
On Sunday, January 2, 2011 6:40:45 PM UTC, catalinf...@gmail.com wrote:
> I install Python 2.7 on Windows XP.
> I try use :
> 
> import win32service
> import win32serviceutil
> 
> But I got that error :
> 
> ImportError: No module named win32service
> Where is this module ?

It's part of the pywin32 (aka win32all) package

http://sourceforge.net/projects/pywin32/

The latest download for your Python version is

http://sourceforge.net/projects/pywin32/files/pywin32/Build%20214/pywin32-214.win32-py2.7.exe/download

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Python comparison matrix

2011-01-03 Thread Alex Willmer
I've created a spreadsheet that compares the built ins, features and modules of 
the CPython releases so far. For instance it shows: 
- basestring was first introduced at version 2.3 then removed in version 3.0
- List comprehensions (PEP 202) were introduced at version 2.0.
- apply() was a built in throughout the 1.x and 2.x series, but was deprecated 
in from 2.3 and removed in 3.0
- Generator functions were first introduced in 2.2 with __future__ import, from 
2.3 they were fully supported

https://spreadsheets.google.com/pub?key=0At5kubLl6ri7dHU2OEJFWkJ1SE16NUNvaGg2UFBxMUE

The current version covers CPython 1.5 - 3.2 on these aspects:
- Built in types and functions
- Keywords
- Modules
- Interpreter switches and environment variables
- Platforms, including shipped Python version(s) for major Linux distributions
- Features/PEPs (incomplete)

I gathered the data from the documentation at python.org. It's work in progress 
so there are plenty of rough edges and holes, but I'd like to get your 
opinions, feedback and suggestions.
- Would you find such a document useful? 
- What format(s) would be most useful to you (e.g. spreadsheet, pdf, web 
page(s), database, wall chart, desktop background)?
- Are there other aspects/pages that you'd like to see included?
- Do you know of source(s) for which versions of CPython supported which 
operating systems (e.g. the first and last Python release that works on Windows 
98 or Mac OS 9)? The best I've found so far is PEP 11

Regards and thanks, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python comparison matrix

2011-01-03 Thread Alex Willmer
On Tuesday, January 4, 2011 12:54:24 AM UTC, Malcolm wrote:
> Alex,
> 
> I think this type of documentation is incredibly useful!

Thank you.

> Is there some type of key which explains symbols like !, *, f, etc?

There is a key, it's the second tab from the end, '!' wasn't documented and I 
forgot why I marked bytes() thusly, so I've removed it.
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python comparison matrix

2011-01-03 Thread Alex Willmer
Thank you Antoine, I've fixed those errors. Going by the docs, I have VMSError 
down as first introduced in Python 2.5.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Trying to decide between PHP and Python

2011-01-04 Thread Alex Willmer
On Jan 4, 8:20 pm, Google Poster  wrote:
> Can any of you nice folks post a snippet of how to perform a listing
> of the current directory and save it in a string?
>
> Something like this:
>
> $ setenv FILES = `ls`
>
> Bonus: Let's say that I want to convert the names of the files to
> lowercase? As 'tolower()'

I'd just like to mention one more python nicety: list comprehension.
If you wanted the filenames as a list of strings, with each made
lowercase then the following would serve well:

import os
filenames = os.listdir('.')
filenames_lower = [fn.lower() for fn in filenames]

You could also combine this into one line:

import os
filenames_lower = [fn.lower() for fn in os.listdir('.')]

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert unicode escape sequences to unicode in a file

2011-01-11 Thread Alex Willmer
On Jan 11, 8:53 pm, Jeremy  wrote:
> I have a file that has unicode escape sequences, i.e.,
>
> J\u00e9r\u00f4me
>
> and I want to replace all of them in a file and write the results to a new 
> file.  The simple script I've created is copied below.  However, I am getting 
> the following error:
>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 
> 947: ordinal not in range(128)
>
> It appears that the data isn't being converted when writing to the file.  Can 
> someone please help?

Are you _sure_ that your file contains the characters '\', 'u', '0',
'0', 'e' and '9'? I expect that actually your file contains a byte
with value 0xe9 and you have inspected the file using Python, which
has printed the byte using a Unicode escape sequence. Open the file
using a text editor or hex editor and look at the value at offset 947
to be sure.

If so, you need to replace 'unicode-escape' with the actual encoding
of the file.

> if __name__ == "__main__":
>     f = codecs.open(filename, 'r', 'unicode-escape')
>     lines = f.readlines()
>     line = ''.join(lines)
>     f.close()
>
>     utFound = re.sub('STRINGDECODE\((.+?)\)', r'\1', line)
>     print(utFound[:1000])
>
>     o = open('newDice.sql', 'w')
>     o.write(utFound.decode('utf-8'))
>     o.close()

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to dump a Python 2.6 dictionary with UTF-8 strings?

2011-01-11 Thread Alex Willmer
On Jan 11, 10:40 pm, "W. Martin Borgert"  wrote:
> Hi,
>
> naively, I thought the following code:
>
> #!/usr/bin/env python2.6
> # -*- coding: utf-8 -*-
> import codecs
> d = { u'key': u'我爱中国人' }
> if __name__ == "__main__":
>     with codecs.open("ilike.txt", "w", "utf-8") as f:
>         print >>f, d
>
> would produce a file ilike.txt like this:
>
> {u'key': u'我爱中国人'}
>
> But unfortunately, it results in:
>
> {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
>
> What's the right way to get the strings in UTF-8?
>
> Thanks in advance!

It has worked, you're just seeing how python presents unicode
characters in the interactive interpreter:

Python 2.7.1+ (r271:86832, Dec 24 2010, 10:04:43)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
>>> x
{u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
>>> print x
{u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
>>> print x['key']
我爱中国人

That last line only works if your terminal uses an suitable encoding
(e.g. utf-8).

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Namespaces

2011-01-21 Thread Alex Willmer
On Jan 21, 10:39 am, sl33k_  wrote:
> What is namespace? And what is built-in namespace?

A namespace is a container for names, like a directory is a container
for files. Names are the labels we use to refer to python objects
(e.g. int, bool, sys), and each Python object - particularly modules
and classes - provides separate namespace.

The idea of a namespace is to isolate names from one another - so that
if you import module_a and module_b and both have an object called foo
then module_a.foo doesn't interfere with module_b.foo.

The built-in namespace is where the core objects of Python are named.
When you refer to an object such as int Python first searches the
local scope (was it defined in the current function/method, i.e. the
output of locals()), then module scope (was it defined in the
current .py file, i.e. output of globals()) and finally in the object
__builtins__.

Hope that makes sense. I realised as I typed this my understanding of
Python namespaces is not as 100% tight as I thought.

Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: interleave string

2011-02-15 Thread Alex Willmer
On Feb 15, 10:09 am, Wojciech Muła
 wrote:
> import re
>
> s = 'xxaabbddee'
> m = re.compile("(..)")
> s1 = m.sub("\\1:", s)[:-1]

One can modify this slightly:

s = 'xxaabbddee'
m = re.compile('..')
s1 = ':'.join(m.findall(s))

Depending on one's taste this could be clearer. The more general
answer, from the itertools docs:

from itertools import izip_longest

def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)

s2 = ':'.join(''.join(pair) for pair in grouper(2, s, ''))

Note that this behaves differently to the previous solutions, for
sequences with an odd length.
-- 
http://mail.python.org/mailman/listinfo/python-list


Reminder: 6 days left for EuroPython 2010 talk submissions

2010-04-24 Thread Alex Willmer
The EuroPython 2010 call for papers closes this Friday on 30th April.
We've already had many submissions covering Python 3, Python 2.7,
IronPython, Game Programming, Testing, Behavior Driven Development,
NoSQL, Accessiblilty and others.

We still are looking for talks and tutorials on Django, PyPy, Twisted,
HTML5, Unladen Swallow, Testing and whatever you wish to present.

http://www.europython.eu/submission/

EuroPython
--
This year EuroPython will be held from the 17th to 24th July in
Birmingham, UK. It will include over 100 talks, tutorials, sprints and
social events. Confirmed speakers so far include Guido van Rossum,
Raymond Hettinger and Brett Cannon.

http://www.europython.eu

Registration

Registration is open now. For the best registration rates, book early!
Early Bird rate is open until 10th May. Speakers can attend at the
discounted rate Speaker Rate.

http://www.europython.eu/registration/

Help Us Out
---
EuroPython is run by volunteers, like you! We could use a hand, and
any contribution is welcome.
Go to http://wiki.europython.eu/Helping to join us!
Go to http://www.europython.eu/contact/ to contact us directly!

Sponsors

Sponsoring EuroPython is a unique opportunity to affiliate with this
prestigious conference and to reach a large number of Python users
from computing professionals to academics, from entrepreneurs to
motivated and well-educated job seekers.
http://www.europython.eu/sponsors/

Spread the Word
---
We are a community-run not-for-profit conference. Please help to
spread the word by distributing this announcement to colleagues,
project mailing lists, friends, your blog, Web site, and through your
social networking connections. Take a look at our publicity resources:
http://wiki.europython.eu/Publicity

General Information
---
For more information about the conference, please visit the official
site: http://www.europython.eu/

Looking forward to see you!
The EuroPython Team
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Get name of file from directory into variable

2010-08-03 Thread Alex Willmer
On Aug 3, 11:21 am, loial  wrote:
> In a unix shell script I can do something like this to look in a
> directory and get the name of a file or files into a variable :
>
> MYFILE=`ls /home/mydir/JOHN*.xml`
>
> Can I do this in one line in python?

Depends if you count imports.

import glob
my_files = glob.glob('/home/mydir/JOHN*.xml')

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The Application cannot locate win32ui.pyd (or Python) (126)

2010-08-04 Thread Alex Willmer
On Aug 4, 2:35 pm, vsoler  wrote:
> Hi all,
>
> I just installed python 3.1.2 where I used to have python 2.6.4. I'm
> working on Win7.
>
> The IDLE GUI works, but I get the following message when trying to
> open *.py files written for py 2.6
>
>         The Application cannot locate win32ui.pyd (or Python) (126)
>

win32ui is part of the PyWin32 package. Most likely you have a version
of PyWin32 for Python 2.6 installed, you should uninstall that and
install PyWin32 for Python 3.1. Downloads are at
http://sourceforge.net/projects/pywin32/files/

You should do the same for any other third party packages that are
installed.

> Moreover, when I try to open an old *.py file, I sometimes get a
> message saying that the file should be converted to UTF-8. What does
> this mean?

Those files contain non-ascii characters (e.g. £, €, æ). Non-ascii
characters must be encoded when saved using and encoding. UTF-8 is one
such encoding, and it was chosen as the default .py encoding for
Python 3.x. Those files are probably in iso8859, cp432, or perhaps
UTF-16 (aka UCS-2). You can save them in UTF-8 using your favourite
text editor, or declare the encoding so Python 3 knows it. More info:

http://www.joelonsoftware.com/articles/Unicode.html
http://docs.python.org/howto/unicode

> I'm also trying to use the 2to3 converter, but I cannot see where the
> converted files are written to!

I think 2to3 prints a diff of the file changes to the console. The -w
command line option should modify files in place.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The Application cannot locate win32ui.pyd (or Python) (126)

2010-08-04 Thread Alex Willmer
On Aug 4, 5:19 pm, vsoler  wrote:
> On Aug 4, 5:41 pm, Alex Willmer  wrote:
>
>
>
>
>
> > On Aug 4, 2:35 pm, vsoler  wrote:
>
> > > Hi all,
>
> > > I just installed python 3.1.2 where I used to have python 2.6.4. I'm
> > > working on Win7.
>
> > > The IDLE GUI works, but I get the following message when trying to
> > > open *.py files written for py 2.6
>
> > >         The Application cannot locate win32ui.pyd (or Python) (126)
>
> > win32ui is part of the PyWin32 package. Most likely you have a version
> > of PyWin32 for Python 2.6 installed, you should uninstall that and
> > install PyWin32 for Python 3.1. Downloads are 
> > athttp://sourceforge.net/projects/pywin32/files/
>
> > You should do the same for any other third party packages that are
> > installed.
>
> > > Moreover, when I try to open an old *.py file, I sometimes get a
> > > message saying that the file should be converted to UTF-8. What does
> > > this mean?
>
> > Those files contain non-ascii characters (e.g. £, €, æ). Non-ascii
> > characters must be encoded when saved using and encoding. UTF-8 is one
> > such encoding, and it was chosen as the default .py encoding for
> > Python 3.x. Those files are probably in iso8859, cp432, or perhaps
> > UTF-16 (aka UCS-2). You can save them in UTF-8 using your favourite
> > text editor, or declare the encoding so Python 3 knows it. More info:
>
> >http://www.joelonsoftware.com/articles/Unicode.htmlhttp://docs.python...
>
> > > I'm also trying to use the 2to3 converter, but I cannot see where the
> > > converted files are written to!
>
> > I think 2to3 prints a diff of the file changes to the console. The -w
> > command line option should modify files in place.
>
> Thank you Alex for your detailes reply.
>
> Before switching to Python 3.1.2 I removed all my Python 2.6 packages
> (python, pywin32, numpy, wxpython). However, the removal was not
> complete since some files could not be removed. Additionally, I still
> see my C:\python26 directory which is suposed not to exist any longer.

It probably contains one or two files the installers weren't aware of.
E.g. a module you added manually, a log, a .pyc

> I would not like to take a lot of your time, but, do you have any
> hints as to what I should do to 'tune' my PC?

Take a backup then either delete the Python26 directory, or rename it.
Any problems, reverse the process.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to match patterns like XX YY XX YY? (regex)

2010-08-07 Thread Alex Willmer
On Aug 7, 4:48 pm, Peng Yu  wrote:
> The problem is that I don't know how to capture pattern that repeat
> itself (like 'a' and 'xy' in the example). I could use 'test\((\w+)
> (\w+)\)(\w) (\w)', but it will capture something like 'test(a b)x y',
> which I don't want to.
>
> I'm wondering if there is way to capture recurring patterns.

Back references can deal with repetition.

Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match(r'test\((\w+) (\w+)\)\1 \2', 'test(xy uv)xy uv').groups()
('xy', 'uv')
>>> re.match(r'test\((\w+) (\w+)\)\1 \2', 'test(a b)x y')
>>>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequential Object Store

2010-08-07 Thread Alex Willmer
On Aug 7, 5:26 pm, GZ  wrote:
> I am wondering if there is a module that can persist a stream of
> objects without having to load everything into memory. (For this
> reason, I think Pickle is out, too, because it needs everything to be
> in memory.)

>From the pickle docs it looks like you could do something like:

try:
import cPickle as pickle
except ImportError
import pickle

file_obj = open('whatever', 'wb')
p = pickle.Pickler(file_obj)

for x in stream_of_objects:
p.dump(x)
p.memo.clear()

del p
file_obj.close()

then later

file_obj = open('whatever', 'rb')
p = pickle.Unpickler(file_obj)

while True:
try:
x = p.load()
do_something_with(x)
except EOFError:
break

Your loading loop could be wrapped in a generator function, so only
one object should be held in memory at once.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 2.7 re.IGNORECASE broken in re.sub?

2010-08-15 Thread Alex Willmer
On Aug 16, 1:07 am, Steven D'Aprano  wrote:
> You're passing re.IGNORECASE (which happens to equal 2) as a count
> argument, not as a flag. Try this instead:
>
> >>> re.sub(r"python\d\d" + '(?i)', "Python27", t)
> 'Python27'

Basically right, but in-line flags must be placed at the start of a
pattern, or the result is undefined. Also in Python 2.7 re.sub() has a
flags argument.

Python 2.7.0+ (release27-maint:83286, Aug 16 2010, 01:25:58)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> t = 'Python26'
>>> re.sub(r'(?i)python\d\d', 'Python27', t)
'Python27'
>>> re.sub(r'python\d\d', 'Python27', t, flags=re.IGNORECASE)
'Python27'

Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 2.7 re.IGNORECASE broken in re.sub?

2010-08-16 Thread Alex Willmer
On Aug 16, 12:23 pm, Steven D'Aprano  wrote:
> On Sun, 15 Aug 2010 17:36:07 -0700, Alex Willmer wrote:
> > On Aug 16, 1:07 am, Steven D'Aprano  > cybersource.com.au> wrote:
> >> You're passing re.IGNORECASE (which happens to equal 2) as a count
> >> argument, not as a flag. Try this instead:
>
> >> >>> re.sub(r"python\d\d" + '(?i)', "Python27", t)
> >> 'Python27'
>
> > Basically right, but in-line flags must be placed at the start of a
> > pattern, or the result is undefined.
>
> Pardon me, but that's clearly not correct, as proven by the fact that the
> above example works.

Undefined includes 'might work sometimes'. I refer you to the Python
documentation:

"Note that the (?x) flag changes how the expression is parsed. It
should be used first in the expression string, or after one or more
whitespace characters. If there are non-whitespace characters before
the flag, the results are undefined."
http://docs.python.org/library/re.html#regular-expression-syntax

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 2.7 re.IGNORECASE broken in re.sub?

2010-08-16 Thread Alex Willmer
On Aug 16, 1:46 pm, Alex Willmer  wrote:
> "Note that the (?x) flag changes how the expression is parsed. It
> should be used first in the expression string, or after one or more
> whitespace characters. If there are non-whitespace characters before
> the flag, the results are undefined.
> "http://docs.python.org/library/re.html#regular-expression-syntax

Hmm, I found a lot of instances that place (?iLmsux) after non-
whitespace characters

http://google.com/codesearch?hl=en&lr=&q=file:\.py[w]%3F$+[^[:space:]%22']%2B\(\%3F[iLmsux]%2B\)

including two from the Python unit tests, re_test.py lines 109-110.
Perhaps the documentation is overly cautious..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: seach for pattern based on string

2010-08-24 Thread Alex Willmer
On Aug 24, 5:33 pm, richie05 bal  wrote:
> i am starting to learn python and I am stuck with query I want to
> generate with python
> File looks something like this
> TRACE: AddNewBookD {bookId 20, noofBooks 6576, authorId 41,
> publishingCompanyId 7}
> TRACE: AddNewBookD {bookId 21, noofBooks 6577, authorId 42,
> publishingCompanyId 8}
>
> I want to first search for AddNewBookD
> if found
>    store bookId, noofBooks, authorId and publishingCompanyId
>
> I know how to search for only AddNewBookD or find the pattern bookId
> 20, noofBooks 6576, authorId 41, publishingCompanyId 7 but I don't
> know how search one based on another.

Using a regular expression I would perform a match against each line.
If the match fails, it will return None. If the match succeeds it
returns a match object with which you can extract the values

>>> import re
>>> pattern = re.compile(r'TRACE: AddNewBookD \{bookId (\d+), noofBooks (\d+), 
>>> authorId (\d+), publishingCompanyId (\d+)\}\s*')
>>> s = '''TRACE: AddNewBookD {bookId 20, noofBooks 6576, authorId 41, 
>>> publishingCompanyId 7} '''
>>> pattern.match(s)
<_sre.SRE_Match object at 0xa362f40> # If the match failed this would
be None
>>> m = pattern.match(s)
>>> m.groups()
('20', '6576', '41', '7')
>>>

So your code to store the result would be inside an if m: block

HTH, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with strptime and time zone

2010-08-24 Thread Alex Willmer
On Aug 24, 9:45 pm, m_ahlenius  wrote:
>
> whereas this fails:
> myStrA = 'Sun Aug 22 19:03:06 PDT'
> gTimeA = strptime( myStrA, '%a %b %d %H:%M:%S %Z')
> print "gTimeA = ",gTimeA
>
> ValueError: time data 'Sun Aug 22 19:03:06 PDT' does not match format
> '%a %b %d %H:%M:%S %Z'

Support for the %Z directive is based on the values contained in
tzname and whether daylight is true. Because of this, it is platform-
specific except for recognizing UTC and GMT which are always known
(and are considered to be non-daylight savings timezones).

http://docs.python.org/library/time.html

Dateutil has it's own timezone database, so should work reliably
http://labix.org/python-dateutil
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with strptime and time zone

2010-08-25 Thread Alex Willmer
On Aug 25, 8:48 am, Lawrence D'Oliveiro  wrote:
> In message
> <45faa241-620e-42c7-b524-949936f63...@f6g2000yqa.googlegroups.com>, Alex
>
> Willmer wrote:
> > Dateutil has it's own timezone database ...
>
> I hate code which doesn’t just use /usr/share/zoneinfo. How many places do
> you need to patch every time somebody changes their daylight-saving rules?

>From reading http://labix.org/python-dateutil can read timezone
information from several platforms, including /usr/share/zoneinfo. I
don't know whether one chooses the source explicitly, or if it is
detected with fall back to the internal database.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New implementation of re module

2009-08-04 Thread Alex Willmer
On Jul 27, 5:34 pm, MRAB  wrote:
> Hi all,
>
> I've been working on a new implementation of the re module. The details
> are athttp://bugs.python.org/issue2636, specifically 
> fromhttp://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
> Python 2.6 on Windows if you want to try it out.

Firstly Matthew, thank you for all your work on this. It brings some
very nice regex features to Python.

I've used Christopher Arndt's post as a basis and created a package
from you latest upload (issue2636-20090804.zip), which builds for
Python 2.5 and 2.6. I'd like to see this on PyPI, so it's easier to
install the module and your work gets wider exposure. Would this be
alright and would you prefer to have control of the upload, as this is
your work?

Below is the setup.py, the unicodedata_db.h files are taken from the
appropriate branches on svn.python.org

#!/usr/bin/env python

import shutil
import sys
from distutils.core import setup, Extension

MAJOR, MINOR = sys.version_info[:2]

# Copy appropriate unicodedata_db.h, not found in published includes
if (MAJOR, MINOR) == (2, 6):
shutil.copy('Python26/unicodedata_db.h', './')
elif (MAJOR, MINOR) == (2, 5):
shutil.copy('Python25/unicodedata_db.h', './')
else:
print "WARNING: No unicodedata_db.h prepared."

setup(
name='regex',
version='20080804',
description='Alternate regular expression module, to replace re.',
author='Matthew Barnett',
author_email='pyt...@mrabarnett.nospam.plus.com', # Obsfucated
    url='http://bugs.python.org/issue2636',
py_modules = ['regex'],
ext_modules=[Extension('_regex', ['_regex.c'])],
)


Sincerely, Alex Willmer
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where do I report a bug to the pythonware PIL

2010-09-03 Thread Alex Willmer
On Sep 3, 10:35 am, "jc.lopes"  wrote:
> Does anyone knows what is the proper way to submit a bug report to
> pythonware PIL?
>
> thanks
> JC Lopes

The Python Image SIG list http://mail.python.org/mailman/listinfo/image-sig

"Free Support: If you don't have a support contract, please send your
question to the Python Image SIG mailing list. The same applies for
bug reports and patches." -- http://www.pythonware.com/products/pil/

They don't appear to have a dedicated mailing list or public bug
tracker.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Plz comment on this code

2010-09-19 Thread Alex Willmer
Your code works (assuming digits gets populated fully), but it's the
absolute bare minimum that would.
To be brutally honest it's:
 - unpythonic - you've not used the core features of Python at all,
such as for loops over a sequence
 - poorly formatted - Please read the python style guide and follow it
 - not reusable - Your code can only be called from the command line,
it should be usable as a module
 - not documented - There is no indication what this code does, other
than mentally running it
 - Fragile - There is no error checking on user input

There are other ways to write what you have more concisely (e.g. list
comprehensions, iterators) but those can wait for another time. Here
is a start at improving your code wrt to the above points:

#!/usr/bin/env python3

# bigdigits2.py

ZERO = ["***", # NB Constants are by convention ALL_CAPS
"* *",
"***"]
ONE = ["** ",
   " * ",
   "***"]

# TODO Define and populate digits 2-9
DIGITS = [ZERO, ONE, ZERO, ONE, ZERO, ONE, ZERO, ONE, ZERO, ONE]

def big_digits(str_of_digits):
"""Return a list of lines representing the digits using 3x3 blocks
of "*"
"""
banner = [] # Accumulate results in this

# Loop over the rows/lines of the result
# TODO Replace hard coded block size with global constant or
measured size
for row in range(3):
line_parts = []

# Assemble the current line from the current row of each big
digit
for digit in str_of_digits:
big_digit = DIGITS[int(digit)]
line_parts.append(big_digit[row])

# Create a string for the current row and add it to the result
line = " ".join(line_parts)
banner.append(line)

return banner

def usage():
print("Usage: bigdigit.py ", file=sys.stderr)
sys.exit(1)

if __name__ == "__main__":
import sys

# Check that an argument was passed
# NB This will ignore additional arguments
if len(sys.argv) >= 2:
input_string = sys.argv[1]
else:
usage()

# Check that only digits were passed
if not input_string.isnumeric():
usage()

# All is well, print the output
for line in big_digits(input_string):
print(line)

Here are some suggested further improvements:
- Map directly from a digit to it's big digit with a dictionary,
rather than indexing into a list:
BIG_DIGITS = {
"1": ["** ",
  " * ",
  "***"],
# ...
}
- Is input_string.isnumeric() the right test? Can you find a character
it would not correctly flag as invalid input?
- What if I wanted to use my own 4x4 big digits? Could the
big_digits() function accept it as an argument?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Plz comment on this code

2010-09-19 Thread Alex Willmer
On Sep 19, 12:20 pm, Lawrence D'Oliveiro  wrote:
> In message
> , Alex
>
> Willmer wrote:
> > # NB Constants are by convention ALL_CAPS
>
> SAYS_WHO?

Says PEP 8:

Constants

   Constants are usually declared on a module level and written in
all
   capital letters with underscores separating words.  Examples
include
   MAX_OVERFLOW and TOTAL.

-- http://www.python.org/dev/peps/pep-0008/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: if the else short form

2010-09-29 Thread Alex Willmer
On Sep 29, 12:38 pm, Hrvoje Niksic  wrote:
> Tracubik  writes:
> > Hi all,
> > I'm studying PyGTK tutorial and i've found this strange form:
>
> > button = gtk.Button(("False,", "True,")[fill==True])
>
> > the label of button is True if fill==True, is False otherwise.
>
> The tutorial likely predates if/else expression syntax introduced in
> 2.5, which would be spelled as:
>
> button = gtk.Button("True" if fill else "False")
>
> BTW adding "==True" to a boolean value is redundant and can even break
> for logically true values that don't compare equal to True (such as the
> number 10 or the string "foo").

Totally agreed with one nit. If one chooses to fake

x = true_val if expr else false_val

prior to Python 2.5, with

x = (false_val, true_val)[expr]

then one should ensure that expr evaluates to either 0, 1 or a bool.
If expr evaluates to "fred" or 42 a TypeError or IndexError will
occur. So better to use (in original line)

 button = gtk.Button(("False,", "True,")[bool(fill)])

but still best for readability, to use a full if-else block
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem installing psycopg2 in virtualenv (Ubuntu 10.04, Python 2.5)

2010-10-05 Thread Alex Willmer
On Oct 5, 7:41 am, Pascal Polleunus  wrote:
> On 05/10/10 00:11, Diez B. Roggisch wrote:
> > Install the python-dev-package. It contains the Python.h file, which the
> > above error message pretty clearly says. Usually, it's a good idea to
> > search package descriptions of debian/ubuntu packages for missing header
> > files to know what to install.
>
> It's already installed; at least for 2.6, nor sure it's correct for 2.5.
> python2.5-dev is not available but python-old-doctools replaces it.

Ubuntu 10.04 doesn't have a full Python 2.5 packaged, as evidenced by
the lack of python2.5-dev. You need to use Python 2.6 or if you
absolutely must use Python 2.5 build it from source, try a Debian
package or switch distro. python-old-doctools does not replace python-
dev, it looks like it was bodged to keep some latex tools working.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why "flat is better than nested"?

2010-10-25 Thread Alex Willmer
On Oct 25, 11:07 am, kj  wrote:
> In "The Zen of Python", one of the "maxims" is "flat is better than
> nested"?  Why?  Can anyone give me a concrete example that illustrates
> this point?

I take this as a reference to the layout of the Python standard
library and other packages i.e. it's better to have a module hierarchy
of depth 1 or 2 and many top level items, than a depth of 5+ and only
a few top level items.

For instance

import re2
import sqlite3
import logging

import something_thirdparty

vs

import java.util.regex
import java.sql
import java.util.logging

import org.example.thirdparty.something

There are of course some Python packages that go deeper than 2
(xml.dom.minidom), but the majority are shallow. I think the
motivation is to make the packages more discoverable, and to avoid
classification arguments (does regex go under util or text). Alone the
statement would imply a single, global space ala C but that of course
is evil and so one must balance it with "Namespaces are one honking
great idea -- let's do more of those!"

I don't think it applies to data structures though. If a deeply nested
tree models your data well, then use it.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why "flat is better than nested"?

2010-10-25 Thread Alex Willmer
On Oct 25, 2:56 pm, Robin Becker  wrote:
> On 25/10/2010 11:07, kj wrote:
>
> > In "The Zen of Python", one of the "maxims" is "flat is better than
> > nested"?  Why?  Can anyone give me a concrete example that illustrates
> > this point?
>
> ...
> I believe that the following illustrates the nesting issue (I think this is 
> from
> somewhere in Chomsky)
>
> The rat ate the corn.
> The rat that the cat killed ate the corn.
> The rat that the cat that the dog chased killed ate the corn.
>
> I believe this is called central embedding.
>
> There's also the old schoolboy saying "I know that that that that that boy 
> said
> is wrong!".
>
> The nested nature makes the semantics quite hard. The same will be true of
> nested tuple/list and similar programming structures.

I agree in the case of a suped-up hierachical record structure that
encourages code like

my_far =
the_record.something.something_else.foo[2].keep_going.bar.baz()

A tree of homogeneous nodes that one walks or recurses into (e.g. a b-
tree or r-tree) is a case where I would ignore this maxim
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 'NoneType' object has no attribute 'bind'

2010-10-28 Thread Alex Willmer
On Oct 28, 11:24 am, Alex  wrote:
> hi there, I keep getting the message in the Topic field above.
>
> Here's my code:
>
> self.click2=Button(root,text="Click Me").grid(column=4,row=10)
> self.click2.bind("",self.pop2pop)

>From reading the Tkinter docs grid doesn't itself return a control. So
I think you want this:

self.click2 = Button(root, text="Click Me")
self.click2.grid(column=4, row=10)
self.click2.bind("", self.pop2pop)

However, that's totally untested so don't take it as gospel.

> def pop2pop(self,event):
>
>         print("Adsfadsf")
>         newpop=IntVar()
>         newpop=self.PopSize.get();
>
> what am I doing wrong?
>
> cheers,
>
> Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Would you recommend python as a first programming language?

2010-11-01 Thread Alex Willmer
On Oct 30, 7:16 pm, brad...@hotmail.com wrote:
> I was thinking of recommending this to a friend but what do you all think?
>

I think
 1. Python is a great language, and a good starting point for many
people.
 2. You really haven't given us much to go on.

Regards, Alex
-- 
http://mail.python.org/mailman/listinfo/python-list