Re: CENSORSHIP - Django Project (Schema Evolution Support)

2006-06-07 Thread Simon Willison
Ilias Lazaridis wrote:
> [posted publicly to comp.lang.python, with email notification to 6
> recipients relevant to the topic]
>
> I have implemented a simple schema evolution support for django, due to
> a need for a personal project. Additionally, I've provided an Audit:
>
> http://case.lazaridis.com/wiki/DjangoAudit
>
> As a result, I was censored ('banned' from the development list)

Please see this message for background:

http://groups.google.com/group/django-users/msg/5a96eabf75f2b9c7

To summarise, it was felt that Ilias was deliberately trolling the
mailing list. No further explanation seems necessary.

-- 
http://mail.python.org/mailman/listinfo/python-list


Pythonic API design: detailed errors when you usually don't care

2006-10-02 Thread Simon Willison
Hi all,

I have an API design question. I'm writing a function that can either
succeed or fail. Most of the time the code calling the function won't
care about the reason for the failure, but very occasionally it will.

I can see a number of ways of doing this, but none of them feel
aesthetically pleasing:

1.

try:
  do_something()
except HttpError:
  # An HTTP error occurred
except ApplicationError:
  # An application error occurred
else:
  # It worked!

This does the job fine, but has a couple of problems. The first is that
I anticipate that most people using my function won't care about the
reason; they'll just want a True or False answer. Their ideal API would
look like this:

if do_something():
  # It succeeded
else:
  # It failed

The second is that the common path is success, which is hidden away in
the 'else' clause. This seems unintuitive.

2.

Put the method on an object, which stores the reason for a failure:

if obj.do_something():
  # It succeeded
else:
  # It failed; obj.get_error_reason() can be called if you want to know
why

This has an API that is closer to my ideal True/False, but requires me
to maintain error state inside an object. I'd rather not keep extra
state around if I don't absolutely have to.

3.

error = do_something()
if error:
  # It failed
else:
  # It succeeded

This is nice and simple but suffers from cognitive dissonance in that
the function returns True (or an object evaluating to True) for
failure.

4.

The preferred approach works like this:

if do_something():
  # Succeeded
else:
  # Failed

BUT this works too...

ok = do_something()
if ok:
  # Succeeded
else:
  # ok.reason has extra information
  reason = ok.reason

This can be implemented by returning an object from do_something() that
has a __nonzero__ method that makes it evaluate to False. This solves
my problem almost perfectly, but has the disadvantage that it operates
counter to developer expectations (normally an object that evaluates to
False is 'empty').

I know I should probably just pick one of the above and run with it,
but I thought I'd ask here to see if I've missed a more elegant
solution.

Thanks,

Simon

-- 
http://mail.python.org/mailman/listinfo/python-list


Treating a unicode string as latin-1

2008-01-03 Thread Simon Willison
Hello,

I'm using ElementTree to parse an XML file which includes some data
encoded as cp1252, for example:

Bob\x92s Breakfast

If this was a regular bytestring, I would convert it to utf8 using the
following:

>>> print 'Bob\x92s Breakfast'.decode('cp1252').encode('utf8')
Bob's Breakfast

But ElementTree gives me back a unicode string, so I get the following
error:

>>> print u'Bob\x92s Breakfast'.decode('cp1252').encode('utf8')
Traceback (most recent call last):
  File "", line 1, in 
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/
python2.5/encodings/cp1252.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeEncodeError: 'ascii' codec can't encode character u'\x92' in
position 3: ordinal not in range(128)

How can I tell Python "I know this says it's a unicode string, but I
need you to treat it like a bytestring"?

Thanks,

Simon Willison
-- 
http://mail.python.org/mailman/listinfo/python-list


Is it possible to consume UTF8 XML documents using xml.dom.pulldom?

2008-07-30 Thread Simon Willison
I'm having a horrible time trying to get xml.dom.pulldom to consume a
UTF8 encoded XML file. Here's what I've tried so far:

>>> xml_utf8 = """
Simon\xe2\x80\x99s XML nightmare
"""
>>> from xml.dom import pulldom
>>> parser = pulldom.parseString(xml_utf8)
>>> parser.next()
('START_DOCUMENT', )
>>> parser.next()
('START_ELEMENT', )
>>> parser.next()
...
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 21: ordinal not in range(128)

xml.dom.minidom can handle the string just fine:

>>> from xml.dom import minidom
>>> dom = minidom.parseString(xml_utf8)
>>> dom.toxml()
u'Simon\u2019s XML nightmare'

If I pass a unicode string to pulldom instead of a utf8 encoded
bytestring it still breaks:

>>> xml_unicode = u'Simon\u2019s XML nightmare'
>>> parser = pulldom.parseString(xml_unicode)
...
/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
xml/dom/pulldom.py in parseString(string, parser)
346
347 bufsize = len(string)
--> 348 buf = StringIO(string)
349 if not parser:
350 parser = xml.sax.make_parser()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 32: ordinal not in range(128)

Is it possible to consume utf8 or unicode using xml.dom.pulldom or
should I try something else?

Thanks,

Simon Willison
--
http://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to consume UTF8 XML documents using xml.dom.pulldom?

2008-07-30 Thread Simon Willison
Follow up question: what's the best way of incrementally consuming XML
in Python that's character encoding aware? I have a very large file to
consume but I'd rather not have to fall back to the raw SAX API.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to consume UTF8 XML documents using xml.dom.pulldom?

2008-07-30 Thread Simon Willison
On Jul 30, 4:43 pm, Paul Boddie <[EMAIL PROTECTED]> wrote:
> I can't reproduce this on Python 2.3.6 or 2.4.4 on RHEL 4. Instead, I
> get the usual...
>
> ('CHARACTERS', )

I'm using Python 2.5.1 on OS X Leopard:

$ python
Python 2.5.1 (r251:54863, Feb  4 2008, 21:48:13)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin

I just tried it out on Python 2.4.2 on an Ubuntu machine and it worked
fine! I guess this must be an OS X Python bug. How absolutely
infuriating.

Thanks,

Simon
--
http://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to consume UTF8 XML documents using xml.dom.pulldom?

2008-07-30 Thread Simon Willison
On Jul 30, 4:59 pm, Simon Willison <[EMAIL PROTECTED]> wrote:
> I just tried it out on Python 2.4.2 on an Ubuntu machine and it worked
> fine! I guess this must be an OS X Python bug. How absolutely
> infuriating.

Some very useful people in #python on Freenode pointed out that my bug
occurs because I'm trying to display things interactively in the
console. Saving to a variable instead fixes the problem.

Thanks for your help,

Simon
--
http://mail.python.org/mailman/listinfo/python-list