from:"oren"

Re: python3: 'where' keyword

2005-01-08 Thread oren

When I first saw this I thought: "hmmm... this seems as redundant as
adding a repeat/until loop to Python; there's no chance in hell it will
ever be accepted by the community or Guido, but I actually kinda like
it".  It's nice to see mostly positive reactions to this idea so far.

I think it's a really ingenious solution to the the anonymous function
problem - don't make it anonymous! A short, throwaway name with a very
localized scope is as good as a truly anonymous function and feels more
Pythonic to me. We thought we wanted a better syntax than lambda for
anonymous functions but Andrey shows that perhaps it wasn't what we
really need. What we need is a solution to quickly and cleanly generate
bits of callable code without polluting the containing namespace,
without having to think too hard about unique names and while making
their temporary and local nature clear from the context. Anonymity
isn't one of the requirements.

I really liked Nick Coghlan's property example. The names 'get' and
'set' are too short and generic to be used without a proper scope but
with this syntax they are just perfect.

Here's another example:

w = Widget(color=Red, onClick=onClick, onMouseOver=onMouseOver) where:
.   def onClick(event): do_this(event.x, event.y, foo)
.   def onMouseOver(event): someotherwidget.do_that()

The "onClick=onClick" part seems a bit redundant, right? So how about
this:

w = Widget(**kw) where:
.   color = Red
.   def onClick(event): do_this(event.x, event.y, blabla)
.   def onMouseOver(event): someotherwidget.do_that()
.   x, y = 100, 200
.   kw = locals()

I'm not really sure myself how much I like this. It has a certain charm
but also feels like abuse of the feature. Note that "w =
Widget(**locals()) where:" would produce the wrong result as it will
include all the values in the containing scope, not just those defined
in the where block. 

   Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Inserting Records into SQL Server - is there a faster interface than ADO

2005-11-14 Thread Oren Tirosh

> We are using the stored procedure to do a If Exist, update, else Insert 
> processing for
> each record.

Consider loading the data in batches into a temporary table and then
use a single insert statement to insert new records and a single update
statement to update existing ones. This way, you are not forcing the
database to do it one by one and give it a chance to aggressively
optimize your queries and update the indexes in bulk. You'd be
surprized at the difference this can make!

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python choice of database

2005-06-20 Thread Oren Tirosh

Philippe C. Martin wrote:
> Hi,
>
> I am looking for a stand-alone (not client/server) database solution for
> Python.
>
> 1) speed is not an issue
> 2) I wish to store less than 5000 records
> 3) each record should not be larger than 16K

How about using the filesystem as a database? For the number of records
you describe it may work surprisingly well. A bonus is that the
database is easy to manage manually. One tricky point is updating: you
probably want to create a temporary file and then use os.rename to
replace a record in one atomic operation.

For very short keys and record (e.g. email addresses) you can use
symbolic links instead of files. The advantage is that you have a
single system call (readlink) to retrieve the contents of a link. No
need to open, read and close.

This works only on posix systems, of course. The actual performance
depends on your filesystem but on linux and BSDs I find that
performance easily rivals that of berkeleydb and initialization time is
much faster. This "database" also supports reliable concurrent access
by multiple threads or processes.

See http://www.tothink.com/python/linkdb

Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: voicemail program written with python

2005-06-22 Thread Oren Tirosh

It is relatively easy to write voice applications for the Asterisk
software PBX using the CGI-like AGI (Asterisk Gateway Interface).

The following document describes the AGI and has some examples in
Python:

http://home.cogeco.ca/~camstuff/agi.html

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: JBUS and Python which way

2005-08-02 Thread Oren Tirosh

If you can't find any JBUS/Modbus modules specific for Python it's
possible to use one of the many C/C++ modules available and make a
Python wrapper for it with an interface generator like SWIG or SIP. You
say that you don't have much technical background so you may consider
hiring someone to do it. It's not a big project so it shouldn't be too
expensive.

  Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to get a unique id for bound methods?

2005-08-22 Thread Oren Tirosh

Russell E. Owen wrote:
> I have several situations in my code where I want a unique identifier
> for a method of some object (I think this is called a bound method). I
> want this id to be both unique to that method and also stable (so I can
> regenerate it later if necessary).

>>> def persistent_bound_method(m):
... return m.im_self.__dict__.setdefault(m.im_func.func_name, m)
...
>>> class A:
... def x(self):
... return
...
>>> a=A()
>>> a.x is a.x
False
>>> persistent_bound_method(a.x) is persistent_bound_method(a.x)
True
>>>

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Email client in Pyhton

2005-08-25 Thread Oren Tirosh

> IIRC, many of the mailbox modules (such as mailbox and
> mhlib) are read-only, but they should provide a good starting point.

The mailbox module has recently been upgraded for full read-write
access by a student participating in google's Summer of Code. It is
currently under review for inclusion in the standard library.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Virtual Slicing

2005-08-27 Thread Oren Tirosh

Bryan Olson wrote:
> I recently wrote a module supporting value-shared slicing. I
> don't know if this functionality already existed somewhere,

In the Numarray module slices are a view into the underlying array
rather than a copy.

http://www.stsci.edu/resources/software_hardware/numarray

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Is it possible to detect if files on a drive were changed without scanning the drive?

2005-09-12 Thread Oren Tirosh

> After connecting a drive to the system (via USB
> or IDE) I would like to be able to see within seconds
> if there were changes in the file system of that drive
> since last check (250 GB drive with about four million
> files on it).

Whenever a file is modified the last modification time of the directory
containing it is also set. I'm not sure if the root directory itself
has a last modification time field but you can just store and compared
the last mod time of all subdirectories directly under the root
directory.

If you trust the clock of all machines mounting this drives is set
correctly (including time zone) you can store just a single timestamp
and compare for files or directories modified after that time.
Otherwise you will need to store and compare for any changes, not just
going forward.

  Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Word for a non-iterator iterable?

2005-02-07 Thread Oren Tirosh

Leif K-Brooks <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> Is there a word for an iterable object which isn't also an iterator, and 
> therefor can be iterated over multiple times without being exhausted? 
> "Sequence" is close, but a non-iterator iterable could technically 
> provide an __iter__ method without implementing the sequence protocol, 
> so it's not quite right.

"reiterable". I think I was the first to use this word on
comp.lang.python.

If you have code that requires this property might want to use this
function:

.def reiter(x):
.i = iter(x)
.if i is x:
.raise TypeError, "Object is not re-iterable"
.return i

example:

.for outer in x:
.for inner in reiter(y):
.do_something_with(outer, inner)

This will raise an exception when an iterator is used for y instead of
silently failing after the first time through the outer loop and
making it look like an empty container.

When iter() returns a new iterator object it is a good hint but not a
100% guarantee that the object is reiterable. For example, python 2.2
returned a new xreadlines object for iterating over a file but it
messed up the underlying file object's state so it still wasn't
reiterable. But when iter() returns the same object - well, that's a
sign that the object is definitely not reiterable.

   Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: managing multiple subprocesses

2005-02-07 Thread Oren Tirosh

"Marcos" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> ...
> os.systems / commands etc etc. I realise the subprocess module may have
> what I need but I can't get python 2.4 on the Mac so I need a 2.3 based
> solution. Any help is much appreciated. Cheers.

The Python 2.4 subprocess module by Peter Astrand is pure python for
posix systems (a small C extension module is required only for win32
systems). You can bundle it with your code and use it with older
Python versions. It even contains backward compatibility code to
replace True and False if not found so it can run on Python 2.2.

  Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Turn of globals in a function?

2005-03-26 Thread Oren Tirosh

Ron_Adam <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> Is there a way to hide global names from a function or class?
> 
> I want to be sure that a function doesn't use any global variables by
> mistake.  So hiding them would force a name error in the case that I
> omit an initialization step.  This might be a good way to quickly
> catch some hard to find, but easy to fix, errors in large code blocks.

def noglobals(f):
.   import new
.   return new.function(
.   f.func_code, 
.   {'__builtins__':__builtins__},
.   f.func_name, 
.   f.func_defaults, 
.   f.func_closure
.   )

You can use it with the Python 2.4 @decorator syntax:

@noglobals
def a(...):
.   # code here

Doing this for a class is a little more work. You will need dig inside
to perform this treatment on each method separately and handle new and
old-style classes a bit differently.

Note that this kind of function may declare globals. They will be
persistent but private to the function.

  Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Suggesting a new feature - "Inverse Generators"

2005-03-26 Thread Oren Tirosh

"Jordan Rastrick" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL 
PROTECTED]>...
> Hmmm, I like the terminology consumers better than acceptors.

Here's an implementation of Python consumers using generators:
http://groups.google.co.uk/[EMAIL PROTECTED]

  Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: pre-PEP: Suite-Based Keywords

2005-04-16 Thread Oren Tirosh

Take a look at Nick Coglan's "with" proposal:

http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list%40python.org

It addresses many of the same issues (e.g. easy definition of
properties). It is more general, though: while your proposal only
applies to keyword arguments in a function call this one can be used
to name any part of a complex expression and define it in a suite.

I think that a good hybrid solution would be to combine the "with"
block with optional use of the ellipsis to mean "all names defined in
the with block".

See also the thread resulting from Andrey Tatarinov's original
proposal (using the keyword "where"):

http://groups.google.co.uk/groups?selm=3480qqF46jprlU1%40individual.net


  Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Removing dictionary-keys not in a set?

2005-04-18 Thread Oren Tirosh

"Tim N. van der Leeuw" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL 
PROTECTED]>...
> Hi,
> 
> I'd like to remove keys from a dictionary, which are not found in a
> specific set. 

Here's my magic English-to-Python translator:

"I'd like to ... keys which ..."  ->  "for key in"
"keys from a dictionary"  ->  "set(dictionary)"
"not found in a specific set" ->  "-specificset"
"... remove keys ..."     ->  "del dictionary[key]"

Putting it all together:

>>> for key in set(dictionary)-specificset:
...del dictionary[key]

Oren
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ElemenTree and namespaces

2005-05-16 Thread oren . tirosh

Matthew Thorley wrote:
> Does any one know if there a way to force the ElementTree module to
> print out name spaces 'correctly' rather than as ns0, ns1 etc? Or is
> there at least away to force it to include the correct name spaces in
> the output of tostring?
>
> I didn't see anything in the api docs or the list archive, but before
I
> set off to do it myself I thought I should ask, because it seemed
like
> the kind of thing that has already been done.

There's a way, but it requires access to an undocumented internal
stuff. It may not be compatible with other implementations of the
ElementTree API like lxml.

The ElementTree module has a _namespace_map dictionary of "well known"
namespace prefixes mapping namespace URIs to prefixes. By default it
contains the xml:, html:, rdf: and wsdl:. You can add your own
namespace to that dictionary to get your preferred prefix.

In theory, namespace prefixes are entirely arbitrary and only serve as
a temporary link to the namespace URI. In practice, people tend to get
emotionally attached to their favorite prefixes. XPath also breaks this
theory because it refers to prefixes rather than URIs.

Take a look at http://www.tothink.com/python/ElementBuilder. It's a
module to provide a friendly syntax for building and populating
Elements:

Example:

>>> import ElementBuilder
>>> from elementtree import ElementTree
>>> ns = ElementBuilder.Namespace('http://some.uri', 'ns')
>>> e = ns.tag(
...   ns.tag2('content'),
...   ns.tag3(attr='value'),
...   ns.tag4({ns.attr: 'othervalue'}),
...   ns.x(
... ns.y('y'),
... ns.z('z'),
... 'some text',
...   )
... )
>>> ElementTree.dump(e)
http://some.uri";>contentyzsome text

Note that the namespace prefix on output is not "ns0". The second
argument to the Namespace constructor is the prefix hint and unless it
collides with any other namespace or prefix it will be added to
_namespace_map dictionary and used on output.

  Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: For review: PEP 343: Anonymous Block Redux and Generator Enhancements

2005-06-04 Thread oren . tirosh

Ilpo Nyyssönen wrote:
> Nicolas Fleury <[EMAIL PROTECTED]> writes:
> > def foo():
> > with locking(someMutex)
> > with opening(readFilename) as input
> > with opening(writeFilename) as output
> > ...
>
> How about this instead:
>
> with locking(mutex), opening(readfile) as input:
> ...

+1, and add PEP-328-like parentheses for multiline.

  Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ElementTree Namespace Prefixes

2005-06-13 Thread Oren Tirosh

Fredrik Lundh wrote:
> Chris Spencer wrote:
>
> > If an XML parser reads in and then writes out a document without having
> > altered it, then the new document should be the same as the original.
>
> says who?

Good question. There is no One True Answer even within the XML
standards.

It all boils down to how you define "the same". Which parts of the XML
document are meaningful content that needs to be preserved and which
ones are mere encoding variations that may be omitted from the internal
representation?

Some relevant references which may be used as guidelines:

* http://www.w3.org/TR/xml-infoset
The XML infoset defines 11 types of information items including
document type declaration, notations and other features. It does not
appear to be suitable for a lightweight API like ElementTree.

* http://www.w3.org/TR/xpath-datamodel
The XPath data model uses a subset of the XML infoset with "only" seven
node types.

http://www.w3.org/TR/xml-c14n
The canonical XML recommendation is meant to describe a process but it
also effectively defines a data model: anything preserved by the
canonicalization process is part of the model. Anything not preserved
is not part of the model.

In theory, this definition should be equivalent to the xpath data model
since canonical XML is defined in terms of the xpath data model. In
practice, the XPath data model defines properties not required for
producing canonical XML (e.g. unparsed entities associated with
document note). I like this alternative "black box" definition because
provides a simple touchstone for determining what is or isn't part of
the model.

I think it would be a good goal for ElementTree to aim for compliance
with the canonical XML data model. It's already quite close.

It's possible to use the canonical XML data model without being a
canonical XML processor but it would be nice if parse() followed by
write() actually passed the canonical XML test vectors. It's the
easiest way to demonstrate compliance conclusively.

So what changes are required to make ElementTree canonical?

1. PI nodes are already supported for output. Need an option to
preserve them on parsing
2. Comment nodes are already support for output. Need an option to
preserve them on parsing (canonical XML also defines a "no comments"
canonical form)
3. Preserve Comments and PIs outside the root element (store them as
children of the ElementTree object?)
4. Sorting of attributes by canonical order
5. Minor formatting and spacing issues in opening tags

oh, and one more thing...

6. preserve namespace prefixes ;-)
(see http://www.w3.org/TR/xml-c14n#NoNSPrefixRewriting for rationale)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ElementTree Namespace Prefixes

2005-06-14 Thread Oren Tirosh

> you forgot
>
>http://effbot.org/zone/element-infoset.htm
>
> which describes the 3-node XML infoset subset used by ElementTree.

No, I did not forget your infoset subset. I was comparing it with other
infoset subsets described in various XML specifications.

I agree 100% that prefixes were not *supposed* to be part of the
document's meaning back when the XML namespace specification was
written, but later specifications broke that.

Please take a look at http://www.w3.org/TR/xml-c14n#NoNSPrefixRewriting

"... there now exist a number of contexts in which namespace prefixes
can impart information value in an XML document..."

"...Moreover, it is possible to prove that namespace rewriting is
harmful, rather than simply ineffective."

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Set of Dictionary

2005-06-17 Thread Oren Tirosh

See the frozendict recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/414283

It was written exactly for this purpose: a dictionary that can be a
member in a set.

  Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: 1980's Home Computer-style Package.

2005-06-17 Thread Oren Tirosh

http://tothink.com/python/progman/

This module implements BASIC-like NEW, LOAD, RUN (sorry, no SAVE...).

   Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Parsing XML with ElementTree (unicode problem?)

2007-07-23 Thread oren . tsur

(this question was also posted in the devshed python forum:
http://forums.devshed.com/python-programming-11/parsing-xml-with-elementtree-unicode-problem-461518.html
).
-

(it's a bit longish but I hope I give all the information)

1. here is my problem: I'm trying to parse an XML file (saved locally)
using elementtree.parse but I get the following error:
xml.parsers.expat.ExpatError: not well-formed (invalid token): line
13, column 327
apparently, the problem is caused by the token 'Saunière' due to the
apostrophe.

the thing is that I'm sure that python (ElementTree module and parse()
function) can handle this type of encoding since I obtain my xml file
from the web by opening it with:

from elementtree import ElementTree
from urllib import urlopen
query = r'http://ecs.amazonaws.com/onca/xml?
Service=AWSECommerceService&AWSAccessKeyId=189P5TE3VP7N9MN0G302&Operation=ItemLookup&ItemId=1400079179&ResponseGroup=Reviews&ReviewPage=166'
root = ElementTree.parse(urlopen(query))

where query is a query to the AWS, and this specific query has the
'Saunière' in the response. (you could simply open the query with a
web browser and see the xml).

I create a local version of the XML file, containing only the tags
that are of interest. my file looks something like this (I replaced
some of the content with 'bla bla' string in order to make it fit
here):


805 3
5 6
2004-04-03
Not as good as Angels and Demons
I found that this book was not as good and thrilling as
Angels and Demons. bla bla.



827 4
2 8
2004-04-01
The Da Vinci Code, a master piece of words
The Da Vinci Code by Dan Brown is a well-written bla bla. The
story starts out in Paris, France with a murder of Jacque Saunière,
the head curator at Le Louvre.bla bla 



BUT, then trying:

fIn  = open(file,'r') #or even 'import codecs'  and opening with 'fIn
= codecs.open(file,encoding = 'utf-8')'
tree = ElementTree.parse(fIn)



where file is the saved file, I get the error above
(xml.parsers.expat.ExpatError: not well-formed (invalid token): line
13, column 327). so what's the difference? how comes parsing is fine
in the first case but erroneous in the second case? please advise.

2. there is another problem that might be similar I get a similar
error if the content of the (locally saved) xml have special
characters such as '&', for example in 'angles & demons' (vs. 'angles
and demons'). is it the same problem? same solution?

thanks!

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-23 Thread oren . tsur

On Jul 23, 4:46 pm, "Richard Brodie" <[EMAIL PROTECTED]> wrote:
> <[EMAIL PROTECTED]> wrote in message
>
> news:[EMAIL PROTECTED]
>
> > so what's the difference? how comes parsing is fine
> > in the first case but erroneous in the second case?
>
> You may have guessed the encoding wrong. It probably
> wasn't utf-8 to start with but iso8859-1 or similar.
> What actual byte value is in the file?

I tried it with different encodings and it didn't work. Anyways, I
would expect it to be utf-8 since the XML response to the amazon query
indicates a utf-8 (check it with
http://ecs.amazonaws.com/onca/xml?Service=AWSECommerceService&AWSAccessKeyId=189P5TE3VP7N9MN0G302&Operation=ItemLookup&ItemId=1400079179&ResponseGroup=Reviews&ReviewPage=166

 in your browser, the first line in the source is )

but the thing is that the parser parses it all right from the web (the
amazon response) but fails to parse the locally saved file.

> > 2. there is another problem that might be similar I get a similar
> > error if the content of the (locally saved) xml have special
> > characters such as '&'
>
> Either the originator of the XML has messed up, or whatever
> you have done to save a local copy has mangled it.

I think i made a mess. I changed the '&' in the original response to
'and' because the parser failed to parse the '&' (in the locally saved
file) just like it failed with the French characters. Again, parsing
the original response was just fine.

Thanks again,

Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread oren . tsur


> How about trying
> root = ElementTree.parse(urlopen(query), encoding ='utf-8')
>

this specific thing is not working, however, parsing the url is not
problematic. the problem is that after parsing the xml at the url I
save some of the fields to a local file and the local file is not
being parsed properly due to the non-ascii characters Sauni\xc3\xa8re
(french name: Saunière).

an example of the file can be found in the first posting, you could
copy+paste+save it to your machine then try to parse it.

I'm quite new to xml and python so I guess there must be something
wrong or dumb in the way I save the file (maybe I miss some important
tags?) or in the way I re-open it but I can't find whats wrong.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur

On Jul 26, 3:13 pm, John Machin <[EMAIL PROTECTED]> wrote:
> On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote:
>
> > OK, I solved the problem but I still don't get what went wrong.
> > Solution - use tree builder in order to create the new xml file
> > (previously I was  "manually" creating it).
>
> > I'm still curious so I'm adding a link to a short and very simple
> > script that gets an xml (containing non ascii chars) from the web and
> > saves some of the elements to 2 different local xml files - one is
> > created by XMLWriter and the other is created manually. you could see
> > that parsing of the first local file is OK while parsing of the
> > "manually" created xml file fails. obviously I'm doing something wrong
> > and I'd love to learn what.
>
> > the toy script:http://staff.science.uva.nl/~otsur/code/xmlConversions.py
>
> Simple file comparison:
>
> File 1: ... Modern Church.  The book ...
> File 2: ... Modern Church.  The book ...
>
> Firefox:
>
> XML Parsing Error: mismatched tag. Expected: .
> Location: file:///C:/junk/myDeVinciCode166_2.xml
> Line Number 3, Column 1153:
>
> The...Church.  The...thrill.
> --^

yup, but why does this happen - on the script side - I write the exact
same strings, of content with supposedly, same encoding, so why the
encoding is different?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur

OK, I solved the problem but I still don't get what went wrong.
Solution - use tree builder in order to create the new xml file
(previously I was  "manually" creating it).

I'm still curious so I'm adding a link to a short and very simple
script that gets an xml (containing non ascii chars) from the web and
saves some of the elements to 2 different local xml files - one is
created by XMLWriter and the other is created manually. you could see
that parsing of the first local file is OK while parsing of the
"manually" created xml file fails. obviously I'm doing something wrong
and I'd love to learn what.

the toy script:
http://staff.science.uva.nl/~otsur/code/xmlConversions.py

Thaks for all your help,

Oren

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur

On Jul 26, 4:34 pm, Stefan Behnel <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > On Jul 26, 3:13 pm, John Machin <[EMAIL PROTECTED]> wrote:
> >> On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote:
>
> >>> OK, I solved the problem but I still don't get what went wrong.
> >>> Solution - use tree builder in order to create the new xml file
> >>> (previously I was  "manually" creating it).
> >>> I'm still curious so I'm adding a link to a short and very simple
> >>> script that gets an xml (containing non ascii chars) from the web and
> >>> saves some of the elements to 2 different local xml files - one is
> >>> created by XMLWriter and the other is created manually. you could see
> >>> that parsing of the first local file is OK while parsing of the
> >>> "manually" created xml file fails. obviously I'm doing something wrong
> >>> and I'd love to learn what.
> >>> the toy script:http://staff.science.uva.nl/~otsur/code/xmlConversions.py
> >> Simple file comparison:
>
> >> File 1: ... Modern Church.  The book ...
> >> File 2: ... Modern Church.  The book ...
>
> >> Firefox:
>
> >> XML Parsing Error: mismatched tag. Expected: .
> >> Location: file:///C:/junk/myDeVinciCode166_2.xml
> >> Line Number 3, Column 1153:
>
> >> The...Church.  The...thrill.
> >> --^
>
> > yup, but why does this happen - on the script side - I write the exact
> > same strings, of content with supposedly, same encoding, so why the
> > encoding is different?
>
> Read the mail. It's not the encoding, it's the "" which does not get
> through as a tag in the first file.
>
> Stefan

thanks. I guess it was a dumb question after all. thanks again :)

-- 
http://mail.python.org/mailman/listinfo/python-list

Interest check in some delicious syntactic sugar for "except:pass"

2010-03-03 Thread Oren Elrad

Howdy all, longtime appreciative user, first time mailer-inner.

I'm wondering if there is any support (tepid better than none) for the
following syntactic sugar:

silence:
 block

->

try:
block
except:
pass

The logic here is that there are a ton of "except: pass" statements[1]
floating around in code that do not need to be there. Meanwhile, the
potential keyword 'silence' does not appear to be in significant use
as a variable[2], or an alternative keyword might be imagined
('quiet', 'hush', 'stfu') but I somewhat like the verbiness of
'silence' since that is precisely what it does to the block (that is,
you have to inflect it as a verb, not a noun -- you are telling the
block to be silent). Finally, since this is the purest form of
syntactic sugar, I cannot fathom any parsing, interpreting or other
complications that would arise.

I appreciate any feedback, including frank statements that you'd
rather not trifle with such nonsense.

~Oren

[1] http://www.google.com/codesearch?q=except%3A\spass&hl=en
[2] http://www.google.com/codesearch?hl=en&lr=&q=silence+lang%3Apy
-- 
http://mail.python.org/mailman/listinfo/python-list

Re Interest check in some delicious syntactic sugar for "except:pass"

2010-03-03 Thread Oren Elrad

To all that responded, thanks for the prompt response folks, your
criticisms are well taken. Coming from Cland, one is inculcated with
the notion that if the programmer wants to shoot himself in the foot
the language ought not to prevent that (or even should return him a
loaded magnum with the safety off and the hair-trigger pulled). My
apologies for not immediately grokking the cultural difference in
pytown.

With that said, let me at least offer a token defense of my position.
By way of motivation, I wrote that email after copying/pasting the
following a few times around a project until I wrote it into def
SilentlyDelete() and its cousin SilentlyRmdir()

""" code involving somefile """
try:
os.remove(somefile)
except:
...pass # The bloody search indexer has got the file and I
can't delete it. Nothing to be done.

Certainly the parade of horribles (bad files! corrupt data! syntax
errors!) is a tad melodramatic. Either os.remove() succeeds or it
doesn't and the execution path (in the estimation of this programmer,
at least) is not at all impacted by whether it succeeds or fails. I
know with certainty at compile time what exceptions might be raised
and what the consequences of passing them are and there is no sense
pestering the user or sweating over it. Nor can I see the logic, as
was suggested, in writing "except OSError:" since (seems to me) mere
surplusage -- it neither causes a semantic difference in the way the
program runs nor provides anything useful to the reader.

Now, perhaps this is a special case that is not nearly special enough
to warrant its own syntactic sugar, I granted that much, but >30,000
examples in Google Code cannot be considered to be a complete corner
case either. Briefly skimming those results, most of them seem to be
of this flavor, not the insane programmer that wants to write
"silence: CommitDBChangesEmailWifeAndAdjustBankAccount()" nor novices
that aren't aware of what they might be ignoring.

At any rate (and since this is far more words than I had intended), I
want to reiterate that the criticism is well-taken as a cultural
matter. I just don't want everyone to think I'm bloody insane or that
I'm not aware this is playing with fire. Maybe we can put it in module
"YesImSureJustBloodyDoItAlreadyGoddamnit"  and prints an ASCII skull
and crossbones to the console when imported? :-P

~ Oren

PS. I did like Dave's suggestions that one might want to write
"silence Type1 Type2:" which I suppose goes a long way towards
alleviating the concern that the programmer doesn't know what he's
missing. Doesn't quite meet my desire (both syntaxes would be nice, of
course) to avoid the verbiage involved with explaining to the compiler
(or the next reader) something that it knows well enough by now (or
ought to know, at least).
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: why del is not a function or method?

2017-10-16 Thread Oren Ben-Kiki

That doesn't explain why `del` isn't a method though. Intuitively,
`my_dict.delete(some_key)` makes sense as a method. Of course, you could
also make the same case for `len` being a method... and personally I think
it would have been cleaner that way in both cases. But it is a minor issue,
if at all.

I guess the answer is a combination of "historical reasons" and "Guido's
preferences"?

On Mon, Oct 16, 2017 at 6:58 PM, Stefan Ram  wrote:

> Xue Feng  writes:
> >I wonder why 'del' is not a function or method.
>
>   Assume,
>
> x = 2.
>
>   When a function »f« is called with the argument »x«,
>   this is written as
>
> f( x )
>
>   . The function never gets to see the name »x«, just
>   its boundee (value) »2«. So, it cannot delete the
>   name »x«.
>
>   Also, the function has no access to the scope of »x«,
>   and even more so, it cannot make any changes in it.
>
>   Therefore, even a call such as
>
> f( 'x' )
>
>   will not help much.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: why del is not a function or method?

2017-10-16 Thread Oren Ben-Kiki

True... technically, "Deletion of a name removes the binding of that name
from the local or global namespace". Using `x.del()` can't do that.

That said, I would hazard to guess that `del x` is pretty rare (I have
never felt the need for it myself). Ruby doesn't even have an equivalent
operation, and doesn't seem to suffer as a result. If Python used methods
instead of global functions for `len` and `del`, and provided a
`delete_local_variable('x')` for these rare cases, that could have been a
viable solution.

So I still think it was a matter of preference rather than a pure technical
consideration. But that's all second-guessing, anyway. You'd have to ask
Guido what his reasoning was...

On Mon, Oct 16, 2017 at 7:36 PM, Ned Batchelder 
wrote:

> On 10/16/17 12:16 PM, Oren Ben-Kiki wrote:
>
>> That doesn't explain why `del` isn't a method though. Intuitively,
>> `my_dict.delete(some_key)` makes sense as a method. Of course, you could
>> also make the same case for `len` being a method... and personally I think
>> it would have been cleaner that way in both cases. But it is a minor
>> issue,
>> if at all.
>>
>> I guess the answer is a combination of "historical reasons" and "Guido's
>> preferences"?
>>
>
> It would still need to be a statement to allow for:
>
> del x
>
> since "x.del()" wouldn't affect the name x, it would affect the value x
> refers to.
>
> --Ned.
>
>
>>
>> On Mon, Oct 16, 2017 at 6:58 PM, Stefan Ram 
>> wrote:
>>
>> Xue Feng  writes:
>>>
>>>> I wonder why 'del' is not a function or method.
>>>>
>>>Assume,
>>>
>>> x = 2.
>>>
>>>When a function »f« is called with the argument »x«,
>>>this is written as
>>>
>>> f( x )
>>>
>>>. The function never gets to see the name »x«, just
>>>its boundee (value) »2«. So, it cannot delete the
>>>name »x«.
>>>
>>>Also, the function has no access to the scope of »x«,
>>>and even more so, it cannot make any changes in it.
>>>
>>>Therefore, even a call such as
>>>
>>> f( 'x' )
>>>
>>>will not help much.
>>>
>>> --
>>> https://mail.python.org/mailman/listinfo/python-list
>>>
>>>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: why del is not a function or method?

2017-10-16 Thread Oren Ben-Kiki

The first line says "The major reason is history." :-) But it also gives an
explanation: providing functionality for types that, at the time, didn't
have methods.

On Mon, Oct 16, 2017 at 8:33 PM, Lele Gaifax  wrote:

> Oren Ben-Kiki  writes:
>
> > So I still think it was a matter of preference rather than a pure
> technical
> > consideration. But that's all second-guessing, anyway. You'd have to ask
> > Guido what his reasoning was...
>
> A rationale is briefly stated in the design FAQs, see
> https://docs.python.org/3/faq/design.html#why-does-python-
> use-methods-for-some-functionality-e-g-list-index-
> but-functions-for-other-e-g-len-list
> and the next one.
>
> ciao, lele.
> --
> nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
> real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
> l...@metapensiero.it  | -- Fortunato Depero, 1929.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why does ne exist?

2018-01-08 Thread Oren Ben-Kiki

I don't see a case in IEEE where (x == y) != !(x != y).
There _is_ a case where (x != x) is true (when x is NaN), but for such an
x, (x == x) will be false.

I am hard pressed to think of a case where __ne__ is actually useful.

That said, while it is true you only need one of (__eq__, __ne__), you
could make the same claim about (__lt__, __ge__) and (__le__, __gt__).
That is, in principle you could get by with only (__eq__, __le__, and
__ge__) or, if you prefer, (__ne__, __lt__, __gt__), or any other
combination you prefer.

Or you could go where C++ is doing and say that _if_ one specifies a single
__cmp__ method, it should return one of LT, EQ, GT, and this will
automatically give rise to all the comparison operators.

"Trade-offs... trafe-offs as far as the eye can see" ;-)

On Mon, Jan 8, 2018 at 4:01 PM, Thomas Nyberg  wrote:

> On 01/08/2018 12:36 PM, Thomas Jollans wrote:
> >
> > Interesting sentence from that PEP:
> >
> > "3. The == and != operators are not assumed to be each other's
> > complement (e.g. IEEE 754 floating point numbers do not satisfy this)."
> >
> > Does anybody here know how IEE 754 floating point numbers need __ne__?
>
> That's very interesting. I'd also like an answer to this. I can't wrap
> my head around why it would be true. I've just spent 15 minutes playing
> with the interpreter (i.e. checking operations on 0, -0, 7,
> float('nan'), float('inf'), etc.) and then also reading a bit about IEEE
> 754 online and I can't find any combination of examples where == and !=
> are not each others' complement.
>
> Cheers,
> Thomas
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why does ne exist?

2018-01-08 Thread Oren Ben-Kiki

Ugh, right, for NaN you can have (x < y) != (x >= y) - both would be false
if one of x and y is a NaN.

But __ne__ is still useless ;-)

On Mon, Jan 8, 2018 at 4:36 PM, Thomas Nyberg  wrote:

> On 01/08/2018 03:25 PM, Oren Ben-Kiki wrote:
> > I am hard pressed to think of a case where __ne__ is actually useful.
>
> Assuming you're talking about a case specifically for IEEE 754, I'm
> starting to agree. In general, however, it certainly is useful for some
> numpy objects (as mentioned elsewhere in this thread).
>
> > That said, while it is true you only need one of (__eq__, __ne__), you
> > could make the same claim about (__lt__, __ge__) and (__le__, __gt__).
> > That is, in principle you could get by with only (__eq__, __le__, and
> > __ge__) or, if you prefer, (__ne__, __lt__, __gt__), or any other
> > combination you prefer.
>
> This isn't true for IEEE 754. For example:
>
> >>> float('nan') < 0
> False
> >>> float('nan') > 0
> False
> >>> float('nan') == 0
> False
>
> Also there are many cases where you don't have a < b OR a >= b. For
> example, subsets don't follow this.
>
> > "Trade-offs... trafe-offs as far as the eye can see" ;-)
>
> Yes few things in life are free. :)
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why does ne exist?

2018-01-08 Thread Oren Ben-Kiki

Good points. Well, this is pretty academic at this point - I don't think
anyone would seriously choose to obsolete __ne__, regardless of whether it
is absolutely necessary or not.

On Mon, Jan 8, 2018 at 4:51 PM, Thomas Jollans  wrote:

> On 2018-01-08 15:25, Oren Ben-Kiki wrote:
> > I don't see a case in IEEE where (x == y) != !(x != y).
> > There _is_ a case where (x != x) is true (when x is NaN), but for such an
> > x, (x == x) will be false.
> >
> > I am hard pressed to think of a case where __ne__ is actually useful.
>
> See my earlier email and/or PEP 207. (tl;dr: non-bool return values)
>
> >
> > That said, while it is true you only need one of (__eq__, __ne__), you
> > could make the same claim about (__lt__, __ge__) and (__le__, __gt__).
> > That is, in principle you could get by with only (__eq__, __le__, and
> > __ge__) or, if you prefer, (__ne__, __lt__, __gt__), or any other
> > combination you prefer.
>
> PEP 207: "The above mechanism is such that classes can get away with not
> implementing either __lt__ and __le__ or __gt__ and __ge__."
>
>
> >
> > Or you could go where C++ is doing and say that _if_ one specifies a
> single
> > __cmp__ method, it should return one of LT, EQ, GT, and this will
> > automatically give rise to all the comparison operators.
>
> This used to be the case. (from version 2.1 to version 2.7, AFAICT)
>
>
> >
> > "Trade-offs... trafe-offs as far as the eye can see" ;-)
> >
> >
> > On Mon, Jan 8, 2018 at 4:01 PM, Thomas Nyberg  wrote:
> >
> >> On 01/08/2018 12:36 PM, Thomas Jollans wrote:
> >>>
> >>> Interesting sentence from that PEP:
> >>>
> >>> "3. The == and != operators are not assumed to be each other's
> >>> complement (e.g. IEEE 754 floating point numbers do not satisfy this)."
> >>>
> >>> Does anybody here know how IEE 754 floating point numbers need __ne__?
> >>
> >> That's very interesting. I'd also like an answer to this. I can't wrap
> >> my head around why it would be true. I've just spent 15 minutes playing
> >> with the interpreter (i.e. checking operations on 0, -0, 7,
> >> float('nan'), float('inf'), etc.) and then also reading a bit about IEEE
> >> 754 online and I can't find any combination of examples where == and !=
> >> are not each others' complement.
> >>
> >> Cheers,
> >> Thomas
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Behavior of auto in Enum and Flag.

2017-04-02 Thread Oren Ben-Kiki

The current behavior of `auto` is to pick a value which is one plus the
previous value.

It would probably be better if `auto` instead picked a value that is not
used by any named member (either the minimal unused value, or the minimal
higher than the previous value). That is, in this simple case:

class MyEnum(Enum):
FOO = 1
BAR = auto()
BAZ = 2

It would be far better for BAR to get the value 3 rather than today's value
2.

In the less simple case of:

class MyEnum(Enum):
FOO = 2
BAR = auto()
BAZ = 3

Then BAR could be either 1 or 4 - IMO, 1 would be better, but 4 works as
well.

After all, `auto` is supposed to be used when:

"If the exact value is unimportant you may use auto instances and an
appropriate value will be chosen for you."

Choosing a value that conflicts with BAZ in above cases doesn't seem
"appropriate" for a value that is "unimportant".

The docs also state "Care must be taken if you mix auto with other values."
- fair enough. But:

First, why require "care" if the code can take care of the issue for us?

Second, the docs don't go into further detail about what exactly to avoid.
In particular, the docs do not state that the automatic value will only
take into account the previous values, and will ignore following values.

However, this restriction is baked into the current implementation:
It is not possible to just override `_generate_next_value_` to skip past
named values which were not seen yet, because the implementation only
passes it the list of previous values.

I propose that:

1. The documentation will be more explicit about the way `auto` behaves in
the presence of following values.

2. The default behavior of `auto` would avoid generating a conflict with
following values.

3. Whether `auto` chooses (A) the minimal unused value higher than the
previous value, or (B) the minimal overall unused value, or (C) some other
strategy, would depend on the specific implementation.

3. To allow for this, the implementation will include a
`_generate_auto_value_` which will take both the list of previous ("last")
values (including auto values) and also a second list of the following
("next") values (excluding auto values).

4. If the class implements `_generate_next_value_`, then
`_generate_auto_value_` will invoke `_generate_next_value_` with the
concatenation of both lists (following values first, preceding values
second), to maximize compatibility with existing code.

Thanks,

Oren Ben-Kiki
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Behavior of auto in Enum and Flag.

2017-04-02 Thread Oren Ben-Kiki

While "the current behaviour is compliant with what the docs say" is true,
saying "as such, I would be disinclined to change the code" misses the
point.

The current documentation allows for multiple behaviors. The current
implementation has an chosen to add an arbitrary undocumented restriction
on the behavior, which has a usability issue. Even worse, for no clear
reason, the current implementation forces _all_ implementations to suffer
from the same usability issue.

The proposed behavior is _also_ compliant with the current documentation,
and does not suffer from this usability issue. The proposed implementation
is compatible with existing code bases, and allows for "any" other
implementation to avoid this issue.

That is, I think that instead of enshrining the current implementation's
undocumented and arbitrary restriction, by explicitly adding it to the
documentation, we should instead remove this arbitrary restriction from the
implementation, and only modify the documentation to clarify this
restriction is gone.

Oren.

On Mon, Apr 3, 2017 at 8:38 AM, Chris Angelico  wrote:

> On Mon, Apr 3, 2017 at 2:49 PM, Oren Ben-Kiki 
> wrote:
> > "If the exact value is unimportant you may use auto instances and an
> > appropriate value will be chosen for you."
> >
> > Choosing a value that conflicts with BAZ in above cases doesn't seem
> > "appropriate" for a value that is "unimportant".
> >
> > The docs also state "Care must be taken if you mix auto with other
> values."
> > - fair enough. But:
> >
> > First, why require "care" if the code can take care of the issue for us?
> >
> > Second, the docs don't go into further detail about what exactly to
> avoid.
> > In particular, the docs do not state that the automatic value will only
> > take into account the previous values, and will ignore following values.
>
> Sounds to me like the current behaviour is compliant with what the
> docs say, and as such, I would be disinclined to change the code.
> Perhaps a documentation clarification would suffice?
>
> """Care must be taken if you mix auto with other values. In
> particular, using auto() prior to explicitly-set values may result in
> conflicts."""
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Behavior of auto in Enum and Flag.

2017-04-03 Thread Oren Ben-Kiki

On Mon, Apr 3, 2017 at 11:03 AM, Ethan Furman  wrote:

> Python code is executed top-down.  First FOO, then BAR, then BAZ.  It is
> not saved up and executed later in random order.  Or, put another way, the
> value was appropriate when it was chosen -- it is not the fault of auto()
> that the user chose a conflicting value (hence why care should be taken).

This is not to say that there's no possible workaround for this - the code
could pretty easily defer invocation of _generate_next_macro_ until after
the whole class was seen. It would still happen in order (since members are
an ordered dictionary these days).

So it is a matter of conflicting values - what would be more "Pythonic":
treating auto as executed immediately, or avoiding conflicts between auto
and explicit values.

> 1. The documentation will be more explicit about the way `auto` behaves in
> the presence of following value

> I can do that.
>

Barring changing the way auto works, that would be best ("explicit is
better than implicit" and all that ;-)

> 2. The default behavior of `auto` would avoid generating a conflict with
>> following values.
>>
>
> I could do that, but I'm not convinced it's necessary, plus there would be
> backwards compatibility constraints at this point.

"Necessity" depends on the judgement call above.

As for backward compatibility, the docs are pretty clear about "use auto
when you don't care about the value"... and Enum is pretty new, so there's
not _that_ much code that relies on "implementation specific" details.

*If* backward compatibility is an issue here, then the docs might as well
specify "previous value plus 1, or 1 if this is the first value" as the
"standard" behavior, and be done.

This has the advantage of being deterministic and explicit, so people would
be justified in relying on it. It would still have to be accompanied by
saying "auto() can only consider previous values, not following ones".

> This might work for you (untested):
>
> def _generate_next_value_(name, start, count, previous_values):
> if not count:
> return start or 1
> previous_values.sort()
> last_value = previous_values[-1]
> if last_value < 1000:
> return 1001
> else:
> return last_value + 1

This assumes no following enum values have values > 1000 (or some
predetermined constant), which doesn't work in my particular case, or in
the general case. But yes, this might solve the problem for some people.

> 3. To allow for this, the implementation will include a
>> `_generate_auto_value_` which will take both the list of previous ("last")
>> values (including auto values) and also a second list of the following
>> ("next") values (excluding auto values).
>>
>
> No, I'm not interested in doing that.  I currently have that kind of code
> in aenum[1] for 2.7 compatibility, and it's a nightmare to maintain.
>

Understood. Another alternative would be to have something like
_generate_next_value_ex_ with the additional argument (similar to
__reduce_ex__), which isn't ideal either.

Assuming you buy into my "necessity" claim, that is...

Thanks,

Oren.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Behavior of auto in Enum and Flag.

2017-04-03 Thread Oren Ben-Kiki

On Mon, Apr 3, 2017 at 7:43 PM, Chris Angelico  wrote:

> Here's a counter-example that supports the current behaviour:
>
> >>> from enum import IntFlag, auto
> >>> class Spam(IntFlag):
> ... FOO = auto()
> ... BAR = auto()
> ... FOOBAR = FOO | BAR
> ... SPAM = auto()
> ... HAM = auto()
> ... SPAMHAM = SPAM | HAM
> ...
>

Ugh, good point - I didn't consider that use case, I see how it would be
nasty to implement.

I guess just improving the documentation is called for, then...

Thanks,

Oren.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Appending data to a json file

2017-04-03 Thread Oren Ben-Kiki

You _can_ just extend a JSON file without loading it, but it will not be
"fun".

Say the JSON file contains a top-level array. The final significant
character in it would be a ']'. So, you can read just a reasonably-sized
block from the end of the file, find the location of the final ']',
overwrite it with a ',' followed by your additional array entry/entries,
with a final ']'.

If the JSON file contains a top-level object, the final significant
character would be a '}'. Overwrite it with a ',' followed by your
additional object key/value pairs, with a final '}'.

Basically, if what you want to append is of the same kind as the content of
the file (array appended to array, or object to object):

- Locate final significant character in the file
- Locate first significant character in your appended data, replace it with
a ','
- Overwrite the final significant character in the file with your patched
data

It isn't elegant or very robust, but if you want to append to a very large
JSON array (for example, some log file?), then it could be very efficient
and effective.

Or, you could use YAML ;-)

On Tue, Apr 4, 2017 at 8:31 AM, dieter  wrote:

> Dave  writes:
>
> > I created a python program that gets data from a user, stores the data
> > as a dictionary in a list of dictionaries.  When the program quits, it
> > saves the data file.  My desire is to append the new data to the
> > existing data file as is done with purely text files.
>
> Usually, you cannot do that:
> "JSON" stands for "JavaScript Object Notation": it is a text representation
> for a single (!) JavaScript object. The concatenation of two
> JSON representations is not a valid JSON representation.
> Thus, you cannot expect that after such a concatenation, a single
> call to "load" will give you back complete information (it might
> be that a sequence of "load"s works).
>
> Personally, I would avoid concatenated JSON representations.
> Instead, I would read in (i.e. "load") the existing data,
> construct a Python object from the old and the new data (likely in the form
> of a list) and then write it out (i.e. "dump") again.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Difference in behavior of GenericMeta between 3.6.0 and 3.6.1

2017-07-16 Thread Oren Ben-Kiki

TL;DR: We need improved documentation of the way meta-classes behave for
generic classes, and possibly reconsider the way "__setattr__" and
"__getattribute__" behave for such classes.

I am using meta-programming pretty heavily in one of my projects.
It took me a while to figure out the dance between meta-classes and generic
classes in Python 3.6.0.

I couldn't find good documentation for any of this (if anyone has a good
link, please share...), but with a liberal use of "print" I managed to
reverse engineer how this works. The behavior isn't intuitive but I can
understand the motivation (basically, "type annotations shall not change
the behavior of the program").

For the uninitiated:

* It turns out that there are two kinds of instances of generic classes:
the "unspecialized" class (basically ignoring type parameters), and
"specialized" classes (created when you write "Foo[Bar]", which know the
type parameters, "Bar" in this case).

* This means the meta-class "__new__" method is called sometimes to create
the unspecialized class, and sometimes to create a specialized one - in the
latter case, it is called with different arguments...

* No object is actually an instance of the specialized class; that is, the
"__class__" of an instance of "Foo[Bar]" is actually the unspecialized
"Foo" (which means you can't get the type parameters by looking at an
instance of a generic class).

So far, so good, sort of. I implemented my meta-classes to detect whether
they are creating a "specialized" or "unspecialized" class and behave
accordingly.

However, these meta-classes stopped working when switching to Python 3.6.1.
The reason is that in Python 3.6.1, a "__setattr__" implementation was
added to "GenericMeta", which redirects the setting of an attribute of a
specialized class instance to set the attribute of the unspecialized class
instance instead.

This causes code such as the following (inside the meta-class) to behave in
a mighty confusing way:

if is-not-specialized:
cls._my_attribute = False
else:  # Is specialized:
cls._my_attribute = True
assert cls._my_attribute  # Fails!

As you can imagine, this caused us some wailing and gnashing of teeth,
until we figured out (1) that this was the problem and (2) why it was
happening.

Looking into the source code in "typing.py", I see that I am not the only
one who had this problem. Specifically, the implementers of the "abc"
module had the exact same problem. Their solution was simple: the
"GenericMeta.__setattr__" code explicitly tests whether the attribute name
starts with "_abc_", in which case it maintains the old behavior.

Obviously, I should not patch the standard library typing.py to preserve
"_my_attribute". My current workaround is to derive from GenericMeta,
define my own "__setattr__", which preserves the old behavior for
"_my_attribute", and use that instead of the standard GenericMeta
everywhere.

My code now works in both 3.6.0 and 3.6.1. However, I think the following
points are worth fixing and/or discussion:

* This is a breaking change, but it isn't listed in
https://www.python.org/downloads/release/python-361/ - it should probably
be listed there.

* In general it would be good to have some documentation on the way that
meta-classes and generic classes interact with each other, as part of the
standard library documentation (apologies if it is there and I missed it...
link?)

* I'm not convinced the new behavior is a better default. I don't recall
seeing a discussion about making this change, possibly I missed it (link?)

* There is a legitimate need for the old behavior (normal per-instance
attributes). For example, it is needed by the "abc" module (as well as my
project). So, some mechanism should be recommended (in the documentation)
for people who need the old behavior.

* Separating between "really per instance" attributes and "forwarded to the
unspecialized instance" attributes based on their prefix seems to violate
"explicit is better than implicit". For example, it would have been
explicit to say "cls.__unspecialized__.attribute" (other explicit
mechanisms are possible).

* Perhaps the whole notion of specialized vs. unspecialized class instances
needs to be made more explicit in the GenericMeta API...

* Finally and IMVHO most importantly, it is *very* confusing to override
"__setattr__" and not override "__getattribute__" to match. This gives rise
to code like "cls._foo = True; assert cls._foo" failing. This feels
wrong And presumably fixing the implementation so that
"__getattribute__" forwards the same set of attributes to the
"unspecialized" instance wouldn't break any code... Other than code that
already broken due to the new functionality, that is.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Difference in behavior of GenericMeta between 3.6.0 and 3.6.1

2017-07-16 Thread Oren Ben-Kiki

Yes, it sort-of makes sense... I'll basically re-post my question there.

Thanks for the link!

Oren.


On Sun, Jul 16, 2017 at 4:29 PM, Peter Otten <__pete...@web.de> wrote:

> Oren Ben-Kiki wrote:
>
> > TL;DR: We need improved documentation of the way meta-classes behave for
> > generic classes, and possibly reconsider the way "__setattr__" and
> > "__getattribute__" behave for such classes.
>
> The typing module is marked as "provisional", so you probably have to live
> with the incompatibilities.
>
> As to your other suggestions/questions, I'm not sure where the actual
> discussion is taking place -- roughly since the migration to github python-
> dev and bugs.python.org are no longer very useful for outsiders to learn
> what's going on.
>
> A random walk over the github site found
>
> https://github.com/python/typing/issues/392
>
> Maybe you can make sense of that?
>
> Personally, I'm not familiar with the evolving type system and still
> wondering whether I should neglect or reject...
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

42 matches

Mail list logo