Re: No trees in the stdlib?

2009-06-25 Thread João Valverde

Aahz wrote:

In article ,
Tom Reed   wrote:
  
Why no trees in the standard library, if not as a built in? I searched 
the archive but couldn't find a relevant discussion. Seems like a 
glaring omission considering the batteries included philosophy, 
particularly balanced binary search trees. No interest, no good 
implementations, something other reason? Seems like a good fit for the 
collections module. Can anyone shed some light?



What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree implementations
available from either PyPI or the Cookbook.
  
A hash table is very different to a BST.  They are both useful. The 
bisect module I'm not familiar with, I'll have to look into that, thanks.


I have found pyavl on the web, it does the job ok, but there no 
implementations for python3 that I know of.


Simple example usage case: Insert string into data structure in sorted 
order if it doesn't exist, else retrieve it.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-25 Thread João Valverde

João Valverde wrote:

Aahz wrote:

In article ,
Tom Reed   wrote:
 
Why no trees in the standard library, if not as a built in? I 
searched the archive but couldn't find a relevant discussion. Seems 
like a glaring omission considering the batteries included 
philosophy, particularly balanced binary search trees. No interest, 
no good implementations, something other reason? Seems like a good 
fit for the collections module. Can anyone shed some light?



What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree 
implementations

available from either PyPI or the Cookbook.
  
A hash table is very different to a BST.  They are both useful. The 
bisect module I'm not familiar with, I'll have to look into that, thanks.


I have found pyavl on the web, it does the job ok, but there no 
implementations for python3 that I know of.


Simple example usage case: Insert string into data structure in sorted 
order if it doesn't exist, else retrieve it.


Crap, sorry about the mixed identities, I'm not using my own machine. 
The original post is mine also.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-25 Thread João Valverde

João Valverde wrote:

Aahz wrote:

In article ,
Tom Reed   wrote:
 
Why no trees in the standard library, if not as a built in? I 
searched the archive but couldn't find a relevant discussion. Seems 
like a glaring omission considering the batteries included 
philosophy, particularly balanced binary search trees. No interest, 
no good implementations, something other reason? Seems like a good 
fit for the collections module. Can anyone shed some light?



What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree 
implementations

available from either PyPI or the Cookbook.
  
A hash table is very different to a BST.  They are both useful. The 
bisect module I'm not familiar with, I'll have to look into that, thanks.


I have found pyavl on the web, it does the job ok, but there no 
implementations for python3 that I know of.
The main problem with pyavl by the way is that it doesn't seem to be 
subclassable (?). Besides some interface glitches, like returning None 
on delete if I recall correctly.


There's also rbtree, which I didn't try. And I think that's it. On the 
whole not a lot of choice and not as practical for such a common data 
structure.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

João Valverde wrote:

Aahz wrote:

In article ,
Tom Reed   wrote:
 
Why no trees in the standard library, if not as a built in? I 
searched the archive but couldn't find a relevant discussion. Seems 
like a glaring omission considering the batteries included 
philosophy, particularly balanced binary search trees. No interest, 
no good implementations, something other reason? Seems like a good 
fit for the collections module. Can anyone shed some light?



What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree 
implementations

available from either PyPI or the Cookbook.
  
A hash table is very different to a BST.  They are both useful. The 
bisect module I'm not familiar with, I'll have to look into that, thanks.


I have found pyavl on the web, it does the job ok, but there no 
implementations for python3 that I know of.


Simple example usage case: Insert string into data structure in sorted 
order if it doesn't exist, else retrieve it.


After browsing the bisect module I don't think it is the complete 
answer. Please correct me if I'm mistaken but...


Ignoring for a moment that subjectively I feel this is not very pythonic 
for my use case, if I get back the insertion position, doesn't that mean 
I have to go over on average N/2 items on a linked list to insert the 
item in position? Maybe less for sophisticated implementations but still 
O(n)? That doesn't compare favorably to the cost of (possibly) having to 
rebalance a tree on insertion.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

Jason Scheirer wrote:

On Jun 25, 10:32 pm, a...@pythoncraft.com (Aahz) wrote:
  

In article ,
Tom Reed   wrote:





Why no trees in the standard library, if not as a built in? I searched
the archive but couldn't find a relevant discussion. Seems like a
glaring omission considering the batteries included philosophy,
particularly balanced binary search trees. No interest, no good
implementations, something other reason? Seems like a good fit for the
collections module. Can anyone shed some light?
  

What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree implementations
available from either PyPI or the Cookbook.
--
Aahz (a...@pythoncraft.com)   <*>http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha



...And heapq is more-or-less an emulation of a tree structure in its
underlying model. I once wrote a binary sorted tree data structure for
Python in C. It performed anywhere from 15-40% worse than dicts. I
think with optimization it will only perform 10% worse than dicts at
best.

Oh hey maybe that is why trees aren't an emphasized part of the
standard. They are going to be much slower than the ultra-optimized
dicts already in the standard lib.
  

But a dict can't be used to implement a (sorted) table ADT.
--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

Stefan Behnel wrote:

João Valverde wrote:
  

Besides some interface glitches, like returning None
on delete if I recall correctly.



That's actually not /that/ uncommon. Operations that change an object are
not (side-effect free) functions, so it's just purity if they do not have a
return value.

Although practicality beats purity, sometimes... ;)

Stefan
  
I didn't know that. But in this case I think purity gets pummeled every 
time :) It's still not making sense to force a lookup to fetch something 
before deleting (another lookup operation). If that were imposed by 
python's internal machinery I'd suggest fixing that instead.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

João Valverde wrote:

Stefan Behnel wrote:

João Valverde wrote:
 

Besides some interface glitches, like returning None
on delete if I recall correctly.



That's actually not /that/ uncommon. Operations that change an object 
are
not (side-effect free) functions, so it's just purity if they do not 
have a

return value.

Although practicality beats purity, sometimes... ;)

Stefan
  
I didn't know that. But in this case I think purity gets pummeled 
every time :) It's still not making sense to force a lookup to fetch 
something before deleting (another lookup operation). If that were 
imposed by python's internal machinery I'd suggest fixing that instead.


To be clear what I mean by that is that it's just reference passing so 
it can't generate programmatic errors.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

João Valverde wrote:

João Valverde wrote:

Aahz wrote:

In article ,
Tom Reed   wrote:
 
Why no trees in the standard library, if not as a built in? I 
searched the archive but couldn't find a relevant discussion. Seems 
like a glaring omission considering the batteries included 
philosophy, particularly balanced binary search trees. No interest, 
no good implementations, something other reason? Seems like a good 
fit for the collections module. Can anyone shed some light?



What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree 
implementations

available from either PyPI or the Cookbook.
  
A hash table is very different to a BST.  They are both useful. The 
bisect module I'm not familiar with, I'll have to look into that, 
thanks.


I have found pyavl on the web, it does the job ok, but there no 
implementations for python3 that I know of.


Simple example usage case: Insert string into data structure in 
sorted order if it doesn't exist, else retrieve it.


After browsing the bisect module I don't think it is the complete 
answer. Please correct me if I'm mistaken but...


Ignoring for a moment that subjectively I feel this is not very 
pythonic for my use case, if I get back the insertion position, 
doesn't that mean I have to go over on average N/2 items on a linked 
list to insert the item in position? Maybe less for sophisticated 
implementations but still O(n)? That doesn't compare favorably to the 
cost of (possibly) having to rebalance a tree on insertion.
I was indeed corrected on this shameful blunder, but in my defense array 
based lists are implementation dependent and the case for trees can be 
made on deletion.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

Aahz wrote:

In article <006078f0$0$9721$c3e8...@news.astraweb.com>,
Steven D'Aprano   wrote:
  
Hash tables (dicts) are useful for many of the same things that trees are 
useful for, but they are different data structures with different 
strengths and weaknesses, and different uses. To argue that we don't need 
trees because we have dicts makes as much sense as arguing that we don't 
need dicts because we have lists.



The problem is that trees are like standards: there are so many to choose
from.  Each has its own tradeoffs, and because Python dicts and lists can
substitute for many of the traditionals uses of trees, the tradeoffs are
less clear.  I think nobody would object to adding trees to the standard
library, but it will certainly require a clear PEP, preferably with a
pointer to an existing PyPI library that has acquired real-world use.


  

Wish I had asked this before this year's GSoC started.

What's lacking is an associative array that preserves ordering, doesn't 
require a hash function and has fast insertions and deletions in 
O(log(n)). The particular algorithm to achieve this is a secondary 
issue. It's a BST for sure, AVL vs RBT vs something else. It's my fault 
for having opened the topic with simply "trees" instead, it would have 
avoided this vagueness problem, but I'm genuinely surprised to know 
there are no data structures that efficiently support such a common need 
in Python. And I really enjoy the using this language.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

greg wrote:

João Valverde wrote:

What's lacking is an associative array that preserves ordering, 
doesn't require a hash function and has fast insertions and deletions 
in O(log(n)).


Careful here -- you can't get away from the need for
hashability just by using a tree. Even if you don't
need to actually hash the values, it's still important
that the criterion for ordering between objects doesn't
change while they're in the tree, otherwise they'll
be in the wrong place and won't be found by subsequent
lookups.


I'm aware :) Technically it's necessary to define a total ordering on 
the set of keys.


> I'm genuinely surprised to know
there are no data structures that efficiently support such a common 
need in Python.


Is it really all that common? If it truly were common,
there probably *would* be something for it in the
stdlib by now.

Obviously my experience differs, but those were my expectations.


What sort of things are you doing that you want such
a structure for? Maybe we can suggest a way of using
the existing data structures to achieve the same
goal.

To answer the question of what I need the BSTs for, without getting into 
too many boring details it is to merge and sort IP blocklists, that is, 
large datasets of ranges in the form of (IP address, IP address, 
string). Originally I was also serializing them in a binary format (but 
no more after a redesign). I kept the "merge and sort" part as a helper 
script, but that is considerably simpler to implement.


Please note that I'm happy with my code, it works well. I intended to 
implement it in C all along, even before trying Python. The python code 
was a side thing for testing/curiosity/fun. It prompted my original 
question but that was really about Python and the standard library 
itself, and I don't wish to waste anyone's time.


As an anecdotal data point (honestly not trying to raise the "Python is 
slow" strawman), I implemented the same algorithm in C and Python, using 
pyavl. Round numbers were 4 mins vs 4 seconds, against Python (plus 
pyavl). Even considering I'm a worse Python programmer than C 
programmer, it's a lot. I know many will probably think I tried to do "C 
in Python" but that's not the case, at least I don' t think so. Anyway 
like I said, not really relevant to this discussion.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

João Valverde wrote:

greg wrote:

João Valverde wrote:

What's lacking is an associative array that preserves ordering, 
doesn't require a hash function and has fast insertions and 
deletions in O(log(n)).


Careful here -- you can't get away from the need for
hashability just by using a tree. Even if you don't
need to actually hash the values, it's still important
that the criterion for ordering between objects doesn't
change while they're in the tree, otherwise they'll
be in the wrong place and won't be found by subsequent
lookups.


I'm aware :) Technically it's necessary to define a total ordering on 
the set of keys.


> I'm genuinely surprised to know
there are no data structures that efficiently support such a common 
need in Python.


Is it really all that common? If it truly were common,
there probably *would* be something for it in the
stdlib by now.

Obviously my experience differs, but those were my expectations.


What sort of things are you doing that you want such
a structure for? Maybe we can suggest a way of using
the existing data structures to achieve the same
goal.

To answer the question of what I need the BSTs for, without getting 
into too many boring details it is to merge and sort IP blocklists, 
that is, large datasets of ranges in the form of (IP address, IP 
address, string). Originally I was also serializing them in a binary 
format (but no more after a redesign). I kept the "merge and sort" 
part as a helper script, but that is considerably simpler to implement.
Crap, this sentence is totally confusing. I meant kept the merge code as 
a helper script and moved the rest to C, see next paragraph.


Please note that I'm happy with my code, it works well. I intended to 
implement it in C all along (for a system daemon), even before trying 
Python. The python code was a side thing for testing/curiosity/fun. It 
prompted my original question but that was really about Python and the 
standard library itself, and I don't wish to waste anyone's time.


As an anecdotal data point (honestly not trying to raise the "Python 
is slow" strawman), I implemented the same algorithm in C and Python, 
using pyavl. Round numbers were 4 mins vs 4 seconds, against Python 
(plus pyavl). Even considering I'm a worse Python programmer than C 
programmer, it's a lot. I know many will probably think I tried to do 
"C in Python" but that's not the case, at least I don' t think so. 
Anyway like I said, not really relevant to this discussion.




--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

Aahz wrote:

In article ,
=?ISO-8859-1?Q?Jo=E3o_Valverde?=   wrote:
  
What's lacking is an associative array that preserves ordering, doesn't 
require a hash function and has fast insertions and deletions in 
O(log(n)). The particular algorithm to achieve this is a secondary 
issue. It's a BST for sure, AVL vs RBT vs something else. It's my fault 
for having opened the topic with simply "trees" instead, it would have 
avoided this vagueness problem, but I'm genuinely surprised to know 
there are no data structures that efficiently support such a common need 
in Python. And I really enjoy the using this language.



Why AVL/RBT instead of B*?  It's not that simple  Another problem is
that unless the tree is coded in C, the constant factors are going to
swamp algorithmic complexity for many use cases -- that in turn makes it
more difficult to deploy a PyPI library for real-world testing.
  


I wouldn't consider anything other than C for such a module on 
efficiency alone, unless it was a prototype of course. But I have little 
knowledge about the Python C API.


About B* trees, again not an expert but I don't see how the added 
complexity brings any noticeable advantage to implement the associative 
array data structure I mentioned. Simple is better.



Anyway, I'm *not* trying to discourage you, just explain some of the
roadblocks to acceptance that likely are why it hasn't already happened.

If you're serious about pushing this through, you have two options:

* Write the code and corresponding PEP yourself (which leads to the
second option, anyway)

* Lobby on the python-ideas mailing list
  


Currently I don't have a strong need for this. I just believe it would 
be a benefit to a language I like a lot.  Lobbying isn't my thing. I'd 
rather write code, but neither am I the most qualified person for the 
job. It would certainly be interesting and fun and challenging in a good 
way and a great way to learn some new stuff. But I would definitely need 
mentoring or asking some silly questions on the mailing list. Maybe I'll 
seriously consider it some other time.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-26 Thread João Valverde

João Valverde wrote:

Aahz wrote:

In article ,
=?ISO-8859-1?Q?Jo=E3o_Valverde?=   wrote:   
Anyway, I'm *not* trying to discourage you, just explain some of the

roadblocks to acceptance that likely are why it hasn't already happened.

If you're serious about pushing this through, you have two options:

* Write the code and corresponding PEP yourself (which leads to the
second option, anyway)

* Lobby on the python-ideas mailing list
  


Currently I don't have a strong need for this. I just believe it would 
be a benefit to a language I like a lot.  Lobbying isn't my thing. I'd 
rather write code, but neither am I the most qualified person for the 
job. It would certainly be interesting and fun and challenging in a 
good way and a great way to learn some new stuff. But I would 
definitely need mentoring or asking some silly questions on the 
mailing list. Maybe I'll seriously consider it some other time.
There's also another issue raise by Paul Rubin I wasn't even aware of, 
that the LGPL is not suitable for the standard library. Having to write 
a complete BST implementation in C is a drag. There are already good C 
libraries available.

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-27 Thread João Valverde

alex23 wrote:

João Valverde  wrote:
  

Currently I don't have a strong need for this.



And clearly neither has anyone else, hence the absence from the
stdlib. As others have pointed out, there are alternative approaches,
and plenty of recipes on ActiveState, which seem to have scratched
whatever itch there is for the data structure you're advocating.
  


Propose such alternative then. There are none that offer the same 
performance. At best they're workarounds.


I don't care about recipes. That's called research.

If people don't find it to be useful, that's fine. Surprising, but fine.

And I don't have a need because I'm not using Python for my project. If 
I wanted to I couldn't, without implementing myself or porting to Python 
3 a basic computer science data structure.



While Python's motto is "batteries included" I've always felt there
was an implicit "but not the kitchen sink" following it. Just because
something "could" be useful shouldn't be grounds for inclusion. That's
what pypi & the recipes are for. Ideally, little should be created
wholesale for the stdlib, what should be added are the existing 3rd
party modules that have become so ubiquitous that their presence on
any python platform is just expected.
  


Agreed.
--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-27 Thread João Valverde

Miles Kaufmann wrote:

João Valverde wrote:
To answer the question of what I need the BSTs for, without getting 
into too many boring details it is to merge and sort IP blocklists, 
that is, large datasets of ranges in the form of (IP address, IP 
address, string). Originally I was also serializing them in a binary 
format (but no more after a redesign). I kept the "merge and sort" 
part as a helper script, but that is considerably simpler to implement.


...

As an anecdotal data point (honestly not trying to raise the "Python 
is slow" strawman), I implemented the same algorithm in C and Python, 
using pyavl. Round numbers were 4 mins vs 4 seconds, against Python 
(plus pyavl). Even considering I'm a worse Python programmer than C 
programmer, it's a lot. I know many will probably think I tried to do 
"C in Python" but that's not the case, at least I don' t think so. 
Anyway like I said, not really relevant to this discussion.


What format were you using to represent the IP addresses?  (Is it a 
Python class?)  And why wouldn't you use a network address/subnet mask 
pair to represent block ranges?  (It seems like being able to 
represent ranges that don't fit into a subnet's 2^n block wouldn't be 
that common of an occurrence, and that it might be more useful to make 
those ranges easier to manipulate.)
I was using a bytes subclass. I'm not free to choose CIDR notation, 
range boundaries must be arbitrary.




One of the major disadvantages of using a tree container is that 
usually multiple comparisons must be done for every tree operation.  
When that comparison involves a call into Python bytecode (for custom 
cmp/lt methods) the cost can be substantial.  Compare that to Python's 
hash-based containers, which only need to call comparison methods in 
the event of hash collisions (and that's hash collisions, not hash 
table bucket collisions, since the containers cache each object's hash 
value).  I would imagine that tree-based containers would only be 
worth using with objects with comparison methods implemented in C.
I would flip your statement and say one of the advantages of using trees 
is that they efficiently keep random input sorted. Obviously no 
algorithm can do that with single comparisons. And not requiring a hash 
function is a desirable quality for non-hashable objects. There's a 
world beyond dicts. :)


I profiled the code and indeed the comparisons dominated the execution 
time. Trimming the comparison function to the bare minimum, a single 
python operation, almost doubled the program's speed.




Not that I'm trying to be an apologist, or reject your arguments; I 
can definitely see the use case for a well-implemented, fast 
tree-based container for Python.  And so much the better if, when you 
need one, there was a clear consensus about what package to use (like 
PIL for image manipulation--it won't meet every need, and there are 
others out there, but it's usually the first recommended), rather than 
having to search out and evaluate a dozen different ones.



Thanks, and I'm not trying to start a religion either. ;)

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-28 Thread João Valverde

Paul Rubin wrote:

a...@pythoncraft.com (Aahz) writes:
  

(In particular, WRT the bisect module, although insertion and deletion
are O(N), the constant factor for doing a simple memory move at C speed
swamps bytecode until N gets very large -- and we already have
collections.deque() for some other common use cases.)



Again, at least in my case, I'd hope for an immutable structure.

  
Could you clarify what you mean by immutable? As in... not mutable? As 
in without supporting insertions and deletions? That's has the same 
performance as using binary search on a sorted list. What's the point of 
using a tree for that?

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-28 Thread João Valverde

Paul Rubin wrote:

João Valverde  writes:
  

Could you clarify what you mean by immutable? As in... not mutable? As
in without supporting insertions and deletions?  



Correct.  

  

That's has the same performance as using binary search on a sorted
list.  What's the point of using a tree for that?



The idea is you can accomplish the equivalent of insertion or deletion
by allocating a new root, along with the path down to the place you
want to insert, i.e. O(log n) operations.  So instead of mutating an
existing tree, you create a new tree that shares most of its structure
with the old tree, and switch over to using the new tree.  This
trivially lets you maintain snapshots of old versions of the tree,
implement an "undo" operation, have a background thread do a complex
operation on a snapshot while the foreground thread does any number of
update-and-replace operations, etc.


  
Interesting, thanks. The concept is not difficult to understand but I'm 
not sure it would be preferable. A copy operation should have the same 
cost as a "snapshot", undo is kind of redundant and multithreading... 
don't see a compelling use that would justify it. Also the interface is 
a mapping so it'd be rather nice to emulate dict's.


Have you considered how the syntax would work in Python by the way? This:

new_tree = old_tree.insert(object)

Just looks wrong. The interface should support non functional idioms too.
--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-28 Thread João Valverde

João Valverde wrote:

Paul Rubin wrote:

João Valverde  writes:
 

Could you clarify what you mean by immutable? As in... not mutable? As
in without supporting insertions and deletions?  


Correct. 
 

That's has the same performance as using binary search on a sorted
list.  What's the point of using a tree for that?



The idea is you can accomplish the equivalent of insertion or deletion
by allocating a new root, along with the path down to the place you
want to insert, i.e. O(log n) operations.  So instead of mutating an
existing tree, you create a new tree that shares most of its structure
with the old tree, and switch over to using the new tree.  This
trivially lets you maintain snapshots of old versions of the tree,
implement an "undo" operation, have a background thread do a complex
operation on a snapshot while the foreground thread does any number of
update-and-replace operations, etc.


  
Interesting, thanks. The concept is not difficult to understand but 
I'm not sure it would be preferable. A copy operation should have the 
same cost as a "snapshot", undo is kind of redundant and 
multithreading... don't see a compelling use that would justify it. 
Also the interface is a mapping so it'd be rather nice to emulate dict's.


Have you considered how the syntax would work in Python by the way? This:

new_tree = old_tree.insert(object)



Heh, that's a poor example for a mapping. But:

bst[key] = object

is even dicier for immutable structures no?
--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-28 Thread João Valverde

Paul Rubin wrote:

João Valverde  writes:
  

Interesting, thanks. The concept is not difficult to understand but
I'm not sure it would be preferable. A copy operation should have the
same cost as a "snapshot", 



You mean a deep-copy?  That is unnecessarily expensive; with a
functional structure you can snapshot (or copy) by copying a single
pointer.

  


Shallow copy...


undo is kind of redundant and multithreading... don't see a
compelling use that would justify it. 



Here is one:
 http://groups.google.com/group/comp.lang.python/msg/1fbe66701e4bc65b

  


I just skimmed that but if someone really needs multithreading for such 
intensive processing without wanting a database, fair enough I guess.



Have you considered how the syntax would work in Python by the way? This:
new_tree = old_tree.insert(object)
Just looks wrong. 



It looks fine to me.  Obviously you could support a wrapper with
a mutating slot that holds a pointer to the tree.
  
I didn't get the last part, sorry. But I think you'd have a lot of users 
annoyed that the interface is similar to a list yet their objects 
mysteriously disappear. To me, tree.insert() implies mutability but I 
defer that to others like yourself with more experience in Python than me.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-28 Thread João Valverde

João Valverde wrote:

Paul Rubin wrote:

João Valverde  writes:
 

Interesting, thanks. The concept is not difficult to understand but
I'm not sure it would be preferable. A copy operation should have the
same cost as a "snapshot", 


You mean a deep-copy?  That is unnecessarily expensive; with a
functional structure you can snapshot (or copy) by copying a single
pointer.

  


Shallow copy...


Actually I meant whatever that snapshot operation is, minus the 
insertion, if that makes sense.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-29 Thread João Valverde

João Valverde wrote:

Paul Rubin wrote:

João Valverde  writes:
 

Interesting, thanks. The concept is not difficult to understand but
I'm not sure it would be preferable. A copy operation should have the
same cost as a "snapshot", 


You mean a deep-copy?  That is unnecessarily expensive; with a
functional structure you can snapshot (or copy) by copying a single
pointer.

  


Shallow copy...


undo is kind of redundant and multithreading... don't see a
compelling use that would justify it. 


Here is one:
 http://groups.google.com/group/comp.lang.python/msg/1fbe66701e4bc65b

  


I just skimmed that but if someone really needs multithreading for 
such intensive processing without wanting a database, fair enough I 
guess.


Have you considered how the syntax would work in Python by the way? 
This:

new_tree = old_tree.insert(object)
Just looks wrong. 


It looks fine to me.  Obviously you could support a wrapper with
a mutating slot that holds a pointer to the tree.
  
I didn't get the last part, sorry. But I think you'd have a lot of 
users annoyed that the interface is similar to a list yet their 
objects mysteriously disappear. To me, tree.insert() implies 
mutability but I defer that to others like yourself with more 
experience in Python than me.




Rereading this I got what you meant by "wrapper with mutating slot".  
But that is (like I think you implied) functionally equivalent to a 
mutating data structure, with worse performance because of additional 
memory allocation and such. Is it faster to rebalance the tree with a 
persistent data structure?

--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-06-29 Thread João Valverde

alex23 wrote:

João Valverde  wrote:
  

Currently I don't have a strong need for this.



And clearly neither has anyone else, hence the absence from the
stdlib. As others have pointed out, there are alternative approaches,
and plenty of recipes on ActiveState, which seem to have scratched
whatever itch there is for the data structure you're advocating.

  


I can't resist quoting Sedgewick here. Then I'll shut up about it.

[quote=http://www.cs.princeton.edu/~rs/talks/LLRB/LLRB.pdf]
Abstract
The red-black tree model for implementing balanced search trees, 
introduced by Guibas and Sedge-
wick thirty years ago, is now found throughout our computational 
infrastructure. Red-black trees
are described in standard textbooks and are the underlying data 
structure for symbol-table imple-
mentations within C++, Java, Python, BSD Unix, and many other modern 
systems.

[/quote]

You'd think so, but no. You should correct him that in Python a balanced 
search tree is the useless cross between a dict and a database.


--
http://mail.python.org/mailman/listinfo/python-list


Re: No trees in the stdlib?

2009-07-01 Thread João Valverde

Lawrence D'Oliveiro wrote:
In message , João 
Valverde wrote:


  

Simple example usage case: Insert string into data structure in sorted
order if it doesn't exist, else retrieve it.



the_set = set( ... )

if str in the_set :
... "retrieval" case ...
else :
the_set.add(str)
#end if

Want sorted order?

sorted(tuple(the_set))

What could be simpler?

  


Try putting that inside a loop with thousands of iterations and you'll 
see what the problem is.

--
http://mail.python.org/mailman/listinfo/python-list


Re: ISO library ref in printed form

2009-07-08 Thread João Valverde

kj wrote:

Does anyone know where I can buy the Python library reference in
printed form?  (I'd rather not print the whole 1200+-page tome
myself.)  I'm interested in both/either 2.6 and 3.0.

TIA!

kj
  
Why not download the documentation, take it to a local copy shop and 
have it printed and bound there?


http://docs.python.org/download.html
http://docs.python.org/3.1/download.html
--
http://mail.python.org/mailman/listinfo/python-list