Re: [Pharo-users] [GSoC blog post] Binary Search Trees

Ben Coman Wed, 10 Jul 2019 22:10:20 -0700

On Thu, 11 Jul 2019 at 00:53, Alexandre Bergel via Pharo-users
<pharo-users@lists.pharo.org> wrote:
>
> Hello Smiljana,
>
> Thanks for having written down this document. I am not expert in algorithm, 
> so I would consider myself a simple user. I have developed complex software 
> for some times and I have never seen the need of having a binary search tree. 
> I guess this is probably partly because of my lack of expertise in binary 
> search tree and partly because experts in binary search trees assume that 
> people know what it is about and in what it is useful.
>
> My question is, when should a programmer ever need to use binary search tree? 
> Can you add some examples on what these trees are good for, and how an 
> average programmer should look into it. I think this will be a valuable and 
> easy way to expand your blog.

My "computer science" is a bit rusty but Binary Search Trees and their
generalisation to B-Trees is Databases 101.
Almost guaranteed you are "using" them with any database you use,
you're just not directly exposed to them.

If you skim "Redesigning String Data Structures to Exploit Cache"
(https://people.eng.unimelb.edu.au/jzobel/fulltext/acmjea10.pdf)
looking at the performance graphs, for "unordered searching" the best
bet for many applications is probably our hash-based Dictionary if
that covers all your needs.
BSTs are more memory efficient than Dictionarys, but less cache-friendly.

A few applications where BSTs can be useful:
* doing range searches efficiently [1]
* traversing elements in order, which is more difficult with hash tables [1] [2]
* good for "order" statistics [2] like:
- inexact search - finding closest lower and greater elements.
- determining maximum & minimum elements
- determining successor & predecessor elements
- range queries - easy to do with BSTs and not a natural operation
with Hash Tables.
* Self-Balancing BSTs guarantee operations work in O(Logn) time.
Hashing has Θ(1) is average time but some particular operations are
costly, especially when table resizing happen [2]
but also they are slower so if the data is sufficiently random
simple BSTs may do.

[1]
https://stackoverflow.com/questions/371136/binary-trees-vs-linked-lists-vs-hash-tables
[2] https://www.geeksforgeeks.org/advantages-of-bst-over-hash-table/

This side-related article...
"Number crunching: Why you should never, ever, EVER use linked-list
in your code again"

(https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again)
indicates that for many In-Memory operations you are better using
sorted-Arrays (e.g. SequenceableCollection>>#findBinary:)
than cache-invalidating pointer-base structures like Lists & Trees -
thus the strong association of BSTs & BTrees with slower access
disk-based databases.
Also notice that additional BST features closely match common database
operations.

cheers -ben

Re: [Pharo-users] [GSoC blog post] Binary Search Trees

Reply via email to