On Thu, 11 Jul 2019 at 00:53, Alexandre Bergel via Pharo-users <pharo-users@lists.pharo.org> wrote: > > Hello Smiljana, > > Thanks for having written down this document. I am not expert in algorithm, > so I would consider myself a simple user. I have developed complex software > for some times and I have never seen the need of having a binary search tree. > I guess this is probably partly because of my lack of expertise in binary > search tree and partly because experts in binary search trees assume that > people know what it is about and in what it is useful. > > My question is, when should a programmer ever need to use binary search tree? > Can you add some examples on what these trees are good for, and how an > average programmer should look into it. I think this will be a valuable and > easy way to expand your blog.
My "computer science" is a bit rusty but Binary Search Trees and their generalisation to B-Trees is Databases 101. Almost guaranteed you are "using" them with any database you use, you're just not directly exposed to them. If you skim "Redesigning String Data Structures to Exploit Cache" (https://people.eng.unimelb.edu.au/jzobel/fulltext/acmjea10.pdf) looking at the performance graphs, for "unordered searching" the best bet for many applications is probably our hash-based Dictionary if that covers all your needs. BSTs are more memory efficient than Dictionarys, but less cache-friendly. A few applications where BSTs can be useful: * doing range searches efficiently [1] * traversing elements in order, which is more difficult with hash tables [1] [2] * good for "order" statistics [2] like: - inexact search - finding closest lower and greater elements. - determining maximum & minimum elements - determining successor & predecessor elements - range queries - easy to do with BSTs and not a natural operation with Hash Tables. * Self-Balancing BSTs guarantee operations work in O(Logn) time. Hashing has Θ(1) is average time but some particular operations are costly, especially when table resizing happen [2] but also they are slower so if the data is sufficiently random simple BSTs may do. [1] https://stackoverflow.com/questions/371136/binary-trees-vs-linked-lists-vs-hash-tables [2] https://www.geeksforgeeks.org/advantages-of-bst-over-hash-table/ This side-related article... "Number crunching: Why you should never, ever, EVER use linked-list in your code again" (https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again) indicates that for many In-Memory operations you are better using sorted-Arrays (e.g. SequenceableCollection>>#findBinary:) than cache-invalidating pointer-base structures like Lists & Trees - thus the strong association of BSTs & BTrees with slower access disk-based databases. Also notice that additional BST features closely match common database operations. cheers -ben