The Judy algorithm

Tim Bunce Mon, 10 Mar 2003 02:36:49 -0800

I think this might be interesting to some of you...

  "Judy is a general purpose dynamic array implemented as a C callable
  library. Judy's speed and memory usage are typically better than
  other data storage models and improves with very large data sets."


http://judy.sourceforge.net/application/10minutes.htm
http://judy.sourceforge.net/application/
http://sourceforge.net/projects/judy

I've appended a few extracts from the "10minutes.htm" url given above.

Tim.

A (CPU) cache-line fill is additional time required to do a read
reference from RAM when a word is not found in cache. In today's
computers the time for a cache-line fill is in the range of 50..2000
machine instructions. Therefore a cache-line fill should be avoided
when fewer than 50 instructions can do the same job. (Modern machines
tend to pipeline writes to RAM. They often take no additional time
in the Judy design.)

Some of the reasons Judy outperforms binary trees, b-trees, and skip-lists:

    * Judy rarely compromises speed/space performance for simplicity
      (Judy will never be called simple except at the API).

    * Judy is designed to avoid cache-line fills wherever possible.
      (This is the main design criteria for Judy.)

    * A b-tree requires a search of each node (branch), resulting
      in more cache-line fills.

    * A binary-tree has many more levels (about 8X), resulting in
      more cache-line fills.

    * A skip-list is roughly equivalent to a degree-4 (4-ary) tree,
      resulting in more cache-line fills

>From a speed point of view Judy is chiefly a 256-ary digital tree
or trie (per D. Knuth Volume 3 definitions). A degree of 256-ary
is a somewhat "magic" N-ary for a variety of reasons -- but mostly
because a byte (the least addressable memory unit) is 8 bits. Also
a higher degree means reduced cache-line fills per access. You see
the theme here -- avoid cache-line fills like the plague.

The presence of a CPU cache in modern machines has changed many of
the ways to write a performance algorithm. To take advantage of a
cache, it is important to leverage as much as possible. In a Judy
tree, the presence of a cache results in 1..3 (or more) fewer
cache-line fills per lookup than would be possible without a cache.

Judy adapts efficiently to a wide range of populations and data set
densities. Since the Judy data structure is a tree of trees, each
sub-tree is a static expanse that is optimized to match the "character"
or density of the keys it contains.

Notice that to insert or delete a key is almost as simple as setting
or clearing a bit.

The Judy algorithm

Reply via email to