Re: Software Engineering Practices for Marking Algorithms?

CuppoJava Wed, 16 Mar 2011 13:22:22 -0700

Thank you for your reply Alan. I am still eager to hear for more
solutions. I would be even very happy with just a nice organizational
way of thinking about the marker fields instead of a whole new
algorithmic solution.


As for why I thought that hashtables are not appropriate for my use
case, the following is my understanding of its performance
characteristics. Please let me know if I misunderstand anything.

  - For a Hashtable to be performant, there needs to be few
collisions, otherwise you resort to iterating through a list, or tree
every time there is one.

  - In order for there to be few collisions, there are 2 approaches.
(1) Have a very very good hash function. or (2) If you don't have a
great hash function, then allocate a larger table.

  - Mark's idea of storing the numeric id in each node can be thought
of as an ideal hash function. Every distinct object maps to a distinct
integer value within a very small numeric range.

  - An ideal hash function is not typically attainable, so we deal
with non-ideal hash functions. The average loading ratio for a
hashtable, that I use, is 60/40. That is, at maximum capacity, only
60% of the slots are used.

  - Therefore, using a hashtable to implement this marking algorithm
will take up 40% more space than using the marker fields. There is
also the performance aspects of having to compute the hash function
for every access. This is typically very fast, but it still cannot
compare to a pointer deference.

That is why, I think for my problem, the marker fields are the best
way to go. What I think is still possible, however, is a nice way to
organize these fields, such that as Mark pointed out, multithreaded
stuff don't become a nightmare.

The *binary* tree as well, is just a toy example for illustrative
purposes. I agree that storing the children in an array is not a
practical implementation.

  -Patrick

On Mar 16, 3:33 pm, Alan <[email protected]> wrote:
> You are asking on the wrong list. Nobody in the Clojure list will ever
> tell you that monkey-patching and mutating your data structure is the
> right approach in order to traverse it. And that's totally fine: ask
> away, if you're willing to accept other solutions. But you've rejected
> all ideas aside from the one you had before you got here as non-
> performant, when you don't seem to have a clear understanding of the
> performance characteristics of either.
>
> Space waste? Really? You can't afford a pointer to each Node object
> during the traversal, but you can afford an extra boolean field in
> each Node, even when you're not traversing them? Hint: objects are
> typically allocated on pointer-sized boundaries, so an extra boolean
> at the end will take up as much "real" space as a whole pointer.
>
> And you have a *binary* tree, storing a Node[] instead of Node left,
> Node right? Instead of two pointers and two data objects, you're
> storing two data objects, two pointers, a pointer to Node[], and a
> length property. Throw away that extra crap and you have more than
> enough room for a temporary hashtable.
>
> On Mar 16, 11:17 am, CuppoJava <[email protected]> wrote:
>
>
>
> > It sounds like hashing is the only solution that can really compete
> > with these markers. My particular problem cannot use hashing because
> > the space waste and extra compute time is unacceptable. I'll just have
> > to be particularly careful for multithreading my app.
>
> > Thanks for the replies
> >   -Patrick
>
> > On Mar 16, 10:31 am, Armando Blancas <[email protected]>
> > wrote:
>
> > > > However, the visited field has nothing to do with the actual Node
> > > > class. It's simply for other functions to use as a marker.
>
> > > > This solution is kludgy, but I cannot see any other *performant* way
> > > > of doing this.
>
> > > I don't think markers are a kludge. Besides modeling, data structures
> > > must support stuff like performance requirements. This is no different
> > > than, say, reference counting in GC's, COM, inodes, etc.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Software Engineering Practices for Marking Algorithms?

Reply via email to