> -----Original Message-----
> From: Bruce Momjian [mailto:[EMAIL PROTECTED]]
> Sent: Friday, June 21, 2002 9:52 AM
> To: Tom Lane
> Cc: Neil Conway; [EMAIL PROTECTED]; Dann Corbit;
> [EMAIL PROTECTED]
> Subject: Re: [HACKERS] What is wrong with hashed index usage?
> 
> 
> Tom Lane wrote:
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> > > I remember three problems:  build time, index size, and 
> concurrency
> > > problems.  I was wondering about the equal key case 
> myself, and assumed
> > > hash may be a win there, but with the concurrency 
> problems, is that even
> > > possible?
> > 
> > Sure.  Many-equal-keys are a problem for btree whether you have any
> > concurrency or not.
> > 
> > > OK, I have reworded it.  Is that better?
> > 
> > It's better, but you've still discarded the original's 
> explicit mention
> > of concurrency problems.  Why do you want to remove information?
> 
> OK, concurrency added.  How is that?
> 
> > 
> > > How about an elog(NOTICE) for hash use?
> > 
> > I don't think that's appropriate.
> 
> I was thinking of this during CREATE INDEX ... hash:
> 
>       NOTICE:  Hash index use is discouraged.  See the CREATE INDEX
>       reference page for more information.
> 
> Does anyone else like/dislike that?

I think it might be OK temporarily, to show that there is some work that
needs done.  When hashed indexes are fixed, the notice should be
removed.

I have not looked at the hash code.  Here is a strategy (off the top of
my head) that seems that it should work:

Use Bob Jenkins' 64 bit generic hash from here (totally free for use and
fast as blazes):
http://burtleburtle.net/bob/hash/index.html

Specifically:
http://burtleburtle.net/bob/c/lookup8.c and routine: hash( k, length,
level)

Now, with a 64 bit hash, there is very tiny probability of a collision
(but you could have duplicate data).
The hash index would consist of nothing more than this:
[long long hash=64 bit hash code][unsigned nmatches=count of matching
hashes][array of {nmatches} pointers directly to the records with that
hash]

This is probably grotesqely oversimplified.  But maybe it will spur an
indea in the person who writes the indexing code.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Reply via email to