Re: Doc chapter for Hash Indexes

Justin Pryzby Mon, 21 Jun 2021 15:55:11 -0700

On Mon, Jun 21, 2021 at 02:08:12PM +0100, Simon Riggs wrote:
> New chapter for Hash Indexes, designed to help users understand how
> they work and when to use them.
> 
> Mostly newly written, but a few paras lifted from README when they were 
> helpful.


+ <para>
+  PostgreSQL includes an implementation of persistent on-disk hash indexes,
+  which are now fully crash recoverable. Any data type can be indexed by a

I don't see any need to mention that they're "now" crash safe.

+  Each hash index tuple stores just the 4-byte hash value, not the actual
+  column value. As a result, hash indexes may be much smaller than B-trees
+  when indexing longer data items such as UUIDs, URLs etc.. The absence of

comma:
URLs, etc.

+  the column value also makes all hash index scans lossy. Hash indexes may
+  take part in bitmap index scans and backward scans.

Isn't it more correct to say that it must use a bitmap scan?

+  through the tree until the leaf page is found. In tables with millions
+  of rows this descent can increase access time to data. The equivalent

rows comma

+  that hash value. When scanning a hash bucket during queries we need to

queries comma

+ <para>
+  As a result of the overflow cases, we can say that hash indexes are
+  most suitable for unique, nearly unique data or data with a low number
+  of rows per hash bucket will be suitable for hash indexes. One

The beginning and end of the sentence duplicate "suitable".

+  Each row in the table indexed is represented by a single index tuple in
+  the hash index. Hash index tuples are stored in the bucket pages, and if
+  they exist, the overflow pages. 

"the overflow pages" didn't sound right, but I was confused by the comma.  
I think it should say ".. in bucket pages and overflow pages, if any."

-- 
Justin

Re: Doc chapter for Hash Indexes

Reply via email to