On Mon, Jun 21, 2021 at 02:08:12PM +0100, Simon Riggs wrote: > New chapter for Hash Indexes, designed to help users understand how > they work and when to use them. > > Mostly newly written, but a few paras lifted from README when they were > helpful.
+ <para> + PostgreSQL includes an implementation of persistent on-disk hash indexes, + which are now fully crash recoverable. Any data type can be indexed by a I don't see any need to mention that they're "now" crash safe. + Each hash index tuple stores just the 4-byte hash value, not the actual + column value. As a result, hash indexes may be much smaller than B-trees + when indexing longer data items such as UUIDs, URLs etc.. The absence of comma: URLs, etc. + the column value also makes all hash index scans lossy. Hash indexes may + take part in bitmap index scans and backward scans. Isn't it more correct to say that it must use a bitmap scan? + through the tree until the leaf page is found. In tables with millions + of rows this descent can increase access time to data. The equivalent rows comma + that hash value. When scanning a hash bucket during queries we need to queries comma + <para> + As a result of the overflow cases, we can say that hash indexes are + most suitable for unique, nearly unique data or data with a low number + of rows per hash bucket will be suitable for hash indexes. One The beginning and end of the sentence duplicate "suitable". + Each row in the table indexed is represented by a single index tuple in + the hash index. Hash index tuples are stored in the bucket pages, and if + they exist, the overflow pages. "the overflow pages" didn't sound right, but I was confused by the comma. I think it should say ".. in bucket pages and overflow pages, if any." -- Justin