On Thu, Aug 24, 2017 at 11:26 AM, Jesper Pedersen <jesper.peder...@redhat.com> wrote: > Based on the feedback in this thread, I have moved the patch to "Ready for > Committer".
Reviewing 0001: _hash_readpage gets the page LSN to see if we can apply LP_DEAD hints, but if the table is unlogged or temporary, the LSN will never change, so the test in _hash_kill_items will always think that it's OK to apply the hints. (This seems like it might be a pretty serious problem, because I'm not sure what would be a viable workaround.) The logic that tries to ensure that so->currPos.{buf,currPage,lsn} get updated is not, to my eyes, obviously correct. Generally, the logic for this stuff seems unnaturally spread out to me. For example, _hash_next() updates currPos.buf, but leaves it to _hash_readpage to set currPage and lsn. That function also sets all three fields when it advances to the next page by calling _hash_readnext(), but when it tries to advance to the next page and doesn't find one it sets buf but not currPage or lsn. It seems to me that this logic should be more centralized somehow. Can we arrange things so that at least buf, currPage, and lsn, and maybe also nextPage and prevPage, get updated at the same time and as soon after reading the buffer as possible? It would be bad if a primary bucket page's hasho_prevblkno field got copied into so->currPos.prevpage, because the value that appears for the primary bucket is not a real block number. But _hash_readpage seems like it can bring about this exact situation, because of this code: + if (!BlockNumberIsValid(opaque->hasho_nextblkno)) + prev_blkno = opaque->hasho_prevblkno; ... + so->currPos.prevPage = prev_blkno; If we're reading the primary bucket page and there are no overflow pages, hasho_nextblkno will not be valid and hasho_prevblkno won't be a real block number. Incidentally, the "if" statement in the above block of code is probably not saving anything; I would suggest for clarity that you do the assignment unconditionally (but this is just a matter of style, so I don't feel super-strongly about it). + return (so->currPos.firstItem <= so->currPos.lastItem); Can this ever return false? It looks to me like we should never reach this code unless that condition holds, and that it would be a bug if we did. If that's correct, maybe this should be an Assert() and the return statement should just return true. + buf = _hash_getbuf(rel, blkno, HASH_READ, LH_OVERFLOW_PAGE); + + /* It might not exist anymore; in which case we can't hint it. */ + if (!BufferIsValid(buf)) + return; This is dead code, because _hash_getbuf always returns a valid buffer. If there's actually a risk of the buffer disappearing, then some other handling is needed for this case. But I suspect that, since a scan always holds a pin on the primary bucket, there's actually no way for this to happen and this is just dead code. The comment in hashgetbitmap claims that _hash_first() or _hash_next() never returns dead tuples. If that were true, it would be a bug, because then scans started during recovery would return wrong answers. A more accurate statement would be something like: _hash_first and _hash_next handle eliminate dead index entries whenever scan->ignored_killed_tuples is true. Therefore, there's nothing to do here except add the results to the TIDBitmap. _hash_readpage contains unnecessary "continue" statements inside the loops. The reason that they're unnecessary is that there's no code below that in the loop anyway, so the loop is already going to go back around to the top. Whether to change this is a matter of style, so I won't complain too much if you want to leave it the way it is, but personally I'd favor removing them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers