On Tue, Apr 4, 2017 at 11:43 PM, Pavan Deolasee <pavan.deola...@gmail.com> wrote: > Well, better than causing a deadlock ;-)
Yep. > Lets see if we want to go down the path of blocking WARM when tuples have > toasted attributes. I submitted a patch yesterday, but having slept over it, > I think I made mistakes there. It might not be enough to look at the caller > supplied new tuple because that may not have any toasted values, but the > final tuple that gets written to the heap may be toasted. Yes, you have to make whatever decision you're going to make here after any toast-ing has been done. > We could look at > the new tuple's attributes to find if any indexed attributes are toasted, > but that might suck as well. Or we can simply block WARM if the old or the > new tuple has external attributes i.e. HeapTupleHasExternal() returns true. > That could be overly restrictive because irrespective of whether the indexed > attributes are toasted or just some other attribute is toasted, we will > block WARM on such updates. May be that's not a problem. Well, I think that there's some danger of whittling down this optimization to the point where it still incurs most of the costs -- in bit-space if not in CPU cycles -- but no longer yields much of the benefit. Even though the speed-up might still be substantial in the cases where the optimization kicks in, if a substantial number of users doing things that are basically pretty normal sometimes fail to get the optimization, this isn't going to be very exciting outside of synthetic benchmarks. Backing up a little bit, it seems like the root of the issue here is that, at a certain point in what was once a HOT chain, you make a WARM update, and you make a decision about which indexes to update at that point. Now, later on, when you traverse that chain, you need to be able to figure what decide you made before; otherwise, you might make a bad decision about whether an index pointer applies to a particular tuple. If the index tuple is WARM, then the answer is "yes" if the heap tuple is also WARM, and "no" if the heap tuple is CLEAR (which is an odd antonym to WARM, but leave that aside). If the index tuple is CLEAR, then the answer is "yes" if the heap tuple is also CLEAR, and "maybe" if the heap tuple is WARM. In that "maybe" case, we are trying to reconstruct the decision that we made when we did the update. If, at the time of the update, we decided to insert a new index entry, then the answer is "no"; if not, it's "yes". From an integrity point of view, it doesn't really matter how we make the decision; what matters is that we're consistent. More specifically, if we sometimes insert a new index tuple even when the value has not changed in any user-visible way, I think that would be fine, provided that later chain traversals can tell that we did that. As an extreme example, suppose that the WARM update inserted in some magical way a bitmap of which attributes had changed into the new tuple. Then, when we are walking the chain following a CLEAR index tuple, we test whether the index columns overlap with that bitmap; if they do, then that index got a new entry; if not, then it didn't. It would actually be fine (apart from efficiency) to set extra bits in this bitmap; extra indexes would get updated, but chain traversal would know exactly which ones, so no problem. This is of course just a gedankenexperiment, but the point is that as long as the insert itself and later chain traversals agree on the rule, there's no integrity problem. I think. The first idea I had for an actual solution to this problem was to make the decision as to whether to insert new index entries based on whether the indexed attributes in the final tuple (post-TOAST) are byte-for-byte identical with the original tuple. If somebody injects a new compression algorithm into the system, or just changes the storage parameters on a column, or we re-insert an identical value into the TOAST table when we could have reused the old TOAST pointer, then you might have some potentially-WARM updates that end up being done as regular updates, but that's OK. When you are walking the chain, you will KNOW whether you inserted new index entries or not, because you can do the exact same comparison that was done before and be sure of getting the same answer. But that's actually not really a solution, because it doesn't work if all of the CLEAR tuples are gone -- all you have is the index tuple and the new heap tuple; there's no old heap tuple with which to compare. The only other idea that I have for a really clean solution here is to support this only for index types that are amcanreturn, and actually compare the value stored in the index tuple with the one stored in the heap tuple, ensuring that new index tuples are inserted whenever they don't match and then using the exact same test to determine the applicability of a given index pointer to a given heap tuple. I'm not sure how viable that is either, but hopefully you see my underlying point here: it would be OK for there to be cases where we fall back to a non-WARM update because a logically equal value changed at the physical level, especially if those cases are likely to be rare in practice, but it can never be allowed to happen that chain traversal gets confused about which indexes actually got touched by a particular WARM update. By the way, the "Converting WARM chains back to HOT chains" section of README.WARM seems to be out of date. Any chance you could update that to reflect the current state and thinking of the patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers