Hey Amit, On Thu, Jul 9, 2020 at 12:16 AM Amit Langote <amitlangot...@gmail.com> wrote:
> By the way, what happens today if you do INSERT INTO a_zedstore_table > ... RETURNING xmin? Do you get an error "xmin is unrecognized" or > some such in slot_getsysattr() when trying to project the RETURNING > list? > We get garbage values for xmin and cmin. If we request cmax/xmax, we get an ERROR from slot_getsystattr()->tts_zedstore_getsysattr(): "zedstore tuple table slot does not have system attributes (except xmin and cmin)" A ZedstoreTupleTableSlot only stores xmin and xmax. Also, zedstoream_insert(), which is the tuple_insert() implementation, does not supply the xmin/cmin, thus making those values garbage. For context, Zedstore has its own UNDO log implementation to act as storage for transaction information. (which is intended to be replaced with the upstream UNDO log in the future). The above behavior is not just restricted to INSERT..RETURNING, right now. If we do a select <tx_column> from foo in Zedstore, the behavior is the same. The transaction information is never returned from Zedstore in tableam calls that don't demand transactional information be used/returned. If you ask it to do a tuple_satisfies_snapshot(), OTOH, it will use the transactional information correctly. It will also populate TM_FailureData, which contains xmax and cmax, in the APIs where it is demanded. I really wonder what other AMs are doing about these issues. I think we should either: 1. Demand transactional information off of AMs for all APIs that involve a projection of transactional information. 2. Have some other component of Postgres supply the transactional information. This is what I think the upstream UNDO log can probably provide. 3. (Least elegant) Transform tuple table slots into heap tuple table slots (since it is the only kind of tuple storage that can supply transactional info) and explicitly fill in the transactional values depending on the context, whenever transactional information is projected. For this bug report, I am not sure what is right. Perhaps, to stop the bleeding temporarily, we could use the pi_PartitionTupleSlot and assume that the AM needs to provide the transactional info in the respective insert AM API calls, as well as demand a heap slot for partition roots and interior nodes. And then later on. we would need a larger effort making all of these APIs not really demand transactional information. Perhaps the UNDO framework will come to the rescue. Regards, Soumyadeep (VMware)