On Mon, Aug 19, 2019 at 2:04 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > Currently, In UnpackedUndoRecord we store all members directly which > are set by the caller. We store pointers to some header which are > allocated internally by the undo layer and the caller need not worry > about setting them. So now you are suggesting to put other headers > also as structures in UnpackedUndoRecord. I as such don't have much > problem in doing that but I think initially Robert designed > UnpackedUndoRecord structure this way so it will be good if Robert > provides his opinion on this.
I don't believe that's what is being suggested. It seems to me that the thing Andres is complaining about here has roots in the original sketch that I did for this code. The oldest version I can find is here: https://github.com/EnterpriseDB/zheap/commit/7d194824a18f0c5e85c92451beab4bc6f044254c In this version, and I think still in the current version, there is a two-stage marshaling strategy. First, the individual fields from the UnpackedUndoRecord get copied into global variables (yes, that was my fault, too, at least in part!) which are structures. Then, the structures get copied into the target buffer. The idea of that design was to keep the code simple, but it didn't really work out, because things got a lot more complicated between the time I wrote those 3244 lines of code and the >3000 lines of code that live in this patch today. One thing that change was that we moved more and more in the direction of considering individual fields as separate objects to be separately included or excluded, whereas when I wrote that code I thought we were going to have groups of related fields that stood or fell together. That idea turned out to be wrong. (There is the even-larger question here of whether we ought to take Heikki's suggestion and make this whole thing a lot more generic, but let's start by discussing how the design that we have today could be better-implemented.) If I understand Andres correctly, he's arguing that we ought to get rid of the two-stage marshaling strategy. During decoding, he wants data to go directly from the buffer that contains it to the UnpackedUndoRecord without ever being stored in the UnpackUndoContext. During insertion, he wants data to go directly from the UnpackedUndoRecord to the buffer that contains it. Or, at least, if there has to be an intermediate format, he wants it to be just a chunk of raw bytes, rather than a bunch of individual fields like we have in UndoPackContext currently. I think that's a reasonable goal. I'm not as concerned about it as he is from a performance point of view, but I think it would make the code look nicer, and that would be good. If we save CPU cycles along the way, that is also good. In broad outline, what this means is: 1. Any field in the UndoPackContext that starts with urec_ goes away. 2. Instead of something like InsertUndoBytes((char *) &(ucontext->urec_fxid), ...) we'd write InsertUndobytes((char *) &uur->uur_fxid, ...). 3. Similarly instead of ReadUndoBytes((char *) &ucontext->urec_fxid, ...) we'd write ReadUndoBytes((char *) &uur->uur_fxid, ...). 4. It seems slightly trickier to handle the cases where we've got a structure instead of individual fields, like urec_hd. But those could be broken down into field-by-field reads and writes, e.g. in this case one call for urec_type and a second for urec_info. 5. For uur_group and uur_logswitch, the code would need to allocate those subsidiary structures before copying into them. To me, that seems like it ought to be a pretty straightforward change that just makes things simpler. We'd probably need to pass the UnpackedUndoRecord to BeginUnpackUndo instead of FinishUnpackUndo, and keep a pointer to it in the UnpackUndoContext, but that seems fine. FinishUnpackUndo would end up just about empty, maybe entirely empty. Is that a reasonable idea? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company