On 2013-08-27 11:32:30 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > The git tree is at: > > git://git.postgresql.org/git/users/andresfreund/postgres.git branch > > xlog-decoding-rebasing-cf4 > > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4 > > I gave this recently rebased branch a skim. In general, the separation > between decode.c/reorderbuffer.c/snapbuild.c seems a lot nicer now than > on previous iterations -- good job there.
Thanks for having a look! > Here are some quick notes I took while reading the patch itself. I > haven't gone through it really carefully, yet. > > > - I wonder whether DecodeCommit and DecodeAbort should really be a single > routine. Right now, the former might call the later; and the latter is > aware of this. Seems awkward. Yes, I am not happy with that either. I'll play with combining them and check whether that looks beter. > - We skip insert/update/delete if not my database Id; however, we don't skip > commit in the same case. If there are two walrecvrs on a cluster, on > different databases, does this lead to us trying to remove files > twice, if a xact commits which deleted some files? Is this a problem? > Should we try to skip such database-specific actions in global > WAL records? Hm. We should be able to skip it for long commit records at least. I think I lost that code along the unification. There's no danger of removing anything global afaics since we're not replaying using the original replay routines and all the slot/sender specific stuff has unique names. > - There's rmgr-specific knowledge in decode.c. I wonder if, similar to > redo and desc routines, that shouldn't instead be pluggable functions > for each rmgr. I don't think that's a good idea. I've quickly played with it before and it doesn't seem to end happy. It would require opening up more semi-public interfaces and in the end, we're only interested of in-core stuff. Even if it were possible to add new indexes by plugging new rmgrs, we wouldn't care. > - What's with ReorderBufferRestoreCleanup()? Shouldn't it be in logical.c? No, that's just for removing ondisk data at the end of a transaction. I'll improve the comment. > - reorderbuffer.c does several different things. Can it be split? > Perhaps in pieces such as > * stuff to manage memory (slab cache thingies) > * TXN iterator > * other logically separate parts? > * the rest Hm. I don't really see much point in splitting it along those lines. None of those really makes sense without the other parts and the file isn't *that* huge. > - Having to expose LocalExecuteInvalidationMessage() looks awkward. Is there > another way? Hm. I don't immediately see any way. We need to execute invalidation messages just within one backend. There just is no exposed functionality for that yet since it wasn't needed so far. We could expose something like LocalExecuteInvalidationMessage*s*() instead of doing the loop in reorderbuffer.c, but that's about it. > - I think we need a better name for "treat_as_catalog_table" (and > RelationIsTreatedAsCatalogTable). Maybe replication_catalog or > something similar? I think we're going to end up needing that for more than just replication, so I'd like to keep replication out of the name. I don't like the current name either though, so any other ideas? > - Don't do this: > + * RecentGlobal(Data)?Xmin is initialized to InvalidTransactionId, to > ensure that no > because later greps for RecentGlobalDataXmin and RecentGlobalXmin will > fail to find it. It seems better to spell both names, so > "RecentGlobalDataXmin and RecentGlobalXmin are initialized to ..." Ok. > - the pg_receivellog command line is strange. Apparently I need one or > more of --start,--init,--stop, but if stop, then the other two must > not be present; and if startpos, then init and stop cannot be > specified. (There's a typo there that says "cannot combine with > --start" when it really means "cannot combine with --stop", BTW). I > think this would make more sense to have init,start,stop be commands, > in pg_ctl's spirit; so there would be no double-dash. IOW > SOMEPATH/pg_receivellog --startpos=123 start > and so on. The reasoning here is somewhat complex and I am not happy with the status quo, so I like getting input here. The individual verbs mean: * init: create a replication slot * start: continue streaming in an existing replication slot * stop: remove replication slot The reason you cannot specify anything with --stop is that a) --start streams until you abort the utility. So there's no chance of running --stop after it. b) --init and --stop seems like a pointless combination since you cannot actually do anything with the slot. --init and --start combined, on the other hand are useful for testing, which is why I allow them so far, but I wouldn't have problems removing that capability. The reason you cannot combine --init or --init --start with --startpos is that --startpos has to refer to a location that could have actually streamed to the client. Before a replication slot is established the client doesn't know anything about such an address, so --init --start cannot know any useful --startpos, that's why it's forbidden to pass one. The idea behind startpos is that you can tell the server "I have replayed transactions up to this LSN" and the server will only give you only transactions that have commited after this. > Also, we need SGML docs for this new utility. And a lot more than only for this utility :( > Any particular reason for removing this line: > -/* Get a new XLogReader */ > + > extern XLogReaderState *XLogReaderAllocate(XLogPageReadCB pagereadfunc, > void *private_data); Hrmpf. Merge error. I've integrated too many different versions of too different xlogreaders ;) > I don't see the point of XLogRecordBuffer.record_data; we already have a > pointer to the XLogRecord, and the data can readily be obtained using > XLogRecGetData. So why provide the same thing twice? It seems to me > that if instead of passing the XLogRecordBuffer we just provide the > XLogRecord, and separately the "origptr" where needed, we could avoid > having to expose the XLogRecordBuffer stuff unnecessarily. By now we also need the end location of a wal record. So we have to pass three addresses around for everything which isn't very convenient. If you vastly prefer passing around three parameters I can do that, but I'd rather not. The original reason for doing so was, to be honest, that my own xlogreader's API was different... > In this comment: > + * FIXME: We need something resembling the real SnapshotNow to handle things > + * like enum lookups from indices correctly. > what do we need consider in light of the new comment proposed by Robert > ca+tgmobvtjrj_doxxq0wga1a1jlypvyqtr3m+cou_ousabn...@mail.gmail.com I did most of the code changes for this, but this made me realize that there are quite some more comments and even a function name to be adapted. Will work on that. Thanks! Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers