On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Thu, May 26, 2011 at 12:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Alvaro Herrera <alvhe...@commandprompt.com> writes: >>> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 >>> -0400 2011: >>>> I think we should emit the real cause in those cases, if possible (not too >>>> much overhead). The message would be "Unlogged table content is not >>>> available >>>> in standby server". >> >>> I guess what it should do is create an empty file in the slave. >> >> Probably it should, because won't the table malfunction after the slave >> is promoted to master, if there's no file at all there? Or will the >> process of coming live create an empty file even if there was none? > > Coming live creates an empty file. > >> But Euler is pointing out a different issue, which is usability. If the >> slave just acts like the table is present but empty, we are likely to >> get bug reports about that too. An error telling you you aren't allowed >> to access such a table on slaves would be more user-friendly, if we can >> do it without too much pain. > > I looked into this a bit. A few observations: > > (1) This problem is actually not confined to unlogged tables; > temporary tables have the same issue. For example, if you create a > temporary table on the master and then, on the slave, do SELECT * FROM > pg_temp_3.hi_mom (or whatever the name of the temp schema where the > temp table is) you get the same error. In fact I suspect if you took > a base backup that included the temporary relation and matched the > backend ID you could even manage to read out the old contents (modulo > any fun and exciting XID wraparound issues). But the problem is of > course more noticeable for unlogged tables since they're not hidden > away in a special funny schema. > > (2) It's somewhat difficult to get a clean fix for this error because > it's coming from deep down in the bowels of the system, inside > mdnblocks(). At that point, we haven't got a Relation available, just > an SMgrRelation, so the information that this relation is unlogged is > not really available, at least not without some sort of gross > modularity violation like calling smgrexists() on the init fork. It > doesn't particularly seem like a good idea to conditionalize the > behavior of mdnblocks() on relpersistence anyway; that's a pretty > gross modularity violation all by itself. > > (3) mdnblocks() gets called twice during the process of doing a simple > select on a relation - once from the planner, via estimate_rel_size, > and again from the executor, via initscan. A fairly obvious fix for > this problem is to skip the call to estimate_rel_size() if we're > dealing with a temporary or unlogged relation and are in recovery; and > make heap_beginscan_internal() throw a suitable error under similar > circumstances (or do we want to throw the error at plan time? tossing > it out from all the way down in estimate_rel_size() seems a bit > wacky). Patch taking this approach attached. > > (4) It strikes me that it might be possible to address this problem a > bit more cleanly by allowing mdnblocks() and smgrnblocks() and > RelationGetNumberOfBlocksInFork() to take a boolean argument > indicating whether or not an error should be thrown if the underlying > physical file happens not to exist. When no error is to be signaled, > we simply return 0 when the main fork doesn't exist, rather than > throwing an error. We could then make estimate_rel_size() use this > variant, eliminating the need for an explicit test in > get_relation_info(). I'm sort of inclined to go this route even > though it would require touching a bit more code. We've already > whacked around RelationGetNumberOfBlocks() a bit in this release (the > function that formerly had that name is now > RelationGetNumberOfBlocksInFork(), and RelationGetNumberOfBlocks() is > now a macro that calls that function w/MAIN_FORKNUM) so it doesn't > seem like a bad time to get any other API changes that might be useful > out of our system. > > Thoughts?
Anyone? Anyone? Bueller? If we don't want to gum this with the above-mentioned cruft, the other obvious alternative here is to do nothing, and live with the non-beauty of the resulting error message. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs