Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

Tom Lane Wed, 03 Apr 2013 14:49:56 -0700

Kevin Grittner <[email protected]> writes:
> Tom Lane <[email protected]> wrote:
>>> In fact, I'm going to go further and say that I do not like the
>>> entire concept of scannability, either as to design or
>>> implementation, and I think we should just plain rip it out.


> To be honest, I don't think I've personally seen a single use case
> for matviews where they could be used if you couldn't count on an
> error if attempting to use them without the contents reflecting a
> materialization of the associated query at *some* point in time.

Well, if we remove the WITH NO DATA clause from CREATE MATERIALIZED
VIEW, that minimum requirement is satisfied no?

I wouldn't actually want to remove that option, because pg_dump will
need it to avoid circular-reference problems.  But if you simply don't
use it then you have the minimum guarantee.  And I do not see where
the current implementation is giving you any more guarantees.

What it *is* doing is setting a rather dubious behavioral precedent that
we will no doubt hear Robert complaining about when (not if) we decide
we don't want to be backward-compatible with it anymore.

Granting that throwing an error is actually of some use to some people,
I would not think that people would want to turn it on via a command
that throws away the existing view contents altogether, nor turn it off
with a full-throated REFRESH.  There are going to need to be ways to
incrementally update matviews, and ways to disable/enable access that
are not tied to a complete rebuild, not to mention being based on
user-determined rather than hard-wired criteria for what's too stale.
So I don't think this is a useful base to build on.

If you feel that scannability disable is an absolute must for version 0,
let's invent a matview reloption or some such to implement it and let
users turn it on and off as they wish.  That seems a lot more likely
to still be useful two years from now.  And if you're absolutely
convinced that unlogged matviews mustn't work as I suggest, we can
lose those from 9.3, too.

What I'd actually rather see us spending time on right now is making
some provision for incremental updates, which I will boldly propose
could be supported by user-written triggers on the underlying tables
if we only diked out the prohibitions against INSERT/UPDATE/DELETE on
matviews, and allowed them to operate on a matview's contents just like
it was a table.  Now admittedly that would foreclose allowing matviews
to be updatable in the updatable-view sense, but that's a feature I
would readily give up if it meant users could build incremental update
mechanisms this year and not two years down the road.

> (1) On the public web site for circuit court data, visibility is
> based on supreme court rules and the advice of a committee
> consisting of judges, representatives of the press, defense
> attorneys, prosecuting attorneys, etc.  There are cases in the
> database which, for one reason or another, should not show up on
> the public web site.  On a weekly basis, where monitoring shows the
> lowest usage, the table of cases which are "too old" to be shown
> according to the rules thus determined is regenerated.  If there
> was the possibility that a dump and load could fail to fill this,
> and the queries would run without error, they could not move from
> ad hoc matview techniques to the new feature without excessive
> risk.

Why exactly do you think that what I'm suggesting would cause a dump and
reload to not regenerate the data?  (Personally, I think pg_dump ought
to have an option selecting whether or not to repopulate matviews, but
again, if that's not what you want *don't use it*.)

> (2) Individual judges have a "dashboard" showing such things as the
> number of court cases which have gone beyond certain thresholds
> without action.  They can "drill down" to detail so that cases
> which have "slipped through the cracks" can be scheduled for some
> appropriate action.  "Justice delayed..." and all of that.  It
> would be much better to get an error which would result in
> "information currently unavailable" than to give the impression
> that there are no such cases.

If you need 100% accuracy, which is what this scenario appears to be
demanding, matviews are not for you (yet).  The existing implementation
certainly is nowhere near satisfying this scenario.

> ... Making sure that
> the heap has at least one page if data has been generated seems
> like a not-entirely-unreasonable way to do that, although there
> remains at least one vacuum bug to fix if we keep it, in addition
> to Tom's concerns.

No.  This is an absolute disaster.  It's taking something we have always
considered to be an irrelevant implementation detail and making it into
user-visible DDL state, despite the fact that it doesn't begin to satisfy
basic transactional behaviors.  We *need* to get rid of that aspect of
things.  If you must have scannability state in version 0, okay, but
it has to be a catalog property not this.

> It has the advantage of playing nicely with
> unlogged tables.  If this is not going to be what we use long term,
> do we have a clue what is?

I don't, but I don't think I'm required to propose something, given
that (a) we can certainly ship 9.3 without unlogged matviews, and
(b) you still haven't convinced me that the current semantics for
unlogged matviews are a requirement anyway.  Again, somebody who
doesn't want his matviews to go to empty on crash simply shouldn't
use an unlogged matview.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

Reply via email to