On Fri, 31 May 2019 at 11:26, Ivan Voras <ivo...@gmail.com> wrote:

> Hello,
>
> The reason why we are using temp tables is to allow concurrent runs on
> some very large reports which involve creating and heavily churning dozens
> of very large tables.
>
> The problem we're facing is that if we DON'T use temp tables, the reports
> finish in a couple of hours. If we DO use temp tables, they finish in a
> couple of weeks. While using regular tables, autovacuum runs pretty much
> continuously.
>
> We'll try manually vacuuming the tables after major operations, but that's
> kind of kludgy. What would help is a GUC setting which toggles this
> behaviour.
>
> Since AFAIK temp tables are very close to unlogged ordinary tables, what
> would stop this feature from being implemented?
>
>
I do hit some situations similar to this; it does NOT lead me to wanting to
VACUUM a lot; rather, after any query where I populate a lot of data in a
temporary table, I need to add an explicit ANALYZE, which tends to be more
than sufficient.

It is problematic for autovacuum to do this for two reasons:

1. Frequently, the moment that the ANALYZE *needs* to be run is the instant
after the table is populated, because if it waits any longer, that will be
too late to help the Immediately Following Query where the freshly
populated temp table's data was used.

In effect, the only "safe" answer for those processes is for the SQL code
to include an ANALYZE in the crucial spot.   If autovacuum tried to do this
work, it might get to the table 5 seconds late, and thereby not do the good
that you need.

2.  Temp tables are only associated (and visible) in the session in which
you are doing the work.  autovacuum operates inside an ordinary session
context, and in a separate connected session, so it can't see your temp
tables; they are not handled in shared memory that crosses contexts to the
one in which those temp tables exist.  Changing that would be way more
difficult than meets the eye.

I would actually suggest that it is likely that VACUUM never needs to be
done, as there's something wrong if a lot of garbage is being generated in
temp tables.  All that should be needed are ANALYZE requests, and it makes
sense to annotate the code with an ANALYZE after any time a table is
massively updated.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Reply via email to