On 2020/10/02 10:06, Kyotaro Horiguchi wrote:
At Thu, 1 Oct 2020 08:14:42 +0000, "osumi.takami...@fujitsu.com" 
<osumi.takami...@fujitsu.com> wrote in
Hi, Horiguchi-San and Fujii-San.


Thank you so much both of you.
the table needs to be rewriitten. One idea for that is to improve that
command so that it skips the table rewrite if wal_level=minimal.
Of course, also you can change wal_level after marking the table as
unlogged.

tablecmd.c:
The idea is really interesting.
I didn't come up with getting rid of the whole copy of
the ALTER TABLE UNLOGGED/LOGGED commands
only when wal_level='minimal'.

* There are two reasons for requiring a rewrite when changing
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly.  On the other hand, since rewriting creates
* and assigns a new relfilenode, we automatically create or drop an
* init fork for the relation as appropriate.
Thanks for sharing concrete comments in the source code.

According to this comment, perhaps we can do that at least for
wal_level=minimal.
When I compare the 2 ideas,
one of the benefits of this ALTER TABLE 's improvement
is that we can't avoid the downtime
while that of wal_level='none' provides an easy and faster
major version up via output file of pg_dumpall.

The speedup has already been achieved with higher durability by
wal_level=minimal in that case.

I was thinking the same, i.e., wal_level=minimal + wal_skip_threshold would
speed up that initial data loading.


 Or maybe you should consider using
pg_upgrade instead.  Even inducing the time to take a backup copy of
the whole cluster, running pg_upgrade would be far faster than
pg_dumpall then loading.

Both ideas have good points.
However, actually to modify ALTER TABLE's copy
looks far more difficult than wal_level='none' and
beyond my current ability.
So, I'd like to go forward with the direction of wal_level='none'.
Did you have strong objections for this direction ?

No, I have no strong objection against your trial. But I was thinking
that it's not so easy to design and implement wal_level=none.
For example, there are some functions and commands depending on
the existence of WAL, like pg_switch_wal(), PREPARE TRANSACTION
and COMMIT PREPARED. Probably you need to define how they should
work in wal_level=none, e.g., emit an error.


For fuel(?) of the discussion, I tried a very-quick PoC for in-place
ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
trials of several ways, I drifted to the following way after poking
several ways.

Nice!


1. Flip BM_PERMANENT of active buffers
2. adding/removing init fork
3. sync files,
4. Flip pg_class.relpersistence.

It always skips table copy in the SET UNLOGGED case,

Even in wal_level != minimal?
What happens in the standby side when SET UNLOGGED is executed without
the table rewrite in the primary? The table data should be truncated
in the standby?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply via email to