2021年4月9日(金) 23:49 Kohei KaiGai <kai...@heterodb.com>:
>
> 2021年4月9日(金) 22:51 Fujii Masao <masao.fu...@oss.nttdata.com>:
> >
> > On 2021/04/09 12:33, Kohei KaiGai wrote:
> > > 2021年4月8日(木) 22:14 Fujii Masao <masao.fu...@oss.nttdata.com>:
> > >>
> > >> On 2021/04/08 22:02, Kohei KaiGai wrote:
> > >>>> Anyway, attached is the updated version of the patch. This is still 
> > >>>> based on the latest Kazutaka-san's patch. That is, extra list for ONLY 
> > >>>> is still passed to FDW. What about committing this version at first? 
> > >>>> Then we can continue the discussion and change the behavior later if 
> > >>>> necessary.
> > >>
> > >> Pushed! Thank all involved in this development!!
> > >> For record, I attached the final patch I committed.
> > >>
> > >>
> > >>> Ok, it's fair enought for me.
> > >>>
> > >>> I'll try to sort out my thought, then raise a follow-up discussion if 
> > >>> necessary.
> > >>
> > >> Thanks!
> > >>
> > >> The followings are the open items and discussion points that I'm 
> > >> thinking of.
> > >>
> > >> 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL, 
> > >> TRUNCATE_REL_CONTEXT_ONLY or TRUNCATE_REL_CONTEXT_CASCADING) about how a 
> > >> foreign table was specified as the target to truncate in TRUNCATE 
> > >> command is collected and passed to FDW. Does this really need to be 
> > >> passed to FDW? Seems Stephen, Michael and I think that's necessary. But 
> > >> Kaigai-san does not. I also think that TRUNCATE_REL_CONTEXT_CASCADING 
> > >> can be removed because there seems no use case for that maybe.
> > >>
> > >> 2. Currently when the same foreign table is specified multiple times in 
> > >> the command, the extra information only for the foreign table found 
> > >> first is collected. For example, when "TRUNCATE ft, ONLY ft" is 
> > >> executed, TRUNCATE_REL_CONTEXT_NORMAL is collected and _ONLY is ignored 
> > >> because "ft" is found first. Is this OK? Or we should collect all, e.g., 
> > >> both _NORMAL and _ONLY should be collected in that example? I think that 
> > >> the current approach (i.e., collect the extra info about table found 
> > >> first if the same table is specified multiple times) is good because 
> > >> even local tables are also treated the same way. But Kaigai-san does not.
> > >>
> > >> 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that 
> > >> it constructs. That is, if the foreign table is specified with ONLY, 
> > >> postgres_fdw also issues the TRUNCATE command for the corresponding 
> > >> remote table with ONLY to the remote server. Then only root table is 
> > >> truncated in remote server side, and the tables inheriting that are not 
> > >> truncated. Is this behavior desirable? Seems Michael and I think this 
> > >> behavior is OK. But Kaigai-san does not.
> > >>
> > > Prior to the discussion of 1-3, I like to clarify the role of 
> > > foreign-tables.
> > > (Likely, it will lead a natural conclusion for the above open items.)
> > >
> > > As literal of SQL/MED (Management of External Data), a foreign table
> > > is a representation of external data in PostgreSQL.
> > > It allows to read and (optionally) write the external data wrapped by
> > > FDW drivers, as if we usually read / write heap tables.
> > > By the FDW-APIs, the core PostgreSQL does not care about the
> > > structure, location, volume and other characteristics of
> > > the external data itself. It expects FDW-APIs invocation will perform
> > > as if we access a regular heap table.
> > >
> > > On the other hands, we can say local tables are representation of
> > > "internal" data in PostgreSQL.
> > > A heap table is consists of one or more files (per BLCKSZ *
> > > RELSEG_SIZE), and table-am intermediates
> > > the on-disk data to/from on-memory structure (TupleTableSlot).
> > > Here are no big differences in the concept. Ok?
> > >
> > > As you know, ONLY clause controls whether TRUNCATE command shall run
> > > on child-tables also, not only the parent.
> > > If "ONLY parent_table" is given, its child tables are not picked up by
> > > ExecuteTruncate(), unless child tables are not
> > > listed up individually.
> > > Then, once ExecuteTruncate() picked up the relations, it makes the
> > > relations empty using table-am
> > > (relation_set_new_filenode), and the callee
> > > (heapam_relation_set_new_filenode) does not care about whether the
> > > table is specified with ONLY, or not. It just makes the data
> > > represented by the table empty (in transactional way).
> > >
> > > So, how foreign tables shall perform?
> > >
> > > Once ExecuteTruncate() picked up a foreign table, according to
> > > ONLY-clause, does FDW driver shall consider
> > > the context where the foreign tables are specified? And, what behavior
> > > is consistent?
> > > I think that FDW driver shall make the external data represented by
> > > the foreign table empty, regardless of the
> > > structure, location, volume and others.
> > >
> > > Therefore, if we follow the above assumption, we don't need to inform
> > > the context where foreign-tables are
> > > picked up (TRUNCATE_REL_CONTEXT_*), so postgres_fdw shall not control
> > > the remote TRUNCATE query
> > > according to the flags. It always truncate the entire tables (if
> > > multiple) on behalf of the foreign tables.
> >
> > This makes me wonder if the information about CASCADE/RESTRICT (maybe also 
> > RESTART/CONTINUE) also should not be passed to FDW. You're thinking that? 
> > Or only ONLY clause should be ignored for a foreign table?
> >
> I think the above information (DropBehavior and restart_seqs) are
> valuable to pass.
>
> The CASCADE/RESTRICT clause controls whether the truncate command also
> eliminates
> the rows that blocks to delete (FKs in RDBMS). Only FDW driver can
> know whether the
> external data has "removal-blocker", thus we need to pass the
> DropBehavior for the callback.
>
> The RESTART/CONTINUE clause also controle whether the truncate command restart
> the relevant resources that is associated with the target table
> (Sequences in RDBMS).
> Only FDW driver can know whether the external data has relevant
> resources to reset,
> thus we need to pass the "restart_seqs" for the callback.
>
> Unlike above two parameters, the role of ONLY-clause is already
> finished at the time
> when ExecuteTruncate() picked up the target relations, from the
> standpoint of above
> understanding of foreign-tables and external data.
>
> Thought?
>
Let me remind the discussion at the design level.

If postgres_fdw (and other FDW drivers) needs to consider whether
ONLY-clause is given
on the foreign tables of them, what does a foreign table represent in
PostgreSQL system?

My assumption is, a foreign table provides a view to external data, as
if it performs like a table.
TRUNCATE command eliminates all the segment files, even if a table
contains multiple
underlying files, never eliminate them partially.
If a foreign table is equivalent to a table in SQL operation level,
indeed, ONLY-clause controls
which tables are picked up by the TRUNCATE command, but never controls
which portion of
the data shall be eliminated. So, I conclude that
ExecForeignTruncate() shall eliminate the entire
external data on behalf of a foreign table, regardless of ONLY-clause.

I think it is more significant to clarify prior to the implementation details.
How about your opinions?

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kai...@heterodb.com>


Reply via email to