Alexey-san, Sawada-san,
cc: Fujii-san,

From: Fujii Masao <masao.fu...@oss.nttdata.com>
> But if we
> implement 2PC as the improvement on FDW independently from PostgreSQL
> sharding, I think that it's necessary to support other FDW. And this is our
> direction, isn't it?

I understand the same way as Fujii san.  2PC FDW is itself useful, so I think 
we should pursue the tidy FDW interface and good performance withinn the FDW 
framework.  "tidy" means that many other FDWs should be able to implement it.  
I guess XA/JTA is the only material we can use to consider whether the FDW 
interface is good.


> Sawada-san's patch supports that case by implememnting some conponents
> for that also in PostgreSQL core. For example, with the patch, all the remote
> transactions that participate at the transaction are managed by PostgreSQL
> core instead of postgres_fdw layer.
> 
> Therefore, at least regarding the difference 2), I think that Sawada-san's
> approach is better. Thought?

I think so.  Sawada-san's patch needs to address the design issues I posed 
before digging into the code for thorough review, though.

BTW, is there something Sawada-san can take from Alexey-san's patch?  I'm 
concerned about the performance for practical use.  Do you two have differences 
in these points, for instance?  The first two items are often cited to evaluate 
the algorithm's performance, as you know.

* The number of round trips to remote nodes.
* The number of disk I/Os on each node and all nodes in total (WAL, two-phase 
file, pg_subtrans file, CLOG?).
* Are prepare and commit executed in parallel on remote nodes? (serious DBMSs 
do so)
* Is there any serialization point in the processing? (Sawada-san's has one)

I'm sorry to repeat myself, but I don't think we can compromise the 2PC 
performance.  Of course, we recommend users to design a schema that co-locates 
data that each transaction accesses to avoid 2PC, but it's not always possible 
(e.g., when secondary indexes are used.)

Plus, as the following quote from TPC-C specification shows, TPC-C requires 15% 
of (Payment?) transactions to do 2PC.  (I knew this on Microsoft, CockroachDB, 
or Citus Data's site.)


--------------------------------------------------
Independent of the mode of selection, the customer resident 
warehouse is the home warehouse 85% of the time and is a randomly selected 
remote warehouse 15% of the time. 
This can be implemented by generating two random numbers x and y within [1 .. 
100]; 

. If x <= 85 a customer is selected from the selected district number (C_D_ID = 
D_ID) and the home warehouse 
number (C_W_ID = W_ID). The customer is paying through his/her own warehouse. 

. If x > 85 a customer is selected from a random district number (C_D_ID is 
randomly selected within [1 .. 10]), 
and a random remote warehouse number (C_W_ID is randomly selected within the 
range of active 
warehouses (see Clause 4.2.2), and C_W_ID ≠ W_ID). The customer is paying 
through a warehouse and a 
district other than his/her own. 
--------------------------------------------------


Regards
Takayuki Tsunakawa


Reply via email to