On Tue, Sep 1, 2015 at 6:55 AM, Bruce Momjian <br...@momjian.us> wrote: > I assumed these queries were going to be solved by sending as digested > data as possible to the coordinator, and having the coordinator complete > any remaining processing. I think we are going to need to decide if > such "sending data back to shards" is something we are ever going to > implement. I can see FDWs _not_ working well for that use-case.
I do think we are going to want to support that. All the people I've talked to about parallel and distributed query processing agree that you need to do that sort of thing to get really good and scalable performance. I think that we could make a lot of headway as compared with the status quo just by implementing more pushdown optimizations than we have today. Right now, SELECT COUNT(*) FROM table will suck back the whole remote table and count the rows locally, and that's stupid. We can fix that case with better pushdown logic. We can also fix the case of N-way join nests where the joins are either on the partitioning key or to replicated tables. But suppose you have a join between two tables which are sharded across the cluster but not on the partitioning key. There's no way to push the join down, so all the work comes back to the coordinator, which is possibly OK if such queries are rare, but not so hot if they are frequent. To leverage the full CPU power of the cluster in such a case, you need to be able to shuffle data around between the nodes. You pick one of the two tables being joined, and based on the partitioning key of that table, each node examines the other table and, for each row, sends it to the machine where it MIGHT have one or more join partners. Then each node can join its shard of the first table against the rows from the second table that were sent to it. Now the question is, where should the code that does all of this live? postgres_fdw? Some new, sharding-specific FDW? In core? I don't know for sure, but what I do know is that we could make a lot of progress over where we are today by just improving postgres_fdw, and I don't think those improvements are even all that difficult. If we decide we need to implement something new, it's going to be a huge project that will take years to complete, with uncertain results. I'd rather have a postgres_fdw-based implementation that is imperfect and can't handle some kinds of queries in 9.6 than a promise that by 9.9 we'll have something really great that handles MPP perfectly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers