> On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: > > Umm... I'm now missing the direction towards my goal. > > What approach is the best way to glue PostgreSQL and PGStrom? > > I haven't really paid any attention to PGStrom. Perhaps it's just that I > missed it, but I would find it useful if you could direct me towards a > benchmark or something like that, that demonstrates a representative > scenario in which the facilities that PGStrom offers are compelling compared > to traditional strategies already implemented in Postgres and other > systems. > Implementation of Hash-Join on GPU side is still under development.
Only available use-case right now is an alternative scan path towards full table scan in case when a table contains massive amount of records and qualifiers are enough complicated. EXPLAIN command below is, a sequential scan towards a table that contains 80M records (all of them are on memory; no disk accesses during execution). Nvidia's GT640 takes advantages towards single threaded Core i5 4570S, at least. postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + (y-20.0)^2) < 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=10003175757.67..10003175757.68 rows=1 width=0) (actual time=46648.635..46648.635 rows=1 loops=1) -> Seq Scan on t1 (cost=10000000000.00..10003109091.00 rows=26666667 width=0) (actual time=0.047..46351.567 rows=2513814 loops=1) Filter: (sqrt((((x - 20::double precision) ^ 2::double precision) + ((y - 20::double precision) ^ 2::double precision))) < 10::double precision) Rows Removed by Filter: 77486186 Planning time: 0.066 ms Total runtime: 46648.668 ms (6 rows) postgres=# set pg_strom.enabled = on; SET postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + (y-20.0)^2) < 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1274424.33..1274424.34 rows=1 width=0) (actual time=1784.729..1784.729 rows=1 loops=1) -> Custom (GpuScan) on t1 (cost=10000.00..1207757.67 rows=26666667 width=0) (actual time=179.748..1567.018 rows=2513699 loops=1) Host References: Device References: x, y Device Filter: (sqrt((((x - 20::double precision) ^ 2::double precision) + ((y - 20::double precision) ^ 2::double precision))) < 10::double precision) Total time to load: 0.231 ms Avg time in send-mq: 0.027 ms Max time to build kernel: 1.064 ms Avg time of DMA send: 3.050 ms Total time of DMA send: 933.318 ms Avg time of kernel exec: 5.117 ms Total time of kernel exec: 1565.799 ms Avg time of DMA recv: 0.086 ms Total time of DMA recv: 26.289 ms Avg time in recv-mq: 0.011 ms Planning time: 0.094 ms Total runtime: 1784.793 ms (17 rows) > If I wanted to make joins faster, personally, I would look at opportunities > to optimize our existing hash joins to take better advantage of modern CPU > characteristics. A lot of the research suggests that it may be useful to > implement techniques that take better advantage of available memory > bandwidth through techniques like prefetching and partitioning, perhaps > even (counter-intuitively) at the expense of compute bandwidth. It's > possible that it just needs to be explained to me, but, with respect, > intuitively I have a hard time imagining that offloading joins to the GPU > will help much in the general case. Every paper on joins from the last decade > talks a lot about memory bandwidth and memory latency. Are you concerned > with some specific case that I may have missed? In what scenario might a > cost-based optimizer reasonably prefer a custom join node implemented by > PgStrom, over any of the existing join node types? It's entirely possible > that I simply missed relevant discussions here. > If our purpose is to consume 100% capacity of GPU device, memory bandwidth is troublesome. But I'm not interested in GPU benchmarking. Things I want to do is, accelerate complicated query processing than existing RDBMS, with cheap in cost and transparent to existing application approach. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers