2014-03-04 23:10 GMT+09:00 Stephen Frost <sfr...@snowman.net>: >> The "cache_scan" module that I and Haribabu are discussing in another >> thread also might be a good demonstration for custom-scan interface, >> however, its code scale is a bit larger than ctidscan. > > That does sound interesting though I'm curious about the specifics... > This module caches a part of columns, but not all, thus allows to hold much larger number of records for a particular amount of RAM than the standard buffer cache. It is constructed on top of custom-scan node, and also performs a new hook for a callback on page vacuuming to invalidate its cache entry. (I originally designed this module for demonstration of on-vacuum hook because I already made ctidscan and postgres_fdw enhancement for custom-scan node, by the way.)
>> > For one thing, an example where you could have this CustomScan node calling >> > other nodes underneath would be interesting. I realize the CTID scan can't >> > do that directly but I would think your GPU-based system could; after all, >> > if you're running a join or an aggregate with the GPU, the rows could come >> > from nearly anything. Have you considered that, or is the expectation that >> > users will just go off and access the heap and/or whatever indexes >> > directly, >> > like ctidscan does? How would such a requirement be handled? >> > >> In case when custom-scan node has underlying nodes, it shall be invoked using >> ExecProcNode as built-in node doing, then it will be able to fetch tuples >> come from underlying nodes. Of course, custom-scan provider can perform the >> tuples come from somewhere as if it came from underlying relation. It is >> responsibility of extension module. In some cases, it shall be required to >> return junk system attribute, like ctid, for row-level locks or table >> updating. >> It is also responsibility of the extension module (or, should not add custom- >> path if this custom-scan provider cannot perform as required). > > Right, tons of work to do to make it all fit together and play nice- > what I was trying to get at is: has this actually been done? Is the GPU > extension that you're talking about as the use-case for this been > written? > Its chicken-and-egg problem, because implementation of the extension module fully depends on the interface from the backend. Unlike commit-fest, here is no deadline for my extension module, so I put higher priority on the submission of custom-scan node, than the extension. However, GPU extension is not fully theoretical stuff. I had implemented a prototype using FDW APIs, and it allowed to accelerate sequential scan if query has enough complicated qualifiers. See the movie (from 2:45). The table t1 is a regular table, and t2 is a foreign table. Both of them has same contents, however, response time of the query is much faster, if GPU acceleration is working. http://www.youtube.com/watch?v=xrUBffs9aJ0 So, I'm confident that GPU acceleration will have performance gain once it can run regular tables, not only foreign tables. > How does it handle all of the above? Or are we going through > all these gyrations in vain hope that it'll actually all work when > someone tries to use it for something real? > I don't talk something difficult. If junk attribute requires to return "ctid" of the tuple, custom-scan provider reads a tuple of underlying relation then includes a correct item pointer. If this custom-scan is designed to run on the cache, all it needs to do is reconstruct a tuple with correct item-pointer (thus this cache needs to have ctid also). It's all I did in the cache_scan module. Thanks, -- KaiGai Kohei <kai...@kaigai.gr.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers