On 7/23/13 2:16 AM, Satoshi Nagayasu wrote: > I've been working on new pgstattuple function to allow > block sampling [1] in order to reduce block reads while > scanning a table. A PoC patch is attached.
Take a look at all of the messages linked in https://commitfest.postgresql.org/action/patch_view?id=778 Jaime and I tried to do what you're working on then, including a random block sampling mechanism modeled on the stats_target mechanism. We didn't do that as part of pgstattuple though, which was a mistake. Noah created some test cases as part of his thorough review that were not computing the correct results. Getting the results correct for all of the various types of PostgreSQL tables and indexes ended up being much harder than the sampling part. See http://www.postgresql.org/message-id/20120222052747.ge8...@tornado.leadboat.com in particular for that. > This new function, pgstattuple2(), samples only 3,000 blocks > (which accounts 24MB) from the table randomly, and estimates > several parameters of the entire table. There should be an input parameter to the function for how much sampling to do, and if it's possible to make the scale for it to look like ANALYZE that's helpful too. I have a project for this summer that includes reviving this topic and making sure it works on some real-world systems. If you want to work on this too, I can easily combine that project into what you're doing. -- Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers