(2011/11/18 16:25), Etsuro Fujita wrote: > Thank you for your testing. I updated the patch according to your > comments. Attached is the updated version of the patch.
I'd like to share result of my review even though it's not fully finished. So far I looked from viewpoint of API design, code formatting, and documentation. I'll examine effectiveness of the patch and details of implementation next week, and hopefully try writing ANALYZE handler for pgsql_fdw :) New patch has correct format, and it could be applied to HEAD of master branch cleanly. All regression tests including those of contrib modules have passed. It contains changes of codes and regression tests related to the issue, and they have enough comments. IMO the document in this patch is not enough to show how to write analyze handler to FDW authors, but it can be enhanced afterward. With this patch, FDW author can provide optional ANALYZE handler which updates statistics of foreign tables. Planner would be able to generate better plan by using statistics. > Yes. But in the updated version, I've refactored analyze.c a little bit > to allow FDW authors to simply call do_analyze_rel(). <snip> > The updated version enables FDW authors to just write their own > acquire_sample_rows(). On the other hand, by retaining to hook > AnalyzeForeignTable routine at analyze_rel(), higher level than > acquire_sample_rows() as before, it allows FDW authors to write > AnalyzeForeignTable routine for foreign tables on a remote server to ask > the server for its current stats instead, as pointed out earlier by Tom > Lane. IIUC, this patch offers three options to FDWs: a) set AnalyzeForeignTable to NULL to indicate lack of capability, b) provide AnalyzeForeignTable which calls do_analyze_rel with custom sample_row_acquirer, and c) create statistics data from scratch by FDW itself by doing similar things to do_analyze_rel with given argument or copying statistics from remote PostgreSQL server. ISTM that this design is well-balanced between simplicity and flexibility. Maybe these three options would suit web-based wrappers, file-based or RDBMS wrappers, and pgsql_fdw respectively. I think that adding more details of FdwRoutine, such as purpose of new callback function and difference from required ones, would help FDW authors, including me :) I have some random comments: - I think separated typedef of sample_acquire_rows would make codes more readable. In addition, parameters of the function should be described explicitly. - I couldn't see the reason why file_fdw sets ctid of sample tuples, though I guess it's for Vitter's random sampling algorithm. If every FDW must set valid ctid to sample tuples, it should be mentioned in document of AnalyzeForeignTable. Exporting some functions from analyze.c relates this issue? - Why file_fdw skips sample tuples which have NULL value? AFAIS original acquire_sample_rows doesn't do so. - Some comment lines go past 80 columns. - Some headers included in file_fdw.c seems unnecessary. Regards, -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers