Wow.. thanks guys, really appreciate the detailed analysis. Tim
On Wed, Jul 24, 2013 at 4:08 AM, Noah Misch <n...@leadboat.com> wrote: > On Tue, Jul 23, 2013 at 01:06:26PM +0100, Tim Kane wrote: > > I haven't given this a lot of thought, but it struck me that when > > rebuilding tables (be it for a restore process, or some other operational > > activity) - there is more often than not a need to build an index or two, > > sometimes many indexes, against the same relation. > > > > It strikes me that in order to build just one index, we probably need to > > perform a full table scan (in a lot of cases). If we are building > > multiple indexes sequentially against that same table, then we're > probably > > performing multiple sequential scans in succession, once for each index. > > Check. > > > Could we architect a mechanism that allowed multiple index creation > > statements to execute concurrently, with all of their inputs fed directly > > from a single sequential scan against the full relation? > > > > From a language construct point of view, this may not be trivial to > > implement for raw/interactive SQL - but possibly this is a candidate for > > the custom format restore? > > As Greg Stark mentioned, pg_restore can already issue index build commands > in > parallel. Where applicable, that's probably superior to having one backend > build multiple indexes during a single heap scan. Index builds are > CPU-intensive, and the pg_restore approach takes advantage of additional > CPU > cores in addition to possibly saving I/O. > > However, the pg_restore method is not applicable if you want CREATE INDEX > CONCURRENTLY, and it's not applicable for implicit index building such as > happens for ALTER TABLE rewrites and for VACUUM FULL. Backend-managed > concurrent index builds could shine there. > > > I presume this would substantially increase the memory overhead required > to > > build those indexes, though the performance gains may be advantageous. > > The multi-index-build should respect maintenance_work_mem overall. > Avoiding > cases where that makes concurrent builds slower than sequential builds is a > key challenge for such a project: > > - If the index builds each fit in maintenance_work_mem when run > sequentially > and some spill to disk when run concurrently, expect concurrency to lose. > - If the heap is small enough to stay in cache from one index build to the > next, performing the builds concurrently is probably a wash or a loss. > - Concurrency should help when a wide-row table large enough to exhaust OS > cache has narrow indexes that all fit in maintenance_work_mem. I don't > know > whether concurrency would help for a huge-table scenario where the > indexes > do overspill maintenance_work_mem. You would have N indexes worth of > external merge files competing for disk bandwidth; that could cancel out > heap I/O savings. > > Overall, it's easy to end up with a loss. We could punt by having an > index_build_concurrency GUC, much like pg_restore relies on the user to > discover a good "-j" value. But if finding cases where concurrency helps > is > too hard, leaving the GUC at one would become the standard advice. > > > Apologies in advance if this is not the correct forum for suggestions.. > > It's the right forum. > > Thanks, > nm > > -- > Noah Misch > EnterpriseDB http://www.enterprisedb.com >