On Fri, Feb 29, 2008 at 8:19 PM, Florian G. Pflug <[EMAIL PROTECTED]> wrote: > Pavan Deolasee wrote: > > What I am thinking is if we can read ahead these blocks in the shared > > buffers and then apply redo changes to them, it can potentially > > improve things a lot. If there are multiple read requests, kernel (or > > controller ?) can probably schedule the reads more efficiently. > The same holds true for index scans, though. Maybe we can find a > solution that benefits both cases - something along the line of a > bgreader process > >
I agree. Something like bgreader process would make a good sense as a general solution. ISTM that this would be first and easy step towards making recovery faster, without too much complexity in the recovery code path. > > Btw, isn't our redo recovery completely physical in nature ? I mean, > > can we replay redo logs related to a block independent of other > > blocks ? The reason I am asking because if thats the case, ISTM we > > can introduce parallelism in recovery by splitting and reordering the > > xlog records and then run multiple processes to do the redo > > recovery. > > > I'd say its "physical" on the tuple level (We just log the new tuple on an > update, not how to calculate it from the old one), but "logical" on the > page level (We log the fact that a tuple was inserted on a page, but > e.g. the physical location of the tuple on the page can come out > differently upon replay). I think it would be OK if the recovery is logical at page level. As long as we can apply redo logs in-order for a given page, but out-of-order with respect to some other page, there is a great scope for introducing parallelism. Though I would agree with Tom that we need to be extremely cautious before we do anything like this. I remember Heikki caught a few bugs in HOT redo recovery while code reviewing which escaped from the manual crash recovery testing I did, proving Tom's point that its hard to catch such bugs. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers