2010/11/11 Robert Haas <robertmh...@gmail.com> > > But thinking over what you've written here, I'm reminded of something > Peter said years ago, also about the optimizer. He was discussed the > ratio of the estimated cost to the actual cost and made an off-hand > remark that efforts had been made over the years to make that ratio > more consistent (i.e. improve the quality of the cost estimates) but > that they'd been abandoned because they didn't necessarily produce > better plans. Applying that line of thinking to this problem, maybe > we should give up on trying to make the estimates truly model reality, > and focus more on assigning them values which work well in practice. > For example, in your case, it would be sufficient to estimate the > amount of data that a given query is going to grovel through and then > applying some heuristic to choose values for random_page_cost and > seq_page_cost based on the ratio of that value to, I don't know, > effective_cache_size. >
As for me, the simplest solution would be to allow to set costs on per-relation basis. E.g. I know that this relation is most time in memory and other one (archive) is on the disk. This could work like charm along with buffer pools (portions of shared cache) - tables (or indexes) that are required to be cached can be assigned to bufferpool that has enough size to hold all the data, archive ones - to small bufferpool. This can guarantie that after query on the archive data, cached tables are still cached. This solutions however, does not help on tables where only some portion of table is activelly used. The solution can be to allow set costs via partial indexes - e.g. "for any table access using this index, use this cost values". This, BTW, will make table access via given index more preferable. -- Best regards, Vitalii Tymchyshyn