On 8/8/23 23:03, Peter Geoghegan wrote:
> On Tue, Aug 8, 2023 at 1:49 PM Tomas Vondra
> <tomas.von...@enterprisedb.com> wrote:
>> So we expect 1250 rows. If that was accurate, the index scan would have
>> to do 1250 heap fetches. It's just luck the index scan doesn't need to
>> do that. I don't this there's a chance to improve this costing - if the
>> inputs are this off, it can't do anything.
>
> Well, that depends. If we can find a way to make the bitmap index scan
> capable of doing something like the same trick through other means, in
> some other patch, then this particular problem (involving a simple
> inequality) just goes away. There may be other cases that look a
> little similar, with a more complicated expression, where it just
> isn't reasonable to expect a bitmap index scan to compete. Ideally,
> bitmap index scans will only be at a huge disadvantage when it just
> makes sense, due to the particulars of the expression.
>
> I'm not trying to make this your problem. I'm just trying to establish
> the general nature of the problem.
>
>> Also, I think this is related to the earlier discussion about maybe
>> costing it according to the worst case - i.e. as if we still needed
>> fetch the same number of heap tuples as before. Which will inevitably
>> lead to similar issues, with worse plans looking cheaper.
>
> Not in those cases where it just doesn't come up, because we can
> totally avoid visibility checks. As I said, securing that guarantee
> has the potential to make the costing a lot more reliable/easier to
> implement.
>
But in the example you shared yesterday, the problem is not really about
visibility checks. In fact, the index scan costing completely ignores
the VM checks - it didn't matter before, and the patch did not change
this. It's about the number of rows the index scan is expected to
produce - and those will always do a random I/O, we can't skip those.
>> That is certainly true - I'm trying to keep the scope somewhat close to
>> the original goal. Obviously, there may be additional things the patch
>> really needs to consider, but I'm not sure this is one of those cases
>> (perhaps I just don't understand what the issue is - the example seems
>> like a run-of-the-mill case of poor estimate / costing).
>
> I'm not trying to impose any particular interpretation here. It's
> early in the cycle, and my questions are mostly exploratory. I'm still
> trying to develop my own understanding of the trade-offs in this area.
>
Understood. I think this whole discussion is about figuring out these
trade offs and also how to divide the various improvements into "minimum
viable" changes.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company