On Tue, Jun 13, 2023 at 10:28 PM Patrick O'Toole <patrick.oto...@sturdy.ai> wrote:
> Hi all, > > Questions: > > 1. In Plan A, what factors are causing the planner to select a > substantially slower plan despite having recent stats about number of rows? > > Estimated overall cost. For Plan A it is ~200k. For plans B/C (haven't noticed any differences in these two) it is ~250k. The planner uses a less expensive plan. Also, in the plans you can see that Pg estimates the number of rows correctly. > > 1. Is there a substantial difference between the on-the-fly hash done > in Plan B and Plan C compared to the hash-index used in Plan A? Can I > assume they are essentially the same? Perhaps there are there differences > in how they're applied? > > I don't see any difference in plans B and C, but you report timing changes. To me this looks like just a fluctuation in measurements. So I wouldn't trust any measurements for plan A either. I'm not a big expert, but can not say that plan A and B are essentially the same. Plan A: DB scans item_text table and for every record looks into the index of conversation_item table, then looks into the table itself. Plan B/C: DB scans conversation_item table without looking into its indexes building a hash table on the fly. > 1. Is it common to see values for random_page_cost set as high as 8.0? > We would of course need to investigate whether we see a net positive or net > negative impact on other queries, to adopt this as a general setting, but > is it a proposal we should actually consider? > > No idea. > > 1. Maybe we are barking up the wrong tree with the previous questions. > Are there other configuration parameters we should consider first to > improve performance in situations like the one illustrated? > > Recheck your numbers. > > 1. Are there other problems with our schema, query, or plans shown > here? Other approaches (or tools/analyses) we should consider? > > You can try the following index: CREATE INDEX conversation_item_ruz1 ON conversation_item(item_uuid, conversation_uuid, tenant_id); I believe this index would allow Pg to use "index only scan" as variation of Plan A and avoid touching the conversation_item table completely. -- Best regards, Ruslan.