Jared Rulison <ja...@affinity.co> writes: > Hello, we have some confusion over the planner's use of an index. > ... > 1. How is postgres able to use parades_city_id_description_tsv_index in the > first explain result without any filter on "city_id"?
GIN indexes don't have any particular bias towards earlier or later columns (unlike btrees). So this isn't any harder than if you'd put the index columns in the other order. > 2. Why does the planner in the first query decide not to simply use > parades_city_id_description_tsv_index (as in the second explain result) > when the cardinality of the set of "city_id"s is high? [ shrug... ] It thinks it's cheaper. Whether it's correct is impossible to say from the given data, but there is a moderately complex cost model in there. The comments for gincost_scalararrayopexpr note * A ScalarArrayOpExpr will give rise to N separate indexscans at runtime, * each of which involves one value from the RHS array, plus all the * non-array quals (if any). I haven't checked the actual execution code, but this seems to be saying that the GIN indexscan executor always does ANDs before ORs. That means that doing everything in the same GIN indexscan would require executing the to_tsvector part 50 times, so I can definitely believe that shoving the IN part to a different index and AND'ing afterwards is a better idea. (Whether the GIN executor should be made smarter to avoid that is a separate question ;-)) regards, tom lane