I am not a PostgreSQL expert, however I think that the following algorithm should be possible and fast:
1. find the bitmap of all subscriptions in a project that are not trashed (it can use the index and takes only ~500ms) 2. find the bitmap of all subscriptions that match the above condition and HAVE the tag (~7s) 3. calculate [bitmap 1] - [bitmap 2] to find the subscriptions of the project that DON'T HAVE the tag On Tue, Nov 12, 2019 at 9:50 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Marco Colli <collimarc...@gmail.com> writes: > > 3) Here's the query plan that I get after disabling the seq scan: > > > Finalize Aggregate (cost=2183938.89..2183938.90 rows=1 width=8) (actual > > time=94972.253..94972.254 rows=1 loops=1) > > So, this is slower than the seqscan, which means the planner made the > right choice. > > You seem to be imagining that there's some way the index could be used > with the NOT clause, but there isn't. Indexable clauses are of the form > indexed_column indexable_operator constant > and there's no provision for a NOT in that. If we had a "not contained > in" array operator, the NOT could be folded to be of this syntactic form, > but it's highly unlikely that any index operator class would consider such > an operator to be a supported indexable operator. It doesn't lend itself > to searching an index. > > So the planner is doing the best it can, which in this case is a > full-table scan. > > A conceivable solution, if the tags array is a lot smaller than > the table as a whole and the table is fairly static, is that you could > make a btree index on the tags array and let the planner fall back > to an index-only scan that is just using the index as a cheaper > source of the array data. (This doesn't work for your existing GIST > index because GIST can't reconstruct the original arrays on-demand.) > I suspect though that this wouldn't win much, even if you disregard > the maintenance costs for the extra index. The really fundamental > problem here is that a large fraction of the table satisfies the > NOT-in condition, and no index is going to beat a seqscan by all that > much when that's true. Indexes are good at retrieving small portions > of tables. > > regards, tom lane >