Hi, On 2018-12-05 12:24:54 -0500, Tom Lane wrote: > The core of the problem, I think, is that we're unable to convert the > condition on table_name into an indexscan on pg_class.relname, because > the view has cast pg_class.relname to the sql_identifier domain. > > There are two different issues in that. One is that the domain might > have constraints (though in reality it does not), so the planner can't > throw away the CoerceToDomain node, and thus can't match the expression > to the index. Even if we did throw away the CoerceToDomain, it still > would not work because the domain is declared to be over varchar, and > so there's a cast-to-varchar underneath the CoerceToDomain.
Couldn't we make expression simplification replace CoerceToDomain with a RelabelType if the constraint is simple enough? That's not entirely trivial because we'd have to look into the typecache to get the constraints, but that doesn't sound too bad. > The most straightforward way to fix that would be to add some cross-type > comparison operators to the name_ops operator family. That seems reasonable. > While we only > directly need 'name = text' to make this case work, the opr_sanity > regression test will complain if we don't supply a fairly complete set of > operators, and I think not without reason. So I experimented with doing > that, as in the very-much-WIP 0001 patch below. A name index can only > cope with strcmp-style comparison semantics, not strcoll-style, so while > it's okay to call the equality operator = I felt we'd better call the > inequality operators ~<~ and so on. While the patch as presented fixes > the case shown above, it's not all good: the problem with having a new > 'text = name' operator is that it will also capture cases where you would > like to have a text index working with a comparison value of type "name". > If 'text = name' isn't part of the text_ops opclass then that doesn't > work. I think that the reason for the join.sql regression test failure > shown in patch 0001 is likewise that since 'text = name' isn't part of the > text_ops opclass, the join elimination stuff is unable to prove that it > can eliminate a join to a unique text column. This could probably be > fixed by creating yet another dozen cross-type operators that implement > text vs name comparisons using strcoll semantics (hence, using the normal > '<' operator names), and adding that set to the text_ops opfamily. > I didn't do that here (yet), because it seems pretty tedious. Is there a way we could make that less laborious by allowing a bit more casting? > 0002 is a really preliminary POC for eliminating CoerceToDomain nodes > at plan time if the domain has no constraints to check. While this is > enough to check the effects on plan selection, it's certainly not > committable as-is, because the resulting plan is broken if someone then > adds a constraint to the domain. (There is a regression test failure, > not shown in 0002, for a case that tests exactly that scenario.) Hah. > What we would have to do to make 0002 committable is to add sinval > signaling to invalidate any cached plan that's dependent on an > assumption of no constraints on a domain. This is something that > we probably want anyway, since it would be necessary if we ever want > to compile domain constraint expressions as part of the plan tree > rather than leaving them to run-time. Yea, that seems good. I don't like the fact that expression initialization does the lookups for constraints right now. > While the sinval additions per se would be fairly straightforward, > I wonder whether anyone is likely to complain about the race conditions > that will ensue from not taking any locks associated with the domain > type; i.e. it's possible that a query would execute with slightly > out-of-date assumptions about what constraints a domain has. I suspect > that we have race conditions like that already, but they might be worse > if we inspect the domain during planning rather than at executor > startup. Is anyone worried enough about that to want to add locking > overhead to domain usage? (I'm not.) I'm not either. > 0001+0002 still don't get the job done: now we strip off the > CoerceToDomain successfully, but we still have "relname::varchar" > being compared to the comparison value, so we still can't match > that to the index. 0003 is a somewhat klugy solution to that part, > allowing indxpath.c to construct a name equality test from a texteq > restriction condition. (This is semantically OK because equality > is equality in both name and text domains. We could not convert > inequalities, at least not as freely; maybe it could be done in > C locale, but I've not done anything about that here.) > > With all three patches in place, we get > > regression=# explain select * from pg_class where > relname::information_schema.sql_identifier = 'foo'::text; > QUERY PLAN > --------------------------------------------------------------------------------------------- > Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..8.30 > rows=7 width=781) > Index Cond: (relname = 'foo'::text) > Filter: ((relname)::text = 'foo'::text) > (3 rows) > > so we've fixed the lack of indexscan but we're still a bit off about the > statistics. I don't have any immediate proposal about how to fix the > latter, but anyway this is enough to make Pavel's case a lot better. > 0003 essentially converts "namecol::text texteq textvalue" into > "namecol nameeqtext textvalue", relying on the new equality > operator introduced by 0001. Ugh, that's indeed a bit kludgy. It'd be nice to have an approach that's usable outside of one odd builtin type. I was wondering for a bit whether we could have logic to move the cast to the other side of an operator, but I don't see how we could make that generally safe. Greetings, Andres Freund