On Thu, Feb 09, 2012 at 12:02:29PM -0500, Robert Haas wrote: > When Heikki worked up his original index-only scan patches (which > didn't end up looking much like what eventually got committed), he had > the notion of an index-only qual. That is, given a query like this: > > select sum(1) from foo where substring(a,1,3) = 'abc'; > > We could evaluate the substring qual before performing the heap fetch, > and fetch the tuple from the heap only if the qual passes.
> Now, there's a fly in the ointment here, which is that applying > arbitrary user-defined functions to tuples that might not be visible > doesn't sound very safe. The user-defined function in question might > perform some action based on those invisible tuples that has side > effects, which would be bad, because now we're violating MVCC > semantics. Or it might throw an error, killing the scan dead on the > basis of the contents of some tuple that the scan shouldn't even see. > However, there is certainly a class of functions for which this type > of optimization would be safe, and it's an awful lot like the set of > functions that can be safely pushed down through a security_barrier > view - namely, things that don't have side effects or throw errors. > So it's possible that the proleakproof flag KaiGai is proposing to add > to pg_proc could do double duty, serving also to identify when it's > safe to apply a qual to an index tuple when the corresponding heap > tuple might be invisible. However, I have some reservations about > assuming that the two concepts are exactly the same. For one thing, > people are inevitably going to want to cheat a little bit more here > than is appropriate for security views, and encouraging people to mark > things LEAKPROOF when they're really not is a security disaster > waiting to happen. The similarity is indeed tempting, but I find the concepts sufficiently distinct to not make one device serve both. Adding to the reservations you offer, LEAKPROOF is superuser-only. This other marker would not entail any special privilege. > For another thing, there are some important cases > that this doesn't cover, like: > > select * from foo where substring(a,1,3) like '%def%'; > > The like operator doesn't seem to be leakproof in the security sense, > because it can throw an error if the pattern is something like a > single backslash (ERROR: LIKE pattern must not end with escape > character) and indeed it doesn't seem like it would be safe here > either if the pattern were stored in the table. But if the pattern > were constant, it'd be OK, or almost OK: there's still the edge case > where the table contains invisible rows but no visible ones - whether > or not we complain about the pattern there ought to be the same as > whether or not we complain about it on a completely empty table. If > we got to that point, then we might as well consider the qual > leakproof for security purposes under the same set of circumstances > we'd consider it OK to apply to possibly-invisible tuples. This sort of thing implicates substring(), too, when you call it as substring(a, 1, b); b < 0 produces an error. To handle these, I think we'd need a facility along the lines of protransform. Have a function inspecting call nodes for a particular other function and determining whether each is ok-for-index-only-quals. You could even force protransform itself into that role. Create an additional pg_proc entry identical to the ordinary substring() but for a different name and having the ok-for-index-only-quals flag. Add a protransform to the main pg_proc entry that inspects the argument nodes and, when they're safe, replaces the call with a call to that errorfree_substring() at plan time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers