On Wed, Nov 28, 2018 at 6:17 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > Comparing with overhead of setting snapshot before evaluating every row > > and considering this > > > > kind of usage is not frequent it seems to me the behavior is acceptable > > I'm not really buying the argument that this behavior is acceptable > simply because using the feature like this will be uncommon. That seems > like a rather weak reason to accept it. > > I however agree we don't want to make COPY less efficient, at least not > unless absolutely necessary. But I think we can handle this simply by > restricting what's allowed to appear the FILTER clause. > > It should be fine to allow IMMUTABLE and STABLE functions, but not > VOLATILE ones. That should fix the example I shared, because f() would > not be allowed.
I don't think that's a very good solution. It's perfectly sensible for someone to want to do WHERE/FILTER random() < 0.01 to load a smattering of rows, and this would rule that out for no very good reason. I think it would be fine to just document that if the filter condition examines the state of the database, it will not see the results of the COPY which is in progress. I'm pretty sure there are other cases - for example with triggers - where you can get code to run that can't see the results of the command currently in progress, so I don't really buy the idea that having a feature which works that way is categorically unacceptable. I agree that we can't justify flagrantly wrong behavior on the grounds that correct behavior is expensive or on the grounds that the incorrect cases will be rare. However, when there's more than one sensible behavior, it's not unreasonable to pick one that is easier to implement or cheaper or whatever, and that's the category into which I would put this decision. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company