On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut > <peter.eisentr...@2ndquadrant.com> wrote: >> Regarding the patch that ended up being committed, I wonder if it is >> intentional that PL/pgSQL overwrites the context from the parallel worker. >> Shouldn't the context effectively look like >> >> ERROR: message >> CONTEXT: parallel worker >> CONTEXT: PL/pgSQL function >> >> Elsewhere in this thread I suggested getting rid of the parallel worker >> context by default (except for debugging), but if we do want to keep it, >> then it seems to be a bug that a PL/pgSQL function can just eliminate it. > > Several people have suggested getting rid of that now, so maybe we > should just go ahead and do it. > > How would we make it available for debugging, though? That seems like > something people will want.
This is currently listed as an open item, but it doesn't seem very actionable to me. The fact that PL/plpgsql chucks the existing context instead of appending to it is presumably a property of PL/plpgsql, not parallel query, and changing that seems like it ought to be out of scope for 9.6. As far as the context is concerned, the reason why that's there is because I felt (and still feel, to some extent) that users might experience errors that happen inside a parallel worker and it might be important for debugging purposes to know that. Suppose, for example, that you mislabel a function as PARALLEL SAFE when in fact it relies on some backend-local state (and should therefore be PARALLEL RESTRICTED). When it runs in a worker, it might generate an error message like this: ERROR: can't generate random numbers because you haven't specified a seed ...to which the user will reply, "oh yes I did; in fact I ran SELECT magic_setseed(42) just before I ran the offending query!". They'll then contact an expert (hopefully) who will very possibly spend a lot of time troubleshooting the wrong thing. If the message says: ERROR: can't generate random numbers because you haven't specified a seed CONTEXT: parallel worker, PID 62413 ...then the expert has a very good chance of guessing what has gone wrong right away. Now, against this scenario, there is every possibility that this message will just annoy a lot of people. It is certainly annoying for regression test authors because the PID changes and, even if we took the PID out, any given error might sometimes happen in the leader rather than the worker, depending on timing. I am not aware of a concrete reason why it will annoy people in other situations, but there may well be one (or more). The simplest thing we could do here is nothing. We can wait to see what happens with PostgreSQL 9.6 and then decide whether to make a change. Or we can do something now. We can eliminate the context entirely and take our chances with scenarios like the one mentioned above. Or we can add a new GUC, something like hide_parallel_context, which suppresses it and turn that off when running regression tests, or even by default. It's all just a matter of how much it bothers you to sometimes get an extra context line vs. how much annoyance you think will be caused by people wasting time troubleshooting because they didn't realize parallel query was involved. Personally, I'm very doubtful about our ability to paper over artifacts like this. Sophisticated users are going to end up having to understand that there are parallel workers and that they propagate their messages back to the leader, which appends its own context. We often seem to assume users won't find out about internal details like that and so we don't make an effort to expose and document them, but I don't think that often works out. There are a whole lot of people who know that they need to run EXPLAIN 6 times to find out what's really going on. I don't expect this case to be any different. I'm actually sort of flattered that parallel query is working well enough that multiple people think they might not ever need to know whether an error comes from the worker or from the leader, but I'm rather doubtful that will be true in practice. All that having been said, I don't know everything and this is just a judgement call. I exercised my judgement and that's how we got here. If there's a consensus that my judgement should be overruled, I'm fine with that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers