On Fri, Sep 12, 2014 at 12:07 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Sep 11, 2014 at 7:34 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > Don't we need some way to prohibit changing GUC by launching process, > > once it has shared the existing GUC? > > Nope. I mean, eventually, for true parallelism ... we absolutely will > need that. But pg_background itself doesn't need that; it's perfectly > fine for configuration settings to get changed in the background > worker. So it's a different piece of infrastructure from this patch > set.
Okay, but as there is no predictability (it can be either same as what launching process has at the when it has launched background worker or it could be some different value if got changed later due to sighup) which GUC value will be used by background worker, it might be good to clarify the same in pg_bacground docs (which are not there currently, but I assume it will eventually be part of this patch). > > > 3. > > Can't we handle all these or other changes inside exec_simple_query() > > based on some parameter or may be a use a hook similar to what we > > do in ProcessUtility? > > > > Basically it looks bit odd to have a duplicate (mostly) copy of > > exec_simple_query(). > > It is. But I didn't think hacking up exec_simple_query() was a better > option. We could do that if most people like that approach, but to me > it seemed there were enough differences to make it unappealing. Okay, but I think in that case we need to carefully evaluate the differences else it might lead to behaviour differences in statement execution. Few observations related to differences are as follows: 1. Keeping transaction control (Start/commit) outside the function execute_sql_string() could lead to EndCommand() message being sent before transaction end which could be a problem in case transaction commit fails due to any reason. exec_simple_query() takes care of the same by calling finish_xact_command() before reporting command-complete for last parse tree. It even has comment indicating that we should follow this protocol. 2. +static void +execute_sql_string(const char *sql) { .. + /* Be sure to advance the command counter after the last script command */ + CommandCounterIncrement(); } Won't CommandCounterIncrement() required after every command like we do in exec_simple_query()? 3. +static void +execute_sql_string(const char *sql) { .. + /* + * Send a CommandComplete message even if we suppressed the query + * results. The user backend will report these in the absence of + * any true query results. + */ + EndCommand(completionTag, DestRemote); + + /* Clean up the portal. */ + PortalDrop(portal, false); .. } Whats the need to reverse the order of execution for EndCommand() and PortalDrop()? Any error in PortalDrop() will lead to wrong message being sent to client. 4. What is the reason for not logging statements executed via pg_background, even though it allows to report the sql statement to various monitoring tools by setting debug_query_string? 5. Isn't it better to add a comment why execute_sql_string() uses binary format for result? > > 4. > > Won't it be better if pg_background_worker_main() can look more > > like PostgresMain() (in terms of handling different kind of messages), > > so that it can be extended in future to handle parallel worker. > > I don't think that a parallel worker will look like pg_background in > much more than broad outline. Some of the same constructs will get > reused, but overall I think it's a different problem that I'd rather > not conflate with this patch. No issues. > > 5. > > pg_background_result() > > { > > .. > > /* Read and processes messages from the shared memory queue. */ > > } > > > > Shouldn't the processing of messages be a separate function as > > we do for pqParseInput3(). > > I guess we could. It's not all that much code, though. Sure, please take a call based on what you feel is right here, I mentioned it because I felt it might be little bit easier for other people to understand that code. Some other comments are as follows: 1. +pg_background_result(PG_FUNCTION_ARGS) { .. .. + /* + * Whether we succeed or fail, a future invocation of this function + * may not try to read from the DSM once we've begun to do so. + * Accordingly, make arrangements to clean things up at end of query. + */ + dsm_unkeep_mapping(info->seg); There can be certain scenarios where user might like to invoke this again. Assume, user calls function pg_background_launch('select count(*) from t1') and this statement execution via background worker is going to take very long time before it could return anything. After sometime user tries to retrieve data via pg_background_result(), but the call couldn't came out as it is waiting for results, so user presses cancel and on again trying after sometime, he won't get any data. I think this behaviour is bit annoying. I am able to reproduce this by halting the background worked via debugger. postgres=# select pg_background_launch('select count(*) from t1'); pg_background_launch ---------------------- 656 (1 row) postgres=# select * from pg_background_result(656) as (cnt int); Cancel request sent ERROR: canceling statement due to user request postgres=# select * from pg_background_result(656) as (cnt int); ERROR: PID 656 is not attached to this session 2. To avoid user to wait for longer, function pg_background_result() can take an additional parameter where user can specify whether to WAIT incase results are not available. 3. + case 'E': + case 'N': + { + ErrorData edata; + + /* Parse ErrorResponse or NoticeResponse. */ + pq_parse_errornotice(&msg, &edata); + + /* + * Limit the maximum error level to ERROR. We don't want + * a FATAL inside the background worker to kill the user + * session. + */ + if (edata.elevel > ERROR) + edata.elevel = ERROR; Why FATAL inside background worker is not propagated at same level by launcher process? If PANIC in background worker can kill other backends and restart server then ideally FATAL in background worker should also be treated at same level by client backend. 4. +void +pg_background_worker_main(Datum main_arg) +{ .. + responseq = shm_mq_attach(mq, seg, NULL); + + /* Redirect protocol messages to responseq. */ + pq_redirect_to_shm_mq(mq, responseq); Any error ("unable to map dynamic shared memory segment") before pq_redirect_to_shm_mq() will not reach launcher. Launcher client will get "ERROR: lost connection to worker process with PID 4020". I think it is very difficult for user to know why such an error has occurred and what he can do to proceed. I am not sure if there is any sensible way to report such an error, but OTOH it seems we should provide some information regarding what has happened to client. 5. Commands not supported in pg_background should get proper message. postgres=# select pg_background_launch('copy t1 from stdin'); pg_background_launch ---------------------- 4672 (1 row) postgres=# select * from pg_background_result(4672) as (result TEXT); WARNING: unknown message type: G (6 bytes) ERROR: there is no client connection CONTEXT: COPY t1, line 1: "" Something similar to what is returned for transaction statements ("transaction control statements are not allowed in pg_background") would be better. 6. + /* + * Tuples returned by any command other than the last are simply + * discarded; but those returned by the last (or only) command are + * redirected to the shared memory queue we're using for communication + * with the launching backend. If the launching backend is gone or has + * detached us, these messages will just get dropped on the floor. + */ + --commands_remaining; + if (commands_remaining > 0) + receiver = CreateDestReceiver(DestNone); + else + { + receiver = CreateDestReceiver(DestRemote); + SetRemoteDestReceiverParams(receiver, portal); + } If you have to discard results of statements other than last, then why at first place you want to allow multiple statements? Like in below case, how will user identify whether the result is for first statement or second statement? postgres=# select pg_background_launch('select count(*) from t1;select count(*) from t1'); pg_background_launch ---------------------- 3996 (1 row) postgres=# select * from pg_background_result(3996) as (result bigint); result -------- 11 (1 row) With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com