On 2017-04-03 14:10:47 +1000, Vaishnavi Prabakaran wrote: > > The CF has been extended until April 7 but time is still growing short. > > Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or this > > submission will be marked "Returned with Feedback". > > > > > Thanks for the information, attached the latest patch resolving one > compilation warning. And, please discard the test patch as it will be > re-implemented later separately.
Hm. If the tests aren't ready yet, it seems we'll have to move this to the next CF. > + <sect1 id="libpq-batch-mode"> > + <title>Batch mode and query pipelining</title> > + > + <indexterm zone="libpq-batch-mode"> > + <primary>libpq</primary> > + <secondary>batch mode</secondary> > + </indexterm> > + > + <indexterm zone="libpq-batch-mode"> > + <primary>libpq</primary> > + <secondary>pipelining</secondary> > + </indexterm> > + > + <para> > + <application>libpq</application> supports queueing up multiple queries > into > + a pipeline to be executed as a batch on the server. Batching queries > allows > + applications to avoid a client/server round-trip after each query to get > + the results before issuing the next query. > + </para> "queueing .. into a pipeline" sounds weird to me - but I'm not a native speaker. Also batching != pipelining. > + <sect2> > + <title>When to use batching</title> > + > + <para> > + Much like asynchronous query mode, there is no performance disadvantage > to > + using batching and pipelining. It increases client application complexity > + and extra caution is required to prevent client/server deadlocks but > + offers considerable performance improvements. > + </para> s/offers/can sometimes offer/ > + <sect2 id="libpq-batch-using"> > + <title>Using batch mode</title> > + > + <para> > + To issue batches the application must switch > + <application>libpq</application> into batch mode. s/libpq/a connection/? > Enter batch mode with <link > + > linkend="libpq-PQbatchBegin"><function>PQbatchBegin(conn)</function></link> > or test > + whether batch mode is active with <link > + > linkend="libpq-PQbatchStatus"><function>PQbatchStatus(conn)</function></link>. > In batch mode only <link > + linkend="libpq-async">asynchronous operations</link> are permitted, and > + <literal>COPY</literal> is not recommended as it most likely will > trigger failure in batch processing. > + (The restriction on <literal>COPY</literal> is an implementation > + limit; the PostgreSQL protocol and server can support batched > <literal>COPY</literal>). Hm, I'm unconvinced that that's a useful parenthetical in the libpq docs. > + <para> > + The client uses libpq's asynchronous query functions to dispatch work, > + marking the end of each batch with <function>PQbatchQueueSync</function>. > + Concurrently, it uses <function>PQgetResult</function> and > + <function>PQbatchQueueProcess</function> to get results. "Concurrently" imo is a dangerous word, somewhat implying multi-threading. > + <note> > + <para> > + It is best to use batch mode with <application>libpq</application> in > + <link linkend="libpq-pqsetnonblocking">non-blocking mode</link>. If > used in > + blocking mode it is possible for a client/server deadlock to occur. The > + client will block trying to send queries to the server, but the server > will > + block trying to send results from queries it has already processed to > the > + client. This only occurs when the client sends enough queries to fill > its > + output buffer and the server's receive buffer before switching to > + processing input from the server, but it's hard to predict exactly when > + that'll happen so it's best to always use non-blocking mode. > + </para> > + </note> Such deadlocks are possible just as well with non-blocking mode, unless one can avoid sending queries and switching to receiving results anytime in the application code. > + <para> > + Batched operations will be executed by the server in the order the > client > + sends them. The server will send the results in the order the statements > + executed. The server may begin executing the batch before all commands > + in the batch are queued and the end of batch command is sent. If any > + statement encounters an error the server aborts the current transaction > and > + skips processing the rest of the batch. Query processing resumes after > the > + end of the failed batch. > + </para> What if a batch contains transaction boundaries? > + <sect3 id="libpq-batch-results"> > + <title>Processing results</title> > + > + <para> > + The client <link linkend="libpq-batch-interleave">interleaves result > + processing with sending batch queries</link>, or for small batches may > + process all results after sending the whole batch. > + </para> That's a very long <link> text, is it not? > + <para> > + To get the result of the first batch entry the client must call <link > + > linkend="libpq-PQbatchQueueProcess"><function>PQbatchQueueProcess</function></link>. > It must then call What does 'QueueProcess' mean? Shouldn't it be 'ProcessQueue'? You're not enquing a process or processing, right? > + <function>PQgetResult</function> and handle the results until > + <function>PQgetResult</function> returns null (or would return null if > + called). What is that parenthetical referring to? IIRC we don't provide any external way to determine PQgetResult would return NULL. Have you checked how this API works with PQsetSingleRowMode()? > + <para> > + To enter single-row mode, call <function>PQsetSingleRowMode</function> > immediately after a > + successful call of <function>PQbatchQueueProcess</function>. This mode > selection is effective > + only for the query currently being processed. For more information on > the use of <function>PQsetSingleRowMode > + </function>, refer to <xref linkend="libpq-single-row-mode">. Hah ;). > + <para> > + The client must not assume that work is committed when it > + <emphasis>sends</emphasis> a <literal>COMMIT</literal>, only when the > + corresponding result is received to confirm the commit is complete. > + Because errors arrive asynchronously the application needs to be able to > + restart from the last <emphasis>received</emphasis> committed change and > + resend work done after that point if something goes wrong. > + </para> That seems like a batch independent thing, right? If so, maybe make it a <note>? > + <listitem> > + <para> > + Causes a connection to enter batch mode if it is currently idle or > + already in batch mode. > + > +<synopsis> > +int PQbatchBegin(PGconn *conn); > +</synopsis> > + > + </para> > + <para> > + Returns 1 for success. Returns 0 and has no > + effect if the connection is not currently idle, i.e. it has a > result > + ready, is waiting for more input from the server, etc. This > function > + does not actually send anything to the server, it just changes the > + <application>libpq</application> connection state. > + > + </para> > + </listitem> > + </varlistentry> That function name sounds a bit too much like it'd be relevant for a single batch, not something that can send many batches. enterBatchMode? > + <varlistentry id="libpq-PQbatchEnd"> > + <term> > + <function>PQbatchEnd</function> > + <indexterm> > + <primary>PQbatchEnd</primary> > + </indexterm> > + </term> > + > + <listitem> > + <para> > + Causes a connection to exit batch mode if it is currently in batch mode > + with an empty queue and no pending results. > +<synopsis> > +int PQbatchEnd(PGconn *conn); > +</synopsis> > + </para> > + <para>Returns 1 for success. > + Returns 1 and takes no action if not in batch mode. If the connection > has > + pending batch items in the queue for reading with > + <function>PQbatchQueueProcess</function>, the current statement isn't > finished > + processing or there are results pending for collection with > + <function>PQgetResult</function>, returns 0 and does nothing. > + > + </para> > + </listitem> > + </varlistentry> "" > + <varlistentry id="libpq-PQbatchQueueSync"> > + <term> > + <function>PQbatchQueueSync</function> > + <function>PQbatchQueueProcess</function> As said above, I'm not a fan of these, because it sounds like you're queueing a sync/process. > /* ---------------- > @@ -1108,7 +1113,7 @@ pqRowProcessor(PGconn *conn, const char **errmsgp) > conn->next_result = conn->result; > conn->result = res; > /* And mark the result ready to return */ > - conn->asyncStatus = PGASYNC_READY; > + conn->asyncStatus = PGASYNC_READY_MORE; > } Uhm, isn't that an API/ABI breakage issue? > /* > - * Common startup code for PQsendQuery and sibling routines > + * PQmakePipelinedCommand > + * Get a new command queue entry, allocating it if required. Doesn't add > it to > + * the tail of the queue yet, use PQappendPipelinedCommand once the > command has > + * been written for that. If a command fails once it's called this, it > should > + * use PQrecyclePipelinedCommand to put it on the freelist or release it. "command fails once it's called this"? > +/* > + * PQrecyclePipelinedCommand > + * Push a command queue entry onto the freelist. It must be a dangling > entry > + * with null next pointer and not referenced by any other entry's next > pointer. > + */ > +static void > +PQrecyclePipelinedCommand(PGconn *conn, PGcommandQueueEntry * entry) > +{ > + if (entry == NULL) > + return; > + if (entry->next != NULL) > + { > + fprintf(stderr, "tried to recycle non-dangling command queue > entry"); > + abort(); Needs a libpq_gettext()? > +/* > + * PQbatchEnd > + * End batch mode and return to normal command mode. > + * > + * Has no effect unless the client has processed all results > + * from all outstanding batches and the connection is idle. > + * > + * Returns true if batch mode ended. > + */ > +int > +PQbatchEnd(PGconn *conn) > +{ > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return true; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in > batch mode")); > + break; > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in > batch mode")); > + break; Why aren't you returning false here, > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + /* can't end batch while busy */ > + return false; but are here? > + case PGASYNC_QUEUED: > + break; > + } > + > +int > +PQbatchQueueSync(PGconn *conn) > +{ > + PGcommandQueueEntry *entry; > + > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return false; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in > batch mode")); > + break; > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in > batch mode")); > + break; > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + case PGASYNC_QUEUED: > + /* can send sync to end this batch of cmds */ > + break; > + } Uhm, what is that switch actually achieving? We're not returning an error code, so ...? > + /* Should try to flush immediately if there's room */ > + PQflush(conn); "room"? Also, don't we need to process PQflush's return value? > +/* > + * PQbatchQueueProcess > + * In batch mode, start processing the next query in the queue. > + * > + * Returns true if the next query was popped from the queue and can > + * be processed by PQconsumeInput, PQgetResult, etc. > + * > + * Returns false if the current query isn't done yet, the connection > + * is not in a batch, or there are no more queries to process. > + */ > +int > +PQbatchQueueProcess(PGconn *conn) > +{ > + PGcommandQueueEntry *next_query; > + > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return false; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in > batch mode")); > + break; > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + /* client still has to process current query or results > */ > + return false; > + break; > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in > batch mode")); > + break; > + case PGASYNC_QUEUED: > + /* next query please */ > + break; > + } Once more, I'm very unconvinced by the switch. Unless you do anything with the errors, this seems pointless. > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != > PGQUERY_SYNC) > + { > + /* > + * In an aborted batch we don't get anything from the server > for each > + * result; we're just discarding input until we get to the next > sync > + * from the server. The client needs to know its queries got > aborted > + * so we create a fake PGresult to return immediately from > + * PQgetResult. > + */ > + conn->result = PQmakeEmptyPGresult(conn, > + > PGRES_BATCH_ABORTED); > + if (!conn->result) > + { > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext("out of > memory")); > + pqSaveErrorResult(conn); > + } > + conn->asyncStatus = PGASYNC_READY; So we still return true in the OOM case? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers