From: iwata....@fujitsu.com <iwata....@fujitsu.com> > This patch includes these items:
Is there anything else in this revision that is not handled? Below are my comments. Also, why don't you try running the regression tests with a temporary modification to PQtrace() to output the trace to a file? The sole purpose is to confirm that this patch makes the test crash (core dump). (1) - conn->Pfdebug = debug_port; + if (pqTraceInit(conn)) + { + conn->Pfdebug = debug_port; + if (conn->Pfdebug != NULL) + setlinebuf(conn->Pfdebug); + } + else + { + /* XXX report ENOMEM? */ + fprintf(conn->Pfdebug, "Failed to initialize trace support\n"); + fflush(conn->Pfdebug); + } } * When the passed debug_port is NULL, the function should return after calling PQuntrace(). * Is setlinebuf() available on Windows? Maybe you should use setvbuf() instead. * I don't see the need for separate pqTraceInit() function, because it is only called here. (2) +bool +pqTraceInit(PGconn *conn) +{ + /* already done? */ + if (conn->be_msg != NULL) + { What's this code for? I think it's sufficient that PQuntrace() frees b_msg and PQtrace() always allocates b_msg. (3) + conn->fe_msg = malloc(sizeof(pqFrontendMessage) + + MAX_FRONTEND_MSGS * sizeof(pqFrontendMessageField)); + conn->fe_msg->field_num = 0; + if (conn->fe_msg == NULL) + { + free(conn->be_msg); + /* NULL out for the case that fe_msg malloc fails */ + conn->be_msg = NULL; + return false; + } + conn->fe_msg->field_num = 0; The memory for the fields array is one entry larger than necessary. The allocation code would be: + conn->fe_msg = malloc(offsetof(pqFrontendMessage, fields) + + MAX_FRONTEND_MSGS * sizeof(pqFrontendMessageField)); Also, the line with "field_num = 0" appears twice. The first one should be removed. (4) +static void +pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource) +{ ... + if (PG_PROTOCOL_MAJOR(conn->pversion) >= 3) + { I think you can remove the protocol version check in various logging functions, because PQtrace() disables logging when the protocol version is 2. (5) @@ -966,10 +966,6 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value) pgParameterStatus *pstatus; pgParameterStatus *prev; - if (conn->Pfdebug) - fprintf(conn->Pfdebug, "pqSaveParameterStatus: '%s' = '%s'\n", - name, value); - Where is this information output instead? (6) + * Set up state so that we can trace. NB -- this might be called mutiple mutiple -> multiple (7) + LOG_FIRST_BYTE, /* logging the first byte identifing the + protocol message type */ identifing -> identifying (8) +#define pqLogMsgByte1(CONN, CH, SOURCE) \ +((CONN)->Pfdebug ? pqLogMessageByte1(CONN, CH, SOURCE) : 0) For functions that return void, follow the style of CHECK_FOR_INTERRUPTS() defined in miscadmin.h. (9) + currtime = time(NULL); + tmp = localtime(&currtime); + strftime(timestr, sizeof(timestr), "%Y-%m-%d %H:%M:%S %Z", tmp); + + fprintf(conn->Pfdebug, "%s %s ", timestr, message_direction); It's better to have microsecond accuracy, because knowing the accumulation of such fine-grained timings may help to troubleshoot heavily-loaded client cases. You can mimic setup_formatted_log_time() in src/backend/utils/error/elog.c. This is used for the %m marker in log_line_prefix. (10) I think it's better to use tabs (or any other character that is less likely to appear in the log field) as a field separator, because it makes it easier to process the log with a spreadsheet or some other tools. (11) + /* + * True type of message tagged '\0' is known when next 4 bytes is + * checked. So we delay printing message type to pqLogMsgInt() + */ + if (v != '\0') + fprintf(conn->Pfdebug, "%s ", + pqGetProtocolMsgType((unsigned char) v, commsource)); In what cases does '\0' appear as a message type? (12) +static const char * +pqGetProtocolMsgType(unsigned char c, PGCommSource commsource) +{ + if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b)) + return protocol_message_type_b[c]; + else if (commsource == FROM_FRONTEND && c < lengthof(protocol_message_type_f)) + return protocol_message_type_f[c]; + else + return "UnknownMessage"; +} This function returns NULL (=0) when protocol_message_type_b/f[c] is 0. That crashes the caller: + fprintf(conn->Pfdebug, "%s ", + pqGetProtocolMsgType((unsigned char) v, commsource)); Plus, you may as well print the invalid message type. Why don't you do something like this: +static const char * +pqGetProtocolMsgType(unsigned char c, PGCommSource commsource) +{ + static char unknown_message[64]; + char *msg = NULL; + if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b)) + msg = protocol_message_type_b[c]; + else if (commsource == FROM_FRONTEND && c < lengthof(protocol_message_type_f)) + msg = protocol_message_type_f[c]; + + if (msg == NULL) + { + sprintf(unknown_message, "Unknown message %02x", c); + msg = unknown_message; + } + + return msg; +} (13) @@ -156,7 +156,12 @@ pqParseInput3(PGconn *conn) { /* If not IDLE state, just wait ... */ if (conn->asyncStatus != PGASYNC_IDLE) + { + /* Terminate a half-finished logging message */ + if (conn->Pfdebug) + pqTraceForcelyBreakLine(msgLength, conn); return; + } /* * Unexpected message in IDLE state; need to recover somehow. What's this situation like? Why output a new line and reset the trace status? (14) +/* pqLogInvalidProtocol: Print that the protocol message is invalid */ +static void +pqLogInvalidProtocol(PGconn *conn) +{ + fprintf(conn->Pfdebug, ":::Invalid Protocol\n"); + conn->be_msg->state = LOG_FIRST_BYTE; +} Is it sufficient to just reset the state field? Isn't it necessary to call pqTraceResetBeMsg() instead? (15) @@ -212,15 +368,8 @@ pqSkipnchar(size_t len, PGconn *conn) ... conn->inCursor += len; - return 0; } This is an unnecessary removal of an empty line. (16) +static void +pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource) +{ + char *message_direction = commsource == FROM_BACKEND ? "<" : ">"; ... + switch (conn->be_msg->state) This function handles messages in both directions. But the switch condition is based on the backend message state (b_msg). How is this correct? (17) What's the difference between pqLogMsgString() and pqLogMsgnchar()? They both take a length argument. Do they have to be separate functions? (18) if (conn->Pfdebug) - fprintf(conn->Pfdebug, "To backend> %c\n", c); + pqStoreFrontendMsg(conn, LOG_BYTE1, 1); To match the style for backend messages with pqLogMsgByte1 etc., why don't you wrap the conn->Pfdebug check in the macro? (19) @@ -520,8 +667,6 @@ pqPutMsgStart(char msg_type, bool force_len, PGconn *conn) /* allow room for message length */ endPos += 4; } - else - lenPos = -1; Why is this change necessary? (20) +static void +pqLogFrontendMsg(PGconn *conn) ... + for (i = 0; i < conn->fe_msg->field_num; i++) + { + message_addr = conn->fe_msg->fields[i].offset_in_buffer; + length = conn->fe_msg->fields[i].length; + + switch (conn->fe_msg[i].type) + { + case LOG_BYTE1: When I saw this switch condition, the confusion exploded. Shouldn't the type should be the attribute of each field as follows? + switch (conn->fe_msg->fields[i].type) I also found the function names for frontend -> backend messages hard to grasp. Could you show the flow of function calls when sending messages to the backend? (21) + uint16 result16; + memcpy(&result16, conn->outBuffer + message_addr, length); You should have an empty line between the declaration part and execution part. (22) I couldn't understand pqLogBinaryMsg() at all. Could you explain what it does? Particularly, if all bytes in the string is printable, the function seems to do nothing: + if (isprint(v[i])) + continue; Should the following part be placed outside the for loop? What does 'pin' mean? (I want the variable name to be more intuitive.) + if (pin < length) + fwrite(v + pin, 1, length - pin, conn->Pfdebug); Regards Takayuki Tsunakawa