RE: libpq debug log

tsunakawa.ta...@fujitsu.com Sat, 16 Jan 2021 21:23:57 -0800

From: iwata....@fujitsu.com <iwata....@fujitsu.com>
> This patch includes these items:


Is there anything else in this revision that is not handled?

Below are my comments.

Also, why don't you try running the regression tests with a temporary 
modification to PQtrace() to output the trace to a file?  The sole purpose is 
to confirm that this patch makes the test crash (core dump).


(1)
-       conn->Pfdebug = debug_port;
+       if (pqTraceInit(conn))
+       {
+               conn->Pfdebug = debug_port;
+               if (conn->Pfdebug != NULL)
+                       setlinebuf(conn->Pfdebug);
+       }
+       else
+       {
+               /* XXX report ENOMEM? */
+               fprintf(conn->Pfdebug, "Failed to initialize trace support\n");
+               fflush(conn->Pfdebug);
+       }
 }

* When the passed debug_port is NULL, the function should return after calling 
PQuntrace().

* Is setlinebuf() available on Windows?  Maybe you should use setvbuf() instead.

* I don't see the need for separate pqTraceInit() function, because it is only 
called here.


(2)
+bool
+pqTraceInit(PGconn *conn)
+{
+       /* already done? */
+       if (conn->be_msg != NULL)
+       {

What's this code for?  I think it's sufficient that PQuntrace() frees b_msg and 
PQtrace() always allocates b_msg.


(3)
+       conn->fe_msg = malloc(sizeof(pqFrontendMessage) +
+                                                       MAX_FRONTEND_MSGS * 
sizeof(pqFrontendMessageField));
+       conn->fe_msg->field_num = 0;
+       if (conn->fe_msg == NULL)
+       {
+               free(conn->be_msg);
+               /* NULL out for the case that fe_msg malloc fails */
+               conn->be_msg = NULL;
+               return false;
+       }
+       conn->fe_msg->field_num = 0;

The memory for the fields array is one entry larger than necessary.  The 
allocation code would be:

+       conn->fe_msg = malloc(offsetof(pqFrontendMessage, fields) +
+                                                       MAX_FRONTEND_MSGS * 
sizeof(pqFrontendMessageField));

Also, the line with "field_num = 0" appears twice.  The first one should be 
removed.


(4)
+static void
+pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource)
+{
...
+       if (PG_PROTOCOL_MAJOR(conn->pversion) >= 3)
+       {

I think you can remove the protocol version check in various logging functions, 
because PQtrace() disables logging when the protocol version is 2.


(5)
@@ -966,10 +966,6 @@ pqSaveParameterStatus(PGconn *conn, const char *name, 
const char *value)
        pgParameterStatus *pstatus;
        pgParameterStatus *prev;
 
-       if (conn->Pfdebug)
-               fprintf(conn->Pfdebug, "pqSaveParameterStatus: '%s' = '%s'\n",
-                               name, value);
-

Where is this information output instead?


(6)
+ * Set up state so that we can trace. NB -- this might be called mutiple

mutiple -> multiple


(7)
+       LOG_FIRST_BYTE,                         /* logging the first byte 
identifing the
+                                                                  protocol 
message type */

identifing -> identifying


(8)
+#define pqLogMsgByte1(CONN, CH, SOURCE) \
+((CONN)->Pfdebug ? pqLogMessageByte1(CONN, CH, SOURCE) : 0)

For functions that return void, follow the style of CHECK_FOR_INTERRUPTS() 
defined in miscadmin.h.


(9)
+                               currtime = time(NULL);
+                               tmp = localtime(&currtime);
+                               strftime(timestr, sizeof(timestr), "%Y-%m-%d 
%H:%M:%S %Z", tmp);
+
+                               fprintf(conn->Pfdebug, "%s %s ", timestr, 
message_direction);

It's better to have microsecond accuracy, because knowing the accumulation of 
such fine-grained timings may help to troubleshoot heavily-loaded client cases. 
 You can mimic setup_formatted_log_time() in src/backend/utils/error/elog.c.  
This is used for the %m marker in log_line_prefix.


(10)
I think it's better to use tabs (or any other character that is less likely to 
appear in the log field) as a field separator, because it makes it easier to 
process the log with a spreadsheet or some other tools.


(11)
+                               /*
+                                * True type of message tagged '\0' is known 
when next 4 bytes is
+                                * checked. So we delay printing message type 
to pqLogMsgInt()
+                                */
+                               if (v != '\0')
+                                       fprintf(conn->Pfdebug, "%s ",
+                                                       
pqGetProtocolMsgType((unsigned char) v, commsource));

In what cases does '\0' appear as a message type?


(12)
+static const char *
+pqGetProtocolMsgType(unsigned char c, PGCommSource commsource)
+{
+       if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b))
+               return protocol_message_type_b[c];
+       else if (commsource == FROM_FRONTEND && c < 
lengthof(protocol_message_type_f))
+               return protocol_message_type_f[c];
+       else
+               return "UnknownMessage";
+}

This function returns NULL (=0) when protocol_message_type_b/f[c] is 0.  That 
crashes the caller:

+                                       fprintf(conn->Pfdebug, "%s ",
+                                                       
pqGetProtocolMsgType((unsigned char) v, commsource));

Plus, you may as well print the invalid message type.  Why don't you do 
something like this:

+static const char *
+pqGetProtocolMsgType(unsigned char c, PGCommSource commsource)
+{
+       static char unknown_message[64];
+       char *msg = NULL;

+       if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b))
+               msg = protocol_message_type_b[c];
+       else if (commsource == FROM_FRONTEND && c < 
lengthof(protocol_message_type_f))
+               msg = protocol_message_type_f[c];
+
+       if (msg == NULL)
+       {
+               sprintf(unknown_message, "Unknown message %02x", c);
+               msg = unknown_message;
+       }
+
+       return msg;
+}


(13)
@@ -156,7 +156,12 @@ pqParseInput3(PGconn *conn)
                {
                        /* If not IDLE state, just wait ... */
                        if (conn->asyncStatus != PGASYNC_IDLE)
+                       {
+                               /* Terminate a half-finished logging message */
+                               if (conn->Pfdebug)
+                                       pqTraceForcelyBreakLine(msgLength, 
conn);
                                return;
+                       }
 
                        /*
                         * Unexpected message in IDLE state; need to recover 
somehow.

What's this situation like?  Why output a new line and reset the trace status?


(14)
+/* pqLogInvalidProtocol: Print that the protocol message is invalid */
+static void
+pqLogInvalidProtocol(PGconn *conn)
+{
+       fprintf(conn->Pfdebug, ":::Invalid Protocol\n");
+       conn->be_msg->state = LOG_FIRST_BYTE;
+}

Is it sufficient to just reset the state field?  Isn't it necessary to call 
pqTraceResetBeMsg() instead?


(15)
@@ -212,15 +368,8 @@ pqSkipnchar(size_t len, PGconn *conn)
...
        conn->inCursor += len;
-
        return 0;
 }

This is an unnecessary removal of an empty line.


(16)
+static void
+pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource)
+{
+       char       *message_direction = commsource == FROM_BACKEND ? "<" : ">";
...
+               switch (conn->be_msg->state)

This function handles messages in both directions.  But the switch condition is 
based on the backend message state (b_msg).  How is this correct?


(17)
What's the difference between pqLogMsgString() and pqLogMsgnchar()?  They both 
take a length argument.  Do they have to be separate functions?


(18)
        if (conn->Pfdebug)
-               fprintf(conn->Pfdebug, "To backend> %c\n", c);
+               pqStoreFrontendMsg(conn, LOG_BYTE1, 1);

To match the style for backend messages with pqLogMsgByte1 etc., why don't you 
wrap the conn->Pfdebug check in the macro?


(19)
@@ -520,8 +667,6 @@ pqPutMsgStart(char msg_type, bool force_len, PGconn *conn)
                /* allow room for message length */
                endPos += 4;
        }
-       else
-               lenPos = -1;
 
Why is this change necessary?


(20)
+static void
+pqLogFrontendMsg(PGconn *conn)
...
+       for (i = 0; i < conn->fe_msg->field_num; i++)
+       {
+               message_addr = conn->fe_msg->fields[i].offset_in_buffer;
+               length = conn->fe_msg->fields[i].length;
+
+               switch (conn->fe_msg[i].type)
+               {
+                       case LOG_BYTE1:

When I saw this switch condition, the confusion exploded.  Shouldn't the type 
should be the attribute of each field as follows?

+               switch (conn->fe_msg->fields[i].type)

I also found the function names for frontend -> backend messages hard to grasp. 
 Could you show the flow of function calls when sending messages to the backend?


(21)
+                               uint16          result16;
+                               memcpy(&result16, conn->outBuffer + 
message_addr, length);

You should have an empty line between the declaration part and execution part.


(22)
I couldn't understand pqLogBinaryMsg() at all.  Could you explain what it does? 
 Particularly, if all bytes in the string is printable, the function seems to 
do nothing:

+               if (isprint(v[i]))
+                       continue;

Should the following part be placed outside the for loop?  What does 'pin' 
mean? (I want the variable name to be more intuitive.)

+               if (pin < length)
+                       fwrite(v + pin, 1, length - pin, conn->Pfdebug);


Regards
Takayuki Tsunakawa

RE: libpq debug log

Reply via email to