Hi, first of all, it's great that this is being tackled. Thanks a lot!
I don't have a very strong opinion on most of the details, but my thoughts are somewhere along these lines: - If possible, the most detailed information should be made available, not neccessarily by default, but at least if one wishes to see it. So in this case, having the option of using %Th and %Ti in addition to what's already there seems great (to me, which may not be representative) :) - While it's great to have all the information to figure out the nitty- gritty details, one should always consider that many people will only care for less detail, but the values should be "what they expect". In that fashion, I would love to see the "accept date" of the default log output be replaced with the start time of the actual HTTP request. I think this makes the most sense with regards to keep-alive and preconnect. Also, it gives consistency with regards to HTTP vs. HTTPS: the dates would then be comparable, and if I use SSL I should make sure I measure that _in addition_ to what I am already measuring. Also, we ingest our logs into an ElasticSearch cluster and would _love_ to just use that date as the timestamp (but it makes no sense with the accept date). Also, it can be hard to correlate downstream events to the log output, because the time can be quite a bit off. So with that reasoning, I guess I would love to see: - all new variables created that you mentioned (%tr, %Ti, %Th, not sure about %TR) - make %tr the default as logged date Just my two cents :) Thanks again for taking this on, Conrad On 08/23/2016 11:34 AM, Willy Tarreau wrote: > Hi all, > > For quite some time now I've been annoyed by the browsers' preconnect > feature causing unexploitable %Tq in the logs. But to defend the browsers, > we're having the same issue with the keep-alive feature as well. > > So I planned to change %Tq to %Ti+%Tq where Ti would be the idle time > before the first byte of request and Tq would be (as it used to be a > decade ago) the time to receive the complete request. > > Recently Thierry discussed with me about the same issue and wanted also > to handle %Th (the time to complete the connection handshake, eg: SSL). > > He initially proposed to have a new field in addition to the existing > ones to report the request time only, and not to change Tq. I asked him > to modify this to have Tq replaced with what it should really be but now > after having seen some such logs I'm having a doubt about the relevance > of this choice. > > While it's appealing to know that every user of httplog will find again > usable logs (regarding the Tq field), I realized that this will offset > all dates by an invisible Ti value which will not be reported by default > if we don't change the default log format. Example : > > Aug 23 10:55:12 haproxy[3848]: 127.0.0.1:47034 [23/Aug/2016:10:55:09.818] > test test/test1 246/0/0/1/2365 200 79 - - ---- 1/1/0/0/3 0/0 "GET / HTTP/1.1" > > Here one wouldbe tempted to think that the complete request was received at > about 10:55:10.064 (10:55:09.818+.246) and that the server received the > connection at about the same time. But in fact this is wrong because if we > log the new %Th and %Ti, we get this : > > Aug 23 10:55:12 haproxy[3848]: 127.0.0.1:47034 [23/Aug/2016:10:55:09.818] > test test/test1 0/2118/246/0/0/1/2365 200 79 - - ---- 1/1/0/0/3 0/0 "GET / > HTTP/1.1" > > And it clearly shows that it took me 2 seconds to type this request so all > dates are off by 2 seconds. Thus I'm tempted to roll back to Thierry's > initialy proposal but I'd like to get other opinions so that we try to do > the best thing. > > Among the other possible choices, I thought that maybe we could have a new > date field which reports the date of end the start of the request and use > this one in conjunction with Tq in the HTTP logs. But then this new date > field should have the same name as the existing one in order not to confuse > people using their own modified format... which in turn causes an > inconsistency with the TCP log format, so this doesn't seem like a very > good idea. > > Maybe something between all of this could consist in this : > > - we create a new date field (eg: %tr) to note the date the request > starts to arrive (end of the idle period) > - we use a new field for the request-only time (eg: %TR) > - we change the default HTTP log format to use both. > > That would give us %Tq = %Th + %Ti + %TR, and %tr = %t + %Th + %Ti. > > This way people who use "option httplog" will suddenly have much more > exploitable logs without changing anything, as the dates will now reflect > the start of receipt of requests and not the end of last response, and the > other ones will not be harmed by any change. > > Any opinion on this is welcome. Note: the main work is done, it's only a > matter of defining what we want to see in the logs, it's easy to adapt > once we know what we want. BTW, I'm also thinking about adding a field to > indicate if a request is the first one of a connection or not. > > I'm attaching Thierry's revised patch for reference (the one implementing > my initial proposal that I'm not happy with anymore). > > Thanks, > Willy > -- Conrad Hoffmann Traffic Engineer SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany Managing Director: Alexander Ljung | Incorporated in England & Wales with Company No. 6343600 | Local Branch Office | AG Charlottenburg | HRB 110657B

