On 14/12/2011, at 10:33 PM, James Peach wrote: > Hi all, > > I've been looking at an issue where TSFetchUrl does not deliver the EOS event > to the caller until long after the origin server closes the read socket. I > made a small change to read_from_net() so that it notices when read(2) > returns 0 even if it returned > 0 a previous time round the loop. This causes > it to deliver the EOS event up to the HTTP state machine. > > In the end, however, the EOS event is not delivered to the FetchSM. Here's > what happens … > > We receive the response from the origin server. It's > > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] > producer_handler [http server VC_EVENT_READ_READY] > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] > producer_handler_dechunked [http server VC_EVENT_READ_READY] > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_chunk) creating a > chunk of size 3296 bytes > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) > [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 100 > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > reenable Write > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) > attempting to read 32768 bytes > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) read 0 > bytes > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) > triggering EOS > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] > producer_handler [http server VC_EVENT_EOS] > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] > producer_handler_dechunked [http server VC_EVENT_EOS] > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) > [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 104 > [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http) [0] > [&HttpSM::tunnel_handler_server, VC_EVENT_EOS] > > At this point, we know that the read socket closed and we are trying to > deliver the EOS event up the stack. Then this happens: > > [Dec 14 20:36:31.338] Server {0x7fff7b5f9960} DEBUG: (http_ss) [0] session > closed > [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > reenable Write > [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: > Received event 1 > [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_write_side > [Dec 14 20:36:31.363] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_write_side; act_on 3499 > [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_write_side; added 3499 > [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] > consumer_handler [user agent VC_EVENT_WRITE_COMPLETE] > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] > [&HttpSM::tunnel_handler_ua, VC_EVENT_WRITE_COMPLETE] > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] session > released by sm [0] > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] initiating > io for next header > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > do_io_read for -1 bytes > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] > [HttpSM::main_handler, HTTP_TUNNEL_EVENT_DONE] > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] > [&HttpSM::tunnel_handler, HTTP_TUNNEL_EVENT_DONE] > [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_redirect) > [HttpTunnel::deallocate_postdata_copy_buffers] > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http_seq) > [HttpStateMachineGet::update_stats] Logging transaction > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http) [0] dellocating sm > > So now the HttpSM knows that it's all over because it is deallocating itself > in HttpSM::kill_this(). > > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > process_read_side > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > process_read_side; act_on 3499 > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > process_read_side; added 3499 > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) > [fetch_handler] calling fetch_plugin > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) > [process_fetch_read] I am here read > [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) > [process_fetch_read] number of bytes in read ready 3499 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > reenable Read > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > reenable Write > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: > Received event 1 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_read_side > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_read_side; act_on 0 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > closed? 0 shutdown? 0 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: > process_write_side > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Active: > Received event 1 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > process_read_side > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > process_read_side; act_on 0 > [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: > closed? 0 shutdown? 0 > > proxy/PluginVC.cc: > 624 if (act_on <= 0) { > 625 Debug("pvc", "[%u] %s: closed? %u shutdown? %u", PVC_ID, PVC_TYPE, > 626 other_side->closed, other_side->write_state.shutdown); > 627 if (other_side->closed || other_side->write_state.shutdown) { > 628 read_state.vio._cont->handleEvent(VC_EVENT_EOS, &read_state.vio); > 629 } > 630 return; > 631 } > > So the Plugin VC does not think its other_side is closed (because no-one has > called do_io_close()?). But eventually there is a HttpClientSession timeout, > which closes the right PluginVC and wakes it. Now the FetchSM is kicked and > the TSFetchUrl event triggers: > > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Passive: > Received event 2 > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] > [&HttpClientSession::state_keep_alive, VC_EVENT_ACTIVE_TIMEOUT] > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: > do_io_close > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session closed > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session > destroy > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: > process_close > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Active: > Received event 1 > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: > process_read_side > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: > process_read_side; act_on 0 > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: closed? 1 > shutdown? 0 > [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (FetchSM) [fetch_handler] > calling fetch_plugin > > After beating my head against this for a while, I don't see what the right > way to deliver the EOS event up from the HTTP layer to the PluginVC is. Can > anyone hold my hand a little and give me some clues about how this is > supposed to work?
This appears to do the trick: diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc index 624637b..b450755 100644 --- a/proxy/http/HttpSM.cc +++ b/proxy/http/HttpSM.cc @@ -2842,6 +2842,7 @@ HttpSM::tunnel_handler_server(int event, HttpTunnelProducer * if (close_connection) { p->vc->do_io_close(); p->read_vio = NULL; + t_state.client_info.keep_alive = HTTP_NO_KEEPALIVE; } else { server_session->attach_hostname(t_state.current.server->name); server_session->server_trans_stat--; I don't understand the HTTP state machine well enough to know whether this is a generally correct fix, so I'd love it if anyone could give this a thumbs up or down J