I've spent some time looking at this and it's quite puzzling. The check for 
server_vc being NULL isn't done because the only place I can find where it is 
set to nullptr is HttpServerSession::do_io_close which also destroys the 
HttpServerSession itself. For this reason it should never be the case that 
server_vc is NULL and the enclosing HttpServerSession is valid. I can only 
think that I've missed a place where server_vc gets cleared without destroying 
the HttpServerSession. Still investigating.
 

    On Tuesday, February 21, 2017 10:43 AM, Alan Carroll 
<[email protected]> wrote:
 

 I will try to look at this this week.
 

    On Friday, February 17, 2017 10:15 AM, Dale Ghent <[email protected]> 
wrote:
 

 
Ping!

I'd like to try to get to the bottom of this cause; is there anything I can do 
to help figure this one out? Disabling the gzip plugin has eliminated the 
occurrences of this happening.

/dale

> On Feb 9, 2017, at 1:32 PM, Dale Ghent <[email protected]> wrote:
> 
> 
> Hey all;
> 
> For the past several days, I've been tracking down a crash that's happening 
> in traffic_server 6.2.1 that is happening fairly regularly on our cache 
> nodes. The basic pathology is: server_vc is NULL under some circumstances, 
> which has caused segfaults in code paths which do not guard for this 
> condition. This issue was raised last year in ticket TS-5046 and reported to 
> be fixed with the release of 6.2.1:
> 
> https://issues.apache.org/jira/browse/TS-5046
> 
> The fix was in PR 1222, by backporting the fixed for a different, related 
> issue reported in TS-4938, also reported as fixed in 6.2.1:
> 
> https://issues.apache.org/jira/browse/TS-4938
> 
> In the stack trace given in TS-5046, the person reported the segfault 
> happening in HttpSM::tunnel_handler_server(). While their path to get there 
> was different than ours, the effect was the same. In my case, the segfaults 
> always happen during processing in the gzip plugin:
> 
> https://paste.ec/paste/ZJnXEDC6#Rd-AeEkmmbsqM6E4rA/WuFdcgnUrv9By+rKO6eaxWmY
> 
> As a work-around, I put Masa's bandaid fix in HttpSM::tunnel_handler_server() 
> he mentioned in TS-5046. This of course avoids the null ptr deref there, but 
> now it occurs at the next point down the stack:
> 
> https://paste.ec/paste/+JwWMtuB#ew-87NnW1hQeFzwOCZ5GC+93EPF45YeJut/MtZwsdro
> 
> Obviously we can continue playing whackamole and put guards in for a NULL 
> server_vc in HttpServerSession::release(), however I feel that this probably 
> isn't the desired route. Questions such as "Is server_vc *ever* supposed to 
> be NULL?" and "Why would it be null in the first place?" persist, so I'm 
> turning to more seasoned eyes here for your thoughts. At any rate, 
> TS-5046/4938 do not appear to have completely addressed the situation.
> 
> /dale
> 
> 


   

   

Reply via email to