Hi Wei Chiu,

We also observed same issue when NN replays large editlogs from JN.
It looks like in jetty 6 the default max idle timeout is  200 seconds.

public abstract class AbstractConnector extends AbstractBuffers implements 
Connector
{
    ....
    protected int _maxIdleTime=200000;
    ....
}

Thanks,
Jason

On 12/7/20, 9:51 PM, "Wei-Chiu Chuang" <weic...@apache.org> wrote:

    Hi community,

    I want to share with you this observation.

    We received several case reports that users sometimes experience
    JournalNode timeout when NN requests edits from JN. The end result is
    (both!) NN crash after the timeout (10 seconds).

    It seems to only happen to Hadoop 3 users (CDH6 and HDP3). While
    HADOOP-15696 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D15696&d=DwIBaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=UflFQf1BWcrVtfjfN1LUqWWh-UBP5XtRGMdcDC-0P7o&m=L67rN1m5wT8nsi0reG7VuHuSEiJ0khiFAjDFK3GFFbQ&s=eEUnJdQK8HKIlsWlNRMzmhQs4DqKn8SFs4X4s2xIENs&e=
 > offered a
    configurable switch for you to increase hadoop.http.idle_timeout.ms, it
    looks like a regression in Hadoop 3 and NN shouldn't simply crash because
    JN is slightly slow. It looks to me a 10 second timeout for fetching edits
    from JN is simply too low.

    I believe this is a regression caused when we updated Jetty from 6 to 9 in
    Hadoop 3 (HADOOP-10075 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D10075&d=DwIBaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=UflFQf1BWcrVtfjfN1LUqWWh-UBP5XtRGMdcDC-0P7o&m=L67rN1m5wT8nsi0reG7VuHuSEiJ0khiFAjDFK3GFFbQ&s=D_Tma-NaItInfNfm3UuoQbndqB4541VxEeyXpkYMkH4&e=
 >).
    We replaced SelectChannelConnector.setLowResourceMaxIdleTime()
    with ServerConnector.setIdleTimeout() but they aren't the same.

    
https://urldefense.proofpoint.com/v2/url?u=http-3A__archive.eclipse.org_jetty_7.0.0.RC0_apidocs_org_eclipse_jetty_server_nio_SelectChannelConnector.html-23getLowResourcesMaxIdleTime-28-29&d=DwIBaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=UflFQf1BWcrVtfjfN1LUqWWh-UBP5XtRGMdcDC-0P7o&m=L67rN1m5wT8nsi0reG7VuHuSEiJ0khiFAjDFK3GFFbQ&s=PcA6g7BGB_1fGEHHCS1Dgl0i4fS_AeCRr1q5ceVduOo&e=
 

    
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.eclipse.org_jetty_javadoc_9.4.26.v20200117_org_eclipse_jetty_server_AbstractConnector.html-23setIdleTimeout-28long-29&d=DwIBaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=UflFQf1BWcrVtfjfN1LUqWWh-UBP5XtRGMdcDC-0P7o&m=L67rN1m5wT8nsi0reG7VuHuSEiJ0khiFAjDFK3GFFbQ&s=FKfElxhHXM1PCAk0VpG9wt6Y6jyKbr-PN4H4v4m9Tfc&e=
 

    Does any know the behavior back in Hadoop 2/Jetty6? Does it use the Jetty's
    default idle time which is 300 seconds?

Reply via email to