[
https://issues.apache.org/jira/browse/SOLR-9401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886272#comment-15886272
]
Steve Rowe commented on SOLR-9401:
----------------------------------
This has been showing up on Jenkins-s, including mine, every once in a while.
I beasted it a few hundred times and it happens on my box about 5% of the time.
Looking at the beasting failures, I noticed that the following is always
present in the logs just before the NPE triggers a test failure (below is from
[https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/750/consoleText]):
{noformat}
[junit4] 2> 403449 ERROR
(TEST-TestPKIAuthenticationPlugin.test-seed#[D2BF175262C5311]) [ ]
o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp: 1488029294206 ,
received timestamp: 1488029300727 , TTL: 5000
{noformat}
In short, the NPEs are triggered because the 5 second TTL has expired (which is
itself weird - what the hell is taking so long???), causing
{{PKIAuthenticationPlugin.doAuthenticate()}} to skip the code path that would
pull the user principal from the headers and populate it on the request (line
#134 below). As a result, when test code attempts to verify that the user
principal on the request has the appropriate value, the NPE is triggered.
{code:java|title=PKIAuthenticationPlugin.java}
92: public boolean doAuthenticate(ServletRequest request, ServletResponse
response, FilterChain filterChain) throws Exception {
[...]
99: long receivedTime = System.currentTimeMillis();
100: String header = ((HttpServletRequest) request).getHeader(HEADER);
[...]
108: List<String> authInfo = StrUtils.splitWS(header, false);
[...]
115: String nodeName = authInfo.get(0);
116: String cipher = authInfo.get(1);
117:
118: PKIHeaderData decipher = decipherHeader(nodeName, cipher);
[...]
124: if ((receivedTime - decipher.timestamp) > MAX_VALIDITY) {
125: log.error("Invalid key request timestamp: {} , received timestamp:
{} , TTL: {}", decipher.timestamp, receivedTime, MAX_VALIDITY);
126: filterChain.doFilter(request, response);
127: return true;
128: }
129:
130: final Principal principal = "$".equals(decipher.userName) ?
131: SU :
132: new BasicUserPrincipal(decipher.userName);
133:
134: filterChain.doFilter(getWrapper((HttpServletRequest) request,
principal), response);
135: return true;
136: }
{code}
Why doesn't the overridden {{doAuthenticate()}} follow the contract laid out in
its superclass's javadocs, i.e. throw an exception on error?:
{code:java|title:AuthenticationPlugin.java}
/**
* This method attempts to authenticate the request. Upon a successful
authentication, this
* must call the next filter in the filter chain and set the user principal
of the request,
* or else, upon an error or an authentication failure, throw an exception.
[...]
* @throws Exception any exception thrown during the authentication, e.g.
PrivilegedActionException
public abstract boolean doAuthenticate(ServletRequest request,
ServletResponse response,
FilterChain filterChain) throws Exception;
{code}
MAX_VALIDITY could maybe be modified by the test class (via sysprop
{{pikiauth.ttl}}) to wait around longer:
{code:java|title=PKIAuthenticationPlugin.java}
63: private final int MAX_VALIDITY =
Integer.parseInt(System.getProperty("pkiauth.ttl", "5000"));
{code}
But that ^^ feels hackish. Seems like client code should react to timeouts
like this by retrying at least once?
I'm not sure which path is correct. Ping [~noble.paul] , who AFAICT wrote most
of this code, and [~janhoy], who was asking on the dev list about one of these
failures the other day.
> TestPKIAuthenticationPlugin NPE
> -------------------------------
>
> Key: SOLR-9401
> URL: https://issues.apache.org/jira/browse/SOLR-9401
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Steve Rowe
>
> Failure from my Jenkins, doesn't reproduce for me (this is
> {{tests-failures.txt}}):
> {noformat}
> 2> Creating dataDir:
> /var/lib/jenkins/jobs/Lucene-Solr-tests-master/workspace/solr/build/solr-core/test/J7/temp/solr.security.TestPKIAuthenticationPlugi
> n_7AC33B2240CB767D-001/init-core-data-001
> 2> 14521 INFO
> (SUITE-TestPKIAuthenticationPlugin-seed#[7AC33B2240CB767D]-worker) [ ]
> o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (fal
> se) via: @org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN,
> clientAuth=NaN)
> 2> 14540 INFO
> (TEST-TestPKIAuthenticationPlugin.test-seed#[7AC33B2240CB767D]) [ ]
> o.a.s.SolrTestCaseJ4 ###Starting test
> 2> 15553 ERROR
> (TEST-TestPKIAuthenticationPlugin.test-seed#[7AC33B2240CB767D]) [ ]
> o.a.s.s.PKIAuthenticationPlugin No SolrAuth header present
> 2> 15843 ERROR
> (TEST-TestPKIAuthenticationPlugin.test-seed#[7AC33B2240CB767D]) [ ]
> o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp: 9 ,
> received timestamp: 1470760833176 , TTL: 5000
> 2> 15843 INFO
> (TEST-TestPKIAuthenticationPlugin.test-seed#[7AC33B2240CB767D]) [ ]
> o.a.s.SolrTestCaseJ4 ###Ending test
> 2> NOTE: download the large Jenkins line-docs file by running 'ant
> get-jenkins-line-docs' in the lucene directory.
> 2> NOTE: reproduce with: ant test -Dtestcase=TestPKIAuthenticationPlugin
> -Dtests.method=test -Dtests.seed=7AC33B2240CB767D -Dtests.slow=true -Dtests.li
> nedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt
> -Dtests.locale=cs -Dtests.timezone=Europe/Tirane -Dtests.asserts=true
> -Dtests.file.encoding=U
> TF-8
> [12:40:32.094] ERROR 1.35s J7 | TestPKIAuthenticationPlugin.test <<<
> > Throwable #1: java.lang.NullPointerException
> > at
> __randomizedtesting.SeedInfo.seed([7AC33B2240CB767D:F29704F8EE371B85]:0)
> > at
> org.apache.solr.security.TestPKIAuthenticationPlugin.test(TestPKIAuthenticationPlugin.java:144)
> [...]
> 2> 15867 INFO
> (SUITE-TestPKIAuthenticationPlugin-seed#[7AC33B2240CB767D]-worker) [ ]
> o.a.s.SolrTestCaseJ4 ###deleteCore
> 2> NOTE: leaving temporary files on disk at:
> /var/lib/jenkins/jobs/Lucene-Solr-tests-master/workspace/solr/build/solr-core/test/J7/temp/solr.security.TestPKIAuthenticationPlugin_7AC33B2240CB767D-001
> 2> NOTE: test params are: codec=Asserting(Lucene62): {}, docValues:{},
> maxPointsInLeafNode=752, maxMBSortInHeap=5.390190554185364,
> sim=RandomSimilarity(queryNorm=true): {}, locale=cs, timezone=Europe/Tirane
> 2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_77
> (64-bit)/cpus=16,threads=1,free=255922760,total=336592896
> 2> NOTE: All tests run in this JVM: [TestIndexingPerformance,
> TestPKIAuthenticationPlugin]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]