[ https://issues.apache.org/jira/browse/TIKA-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782619#comment-17782619 ]
ASF GitHub Bot commented on TIKA-4165: -------------------------------------- lxb007981 closed pull request #1436: [TIKA-4165] fix a flaky test. URL: https://github.com/apache/tika/pull/1436 > Fix a flaky test, caused by nondeterministic iteration order of HashMap > ----------------------------------------------------------------------- > > Key: TIKA-4165 > URL: https://issues.apache.org/jira/browse/TIKA-4165 > Project: Tika > Issue Type: Improvement > Reporter: Xinbo Lu > Priority: Minor > Attachments: diff > > > Fix a flaky test, caused by nondeterministic iteration order of HashMap > h3. Related Test > [org.apache.tika.parser.microsoft.ooxml.xps.XPSParserTest.testVarious|https://github.com/lxb007981/tika/blob/971f0cbd9b46c1d7fb96b0a3732c3fc870920aba/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/xps/XPSParserTest.java#L59] > h3. Root Cause > > [tika/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xps/XPSExtractorDecorator.java|https://github.com/lxb007981/tika/blob/d466492e0a01c8ee28c108bc3022f1a03ff530de/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xps/XPSExtractorDecorator.java#L119] > Line 119 in > [d466492|https://github.com/lxb007981/tika/commit/d466492e0a01c8ee28c108bc3022f1a03ff530de] > ||for (Map.Entry<String, Metadata> embeddedImage : embeddedImages.entrySet()) > {|| > > When recursively extracting metadata from an XPS file, a simple {{HashMap}} > is used and iterated. However, note that {{HashMap}} does not gurantee the > order of iteration, the extracted metadata have no guaranteed order in the > resulting metadata list. And later in the test > {{{}org.apache.tika.parser.microsoft.ooxml.xps.XPSParserTest.testVarious{}}}, > the test assumes the order of metadata in the list is the same as the > {{HashMap}} insertion order, thus renders the test flaky. > h3. Fix > We sort the metadata list before doing comparison. > h3. How to reproduce the test > {*}Java version{*}: 11.0.20.1 > {*}Maven version{*}: 3.6.3 > # Build the module > {{mvn clean install -DskipTests -pl > tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module > -am}} > # Test without shuffling > {{mvn -pl > tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module > test > -Dtest=org.apache.tika.parser.microsoft.ooxml.xps.XPSParserTest#testVarious}} > This test passed. > # Test with shuffling using > [NonDex|https://github.com/TestingResearchIllinois/NonDex] > {{{}mvn -pl > tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module > edu.illinois:nondex-maven-plugin:2.1.1:nondex > -Dtest=org.apache.tika.parser.microsoft.ooxml.xps.XPSParserTest#testVarious{}}}This > test passed with the proposed fix but failed without it. -- This message was sent by Atlassian Jira (v8.20.10#820010)