[ 
https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862220#comment-13862220
 ] 

Hoss Man commented on SOLR-5605:
--------------------------------

Filling this because i encountered it in randomized testing, it sounded 
familiar, but i was suprised not to be able to find an issue about it.

easy to repro...

{code}
ant test -Dtestcase=MapReduceIndexerToolArgumentParserTest 
-Dtests.method=testArgsParserHelp -Dtests.slow=true -Dtests.locale=hi_IN 
-Dtests.file.encoding=UTF-8
...
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=MapReduceIndexerToolArgumentParserTest 
-Dtests.method=testArgsParserHelp -Dtests.seed=90EEAEBDB08626A8 
-Dtests.slow=true -Dtests.locale=hi_IN -Dtests.timezone=Pacific/Apia 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.25s | 
MapReduceIndexerToolArgumentParserTest.testArgsParserHelp <<<
   [junit4]    > Throwable #1: java.util.UnknownFormatConversionException: 
Conversion = '१'
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([90EEAEBDB08626A8:C3C04CAF7E84AE5]:0)
   [junit4]    >        at java.util.Formatter.checkText(Formatter.java:2547)
   [junit4]    >        at java.util.Formatter.parse(Formatter.java:2523)
   [junit4]    >        at java.util.Formatter.format(Formatter.java:2469)
   [junit4]    >        at java.io.PrintWriter.format(PrintWriter.java:905)
   [junit4]    >        at 
net.sourceforge.argparse4j.helper.TextHelper.printHelp(TextHelper.java:206)
   [junit4]    >        at 
net.sourceforge.argparse4j.internal.ArgumentImpl.printHelp(ArgumentImpl.java:247)
   [junit4]    >        at 
net.sourceforge.argparse4j.internal.ArgumentParserImpl.printArgumentHelp(ArgumentParserImpl.java:253)
   [junit4]    >        at 
net.sourceforge.argparse4j.internal.ArgumentParserImpl.printHelp(ArgumentParserImpl.java:279)
   [junit4]    >        at 
org.apache.solr.hadoop.MapReduceIndexerTool$MyArgumentParser$1.run(MapReduceIndexerTool.java:187)
{code}

Analysis from Uwe on the list when jenkins hit this a while back...

{quote}
Locale problem with the argument parser.

The sperm-like symbol (१) is DEVANAGARI DIGIT ONE (U+0967). It looks like while 
testing some foreign (non-lucene) code converts the digit "1" to this small 
creature maybe through the use of default locale. As the Lucene code is 
forbidden-api checked, this seems to be a bug somewhere else - the stack trace 
shows the bug: net.sourceforge.argparse4j.helper.TextHelper calls String.format 
without Locale!). 
{quote}

...and...

{quote}
The problem is in Argparser4J:

http://grepcode.com/file/repo1.maven.org/maven2/net.sourceforge.argparse4j/argparse4j/0.3.2/net/sourceforge/argparse4j/helper/TextHelper.java#197

The code does the following:

String fmt = String.format("%%%ds%%s\n", indentWidth);
writer.format(fmt,....)

So it uses the first String.format (without locale) to produce the format 
string of the second one. The %d will be the indentWidth, so the code is 
right-aligned. But the indent-with pattern is formatted using default locale, 
so the first line produces something like the following code:
"%१s%s" instead of "%1s%s"

This will fail format parsing in the second. In my opinion the whole code is a 
bug by itself. Creating a format pattern with another format pattern is slow 
and as shown: buggy!

{quote}

> MapReduceIndexerTool fails in some locales -- seen in random failures of 
> MapReduceIndexerToolArgumentParserTest
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5605
>                 URL: https://issues.apache.org/jira/browse/SOLR-5605
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest 
> which is reproducible with any seed -- all that matters is the locale.
> The problem sounded familiar, and a quick search verified that jenkins has in 
> fact hit this a couple of times in the past -- Uwe commented on the list that 
> this is due to a real problem in one of the third-party dependencies (that 
> does the argument parsing) that will affect usage on some systems.
> If working around the bug in the arg parsing lib isn't feasible, 
> MapReduceIndexerTool should fail cleanly if the locale isn't one we know is 
> "supported"



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to