[ https://issues.apache.org/jira/browse/PIG-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490439#comment-15490439 ]
Adam Szita commented on PIG-5025: --------------------------------- I agree [~kexianda], thanks for the tip. See revised patch PIG-5025.1.patch A little more detail what's happening under the hood: When Pig asks HDFS to get files by pattern HDFS will check ALL the filenames in the directory. This means that anything containing colon will break this mechanism. Doesn't even have to match the requested pattern, just be in the same dir. I guess the pattern matching functionality (see org.apache.hadoop.fs.Globber.glob() ) is only activated if we supply patterns - and that's true for these 2 test cases only in TestLoad.java > Improve TestLoad.java: use own separated folder under /tmp > ---------------------------------------------------------- > > Key: PIG-5025 > URL: https://issues.apache.org/jira/browse/PIG-5025 > Project: Pig > Issue Type: Improvement > Reporter: Adam Szita > Assignee: Adam Szita > Priority: Minor > Attachments: PIG-5025.1.patch, PIG-5025.patch > > > Test cases testCommaSeparatedString2 and testGlobChars may fail if for some > reason files (from any other sources) in /tmp have : (colon) in the > filenames. This is because HDFS doesn't support colon since it has its own > URI handling. Exception below. > I propose we separate the working dir of these tests to use their own folder > in /tmp. > Failed to parse: java.net.URISyntaxException: Relative path in absolute URI: > t:2sTest.txt > at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:198) > at org.apache.pig.test.TestLoad.checkLoadPath(TestLoad.java:317) > at org.apache.pig.test.TestLoad.checkLoadPath(TestLoad.java:299) > at > org.apache.pig.test.TestLoad.testCommaSeparatedString2(TestLoad.java:189) > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: t:2sTest.txt > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.<init>(Path.java:172) > at org.apache.hadoop.fs.Path.<init>(Path.java:94) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:260) > at org.apache.hadoop.fs.Globber.glob(Globber.java:151) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1637) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.asCollection(HDataStorage.java:215) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.asCollection(HDataStorage.java:41) > at > org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:119) > at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:191) > at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:518) > at > org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175) > at > org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89) > at > org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:866) > at > org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568) > at > org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625) > at > org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) > at > org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) > at > org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) > at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > t:2sTest.txt > at java.net.URI.checkPath(URI.java:1823) > at java.net.URI.<init>(URI.java:745) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) -- This message was sent by Atlassian JIRA (v6.3.4#6332)