I'm trying to read a parquet file in Pig, using parquet-mr jars built from
master. Should I be building from a release tag?
Pig version is binary 0.14.
grunt> register
/home/akm/parquet-mr/parquet-*/target/parquet-*-1.8.0-SNAPSHOT.jar;
grunt> a = load '/home/akm/record.parquet' using
org.apache.parquet.pig.ParquetLoader;
2015-05-07 15:39:41,860 [main] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
to process : 1
2015-05-07 15:39:41,878 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2218: Invalid resource schema: bag schema must have tuple as its field
Details at logfile: /home/akm/pig_1431036955635.log
And in that logfile:
Pig Stack Trace
---------------
ERROR 2218: Invalid resource schema: bag schema must have tuple as its field
Failed to parse: Can not retrieve schema from loader
org.apache.parquet.pig.ParquetLoader@1be72d8
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader
org.apache.parquet.pig.ParquetLoader@1be72d8
at
org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:91)
at
org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
at
org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
... 10 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245:
Cannot get schema from loadFunc org.apache.parquet.pig.ParquetLoader
at
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:179)
at
org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
... 17 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2218:
Invalid resource schema: bag schema must have tuple as its field
at
org.apache.pig.ResourceSchema$ResourceFieldSchema.throwInvalidSchemaException(ResourceSchema.java:216)
at
org.apache.pig.impl.logicalLayer.schema.Schema.getPigSchema(Schema.java:1916)
at
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:176)
... 18 more
================================================================================