Dear all, I found the reason.
After enabling the "spark.sql.parquet.useDataSourceApi" in sqlContext, the partition of parquet works correctly. example code: ``` sqlContext.setConf("spark.sql.parquet.useDataSourceApi", "true") val ecrtb20150622 = sqlContext.parquetFile("hdfs:///bwlogs/beta/archive/EC.RTB/_year=2015/_month=06/_day=22") ``` Hope this might help others in the future. Best, Wush 2015-06-23 10:00 GMT+08:00 Wush Wu <w...@bridgewell.com>: > Dear all, > > Today we try to load parquet file with partition as instructed in < > https://spark.apache.org/docs/1.3.1/sql-programming-guide.html#partition-discovery> > : > > ``` > > sqlContext.parquetFile("hdfs:///bwlogs/beta/archive/EC.Buy/_year=2015/_month=06/_day=11") > ``` > > but we got `java.lang.IllegalArgumentException: Could not find Parquet > metadata at path > hdfs://bwhdfscluster/bwlogs/beta/archive/EC.Buy/_year=2015/_month=06/_day=11` > > However, if I new a HiveContext by myself: > > ``` > val hc = new org.apache.spark.sql.hive.HiveContext(sc) > > hc.parquetFile("hdfs:///bwlogs/beta/archive/EC.Buy/_year=2015/_month=06/_day=11") > ``` > > It works. > > Is this a bug? Or did I make a mistake in configuration my hdfs cluster? > > Thanks, > Wush >