What's the Spark version? Could you please also attach result of explain(extended = true)?
On Fri, Jul 8, 2016 at 4:33 PM, Sea <261810...@qq.com> wrote: > I have a problem reading parquet files. > sql: > select count(1) from omega.dwd_native where year='2016' and month='07' > and day='05' and hour='12' and appid='6'; > The hive partition is (year,month,day,appid) > > only two tasks, and it will list all directories in my table, not only > /user/omega/events/v4/h/2016/07/07/12/appid=6 > [Stage 1:> (0 + > 0) / 2] > > 16/07/08 16:16:51 INFO sources.HadoopFsRelation: Listing > hdfs://mycluster-tj/user/omega/events/v4/h/2016/05/31/21/appid=1 > 16/07/08 16:16:51 INFO sources.HadoopFsRelation: Listing > hdfs://mycluster-tj/user/omega/events/v4/h/2016/06/28/20/appid=2 > > 16/07/08 16:16:51 INFO sources.HadoopFsRelation: Listing > hdfs://mycluster-tj/user/omega/events/v4/h/2016/07/22/21/appid=65537 > 16/07/08 16:16:51 INFO sources.HadoopFsRelation: Listing > hdfs://mycluster-tj/user/omega/events/v4/h/2016/08/14/05/appid=65536 > >