set spark.sql.hive.verifyPartitionPath=true didn’t help. Still getting the same error.
I tried to copy a file with a _ prefix and I am not getting the error and the file is also ignored by SparkSQL. But when scheduling the job in prod and if during one execution there is no data to be processed the query will again fail. How to deal with this scenario. From: Sea <261810...@qq.com> Date: Sunday, May 21, 2017 at 8:04 AM To: Steve Loughran <ste...@hortonworks.com>, "Bajpai, Amit X. -ND" <amit.x.bajpai....@disney.com> Cc: "user@spark.apache.org" <user@spark.apache.org> Subject: Re: SparkSQL not able to read a empty table location please try spark.sql.hive.verifyPartitionPath true ------------------ Original ------------------ From: "Steve Loughran";<ste...@hortonworks.com>; Date: Sat, May 20, 2017 09:19 PM To: "Bajpai, Amit X. -ND"<amit.x.bajpai....@disney.com>; Cc: "user@spark.apache.org"<user@spark.apache.org>; Subject: Re: SparkSQL not able to read a empty table location On 20 May 2017, at 01:44, Bajpai, Amit X. -ND <amit.x.bajpai....@disney.com<mailto:n...@disney.com>> wrote: Hi, I have a hive external table with the S3 location having no files (but the S3 location directory does exists). When I am trying to use Spark SQL to count the number of records in the table it is throwing error saying “File s3n://data/xyz does not exist. null/0”. select * from tablex limit 10 Can someone let me know how we can fix this issue. Thanks There isn't really a "directory" in S3, just a set of objects whose paths begin with a string. Try creating an empty file with an _ prefix in the directory; it should be ignored by Spark SQL but will cause the "directory" to come into being