Hello,
We have a number of Hive tables (non partitioned) that are populated with subdirectories. (result of tez execution engine union queries) E.g. Table location: “s3://table1/” With the actual data residing in: s3://table1/1/data1 s3://table1/2/data2 s3://table1/3/data3 When using SparkSession (sql/hiveContext has the same behavior) and spark.sql to query the data, no records are displayed due to these subdirectories. e.g val df = spark.sql("select * from db.table1").show() I’ve tried a number of setConf properties e.g. spark.hive.mapred.supports.subdirectories=true, mapreduce.input.fileinputformat.input.dir.recursive=true but it does not look like any of these properties are supported. Has anyone run into similar problems or ways to resolve it? Our current alternatives are reading the input path directory directly e.g.: spark.read.csv("s3://bucket-name/table1/bullseye_segments/*/*") But this requires prior knowledge of the path or an extra step to determine it. Thanks, Matt -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org