When using native data sources (e.g. Parquet, ORC, JSON, ...), partitions
are automatically merged so they would add up to a specific size,
configurable by spark.sql.files.maxPartitionBytes.
spark.sql.files.openCostInBytes is used to specify the cost of each "file".
That is, an empty file will be
-user
Reynold made the comment that he thinks this was resolved by another
change; maybe he can comment.
On Thu, Jul 7, 2016 at 7:53 AM, Ajay Srivastava
wrote:
> Hi,
>
> This jira https://issues.apache.org/jira/browse/SPARK-8813 is fixed in spark
> 2.0.
> But resolution is not mentioned there.
>