Hi all, Many Spark users in my company are asking for a way to control the number of output files in Spark SQL. There are use cases to either reduce or increase the number. The users prefer not to use function *repartition*(n) or *coalesce*(n, shuffle) that require them to write and deploy Scala/Java/Python code.
Could we introduce a query hint for this purpose (similar to Broadcast Join Hints)? /*+ *COALESCE*(n, shuffle) */ In general, is query hint is the best way to bring DF functionality to SQL without extending SQL syntax? Any suggestion is highly appreciated. This requirement is not the same as SPARK-6221 that asked for auto-merging output files. Thanks, John Zhuge