[
https://issues.apache.org/jira/browse/SPARK-18021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-18021:
------------------------------------
Assignee: Reynold Xin (was: Apache Spark)
> Refactor file name specification for data sources
> -------------------------------------------------
>
> Key: SPARK-18021
> URL: https://issues.apache.org/jira/browse/SPARK-18021
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Reynold Xin
> Assignee: Reynold Xin
>
> Currently each data source OutputWriter is responsible for specifying the
> entire file name for each file output. This, however, does not make any sense
> because we rely on file name for certain behaviors in Spark SQL, e.g. bucket
> id. The current approach allows individual data sources to break the
> implementation of bucketing.
> We don't want to move file name entirely also out of the data sources,
> because different data sources do want to specify different extensions.
> A good compromise is for the OutputWriter to take in the prefix for a file,
> and it can add its own suffix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]