Re: Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

Sean Owen Wed, 09 Mar 2016 01:31:44 -0800

>From your JIRA, it seems like you're referring to the "part-*" files.
These files are effectively an internal representation, and I would
not expect them to have such an extension. For example, you're not
really guaranteed that the way the data breaks up leaves each file a
valid JSON doc.


On Wed, Mar 9, 2016 at 5:49 AM, Hyukjin Kwon <gurwls...@gmail.com> wrote:
> Hi all,
>
> Currently, the output from CSV, TEXT and JSON data sources does not have
> file extensions such as .csv, .txt and .json (except for compression
> extensions such as .gz, .deflate and .bz4).
>
> In addition, it looks Parquet has the extensions such as .gz.parquet or
> .snappy.parquet according to compression codecs whereas ORC does not have
> such extensions but it is just .orc.
>
> I tried to search some JIRAs related with this but I could not find yet but
> I did not open a JIRA directly because I feel like this is already concerned
>
> Maybe could I open a JIRA for this inconsistent file extensions?
>
> It would be thankful if you give me some feedback
>
> Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

Reply via email to