Re: Merging of parquet file schemas

Wes McKinney Tue, 10 Jul 2018 10:46:19 -0700

hi Dan,

Not yet -- the relevant JIRA is
https://issues.apache.org/jira/browse/ARROW-843. We would appreciate
some help with this


Thanks

On Tue, Jul 10, 2018 at 10:54 AM, Dan Amner <amne...@hotmail.com> wrote:
> Hi,
>
> I am attempting to read a number of smaller parquet files and merge them into 
> a larger parquet file.
>
> The files are created by Spark jobs that run periodically throughout the day.
>
> The issue I have is that the small parquet files can have slightly different 
> schemas and when I create the Dataset it complains that the schemas aren’t 
> the same.
>
> Spark handles this by merging the schemas together, is there functionality in 
> pyarrow that can do the same?
>
> Thanks,
> Dan

Re: Merging of parquet file schemas

Reply via email to