Thanks everyone for your inputs here! Really helpful information!
From: Chamikara Jayalath
Reply-To: "user@beam.apache.org"
Date: Thursday, January 28, 2021 at 10:54 AM
To: user
Subject: Re: Overwrite support from ParquetIO
On Thu, Jan 28, 2021 at 9:14 AM Alexey Romanenko
mailto:
now if this makes sense to you. Thanks!
>
>
> *From: *Alexey Romanenko
> *Reply-To: *"user@beam.apache.org"
> *Date: *Wednesday, January 27, 2021 at 9:10 AM
> *To: *"user@beam.apache.org"
> *Subject: *Re: Overwrite support from ParquetIO
>
> What do you
org"
> Date: Wednesday, January 27, 2021 at 9:10 AM
> To: "user@beam.apache.org"
> Subject: Re: Overwrite support from ParquetIO
>
> What do you mean by “wipe out all existing parquet files before a write
> operation”? Are these all files that already exist in
r this deletion operation, or maybe a composite
>>> PTransform that does deletion first followed by ParquetIO.Write.
>>>
>>>
>>>
>>> *From: *Chamikara Jayalath
>>> *Reply-To: *"user@beam.apache.org"
>>&g
, this can
>> be done by performing it in a side-input step (to a ParDo that precedes
>> sink) or by adding a GBK/Reshuffle between the two steps.
>>
>>
>>
>> Thanks,
>>
>> Cham
>>
>>
>>
>>
>>
>>
>>1.
>
er
> *Cc: *Alexey Romanenko
> *Subject: *Re: Overwrite support from ParquetIO
>
>
>
>
>
>
>
> On Wed, Jan 27, 2021 at 12:06 PM Tao Li wrote:
>
> @Alexey Romanenko thanks for your response.
> Regarding your questions:
>
>
>
>1. Yes I can p
Date: Wednesday, January 27, 2021 at 3:45 PM
To: user
Cc: Alexey Romanenko
Subject: Re: Overwrite support from ParquetIO
On Wed, Jan 27, 2021 at 12:06 PM Tao Li
mailto:t...@zillow.com>> wrote:
@Alexey Romanenko<mailto:aromanenko@gmail.com> thanks for your response.
Regarding your
n the two steps.
Thanks,
Cham
>
>1.
>
>
>
> Please let me know if this makes sense to you. Thanks!
>
>
>
>
>
> *From: *Alexey Romanenko
> *Reply-To: *"user@beam.apache.org"
> *Date: *Wednesday, January 27, 2021 at 9:10 AM
> *To: *&quo
files
from previous run that won’t get overwritten in the current run.
Please let me know if this makes sense to you. Thanks!
From: Alexey Romanenko
Reply-To: "user@beam.apache.org"
Date: Wednesday, January 27, 2021 at 9:10 AM
To: "user@beam.apache.org"
Subject: Re: Overwr
What do you mean by “wipe out all existing parquet files before a write
operation”? Are these all files that already exist in the same output
directory? Can you purge this directory before or just use a new output
directory for every pipeline run?
To write Parquet files you need to use ParquetI
Hi Beam community,
Does ParquetIO support an overwrite behavior when saving files? More
specifically, I would like to wipe out all existing parquet files before a
write operation. Is there a ParquetIO API to support that? Thanks!
11 matches
Mail list logo