Hi Roman,
Thank you.

I'm familiar with FLIP-27 and I was analyzing the new File Source.

>From there I saw that there are two FileEnumerators -> one that allows for
file split and other that does not. BlockSplittingRecursiveEnumerator
and NonSplittingRecursiveEnumerator.
I was wondering if  BlockSplittingRecursiveEnumerator can be used for
Parquet file.

Actually does Parquet format supports reading file in blocks by different
threads. Do those blocks have to be "merged" later or can I just read them
row by row.

Regards,
Krzysztof Chmielewski

pt., 10 gru 2021 o 09:27 Roman Khachatryan <ro...@apache.org> napisaƂ(a):

> Hi,
>
> Yes, file source does support DoP > 1.
> And in general, a single file can be read in parallel after FLIP-27.
> However, parallel reading of a single Parquet file is currently not
> supported AFAIK.
>
> Maybe Arvid or Fabian could shed more light here.
>
> Regards,
> Roman
>
> On Thu, Dec 9, 2021 at 12:03 PM Krzysztof Chmielewski
> <krzysiek.chmielew...@gmail.com> wrote:
> >
> > Hi,
> > can I have a File DataStream Source that will work with Parquet Format
> and have parallelism level higher than one?
> >
> > Is it possible to read  Parquet  file in chunks by multiple threads?
> >
> > Regards,
> > Krzysztof Chmielewski
>

Reply via email to