Hi

I tested it on my side and also got the same error. This should be a
limitation of Parquet.

```
java.lang.IllegalArgumentException: maxCapacityHint can't be less than
initialSlabSize 64 1
at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:57)
~[flink-sql-parquet-1.17.1.jar:1.17.1]
at org.apache.parquet.bytes.CapacityByteArrayOutputStream.<init>(
CapacityByteArrayOutputStream.java:153) ~[flink-sql-parquet-1.17.1.jar:1.17.
1]
at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridEncoder
.<init>(RunLengthBitPackingHybridEncoder.jav
```


So I think the current minimum page size that can be set for parquet is 64B.

Best,
Feng


On Tue, Sep 19, 2023 at 6:06 PM Kamal Mittal <kamal.mit...@ericsson.com>
wrote:

> Hello,
>
>
>
> If given page size as 1 byte then encountered exception as  -
> ‘maxCapacityHint can't be less than initialSlabSize %d %d’.
>
>
>
> This is coming from class CapacityByteArrayOutputStream and contained in
> parquet-common library.
>
>
>
> Rgds,
>
> Kamal
>
>
>
> *From:* Feng Jin <jinfeng1...@gmail.com>
> *Sent:* 19 September 2023 01:01 PM
> *To:* Kamal Mittal <kamal.mit...@ericsson.com>
> *Cc:* user@flink.apache.org
> *Subject:* Re: About Flink parquet format
>
>
>
> Hi Kamal
>
>
>
> What exception did you encounter? I have tested it locally and it works
> fine.
>
>
>
>
>
> Best,
>
> Feng
>
>
>
>
>
> On Mon, Sep 18, 2023 at 11:04 AM Kamal Mittal <kamal.mit...@ericsson.com>
> wrote:
>
> Hello,
>
>
>
> Checkpointing is enabled and works fine if configured parquet page size is
> at least 64 bytes as otherwise there is exception thrown at back-end.
>
>
>
> Looks to be an issue which is not handled by file sink bulk writer?
>
>
>
> Rgds,
>
> Kamal
>
>
>
> *From:* Feng Jin <jinfeng1...@gmail.com>
> *Sent:* 15 September 2023 04:14 PM
> *To:* Kamal Mittal <kamal.mit...@ericsson.com>
> *Cc:* user@flink.apache.org
> *Subject:* Re: About Flink parquet format
>
>
>
> Hi Kamal
>
>
>
> Check if the checkpoint of the task is enabled and triggered correctly. By
> default, write parquet files will roll a new file when checkpointing.
>
>
>
>
>
> Best,
>
> Feng
>
>
>
> On Thu, Sep 14, 2023 at 7:27 PM Kamal Mittal via user <
> user@flink.apache.org> wrote:
>
> Hello,
>
>
>
> Tried parquet file creation with file sink bulk writer.
>
>
>
> If configured parquet page size as low as 1 byte (allowed configuration)
> then flink keeps on creating multiple ‘in-progress’ state files and with
> content only as ‘PAR1’ and never closed the file.
>
>
>
> I want to know what is the reason of not closing the file and creating
> multiple ‘in-progress’ part files or why no error is given if applicable?
>
>
>
> Rgds,
>
> Kamal
>
>

Reply via email to