Re: Flink File Source: File read strategy

2023-09-24 Thread Shammon FY
Hi Kirti, I think you can refer to doc [1] and create a table in your S3 file system (put your s3 path in the `path` field), then submit jobs to write and read data with S3. You can refer to [2] if your jobs are `DataStream`. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connecto

RE: Flink File Source: File read strategy

2023-09-24 Thread Kirti Dhar Upadhyay K via user
Thanks Shammon. Is there any way to verify that File Source reads files directly from S3? Regards, Kirti Dhar From: Shammon FY Sent: 25 September 2023 06:27 To: Kirti Dhar Upadhyay K Cc: user@flink.apache.org Subject: Re: Flink File Source: File read strategy Hi Kirti, I think the default fil

Re: Side outputs documentation

2023-09-24 Thread Yunfeng Zhou
Hi Alexis, If you create OutputTag with the constructor `OutputTag(String id)`, you need to make it anonymous for Flink to analyze the type information. But if you use the constructor `OutputTag(String id, TypeInformation typeInfo)`, you need not make it anonymous as you have provided the type inf

Re: About Flink parquet format

2023-09-24 Thread Feng Jin
Hi Kamal Indeed, Flink does not handle this exception. When this exception occurs, the Flink job will fail directly and internally keep restarting, continuously creating new files. Personally, I think this logic can be optimized. When this exception occurs, the file with the exception should be d

Re: Re: Re: How to read flinkSQL job state

2023-09-24 Thread Hangxiang Yu
Hi, Yifan. Unfortunately, IIUC, we could get the key and value type only by reading related sql codes currently. I think it's useful if we could support SQL semantics for the Processor API, but it indeed will take lots of effort. On Thu, Sep 21, 2023 at 12:05 PM Yifan He via user wrote: > Hi Han

RE: About Flink parquet format

2023-09-24 Thread Kamal Mittal via user
Hello, Can you please share that why Flink is not able to handle exception and keeps on creating files continuously without closing? Rgds, Kamal From: Kamal Mittal via user Sent: 21 September 2023 07:58 AM To: Feng Jin Cc: user@flink.apache.org Subject: RE: About Flink parquet format Yes. D

After using the jemalloc memory allocator for a period of time, checkpoint timeout occurs and tasks are stuck

2023-09-24 Thread rui chen
After using the jemalloc memory allocator for a period of time, checkpoint timeout occurs and tasks are stuck. Who has encountered this? flink version:1.13.2, jiemalloc version: 5.3.0

Re: Flink File Source: File read strategy

2023-09-24 Thread Shammon FY
Hi Kirti, I think the default file `Source` does not download files locally in Flink, but reads them directly from S3. However, Flink also supports configuring temporary directories through `io.tmp.dirs`. If it is a user-defined source, it can be obtained from FlinkS3FileSystem. After the Flink jo