Parquet schema per bucket in Streaming File Sink

2021-11-29 Thread Zack Loebel
Hey all, I have a job which writes data that is a similar shape to a location in s3. Currently it writes a map of data with each row. I want to customize this job to "explode" the map as column names and values, these are consistent for a single bucket. Is there any way to do this? Provide a custo

Re: Flink-13 table API wrongly adding padding for IN clause elements

2021-11-29 Thread Caizhi Weng
Hi! Thanks for raising this issue which is a known issue [1]. Currently, I would recommend creating a UDF as a workaround. [1] https://issues.apache.org/jira/browse/FLINK-24708 Ayush Chauhan 于2021年11月30日周二 下午12:15写道: > Hi, > > In Flink 13, while using filter/where condition in table api I am g

Flink-13 table API wrongly adding padding for IN clause elements

2021-11-29 Thread Ayush Chauhan
Hi, In Flink 13, while using filter/where condition in table api I am getting wrong results. Upon debugging I found that it is adding padding to the IN clause elements according to the first element in the IN clause. Here's the sample code tEnv.toAppendStream(input.where($("ename").in("O2CartPag

Re: How do you configure setCommitOffsetsOnCheckpoints in Flink 1.12 when using KafkaSourceBuilder?

2021-11-29 Thread Hang Ruan
Hi, Maybe you can write like this : builder.setProperty(KafkaSourceOptions.COMMIT_OFFSETS_ON_CHECKPOINT.key(), "true"); Other additional properties could be found here : https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#additional-properties Marco Villa

Re: How do you configure setCommitOffsetsOnCheckpoints in Flink 1.12 when using KafkaSourceBuilder?

2021-11-29 Thread Marco Villalobos
Thank you for the information. That still does not answer my question though. How do I configure Flink in 1.12 using the KafkaSourceBuilder so that consumer should commit offsets back to Kafka on checkpoints? FlinkKafkaConsumer#setCommitOffsetsOnCheckpoints(boolean) has this method. But now tha

Re: Time attribute will be lost after two(or more) table joining

2021-11-29 Thread Caizhi Weng
Hi! As this mail is written in English I'm also forwarding this to the user mailing list. Streaming joins do not retain row time attribute and this is the expected behavior. As you're windowing the results of joins I guess you're enriching the records from one stream with that join. Lookup joins

Re: REST API for detached minicluster based integration test

2021-11-29 Thread Caizhi Weng
Hi! I believe metrics are enabled by default even for a mini cluster. Which Flink version are you using and how do you set your watermark strategy? Could you share your user code about how to create the datastream / SQL and get the job graph? I'm also curious about why do you need to extract the

Re: How do you configure setCommitOffsetsOnCheckpoints in Flink 1.12 when using KafkaSourceBuilder?

2021-11-29 Thread Caizhi Weng
Hi! Flink 1.14 release note states about this. See [1]. [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/release-notes/flink-1.14/#deprecate-flinkkafkaconsumer Marco Villalobos 于2021年11月30日周二 上午7:12写道: > Hi everybody, > > I am using Flink 1.12 and migrating my code from using

How do you configure setCommitOffsetsOnCheckpoints in Flink 1.12 when using KafkaSourceBuilder?

2021-11-29 Thread Marco Villalobos
Hi everybody, I am using Flink 1.12 and migrating my code from using FlinkKafkaConsumer to using the KafkaSourceBuilder. FlinkKafkaConsumer has the method /** > * Specifies whether or not the consumer should commit offsets back to > Kafka on checkpoints. > * This setting will only have effect

Re: REST API for detached minicluster based integration test

2021-11-29 Thread Jin Yi
bump. a more general question is what do people do for more end to end, full integration tests to test event time based jobs with timers? On Tue, Nov 23, 2021 at 11:26 AM Jin Yi wrote: > i am writing an integration test where i execute a streaming flink job > using faked, "unbounded" input wher

Re: REST service for flinkSQL

2021-11-29 Thread Lu Niu
Sure. The requirement is to create a playground so that user can quickly prototype flinksql queries. We have a unified UI for all sqls, including presto and sparksql. We want to integrate flinksql into it. Deploying to prod is not a hard requirement. Best Lu On Tue, Nov 23, 2021 at 11:54 PM Marti

Re: How to Fan Out to 100s of Sinks

2021-11-29 Thread SHREEKANT ANKALA
Hi, Here is our scenario: We have a system that generates data in a jsonl file for all of customers together. We now need to process this jsonl data and conditionally distribute the data to individual customer based on their preferences as Iceberg Tables. So every line in the jsonl file

[DISCUSS] Deprecate Java 8 support

2021-11-29 Thread Chesnay Schepler
Hello, we recently had a discussion on the dev mailing list for deprecating support for Java 8 in 1.15, with a general consensus in favor of it. I now wanted to check in with you, our users, to see what you have got to say about that. Why are we interested in deprecating Java 8 supp

Re: Query regarding exceptions API(/jobs/:jobid/exceptions)

2021-11-29 Thread Mahima Agarwal
Hi Matthias, We have created a JIRA ticket for this issue. Please find the jira id below https://issues.apache.org/jira/browse/FLINK-25096 Thanks Mahima On Mon, Nov 29, 2021 at 2:24 PM Matthias Pohl wrote: > Thanks Mahima, > could you create a Jira ticket and, if possible, add the Flink logs?

November 29 Flink training (today)

2021-11-29 Thread ivan.ros...@agilent.com
Hello, Sorry to spam everyone, but am hoping there's a way to get into Ververica's upcoming Flink developer training, starting today November 29, 2021. Got last-minute approval at work to register, but registration on website is closed :/ Thanks in advance, Ivan Rosero Agilent

Re: Re: Checkpoints aborted - Job is not in state RUNNING but FINISHED

2021-11-29 Thread Yun Gao
Hi Jonas, For the previos versions, the checkpoint would be aborted as long as any task get finished, no matter if they are from the save vertex. And for the Kinesis, sorry I do not find an environment to do a test, but if the task would indeed finished if there are no shards, I think it would in

Re: How to Fan Out to 100s of Sinks

2021-11-29 Thread Fabian Paul
Hi, What do you mean by "fan out" to 100 different sinks? Do you want to replicate the data in all buckets or is there some conditional branching logic? In general, Flink can easily support 100 different sinks but I am not sure if this is the right approach for your use case. Can you clarify your

Re: Query regarding exceptions API(/jobs/:jobid/exceptions)

2021-11-29 Thread Matthias Pohl
Thanks Mahima, could you create a Jira ticket and, if possible, add the Flink logs? That would make it easier to investigate the problem. Best, Matthias On Sun, Nov 28, 2021 at 7:29 AM Mahima Agarwal wrote: > Thanks Matthias > > But we have observed the below 2 exceptions are coming in root-exc