Re: Order of events in Broadcast State

2021-12-03 Thread Alexey Trenikhun
[1] - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/broadcast_state/ The Broadcast State Pattern | Apache Flink The Broadcast State Pattern # In th

Log level for insufficient task slots message

2021-12-03 Thread Mason Chen
Hi all, java.util.concurrent.CompletionException: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not acquire the minimum required resources. Is an exception/message that is thrown when the users misconfigures the job with insufficient task slots. Currently, t

Order of events in Broadcast State

2021-12-03 Thread Alexey Trenikhun
Hello, Trying to understand what statement "Order of events in Broadcast State may differ across tasks" in [1] means. Let's say I have keyed function "A" which broadcasting stream of rules, KeyedBroadcastProcessFunction "B" receives rules and updates broadcast state, like example in [1]. Let's

enable.auto.commit=true and checkpointing turned on

2021-12-03 Thread Vishal Santoshi
Hello folks, 2 questions 1. If we have enabled enable.auto.commit and enabled checkpointing and we restart a flink application ( without checkpoint or savepoint ) , would the kafka consumer start consuming from the last offset committed to kafka. 2. What if in the above scenario, we have "auto.o

Re: Parquet schema per bucket in Streaming File Sink

2021-12-03 Thread Zack Loebel
Unfortunately this does not solve my use case. Because I want to be able to create and change the various outputs at runtime (the partition keys would be dynamic) and as such the sql/extraction would have to change during execution. Which I did not believe to be supported. I'm also operating at the

Re: Stateful functions module configurations (module.yaml) per deployment environment

2021-12-03 Thread Deniz Koçak
Hi Igal, We are using official images from Ververica as the Flink installation. Actually, I was hoping to specify the name of file names to use during the runtime via `mainArgs` in the deployment configuration (or any other way may be). By this way we can specify the target yaml files, but I think

Re: Stateful function endpoint self-signed certificate problem

2021-12-03 Thread Deniz Koçak
Hi Igal, Thanks for the response, we sorted it out by deploying the required certs. to our images. Thanks, Deniz On Fri, Dec 3, 2021 at 3:15 PM Igal Shilman wrote: > > Hi Deniz, > My apologies for the late reply, I assume that by now you have figured this > out since I've seen your followup qu

Re: Stateful function endpoint self-signed certificate problem

2021-12-03 Thread Igal Shilman
Hi Deniz, My apologies for the late reply, I assume that by now you have figured this out since I've seen your followup question :-) StateFun uses the trust store configured in the JVM, so if you can install your certificate there, StateFun should transparently pick it up. Good luck, Igal. On Fr

Re: Stateful functions module configurations (module.yaml) per deployment environment

2021-12-03 Thread Igal Shilman
Hi Deniz, StateFun would be looking for module.yaml(s) in the classpath. If you are submitting the job to an existing Flink cluster this really means that it needs to be either: 1. packaged with the jar (like you are already doing) 2. be present at the classpath, this means that you can place your

Re: Cannot consum from Kinesalite using FlinkKinesisConsumer

2021-12-03 Thread Mika Naylor
Hey Jonas, May I ask what version of Kinesalite you're targeting? With 3.3.3 and STREAM_INITIAL_POSITION = "LATEST", I received a "The timestampInMillis parameter cannot be greater than the currentTimestampInMillis" which may be a misconfiguration on my setup, but with STREAM_INITIAL_POSITION = "

Re: custom metrics in elasticsearch ActionRequestFailureHandler

2021-12-03 Thread Lars Bachmann
Hi Alexander, yes in the first iteration the use case is to get visibility on failed ES requests. Usually we expose metrics to count failures and integrate them into dashboards and setup alerting rules which fire in case they hit a certain threshold. In not Flink based applications which index

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-03 Thread Till Rohrmann
Thanks for starting this discussion Yingjie, How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? I would propose to create a FLIP for these changes since you propose to change the default behaviour. I

Re: PyFlink import internal packages

2021-12-03 Thread Shuiqiang Chen
Hi, Actually, you are able to develop your app in the clean python way. It's fine to split the code into multiple files and there is no need to call `env.add_python_file()` explicitly. When submitting the PyFlink job you can specify python files and entry main module with option --pyFiles and --p

PyFlink import internal packages

2021-12-03 Thread Королькевич Михаил
Hi Flink Team, Im trying to implement app on pyflink. I would like to structure the directory as follows:  flink_app/data_service/s3.pyfilesystem.pyvalidator/validator.pymetrics/statictic.pyquality.pycommon/constants.pymain.py <- e

[DISCUSS] Change some default config values of blocking shuffle

2021-12-03 Thread Yingjie Cao
Hi dev & users, We propose to change some default values of blocking shuffle to improve the user out-of-box experience (not influence streaming). The default values we want to change are as follows: 1. Data compression (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the def

Re: Re: how to run streaming process after batch process is completed?

2021-12-03 Thread Joern Kottmann
Hello, Are there plans to support checkpoints for batch mode? I currently load the state back via the DataStream API, but this gets more and more complicated and doesn't always lead to a perfect state restore (as flink could have done). This is one of my most wanted Flink features these days. Re

Re: Pyflink/Flink Java parquet streaming file sink for a dynamic schema stream

2021-12-03 Thread Georg Heiler
Hi, the schema of the after part depends on each table i.e. holds different columns for each table. So do you receive debezium changelog statements for all/ >1 table? I.e. is the schema in the after part different? Best, Georg Am Fr., 3. Dez. 2021 um 08:35 Uhr schrieb Kamil ty : > Yes the gener

Re: custom metrics in elasticsearch ActionRequestFailureHandler

2021-12-03 Thread Alexander Preuß
Hi Lars, What is your use case for the failure handler, just collecting metrics? We want to remove the configurable failure handler in the new Sink API implementation of the Elasticsearch connector in Flink 1.15 because it can be a huge footgun with regards to delivery guarantees. Best Regards, A