Re: RocksDB Statebackend

2016-04-12 Thread Maxim
would also be big win for jobs with a large number of keys. Thanks, Maxim. On Tue, Apr 12, 2016 at 10:39 AM, Stephan Ewen wrote: > Concerning the size of RocksDB snapshots - I am wondering if RocksDB > simply does not compact for a long time, thus having a lot of stale data in > the

Re: How to perform this join operation?

2016-04-13 Thread Maxim
You could simulate the Samza approach by having a RichFlatMapFunction over cogrouped streams that maintains the sliding window in its ListState. As I understand the drawback is that the list state is not maintained in the managed memory. I'm interested to hear about the right way to implement this.

Task Slots and Heterogeneous Tasks

2016-04-14 Thread Maxim
cross all tasks running on it or each task gets its quota. Could you clarify it? Thanks, Maxim.

Re: Task Slots and Heterogeneous Tasks

2016-04-15 Thread Maxim
such subtasks are IO bound and require no shared memory? On Fri, Apr 15, 2016 at 5:31 AM, Till Rohrmann wrote: > Hi Maxim, > > concerning your second part of the question: The managed memory of a > TaskManager is first split among the available slots. Each slot portion of > the ma

Control triggering on empty window

2016-04-20 Thread Maxim
(); } } I could implement the same logic using global window and custom trigger and evictor, but it looks like ugly workaround to me. Is there any better way to solve this use case? Thanks, Maxim.

Re: Control triggering on empty window

2016-04-21 Thread Maxim
tem has no reference point about what windows there should > exist because there is no knowledge about time except when looking at > elements. > > Cheers, > Aljoscha > > On Thu, 21 Apr 2016 at 01:31 Maxim wrote: > >> I have the following use case: >> >> Inp

Re: Long blocking call in UserFunction triggers HA leader lost?

2020-11-12 Thread Maxim Parkachov
implement background loading in separate thread, but this solution was more complicated, we needed to create shadow copy of cache and then quickly switch them. And with spark streaming there were additional problems. Hope this helps, Maxim.

New kafka producer on each checkpoint

2020-04-06 Thread Maxim Parkachov
fine, but when I look to the log on INFO level, I see that with each checkpoint, new kafka producer is created and then closed again. 1. Is this how it is supposed to work ? 2. Is checkpoint interval 10 second too often ? Thanks, Maxim.

Re: New kafka producer on each checkpoint

2020-04-13 Thread Maxim Parkachov
rrent checkpoints accordingly. I assumed that this is also true for post 011 producers as well. I expected to have 5 (default) producers created and used without re-instantiating producer each time. In my case checkpoint is so fast that I will never have concurrent checkpoints. Regards, Maxim. On Wed,

Wait for cancellation event with CEP

2020-04-30 Thread Maxim Parkachov
with putting "forward" events in state with timer. If "cancel" event is coming I will delete "forward" event from state. My question: Is there more simple way to implement the same logic, possibly with CEP ? Thanks, Maxim.

Re: Wait for cancellation event with CEP

2020-05-04 Thread Maxim Parkachov
Hi Till, thank you for very detailed answer, now it is absolutely clear. Regards, Maxim. On Thu, Apr 30, 2020 at 7:19 PM Till Rohrmann wrote: > Hi Maxim, > > I think your problem should be solvable with the CEP library: > > So what we are doing here is to define a pattern forw

Flink streaming job logging reserves space

2020-07-30 Thread Maxim Parkachov
reserves the space. After I restart Flink job, space is immediately returned back. I'm sure that flink job is the problem, I have re-produces issue on a cluster where only 1 filnk job was running. Below is my log4 config. Any help or idea is appreciated. Thanks in advance,

Re: Flink streaming job logging reserves space

2020-08-03 Thread Maxim Parkachov
s to files and then I could roll them, but then I don't see logs in the GUI. So my question would be, how to make them roll ? Regards, Maxim. On Tue, Aug 4, 2020 at 4:48 AM Yang Wang wrote: > Hi Maxim, > > First, i want to confirm with you that do you have checked all the > &quo

Re: Flink streaming job logging reserves space

2020-08-04 Thread Maxim Parkachov
Hi Yang, Thanks for your advice, now I have a good reason to upgrade to 1.11. Regards, Maxim. On Tue, Aug 4, 2020 at 9:39 AM Yang Wang wrote: > AFAIK, there is no way to roll the *.out/err files except we hijack the > stdout/stderr in Flink code. However, it is a temporary hack. >

Re: Exactly-once semantics in Flink Kafka Producer

2019-08-02 Thread Maxim Parkachov
Hi Vasily, as far as I know, by default console-consumer reads uncommited. Try setting isolation.level to read_committed in console-consumer properties. Hope this helps, Maxim. On Fri, Aug 2, 2019 at 12:42 PM Vasily Melnik < vasily.mel...@glowbyteconsulting.com> wrote: > Hi, Eduardo

Flink 1.9, MapR secure cluster, high availability

2019-08-27 Thread Maxim Parkachov
? (trying to do now, but getting some errors during compilation). Can I somehow force flink to use MapR zookeeper even with HA mode ? Thanks in advance, Maxim.

Re: Flink 1.9, MapR secure cluster, high availability

2019-08-30 Thread Maxim Parkachov
initialize BLOB and org.apache.zookeeper (which is as well MapR) for HA recovery. It works, but, I was expecting it to work without compiling MapR dependencies. Hope this helps, Maxim. On Thu, Aug 29, 2019 at 7:00 PM Stephan Ewen wrote: > Hi Maxim! > > The change of the MapR dependen

Re: Flink 1.9, MapR secure cluster, high availability

2019-09-16 Thread Maxim Parkachov
Hi Stephan, sorry for the late answer, didn't have access to cluster. Here is log and stacktrace. Hope this helps, Maxim. - 2019-09-16 18:00:31,804

Re: Initialization of broadcast state before processing main stream

2019-11-14 Thread Maxim Parkachov
ared lock in zookeeper. Hope this helps, Maxim. On Thu, Nov 14, 2019 at 7:42 AM vino yang wrote: > Hi Vasily, > > Currently, Flink did not do the coordination between a general stream and > broadcast stream, they are both streams. Your scene of using the broadcast > state is a spec

Flink 1.10 on MapR secure cluster with high availability

2020-02-05 Thread Maxim Parkachov
uses shaded zookeeper (org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn) which doesn't have MapR specific changes and fails to authenticate. I would really appreciate any help in resolving this issue.I'm ready to provide any required details. Regards, Maxim.

Re: Flink 1.10 on MapR secure cluster with high availability

2020-02-05 Thread Maxim Parkachov
Hi Chesnay, thanks for advise. Will it work if I include MapR specific zookeeper in job dependencies and still use out-of-box Flink binary distribution ? Regards, Maxim. On Wed, Feb 5, 2020 at 3:25 PM Chesnay Schepler wrote: > You must rebuild Flink while overriding zookeeper.version prope

Re: Flink 1.10 on MapR secure cluster with high availability

2020-02-07 Thread Maxim Parkachov
Hi Chesnay, I managed to re-compile with MapR zookeeper and can confirm that it works with HA as well. Still I find it strange that HA uses shadow version of zookeeper instead of version from classpath how it is done for hadoop. Thanks, Maxim. On Wed, Feb 5, 2020 at 3:43 PM Chesnay Schepler

[Flink 1.10] Classpath doesn't include custom files in lib/

2020-02-14 Thread Maxim Parkachov
Hi everyone, I'm trying to run my job with flink 1.10 with YARN cluster per-job mode. In the previous versions all files in lib/ folder were automatically included in classpath. Now, with 1.10 I see only *.jar files are included in classpath. but not "other" files. Is this deliberate change or bug

Re: [Flink 1.10] Classpath doesn't include custom files in lib/

2020-02-17 Thread Maxim Parkachov
was overwritten with environment specific settings in lib/job.properties of flink distribution. Now this doesn't seem to work. I'm using: getClass.getClassLoader.getResource("lib/job.properties") to get file. Could it be the problem ? Thanks, Maxim. On Mon, Feb 17, 2020

Re: [Flink 1.10] Classpath doesn't include custom files in lib/

2020-02-17 Thread Maxim Parkachov
ace, instead of getting it as resource, which I do in local mode. Regards, Maxim. On Mon, Feb 17, 2020 at 3:03 PM Yang Wang wrote: > Hi Maxim, > > I have verified that the following two ways could both work. > > getClass().getClassLoader().getResource("lib/job.properties") &

Ship compiled code with broadcast stream ?

2018-10-09 Thread Maxim Parkachov
ow to achieve this. Did someone implement system like this ? Is this possible at all ? Any help is greatly appreciated, Maxim.

Re: Ship compiled code with broadcast stream ?

2018-10-09 Thread Maxim Parkachov
ble exception, but I'll check this. Referencing existing jar will not solve problem as I would need to re-submit job, which I want to avoid in the first place. I actually wanted exactly first scenario, send newly compiled objects. Regards, Maxim. >

How autoscaling works on Kinesis Data Analytics for Java ?

2019-01-28 Thread Maxim Parkachov
on AWS Data Analytics for Java give a short explanation ? Thanks in advance. Maxim.

Flink on MapR

2019-03-13 Thread Maxim Parkachov
secured cluster. Regards, Maxim.

Re: How autoscaling works on Kinesis Data Analytics for Java ?

2019-04-22 Thread Maxim Parkachov
snapshot and restarts streaming job, so no magic here, but nicely automated. Regards, Maxim. On Tue, Jan 29, 2019 at 5:23 AM Maxim Parkachov wrote: > Hi, > > I had impression, that in order to change parallelism, one need to stop > Flink streaming job and re-start with

Automatic deployment of new version of streaming stateful job

2019-07-15 Thread Maxim Parkachov
ly my question, what is best practice of automatically restarting or deploying new version of stateful streaming application ? Every tip is greatly appreciated. Thanks, Maxim.

Re: Automatic deployment of new version of streaming stateful job

2019-07-17 Thread Maxim Parkachov
Hi Marc, thanks a lot for the tool. Unfortunately, I could not direcly use it, but I will take couple of ideas and will implement my own script. Nevertherless, I'm really surprised that such functionality doesn't exist out of the box. Regards, Maxim. On Tue, Jul 16, 2019 at 9:

yarn-session vs cluster per job for streaming jobs

2019-07-17 Thread Maxim Parkachov
on mode. 2. there are often network interruptions/slowdowns. 3. I'm trying to minimise time to restart job to have as much as possible continious processing. Thanks in advance, Maxim.

Re: yarn-session vs cluster per job for streaming jobs

2019-07-17 Thread Maxim Parkachov
mode. Thanks, Maxim. On Thu, Jul 18, 2019 at 4:57 AM Haibo Sun wrote: > Hi, Maxim > > For the concern talking on the first point: > If HA and checkpointing are enabled, AM (the application master, that is > the job manager you said) will be restarted by YARN after it dies,

Re: Providing external files to flink classpath

2019-07-18 Thread Maxim Parkachov
Hi Vishwas, took me some time to find out as well. If you have your properties file under lib following will work: val kafkaPropertiesInputStream = getClass.getClassLoader.getResourceAsStream("lib/config/kafka.properties") Hope this helps, Maxim. On Wed, Jul 17, 2019 at 7:23

Initialise side input state

2017-11-02 Thread Maxim Parkachov
e and push all slow dimension data downstream, but I could not find how to hold processing fast data until state is initialised. I realise that FLIP-17 (Side Inputs) is what I need, but is there some other way to implement it now ? Thanks, Maxim.

Fwd: Initialise side input state

2017-11-02 Thread Maxim Parkachov
Hi Xingcan, On Fri, Nov 3, 2017 at 3:38 AM, Xingcan Cui wrote: > Hi Maxim, > > if I understand correctly, you actually need to JOIN the fast stream with > the slow stream. Could you please share more details about your problem? > Sure I can explain more, with some example of

Re: Forcing consuming one stream completely prior to another starting

2018-01-20 Thread Maxim Parkachov
still not sure if it’s better to read it from compacted topic every time or have additional cache in source function or state in CoProcessFunction is enough. Hope this helps and would be interested to hear your experience. Regards, Maxim.

Multiple Async IO

2018-04-03 Thread Maxim Parkachov
query inside Taking into account that each event needs all queries. Reduce amount of queries for each record is not an option. In this case I would like to minimise processing time of event, even if throughput will suffer. Any advice or consideration is greatly appreciated. Thanks, Maxim.

Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Maxim Senin via user
before it picks the right values? Does `pekko.ask.timeout` need to be sufficient for task managers to get into running state with all the restarts? Cheers, Maxim COGILITY SOFTWARE CORPORATION LEGAL DISCLAIMER: The information in this email is confidential and is

Job goes into FINISHED state after rescaling - link operator

2024-04-22 Thread Maxim Senin via user
Hi. My Flink Deployment is set to use savepoint for upgrades and for taking savepoint before stopping. When rescaling happens, for some reason it scales the JobManager to zero (“Scaling JobManager Deployment to zero with 300 seconds timeout”) and the job goes into FINISHED state. It doesn’t seem

Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-25 Thread Maxim Senin via user
tor.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:145) ... 13 more How to fix this? Why is the deployment not coming back up after this exception? Is there an configuration property to set a number of retires? Thanks, Maxim

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-25 Thread Maxim Senin via user
(PartialFunction.scala:127) at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) at org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) I can’t find any information on how to interpret this. Please advise.. Cheers, Maxim From

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
still a mystery. Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0 Hi Maxim! Regarding the status update error, it could be related to a

Re: Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
tor 1.9.0 . . . . . . . . . . . . . . . . . . . . . Maxim Senin Senior Backend Engineer COGILITY<http://cogility.com/> From: prashant parbhane Date: Tuesday, April 23, 2024 at 11:09 PM To: user@flink.apache.org Subject: Regarding java.lang.IllegalStateException Hello, We have been facing this weird issu

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
Deployment [INFO ][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Deleting Kubernetes HA metadata Any ideas? Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink

Re: [External] Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
My guess it’s a major known issue. Need a workaround. https://issues.apache.org/jira/browse/FLINK-32212 /Maxim From: prashant parbhane Date: Tuesday, April 23, 2024 at 11:09 PM To: user@flink.apache.org Subject: [External] Regarding java.lang.IllegalStateException Hello, We have been facing

Operator/Autoscaler/Autotuner tuning behavior question

2024-05-08 Thread Maxim Senin via user
available resources periodically, which should prevent or reduce inadvertent greedy requests. But what’s the strategy used by the operator if the request is already too large to handle? Thanks a lot! /Maxim COGILITY SOFTWARE CORPORATION LEGAL DISCLAIMER: The

S3 schema for jar location?

2024-08-01 Thread Maxim Senin via user
When will Flink Operator support schemas other than `local` for application deployment jar files? I just tried flink operator 1.9 and it’s still not working with `s3` locations. If s3 is good for savepoints and checkpoints, why can’t the jar also be on s3? Thanks, Maxim