k8s operator - clearing operator state

2023-09-04 Thread Krzysztof Chmielewski
Hi community, I would like to ask how one can clear Flink's k8s operator state. I have a sandbox k8s cluster with Flink k8s operator where I've deployed Flink session cluster with few Session jobs. After some play around, and braking few things here and there I see this log: 023-0

Re: Re: Operator state in New Source API

2021-12-23 Thread Yun Gao
Hi Krzysztof, Sorry there are indeed no document said that the operator state is only kept in memory, but based on the current implementation it is indeed the case. And I might also need to fix one point: the Split Enumerate should be executed in the JM side inside the OperatorCoordinator

Re: Operator state in New Source API

2021-12-23 Thread Krzysztof Chmielewski
Thank you both, yes seems that the only option on a non keyed operate would be List State, my bad. Yun Gao, I'm wondering from where you get the information that " Flink only support in-memory operator state", can you point me to the documentation that says that? I cannot find any

Re: Operator state in New Source API

2021-12-22 Thread Yun Gao
Hi Krzysztof, If I understand right, I think managed operator state might not help here since currently Flink only support in-memory operator state. Is it possible currently we first have a customized SplitEnumerator to skip the processed files in some other way? For example, if these files

Re: Operator state in New Source API

2021-12-22 Thread Yun Tang
Hi Krzysztof, Non-keyed operator state only supports list-like state [1] as there exist no primary key in operator state. That is to say you cannot use map state in source operator. [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/state

Operator state in New Source API

2021-12-22 Thread Krzysztof Chmielewski
Hi, Is it possible to use managed operator state like MapState in an implementation of new unified source interface [1]. I'm especially interested with using Managed State in SplitEnumerator implementation. I have a use case that is a variation of File Source where I will have a great numb

Re: Storing Operator state in RocksDb during runtime - plans

2020-04-10 Thread Congxian Qiu
t; Fabian > > Am So., 5. Apr. 2020 um 22:33 Uhr schrieb KristoffSC < > krzysiek.chmielew...@gmail.com>: > >> Hi, >> according to [1] operator state and broadcast state (which is a "special" >> type of operator state) are not stored in RocksDb during runt

Re: Is it possible to emulate keyed state with operator state?

2020-04-10 Thread David Anderson
-each-record-to-a-processor-task-slot-td16483.html On Wed, Apr 8, 2020 at 9:10 AM Salva Alcántara wrote: > > Just for the sake of experimenting and learning. Let's assume that we have a > keyed process function using keyed state and we want to rewrite it using > operator state

Is it possible to emulate keyed state with operator state?

2020-04-08 Thread Salva Alcántara
Just for the sake of experimenting and learning. Let's assume that we have a keyed process function using keyed state and we want to rewrite it using operator state. The question is, would that be possible to keep the exact same behaviour? For example, one could use operator union list stat

Re: Storing Operator state in RocksDb during runtime - plans

2020-04-06 Thread Fabian Hueske
Hi Kristoff, I'm not aware of any concrete plans for such a feature. Best, Fabian Am So., 5. Apr. 2020 um 22:33 Uhr schrieb KristoffSC < krzysiek.chmielew...@gmail.com>: > Hi, > according to [1] operator state and broadcast state (which is a "special" > type of

Storing Operator state in RocksDb during runtime - plans

2020-04-05 Thread KristoffSC
Hi, according to [1] operator state and broadcast state (which is a "special" type of operator state) are not stored in RocksDb during runtime when RocksDb is choosed as state backend. Are there any plans to change this? I'm working around a case where I will have a quite large

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-12-04 Thread Aaron Levin
then JM >> > may encounter OOM because each TDD will contains all the state of all >> > subtasks[1] >> > >> > [1] >> > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-operator-state >> > B

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Gyula Fóra
wrote: > > > > Hi > > > > Do you use UNION state in your scenario, when using UNION state, then JM > may encounter OOM because each TDD will contains all the state of all > subtasks[1] > > > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/d

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Aaron Levin
tasks[1] > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-operator-state > Best, > Congxian > > > Aaron Levin 于2019年11月27日周三 上午3:55写道: >> >> Hi, >> >> Some context: after a refactoring, we were unabl

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Congxian Qiu
Hi Do you use UNION state in your scenario, when using UNION state, then JM may encounter OOM because each TDD will contains all the state of all subtasks[1] [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-operator-state Best, Congxian Aaron

What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-26 Thread Aaron Levin
file for the checkpoint was 1.3GB (usually 11MB). Inside the `_metadata` file we saw `- 1402496 offsets: com.stripe.flink.backfill.kafka-archive-file-progress`. This happened to be the operator state we were no longer initializing or snapshotting after the refactoring. Before I dig further into thi

Re: Per Operator State Monitoring

2019-11-26 Thread Yu Li
Hi Aaron, I don't think we have such fine grained metrics on per operation state size, but from your description that "YARN kills containers who are exceeding their memory limits", I think the root cause is not the state size but related to the memory consumption of the state backend. My guess is

Re: Question about to modify operator state on Flink1.7 with State Processor API

2019-11-26 Thread Kaihao Zhao
hen configuring the source[1, 2]. When resuming your job pass the > —allowNonRestoredState flag and your offsets will reset but all other state > will be retained. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#matching-operator-state > &

Re: Question about to modify operator state on Flink1.7 with State Processor API

2019-11-25 Thread vino yang
:39写道: > Hi, > > We are running Flink 1.7 and recently due to Kafka cluster migration, we > need to find a way to modify kafka offset in FlinkKafkaConnector's state, > and we found Flink 1.9's State Processor API is the exactly tool we need, > we are able to modify the opera

Question about to modify operator state on Flink1.7 with State Processor API

2019-11-25 Thread Kaihao Zhao
Hi, We are running Flink 1.7 and recently due to Kafka cluster migration, we need to find a way to modify kafka offset in FlinkKafkaConnector's state, and we found Flink 1.9's State Processor API is the exactly tool we need, we are able to modify the operator state via State Processo

Re: Per Operator State Monitoring

2019-11-25 Thread Piotr Nowojski
Hi, I’m not sure if there is some simple way of doing that (maybe some other contributors will know more). There are two potential ideas worth exploring: - use periodically triggered save points for monitoring? If I remember correctly save points are never incremental - use save point input/out

Per Operator State Monitoring

2019-11-22 Thread Aaron Langford
Hey Flink Community, I'm working on a Flink application where we are implementing operators that extend the RichFlatMap and RichCoFlatMap interfaces. As a result, we are working directly with Flink's state API (ValueState, ListState, MapState). Something that appears to be extremely valuable is ha

Re: Operator state

2019-08-08 Thread Yun Tang
From: Navneeth Krishnan Sent: Thursday, August 8, 2019 13:50 To: user Subject: Operator state Hi All, Is there a way to share operator state among operators? For example, I have an operator which has union state and the same state data is required in a downstream operator. If not, is

Operator state

2019-08-07 Thread Navneeth Krishnan
Hi All, Is there a way to share operator state among operators? For example, I have an operator which has union state and the same state data is required in a downstream operator. If not, is there a recommended way to share the state? Thanks

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-16 Thread Andrey Zagrebin
The stack trace shows that the state is being restored which has probably already happened after job restart. I am wondering why it has been restarted after some time of running. Could you share full job/task manager logs? On Thu, May 16, 2019 at 6:26 AM anaray wrote: > Thank You Andrey. Arity o

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-15 Thread anaray
Thank You Andrey. Arity of the job has not changed. Here issue is that job will run for sometime (with checkpoint enabled) and then after some time will get into above exception. The job keeps restarting afterwards. One thing that I want point out here is that we have a custom *serialization sche

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-15 Thread Andrey Zagrebin
y of some Tuple type changed in some operator state. The number of tuple fields could have increased after job restart. In that case Flink expects tuples with more fields stored in checkpoint and fails. Such change would require an explicit state migration. Could it be the case? When did the failure

flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-14 Thread anaray
options. Please advice. java.lang.IllegalStateException: Could not initialize operator state backend. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initOperatorState(AbstractStreamOperator.java:301) at

Re: Flink can't initialize operator state backend when starting from checkpoint

2018-09-21 Thread Marvin777
Hi, I do not use custom serializers, and my job contains only source and sink(BucketingSink). What causes this phenomenon in general? I suggest that you also update to a newer version, at least the latest > bugfix release Which version does this sentence refer to? And could you please help l

Re: Flink can't initialize operator state backend when starting from checkpoint

2018-09-21 Thread Stefan Richter
Hi, that is correct. If you are using custom serializers you should double check their correctness, maybe using our test base for type serializers. Another reason could be that you modified the job in a way that silently changed the schema somehow. Concurrent use of serializers across different

Re: Flink can't initialize operator state backend when starting from checkpoint

2018-09-21 Thread vino yang
Hi Qingxiang, Several days ago, Stefan described the causes of this anomaly in a problem similar to this: Typically, these problems have been observed when something was wrong with a serializer or a stateful serializer was used from multiple threads. Thanks, vino. Marvin777 于2018年9月21日周五 下午3:20

Flink can't initialize operator state backend when starting from checkpoint

2018-09-21 Thread Marvin777
Hi all, When Flink(1.4.2) job starts, it could find checkpoint files at HDFS, but exception occurs during deserializing: [image: image.png] Do you have any insight on this? Thanks, Qingxiang Ma

Re: Need a map-like state in an operator state

2018-06-21 Thread xsheng
Solved it by using a key selector that returns a constant, so that creates a "pseudo" keyedStream with only one partition. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Need a map-like state in an operator state

2018-06-21 Thread xsheng
Hi All, I'm sorry if I'm double posting, but I posted it before subscribing and I don't see it in my post lists. So I'm posting it again. The Flink app we are trying to build is as such: read from kafka, sort the messages according to some dependency rules, and only send messages that have satis

Re: Managed operator state treating state of all parallel operators as the same

2018-02-01 Thread m@xi
Hello Gyula & Stefan, Below I attach a similar situation that I am trying to resolve, [1] I am also using *managed operator state*, but I have some trouble with the flink documentation. I believe it is not that clear. So, I have the following questions: 1 -- Can I concatenate all the par

Re: Questions about managed operator state

2018-01-14 Thread Boris Lublinsky
> > 2018-01-14 2:39 GMT+01:00 Boris Lublinsky <mailto:boris.lublin...@lightbend.com>>: > Documentation > https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html#using-managed-operator-state > > <https://ci.apache.org/projects/

Re: Questions about managed operator state

2018-01-14 Thread Fabian Hueske
9 GMT+01:00 Boris Lublinsky : > Documentation https://ci.apache.org/projects/flink/ > flink-docs-release-1.4/dev/stream/state/state.html#using- > managed-operator-state > Refers to CheckpointedRestoring interface. > *Which jar defines this interface - can’t find it* > > *Al

Questions about managed operator state

2018-01-13 Thread Boris Lublinsky
Documentation https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html#using-managed-operator-state <https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html#using-managed-operator-state> Refers to CheckpointedRestoring int

Re: Managed operator state treating state of all parallel operators as the same

2017-07-04 Thread Gyula Fóra
e this functionality? I think I don't have enough knowledge about > Flink's internals to implement this easily. > > Gerard > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Managed-operator-state-treating-s

Re: Managed operator state treating state of all parallel operators as the same

2017-07-04 Thread Stefan Richter
s (e.g. snapshot and restore) to > achieve this functionality? I think I don't have enough knowledge about > Flink's internals to implement this easily. > > Gerard > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archiv

Re: Managed operator state treating state of all parallel operators as the same

2017-07-04 Thread gerardg
ternals to implement this easily. Gerard -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Managed-operator-state-treating-state-of-all-parallel-operators-as-the-same-tp14102p14111.html Sent from the Apache Flink User Mailing List archive. mai

Re: Managed operator state treating state of all parallel operators as the same

2017-07-03 Thread Stefan Richter
> > 2017-07-03 17:50 GMT+02:00 gerardg <mailto:ger...@talaia.io>>: > Hello, > > Is it possible to have managed operator state where all the parallel > operators know that they have the same state? Therefore, it would be only > snapshotted by one of them and in case

Re: Managed operator state treating state of all parallel operators as the same

2017-07-03 Thread Fabian Hueske
e managed operator state where all the parallel > operators know that they have the same state? Therefore, it would be only > snapshotted by one of them and in case of failure (or rescaling) all would > use that snapshot. > > For the rescaling case I guess I could use union redistributio

Managed operator state treating state of all parallel operators as the same

2017-07-03 Thread gerardg
Hello, Is it possible to have managed operator state where all the parallel operators know that they have the same state? Therefore, it would be only snapshotted by one of them and in case of failure (or rescaling) all would use that snapshot. For the rescaling case I guess I could use union

Re: A way to control redistribution of operator state?

2017-02-15 Thread Stefan Richter
list? > > Best regards, > Dmitry > > On Tue, Feb 14, 2017 at 9:33 AM, Stefan Richter <mailto:s.rich...@data-artisans.com>> wrote: > Hi, > > there is something that we call "raw keyed“ operator state, which might > exactly serve your purpose. It is already u

Re: A way to control redistribution of operator state?

2017-02-14 Thread Dmitry Golubets
results in correct behavior. Can I rely on Flink distributing state evenly and in the order I return it in the list? Best regards, Dmitry On Tue, Feb 14, 2017 at 9:33 AM, Stefan Richter wrote: > Hi, > > there is something that we call "raw keyed“ operator state, which might > exa

Re: A way to control redistribution of operator state?

2017-02-14 Thread Stefan Richter
Hi, there is something that we call "raw keyed“ operator state, which might exactly serve your purpose. It is already used internally by Flink’s window operator, but there exists currently no public API for this feature. Way it works currently is that you obtain input and output streams

Re: A way to control redistribution of operator state?

2017-02-13 Thread Tzu-Li (Gordon) Tai
to the user and making it configurable. Looping in Stefan (in cc) who mostly worked on this part and see if he can provide more info. - Gordon On February 14, 2017 at 2:30:27 AM, Dmitry Golubets (dgolub...@gmail.com) wrote: Hi, It looks impossible to implement a keyed state with operator state

A way to control redistribution of operator state?

2017-02-13 Thread Dmitry Golubets
Hi, It looks impossible to implement a keyed state with operator state now. I know it sounds like "just use a keyed state", but latter requires updating it on every value change as opposed to operator state and thus can be expensive (especially if you have to deal with mutable structu

Re: Partitioning operator state

2016-12-08 Thread Dominik Safaric
? Dominik > On 8 Dec 2016, at 10:04, Stefan Richter wrote: > > Hi Dominik, > > as Gordon’s response only covers keyed-state, I will briefly explain what > happens for non-keyed operator state. In contrast to Flink 1.1, Flink 1.2 > checkpointing does not write a single blac

Re: Partitioning operator state

2016-12-08 Thread Stefan Richter
Hi Dominik, as Gordon’s response only covers keyed-state, I will briefly explain what happens for non-keyed operator state. In contrast to Flink 1.1, Flink 1.2 checkpointing does not write a single blackbox object (e.g. ONE object that is a set of all kafka offsets is emitted), but a list of

Re: Partitioning operator state

2016-12-07 Thread Tzu-Li (Gordon) Tai
-LuBCyPUWyFd9l3T9WyssQ63w/edit# On December 8, 2016 at 4:40:18 AM, Dominik Safaric (dominiksafa...@gmail.com) wrote: Hi everyone, In the case of scaling out a Flink cluster, how does Flink handle operator state partitioning of a staged topology? Regards, Dominik

Partitioning operator state

2016-12-07 Thread Dominik Safaric
Hi everyone, In the case of scaling out a Flink cluster, how does Flink handle operator state partitioning of a staged topology? Regards, Dominik

Re: Apache Flink Operator State as Query Cache

2015-11-17 Thread Aljoscha Krettek
resources and as such recover a bit faster. There are > use cases for that as well. > > Storing the model in OperatorState is a good idea, if you can. On the roadmap > is to migrate the operator state to managed memory as well, so that should > take care of the GC issues. > >

Re: Apache Flink Operator State as Query Cache

2015-11-16 Thread Welly Tambunan
t better >>> throughput/latency/consistency and have one less system to worry about >>> (external k/v store). State outside means that the Flink processes can be >>> slimmer and need fewer resources and as such recover a bit faster. There >>> are use cases for that as

Re: Apache Flink Operator State as Query Cache

2015-11-16 Thread Stephan Ewen
a bit faster. There >> are use cases for that as well. >> >> Storing the model in OperatorState is a good idea, if you can. On the >> roadmap is to migrate the operator state to managed memory as well, so that >> should take care of the GC issues. >> >> We ar

Re: Apache Flink Operator State as Query Cache

2015-11-16 Thread Anwar Rizal
aster. There > are use cases for that as well. > > Storing the model in OperatorState is a good idea, if you can. On the > roadmap is to migrate the operator state to managed memory as well, so that > should take care of the GC issues. > > We are just adding functionality to make t

Re: Apache Flink Operator State as Query Cache

2015-11-15 Thread Welly Tambunan
.0/ >>>> >>>> On Fri, Nov 13, 2015 at 9:50 AM, Welly Tambunan >>>> wrote: >>>> >>>>> Hi Aljoscha, >>>>> >>>>> Thanks for this one. Looking forward for 0.10 release version. >>>>> >>>>&g

Re: Apache Flink Operator State as Query Cache

2015-11-15 Thread Kostas Tzoumas
lly Tambunan >>> wrote: >>> >>>> Hi Aljoscha, >>>> >>>> Thanks for this one. Looking forward for 0.10 release version. >>>> >>>> Cheers >>>> >>>> On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek

Re: Apache Flink Operator State as Query Cache

2015-11-13 Thread Welly Tambunan
;> Cheers >>> >>> On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek >>> wrote: >>> >>>> Hi, >>>> I don’t know yet when the operator state will be transitioned to >>>> managed memory but it could happen for 1.0 (which will come af

Re: Apache Flink Operator State as Query Cache

2015-11-13 Thread Welly Tambunan
gt;> >> On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek >> wrote: >> >>> Hi, >>> I don’t know yet when the operator state will be transitioned to managed >>> memory but it could happen for 1.0 (which will come after 0.10). The good >>> thing

Re: Apache Flink Operator State as Query Cache

2015-11-13 Thread Robert Metzger
: > Hi Aljoscha, > > Thanks for this one. Looking forward for 0.10 release version. > > Cheers > > On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek > wrote: > >> Hi, >> I don’t know yet when the operator state will be transitioned to managed >> memory

Re: Apache Flink Operator State as Query Cache

2015-11-13 Thread Welly Tambunan
Hi Aljoscha, Thanks for this one. Looking forward for 0.10 release version. Cheers On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek wrote: > Hi, > I don’t know yet when the operator state will be transitioned to managed > memory but it could happen for 1.0 (which will come after 0

Re: Apache Flink Operator State as Query Cache

2015-11-12 Thread Aljoscha Krettek
Hi, I don’t know yet when the operator state will be transitioned to managed memory but it could happen for 1.0 (which will come after 0.10). The good thing is that the interfaces won’t change, so state can be used as it is now. For 0.10, the release vote is winding down right now, so you can

Re: Apache Flink Operator State as Query Cache

2015-11-11 Thread Welly Tambunan
Hi Stephan, >Storing the model in OperatorState is a good idea, if you can. On the roadmap is to migrate the operator state to managed memory as well, so that should take care of the GC issues. Is this using off the heap memory ? Which version we expect this one to be available ? Anot

Re: Apache Flink Operator State as Query Cache

2015-11-11 Thread Stephan Ewen
for that as well. Storing the model in OperatorState is a good idea, if you can. On the roadmap is to migrate the operator state to managed memory as well, so that should take care of the GC issues. We are just adding functionality to make the Key/Value operator state usable in CoMap/CoFlatMap as

Re: Apache Flink Operator State as Query Cache

2015-11-08 Thread Welly Tambunan
e, so i'm thinking instead of using in memory cache like redis, ignite etc, i can just use operator state for this one. I just want to gauge do i need to use memory cache or operator state would be just fine. However i'm concern about the Gen 2 Garbage Collection for caching our own stat

Re: Apache Flink Operator State as Query Cache

2015-11-06 Thread Anwar Rizal
Let me understand your case better here. You have a stream of model and stream of data. To process the data, you will need a way to access your model from the subsequent stream operations (map, filter, flatmap, ..). I'm not sure in which case Operator State is a good choice, but I think yo

Apache Flink Operator State as Query Cache

2015-11-06 Thread Welly Tambunan
subsequent query. We are considering using Flink Operator state for that one. Is that the right approach to use that for memory cache ? Or if that preferable using memory cache like redis etc. Any comments will be appreciated. Cheers -- Welly Tambunan Triplelands http://weltam.wordpress.com http