Re: Flink state statistics

2024-12-02 Thread Zakelly Lan
Hi Christian, I assume you're using the rocksdb state backend. You can enable some metrics from Rocksdb, please refer to the doc[1]. Please note that the `State#clear` only removes the key/value for specified key, rather than removing all keys. And the deletion will be reflected in the checkpoint

Flink state statistics

2024-12-02 Thread Christian Lorenz via user
Hi, is there a way to receive some basic statistics of a named value-state in flink? Besides, is the assumption correct that a call to org.apache.flink.api.common.state.State#clear is removing the state also from the checkpoint data? We had to use some long TTL time and am uncertain if the data

Re: Flink state

2024-07-22 Thread Saurabh Singh
Hi Banu, Rocksdb is intelligently built to clear any un-useful state from its purview. So you should be good and any required cleanup will be automatically done by RocksDb itself. >From the current documentation, it looks quite hard to relate Flink Internal DS activity to RocksDB DS activity. In m

Re: Flink state

2024-07-21 Thread banu priya
Dear Community, Gentle reminder about my below email. Thanks Banu On Sat, 20 Jul, 2024, 4:37 pm banu priya, wrote: > Hi All, > > I have a flink job with RMQ Source, filters, tumbling window(uses > processing time fires every 2s), aggregator, RMQ Sink. > > I am trying to understand about states

Flink state

2024-07-20 Thread banu priya
Hi All, I have a flink job with RMQ Source, filters, tumbling window(uses processing time fires every 2s), aggregator, RMQ Sink. I am trying to understand about states and checkpoints(enabled incremental rocksdb checkpoints). In local rocks db directory, I have .sst files, log, lock, options fil

Flink State and Filesystem sink

2024-07-05 Thread Alexandre KY
Hello, I am trying to implement a satellite image processing chain. Satellite images are stored as rasters which are heavy, (several GBs) in a FileSystem (I am currently using HDFS for testing purpose but will move on S3 when I'll deploy it on the cloud). So in order to reduce the processing ti

Re: How to read flink state data without setting uid?

2022-09-22 Thread Chesnay Schepler
:02, BIGO wrote: I didn't set the uid for my flink operator, is there any way to read the flink state data? State Processor API requires uid. Thanks.

Re: How to read flink state data without setting uid?

2022-09-22 Thread Chesnay Schepler
Currently the state processor API does not support that. On 22/09/2022 11:02, BIGO wrote: I didn't set the uid for my flink operator, is there any way to read the flink state data? State Processor API requires uid. Thanks.

How to read flink state data without setting uid?

2022-09-22 Thread BIGO
I didn't set the uid for my flink operator, is there any way to read the flink state data? State Processor API requires uid. Thanks.

Re: Best Practice for Querying Flink State

2022-08-29 Thread Chen Qin
the ideas is to emit Flink states CDC (e.g add/remove/update) of a Flink state to side output. Where a certain conditional update to serving systems could be implemented to be able to handle restarts of the Flink job. Chen On Mon, Aug 29, 2022 at 7:15 PM Ken Krugler wrote: > Hi Lu, > > I

Re: Best Practice for Querying Flink State

2022-08-29 Thread Ken Krugler
processing Flink state (option 3) then this is going to be large. As a final take-away, in my experience I’ve always wound up shoving data into a separate system (Pinot is my current favorite) for queries. — Ken > On Aug 29, 2022, at 3:19 PM, Lu Niu wrote: > > Hi, Flink Users > > We h

Best Practice for Querying Flink State

2022-08-29 Thread Lu Niu
Hi, Flink Users We have a user case that requests running ad hoc queries to query flink state. There are several options: 1. Dump flink state to external data systems, like kafka, s3 etc. from there we can query the data. This is a very straightforward approach, but adds system complexity and

RE: how to connect to the flink-state store and use it as cache to serve APIs.

2022-07-06 Thread Schwalbe Matthias
6:29 AM To: Yuan Mei Cc: Hangxiang Yu ; user Subject: Re: how to connect to the flink-state store and use it as cache to serve APIs. ⚠EXTERNAL MESSAGE – CAUTION: Think Before You Click ⚠ Hi Folks, I just wanted to double check, if there is any way to expose rest APIs using Flink sql tables

Re: how to connect to the flink-state store and use it as cache to serve APIs.

2022-07-05 Thread laxmi narayan
Hi Folks, I just wanted to double check, if there is any way to expose rest APIs using Flink sql tables ? Thank you. On Thu, Jun 30, 2022 at 12:15 PM Yuan Mei wrote: > That's definitely something we want to achieve in the future term, and > your input is very valuable. > > One problem with

Re: how to connect to the flink-state store and use it as cache to serve APIs.

2022-06-29 Thread Yuan Mei
That's definitely something we want to achieve in the future term, and your input is very valuable. One problem with the current queryable state setup is that the service is bounded to the life cycle of Flink Job, which limits the usage of the state store/service. Thanks for your insights. Best

Re: how to connect to the flink-state store and use it as cache to serve APIs.

2022-06-29 Thread laxmi narayan
Hi Hangxiang, I was thinking , since we already store entire state in the checkpoint dir so why can't we expose it as a service through the Flink queryable state, in this way I can easily avoid introducing a cache and serve realtime APIs via this state itself and I can go to the database for the h

Re: how to connect to the flink-state store and use it as cache to serve APIs.

2022-06-28 Thread Hangxiang Yu
Hi, laxmi. There are two ways that users can access the state store currently: 1. Queryable state [1] which you could access states in runtime. 2. State Processor API [2] which you could access states (snapshot) offline. But we have marked the Queryable state as "Reaching End-of-Life". We are also

how to connect to the flink-state store and use it as cache to serve APIs.

2022-06-28 Thread laxmi narayan
Hi Team, I am not sure if this is the right use case for the state-store but I wanted to serve the APIs using queryable-state, what are the different ways to achieve this ? I have come across a version where we can use Job_Id to connect to the state, but is there any other way to expose a specific

Re: Flink state migration from 1.9 to 1.13

2022-04-15 Thread XU Qinghui
Thanks Martijn. Now it's more clear to me of the issue. I'll try to see if the state processor API could help to convert the 1.9 savepoints to 1.13 compatible. BR, Le jeu. 14 avr. 2022 à 11:47, Martijn Visser a écrit : > Hi Qinghui, > > If you're using SQL, please be aware that there are unfor

Re: Flink state migration from 1.9 to 1.13

2022-04-14 Thread Martijn Visser
Hi Qinghui, If you're using SQL, please be aware that there are unfortunately no application state compatibility guarantees. You can read more about this in the documentation [1]. This is why the community has been working on FLIP-190 to support this in future versions [2] Best regards, Martijn

Re: Flink state migration from 1.9 to 1.13

2022-04-14 Thread XU Qinghui
Hello Yu'an, Thanks for the reply. I'm using the SQL api, not using the `DataStream` API in the job. So there's no `keyby` call directly in our code, but we do have some `group by` and joins in the SQL. (We are using deprecated table planners both before and after migration) Do you know what could

Re: Flink state migration from 1.9 to 1.13

2022-04-13 Thread yu'an huang
Hi Qinghui, Did you used a difference keyby() for your KeyedCoProcesserOperator? For example, did you use a fied name (keyBy(“id”)) in 1.9 and while use a lambda (keyBy(e->e.getId()) in 1.13. This will make the key serializer incompatible. You may reference this link for how to use Apache Flink

Flink state migration from 1.9 to 1.13

2022-04-13 Thread XU Qinghui
Hello dear community, We are trying to upgrade our flink from 1.9 to 1.13, but it seems the same job running in 1.13 cannot restore the checkpoint / savepoint created by 1.9. The stacktrace looks like: java.lang.Exception: Exception while creating StreamOperatorStateContext. at org.apache.fli

Re: Save app-global cache used by RichAsyncFunction to Flink State?

2022-02-15 Thread Chesnay Schepler
I'm not sure if this would work, but you could try implementing the CheckpointedFunction interface and getting access to state that way. On 14/02/2022 16:25, Clayton Wohl wrote: Is there any way to save a custom application-global cache into Flink state so that it is used with checkp

Save app-global cache used by RichAsyncFunction to Flink State?

2022-02-14 Thread Clayton Wohl
Is there any way to save a custom application-global cache into Flink state so that it is used with checkpoints + savepoints? This cache is used by a RichAsyncFunction that queries an external database, and RichAsyncFunction doesn't support the Flink state functionality directly. I asked

How to examine Flink state in RocksDB with sst_dump

2021-07-13 Thread Eleanore Jin
Hi experts, I am running the flink application as local execution mode for testing. I have configured RocksDB as state backend, and I would like to use rocksDB tools such as ldb or sst_dump to examine how exactly the state is stored. However, I encountered below error, can you please advice me h

Re: Flink State Processor API Example - Java

2021-07-02 Thread Roman Khachatryan
Hi Sandeep, Could you provide the error stack trace and Flink version you are using? Regards, Roman On Fri, Jul 2, 2021 at 6:42 PM Sandeep khanzode wrote: > > Hi Guowei, > > I followed the document, but somehow, I am unable to get a working Java > example for Avro state. > > So, I tried to sim

Re: Flink State Processor API Example - Java

2021-07-02 Thread Sandeep khanzode
Hi Guowei, I followed the document, but somehow, I am unable to get a working Java example for Avro state. So, I tried to simply use the Java SpecificRecords created by Avro Maven Plugin and inject. Now, that works correctly, but I use Avro 1.7.7 since it is the last version that I saw which d

Re: Flink State Processor API Example - Java

2021-06-24 Thread Guowei Ma
Hi Sandeep What I understand is that you want to manipulate the state. So I think you could use the old schema to read the state first, and then write it to a new schema, instead of using a new schema to read an old schema format data. In addition, I would like to ask, if you want to do "State Sch

Flink State Processor API Example - Java

2021-06-24 Thread Sandeep khanzode
Hello, 1.] Can someone please share a working example of how to read ValueState and MapState from a checkpoint and update it? I tried to assemble a working Java example but there are bit and pieces of info around. 2.] I am using Avro 1.7.7 with Flink for state entities since versions belong A

Re: Flink State Query Server threads stuck in infinite loop with high GC activity on CopyOnWriteStateMap get

2021-03-30 Thread Till Rohrmann
inute window basis. The state has been made > queryable so that external services can query the state via Flink State > Query API. I am using memory state backend with a keyed process function > and map state. > > > > I've a simple job running on a 6 node flink standalone clust

Flink State Query Server threads stuck in infinite loop with high GC activity on CopyOnWriteStateMap get

2021-03-29 Thread Aashutosh Swarnakar
Hi Folks, I've recently started using Flink for a pilot project where I need to aggregate event counts on per minute window basis. The state has been made queryable so that external services can query the state via Flink State Query API. I am using memory state backend with a keyed pr

Re: Error querying flink state

2021-01-20 Thread Till Rohrmann
Hi Falak, it is hard to tell what is going wrong w/o the debug logs. Could you check whether they contain anything specific? You can also share them with us. Cheers, Till On Wed, Jan 20, 2021 at 1:04 PM Falak Kansal wrote: > Hi, > > Thank you so much for the response. I am using the 1.12 versi

Re: Error querying flink state

2021-01-18 Thread Till Rohrmann
Hi Falak, Which version of Flink are you using? Providing us with the debug logs could also help understanding what's going wrong. I guess that you have copied the flink-queryable-state-runtime jar into the lib directory and set queryable-state.enable: true in the configuration, right? Here is th

Fwd: Error querying flink state

2021-01-14 Thread Falak Kansal
Hi, I have set up a flink cluster on my local machine. I created a flink job ( TrackMaximumTemperature) and made the state queryable. I am using *github/streamingwithflink/chapter7/QueryableState.scala* example from *https://github.com/streaming-with-flink

Re: Flink State Processor API - Bootstrap One state

2020-11-16 Thread Tzu-Li (Gordon) Tai
Hi, Using the State Processor API, modifying the state in an existing savepoint results in a new savepoint (new directory) with the new modified state. The original savepoint remains intact. The API allows you to only touch certain operators, without having to touch any other state and have them r

Flink State Processor API - Bootstrap One state

2020-11-16 Thread ApoorvK
Currently my flink application has state size of 160GB(around 50 operators), where few state operator size is much higher, I am planning to use state processor API to bootstrap let say one particular state having operator id o1 and inside is a ValueState s1 as ID. Following steps I have planned to

Re: Flink state reconciliation

2020-07-30 Thread Seth Wiesman
is written using Akka Streams and guarantees the "at least once" >>> semantics while the rule handling inside Flink Job implemented in an >>> idempotent manner, but: we have to manage some cases when we need to >>> execute a reconciliation between the current rules s

Re: Flink state reconciliation

2020-07-30 Thread Arvid Heise
e Flink guarantees the "exactly once" delivery >> semantics, while a service, which provides the rules publishing mechanism >> to Kafka is written using Akka Streams and guarantees the "at least once" >> semantics while the rule handling inside Flink Jo

Re: Flink state reconciliation

2020-07-29 Thread Александр Сергеенко
e handling inside Flink Job implemented in an > idempotent manner, but: we have to manage some cases when we need to > execute a reconciliation between the current rules stored at the master > DBMS and the existing Flink State. > > > > We've looked at the Flink's tooling a

Re: Flink state reconciliation

2020-07-24 Thread Kostas Kloudas
rules publishing mechanism to > Kafka is written using Akka Streams and guarantees the "at least once" > semantics while the rule handling inside Flink Job implemented in an > idempotent manner, but: we have to manage some cases when we need to execute > a reconciliation

Flink state reconciliation

2020-07-23 Thread Александр Сергеенко
r, but: we have to manage some cases when we need to execute a reconciliation between the current rules stored at the master DBMS and the existing Flink State. We've looked at the Flink's tooling and found out that the State Processor API can possibly solve our problem, so we basicall

Re: How long Flink state default TTL,if I don't config the state ttl config?

2020-01-05 Thread LakeShen
Ok, got it ,thank you Zhu Zhu 于2020年1月6日周一 上午10:30写道: > Yes. State TTL is by default disabled. > > Thanks, > Zhu Zhu > > LakeShen 于2020年1月6日周一 上午10:09写道: > >> I saw the flink source code, I find the flink state ttl default is >> never expire,is it right? >

Re: How long Flink state default TTL,if I don't config the state ttl config?

2020-01-05 Thread Zhu Zhu
Yes. State TTL is by default disabled. Thanks, Zhu Zhu LakeShen 于2020年1月6日周一 上午10:09写道: > I saw the flink source code, I find the flink state ttl default is > never expire,is it right? > > LakeShen 于2020年1月6日周一 上午9:58写道: > >> Hi community,I have a question about flink

Re: How long Flink state default TTL,if I don't config the state ttl config?

2020-01-05 Thread LakeShen
I saw the flink source code, I find the flink state ttl default is never expire,is it right? LakeShen 于2020年1月6日周一 上午9:58写道: > Hi community,I have a question about flink state ttl.If I don't config the > flink state ttl config, > How long the flink state retain?Is it forever

How long Flink state default TTL,if I don't config the state ttl config?

2020-01-05 Thread LakeShen
Hi community,I have a question about flink state ttl.If I don't config the flink state ttl config, How long the flink state retain?Is it forever retain in hdfs? Thanks your replay.

Re: How to estimate the memory size of flink state

2019-11-20 Thread 刘建刚
Thank you. Your suggestion is good and I benefit a lot. For my case, I want to know the state memory size for other reasons. When the the gc pressure is bigger, I need to limit the source or discard some data from the source to ensure job’s running. If the state size is bigger, I ne

How to estimate the memory size of flink state

2019-11-19 Thread 刘建刚
We are using flink 1.6.2. For filesystem backend, we want to monitor the state size in memory. Once the state size becomes bigger, we can get noticed and take measures such as rescaling the job, or the job may fail because of the memory. We have tried to get the memory usage for the jvm

Re: Flink State Migration Version 1.8.2

2019-10-18 Thread ApoorvK
I am not using any custom serialisation, but pojo is composite type, the pojo I am trying to modify has variables which are other pojo defined by me, is ther any example for TypeSerialization for this kind please share -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabbl

Re: Flink State Migration Version 1.8.2

2019-10-17 Thread Paul Lam
Hi, Could you confirm that you’re using POJOSerializer before and after migration? Best, Paul Lam > 在 2019年10月17日,21:34,ApoorvK 写道: > > It is throwing below error , > the class I am adding variables have other variable as an object of class > which are also in state. > > Caused by: org.apach

Re: Flink State Migration Version 1.8.2

2019-10-17 Thread ApoorvK
It is throwing below error , the class I am adding variables have other variable as an object of class which are also in state. Caused by: org.apache.flink.util.StateMigrationException: The new state typeSerializer for operator state must not be incompatible. at org.apache.flink.runtime.st

Re: Flink State Migration Version 1.8.2

2019-10-16 Thread ApoorvK
Yes, I have tried giving it as option, also the case class has default constructor (this) still unable to migrate -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink State Migration Version 1.8.2

2019-10-16 Thread miki haiat
Can you try to add the new variables as option ? On Wed, Oct 16, 2019, 17:17 ApoorvK wrote: > I have been trying to alter the current state case class (scala) which has > 250 variables, now when I add 10 more variables to the class, and when I > run > my flink application from the save point ta

Flink State Migration Version 1.8.2

2019-10-16 Thread ApoorvK
I have been trying to alter the current state case class (scala) which has 250 variables, now when I add 10 more variables to the class, and when I run my flink application from the save point taken before(Some of the variables are object which are also maintained as state). It fails to migrate the

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
g if there is any 'synchronous' way of accessing the Flink state. BR, Moe On 06/08/2019 13:25, Протченко Алексей wrote: Hi Mohammad, which types of applications do you mean? Streaming or batch ones? In terms of streaming ones queues like Kafka or RabbitMq between applications should

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Oytun Tez
s), then the order of messages are not > preserved and it might lead to incorrect result of your application. That > was the reason why I was wondering if there is any 'synchronous' way of > accessing the Flink state. > > BR, Moe > > > On 06/08/2019 13:25, Протченко Алек

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
r bringing 'stated/current' objects to desired processing nodes), then the order of messages are not preserved and it might lead to incorrect result of your application. That was the reason why I was wondering if there is any 'synchronous' way of accessing the Flink state.

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Протченко Алексей
object-id's > from the previous Flink application. We do not propagate the actual-state of > the objects since, not all types of the objects are relevant to all processes > in the cluster, so we saved some network/storage overhead there. >The question is: for such scenario, w

Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
ink application. We do not propagate the actual-state of the objects since, not all types of the objects are relevant to all processes in the cluster, so we saved some network/storage overhead there. The question is: for such scenario, what is the best way to expose the Flink state to another Fli

Re: Flink state: complex value state pojos vs explicitly managed fields

2019-06-17 Thread Congxian Qiu
Hi, If you use RocksDBStateBackend, one member one state will get better performance. Because RocksDBStateBackend needs to de/serialize the key/value when put/get, with one POJO value, you need to de/serializer the whole POJO value when put/get. Best, Congxian Timothy Victor 于2019年6月17日周一 下午8:0

Re: Flink state: complex value state pojos vs explicitly managed fields

2019-06-17 Thread Timothy Victor
I would choose encapsulation if it the fields are indeed related and makes sense for your model. In general, I feel it is not a good thing to let Flink (or any other frameworks) internal mechanics dictate your data model. Tim On Mon, Jun 17, 2019, 4:59 AM Frank Wilson wrote: > Hi, > > Is it be

Flink state: complex value state pojos vs explicitly managed fields

2019-06-17 Thread Frank Wilson
Hi, Is it better to have one POJO value state with a collection inside or an explicit state declaration for each member? e.g. MyPojo { long id; List[Foo] foos; // getter / setters omitted } Or Two managed state declarations in my process function (a value for the long and a list fo

Re: Null Flink State

2018-09-25 Thread Taher Koitawala
Hi Dawid, Thanks for the answer, how do I get the state of the Window then? I do understand that elements are going to the state as window in itself is a stateful operator. How do I get access to those elements? Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Tue, Sep 25, 20

Re: Null Flink State

2018-09-25 Thread Dawid Wysakowicz
Hi Taher, As long as you don't put something into the state ValueState#value() will return null. The point for having ctx.globalState(1) and ctx.windowState(2) is to allow users to store some their own state, scoped to key(1) and  key & window(2) accordingly. If you want to access all elements ass

Null Flink State

2018-09-25 Thread Taher Koitawala
Hi All, I am trying to access elements stored in the state of the window. As window, itself is a stateful operator I think I should be able to get records in the process function after the is triggered. Can someone tell me why in the following code is the state of the window null? Below

Re: Flink State monitoring

2018-05-11 Thread Juho Autio
stuff that we should have >>>> at some point. >>>> >>>> One thing that comes to mind is checking the size of checkpoints, which >>>> gives you an indirect way of figuring out how big state is but that's not >>>> very exact, i.e. doesn't

Re: Flink State monitoring

2018-04-24 Thread Juho Autio
;> Aljoscha >>> >>> > On 20. Dec 2017, at 08:09, Netzer, Liron >>> wrote: >>> > >>> > Ufuk, Thanks for replying ! >>> > >>> > Aljoscha, can you please assist with the questions below? >>> > >>> > Thanks,

Re: Flink State monitoring

2018-04-20 Thread Stefan Richter
;> > Ufuk, Thanks for replying ! >> > >> > Aljoscha, can you please assist with the questions below? >> > >> > Thanks, >> > Liron >> > >> > -Original Message- >> > From: Ufuk Celebi [mailto:u...@apache.org <mailto:u..

Re: Flink State monitoring

2018-04-20 Thread Juho Autio
>> Best, >> Aljoscha >> >> > On 20. Dec 2017, at 08:09, Netzer, Liron wrote: >> > >> > Ufuk, Thanks for replying ! >> > >> > Aljoscha, can you please assist with the questions below? >> > >> > Thanks, >> > Liron >> > >

Re: Flink State monitoring

2018-04-20 Thread Stefan Richter
below? > > > > Thanks, > > Liron > > > > -Original Message- > > From: Ufuk Celebi [mailto:u...@apache.org <mailto:u...@apache.org>] > > Sent: Friday, December 15, 2017 3:06 PM > > To: Netzer, Liron [ICG-IT] > > Cc: user@flink.apach

Re: Flink State monitoring

2018-04-20 Thread Stefan Richter
replying ! > > > > Aljoscha, can you please assist with the questions below? > > > > Thanks, > > Liron > > > > -Original Message- > > From: Ufuk Celebi [mailto:u...@apache.org <mailto:u...@apache.org>] > > Sent: Friday, December 1

Re: Flink State monitoring

2018-04-20 Thread Juho Autio
gt;> Aljoscha >> >> > On 20. Dec 2017, at 08:09, Netzer, Liron wrote: >> > >> > Ufuk, Thanks for replying ! >> > >> > Aljoscha, can you please assist with the questions below? >> > >> > Thanks, >> > Liron >

Re: Flink State monitoring

2018-01-04 Thread Steven Wu
the questions below? > > > > Thanks, > > Liron > > > > -Original Message- > > From: Ufuk Celebi [mailto:u...@apache.org] > > Sent: Friday, December 15, 2017 3:06 PM > > To: Netzer, Liron [ICG-IT] > > Cc: user@flink.apache.org >

Re: Flink State monitoring

2018-01-04 Thread Aljoscha Krettek
w? > > Thanks, > Liron > > -Original Message- > From: Ufuk Celebi [mailto:u...@apache.org] > Sent: Friday, December 15, 2017 3:06 PM > To: Netzer, Liron [ICG-IT] > Cc: user@flink.apache.org > Subject: Re: Flink State monitoring > > Hey Liron, > >

RE: Flink State monitoring

2017-12-19 Thread Netzer, Liron
Ufuk, Thanks for replying ! Aljoscha, can you please assist with the questions below? Thanks, Liron -Original Message- From: Ufuk Celebi [mailto:u...@apache.org] Sent: Friday, December 15, 2017 3:06 PM To: Netzer, Liron [ICG-IT] Cc: user@flink.apache.org Subject: Re: Flink State

Re: Flink State monitoring

2017-12-15 Thread Ufuk Celebi
Hey Liron, unfortunately, there are no built-in metrics related to state. In general, exposing the actual values as metrics is problematic, but exposing summary statistics would be a good idea. I'm not aware of a good work around at the moment that would work in the general case (taking into accou

Flink State monitoring

2017-12-14 Thread Netzer, Liron
Hi group, We are using Flink keyed state in several operators. Is there an easy was to expose the data that is stored in the state, i.e. the key and the values? This is needed for both monitoring as well as debugging. We would like to understand how many key+values are stored in each state and a

Re: Reprocessing data in Flink / rebuilding Flink state

2016-10-26 Thread Konstantin Gregor
Hi Ufuk, thanks for this information, this is good news! Updating Flink to 1.1 is not really in our hands, but will hopefully happen soon :-) Thank you and best regards Konstantin On 26.10.2016 16:07, Ufuk Celebi wrote: > On Wed, Oct 26, 2016 at 3:06 PM, Konstantin Gregor > wrote: >> We are st

Re: Reprocessing data in Flink / rebuilding Flink state

2016-10-26 Thread Ufuk Celebi
On Wed, Oct 26, 2016 at 3:06 PM, Konstantin Gregor wrote: > We are still using 1.0.1 so this is an expected behavior, but I just > wondered whether there are any news concerning this topic. Yes, we will add an option to ignore this while restoring. This will be added to the upcoming 1.1.4 and 1.2

Re: Reprocessing data in Flink / rebuilding Flink state

2016-10-26 Thread Konstantin Gregor
eak when trying to restore the job with a > different source? Would I need to assign the > replay source a UID, and when switching from > replay to live, remove the replay source and >

Re: Reprocessing data in Flink / rebuilding Flink state

2016-08-02 Thread Gyula Fóra
omething quick running the hacky way, then if we >>>>>> decide to make this a permanent solution maybe I can work on the proper >>>>>> solution. I was wondering about your suggestion for "warming up" the >>>>>> state >>>

Re: Reprocessing data in Flink / rebuilding Flink state

2016-08-02 Thread Stephan Ewen
;> taking a savepoint and switching sources - since the Kafka sources are >>>>> stateful and are part of Flink's internal state, wouldn't this break when >>>>> trying to restore the job with a different source? Would I need to assign >>>>

Re: Reprocessing data in Flink / rebuilding Flink state

2016-08-02 Thread Ufuk Celebi
s break when >>>> trying to restore the job with a different source? Would I need to assign >>>> the replay source a UID, and when switching from replay to live, remove the >>>> replay source and replace it with an dummy operator with the same UID? >>>> >

Re: Reprocessing data in Flink / rebuilding Flink state

2016-08-01 Thread Aljoscha Krettek
>> trying to restore the job with a different source? Would I need to assign >>> the replay source a UID, and when switching from replay to live, remove the >>> replay source and replace it with an dummy operator with the same UID? >>> >>> @Jason - I see what

Re: Reprocessing data in Flink / rebuilding Flink state

2016-08-01 Thread Josh
source? Would I need to assign >> the replay source a UID, and when switching from replay to live, remove the >> replay source and replace it with an dummy operator with the same UID? >> >> @Jason - I see what you mean now, with the historical and live Flink >> jobs. That

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Aljoscha Krettek
an interesting approach - I guess it's solving a slightly different > problem to my 'rebuilding Flink state upon starting job' - as you're > rebuilding state as part of the main job when it comes across events that > require historical data. Actually I think we'

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Josh
o live, remove the replay source and replace it with an dummy operator with the same UID? @Jason - I see what you mean now, with the historical and live Flink jobs. That's an interesting approach - I guess it's solving a slightly different problem to my 'rebuilding Flink state upon st

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Jason Brelloch
g events for that key and just process them as normal. Keep in mind that this is the dangerous part that I was talking about, where memory in the main job would continue to build until the "historical" events are all processed. >In my case I would want the Flink state to always contain

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Jagat Singh
ehow block the live stream source and detect >> when the historical data source is no longer emitting new elements? >> >> > So in you case it looks like what you could do is send a request to >> the "historical" job whenever you get a item that you don't

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Aljoscha Krettek
ata source is no longer emitting new elements? > > > So in you case it looks like what you could do is send a request to the > "historical" job whenever you get a item that you don't yet have the > current state of. > In my case I would want the Flink state to al

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-29 Thread Josh
mitting new elements? > So in you case it looks like what you could do is send a request to the "historical" job whenever you get a item that you don't yet have the current state of. In my case I would want the Flink state to always contain the latest state of every item (except

Re: Reprocessing data in Flink / rebuilding Flink state

2016-07-28 Thread Jason Brelloch
Hey Josh, The way we replay historical data is we have a second Flink job that listens to the same live stream, and stores every single event in Google Cloud Storage. When the main Flink job that is processing the live stream gets a request for a specific data set that it has not been processing

Reprocessing data in Flink / rebuilding Flink state

2016-07-28 Thread Josh
Hi all, I was wondering what approaches people usually take with reprocessing data with Flink - specifically the case where you want to upgrade a Flink job, and make it reprocess historical data before continuing to process a live stream. I'm wondering if we can do something similar to the 'simpl