Re:Re: Flink caching mechanism

2024-01-11 Thread Xuyang
before. You can see more here[1]. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/ -- Best! Xuyang 在 2024-01-12 12:37:43,"Junrui Lee" 写道: Hi Вова In Flink, there is no built-in mechanism for caching SQL query results; ev

Re: Flink caching mechanism

2024-01-11 Thread Junrui Lee
Hi Вова In Flink, there is no built-in mechanism for caching SQL query results; every query execution is independent, and results are not stored for future queries. The StateBackend's role is to maintain operational states within jobs, such as aggregations or windowing, which is critica

Flink caching mechanism

2024-01-11 Thread Вова Фролов
Hi Everyone, I'm currently looking to understand the caching mechanism in Apache Flink in general. As part of this exploration, I have a few questions related to how Flink handles data caching, both in the context of SQL queries and more broadly. When I send a SQL query for examp

Re: [Table API] [JDBC CDC] Caching Configuration from MySql instance needed in multiple Flink Jobs

2022-02-21 Thread Leonard Xu
Hello, Dan > 2022年2月21日 下午9:11,Dan Serb 写道: > 1.Have a processor that uses Flink JDBC CDC Connector over the table that > stores the information I need. (This is implemented currently - working) You mean you’ve implemented a Flink JDBC Connector? Maybe the Flink CDC Connectors[1] would help yo

[Table API] [JDBC CDC] Caching Configuration from MySql instance needed in multiple Flink Jobs

2022-02-21 Thread Dan Serb
Hello all, I kind of need the community’s help with some ideas, as I’m quite new with Flink and I feel like I need a little bit of guidance in regard to an implementation I’m working on. What I need to do, is to have a way to store a mysql table in Flink, and expose that data to other jobs, as

Re: Caching

2020-11-27 Thread Dongwon Kim
Hi Navneeth, I didn't quite understand how async io can be used here. It would be great > if you can share some info on it. You need to add an async operator in the middle of your pipeline in order to enrich your input data. [1] and [2] will help you. Also how are you propagating any changes to

Re: Caching

2020-11-26 Thread Navneeth Krishnan
Thanks Dongwon. It was extremely helpful. I didn't quite understand how async io can be used here. It would be great if you can share some info on it. Also how are you propagating any changes to values? Regards, Navneeth On Thu, Nov 26, 2020 at 6:26 AM Dongwon Kim wrote: > Oops, I forgot to me

Re: Caching

2020-11-26 Thread Dongwon Kim
Oops, I forgot to mention that when doing bulk insert into Redis, you'd better open a pipeline with a 'transaction' property set to False [1]. Otherwise, API calls from your Flink job will be timeout. [1] https://github.com/andymccurdy/redis-py#pipelines On Thu, Nov 26, 2020 at 11:09 PM Dongwon

Re: Caching

2020-11-26 Thread Dongwon Kim
Hi Navneeth, I reported a similar issue to yours before [1] but I took the broadcasting approach at first. As you already anticipated, broadcasting is going to use more memory than your current approach based on a static object on each TM . And the broadcasted data will be treated as operator st

Re: Caching

2020-11-26 Thread Prasanna kumar
Navneeth, Thanks for posting this question. This looks like our future scenario where we might end up with. We are working on a Similar problem statement with two differences. 1) The cache items would not change frequently say max of once per month or few times per year and the number of entiti

Caching

2020-11-26 Thread Navneeth Krishnan
Hi All, We have a flink streaming job processing around 200k events per second. The job requires a lot of less frequently changing data (sort of static but there will be some changes over time, say 5% change once per day or so). There are about 12 caches with some containing approximately 20k entr

Re: Caching Mechanism in Flink

2020-11-19 Thread Andrey Zagrebin
nk/flink-docs-release-1.11/ops/memory/mem_setup_tm.html#detailed-memory-model >> >> On Tue, Nov 10, 2020 at 3:57 AM Jack Kolokasis >> wrote: >> >>> Thank you Xuannan for the reply. >>> >>> Also I want to ask about how Flink uses the off-heap memor

Re: Caching Mechanism in Flink

2020-11-11 Thread Jack Kolokasis
handle by the programmer? Best, Iacovos On 10/11/20 4:42 π.μ., Xuannan Su wrote: Hi Jack, At the moment, Flink doesn't support caching the intermediate result. However, there is some ongoing effort to support caching in Flink.

Re: Caching Mechanism in Flink

2020-11-11 Thread Matthias Pohl
off-heap memory. If I set >> taskmanager.memory.task.off-heap.size then which data does Flink allocate >> off-heap? This is handle by the programmer? >> >> Best, >> Iacovos >> On 10/11/20 4:42 π.μ., Xuannan Su wrote: >> >> Hi Jack, >> >> At the

Re: Caching Mechanism in Flink

2020-11-11 Thread Jack Kolokasis
mory.task.off-heap.size then which data does Flink allocate off-heap? This is handle by the programmer? Best, Iacovos On 10/11/20 4:42 π.μ., Xuannan Su wrote: Hi Jack, At the moment, Flink doesn't support caching the intermediate result. However, there is some o

Re: Caching Mechanism in Flink

2020-11-11 Thread Matthias Pohl
ate > off-heap? This is handle by the programmer? > > Best, > Iacovos > On 10/11/20 4:42 π.μ., Xuannan Su wrote: > > Hi Jack, > > At the moment, Flink doesn't support caching the intermediate result. > However, there is some ongoing effort to support caching in Flink.

Re: Caching Mechanism in Flink

2020-11-09 Thread Jack Kolokasis
the moment, Flink doesn't support caching the intermediate result. However, there is some ongoing effort to support caching in Flink. FLIP-36[1] propose to add the caching mechanism at the Table API. And it is planned for 1.13. Best, Xuannan On Nov 10, 2020, 4:29 AM +0800, Jack Kolo

Re: Caching Mechanism in Flink

2020-11-09 Thread Xuannan Su
Hi Jack, At the moment, Flink doesn't support caching the intermediate result. However, there is some ongoing effort to support caching in Flink. FLIP-36[1] propose to add the caching mechanism at the Table API. And it is planned for 1.13. Best, Xuannan On Nov 10, 2020, 4:29 AM +0800,

Caching Mechanism in Flink

2020-11-09 Thread Jack Kolokasis
Hello all, I am new to Flink and I want to ask if the Flink supports a caching mechanism to store intermediate results in memory for machine learning workloads. If yes, how can I enable it and how can I use it? Thank you, Iacovos

Re: How to add caching to async function?

2019-02-05 Thread William Saar
Ah, thanks, missed it when I only looked at the slides. Yes, have heard deadlocks are a problem with iterations but hoped it had been fixed. Pity, had been hoping to replace an external service with the Flink job, but will keep the service around for the caching, - Original Message - From

Re: How to add caching to async function?

2019-02-05 Thread Lasse Nedergaard
.22 skrev William Saar : > > Thanks! Looks like iterations is indeed the way to go for now then... > > > > - Original Message - > From: "Lasse Nedergaard" > To:"Fabian Hueske" > Cc:"William Saar" , "user" > Sent

Re: How to add caching to async function?

2019-02-05 Thread William Saar
Thanks! Looks like iterations is indeed the way to go for now then... - Original Message - From: "Lasse Nedergaard" To:"Fabian Hueske" Cc:"William Saar" , "user" Sent:Mon, 4 Feb 2019 20:20:30 +0100 Subject:Re: How to add caching to async f

Re: How to add caching to async function?

2019-02-04 Thread Lasse Nedergaard
Hi William We have created a solution that do it. Please take a look at my presentation from Flink forward. https://www.slideshare.net/mobile/FlinkForward/flink-forward-berlin-2018-lasse-nedergaard-our-successful-journey-with-flink Hopefully you can get inspired. Med venlig hilsen / Best regar

Re: How to add caching to async function?

2019-02-04 Thread Fabian Hueske
Hi William, Does the cache need to be fault tolerant? If not you could use a regular in-memory map as cache (+some LRU cleaning). Or do you expect the cache to group too large for the memory? Best, Fabian Am Mo., 4. Feb. 2019 um 18:00 Uhr schrieb William Saar : > Hi, > I am trying to implement

How to add caching to async function?

2019-02-04 Thread William Saar
Hi, I am trying to implement an async function that looks up a value in a cache or, if the value doesn't exist in the cache, queries a web service, but I'm having trouble creating the cache. I've tried to create a RichAsyncFunction and add a map state as cache, but I'm getting: State is not suppor

Re: Regarding caching the evicted elements and re-emitting them to the next window

2017-01-16 Thread Shaoxuan Wang
Hi Abdul, You may want to check out FLIP13 "side output" https://goo.gl/6KSYd0 . Once we have this feature, you should be able to collect the data to the external distributed storage, and use these data later on demand. BTW, can you explain your use case in more details, such that people here may h

Re: Regarding caching the evicted elements and re-emitting them to the next window

2017-01-13 Thread Aljoscha Krettek
Hi, I'm afraid there is no functionality for this in Flink. What you can do, however, is to not evict these elements from the window buffer but instead ignore them when processing your elements in the WindowFunction. This way they will be preserved for the next firing. You have to make sure to even

Re: Caching collected objects in .apply()

2017-01-09 Thread Aljoscha Krettek
Hi, I think your approach with two window() operations is fine. There is no way to retrieve the result from a previous window because it is not strictly defined what the previous window is. Also, keeping data inside your user functions (in fields) is problematic because these function instances are

Regarding caching the evicted elements and re-emitting them to the next window

2017-01-08 Thread Abdul Salam Shaikh
Hi, I am using 1.2-Snapshot version of Apache Flink which provides the new enhanced Evictor functionality and using customized triggers for Global Window. I have a use case where I am evicting the unwanted event(element) for the current window before it is evaluated. However, I am looking for opti

Re: Caching collected objects in .apply()

2017-01-05 Thread Fabian Hueske
Hi Matt, I think your approach should be fine. Although the second keyBy is logically a shuffle, the data will not be sent of the wire to a different machine if the parallelism of the first and second window operator are identical. It only cost one serialization / deserialization step. I would be

Re: Caching collected objects in .apply()

2017-01-05 Thread Matt
I'm still looking for an answer to this question. Hope you can give me some insight! On Thu, Dec 22, 2016 at 6:17 PM, Matt wrote: > Just to be clear, the stream is of String elements. The first part of the > pipeline (up to the first .apply) receives those strings, and returns > objects of anoth

Re: Caching collected objects in .apply()

2016-12-22 Thread Matt
Just to be clear, the stream is of String elements. The first part of the pipeline (up to the first .apply) receives those strings, and returns objects of another class ("A" let's say). On Thu, Dec 22, 2016 at 6:04 PM, Matt wrote: > Hello, > > I have a window processing 10 objects at a time, and

Caching collected objects in .apply()

2016-12-22 Thread Matt
Hello, I have a window processing 10 objects at a time, and creating 1 as a result. The problem is in order to create that object I need the object from the previous window. I'm doing this: stream .keyBy(...some key...) .countWindow(10, 1) .apply(...creates an element A...) .keyBy(...sam

Re: Intermediate Data Caching

2016-07-19 Thread Saliya Ekanayake
look into the code for this. Most of the relevant implementations > > are found in the "org.apache.flink.runtime.iterative.task" package. > > > > Hope this helps... > > > > Ufuk > > > > > > On Sun, Jul 17, 2016 at 9:36 PM, Saliya Ekanayake > wrote: > >> Hi, >

Re: Intermediate Data Caching

2016-07-19 Thread Ufuk Celebi
> Ufuk > > > On Sun, Jul 17, 2016 at 9:36 PM, Saliya Ekanayake wrote: >> Hi, >> >> I am trying to understand what's the intermediate caching support in Flink. >> For example, when there's an iterative dataset what's being cached between &

Re: Intermediate Data Caching

2016-07-18 Thread Ufuk Celebi
s are found in the "org.apache.flink.runtime.iterative.task" package. Hope this helps... Ufuk On Sun, Jul 17, 2016 at 9:36 PM, Saliya Ekanayake wrote: > Hi, > > I am trying to understand what's the intermediate caching support in Flink. > For example, when there

Intermediate Data Caching

2016-07-17 Thread Saliya Ekanayake
Hi, I am trying to understand what's the intermediate caching support in Flink. For example, when there's an iterative dataset what's being cached between iterations. Is there some documentation on this? Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assi