Hi all,
To minimize the recovery time from failure, we employ incremental, retained
checkpoint with `state.checkpoints.num-retained
as 10` in our Flink apps. With this setting, Flink automatically creates
new checkpoints regularly and keeps only the latest 10
checkpoints. Besides, for app upgra
>
> Again, thank you for your input.
You are welcome.
I want the stream element to define the window.
Got it, that was the missing bit of detail. That is also doable - not with
the Windows API, but with the more low level ProcessFunction.
Check out my blog post [1] , especially it's third part [
Hi Alexander,
Thank you for responding. The solution you proposed uses statically defined
windows. What I need a are dynamically created windows determined by metadata
in the stream element.
I want the stream element to define the window.
That’s what I’m trying to research, or an alternate sol
Hi Marco,
Not sure if I get your problem correctly, but you can process those windows
on data "split" from the same input within the same Flink job.
Something along these lines:
DataStream stream = ...
DataStream a = stream.filter( /* time series name == "a" */);
a.keyBy(...).window(TumblingEvent
Hi,
I need to calculate elapsed times between steps of a transaction.
Each step is an event. All steps belonging to a single transaction have the
same transaction id. Every event has a handling time.
All information is part of a large JSON structure.
But I can have the incoming source supply trans
Hi,
I am working with time series data in the form of (timestamp, name, value),
and an event time that is the timestamp when the data was published onto
kafka, and I have a business requirement in which each stream element
becomes enriched, and then processing requires different time series names
Hi Yun and Oran,
Thanks for your time. Much appreciated!
Below are my configs:
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setParallelism(1)
env.enableCheckpointing(2000)
//env.setDefaultSavepointDirectory("file:home/siddhesh/Desktop/savepoints/")
env.getCheck
Hi Jasmin,
>From my knowledge, it seems no big company would adopt pure file system source
>as the main data source of Flink. We would in general choose a message queue,
>e.g Kafka, as the data source.
Best
Yun Tang
From: Jasmin Redžepović
Sent: Wednesday, Janu
Also, what would you recommend? I have both options available:
* Kafka - protobuf messages
* S3 - here are messages copied from kafka for persistence with Kafka
Connect service
On 26.01.2022., at 14:43, Jasmin Redžepović
mailto:jasmin.redzepo...@superbet.com>> wrote:
Hello Flink commit
DataStream API
--
Sender:Guowei Ma
Sent At:2022 Jan. 26 (Wed.) 21:51
Recipient:Shawn Du
Cc:user
Subject:Re: create savepoint on bounded source in streaming mode
Hi, Shawn
Thank you for your sharing. Unfortunately I do not
Hi, Shawn
Thank you for your sharing. Unfortunately I do not think there is an easy
way to achieve this now.
Actually we have a customer who has the same requirement but the scenario
is a little different. The bounded and unbounded pipeline have some
differences but the customer wants reuse some s
Hello Flink committers :)
Just one short question:
How is performance of reading from Kafka source compared to reading from
FileSystem source? I would be very grateful if you could provide a short
explanation.
I saw in documentation that both provide exactly-once semantics for streaming,
but t
right!
--
Sender:Guowei Ma
Sent At:2022 Jan. 26 (Wed.) 19:50
Recipient:Shawn Du
Cc:user
Subject:Re: create savepoint on bounded source in streaming mode
Hi,Shawn
You want to use the correct state(n-1) for day n-1 and the fu
Hi,Shawn
You want to use the correct state(n-1) for day n-1 and the full amount of
data for day n to produce the correct state(n) for day n.
Then use state(n) to initialize a job to process the data for day n+1.
Am I understanding this correctly?
Best,
Guowei
Shawn Du 于2022年1月26日 周三下午7:15写道:
Hi everyone,
I'm trying to migrate from the old set of CatalogTable related APIs
(CatalogTableImpl, TableSchema, DescriptorProperties) to the new ones
(CatalogBaseTable, Schema and ResolvedSchema, CatalogPropertiesUtil), in a
custom catalog.
The catalog stores table definitions, and the current l
Hi Gaowei,
think the case:
we have one streaming application built by flink, but kinds of reason,
the event may be disordered or delayed terribly.
we want to replay the data day by day(the data was processed like
reordered.). it looks like a batching job but with
We will need more of the logs contents to help you (preferably the whole
thing.
On 25/01/2022 23:55, John Smith wrote:
I'm using: final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
But no go.
On Mon, 24 Jan 2022 at 16:35, John Smith wrote:
Hi u
Hi,
I want sink the model data (coefficient from the logsitic regression model in
my case) from the flink.ml.api.Model to print or file. I figure out the way to
sink it in the batch training mode but face the following exception when the
Estimator takes an UNBOUNDED datastream.
```
Caused by:
Hi Siddhesh,
The root cause is that the configuration of group.id is missing for the Flink
program. The configuration of restart strategy has no relationship with this.
I think you should pay your attention to kafka related configurations.
Best
Yun Tang
From: S
cool! HybridSource seems much close to my requirements.
Thanks Dawid. I will have a try.
Shawn
--
Sender:Dawid Wysakowicz
Sent At:2022 Jan. 26 (Wed.) 15:49
Recipient:user
Subject:Re: create savepoint on bound
20 matches
Mail list logo