ebsite is still in 2023, or can I only compile from the
> source code?
>
>
> 在 2024年11月27日,17:56,William Wallace 写道:
>
>
> hi,
> It seems similar to issue described here:
> https://lists.apache.org/thread/g8yb4rlj0mlf1vgjl71815nts8r1w51p
> were we were not able to rest
hi,
It seems similar to issue described here:
https://lists.apache.org/thread/g8yb4rlj0mlf1vgjl71815nts8r1w51p
were we were not able to restore state because of the high number of S3
reads (in your case it might first encounter the connection limitation
first).
Have a look at https://issues.apache.
r to us.
>>
> Copy from the PR:
> Flink state restore from S3 is super slow because skip function is
> consuming ~15 seconds for ~6Mb of data.
> ...
> In this PR the skip going to be called only in case of compression because
> otherwise a stream is seekable.
>
> G
to cherry-pick this PR [1] at top of your Flink
> distro when possible.
> Additionally turn off state compression. These should do the trick...
>
> [1] https://github.com/apache/flink/pull/25509
>
> G
>
>
> On Tue, Oct 15, 2024 at 1:03 PM William Wallace <
>
#x27;t say how that will actually work for us.
Thank you again for looking into this. I'm looking forward for your
thoughts. Please let me know if I missed or misunderstood something. Please
let us know your recommendation.
On Tue, Oct 15, 2024 at 8:35 AM Gabor Somogyi
wrote:
> Hi William
Context
We have recently upgraded from Flink 1.13.6 to Flink 1.19. We consume data
from ~ 40k Kafka topic partitions in some environments. We are using
aligned checkpoints. We set state.storage.fs.memory-threshold: 500kb.
Problem
At the point when the state for operator using
topic-partition-off
Hallo
on 2020/1/8 11:31, RKandoji wrote:
I'm running my job on a EC2 instance with 32 cores and according to the
documentation I tried to use as many task slots the number of cores,
numOfTaskSlots=32 and parallelism=32. But I noticed that the performance
is slightly degrading when I'm using 32
Can you enable debug log to check with that?
regards.
on 2020/1/8 6:36, RKandoji wrote:
But I'm curious if there is way to verify if the checkpoints are
happening asynchronously or synchronously.
Hallo
Is there a flink template for deployment on bitnami?
Thanks
would be interesting if anyone have the same experience as I have.
The pipeline is currently running on Flink 1.7.2
Best regards and wish you a pleasant day,
William
From: Yun Tang
Date: Friday, 30 August 2019 at 11:42
To: William Jonsson , "user@flink.apache.org"
Cc: Fleet Perc
String input and a
“histogram” output class. Do you have any input or ideas how the state could be
manageable in the Heap case but totally unhandleable during the RocksDb version?
Best regards,
William
class Histogram extends WindowFunction[String, Histogram, TimeWindow] {
def process (key : T
Hi,
Is there a way to specify an exponential backoff strategy for when
async function calls fail?
I have an async function that does web requests to a rate-limited API.
Can you handle that with settings on the async function call?
Thanks,
William
: Total number processed: 16
gpg: imported: 16
gpg: no ultimately trusted keys found
+ gpg --batch --verify flink.tgz.asc flink.tgz
gpg: Signature made Fri Feb 15 13:13:29 2019 UTC
gpg: using RSA key
8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E
gpg: Can't check signature: No public key
Thanks,William
essing I'm not the
only one doing async functions in Scala so figured I'd ask :)
Thanks,
William Saar
: "Lasse Nedergaard"
To:"William Saar"
Cc:"Fabian Hueske" , "user"
Sent:Tue, 5 Feb 2019 10:41:41 +0100
Subject:Re: How to add caching to async function?
Hi William
No iterations isn’t the solution as you can (will) end up in a
deadlock. We conclude
Thanks! Looks like iterations is indeed the way to go for now then...
- Original Message -
From: "Lasse Nedergaard"
To:"Fabian Hueske"
Cc:"William Saar" , "user"
Sent:Mon, 4 Feb 2019 20:20:30 +0100
Subject:Re: How to add caching to async f
etting: State is not supported in rich async functions.
What is the best practice for doing this? I guess I could have a
previous step with state and send the responses from the rich function
back as an iteration, but I would guess that's the wrong approach...
Thanks,
William
How can I easiest use s3 from a Flink job deployed in a session
cluster on kubernetes?
I've tried including the flink-s3-fs-hadoop dependency in the sbt file
for my job, can I programmatically set the properties to point to it?
Is there a ready-made docker image for a flink with s3 dependencies
I'm running Flink 1.7 in ECS, is this a known issue or should I create
a jira?
The web console doesn't show anything when trying to list logs or
stdout for task managers and the job manager log have stack traces for
the errors
2018-12-19 15:35:53,498 ERROR
org.apache.flink.runtime.rest.handler.
Thanks, works great! This should be very useful for real-time
dashboard that want to compute in event time, especially for
multi-tenant systems or other specialized kafka topics that can have
gaps in the traffic.
- Original Message -
From: "Aljoscha Krettek"
To:"Wi
Any standardized components to generate watermarks based on processing
time in an event time stream when there is no data from a source?
The docs for event time [1] indicate that people are doing this, but
the only suggestion on Stack Overflow [2] is to make every window
operator in stream have a
I'm trying to stream log messages (syslog fed into Kafak) into Parquet
files on HDFS via Flink. I'm able to read, parse, and construct objects for
my messages in Flink; however, writing to Parquet is tripping me up. I do
*not* need to have this be real-time; a delay of a few minutes, even up to
an
data used to come from two
different topics).
William
- Original Message -
From:
"Gary Yao"
To:
"William Saar"
Cc:
"user"
Sent:
Thu, 18 Jan 2018 11:11:17 +0100
Subject:
Re: Far too few watermarks getting generated with Kafka source
Hi William,
How often
through the pipeline, we're just not
getting watermarks...
Thanks,
William
Hi,
I have added the code below to the start of processElement2 in
CoProcessFunction. It prints timestamps and watermarks for the first 3
elements for each new watermark. Shouldn't the timestamp always be
lower than the next watermark? The 3 timestamps before the last
watermark are all larger than
Hi,
That looks perfect! I realized I could probably use an Evictor
together with my WindowProcessFunction to prevent the window from
preserving the whole state, but ditching the window looks even better.
Thanks a lot!
William
- Original Message -
From: "Nico Kruber"
To:
C
timestamp meets the
expiration condition, or if the elements iterable parameter does not
contain any new elements (deducing that the processing must have been
triggered by a timer invocation and not a new element). Is there a
better way to do this?
Thanks,
William
Flink.
- Original Message -
From:
"Gyula Fóra"
To:
"William Saar" ,
Cc:
Sent:
Tue, 30 May 2017 13:56:08 +
Subject:
Re: Porting batch percentile computation to streaming window
I think you could actually do a window operation to get the
tDigestStre
ark information or something on
metrics objects in line 1 and emit T digests more often in line 2?
Finally, how do I access the watermark/window information in my fold
operation in line 1?
Thanks!
- Original Message -
From: "Gyula Fóra"
To:"William Saar" ,
Cc:
Sent:Tue, 30
after seeing all stream values).
Thanks!
William
the end of the window + allowedLateness you have set.
On Tue, Nov 22, 2016 at 11:08 PM, William Saar
wrote:
> Thanks!
> One difference is that my topology had 2 sources. I have updated
your
> example to also use 2 sources and that breaks the co-group
operation in the
> e
.
William
- Original Message -
From: user@flink.apache.org
To:"user@flink.apache.org"
Cc:
Sent:Tue, 22 Nov 2016 11:50:52 +0100
Subject:Re: ContinuousEventTimeTrigger breaks coGrouped windowed
streams?
Hi William,
I've reproduced your example locally for some toy data and
ld need to keep up to
3 copies of all events (for at least the smallest window size) to
compute the same type of results.
Hälsningar!
William
- Original Message -
From: user@flink.apache.org
To:
Cc:
Sent:Mon, 21 Nov 2016 08:22:16 +
Subject:Re: ContinuousEventTimeTrigger breaks coGroupe
0)))
.fold(...);
stream1.coGroup(stream2).where(...).equalTo(...)
.window(TumblingEventTimeWindows.of(Time.seconds(30)))
.trigger(ContinuousEventTimeTrigger.of(Time.seconds(10)))
.print()
Thanks, William
sult =
localAndRemoteStream.select("local").map(process).union(remoteStream).broadcast();
// Broadcast all fully processed results to all machines
globalResult.fold().addSink(globalWindowOutputSink) // fold/reduce, I want a
result based on the full contents of the window
Any help would be greatly appreciated!
Thanks,
William
36 matches
Mail list logo