I need to make some slower external requests in parallel, so Async I/O
helps greatly with that. However, I also need to make the requests in a
certain order per key. Is that possible with Async I/O?
The documentation[1] talks about preserving the stream order of
results, but it doesn't discuss the
ic)? Is the count reported there correct (no missing data)?
>
> Cheers,
>
> Konstantin
>
>
>
>
> On Wed, Feb 13, 2019 at 3:19 PM Gyula Fóra wrote:
>
>> Sorry not posting on the mail list was my mistake :/
>>
>>
>> On Wed, 13 Feb 2019 at 15:01,
Stefan (or anyone!), please, could I have some feedback on the findings
that I reported on Dec 21, 2018? This is still a major blocker..
On Thu, Jan 31, 2019 at 11:46 AM Juho Autio wrote:
> Hello, is there anyone that could help with this?
>
> On Fri, Jan 11, 2019 at 8:14 AM Juho Aut
Hello, is there anyone that could help with this?
On Fri, Jan 11, 2019 at 8:14 AM Juho Autio wrote:
> Stefan, would you have time to comment?
>
> On Wednesday, January 2, 2019, Juho Autio wrote:
>
>> Bump – does anyone know if Stefan will be available to comment the latest
&
Stefan, would you have time to comment?
On Wednesday, January 2, 2019, Juho Autio wrote:
> Bump – does anyone know if Stefan will be available to comment the latest
> findings? Thanks.
>
> On Fri, Dec 21, 2018 at 2:33 PM Juho Autio wrote:
>
>> Stefan, I managed to analyze
Bump – does anyone know if Stefan will be available to comment the latest
findings? Thanks.
On Fri, Dec 21, 2018 at 2:33 PM Juho Autio wrote:
> Stefan, I managed to analyze savepoint with bravo. It seems that the data
> that's missing from output *is* found in savepoint.
>
>
me "side effect kafka output" on individual
operators. This should allow tracking more closely at which point the data
gets lost. However, maybe this would have to be in some Flink's internal
components, and I'm not sure which those would be.
Cheers,
Juho
On Mon, Nov 19, 2018 at 1
18 at 3:32 PM Juho Autio wrote:
> I was glad to find that bravo had now been updated to support installing
> bravo to a local maven repo.
>
> I was able to load a checkpoint created by my job, thanks to the example
> provided in bravo README, but I'm still missing the es
window-contents"
threw – obviously there's no operator by that name).
Cheers,
Juho
On Mon, Oct 15, 2018 at 2:25 PM Juho Autio wrote:
> Hi Stefan,
>
> Sorry but it doesn't seem immediately clear to me what's a good way to use
> https://github.com/king/bravo.
>
> How
t (locally) and run it from there (using an IDE, for
example)? Also it doesn't seem like the bravo gradle project supports
building a flink job jar, but if it does, how do I do it?
Thanks.
On Thu, Oct 4, 2018 at 9:30 PM Juho Autio wrote:
> Good then, I'll try to analyze the savepo
with most of Flink's
internals. Any way high backpressure is not a seen on this job after it has
caught up the lag, so at I thought it would be worth mentioning.
On Thu, Oct 4, 2018 at 6:24 PM Stefan Richter
wrote:
> Hi,
>
> Am 04.10.2018 um 16:08 schrieb Juho Autio :
>
> &g
nder if it would not make sense to use a batch job instead?
>
> Best,
> Stefan
>
> [1] https://github.com/king/bravo
> [2]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-start-position-configuration
>
> Am 04.10.2018 um 14
e records as input, e.g. tuples of primitive types saved as
>> cvs
>> - minimal deduplication job which processes them and misses records
>> - check if it happens for shorter windows, like 1h etc
>> - setup which you use for the job, ideally locally reproducible or cloud
&g
data that is missing from window output.
On Mon, Oct 1, 2018 at 11:56 AM Juho Autio wrote:
> Hi Andrey,
>
> To rule out for good any questions about sink behaviour, the job was
> killed and started with an additional Kafka sink.
>
> The same number of ids were missed in
9 PM Juho Autio wrote:
> Thanks, Andrey.
>
> > so it means that the savepoint does not loose at least some dropped
> records.
>
> I'm not sure what you mean by that? I mean, it was known from the
> beginning, that not everything is lost before/after restoring a savepoin
gt; the sink from the state.
>
> Another suggestion is to try to write records to some other sink or to
> both.
> E.g. if you can access file system of workers, maybe just into local files
> and check whether the records are also dropped there.
>
> Best,
> Andrey
>
> O
t; is key) it can be that you do not get it in the list or exists (HEAD)
> returns false and you risk to rewrite the previous part.
>
> The BucketingSink was designed for a standard file system. s3 is used over
> a file system wrapper atm but does not always provide normal file system
>
skManager? From the log extracts you sent, I cannot really draw any
> conclusions.
>
> Best,
> Gary
>
>
> On Wed, Aug 15, 2018 at 10:38 AM, Juho Autio wrote:
>
>> Thanks Gary..
>>
>> What could be blocking the RPC threads? Slow checkpoin
og message indicates a border between the events that should
> be included into the savepoint (logged before) or not:
> “{} ({}, synchronous part) in thread {} took {} ms” (template)
> Also check if the savepoint has been overall completed:
> "{} ({}, asynchronous part) in thread {}
] introduced in Flink 1.6.0
> instead of the previous 'BucketingSink’?
>
> Cheers,
> Andrey
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
>
> On 24 Aug 2018, at 18:03, Juho Autio wrote:
>
> Yes, sorry
Kafka but
> before the window result is written into s3.
>
> Allowed lateness should not affect it, I am just saying that the final
> result in s3 should include all records after it.
> This is what should be guaranteed but not the contents of the intermediate
> savepoint.
>
>
oint but it should be behind the snapshotted offset in Kafka.
> Then it should just come later after the restore and should be reduced
> within the allowed lateness into the final result which is saved into s3.
>
> Also, is this `DistinctFunction.reduce` just an example or the actual
I changed to allowedLateness=0, no change, still missing data when
restoring from savepoint.
On Tue, Aug 21, 2018 at 10:43 AM Juho Autio wrote:
> I realized that BucketingSink must not play any role in this problem. This
> is because only when the 24-hour window triggers, BucketinSink
first and then cancel
> the job. Thus, the later savepoints might complete or not depending on the
> correct timing. Since savepoint can flush results to external systems, I
> would recommend not calling the API multiple times.
>
> Cheers,
> Till
>
> On Wed, Aug 22, 2018 at
First, I couldn't find anything about State TTL in Flink docs, is there
anything like that? I can manage based on Javadocs & source code, but just
wondering.
Then to main main question, why doesn't the TTL support event time, and is
there any sensible use case for the TTL if the streaming charater
gt; Because cancelWithSavepoint is actually waiting for savepoint to complete
>> synchronization, and then execute the cancel command.
>>
>> We do not use CLI. I think since you are through the CLI, you can observe
>> whether the savepoint is complete by combining the log
data when restoring a savepoint?
On Fri, Aug 17, 2018 at 4:23 PM Juho Autio wrote:
> Some data is silently lost on my Flink stream job when state is restored
> from a savepoint.
>
> Do you have any debugging hints to find out where exactly the data gets
> dropped?
>
> My job
Some data is silently lost on my Flink stream job when state is restored
from a savepoint.
Do you have any debugging hints to find out where exactly the data gets
dropped?
My job gathers distinct values using a 24-hour window. It doesn't have any
custom state management.
When I cancel the job wi
/ops/config.html#web-timeout
Cheers,
Juho
On Wed, Aug 15, 2018 at 11:43 AM Juho Autio wrote:
> Vishal, from which version did you upgrade to 1.5.1? Maybe from 1.5.0
> (release)? Knowing that might help narrowing down the source of this.
>
> On Wed, Aug 15, 2018 at 11:38 AM Juho Aut
Vishal, from which version did you upgrade to 1.5.1? Maybe from 1.5.0
(release)? Knowing that might help narrowing down the source of this.
On Wed, Aug 15, 2018 at 11:38 AM Juho Autio wrote:
> Thanks Gary..
>
> What could be blocking the RPC threads? Slow checkpointing?
>
> In p
c84dfe69ea0/flink-runtime/src/main/java/org/apache/flink/runtime/heartbeat/HeartbeatManagerSenderImpl.java#L64
>
> On Mon, Aug 13, 2018 at 9:52 AM, Juho Autio wrote:
>
>> I also have jobs failing on a daily basis with the error "Heartbeat of
>> TaskManager with id time
I also have jobs failing on a daily basis with the error "Heartbeat of
TaskManager with id timed out". I'm using Flink 1.5.2.
Could anyone suggest how to debug possible causes?
I already set these in flink-conf.yaml, but I'm still getting failures:
heartbeat.interval: 1
heartbeat.timeout: 10
t and multiple users on the mailing list have
> encountered similar problems.
>
> In our environment, it seems that JM shows that the save point is complete
> and JM has stopped itself, but the client will still connect to the old JM
> and report a timeout exception.
>
> Thanks, v
I was trying to cancel a job with savepoint, but the CLI command failed
with "akka.pattern.AskTimeoutException: Ask timed out".
The stack trace reveals that ask timeout is 10 seconds:
Caused by: akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/jobmanager_0#106635280]] a
that the Flink Kafka Consumer does not rely on the committed
>>> offsets for fault tolerance guarantees. The committed offsets are only a
>>> means to expose the consumer’s progress for monitoring purposes.
>>>
>>> Can you post full logs from all TaskManagers/JobM
ion
> /Cannot-auto-commit-offsets-for-group-console-consumer-79720/m-p/51188
> https://stackoverflow.com/questions/42362911/kafka-high-leve
> l-consumer-error-code-15/42416232#42416232
> Especially that in your case this offsets committing is superseded by
> Kafka coordinator failure.
&
Hi,
We have a Flink stream job that uses Flink kafka consumer. Normally it
commits consumer offsets to Kafka.
However this stream ended up in a state where it's otherwise working just
fine, but it isn't committing offsets to Kafka any more. The job keeps
writing correct aggregation results to the
unnecessary components Etc.
>> Either such process help you figure out what’s wrong on your own and if
>> not, if you share us such minimal program that reproduces the issue, it
>> will allow us to debug it.
>>
>> Piotrek
>>
>>
>> On 11 May 2018,
only. One thing which we have to
> fix first is that also the jar file upload goes through REST.
>
> [1] https://issues.apache.org/jira/browse/FLINK-9478
>
> Cheers,
> Till
>
> On Wed, May 30, 2018 at 9:07 AM, Juho Autio wrote:
>
>> Hi, I tried to search Flink Jira
at this
> improvement will make it into the 1.6 release.
>
> Cheers,
> Till
>
> On Thu, Apr 5, 2018 at 4:45 PM, Juho Autio wrote:
>
>> Thanks for the answer. Wrapping with GET sounds good to me. You said next
>> version; do you mean that Flink 1.5 would already i
er.handleWatermark(StreamInputProcessor.java:262)
... 7 more
On Fri, May 18, 2018 at 11:06 AM, Juho Autio wrote:
> Thanks Sihua, I'll give that RC a try.
>
> On Fri, May 18, 2018 at 10:58 AM, sihua zhou wrote:
>
>> Hi Juho,
>> would you like to try out the latest RC(http://people
t RC includes a fix for the potential silently data
> lost. If it's the reason, you will see a different exception when you
> trying to recover you job.
>
> Best, Sihua
>
>
>
> On 05/18/2018 15:02,Juho Autio
> wrote:
>
> I see. I appreciate keeping this optio
single existing checkpoint or also creating other problematic
> checkpoints? I am asking because maybe a log from the job that produces the
> problematic checkpoint might be more helpful. You can create a ticket if
> you want.
>
> Best,
> Stefan
>
>
> Am 18.05.2018 um 09:02 schrie
I see. I appreciate keeping this option available even if it's "beta". The
current situation could be documented better, though.
As long as rescaling from checkpoint is not officially supported, I would
put it behind a flag similar to --allowNonRestoredState. The flag could be
called --allowRescal
gt; I think you're asking the question I have asked in
> https://github.com/apache/flink/pull/5490, you can refer to it and find
> the comments there.
>
> @Stefan, PR(https://github.com/apache/flink/pull/6020) has been prepared.
>
> Best, Sihua
>
>
> On 05/16/2018 17:
I'd like to dig it out because currently we also
> use the checkpoint like the way you are) ...
>
> Best, Sihua
>
> On 05/16/2018 01:46,Juho Autio
> wrote:
>
> I was able to reproduce this error.
>
> I just happened to notice an important detail about the original fa
cli.properties
log4j-console.properties
log4j.properties
log4j-yarn-session.properties
logback-console.xml
logback.xml
logback-yarn.xml
On Tue, May 15, 2018 at 11:49 AM, Stefan Richter <
s.rich...@data-artisans.com> wrote:
> Hi,
>
> Am 15.05.2018 um 10:34 schrieb Juho Autio :
>
> Ok
e written for checkpoints/savepoints.
> - completed checkpoints/savepoints ids.
> - the restored checkpoint/savepoint id.
> - files that are loaded on restore.
>
> Am 15.05.2018 um 10:02 schrieb Juho Autio :
>
> Thanks all. I'll have to see about sharing the logs & config
, e.g. if you are using incremental checkpoints or not.
> Are you using the local recovery feature? Are you restarting the job from a
> checkpoint or a savepoint? Can you provide logs for both the job that
> failed and the restarted job?
>
> Best,
> Stefan
>
>
> Am 14.05.2
We have a Flink streaming job (1.5-SNAPSHOT) that uses timers to clear old
state. After restoring state from a checkpoint, it seems like a timer had
been restored, but not the data that was expected to be in a related
MapState if such timer has been added.
The way I see this is that there's a bug,
>
> https://gist.github.com/pnowojski/8cd650170925cf35be521cf236f1d97a
>
> Prints only ONE number to the standard err:
>
> > 1394
>
> And there is nothing on the side output.
>
> Piotrek
>
> On 11 May 2018, at 12:32, Juho Autio wrote:
>
> Thanks. What I still don't get is
would end up in can
> also be tricky if records are assigned to multiple windows (e.g., sliding
> windows).
> In this case, a side-outputted records could still be in some windows and
> not in others.
>
> @Aljoscha (CC) Might have an explanation for the current behavior.
>
I don't understand why I'm getting some data discarded as late on my Flink
stream job a long time before the window even closes.
I can not be 100% sure, but to me it seems like the kafka consumer is
basically causing the data to be dropped as "late", not the window. I
didn't expect this to ever ha
Bump this. I can create a ticket if it helps?
On Tue, Apr 24, 2018 at 4:47 PM, Juho Autio wrote:
> Anything to add? Is there a Jira ticket for this yet?
>
>
> On Fri, Apr 20, 2018 at 1:03 PM, Stefan Richter <
> s.rich...@data-artisans.com> wrote:
>
>> If estimates
feature request to include the client logs though.
> Would you mind opening a JIRA issue for this?
>
> Thanks, Fabian
>
> 2018-04-27 11:27 GMT+02:00 Juho Autio :
>
>> Ah, found the place! In my case they seem to be going to
>> /home/hadoop/flink-1.5-SNAPSHOT/log/flin
onstrate the
> configuration for partition discovery.
> Could you open a JIRA for that?
>
> Cheers,
> Gordon
>
> On Tue, Apr 10, 2018, 8:44 AM Juho Autio wrote:
>
>> Ahhh looks like I had simply misunderstood where that property should go.
>>
>> The docs
Ah, found the place! In my case they seem to be going to
/home/hadoop/flink-1.5-SNAPSHOT/log/flink-hadoop-client-ip-10-0-10-29.log
(for example).
Any reason why these can't be shown in Flink UI, maybe in jobmanager log?
On Fri, Apr 27, 2018 at 12:13 PM, Juho Autio wrote:
> The logs l
The logs logged by my job jar before env.execute can't be found in
jobmanager log. I can't find them anywhere else either.
I can see all the usual logs by Flink components in the jobmanager log,
though. And in taskmanager log I can see both Flink's internal & my
application's logs from the executi
could be misleading.
>
>
> Am 20.04.2018 um 11:59 schrieb Juho Autio :
>
> Thanks. At least for us it doesn't matter how exact the number is. I would
> expect most users to be only interested in monitoring if the total state
> size keeps growing (rapidly), or remains abo
Thanks. At least for us it doesn't matter how exact the number is. I would
expect most users to be only interested in monitoring if the total state
size keeps growing (rapidly), or remains about the same. I suppose all of
the options that you suggested would satisfy this need?
On Fri, Apr 20, 2018
Hi Aljoscha & co.,
Is there any way to monitor the state size yet? Maybe a ticket in Jira?
When using incremental checkpointing, the total state size can't be seen
anywhere. For example the checkpoint details only show the size of the
increment. It would be nice to add the total size there as wel
A possible workaround while waiting for FLINK-5479, if someone is hitting
the same problem: we chose to send "heartbeat" messages periodically to all
topics & partitions found on our Kafka. We do that through the service that
normally writes to our Kafka. This way every partition always has some
~r
ree Juho!
>
> Do you want to contribute the docs fix?
> If yes, we should update FLINK-5479 to make sure that the warning is
> removed once the bug is fixed.
>
> Thanks, Fabian
>
> 2018-04-12 9:32 GMT+02:00 Juho Autio :
>
>> Looks like the bug https://issues.apache.
ht be a window (such as a session window) that is
> open much longer than all other windows and which would hold back the
> offset. Other applications might not use the built-in windows at all but
> custom ProcessFunctions.
>
> Have you considered tracking progress using watermarks?
&
Thanks!
On Wed, Apr 11, 2018 at 12:59 PM, Chesnay Schepler
wrote:
> Data that arrives within the allowed lateness should not be written to the
> side output.
>
>
> On 11.04.2018 11:12, Juho Autio wrote:
>
> If I use a non-zero value for allowedLateness and also sideOutpu
If I use a non-zero value for allowedLateness and also sideOutputLateData,
does the late data output contain also the events that were triggered in
the bounds of allowed lateness? By looking at the docs I can't be sure
which way it is.
Code example:
.timeWindow(Time.days(1))
.allowedLateness(Time
this mistake. The docs are clear though, I just
had become blind to this detail as I thought I had already read it.
On Thu, Apr 5, 2018 at 10:26 AM, Juho Autio wrote:
> Still not working after I had a fresh build from https://github.com/
> apache/flink/tree/release-1.5.
>
> When the
se/YARN-2084
>
> Cheers,
> Till
>
> On Wed, Apr 4, 2018 at 4:31 PM, Fabian Hueske wrote:
>
>> Hi Juho,
>>
>> Thanks for raising this point!
>>
>> I'll add Chesnay and Till to the thread who contributed to the REST API.
>>
>> Best, Fabi
ing about discovering a new partition. We should probably add this.
>
> And yes, it would be great if you can report back on this using either the
> latest master, release-1.5 or release-1.4 branches.
>
> On 22 March 2018 at 10:24:09 PM, Juho Autio (juho.au...@rovio.com) wrote:
>
> Th
I just learned that Flink savepoints API was refactored to require using
HTTP POST.
That's fine otherwise, but makes life harder when Flink is run on top of
YARN.
I've added example calls below to show how POST is declined by
the hadoop-yarn-server-web-proxy*, which only supports GET and PUT.
Ca
d '{"cancel-job": true}'
>
> Let me know if it works for you.
>
> Best,
> Gary
>
> On Thu, Mar 29, 2018 at 10:39 AM, Juho Autio wrote:
>
>> Thanks Gary. And what if I want to match the old behaviour ie. have the
>> job cancelled after savepoi
Sorry, my bad. I checked the persisted jobmanager logs and can see that job
was still being restarted at 15:31 and then at 15:36. If I wouldn't have
terminated the cluster, I believe the flink job / yarn app would've
eventually exited as failed.
On Thu, Mar 29, 2018 at 4:49 PM, Juho Au
;
> As a side note, beginning from Flink 1.5, you do not need to specify -yn
> -ys
> because resource allocations are dynamic by default (FLIP-6). The
> parameter -yst
> is deprecated and should not be needed either.
>
> Best,
> Gary
>
> On Thu, Mar 29, 2018 at 8:59 AM, Ju
org/jira/browse/FLINK-9104
> [4] https://github.com/apache/flink/blob/release-1.5/flink-
> runtime/src/main/java/org/apache/flink/runtime/rest/
> handler/job/savepoints/SavepointHandlers.java#L59
>
> On Thu, Mar 29, 2018 at 10:04 AM, Juho Autio wrote:
>
>> With a fresh build fro
With a fresh build from release-1.5 branch, calling /cancel-with-savepoint
fails with 404 Not Found.
The snapshot docs still mention /cancel-with-savepoint:
https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#cancel-job-with-savepoint
1. How can I achieve the same res
I built a new Flink distribution from release-1.5 branch yesterday.
The first time I tried to run a job with it ended up in some stalled state
so that the job didn't manage to (re)start but what makes it worse is that
it didn't exit as failed either.
Next time I tried running the same job (but ne
Never mind, I'll post this new problem as a new thread.
On Wed, Mar 28, 2018 at 6:35 PM, Juho Autio wrote:
> Thank you. The YARN job was started now, but the Flink job itself is in
> some bad state.
>
> Flink UI keeps showing status CREATED for all sub-tasks and nothing seems
s://ci.apache.org/projects/flink/flink-docs-
> master/ops/deployment/hadoop.html#configuring-flink-with-hadoop-classpaths
>
>
> On Wed, Mar 28, 2018 at 4:26 PM, Juho Autio wrote:
>
>> I built a new Flink distribution from release-1.5 branch today.
>>
>> I tried runni
I built a new Flink distribution from release-1.5 branch today.
I tried running a job but get this error:
java.lang.NoClassDefFoundError:
com/sun/jersey/core/util/FeaturesAndProperties
I use yarn-cluster mode.
The jersey-core jar is found in the hadoop lib on my EMR cluster, but seems
like it's
s://issues.apache.org/jira/browse/FLINK-8419
>
> <https://issues.apache.org/jira/browse/FLINK-8419>
> This issue should have been fixed in the recently released 1.4.2 version.
>
> Cheers,
> Gordon
>
> On 22 March 2018 at 8:02:40 PM, Juho Autio (juho.au..
According to the docs*, flink.partition-discovery.interval-millis can be
set to enable automatic partition discovery.
I'm testing this, apparently it doesn't work.
I'm using Flink Version: 1.5-SNAPSHOT Commit: 8395508
and FlinkKafkaConsumer010.
I had my flink stream running, consuming an existin
Hi, has there been any changes to state handling with Flink SQL? Anything
planned?
I didn't find it at
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html.
Recently I ran into problems when trying to restore the state after changes
that I thought wouldn't change the executi
Is it possible to add new fields to the object type of a stream, and then
restore from savepoint?
I tried to add a new field "private String" to my java class. It previously
had "private String" and a "private final Map". When trying
to restore an old savepoint after this code change, it failed wi
also seems a bit risky to me, like
the above attempt shows. Maybe it's best to wait for this to be supported
properly. As I said I don't seem to really need early firing right now,
because writing out all distinct values once window closes is not too slow
for us at the moment.
Thanks ag
OWTIME(rowtime, INTERVAL '10' SECOND) AS rowtime
> FROM events
> WHERE s_aid1 IS NOT NULL
> GROUP BY
> s_aid1,
> s_cid,
> TUMBLE(rowtime, INTERVAL '10' SECOND)
> )
>
> Early triggering is not yet supported for SQL queries.
>
> B
oduces the
> results in real-time. If the batching is not required, you should be good
> by adding a filter on occurrence = 1.
> Otherwise, you could add the filter and wrap it by 10 secs tumbling window.
>
> Hope this helps,
> Fabian
>
>
> 2018-02-14 15:30 GMT+01:00 Juho
I'm joining a tumbling & hopping window in Flink 1.5-SNAPSHOT. The result
is unexpected. Am I doing something wrong? Maybe this is just not a
supported join type at all? Any way here goes:
I first register these two tables:
1. new_ids: a tumbling window of seen ids within the last 10 seconds:
SE
I'm triggering nightly savepoints at 23:59:00 with crontab on the flink
cluster.
For example last night's savepoint has this information:
Trigger Time: 23:59:14
Latest Acknowledgement: 00:00:59
What are the min/max boundaries for the data contained by the savepoint?
Can I deduce from this either
t;
>
> On 15.01.2018 14:09, Juho Autio wrote:
>
> Thanks for the explanation. Did you meant that process() would return a
> SingleOutputWithSideOutputOperator?
>
> Any way, that should be enough to avoid the problem that I hit (and it
> also seems like the best & only way).
&
explicitly define the code as below, which makes the
> behavior unambiguous:
>
> processed = stream
> .process(...)
>
> filtered = processed
> .filter(...)
>
> filteredSideOutput = processed
> .getSideOutput(...)
> .filter(...)
>
>
> On 15.01.2
uld get resource and how much
> sharing same hardware resources, I guess it will open gate to lots of edge
> cases -> strategies-> more edge cases :)
>
> Chen
>
> On Fri, Jan 12, 2018 at 6:34 AM, Juho Autio wrote:
>
>> Maybe I could express it in a slightly different
Thanks, the window operator is just:
.timeWindow(Time.seconds(10))
We haven't changed key types.
I tried debugging this issue in IDE and found the root cause to be this:
!this.keyDeserializer.equals(keySerializer) -> true
=> throw new IllegalStateException("Tried to initialize restored
TimerS
t; Aljoscha would be the expert here, maybe he’ll have more insights. I’ve
> looped him in cc’ed.
>
> Cheers,
> Gordon
>
>
> On 12 January 2018 at 4:05:13 PM, Juho Autio (juho.au...@rovio.com) wrote:
>
> When I run the code below (Flink 1.4.0 or 1.3.1), only "a"
I'm trying to restore savepoints that were made with Flink 1.3.1 but
getting this exception. The few code changes that had to be done to switch
to 1.4.0 don't seem to be related to this, and it seems like an internal
issue of Flink. Is 1.4.0 supposed to be able to restore a savepoint that
was made
When I run the code below (Flink 1.4.0 or 1.3.1), only "a" is printed. If I
switch the position of .process() & .filter() (ie. filter first, then
process), both "a" & "b" are printed, as expected.
I guess it's a bit hard to say what the side output should include in this
case: the stream before fi
l catching up.
> The watermarks are moving at the speed of the bigger topic, but all
> "early" events of the smaller topic are stored in stateful operators and
> are checkpointed as well.
>
> So, you do not lose neither early nor late data.
>
> Best, Fabian
>
>
Thanks for the answers, I still don't understand why I can see the offsets
being quickly committed to Kafka for the "small topic"? Are they committed
to Kafka before their watermark has passed on Flink's side? That would be
quite confusing.. Indeed when Flink handles the state/offsets internally,
t
I would like to understand how FlinkKafkaConsumer treats "unbalanced"
topics.
We're using FlinkKafkaConsumer010 with 2 topics, say "small_topic" &
"big_topic".
After restoring from an old savepoint (4 hours before), I checked the
consumer offsets on Kafka (Flink commits offsets to kafka for refer
Is there any way to set akka.client.timeout (or other flink config) when
calling bin/flink run instead of editing flink-conf.yaml? I tried to add it
as a -yD flag but couldn't get it working.
Related: https://issues.apache.org/jira/browse/FLINK-3964
Related issue: https://issues.apache.org/jira/browse/FLINK-2672
On Wed, May 25, 2016 at 9:21 AM, Juho Autio wrote:
> Thanks, indeed the desired behavior is to flush if bucket size exceeds a
> limit but also if the bucket has been open long enough. Contrary to the
> current RollingSink
1 - 100 of 105 matches
Mail list logo