I’m curious why it seems like a TimelyFlatMapFunction can’t be used with a
regular DataStream, but it can be used with a KeyedStream.
Or maybe I’m missing something obvious (this is with 1.2-SNAPSHOT, pulled
today).
Also the documentation of TimelyFlatMapFunction
(https://ci.apache.org/project
I've set the metric reporting frequency to InfluxDB as 10s. In the
screenshot, I'm using Grafana query interval of 1s. I've tried 10s and more
too, the graph shape changes a bit but the incorrect negative values are
still plotted(makes no difference).
Something to add: If the subtasks are less tha
Hmm. I can't recreate that behavior here. I have seen some issues like
this if you're grouping by a time interval different from the metrics
reporting interval you're using, though. How often are you reporting
metrics to Influx? Are you using the same interval in your Grafana
queries? I see in
Yes, thank Stephan.
Regards,
Anchit
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-on-YARN-Fault-Tolerance-use-case-supported-or-not-tp9776p9817.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabbl
Hi Jamie,
Thank you so much for your response.
The below query:
SELECT derivative(sum("count"), 1s) FROM "numRecordsIn" WHERE "task_name" =
'Sink: Unnamed' AND $timeFilter GROUP BY time(1s)
behaves the same as with the use of the templating variable in the 'All'
case i.e. shows a plots of junk
Hi Manu, Aljoscha,
I had been interested in implementing FLIP-2, but I haven't been able to
make time for it. There is no implementation yet that I'm aware of, and
I'll gladly step aside (or help out how I can) if you or anyone is
interested to take charge of it.
That said, I'm also not sure if d
Hi, Till:
I think the multiple input should include the more general case where
redistribution happens between subtasks, right? Since in this case we also
need to align check barrier.
Till Rohrmann 于2016年11月1日周二 下午11:05写道:
> The tuples are not buffered until the snapshot is globally complete (a
>
Thanks. The ideal case is to fire after watermark past each element from
the window but that requires a custom trigger and FLIP-2 as well. The
enhanced window evictor will help to avoid the last firing.
Are the discussions on FLIP-2 still going on ?
Are there any opening JIRAs or PRs ? (The propo
Ahh.. I haven’t used templating all that much but this also works for your
substask variable so that you don’t have to enumerate all the possible
values:
Template Variable Type: query
query: SHOW TAG VALUES FROM numRecordsIn WITH KEY = "subtask_index"
On Tue, Nov 1, 2016 at 2:51 PM, Jamie Grie
Another note. In the example the template variable type is "custom" and
the values have to be enumerated manually. So in your case you would have
to configure all the possible values of "subtask" to be 0-49.
On Tue, Nov 1, 2016 at 2:43 PM, Jamie Grier wrote:
> This works well for me. This will
This works well for me. This will aggregate the data across all sub-task
instances:
SELECT derivative(sum("count"), 1s) FROM "numRecordsIn" WHERE "task_name" =
'Sink: Unnamed' AND $timeFilter GROUP BY time(1s)
You can also plot each sub-task instance separately on the same graph by
doing:
SELECT
Hi Dominik,
out of curiosity, how come that you receive timestamps from the future? ;)
Depending on the semantics of these future events, it might also make
sense to already "floor" the timestamp to processing time in the
extractTimestamp()-Method.
I am not sure, if I understand your follow up q
Hi Niklas,
I don't know exactly what is going wrong there, but I have a few pointers
for you:
1) in cluster setups, Flink redirects println() to ./log/*.out files, i.e,
you have to search for the task manager that ran the DirReader and check
its ./log/*.out file
2) you are using Java's File class
Hello,
the main issue that prevented us from writing batches is that there is a
server-side limit as to how big a batch may be,
however there was no way to tell how big the batch that you are
currently building up actually is.
Regarding locality, I'm not sure if a partitioner alone solves thi
Hi!
I do not know the details of how Cassandra supports batched writes, but
here are some thoughts:
- Grouping writes that go to the same partition together into one batch
write request makes sense. If you have some sample code for that, it should
be not too hard to integrate into the Flink Cas
Hi Justin,
thank you for sharing the classpath of the Flink container with us. It
contains what Till was already expecting: An older version of the AWS SDK.
If you have some spare time, could you quickly try to run your program with
a newer EMR version, just to validate our suspicion?
If the erro
Hi there,
We're using EMR 4.4.0 -> I suppose this is a bit old, and I can migrate
forward if you think that would be best.
I've appended the classpath that the Flink cluster was started with at the
end of this email (with a slight improvement to the formatting to make it
readable).
Willing to po
Ah, I finally understand it. You would a way to query the current watermark
in the window function to only emit those elements where the timestamp is
lower than the watermark.
When the window fires again, do you want to emit elements that you emitted
during the last firing again? If not, I think y
By 'loop' do you refer to an iteration? The output of a bulk iteration is
processed as the input of the following iteration. Values updated in an
iteration are available in the next iteration just as values updated by an
operator are available to the following operator.
Your chosen algorithm may n
The tuples are not buffered until the snapshot is globally complete (a
snapshot is globally complete iff all operators have successfully taken a
snapshot). They are only buffered until the corresponding checkpoint
barrier on the second input is received. Once this is the case, the
checkpoint barrie
Hi, Till:
By operator with multiple inputs, do you mean inputs from multiple
subtasks?
On Tue, Nov 1, 2016 at 8:56 PM Till Rohrmann wrote:
> Hi Li,
>
> the statement refers to operators with multiple inputs (two in this case).
> With the current implementation you will indeed block one of the in
Sorry the incorrect reply, please ignore this.
On Tue, Nov 1, 2016 at 8:47 PM Renjie Liu wrote:
> Essentially you are right, but the snapshot commit process is
> asynchronous. That's what you have to pay for exactly once semantics.
>
> Li Wang 于2016年11月1日周二 下午3:05写道:
>
> Hi all,
>
> I have a que
Essentially you are right, but the snapshot commit process is asynchronous.
That's what you have to pay for exactly once semantics.
Li Wang 于2016年11月1日周二 下午3:05写道:
> Hi all,
>
> I have a question regarding to the state checkpoint mechanism in Flink. I
> find the statement "Once the last stream h
Hey,
I'm using a BoundedOutOfOrdernessTimestampExtractor for assigning my
timestamps and discarding to old events (which happens sometimes).
Now my problem is that some events, by accident have timestamps in the
future. If the timestamps are more in the future than my
`maxOutOfOrderness`, I'm
Hi Till,
Thanks for your prompt reply. I understand that input streams should be aligned
such that a consistent state snapshot can be generated. In my opinion, that
statement indicates that an operator will buffer its output tuples until the
snapshot is committed. I am wondering if my understa
Hi Pedro,
this looks like a version mismatch. Could you check which version of
elasticsearch you've in your classpath respectively uber jar? It should be
the version 2.3.5.
Cheers,
Till
On Fri, Oct 28, 2016 at 6:59 PM, PedroMrChaves
wrote:
> Hello,
>
> I am using Flink to write data to elastic
Hi Li,
the statement refers to operators with multiple inputs (two in this case).
With the current implementation you will indeed block one of the inputs
after receiving a checkpoint barrier n until you've received the
corresponding checkpoint barrier n on the other input as well. This is what
we
Hi Justin,
I think this might be a problem in Flink's Kinesis consumer. The Flink
Kinesis consumer uses the aws-java-sdk version 1.10.71 which indeed
contains the afore mentioned methods. However, already version 1.10.46 no
longer contains this method. Thus, I suspect, that Yarn puts some older
ve
Hi all,
I have a question regarding to the state checkpoint mechanism in Flink. I find
the statement "Once the last stream has received barrier n, the operator emits
all pending outgoing records, and then emits snapshot n barriers itself” on the
document
https://ci.apache.org/projects/flink/f
29 matches
Mail list logo