Hi Till ,
Thank you for the reply , I have posted some logs with initial email chain
. I think issue is more to do with docker private registry when there is
authorization involved . I can run docker running Job manager and task
manager as separate task for marathon and connect via RPC port . I wa
Sorry: neglected to include the stack trace for JMX failing to instantiate
from a taskmanager:
017-08-05 00:59:09,388 INFO
org.apache.flink.runtime.metrics.MetricRegistry -
Configuring JMXReporter with {port=8789,
class=org.apache.flink.metrics.jmx.JMXReporter}.
2017-08-05 00:59:09,4
I searched in Flink (and hbase) for GeneratedMessageV3 but didn't find any
reference.
Which version of protobuf did you use to generate the class ?
Please copy user@ in the future so that more people can help.
On Fri, Aug 4, 2017 at 8:27 AM, Sridhar Chellappa
wrote:
> public final class Custom
Hi, I'm running flink jobmanagers/taskmanagers with yarn. I've turned on
the JMX reporter in my flink-conf.yaml as follows:
metrics.reporters: jmx
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
I was wondering:
Is there a JMX server with the aggregated stats across all jo
Thanks Fabian. I do have one more question.
When we connect the two streams and perfrom coprocess function. There are
two separate methods for each streams. Which stream state we need to store
and Will the coprocess function automatically trigger once the other stream
data or should we set some tim
TimeWindow.getStart() or TimeWindow.getEnd()
->
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#incremental-window-aggregation-with-reducefunction
2017-08-04 22:43 GMT+02:00 Raj Kumar :
> Thanks Fabian.
>
> The incoming events have the timestamps. Once I aggregate in
Thanks Fabian.
The incoming events have the timestamps. Once I aggregate in the first
stream to get counts and calculate the mean/standard deviation in the second
the new timestamps should be window start time ? How to tackle this issue ?
--
View this message in context:
http://apache-flink-
Hi Robert,
That's right.
The count's are on a per operator-level. I think you can get down to the
task-level but counts per bucket are not tracked.
Maybe Chesnay (in CC) can help here. He knows the metrics system the best.
@Chesnay, is there a way to expire metric counters?
Alternatively, you cou
Hi Maxim,
you could inject an AssignerWithPunctuatedWatermarks into your plan which
emits a watermark for every record it sees.
That way you can increment the logical time for every record.
Best, Fabian
2017-08-04 16:27 GMT+02:00 Maksym Parkachov :
> Hi,
>
> I'm evaluating Flink as alternative
The directory referred to by `blob.storage.directory` is best described as
a local cache. For recovery purposes the JARs are also stored in `
high-availability.storageDir`.At least that's my reading of the code in
1.2. Maybe there's some YARN specific behavior too, sorry if this
information
Hi Raj,
you have to combine two streams. The first stream has the running avg +
std-dev over the last 6 hours, the second stream has the 15 minute counts.
Both streams emit one record every 15 minutes. What you wan to do is to
join the two records of both streams with the same timestamp.
You do th
Hi Peter,
function objects (such as an instance of a class that extends MapFunction)
that are used to construct a plan are serialized using Java serialization
and shipped to the workers for execution.
Therefore, function classes must be Serializable. In general it is
recommended to configure funct
Stephan,
Regarding your last reply to
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/blob-store-defaults-to-tmp-and-files-get-deleted-td11720.html
You mention "Flink (via the user code class loader) actually holds a reference
to the JAR files in "/tmp", so even if "/tmp" ge
We're using a AssignerWithPunctuatedWatermarks that extracts a timestamp from
the data. It keeps and returns the higher timestamp it has ever seen and
returns a new Watermark everytime the value grows.
I know it's bad for performances, but for the moment it's not the issue, i want
the most poss
Can you show how CustomerMessage is defined ?
Thanks
On Fri, Aug 4, 2017 at 7:22 AM, Sridhar Chellappa
wrote:
> Folks,
>
> I wrote a custom Data source to test me CEP logic. The custom data source
> looks like :
>
> public class CustomerDataSource extends RichParallelSourceFunction {
> priv
Hello,
Have you seen these two blog posts? They explain the relationship
between Apache Flink, Apache Beam, and Google Cloud Dataflow.
https://data-artisans.com/blog/why-apache-beam
https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
Best,
Gábor
On Mon, Jul 31,
Hi,
I'm evaluating Flink as alternative to Spark streaming for test project
reading from Kafka and saving to Cassandra. Everything works, but I'm
struggling with integration tests. I could not figure out how to manually
move time in Flink. Basically, I write message in Kafka with event time in
the
Folks,
I wrote a custom Data source to test me CEP logic. The custom data source
looks like :
public class CustomerDataSource extends RichParallelSourceFunction {
private boolean running = true;
private final Random random;
public CustomerDataSource() {
this.random = new Rand
Hi,
Is it possible to synchronize two kafka sources? So they can consume from
different Kafka topics in close enough event times.
My use case is, I have two Kafka topics: A(very large) and B(large). There
is a mapping of one to one or zero between A and B. Topology is simply join
A and B in a tum
Hi everyone,
I always get an FileNotFoundException by following the kubernetes setup guide
[1].
I moved my jar and my input file onto the job manager pod
After that I join the job manager pod by using: kubectl exec -it -
- /bin/bash
With ls I can see both files. WordCount-example worked well.
Hi,
Could you please provide a snipped of code or some minimal example that would
help us reproducing your problem?
Best,
Aljoscha
> On 3. Aug 2017, at 17:41, aitozi wrote:
>
>
> Hi,
>
> i have encounted a problem, i apply generate and assign watermark at the
> datastream, and then keyBy, a
Hi,
How are you defining the watermark, i.e. what kind of watermark extractor are
you using?
Best,
Aljoscha
> On 3. Aug 2017, at 17:45, Gwenhael Pasquiers
> wrote:
>
> We're not using a Window but a more basic ProcessFunction to handle sessions.
> We made this choice because we have to hand
Hi Biswajit,
are there any Mesos logs which might help us pinpointing the problem? I've
actually never run Flink on Mesos with Docker images. But it could be that
Flink does not set things properly up for running Docker images. I'll try
to run Flink based on Docker images over the weekend in order
Hi Konstantin,
If you can at all wait, I would suggest to skip updating to 1.3.1 and go
directly to (the not yet released) 1.3.2. Flink 1.3.0 and 1.3.1 had a few
critical bugs that are not fixed. Most notably, there was a problem in the
Kafka consumer that could lead to state corruption/data du
Hi Konstantin,
I just checked the code and the configuration option is still there and should
be working. Somehow, the backport for the 1.2 release branch did contain the
documentation while the actual commit on master did not.
Thanks for the info, let me create a hotfix to fix that.
Nico
On T
Hi,
if the question is, if there are certain requirements for the filesystem that
you use with the state backends, then I think there might be a small
misconception. Currently, all state backends in Flink operator local to the
task, i.e. either in memory (e.g. FsStateBackend) or also on the loc
Hi,
in Flink 1.2.x the restore will not succeed because it was mapping states on a
task level, not at the operator level. This makes it impossible to add stateful
operators somewhere to an operator chain, because Flink could not figure out
which state belongs to which operator after such a modi
Hi Björn,
You are correct that CEP library buffers events until a watermark with a
greater timestamp arrives. It is because the order of events in case of CEP is
crucial.
Imagine a Pattern like A next B. And sequence a(t=1) c(t=10) b(t=2). If we do
not wait until the Watermark and sort the even
Hi,
I am writing a small application which monitors a couple of directories for
files which are read by Kafka and later consumed by Flink. Flink then
performs some operations on the records (such as extracting the embedded
timestamp) and tries to find a pattern using CEP. Since the data can be out
29 matches
Mail list logo