Robert,
Regarding the event qps 4500 events/sec may not be large no, but I am
seeing some issue in processing the events due to processing power that I
am using, I have deployed flink app on 3 node yarn cluster one node is a
master, 2 slave nodes which has the taskmanager running. Each machine is
Given FLINK 3311 & 3332, I am wondering it would be possible, without
idempotent counters in Cassandra, to deliver on an exactly once sink into
Cassandra. I do note that the verbiage/disc2 in 3332 does warn the user
that this is not exactly "exactly once" sink.
However my question has to do with
I am new to flink. I am wondering what the best practice is for looking up
additional values for entities embedded in an event.
For example, an event might have the engine model #. A metadata store has
the typical rpm characteristics of an engine model #; which can be used in
a subsequent processi
answering my own question:
testing streaming environment should be done with StreamingProgramTestBase
& TestStreamEnvironment which are present in test package
of flink-streaming-java project so it's not directly available?
Project owners, why not to move above two to flink-test-utils? Or I don't
Dear Sir or Madam,
Can you tell me why I have a problem with elasticsearch in local cluster?
I analysed this example:
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/elasticsearch2.html
My flink and elasticsearch config are default (only I change node.name to
"no
Hi
Thanks for the answer , then how can i measure the performance of flink ?
i want to run my application with both spark and flink . and want to
measure the performance . so i can check how fast flink process my data as
compare to spark.
Regards
prateek
On Mon, May 9, 2016 at 2:17 AM, Ufuk Cele
Hi,
Regarding 1) Thanks a lot for the ParallelSourceFunction, i completely
missed that I was using a SourceFunction instead.
Regarding 2) the example works and i can see what is happening there, now
when i increase the parallelism i understand the corresponding change as to
how the data is fed ba
Any idea how to handle following(the message is clear, but I'm not sure
what I need to do)
I'm opening "generic" environment in my code
(StreamExecutionEnvironment.getExecutionEnvironment())
and JavaProgramTestBase configures TestEnvironment...
so what I should do to support custom tests?
Erro
Hello Malte,
As Simone said there is no Java support currently for FlinkML unfortunately.
Regards,
Theodore
On Mon, May 9, 2016 at 3:05 PM, Simone Robutti wrote:
> To my knowledge FlinkML does not support an unified API and most things
> must be used exclusively with Scala Datasets.
>
> 2016-0
Dear Sir or Madam,
Can you tell me why I have a problem with elasticsearch in local cluster?
I analysed this example:
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/elasticsearch2.html
My flink and elasticsearch config are default (only I change node.name to
"no
Hi Bruce,
you're right, taking down the job and restarting (from a savepoint) with
the updated software is the only way of doing it. I'm not aware of any work
being done in this area right now but it is an important topic that we
certainly have to tackle in the not-so-far future.
Cheer,
Aljoscha
This is now solved, thank you. :-)
Thanks and Regards,Piyush Shrivastava
http://webograffiti.com
On Monday, 9 May 2016 3:47 PM, Piyush Shrivastava
wrote:
Hello all,
I have a time series based logic written with Flink. Due to the parallelism, I
am not getting the output in a proper se
To my knowledge FlinkML does not support an unified API and most things
must be used exclusively with Scala Datasets.
2016-05-09 13:31 GMT+02:00 Malte Schwarzer :
> Hi folks,
>
> I tried to get the FlinkML SVM running - but it didn't really work. The
> SVM.fit() method requires a DataSet paramete
>- You wrote you'd like to "instantiate a H2O's node in every task
manager". This reads a bit like you want to start H2O in the TM's JVM , but
I would assume that a H2O node runs as a separate process. So should it be
started inside the TM JVM or as an external process next to each TM. Also,
do you
Hi folks,
I tried to get the FlinkML SVM running - but it didn't really work. The
SVM.fit() method requires a DataSet parameter but there is no
such class/interface in Flink Java. Or am I mixing something up with
Scala? Also, I couldn't find a Flink ML example for Java (there is only
Scala).
Is
Hi Simone,
sorry for the delayed answer. I have a few questions regarding your
requirements and a some ideas that might be helpful (depending on the
requirements).
1) Starting / stopping of H2O nodes from Flink
- You wrote you'd like to "instantiate a H2O's node in every task manager".
This reads
Hello all,
I have a time series based logic written with Flink. Due to the parallelism, I
am not getting the output in a proper series.For example,
3> (12:00:00, "value")
appears before
1> (11:59:00, "value")
while the timestamp of the latter is smaller than the former. I am using
TimeWindow and
Perfect👍
On Mon, May 9, 2016 at 3:12 PM, Ufuk Celebi wrote:
> Yes, I did just that and I used the relevant Flink terminology instead
> of #cores and #machines:
>
> #cores => #slots per TM
> #machines => #TMs
>
> On Mon, May 9, 2016 at 11:33 AM, Punit Naik
> wrote:
> > Yeah, thanks a lot for tha
I do not know if I understand completely, but I would create a new DataSet
based on filtering the condition and then persist this DataSet.
So :
DataSet ds2 = DataSet1.filter(Condition)
2ds.output(...)
On Mon, May 9, 2016 at 11:09 AM, Ufuk Celebi wrote:
> Flink has support for spillable in
Hi,
regarding 1) the source needs to implement the ParallelSourceFunction or
RichParallelSourceFunction interface to allow it to have a higher
parallelism than 1.
regarding 2) I wrote a small example that showcases how to do it:
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.g
Hey Nirmalya,
this is not your fault, but a shortcoming of the docs. ;) I've created
an issue for it here: https://issues.apache.org/jira/browse/FLINK-3884
On Mon, May 2, 2016 at 8:01 PM, nsengupta wrote:
> Fantastic! Many thanks for clarifying, Aljoscha!
>
> I blindly followed what that page sa
Hey John!
It looks like the task managers are not picking up the correct
configuration. Can you please verify that all nodes (JobManager and
TaskManager) use the same configuration.
The task managers use ZooKeeper to look up the JobManager and not the
configuration.
>From the docs
>(https://ci
Yes, I did just that and I used the relevant Flink terminology instead
of #cores and #machines:
#cores => #slots per TM
#machines => #TMs
On Mon, May 9, 2016 at 11:33 AM, Punit Naik wrote:
> Yeah, thanks a lot for that. Also if you could, please write the formula,
> #cores\^2\^ * #machines * 4,
On Thu, May 5, 2016 at 1:59 AM, Bajaj, Abhinav
wrote:
> Or can we resume a stopped streaming job ?
You can use savepoints [1] to take a snapshot of a streaming program from
which you can restart the job at a later point in time. This is independent
of whether you cancel or stop the program afte
Yeah, thanks a lot for that. Also if you could, please write the formula,
*#cores\^2\^* * *#machines* * 4, in a different form so that its more
readable and understandable.
On 09-May-2016 2:54 PM, "Ufuk Celebi" wrote:
> On Mon, May 9, 2016 at 11:05 AM, Punit Naik
> wrote:
> > Thanks for the deta
On Mon, May 9, 2016 at 11:05 AM, Punit Naik wrote:
> Thanks for the detailed answer. I will definitely try this and get back to
> you.
OK, looking forward to it. ;)
In the meantime I've updated the docs with a more concise version of
what do to when you see this exception.
Hey Prateek,
On Fri, May 6, 2016 at 6:40 PM, prateekarora wrote:
> I have below information from spark . do i can get similar information from
> Flink also ? if yes then how can i get that.
You can get GC time via the task manager overview.
The other metrics don't necessarily translate to Flink
Flink has support for spillable intermediate results. Currently they
are only set if necessary to avoid pipeline deadlocks.
You can force this via
env.getConfig().setExecutionMode(ExecutionMode.BATCH);
This will write shuffles to disk, but you don't get the fine-grained
control you probably need
Hello Flink team,
i am currently playing around with Storm and Flink in the context of a smart
home. The primary functional requirement is to quickly react to certain
properties in stream tuples.
I was looking at some benchmarks from the two systems, and generally Flink
has the upper hand, in bo
Hi Ufuk
Thanks for the detailed answer. I will definitely try this and get back to
you.
On 09-May-2016 2:08 PM, "Ufuk Celebi" wrote:
> Hey Punit,
>
> you need to give the task managers more network buffers as Robert
> suggested. Using the formula from the docs, can you please use 147456
> (96^2*
Hi,
I think it should (theoretically) work. You would have to provide a custom
serializer that can serialize/deserialize your different window subclasses.
Also, you will probably have to provide a Trigger that can deal with the
different types of windows.
Cheers,
Aljoscha
On Mon, 9 May 2016 at 05
Hey Punit,
you need to give the task managers more network buffers as Robert
suggested. Using the formula from the docs, can you please use 147456
(96^2*4*4) for the number of network buffers. Each buffer is 32 KB,
meaning that you give 4,5 GB of memory to the network stack. You might
have to adju
Robert, what do you think about adding a note about this to the Kafka
consumer docs? This has come up a couple of times on the mailing list
already.
– Ufuk
On Fri, May 6, 2016 at 12:07 PM, Balaji Rajagopalan
wrote:
> Thanks Robert appreciate your help.
>
> On Fri, May 6, 2016 at 3:07 PM, Robert
Hey Dominique!
Are you running the job in HA mode?
– Ufuk
On Thu, May 5, 2016 at 1:49 PM, Robert Metzger wrote:
> Hi Dominic,
> I'm sorry that you ran into this issue.
> What do you mean by "flink streaming routes" ?
>
> Regarding the second question: "Now I want to restart these routes to
> co
34 matches
Mail list logo