Hi Jamie,
Thanks a lot for the response. Appreciate your help.
Regards,
Govind
On Mon, Sep 26, 2016 at 3:26 AM, Jamie Grier
wrote:
> Hi Govindarajan,
>
> Typically the way people do this is to create a stream of configuration
> changes and consume this like any other stream. For the specific
Hello Fabian,
Thanks a lot for the explanation. When the operators (including source / sink)
are chained, what is the method of communication between them?
We use Kafka for data source and I was interested to learn the mechanism of
communication from Kafka to the next operator, say flatMap… So
Thanks Stephan.I dont see a "graph" in JM's "Dashboard" when I click on the
running job...I see a box like below with Parallelism = 512 which is what I
have set as the parallelism degree in my code:options.setParallelism(512); Does
this mean the cluster is now fully running on its max capacity
Hi Buvana,
A TaskManager runs as a single JVM process. A TaskManager provides a
certain number of processing slots. Slots do not guard CPU time, IO, or JVM
memory. At the moment they only isolate managed memory which is only used
for batch processing. For streaming applications their only purpose
Hello,
I would like to understand the following better:
https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#configuring-taskmanager-processing-slots
Fundamental question - what is the notion of Task Slot? Does it correspond to
one JVM? Or the Task Manager itself corres
Hi Markus,
thanks for the stacktraces!
The client is indeed stuck in the optimizer. I have to look a bit more into
this.
Did you try to set JoinHints in your plan? That should reduce the plan
space that is enumerated and therefore reduce the optimization time (maybe
enough to run your application
I am not sure about that, I will run the pipeline on cluster and share the
details
Since window is a stateful operator , it will store only the key part in
the state backend and not the value , right ?
Regards,
Vinay Patil
On Mon, Sep 26, 2016 at 12:13 PM, Stephan Ewen [via Apache Flink User
Mail
@vinay - Is it in your case large state that causes slower checkpoints?
On Mon, Sep 26, 2016 at 6:17 PM, vinay patil
wrote:
> Hi,
>
> I am also facing this issue, in my case the data is flowing continuously
> from the Kafka source, when I increase the checkpoint interval to 6,
> the data get
Thanks, the logs were very helpful!
TL:DR - The offset committing to ZooKeeper is very slow and prevents proper
starting of checkpoints.
Here is what is happening in detail:
- Between the point when the TaskManager receives the "trigger
checkpoint" message and when the point when the KafkaSour
Hi,
I am also facing this issue, in my case the data is flowing continuously
from the Kafka source, when I increase the checkpoint interval to 6,
the data gets written to S3 sink.
Is it because some operator is taking more time for processing, like in my
case I am using a time window of 1sec.
Hi to all,
I have a use case where I need to tell a Flink cluster to give me a sample
of X records using parametrizable sampling functions. Is there any best
practice or advice to do that?
Should I create a Remote ExecutionEnvironment or should I use the Flink
client (I don't know if it uses REST
Hi Fabian,
at first, sorry for the late answer. The given execution plan was created
after 20 minutes, only one vertex centric iteration is missing. I can
optimize the program because some operators are only needed to create
intermediate debug results, still, it's not enough to run as one Flink j
On Mon, Sep 26, 2016 at 2:07 PM, Maximilian Michels wrote:
> Hi Luis,
>
> With your feedback I was able to find the problem. I have created an
> issue and a fix is available which will be in Flink 1.1.3 and Flink
> 1.2.0.
>
>
> Thanks,
>
Thank you!
> Max
>
> [1] https://issues.apache.org/jira/
Hi Luis,
With your feedback I was able to find the problem. I have created an
issue and a fix is available which will be in Flink 1.1.3 and Flink
1.2.0.
Thanks,
Max
[1] https://issues.apache.org/jira/browse/FLINK-4677
On Tue, Sep 20, 2016 at 2:00 PM, Luis Mariano Guerra
wrote:
> On Tue, Sep 2
Hello,
You can't call map on the sink, but instead you can continue from the
stream that you have just before the sink:
val stream = datastream.filter(new Myfilter())
val sink1 = stream.addSink(new Mysink())
val sink2 = stream.map(new MyMap()).addSink(MySink2())
Best,
Gábor
2016-09-26 12:47 G
hello,
I am a green hand with Flink, and I want to build a topology with
streaming api like this:
datastream.fliter(new Myfilter()).addSink(new Mysink()).map(new
MyMap()).addSink(MySink2())
can Flink support operations like this?
Hope for your reply sincerely. thx.
Hi Govindarajan,
Typically the way people do this is to create a stream of configuration
changes and consume this like any other stream. For the specific case of
filtering for example you may have a data stream and a stream of filters
that you want to run the data through. The typically approach
Hi Govindarajan,
I've put some answers in-line below..
On Sat, Sep 24, 2016 at 7:32 PM, Govindarajan Srinivasaraghavan <
govindragh...@gmail.com> wrote:
> Hi,
>
> I'm working on apache flink for data streaming and I have few questions.
> Any help is greatly appreciated. Thanks.
>
> 1) Are there
Okay. Thanks for the update Robert.
On Mon, Sep 26, 2016 at 3:08 PM, Robert Metzger wrote:
> The RawSchema was once part of Flink's Kafka connector. I've removed it
> because its implementation is trivial and I didn't expect that there are
> many people who need the schema (also, I think I saw p
The RawSchema was once part of Flink's Kafka connector. I've removed it
because its implementation is trivial and I didn't expect that there are
many people who need the schema (also, I think I saw people using a map()
operator after the consumer to deserialize the byte[] into their formats).
As S
Hi Robert, May I know your inputs on same ?
Thanks,
Swapnil
On Thu, Sep 22, 2016 at 7:12 PM, Stephan Ewen wrote:
> /cc Robert, he is looking into extending the Kafka Connectors to support
> more of Kafka's direct utilities
>
> On Thu, Sep 22, 2016 at 3:17 PM, Swapnil Chougule > wrote:
>
>> It
You do not need to create any JSON.
Just click on "Running Jobs" in the UI, and then on the job. The
parallelism is shown as a number in the boxes of the graph.
On Sat, Sep 24, 2016 at 6:28 PM, amir bahmanyari
wrote:
> Thanks Felix.
> Interesting. I tried to create the JASON but didnt work acc
22 matches
Mail list logo