The new default is equivalent to the previous "streaming mode". The
community decided to get rid of this distinction, because it was
confusing to users.
The difference between "streaming mode" and "batch mode" was how
Flink's managed memory was allocated, either lazily when required
('streaming mo
Hi Fabian,
Previously when using flink 0.9-0.10 we start the cluster with streaming
mode or batch mode. I see that this one is gone on Flink 1.00 snapshot ? So
this one has already taken care of the flink and optimize by runtime >
On Mon, Feb 22, 2016 at 5:26 PM, Fabian Hueske wrote:
> Hi Welly
Hi Fabian,
Thanks a lot for your response.
- How many task managers do you start? I assume more than one TM per
machine given that you assign only 4GB of memory out of 128GB to each TM.
Currently what we have done is start a 1 TM per machine with number of task
slot 48.
- What is the maximum pa
Hello,
I'm not sure if this related, but we recently started seeing this when
using `1.0-SNAPSHOT` in the `snapshots` repository:
[error] Modules were resolved with conflicting cross-version suffixes
in {file:/home/ubuntu/bt/}flinkproject:
[error]org.apache.kafka:kafka _2.10, _2.11
java.lang.
Hi Shikhar,
you're right that including a connector dependency would have let us spot
the problem earlier. In fact, any project building a fat jar with SBT would
have failed without setting the flink dependencies to provided.
The problem is that the template is a general purpose template. Thus, i
Hi Till,
Thanks so much for sorting this out!
One suggestion, can the Flink template depend on a connector (Kafka?) --
this would verify that assembly works smoothly for a very common use-case
when you need to include connector JAR's.
Cheers,
Shikhar
--
View this message in context:
http://
At the moment, the system can only deal with lost slots (nodes) if either
there are some excess slots which have not been used before or if the died
node is restarted. The latter is the case for yarn applications, for
example. There the application master will restart containers which have
died.
I
Thank you, Till!
The current (in progress) implementation is considering also the problem
related to losing the task's slots of the failed node(s), something related to
[2] ?
[2] https://issues.apache.org/jira/browse/FLINK-3047
Best,
Ovidiu
> On 22 Feb 2016, at 18:13, Till Rohrmann wrote:
>
Hi Ovidiu,
at the moment Flink's batch fault tolerance restarts the whole job in case
of a failure. However, parts of the logic to do partial backtracking such
as intermediate result partitions and the backtracking algorithm are
already implemented or exist as a PR [1]. So we hope to complete the
Hi
In case of failure of a node what does it mean 'Fault tolerance for programs in
the DataSet API works by retrying failed executions’ [1] ?
-work already done by the rest of the nodes is not lost, only work of the lost
node is recomputed, job execution will continue
or
-entire job execution is
Hi Shikhar,
I just wanted to let you know that we've found the problem with the failing
assembly plugin. It was caused by incompatible classes [1, 2]. Once these
PRs are merged, the merge problems should be resolved.
By the way, we've also added now a SBT template for Flink projects using
giter8
Hi Welly,
I have to correct the formula I posted before:
taskmanager.network.numberOfBuffers: p ^ 2 * t * 4
p is NOT the parallelism of the job, BUT the number of slots of a task
manager.
So if you configure one TM for each machine with 48 slots, you get:
48^2 * 16 * 4 = 147.456 buffers, with 3
Hi Welly,
sorry for the late response.
The number of network buffers primarily depends on the maximum parallelism
of your job.
The given formula assumes a specific cluster configuration (1 task manager
per machine, one parallel task per CPU).
The formula can be translated to:
taskmanager.network
Note that the method to call in the example should be
`conf.setInteger` and the second argument not a String but an int.
On Sun, Feb 21, 2016 at 1:41 PM, Márton Balassi
wrote:
> Dear Ana,
>
> If you are using a single machine with multiple cores, but need convenient
> access to the configuration
Hi,
Following up on the example before: you have to aggregate the data from 2
partitions (e.g. let's say in one you count objects of type 1 and in the other
of type 2). Then you need to pair them together and emit that at iterations N
you had: (object T1 - Y, object T2 - X) and finally evict th
Hi Robert
I just checked my settings in Task Managers (they were configured separately),
they are misconfigured.
My job now runs correctly, after reconfigured them.
Thanks!
Andrew
> On 22 Feb 2016, at 09:41, Robert Metzger wrote:
>
> Hi,
>
> how is your cluster setup? Do you have multiple ma
Hi Thomas,
To avoid having jobs forever restarting, you have to cancel them manually
(from the web interface or the /bin/flink client).
Also, you can set an appropriate restart strategy (in 1.0-SNAPSHOT), which
limits the number of retries. This way the retrying will eventually stop.
On Fri, Feb
Hi,
how is your cluster setup? Do you have multiple machines, or only one?
Did you copy the configuration to all machines?
On Fri, Feb 19, 2016 at 6:08 PM, Andrew Ge Wu
wrote:
> Hi All,
>
> I have been experiencing an error stopping my HA standalone setup.
>
> The cluster startup just fine, b
18 matches
Mail list logo