Yes, the task manager continues running. I have put together a test app to
demonstrate the problem and in doing so noticed some oddities. The problem
manifests itself on a simple join (I originally believed it was the
distinct, I was wrong).
- When the source is generated via fromCollection()
Hi Flinksters,
just some minor:
http://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html
in the second code sample should be
./bin/flink run -m yarn-cluster -yn 4 -yjm 1024 -ytm 4096
./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar
instead of:
./bin/flink -m yarn-clu
Dear Hawin,
We do not have out of the box support for that, it is something you would
need to implement yourself in a custom SinkFunction.
Best,
Marton
On Mon, Jun 22, 2015 at 11:51 PM, Hawin Jiang wrote:
> Hi Marton
>
> if we received a huge data from kafka and wrote to HDFS immediately. W
Dear Hawin,
Sorry, I ahve managed to link to a pom that has been changed in the
meantime. But we have added a section to our doc clarifying your question.
[1] Since then Stephan has proposed an even nicer solution that did not
make it into the doc yet, namely if you start from our quickstart pom a
I think it is a valid concern that the POM should be written such that the
given command is valid.
Should be only a simple change on the root POM, to make sure the Hadoop 2
profile is not only activated in the absence of a profile selection, but
also when the selected profile is "2"
Stephan
On T
Thanks! I'm still used to 0.7 :)
Cheers,
Max
On Tue, Jun 23, 2015 at 2:18 PM, Maximilian Michels wrote:
> Hi Max!
>
> Nowadays, the default target when building from source is Hadoop 2. So a
> simple mvn clean package -DskipTests should do it. You only need the flag
> when you build for Hadoo
On 23 Jun 2015, at 13:53, Stephan Ewen wrote:
> Currently, Flink does not cache anything across runs, except JAR files on the
> workers.
>
> The reason the first run is slower may be:
> - Because in the first run, code is distributed in the cluster. In
> subsequent runs, the JAR files need n
Hi Max!
Nowadays, the default target when building from source is Hadoop 2. So a
simple mvn clean package -DskipTests should do it. You only need the flag
when you build for Hadoop 1: -Dhadoop.profile=1.
Cheers,
The other Max
On Tue, Jun 23, 2015 at 2:03 PM, Maximilian Alber <
alber.maximil...@g
Hi Flinksters,
I just tried to build the current yarn version of Flink. The second error
is probably a because maven is of an older version. But the first one seems
to be an error.
albermax@hadoop1:~/bumpboost/working/programs/flink/incubator-flink$ mvn
clean package -DskipTests -Dhadoop.profile=
Currently, Flink does not cache anything across runs, except JAR files on
the workers.
The reason the first run is slower may be:
- Because in the first run, code is distributed in the cluster. In
subsequent runs, the JAR files need not be redistributed.
- Because the JIT takes a bit to kick in
Thank you!
Still I cannot guarantee the size of each partition, or can I?
Something like randomSplit in Spark.
Cheers,
Max
On Mon, Jun 15, 2015 at 5:46 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Hi,
>
> using partitionCustom, the data distribution depends only on your
> proba
No clue. I used the current branch aka 0.9-SNAPSHOT.
Or is this something related to Scala?
On Mon, Jun 22, 2015 at 4:45 PM, Stephan Ewen wrote:
> Actually, the closure cleaner is supposed to take care of the "anonymous
> inner class" situation.
>
> Did you deactivate that one, by any chance?
>
hi flink community,
to time i test my flink app with a benchmark on an hadoop cluster (flink on
yarn).
my results show me that flink need for the first round more time as all
other rounds. maybe flink cache something in memory? and if i run the
benchmark 100 rounds my system freeze, i think the me
13 matches
Mail list logo