Hi,
what jar am I missing ?
The error is:
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.flink.api.scala.ExecutionEnvironment.readCsvFile$default$4()Z
Hi Robert,
thanks for your reply. It got me digging into my setup and I discovered
that one TM was scheduled next to the JM. When specifying -yn 7 the
documentation suggests that this is the number of TMs (of which I wanted
7), and I thought an additional container would be used for the JM (my YAR
Hi Robert,
the problem here is that YARN's scheduler (there are different schedulers
in YARN: FIFO, CapacityScheduler, ...) is not giving Flink's
ApplicationMaster/JobManager all the containers it is requesting. By
increasing the size of the AM/JM container, there is probably no memory
left to fit
I should say I'm running the current Flink master branch.
On Wed, Sep 30, 2015 at 5:02 PM, Robert Schmidtke
wrote:
> It's me again. This is a strange issue, I hope I managed to find the right
> keywords. I got 8 machines, 1 for the JM, the other 7 are TMs with 64G of
> memory each.
>
> When runn
It's me again. This is a strange issue, I hope I managed to find the right
keywords. I got 8 machines, 1 for the JM, the other 7 are TMs with 64G of
memory each.
When running my job like so:
$FLINK_HOME/bin/flink run -m yarn-cluster -yjm 16384 -ytm 40960 -yn 7 .
The job completes without any
Hi Max,
thanks for your quick reply. I found the relevant code and commented it out
for testing, seems to be working. Happily waiting for the fix. Thanks again.
Robert
On Wed, Sep 30, 2015 at 1:42 PM, Maximilian Michels wrote:
> Hi Robert,
>
> This is a regression on the current master due to
Hi Robert,
This is a regression on the current master due to changes in the way
Flink calculates the memory and sets the maximum direct memory size.
We introduced these changes when we merged support for off-heap
memory. This is not a problem in the way Flink deals with managed
memory, just -XX:Ma
Hi everyone,
I'm constantly running into OutOfMemoryErrors and for the life of me I
cannot figure out what's wrong. Let me describe my setup. I'm running the
current master branch of Flink on YARN (Hadoop 2.7.0). My job is an
unfinished implementation of TPC-H Q2 (
https://github.com/robert-schmid
Hi Gabor, Fabian,
thank you for your suggestions. I am intending to scale up so that I'm sure
that both A and B won't fit in memory. I'll see if I can come up with a
nice way to partition the datasets but if that will take too much time I'll
just have to accept that it wont work on large datasets.
Hello,
Alternatively, if dataset B fits in memory, but dataset A doesn't,
then you can do it with broadcasting B to a RichMapPartitionFunction
on A:
In the open method of mapPartition, you sort B. Then, for each element
of A, you do a binary search in B, and look at the index found by the
binary s
The idea is to partition both datasets by range.
Assume your dataset A is [1,2,3,4,5,6] you create two partitions: p1:
[1,2,3] and p2: [4,5,6].
Each partition is given to a different instance of a MapPartition operator
(this is a bit tricky, because you cannot use broadcastSet. You could load
the c
Hi Lydia,
Till already pointed you to the documentation. If you want to run the
WordCount example, you can do so by executing the following command:
./bin/flink run -c com.dataartisans.flink.dataflow.examples.WordCount
/path/to/dataflow.jar --input /path/to/input --output /path/to/output
If you
Hi Fabian,
thanks for your tips!
Do you have some pointers for getting started with the 'tricky range
partitioning'? I am quite keen to get this working with large datasets ;-)
Cheers,
Pieter
2015-09-30 10:24 GMT+02:00 Fabian Hueske :
> Hi Pieter,
>
> cross is indeed too expensive for this ta
Hi Pieter,
cross is indeed too expensive for this task.
If dataset A fits into memory, you can do the following: Use a
RichMapPartitionFunction to process dataset B and add dataset A as a
broadcastSet. In the open method of mapPartition, you can load the
broadcasted set and sort it by a.propertyX
It's described here:
https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html#run-example
Cheers,
Till
On Wed, Sep 30, 2015 at 8:24 AM, Lydia Ickler
wrote:
> Hi all,
>
> I want to run the data-flow Wordcount example on a Flink Cluster.
> The local execution w
15 matches
Mail list logo