In the dawn of Flink when Flink Operators were still called PACTs (short
for Parallelization Contracts) the system used to support the so called
"output contracts" via custom annotations that can be attached to the UDF
(the ForwardedFields annotation is a descendant of that concept).
Amonst others
You are not getting the issue in your local environment, because there
everything is running in one JVM and the needed class is available there.
In the distributed case, we have a special usercode classloader which can
load classes from the user's jar.
On Mon, May 18, 2015 at 9:16 PM, Michele Bert
Got it!
thank you
Best,
Michele
Il giorno 18/mag/2015, alle ore 19:29, Robert Metzger
mailto:rmetz...@apache.org>> ha scritto:
Hi,
./yarn-session.sh -n 3 -s 1 -jm 1024 -tm 4096
will start a YARN session with 4 containers (3 workers, 1 master). Once the
session is running, you can submit a
Hi! thanks for answer!
for 1 you are right! I am using collect method on custom object, tomorrow I
will try moving back to snapshot.
why is it working correctly in my local environment (that is also a milestone1)?
on the other side i am not collecting large results: i am running on a test
datas
Hi Michele!
I cannot tell you what the problem is at a first glance, but here are some
pointers that may help you find the problem:
Input split creation determinism
- The number of input splits is not really deterministic. It depends on
the parallelism of the source (this tells the system how m
Hi,
./yarn-session.sh -n 3 -s 1 -jm 1024 -tm 4096
will start a YARN session with 4 containers (3 workers, 1 master). Once the
session is running, you can submit as many jobs as you want to this
session, using
./bin/flink run ./path/to/jar
The YARN session will create a hidden file in conf/ whic
Hi Michele!
It looks like there are quite a few things going wrong in your case. Let me
see what I can deduce from the output you are showing me:
1) You seem to run into a bug that exists in the 0.9-milestone-1 and has
been fixed in the master:
As far as I can tell, you call "collect()" on a da
Hi Mustafa,
I'm afraid, this is not possible.
Although you can annotate DataSources with partitioning information, this
is not enough to avoid repartitioning for a CoGroup. The reason for that is
that CoGroup requires co-partitioning of both inputs, i.e., both inputs
must be equally partitioned (s
Hi,
I have a problem running my app on a Yarn cluster
I developed it in my computer and everything is working fine
then we setup the environment on Amazon EMR reading data from HDFS not S3
we run it with these command
./yarn-session.sh -n 3 -s 1 -jm 1024 -tm 4096
./flink run -m yarn-cluste
Hi,
I am writing a flink job, in which I have three datasets. I have
partitionedByHash the first two before coGrouping them.
My plan is to spill the result of coGrouping to disk, and then re-read it
again before coGrouping with the third dataset.
My question is, is there anyway to inform flink
Hi,
I would really recommend you to put your Flink and Spark dependencies into
different maven modules.
Having them both in the same project will be very hard, if not impossible.
Both projects depend on similar projects with slightly different versions.
I would suggest a maven module structure lik
hallo,
i want cluster geo data (lat,long,timestamp) with k-means. now i search for
a good core function, i can not find good paper or other sources for that.
to time i multiplicate the time and the space distance:
public static double dis(GeoData input1, GeoData input2)
{
double timeDis = Math
hi,
if i add your dependency i get over 100 errors, now i change the version
number:
com.fasterxml.jackson.module
jackson-module-scala_2.10
2.4.4
com.google.guava
guava
now the pom is fin
13 matches
Mail list logo