Any insight about this?
On Tue, May 26, 2015 at 5:58 PM, Flavio Pompermaier
wrote:
> Hi to all,
>
> In my program I'd like to infer from a mysql table the list of directory I
> have to output on hdfs (status=0).
> Once my job finish to clean each directory and update the status value of
> my sql
Switching the sides worked (I tried that shortly after sending the mail).
Thanks for the fast response :)
On 26.05.2015 22:26, Stephan Ewen wrote:
If you have this case, giving more memory is fighting a symptom, rather
than a cause.
If you really have that many duplicates in the data set (and
Hi,
What can I do to give Flink more memory when running it from my IDE? I'm
getting the following exception:
Caused by: java.lang.RuntimeException: Hash join exceeded maximum number
of recursions, without reducing partitions enough to be memory resident.
Probably cause: Too many duplicate k
If you have this case, giving more memory is fighting a symptom, rather
than a cause.
If you really have that many duplicates in the data set (and you have not
just a bad implementation of "hashCode()"), then try the following:
1) Reverse hash join sides. Duplicates hurt only on the build-side, n
Great, thanks...Is there a roadmap somewhere that can be consulted with a
timeline for the next RC or milestone?
Thanks.
Date: Sat, 23 May 2015 23:10:59 +0200
Subject: Re: bug? open() not getting called with RichWindowMapFunction
From: gyf...@apache.org
To: user@flink.apache.org
Hi,
This was a k
Pushed a fix to the master and will open a PR to programmatically fix this.
On Tue, May 26, 2015 at 4:22 PM, Flavio Pompermaier
wrote:
> Yeap, that definitively solves the problem! Could you make a PR to fix
> that..?
>
> Thank you in advance,
> Flavio
>
> On Tue, May 26, 2015 at 3:20 PM, Maximi
Hi to all,
In my program I'd like to infer from a mysql table the list of directory I
have to output on hdfs (status=0).
Once my job finish to clean each directory and update the status value of
my sql table.
How can I do that in Flink? Is there any callback on the dataset.output()
finish?
Best,
Thanks ;)
On Tue, May 26, 2015 at 5:08 PM, Fabian Hueske wrote:
> Done
>
> 2015-05-26 16:59 GMT+02:00 Flavio Pompermaier :
>
>> Could you do that so I can avoid to make a PR just for that, please?
>>
>> On Tue, May 26, 2015 at 4:58 PM, Fabian Hueske wrote:
>>
>>> Definitely! Much better than us
Done
2015-05-26 16:59 GMT+02:00 Flavio Pompermaier :
> Could you do that so I can avoid to make a PR just for that, please?
>
> On Tue, May 26, 2015 at 4:58 PM, Fabian Hueske wrote:
>
>> Definitely! Much better than using the String value.
>>
>> 2015-05-26 16:55 GMT+02:00 Flavio Pompermaier :
>>
Could you do that so I can avoid to make a PR just for that, please?
On Tue, May 26, 2015 at 4:58 PM, Fabian Hueske wrote:
> Definitely! Much better than using the String value.
>
> 2015-05-26 16:55 GMT+02:00 Flavio Pompermaier :
>
>> Hi to all,
>>
>> in my program I need to set "recursive.file.
Definitely! Much better than using the String value.
2015-05-26 16:55 GMT+02:00 Flavio Pompermaier :
> Hi to all,
>
> in my program I need to set "recursive.file.enumeration" to true and I
> discovered that there's a constant for that variable in FileInputFormat
> (ENUMERATE_NESTED_FILES_FLAG) bu
Hi to all,
in my program I need to set "recursive.file.enumeration" to true and I
discovered that there's a constant for that variable in FileInputFormat
(ENUMERATE_NESTED_FILES_FLAG) but it's private. Do you think it could be a
good idea to change it's visibility to public?
Best,
Flavio
Yeap, that definitively solves the problem! Could you make a PR to fix
that..?
Thank you in advance,
Flavio
On Tue, May 26, 2015 at 3:20 PM, Maximilian Michels wrote:
> Yes, there is a loop to recursively search for files in directory but that
> should be ok. The code fails when adding a new In
hi community,
my k-means works fine now. thanks a lot for your help.
now i want test something, how is the best way in flink to cout
the iteration?
best regards,
paul
In the src/main/resources folder of the project
On Tue, May 26, 2015 at 2:42 PM, Hilmi Yildirim
wrote:
> Where do you put the hbase-site.xml? In the resource folder of the
> project or on the cluster?
>
>
> Am 26.05.2015 um 14:12 schrieb Flavio Pompermaier:
>
> I usually put those connection pa
Yes, there is a loop to recursively search for files in directory but that
should be ok. The code fails when adding a new InputSplit to an ArrayList.
This is a standard operation.
Oh, I think I found a bug in `addNestedFiles`. It does not pick up the
length of the recursively found files in line 5
Where do you put the hbase-site.xml? In the resource folder of the
project or on the cluster?
Am 26.05.2015 um 14:12 schrieb Flavio Pompermaier:
I usually put those connection params inside the hbase-site.xml that
will be included in the generated jar..
On Tue, May 26, 2015 at 2:07 PM, Hilmi
I have 10 files..I debugged the code and it seems that there's a loop in
the FileInputFormat when files are nested far away from the root directory
of the scan
On Tue, May 26, 2015 at 2:14 PM, Robert Metzger wrote:
> Hi Flavio,
>
> how many files are in the directory?
> You can count with "find
Hi Flavio,
how many files are in the directory?
You can count with "find /tmp/myDir | wc -l"
Flink running out of memory while creating input splits indicates to me
that there are a lot of files in there.
On Tue, May 26, 2015 at 2:10 PM, Flavio Pompermaier
wrote:
> Hi to all,
>
> I'm trying to
I usually put those connection params inside the hbase-site.xml that will
be included in the generated jar..
On Tue, May 26, 2015 at 2:07 PM, Hilmi Yildirim
wrote:
> I want to add that it is strange that the client wants to establish a
> connection to localhost but I have defined another machine
Hi to all,
I'm trying to recursively read a directory but it seems that the
totalLength value in the FileInputformat.createInputSplits() is not
computed correctly..
I have a files organized as:
/tmp/myDir/A/B/cunk-1.txt
/tmp/myDir/A/B/cunk-2.txt
..
If I try to do the following:
Configuration
I want to add that it is strange that the client wants to establish a
connection to localhost but I have defined another machine.
Am 26.05.2015 um 14:05 schrieb Hilmi Yildirim:
Hi,
I implemented a job which reads data from HBASE with following code (I
replaced the real address by m1.example.
Hi,
I implemented a job which reads data from HBASE with following code (I
replaced the real address by m1.example.com)
protected Scan getScanner() {
Scan scan = new Scan();
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.q
Thank you very much
Am 26.05.2015 um 11:26 schrieb Robert Metzger:
Hi,
you need to include dependencies not part of the main flink jar into
your usercode jar file.
Therefore, you have to build a "fat jar".
The easiest way to do that is like this:
http://stackoverflow.com/questions/574594/how
Sure,
here some pseudo code:
public class CentroidMover extends GroupReduceFunction {
public void reduce(Iterable points, Collector out) {
int cnt = 0;
Centroid sum = new Centroid(0,0);
for(Point p : points) {
sum = sum + p // (your addition logic goes here)
cnt++;
thanks for your message,
maybe you can give me a exsample for the GroupReduceFunction?
2015-05-22 23:29 GMT+02:00 Fabian Hueske :
> There are two ways to do that:
>
> 1) You use a GroupReduceFunction, which gives you an iterator over all
> points similar to Hadoop's ReduceFunction.
> 2) You use
Hi,
you need to include dependencies not part of the main flink jar into your
usercode jar file.
Therefore, you have to build a "fat jar".
The easiest way to do that is like this:
http://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven
But that ap
Hi,
I want to deploy my job on a cluster. Unfortunately, the job does not
run because I used org.apache.flink.addons.hbase.TableInputFormat which
is not included in the library of the flink folder. Therefore, I wanted
to build my own flink version with mvn package -DskipTests. Again, the
libra
Optimization happens automatically when you submit a batch program (DataSet
API).
2015-05-26 10:27 GMT+02:00 hagersaleh :
> this optimizer automatic mode or I determination it
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/how-Flin
this optimizer automatic mode or I determination it
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/how-Flink-Optimizer-work-and-what-is-process-do-it-tp1359p1363.html
Sent from the Apache Flink User Mailing List archive. mailing list archiv
It's similar to a query optimizer in a relational database system.
2015-05-26 10:10 GMT+02:00 hagersaleh :
> very thanks
> what meaning Optimizer in flink?
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/how-Flink-Optimizer-work-and
very thanks
what meaning Optimizer in flink?
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/how-Flink-Optimizer-work-and-what-is-process-do-it-tp1359p1361.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Na
32 matches
Mail list logo