Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
ng? Maybe you need to build Flink against > that specific hadoop version yourself. > > On Mon, Feb 8, 2016 at 5:50 PM, Pieter Hameete wrote: > >> After downloading and building the 1.0-SNAPSHOT from the master branch I >> do run into another problem when starting a YARN

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
:34,855 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1 Any clue what couldve gone wrong? I used all-default for building with maven. - Pieter 2016-02-08 17:07 GMT+01:00 Pieter Hameete : > Matter of RTFM eh ;-) thx and sorry for the bother. >

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
in the wrong place? Or is there another way to enforce this property? Cheers, Pieter 2016-02-07 20:04 GMT+01:00 Pieter Hameete : > I found the relevant information on the website. Ill consult with the > cluster admin tomorrow, thanks for the help :-) > > - Pieter > > 2016-02-

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
Matter of RTFM eh ;-) thx and sorry for the bother. 2016-02-08 17:06 GMT+01:00 Robert Metzger : > You said earlier that you are using Flink 0.10. The feature is only > available in 1.0-SNAPSHOT. > > On Mon, Feb 8, 2016 at 4:53 PM, Pieter Hameete wrote: > >>

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-07 Thread Pieter Hameete
to specify a single port or a range of ports for the > JobManager to allocate when running on YARN. > Note that when using this with a single port, the JMs may collide. > > > > On Sun, Feb 7, 2016 at 7:25 PM, Pieter Hameete wrote: > >> Hi Stephan, >> >> surely i

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-07 Thread Pieter Hameete
h HDFS and the YARN resource manager may be > whitelisted r forwarded, so you can submit the YARN session, but then not > connect to the JobManager afterwards. > > > > On Sat, Feb 6, 2016 at 2:11 PM, Pieter Hameete wrote: > >> Hi Max! >> >> I'm usi

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-06 Thread Pieter Hameete
i Pieter, > > Which version of Flink are you using? It appears you've created a > Flink YARN cluster but you can't reach the JobManager afterwards. > > Cheers, > Max > > On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete wrote: > > Hi Robert, > > > >

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-06 Thread Pieter Hameete
obert Metzger : > Hi, > > did you check the logs of the JobManager itself? Maybe it'll tell us > already whats going on. > > On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete > wrote: > >> Hi Guys! >> >> Im attempting to run Flink on YARN, but I run into

Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-06 Thread Pieter Hameete
Hi Guys! Im attempting to run Flink on YARN, but I run into an issue. Im starting the Flink YARN session from an Ubuntu 14.04 VM. All goes well until after the JobManager web UI is started: JobManager web interface address http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532

Re: Imbalanced workload between workers

2016-01-27 Thread Pieter Hameete
there any changes/fixes between Flink 0.9.1 and 0.10.1 that could cause this to be better for me now? Thanks, Pieter 2016-01-27 14:10 GMT+01:00 Pieter Hameete : > > Cheers for the quick reply Till. > > That would be very useful information to have! I'll upgrade my project to >

Re: Imbalanced workload between workers

2016-01-27 Thread Pieter Hameete
heers, > Till > > On Wed, Jan 27, 2016 at 1:35 PM, Pieter Hameete > wrote: > >> Hi guys, >> >> Currently I am running a job in the GCloud in a configuration with 4 task >> managers that each have 4 CPUs (for a total parallelism of 16). >> >> Howev

Imbalanced workload between workers

2016-01-27 Thread Pieter Hameete
Hi guys, Currently I am running a job in the GCloud in a configuration with 4 task managers that each have 4 CPUs (for a total parallelism of 16). However, I noticed my job is running much slower than expected and after some more investigation I found that one of the workers is doing a majority o

Re: Task Manager metrics per job on Flink 0.9.1

2016-01-27 Thread Pieter Hameete
; >> *Ritesh Kumar Singh,* >> *https://riteshtoday.wordpress.com/* <https://riteshtoday.wordpress.com/> >> >> On Tue, Jan 26, 2016 at 8:22 PM, Pieter Hameete >> wrote: >> >>> Hi Ritesh, >>> >>> thanks for the response! The metrics are al

Re: Task Manager metrics per job on Flink 0.9.1

2016-01-26 Thread Pieter Hameete
s.com/* <https://riteshtoday.wordpress.com/> > > On Tue, Jan 26, 2016 at 7:16 PM, Pieter Hameete > wrote: > >> Hi people! >> >> A lot of metrics are gathered for each TaskManager every few seconds. The >> web UI shows nice graphs for some of these metrics too.

Task Manager metrics per job on Flink 0.9.1

2016-01-26 Thread Pieter Hameete
Hi people! A lot of metrics are gathered for each TaskManager every few seconds. The web UI shows nice graphs for some of these metrics too. I would like to make graphs of the memory and cpu usage, and the time spent on garbage collection for each job. Because of this I am wondering if the metric

Re: Reading multiple datasets with one read operation

2015-10-22 Thread Pieter Hameete
> streamed directly to the two filter operators. >>> - The input format and the two filters will probably end up on the >>> same machine, because of chaining, so there won't be >>> serialization/deserialization between them. >>> >>> Best, >>>

Reading multiple datasets with one read operation

2015-10-22 Thread Pieter Hameete
Good morning! I have the following usecase: My program reads nested data (in this specific case XML) based on projections (path expressions) of this data. Often multiple paths are projected onto the same input. I would like each path to result in its own dataset. Is it possible to generate more

Re: data sink stops method

2015-10-08 Thread Pieter Hameete
Hi Florian, I believe that when you call *JoinPredictionAndOriginal.collect* the environment will execute your program up until that point. The Csv writes are after this point, so in order to execute these steps I think you would have to call *.execute()* after the Csv writes to trigger the execut

Re: Reading from multiple input files with fewer task slots

2015-10-05 Thread Pieter Hameete
maybe create a new SimpleInputProjection as well) > > On Mon, Oct 5, 2015 at 12:41 PM, Pieter Hameete > wrote: > >> Hi Stephen, >> >> it concerns the DataSet API. >> >> The program im running can be found at >> https://github.com/PHameete/dawn-flink/blob/development

Re: Reading from multiple input files with fewer task slots

2015-10-05 Thread Pieter Hameete
/scala/wis/dawnflink/parsing/xml/XML2DawnInputFormat.java Cheers! 2015-10-05 12:38 GMT+02:00 Stephan Ewen : > I assume this concerns the streaming API? > > Can you share your program and/or the custom input format code? > > On Mon, Oct 5, 2015 at 12:33 PM, Pieter Hameete >

Reading from multiple input files with fewer task slots

2015-10-05 Thread Pieter Hameete
Hello Flinkers! I run into some strange behavior when reading from a folder of input files. When the number of input files in the folder exceeds the number of task slots I noticed that the size of my datasets varies with each run. It seems as if the transformations don't wait for all input files

Re: For each element in a dataset, do something with another dataset

2015-10-05 Thread Pieter Hameete
element a in A count the number of elements in B that are smaller than a and output a tuple in a map operation. This would also save me a step in aggregating the results? Kind regards, Pieter 2015-09-30 12:44 GMT+02:00 Pieter Hameete : > Hi Gabor, Fabian, > > thank you for your suggesti

Re: For each element in a dataset, do something with another dataset

2015-09-30 Thread Pieter Hameete
ou have about equally sized > > partitions of dataset B with the constraint that the corresponding > > partitions of A fit into memory. > > > > As I said, its a bit cumbersome. I hope you could follow my explanation. > > Please ask if something is not clear ;-) > >

Re: For each element in a dataset, do something with another dataset

2015-09-30 Thread Pieter Hameete
ataset A doesn't fit into memory, things become more cumbersome and we > need to play some tricky with range partitioning... > > Let me know, if you have questions, > Fabian > > 2015-09-29 16:59 GMT+02:00 Pieter Hameete : > >> Good day everyone, >> >> I a

For each element in a dataset, do something with another dataset

2015-09-29 Thread Pieter Hameete
-> Filter probably because there is insufficient memory for the cross to be completed? Does anyone have a suggestion on how I could make this work, especially with datasets that are larger than memory available to a separate Task? Thank you in advance for your time :-) Kind regards, Pieter Hameete

Re: Issue with parallelism when CoGrouping custom nested data tpye

2015-09-16 Thread Pieter Hameete
g. > > Please let me know, if that solved your problem. > > Cheers, Fabian > > [1] > https://squarecog.wordpress.com/2011/02/20/hadoop-requires-stable-hashcode-implementations/ > > 2015-09-16 14:12 GMT+02:00 Pieter Hameete : > >> Hi, >> >> I havent been

Re: Issue with parallelism when CoGrouping custom nested data tpye

2015-09-16 Thread Pieter Hameete
enchmarkSuite* class out of the box to run the query from my first email. In this class you can also change the query thats being run or run multiple queries. If this does not work please let me know! Kind regards and cheers again! - Pieter 2015-09-16 11:24 GMT+02:00 Pieter Hameete : >

Re: Issue with parallelism when CoGrouping custom nested data tpye

2015-09-16 Thread Pieter Hameete
with data (also possible to >> include it in the code) to reproduce your problem? >> >> Cheers, >> Till >> >> On Wed, Sep 16, 2015 at 10:31 AM, Pieter Hameete >> wrote: >> >>> Dear fellow Flinkers, >>> >>> I am implementing qu

Issue with parallelism when CoGrouping custom nested data tpye

2015-09-16 Thread Pieter Hameete
non-empty auction iterators passed to the cogroup function where the persons iterator is empty, but this is impossible because all buyers exist in the persons database! If anyone has some pointers for me why this code starts producing strange results when parallelism is set above 1 this would be greatly appreciated :-) Kind regards. Pieter Hameete