Re: Reading worker-local input files

2017-01-04 Thread Robert Schmidtke
Hi Fabian, thanks for your directions! They worked flawlessly. I am aware of the reduced robustness, but then again my input is only available on each worker and not replicated. In case anyone is wondering, here is how I did it: *https://github.com/robert-schmidtke/hdfs-statistics-adapter/tree/2a4

Re: Reading worker-local input files

2016-12-27 Thread Fabian Hueske
Hi Robert, this is indeed a bit tricky to do. The problem is mostly with the generation of the input splits, setup of Flink, and the scheduling of tasks. 1) you have to ensure that on each worker at least one DataSource task is scheduled. The easiest way to do this is to have a bare metal setup (

Reading worker-local input files

2016-12-27 Thread Robert Schmidtke
Hi everyone, I'm using Flink and/or Hadoop on my cluster, and I'm having them generate log data in each worker node's /local folder (regular mount point). Now I would like to process these files using Flink, but I'm not quite sure how I could tell Flink to use each worker node's /local folder as i