Hi Dulaj! Okay, the logs give us some insight. Both setups seem to look good in terms of TaskManager and JobManager startup.
In one of the logs (127.0.0.1) you submit a job. The job fails because the TaskManager cannot grab the JAR file from the JobManager. I think the problem is that the BLOB server binds to 0.0.0.0 - it should bind to the same address as the JobManager actor system. That should definitely be changed... On Thu, Mar 5, 2015 at 10:08 AM, Dulaj Viduranga <vidura...@icloud.com> wrote: > Hi, > This is the log with setting “localhost” > flink-Vidura-jobmanager-localhost.log < > https://gist.github.com/viduranga/e9d43521587697de3eb5#file-flink-vidura-jobmanager-localhost-log > > > > And this is the log with setting “127.0.0.1” > flink-Vidura-jobmanager-localhost.log < > https://gist.github.com/viduranga/5af6b05f204e1f4b344f#file-flink-vidura-jobmanager-localhost-log > > > > > On Mar 5, 2015, at 2:23 PM, Till Rohrmann <trohrm...@apache.org> wrote: > > > > What does the jobmanager log says? I think Stephan added some more > logging > > output which helps us to debug this problem. > > > > On Thu, Mar 5, 2015 at 9:36 AM, Dulaj Viduranga <vidura...@icloud.com> > > wrote: > > > >> Using start-locat.sh. > >> I’m using the original config yaml. I also tried changing jobmanager > >> address in config to “127.0.0.1 but no luck. With my changes it works > ok. > >> The conf file follows. > >> > >> > >> > ################################################################################ > >> # Licensed to the Apache Software Foundation (ASF) under one > >> # or more contributor license agreements. See the NOTICE file > >> # distributed with this work for additional information > >> # regarding copyright ownership. The ASF licenses this file > >> # to you under the Apache License, Version 2.0 (the > >> # "License"); you may not use this file except in compliance > >> # with the License. You may obtain a copy of the License at > >> # > >> # http://www.apache.org/licenses/LICENSE-2.0 > >> # > >> # Unless required by applicable law or agreed to in writing, software > >> # distributed under the License is distributed on an "AS IS" BASIS, > >> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > >> # See the License for the specific language governing permissions and > >> # limitations under the License. > >> > >> > ################################################################################ > >> > >> > >> > >> > #============================================================================== > >> # Common > >> > >> > #============================================================================== > >> > >> jobmanager.rpc.address: 127.0.0.1 > >> > >> jobmanager.rpc.port: 6123 > >> > >> jobmanager.heap.mb: 256 > >> > >> taskmanager.heap.mb: 512 > >> > >> taskmanager.numberOfTaskSlots: 1 > >> > >> parallelization.degree.default: 1 > >> > >> > >> > #============================================================================== > >> # Web Frontend > >> > >> > #============================================================================== > >> > >> # The port under which the web-based runtime monitor listens. > >> # A value of -1 deactivates the web server. > >> > >> jobmanager.web.port: 8081 > >> > >> # The port uder which the standalone web client > >> # (for job upload and submit) listens. > >> > >> webclient.port: 8080 > >> > >> > >> > #============================================================================== > >> # Advanced > >> > >> > #============================================================================== > >> > >> # The number of buffers for the network stack. > >> # > >> # taskmanager.network.numberOfBuffers: 2048 > >> > >> # Directories for temporary files. > >> # > >> # Add a delimited list for multiple directories, using the system > directory > >> # delimiter (colon ':' on unix) or a comma, e.g.: > >> # /data1/tmp:/data2/tmp:/data3/tmp > >> # > >> # Note: Each directory entry is read from and written to by a different > I/O > >> # thread. You can include the same directory multiple times in order to > >> create > >> # multiple I/O threads against that directory. This is for example > >> relevant for > >> # high-throughput RAIDs. > >> # > >> # If not specified, the system-specific Java temporary directory > >> (java.io.tmpdir > >> # property) is taken. > >> # > >> # taskmanager.tmp.dirs: /tmp > >> > >> # Path to the Hadoop configuration directory. > >> # > >> # This configuration is used when writing into HDFS. Unless specified > >> otherwise, > >> # HDFS file creation will use HDFS default settings with respect to > >> block-size, > >> # replication factor, etc. > >> # > >> # You can also directly specify the paths to hdfs-default.xml and > >> hdfs-site.xml > >> # via keys 'fs.hdfs.hdfsdefault' and 'fs.hdfs.hdfssite'. > >> # > >> # fs.hdfs.hadoopconf: /path/to/hadoop/conf/ > >> > >> > >>> On Mar 5, 2015, at 2:03 PM, Till Rohrmann <trohrm...@apache.org> > wrote: > >>> > >>> How did you start the flink cluster? Using the start-local.sh, the > >>> start-cluster.sh or starting the job manager and task managers > >> individually > >>> using taskmanager.sh/jobmanager.sh. Could you maybe post the > >>> flink-conf.yaml file, you're using? > >>> > >>> With your changes, everything works, right? > >>> > >>> On Thu, Mar 5, 2015 at 8:55 AM, Dulaj Viduranga <vidura...@icloud.com> > >>> wrote: > >>> > >>>> Hi Till, > >>>> I’m sorry. It doesn’t seem to solve the problem. The taskmanager still > >>>> tries a 10.0.0.0/8 IP. > >>>> > >>>> Best regards. > >>>> > >>>>> On Mar 5, 2015, at 1:00 PM, Till Rohrmann <till.rohrm...@gmail.com> > >>>> wrote: > >>>>> > >>>>> Hi Dulaj, > >>>>> > >>>>> I looked through your commit and noticed that the JobClient might not > >> be > >>>>> listening on the right network interface. Your commit seems to fix > it. > >> I > >>>>> just want to understand the problem properly and therefore I opened a > >>>>> branch with a small change. Could you try out whether this change > would > >>>>> also fix your problem? You can find the code here [1]. Would be > awesome > >>>> if > >>>>> you checked it out and let it run on your cluster setting. Thanks a > lot > >>>>> Dulaj! > >>>>> > >>>>> [1] > >>>>> > >>>> > >> > https://github.com/tillrohrmann/flink/tree/fixLocalFlinkMiniClusterJobClient > >>>>> > >>>>> On Thu, Mar 5, 2015 at 4:21 AM, Dulaj Viduranga < > vidura...@icloud.com> > >>>>> wrote: > >>>>> > >>>>>> The every change in the commit b7da22a is not required but I thought > >>>> they > >>>>>> are appropriate. > >>>>>> > >>>>>>> On Mar 5, 2015, at 8:11 AM, Dulaj Viduranga <vidura...@icloud.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> Hi, > >>>>>>> I found many other places “localhost” is hard coded. I changed them > >> in > >>>> a > >>>>>> better way I think. I made a pull request. Please review. b7da22a < > >>>>>> > >>>> > >> > https://github.com/viduranga/flink/commit/b7da22a562d3da5a9be2657308c0f82e4e2f80cd > >>>>>>> > >>>>>>> > >>>>>>>> On Mar 4, 2015, at 8:17 PM, Stephan Ewen <se...@apache.org> > wrote: > >>>>>>>> > >>>>>>>> If I recall correctly, we only hardcode "localhost" in the local > >> mini > >>>>>>>> cluster - do you think it is problematic there as well? > >>>>>>>> > >>>>>>>> Have you found any other places? > >>>>>>>> > >>>>>>>> On Mon, Mar 2, 2015 at 10:26 AM, Dulaj Viduranga < > >>>> vidura...@icloud.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> In some places of the code, "localhost" is hard coded. When it is > >>>>>> resolved > >>>>>>>>> by the DNS, it is posible to be directed to a different IP other > >>>> than > >>>>>>>>> 127.0.0.1 (like private range 10.0.0.0/8). I changed those > places > >> to > >>>>>>>>> 127.0.0.1 and it works like a charm. > >>>>>>>>> But hard coding 127.0.0.1 is not a good option because when the > >>>>>> jobmanager > >>>>>>>>> ip is changed, this becomes an issue again. I'm thinking of > setting > >>>>>>>>> jobmanager ip from the config.yaml to these places. > >>>>>>>>> If you have a better idea on doing this with your experience, > >> please > >>>>>> let > >>>>>>>>> me know. > >>>>>>>>> > >>>>>>>>> Best. > >>>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>> > >>>> > >> > >> > >