Re: Checking for existance of output directory/files before running a batch job

2016-08-24 Thread Maximilian Michels
Forgot to mention, this is on the master. For Flink < 1.2.x, you will have to use GlobalConfiguration.get(); On Wed, Aug 24, 2016 at 12:23 PM, Maximilian Michels wrote: > Hi Niels, > > The problem is that such method only works reliably if the cluster > configuration, e.g. Flink and Hadoop config

Re: Checking for existance of output directory/files before running a batch job

2016-08-24 Thread Maximilian Michels
Hi Niels, The problem is that such method only works reliably if the cluster configuration, e.g. Flink and Hadoop config files, are present on the client machine. Also, the environment variables have to be set correctly. This is usually not the case when working from the IDE. But seems like your c

Re: Checking for existance of output directory/files before running a batch job

2016-08-22 Thread Niels Basjes
Yes, that did the trick. Thanks. I was using a relative path without any FS specification. So my path was "foo" and on the cluster this resolves to "hdfs:///user/nbasjes/foo" Locally this resolved to "file:///home/nbasjes/foo" and hence the mismatch I was looking at. For now I can work with this f

Re: Checking for existance of output directory/files before running a batch job

2016-08-19 Thread Robert Metzger
Ooops. Looks like Google Mail / Apache / the internet needs 13 minutes to deliver an email. Sorry for double answering. On Fri, Aug 19, 2016 at 3:07 PM, Maximilian Michels wrote: > HI Niels, > > Have you tried specifying the fully-qualified path? The default is the > local file system. > > For e

Re: Checking for existance of output directory/files before running a batch job

2016-08-19 Thread Robert Metzger
Hi Niels, I assume the directoryName you are passing doesn't have the file system prefix (hdfs:// or s3://, ...) specified. In those cases, Path.getFileSystem() is looking up the default file system prefix from the configuration. Probably the environment where you are submitting the job from doesn

Re: Checking for existance of output directory/files before running a batch job

2016-08-19 Thread Maximilian Michels
HI Niels, Have you tried specifying the fully-qualified path? The default is the local file system. For example, hdfs:///path/to/foo If that doesn't work, do you have the same Hadoop configuration on the machine where you test? Cheers, Max On Thu, Aug 18, 2016 at 2:02 PM, Niels Basjes wrote:

Checking for existance of output directory/files before running a batch job

2016-08-18 Thread Niels Basjes
Hi, I have a batch job that I run on yarn that creates files in HDFS. I want to avoid running this job at all if the output already exists. So in my code (before submitting the job into yarn-session) I do this: String directory = "foo"; Path directory = new Path(directoryName);FileS