Re: Processing S3 data with Apache Flink

2015-10-06 Thread Robert Metzger
Hi Kostia, I understand your concern. I am going to propose to the Flink developers to remove the S3 File System support in Flink. Also, regarding these annotations, we are actually planning to add them for the 1.0 release so that users know which interfaces they an rely on. Which other component

Re: Processing S3 data with Apache Flink

2015-10-06 Thread KOSTIANTYN Kudriavtsev
Hi Robert, you are right, I just misspell name of the file :( Everything works fine! Basically, I'd suggest to move this workaround into official doc and mark custom S3FileSystem as @Deprecated... In fact, I like that idea to mark all untested functional with specific annotation, for example @Be

Re: Processing S3 data with Apache Flink

2015-10-06 Thread Robert Metzger
Mh. I tried out the code I've posted yesterday and it was working immediately. The security settings of AWS are sometimes a bit complicated. I think there are some logs for S3 buckets, maybe they contain some more information. Maybe there are other users facing the same issue. Since the S3FileSyst

Re: Processing S3 data with Apache Flink

2015-10-06 Thread KOSTIANTYN Kudriavtsev
Hi Robert, thank you very much for your input! Have you tried that? With org.apache.hadoop.fs.s3native.NativeS3FileSystem I moved forward, and now got a new exception: Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/***.csv' - ResponseCode=403, ResponseMessage=For

Re: Running Flink on an Amazon Elastic MapReduce cluster

2015-10-06 Thread Hanen Borchani
Hi Max, You are right the problem is related to Hadoop configuration, both HADOOP_HOME and HADOOP_CONF_DIR environment variables were empty Executing export HADOOP_CONF_DIR=/etc/hadoop/conf solved the problem, and everything works fine now! Many thanks for help :) Best regards, Hanen

Re: ExecutionEnvironment setConfiguration API

2015-10-06 Thread Flavio Pompermaier
That makes sense: what can be configured should be differentiated between local and remote envs (obviously this is a minor issue/improvement) Thanks again, Flavio On Tue, Oct 6, 2015 at 11:25 AM, Stephan Ewen wrote: > We can think about that, but I think it may be quite confusing. The > configu

Re: ExecutionEnvironment setConfiguration API

2015-10-06 Thread Stephan Ewen
We can think about that, but I think it may be quite confusing. The configurations actually mean something different for local and remote environments: - For the local environment, the config basically describes the entire Flink cluster setup (for the local execution cluster in the background)

Re: source binary file

2015-10-06 Thread Fabian Hueske
Hi Lydia, you need to implement a custom InputFormat to read binary files. Usually you can extend the FileInputFormat. The implementation depends a lot on your use case, for example whether each binary file is read into a single or multiple records and how records are delimited if there are more t

source binary file

2015-10-06 Thread Lydia Ickler
Hi, how would I read a BinaryFile from HDFS with the Flink Java API? I can only find the Scala way… All the best, Lydia

Re: LDBC Graph Data into Flink

2015-10-06 Thread Martin Junghanns
Hi Vasia, No problem. Sure, Gelly is just a map() call away :) Best, Martin On 06.10.2015 10:53, Vasiliki Kalavri wrote: > Hi Martin, > > thanks a lot for sharing! This is a very useful tool. > I only had a quick look, but if we merge label and payload inside a Tuple2, > then it should also be

Re: ExecutionEnvironment setConfiguration API

2015-10-06 Thread Flavio Pompermaier
However it could be a good idea to overload also the getExecutionEnvironment() to be able to pass a custom configuration..what do you think? Otherwise I have to know a priori if I'm working in a local deployment or in a remote one, or check if getExecutionEnvironment() returned an instance of Local

Re: ExecutionEnvironment setConfiguration API

2015-10-06 Thread Flavio Pompermaier
Yes Stephan! I usually work with the master version, at least in development ;) Thanks for the quick support! Best, Flavio On Tue, Oct 6, 2015 at 10:48 AM, Stephan Ewen wrote: > Hi! > > Are you on the SNAPSHOT master version? > > You can pass the configuration to the constructor of the executio

Re: LDBC Graph Data into Flink

2015-10-06 Thread Vasiliki Kalavri
Hi Martin, thanks a lot for sharing! This is a very useful tool. I only had a quick look, but if we merge label and payload inside a Tuple2, then it should also be Gelly-compatible :) Cheers, Vasia. On 6 October 2015 at 10:03, Martin Junghanns wrote: > Hi all, > > For our benchmarks with Flink

Re: ExecutionEnvironment setConfiguration API

2015-10-06 Thread Stephan Ewen
Hi! Are you on the SNAPSHOT master version? You can pass the configuration to the constructor of the execution environment, or create one via ExecutionEnvironment.createLocalEnvironment(config) or via createRemoteEnvironment(host, port, configuration, jarFiles); The change of the signature was p

ExecutionEnvironment setConfiguration API

2015-10-06 Thread Flavio Pompermaier
Hi to all, today my code doesn't compile anymore because ExecutionEnvironment doesn't have setConfiguration() anymore..how can I set the following parameters in my unit tests? - ConfigConstants.TASK_MANAGER_TMP_DIR_KEY - ConfigConstants.BLOB_STORAGE_DIRECTORY_KEY - ConfigConstants.TASK_MANAGER_NE

LDBC Graph Data into Flink

2015-10-06 Thread Martin Junghanns
Hi all, For our benchmarks with Flink, we are using a data generator provided by the LDBC project (Linked Data Benchmark Council) [1][2]. The generator uses MapReduce to create directed, labeled, attributed graphs that mimic properties of real online social networks (e.g, degree distribution,

Re: kryo exception due to race condition

2015-10-06 Thread Till Rohrmann
Hi Stefano, we'll definitely look into it once Flink Forward is over and we've finished the current release work. Thanks for reporting the issue. Cheers, Till On Tue, Oct 6, 2015 at 9:21 AM, Stefano Bortoli wrote: > Hi guys, I could manage to complete the process crossing byte arrays I > deser

Re: kryo exception due to race condition

2015-10-06 Thread Stefano Bortoli
Hi guys, I could manage to complete the process crossing byte arrays I deserialize within the group function. However, I think this workaround is feasible just with relatively simple processes. Any idea/plan about to fix the serialization problem? saluti, Stefano Stefano Bortoli, PhD *ENS Techni