RollingSink with APIs requring fs+path

2016-03-20 Thread Lasse Dalegaard
Hello, I'm working on a project where I stream in data from Kafka, massage it a bit, and then wish to spit write it into HDFS using the RollingSink. This works just fine using the provided examples, but I would like the data to be stored in ORC on HDFS, rather than sequence files. I am however un

RE: RollingSink with APIs requring fs+path

2016-03-19 Thread Lasse Dalegaard
user@flink.apache.org Subject: Re: RollingSink with APIs requring fs+path Hi, you are right, it is currently only possible to write to a FSDataOutputStream. It could be generified as you mentioned. One thing that needs to be taken care of, however, is that the write offsets are correctly checkpointed to e

Re: RollingSink with APIs requring fs+path

2016-03-18 Thread Aljoscha Krettek
Hi, you are right, it is currently only possible to write to a FSDataOutputStream. It could be generified as you mentioned. One thing that needs to be taken care of, however, is that the write offsets are correctly checkpointed to ensure exactly-once semantics in case of failure. Right now, we d

Re: RollingSink with APIs requring fs+path

2016-03-18 Thread Aljoscha Krettek
> > Best regards, > Lasse > > From: Aljoscha Krettek > Sent: Friday, March 18, 2016 1:56 PM > To: user@flink.apache.org > Subject: Re: RollingSink with APIs requring fs+path > > Hi, > you are right, it is currently only possible to writ