Hi Niraj, Pleased to here you want to start contributing to Flink :)
In terms of security, there are some open issues. Like Robert metioned, it would be great if you could implement proper HDFS Kerberos authentication. Basically, the HDFS Delegation Token needs to be transferred to the workers so that they don't have to be authenticated themselves. Also, proper renewal of the Token needs to be taken care of. Another security task could be to implement authentication support in Flink. This could be done by asking the user for some kind of shared secret. Alternatively, authentication could also be performed by Kerberos' authentication method. Let me know what you find interesting. Best regards, Max On Fri, Feb 27, 2015 at 2:31 AM, Rai, Niraj <niraj....@intel.com> wrote: > Hi Robert, > Thanks for the detailed response. I worked on the encryption of HDFS as well > as the crypto file system in HDFS, so, I am aware of how it is done in > Hadoop. Let me sync up with Max to get started on it. > I will also start looking into the current implementations. > Niraj > > > From: Robert Metzger [mailto:rmetz...@apache.org] > Sent: Thursday, February 26, 2015 3:11 PM > To: dev@flink.apache.org > Cc: Rai, Niraj > Subject: Re: Contributing to Flink > > Hi Niraj, > > Welcome to the Flink community ;) > I'm really excited that you want to contribute to our project, and since > you've asked for something in the security area, I actually have something > very concrete in mind. > We recently added some support for accessing (Kerberos) secured HDFS clusters > in Flink: https://issues.apache.org/jira/browse/FLINK-1504. > However, the implementation is very simple because it assumes that every > worker of Flink (TaskManager) is authenticated with Kerberos (kinit). Its not > very practical for large setups because you have to ssh to all machines to > log into Kerberos. > > What I would really like to have in Flink would be an way to transfer the > authentication tokens form the JobManager (master) to the TaskManagers. This > way, users only have to be authenticated with Kerberos at the JobManager, and > Flink is taking care of the rest. > As far as I understood it, Hadoop has already all the utilities in place for > getting and transferring the delegation tokens. > Max Michels, another committer in our project has quite a good understanding > of the details there. It would be great if you (Max) could chime in if I > forgot something. > > If you are interested in working on this, you can file a JIRA > (https://issues.apache.org/jira/browse/FLINK) for tracking the progress and > discussing the details. > If not I'm sure we'll come up with more interesting ideas. > > > Robert > > > > > > > > On Thu, Feb 26, 2015 at 11:07 PM, Henry Saputra > <henry.sapu...@gmail.com<mailto:henry.sapu...@gmail.com>> wrote: > Hi Niraj, > > Thanks for your interest at Apache Flink. The quickest is to just give > Flink a spin and figure out how it works. > This would get you start on how it works before actually doing work on Flink > =) > > Please do visit Flink how to contribute page [1] and subscribe to dev > mailing list [2] to start following up. > > Welcome =) > > [1] http://flink.apache.org/how-to-contribute.html > [2] http://flink.apache.org/community.html#mailing-lists > > On Thu, Feb 26, 2015 at 1:45 PM, Rai, Niraj > <niraj....@intel.com<mailto:niraj....@intel.com>> wrote: >> Hi Flink Dev, >> I am looking to contribute to Flink, especially in the area of security. In >> the past, I have contributed to Pig, Hive and HDFS. I would really >> appreciate, if I can get some work assigned to me. Looking forward to hear >> back from the development community of Flink. >> Thanks >> Niraj >> >