Re: Student looking to contribute to Stratosphere

Ufuk Celebi Wed, 15 Jul 2015 05:53:10 -0700

Hey Rohit,

it's best to do the discussion related to a specific issue *in* the issue
itself instead of the mailing list.


In general, it's better to ask specific questions. But a general pointer
would be to look into the existing ML algorithm implementations, Stephan's
approximate PageRank implementation linked in the issue, and then think
about how to translate it into the ML library. This would also be a first
step to asking more specific questions.

– Ufuk

On Wed, Jul 15, 2015 at 2:42 PM, Rohit Shinde <[email protected]>
wrote:

> I intend to solve this issue:
> https://issues.apache.org/jira/browse/FLINK-1748
>
> Could someone give me some pointers on how to approach this?
>
> On Wed, Jul 15, 2015 at 4:58 PM, Kostas Tzoumas <[email protected]>
> wrote:
>
> > IDE choice is up to you with some limitations, see here for IDE setup
> > instructions:
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html
> >
> >
> > Scala IDE is not limited to Scala, it is based on Eclipse, so you can
> > develop in Java. Most committers are using IntelliJ as far as I know.
> >
> > On Wed, Jul 15, 2015 at 1:24 PM, Rohit Shinde <
> [email protected]
> > >
> > wrote:
> >
> > > What IDE should I use? There are various options and I already have
> > Eclipse
> > > Luna. The IDE page lists that the Scala IDE is the best. So should I go
> > > with the Scala IDE? Will I be able to develop in Java later?
> > >
> > > On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[email protected]>
> > > wrote:
> > >
> > > > Hi Rohit,
> > > >
> > > > If you are just working on your laptop, I personally find it much
> > easier
> > > to
> > > > work without Hadoop and use the local file system or just Java
> > > collections
> > > > for testing and trying out ideas.
> > > >
> > > > When you move to a cluster, it is common to use a Hadoop installation
> > to
> > > > store large files in HDFS. There, you can run Flink jobs using
> Flink's
> > > YARN
> > > > mode.
> > > >
> > > > Kostas
> > > >
> > > > On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <
> > > [email protected]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Hadoop is not a necessity for running Flink, but rather an option.
> > Try
> > > > the
> > > > > steps of the setup guide. [1]
> > > > > If you really nee HDFS though to get the best IO performance I
> would
> > > > > suggest having Hadoop on all your machines running Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> > > > >
> > > > > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <
> [email protected]
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Sorry for the brief hiatus. I was preparing for my GRE exam, but
> I
> > am
> > > > > back.
> > > > > > I am starting to build Flink and a doubt which I had was, is a
> > > > > single-node
> > > > > > cluster configuration of Hadoop enough? I assume Hadoop is needed
> > > since
> > > > > it
> > > > > > is given on the build page.
> > > > > >
> > > > > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <
> > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, You can choose any unassigned issue about Flink Machine
> > > Learning
> > > > > > > Library (flink-ml) in JIRA. [1]
> > > > > > > There are some issues for starter in flink-ml such as
> FLINK-1737
> > > [2],
> > > > > > > FLINK-1748 [3], FLINK-1994 [4].
> > > > > > >
> > > > > > > First, It would be better to read some articles about
> > contributing
> > > to
> > > > > > > Flink. [5][6]
> > > > > > > And if you decide a issue to contribute, please assign it to
> you.
> > > If
> > > > > you
> > > > > > > don’t have permission to
> > > > > > > assign, just comment into the issue. Then other people give
> > > > permission
> > > > > to
> > > > > > > you and assign
> > > > > > > the issue to you.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chiwan Park
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/
> > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > > > > [6] http://flink.apache.org/coding-guidelines.html
> > > > > > >
> > > > > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > > > > [email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > I came across Stratosphere while looking for GSOC
> organisations
> > > > > working
> > > > > > > in
> > > > > > > > Machine Learning. I got to know that it had become Apache
> > Flink.
> > > > > > > >
> > > > > > > > I am interested in this project:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > > > > >
> > > > > > > > Backgroundd: I am proficient in C++, Java, Python and
> Scheme. I
> > > > have
> > > > > > > taken
> > > > > > > > undergrad courses in machine learning and data mining. How
> can
> > I
> > > > > > > contribute
> > > > > > > > to the above project?
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Rohit Shinde.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Student looking to contribute to Stratosphere

Reply via email to