It would be great to get more contributions!  If you're new to
contributing, it will be good to start with some small contributions and
check out:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

But if those build up to a larger contribution, the top ones I'd pick out
are:

SPARK-6442 (local linear algebra): This could be done incrementally, and
should be coordinated on that JIRA since I believe others may be working on
it.

SPARK-3703 (Ensemble algorithms): It would be great to get a generic
boosting algorithm under the Pipelines API (probably AdaBoost).

SPARK-5992 (LSH): I believe there is active work on this, so it would be
important to coordinate via JIRA on that.

The other JIRAs which Feynman & I did not comment on either have some
active work or are likely lower priority.  However, if you're interested in
one of those algorithms, you could publish it as a Spark package:
http://spark-packages.org/

Good luck!
Joseph

On Thu, Jul 9, 2015 at 1:20 PM, Feynman Liang <fli...@databricks.com> wrote:

> Exciting, thanks for the contribution! I'm currently aware of:
>
>    - SPARK-8499 is currently in progress (in a duplicate issue); I
>    updated the JIRA to reflect that.
>    - SPARK-5992 has a spark package
>    <http://spark-packages.org/package/mrsqueeze/spark-hash> linked but
>    I'm unclear on whether there is any progress there.
>
> Feynman
>
> On Thu, Jul 9, 2015 at 1:04 PM, emrehan <emrehan.tu...@gmail.com> wrote:
>
>> Hi all,
>>
>> We could contribute to a feature to Spark MLlib by May 2016 and make it
>> count as our undergraduate senior project. The following list of issues
>> seem
>> interesting to us:
>>
>> *  https://issues.apache.org/jira/browse/SPARK-2273
>> <https://issues.apache.org/jira/browse/SPARK-2273>    –  Online learning
>> algorithms: Passive Aggressive
>> *  https://issues.apache.org/jira/browse/SPARK-2335
>> <https://issues.apache.org/jira/browse/SPARK-2335>    –  K-Nearest
>> Neighbor
>> classification and regression for MLLib
>> *  https://issues.apache.org/jira/browse/SPARK-2401
>> <https://issues.apache.org/jira/browse/SPARK-2401>    –  AdaBoost.MH, a
>> multi-class multi-label classifier
>> *  https://issues.apache.org/jira/browse/SPARK-4251
>> <https://issues.apache.org/jira/browse/SPARK-4251>    –  Add Restricted
>> Boltzmann machine(RBM) algorithm to MLlib
>> *  https://issues.apache.org/jira/browse/SPARK-4752
>> <https://issues.apache.org/jira/browse/SPARK-4752>    –  Classifier
>> based on
>> artificial neural network
>> *  https://issues.apache.org/jira/browse/SPARK-5575
>> <https://issues.apache.org/jira/browse/SPARK-5575>    –  Artificial
>> neural
>> networks for MLlib deep learning
>> *  https://issues.apache.org/jira/browse/SPARK-5992
>> <https://issues.apache.org/jira/browse/SPARK-5992>    –  Locality
>> Sensitive
>> Hashing (LSH) for MLlib
>> *  https://issues.apache.org/jira/browse/SPARK-6425
>> <https://issues.apache.org/jira/browse/SPARK-6425>    –  Add parallel
>> Q-learning algorithm to MLLib
>> *  https://issues.apache.org/jira/browse/SPARK-6442
>> <https://issues.apache.org/jira/browse/SPARK-6442>    –  Local Linear
>> Algebra Package
>> *  https://issues.apache.org/jira/browse/SPARK-8499
>> <https://issues.apache.org/jira/browse/SPARK-8499>    –  NaiveBayes
>> implementation for MLPipeline
>>
>> All of these tickets are marked unassigned but have some work done on
>> them.
>> Are any of these issues are unsuitable for us as a senior project?
>>
>> Kind regards,
>> Can Giracoglu, Emrehan Tuzun, Remzi Can Aksoy, Saygin Dogu
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Are-These-Issues-Suitable-for-our-Senior-Project-tp13119.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>

Reply via email to