@Riccardo

Spark does not do the DL learning part of the pipeline (afaik) so it is
limited to data ingestion and transforms (ETL). It therefore is optional
and other ETL options might be better for you.

Most of the technologies @Gourav mentions have their own scaling based on
their own compute engines specialized for their DL implementations, so be
aware that Spark scaling has nothing to do with scaling most of the DL
engines, they have their own solutions.

From: Gourav Sengupta <gourav.sengu...@gmail.com>
<gourav.sengu...@gmail.com>
Reply: Gourav Sengupta <gourav.sengu...@gmail.com>
<gourav.sengu...@gmail.com>
Date: May 4, 2019 at 10:24:29 AM
To: Riccardo Ferrari <ferra...@gmail.com> <ferra...@gmail.com>
Cc: User <user@spark.apache.org> <user@spark.apache.org>
Subject:  Re: Deep Learning with Spark, what is your experience?

Try using MxNet and Horovod directly as well (I think that MXNet is worth a
try as well):
1.
https://medium.com/apache-mxnet/distributed-training-using-apache-mxnet-with-horovod-44f98bf0e7b7
2.
https://docs.nvidia.com/deeplearning/dgx/mxnet-release-notes/rel_19-01.html
3. https://aws.amazon.com/mxnet/
4.
https://aws.amazon.com/blogs/machine-learning/aws-deep-learning-amis-now-include-horovod-for-faster-multi-gpu-tensorflow-training-on-amazon-ec2-p3-instances/


Ofcourse Tensorflow is backed by Google's advertisement team as well
https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-training-with-tensorflow/


Regards,




On Sat, May 4, 2019 at 10:59 AM Riccardo Ferrari <ferra...@gmail.com> wrote:

> Hi list,
>
> I am trying to undestand if ti make sense to leverage on Spark as enabling
> platform for Deep Learning.
>
> My open question to you are:
>
>    - Do you use Apache Spark in you DL pipelines?
>    - How do you use Spark for DL? Is it just a stand-alone stage in the
>    workflow (ie data preparation script) or is it  more integrated
>
> I see a major advantage in leveraging on Spark as a unified entrypoint,
> for example you can easily abstract data sources and leverage on existing
> team skills for data pre-processing and training. On the flip side you may
> hit some limitations including supported versions and so on.
> What is your experience?
>
> Thanks!
>

Reply via email to