MLIB is old RDD-based API since Apache Spark 2 is recommended to use
dataset based APIs to get good performance and introduce ML.
ML contains new API build around Dataset and ML Pipelines ,mllib is slowly
being deprecated (this already happened in case of linear regression)
MLIB currently enter
our main challenge has been the lack of support for missing values generally
On Sat, Sep 23, 2017 at 3:41 AM, Irfan Kabli
wrote:
> Dear All,
>
> We are looking to position MLLib in our organisation for machine learning
> tasks and are keen to understand if their are any challenges that you might
This is something I wrote specifically for the challenges that we faced
when taking spark ml models to production
http://www.tothenew.com/blog/when-you-take-your-machine-learning-models-to-production-for-real-time-predictions/
On Sat, Sep 23, 2017 at 1:33 PM, Jörn Franke wrote:
> As far as I kno
As far as I know there is currently no encryption in-memory in Spark. There are
some research projects to create secure enclaves in-memory based on Intel sgx,
but there is still a lot to do in terms of performance and security objectives.
The more interesting question is why would you need this f
Dear All,
We are looking to position MLLib in our organisation for machine learning
tasks and are keen to understand if their are any challenges that you might
have seen with MLLib in production. We will be going with the pure
open-source approach here, rather than using one of the hadoop
distribu