Re: Custom Data Source for getting data from Rest based services

2017-11-26 Thread shankar.roy
This would be a useful feature. We can leverage it while doing bulk provisioning. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[Spark ML] Compatibility between features and models

2017-11-26 Thread Ming Ma
Hi, does anyone know how Spark's model serialization format can support feature evolution with backward compatibility support? Specifically when new data that include both old features and newly added features is fed into old models trained with old set of features, the old models should be able

Re: NLTK with Spark Streaming

2017-11-26 Thread ashish rawat
Thanks Holden and Chetan. Holden - Have you tried it out, do you know the right way to do it? Chetan - yes, if we use a Java NLP library, it should not be any issue in integrating with spark streaming, but as I pointed out earlier, we want to give flexibility to data scientists to use the language

Re: NLTK with Spark Streaming

2017-11-26 Thread Chetan Khatri
But you can still use Stanford NLP library and distribute through spark right ! On Sun, Nov 26, 2017 at 3:31 PM, Holden Karau wrote: > So it’s certainly doable (it’s not super easy mind you), but until the > arrow udf release goes out it will be rather slow. > > On Sun, Nov 26, 2017 at 8:01 AM a

Re: NLTK with Spark Streaming

2017-11-26 Thread Holden Karau
So it’s certainly doable (it’s not super easy mind you), but until the arrow udf release goes out it will be rather slow. On Sun, Nov 26, 2017 at 8:01 AM ashish rawat wrote: > Hi, > > Has someone tried running NLTK (python) with Spark Streaming (scala)? I > was wondering if this is a good idea a