Re: MLlib vs Madlib

2014-12-14 Thread Brian Dolan
; I need to perform large scale text analytics and I can data store on HDFS or > on Pivotal Greenplum/Hawq. > > Regards, > Venkat Ankam > > From: Brian Dolan [mailto:buddha_...@yahoo.com] > Sent: Sunday, December 14, 2014 10:02 AM > To: Venkat, Ankam > Cc: '

RE: MLlib vs Madlib

2014-12-14 Thread Venkat, Ankam
orm large scale text analytics and I can data store on HDFS or on Pivotal Greenplum/Hawq. Regards, Venkat Ankam From: Brian Dolan [mailto:buddha_...@yahoo.com] Sent: Sunday, December 14, 2014 10:02 AM To: Venkat, Ankam Cc: 'user@spark.apache.org' Subject: Re: MLlib vs Madlib MADLib (http:

Re: MLlib vs Madlib

2014-12-14 Thread Brian Dolan
MADLib (http://madlib.net/) was designed to bring large-scale ML techniques to a relational database, primarily postgresql. MLlib assumes the data exists in some Spark-compatible data format. I would suggest you pick the library that matches your data platform first. DISCLAIMER: I am the origi