Re: MLlib vs Madlib

2014-12-14 Thread Brian Dolan
; I need to perform large scale text analytics and I can data store on HDFS or > on Pivotal Greenplum/Hawq. > > Regards, > Venkat Ankam > > From: Brian Dolan [mailto:buddha_...@yahoo.com] > Sent: Sunday, December 14, 2014 10:02 AM > To: Venkat, Ankam > Cc: '

Re: MLlib vs Madlib

2014-12-14 Thread Brian Dolan
MADLib (http://madlib.net/) was designed to bring large-scale ML techniques to a relational database, primarily postgresql. MLlib assumes the data exists in some Spark-compatible data format. I would suggest you pick the library that matches your data platform first. DISCLAIMER: I am the origi

Setting network variables in spark-shell

2014-11-30 Thread Brian Dolan
Howdy Folks, What is the correct syntax in 1.0.0 to set networking variables in spark shell? Specifically, I'd like to set the spark.akka.frameSize I'm attempting this: spark-shell -Dspark.akka.frameSize=1 --executor-memory 4g Only to get this within the session: System.getProperty("spark