Re: [mllib] State of Multi-Model training

2014-09-17 Thread Burak Yavuz
I believe it will be in the main repo. Burak - Original Message - From: "Kyle Ellrott" To: "Burak Yavuz" Cc: dev@spark.apache.org Sent: Wednesday, September 17, 2014 9:48:54 AM Subject: Re: [mllib] State of Multi-Model training This sounds like a pretty major re

Re: [mllib] State of Multi-Model training

2014-09-17 Thread Kyle Ellrott
dback from you and the rest of the > community! > > Best, > Burak > > - Original Message - > From: "Kyle Ellrott" > To: "Burak Yavuz" > Cc: dev@spark.apache.org > Sent: Tuesday, September 16, 2014 9:41:45 PM > Subject: Re: [mllib] Sta

Re: [mllib] State of Multi-Model training

2014-09-16 Thread Burak Yavuz
ny feedback from you and the rest of the community! Best, Burak - Original Message - From: "Kyle Ellrott" To: "Burak Yavuz" Cc: dev@spark.apache.org Sent: Tuesday, September 16, 2014 9:41:45 PM Subject: Re: [mllib] State of Multi-Model training I'd be intereste

Re: [mllib] State of Multi-Model training

2014-09-16 Thread Kyle Ellrott
I'd be interested in helping to test your code as soon as its available. The version I wrote used a paired RDD and combined by key, it worked best if it used a custom partitioner that put all the samples in the same area. Running things in batched matrices would probably speed things up greatly. Yo

Re: [mllib] State of Multi-Model training

2014-09-16 Thread Burak Yavuz
Hi Kyle, I'm actively working on it now. It's pretty close to completion, I'm just trying to figure out bottlenecks and optimize as much as possible. As Phase 1, I implemented multi model training on Gradient Descent. Instead of performing Vector-Vector operations on rows (examples) and weights,