Re: Feedback: Feature request

2015-08-28 Thread Manish Amde
> though. E.g > > "lhs":0,"op":"<=","rhs":-35.0 > On Aug 28, 2015 12:03 AM, "Manish Amde" > wrote: > >> Hi James, >> >> It's a good idea. A JSON format is more convenient for visualization >>

Re: Feedback: Feature request

2015-08-27 Thread Manish Amde
Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish > On Aug 26, 2015,

Re: Welcoming three new committers

2015-02-03 Thread Manish Amde
Congratulations Cheng, Joseph and Sean. On Tuesday, February 3, 2015, Zhan Zhang wrote: > Congratulations! > > On Feb 3, 2015, at 2:34 PM, Matei Zaharia > wrote: > > > Hi all, > > > > The PMC recently voted to add three new committers: Cheng Lian, Joseph > Bradley and Sean Owen. All three have

Re: Quantile regression in tree models

2014-11-18 Thread Manish Amde
ot. Is there a design doc explaining how the gradient boosting algorithm > is laid out in MLLib? I tried reading the code, but without a "Rosetta > stone" it's impossible to make sense of it. > > Alex > > On Mon, Nov 17, 2014 at 8:25 PM, Manish Amde wrote: > >>

Re: Quantile regression in tree models

2014-11-17 Thread Manish Amde
;weak > hypothesis weights". Does this refer to the weights of the leaves of the > trees? > > Alex > > On Mon, Nov 17, 2014 at 2:24 PM, Manish Amde > wrote: > >> Hi Alessandro, >> >> MLlib v1.1 supports variance for regression and gini impurity

Re: Quantile regression in tree models

2014-11-17 Thread Manish Amde
Hi Alessandro, MLlib v1.1 supports variance for regression and gini impurity and entropy for classification. http://spark.apache.org/docs/latest/mllib-decision-tree.html If the information gain calculation can be performed by distributed aggregation then it might be possible to plug it into the e

Re: Decision forests don't work with non-trivial categorical features

2014-10-13 Thread Manish Amde
Sean, sorry for missing out on the discussion. Evan, you are correct, we are using the heuristic Sean suggested during the multiclass PR for ordering high-arity categorical variables using the impurity values for each categorical feature. Joseph, thanks for fixing the bug which I think was a regr

Re: reduce, transform, combine

2014-05-04 Thread Manish Amde
: https://www.linkedin.com/in/dbtsai > On Sun, May 4, 2014 at 1:12 AM, Manish Amde wrote: >> I am currently using the RDD aggregate operation to reduce (fold) per >> partition and then combine using the RDD aggregate operation. >> def aggregate[U: ClassTag](zeroValue: U)(seqO

reduce, transform, combine

2014-05-04 Thread Manish Amde
I am currently using the RDD aggregate operation to reduce (fold) per partition and then combine using the RDD aggregate operation. def aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) => U): U I need to perform a transform operation after the seqOp and before the combOp. Th