Re: Quantile regression in tree models

2014-11-18 Thread Manish Amde
Hi Alex, Here is the ticket for refining tree predictions. Let's discuss this further on the JIRA. https://issues.apache.org/jira/browse/SPARK-4240 There is no ticket yet for quantile regression. It will be great if you could create one and note down the corresponding loss function and gradient c

Re: Quantile regression in tree models

2014-11-18 Thread Alessandro Baretta
Manish, My use case for (asymmetric) absolute error is quite trivially quantile regression. In other words, I want to use Spark to learn conditional cumulative distribution functions. See R's GBM quantile regression option. If you either find or create a Jira ticket, I would be happy to give it a

Re: Quantile regression in tree models

2014-11-17 Thread Manish Amde
Hi Alessandro, I think absolute error as splitting criterion might be feasible with the current architecture -- I think the sufficient statistics we collect currently might be able to support this. Could you let us know scenarios where absolute error has significantly outperformed squared error fo

Re: Quantile regression in tree models

2014-11-17 Thread Alessandro Baretta
Manish, Thanks for pointing me to the relevant docs. It is unfortunate that absolute error is not supported yet. I can't seem to find a Jira for it. Now, here's the what the comments say in the current master branch: /** * :: Experimental :: * A class that implements Stochastic Gradient Boostin

Re: Quantile regression in tree models

2014-11-17 Thread Manish Amde
Hi Alessandro, MLlib v1.1 supports variance for regression and gini impurity and entropy for classification. http://spark.apache.org/docs/latest/mllib-decision-tree.html If the information gain calculation can be performed by distributed aggregation then it might be possible to plug it into the e