Re: [SparkML] Random access in SparseVector will slow down inference stage for some tree based models

2018-07-04 Thread Vincent Wang
without changing > any APIs or slowing anything else down? if so this could be worth a > pull request. > On Sun, Jul 1, 2018 at 9:21 PM Vincent Wang wrote: > > > > Hi there, > > > > I'm using GBTClassifier do some classification jobs and find the > performance of

Fwd: [SparkML] Random access in SparseVector will slow down inference stage for some tree based models

2018-07-01 Thread Vincent Wang
Hi there, I'm using *GBTClassifier* do some classification jobs and find the performance of scoring stage is not quite satisfying. The trained model has about 160 trees and the input feature vector is sparse and its size is about 20+. After some digging, I found the model will repeatedly and rand