Morning, Adam, Denis! Let me describe the current status
1. https://issues.apache.org/jira/browse/IGNITE-10810 is related to MLeap not to XGBoost. This is the right ticket for XGBoost https://issues.apache.org/jira/browse/IGNITE-10289 2. Currently, we have no plans to add XGBoost or any external ML library for distributed training (inference could be supported now with a few limitations, see XGBoost or H2O examples) 3. We have models storage and partitioned dataset primitives to keep the data with MapReduce-like operations, but each algorithm should be implemented as a sequence of MR operations manually (we have no MR code generation here) I have a few questions, could you please answer them? 1. Are you a member of XGBoost project and have a permission to commit the XGBoost project? (in many cases the collaboration involves changes in both integrated frameworks) 2. What are the primitives or integration points are accessible in XGBoost? Could you share a paper/article/link to give me a chance to read more? 3. What is planned architecture with native C++ libraries? Could you share it with me and Ignite community? P.S. I need to go deeper to understand what capabilities of Ignite ML could be used to become the platform for distributed training, you answers will be helpful. Sincerely yours, Alexey Zinoviev пт, 27 мар. 2020 г. в 01:04, Carbone, Adam <adam.carb...@bottomline.com>: > Good afternoon Denis, > > Nice to meet you, Hello to you too Alexey. So I'm not sure if it will be > me or another member on our team, but I wanted to start the discussion. We > are investigating/integrating ignite into our ML platform. In addition We > have already done a separate tensor flow implementation for Neural Network > using the C++ libraries. And we were about to take the same approach for > XGBoost, when we saw the 2.8 announcement. So before we went that route I > wanted to do a more proper investigations as to where things were, and > where they might head. > > Regards > > Adam > > Adam Carbone | Director of Innovation – Intelligent Platform Team | > Bottomline Technologies > Office: 603-501-6446 | Mobile: 603-570-8418 > www.bottomline.com > > > > On 3/26/20, 5:20 PM, "Denis Magda" <dma...@apache.org> wrote: > > Hi Adam, thanks for starting the thread. The contributions are > highly appreciated and we'll be glad to see you among our contributors, > especially, if it helps to make our ML library stronger. > > But first things first, let me introduce you to @Alexey Zinoviev > <zaleslaw....@gmail.com> who is our main ML maintainer. > > - > Denis > > > On Thu, Mar 26, 2020 at 1:49 PM Carbone, Adam < > adam.carb...@bottomline.com> > wrote: > > > Good Afternoon All > > > > I was asked to forward this here by Denis Magda. I see in the 2.8 > release > > that you implemented importing of XGBoost models for distributed > inference > > => > > > https://issues.apache.org/jira/browse/IGNITE-10810?focusedCommentId=16728718&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16728718Is > > there any plans to add distributed training, We are at a cross roads > of > > building on top of the C++ libraries an XGBoost solution, but if > this is on > > the roadmap maybe we will go the ignite direction vs the pure C++, > and > > maybe we might even be able to help and contribute. > > > > Regards > > > > Adam Carbone > > > > Adam Carbone | Director of Innovation – Intelligent Platform Team | > > Bottomline Technologies > > Office: 603-501-6446 | Mobile: 603-570-8418 > > www.bottomline.com > > > > > > > > >