I think this question was asked just a week ago? same company and setup.

https://mail-archives.apache.org/mod_mbox/spark-user/202104.mbox/%3CLNXP123MB2604758548BE38E8D3F369EC8A7B9%40LNXP123MB2604.GBRP123.PROD.OUTLOOK.COM%3E

On Wed, Apr 7, 2021 at 11:17 AM SRITHALAM, ANUPAMA (Risk Value Stream)
<[email protected]> wrote:

> Classification: Limited
>
> Hi Team,
>
>
>
> We are trying to use Gradient Boosting Classification algorithm and in
> Python we tried using Sklearn library and in Pyspark we are using ML
> library.
>
>
>
> We have around 45k dataset which is used for training and that dataset is
> taking around 3 to 4 hours in python but in Pyspark it is taking more than
> 18 hours for the same hyper parameters used between Python and Pyspark.
>
>
>
> We tried Pyspark by repartitioning the dataframe and can see a little
> improvement in performance but still we are not able to get timings near to
> Python.
>
>
>
> We have live run which need to evaluation predictions for 40million plus
> data and data resides in Hadoop. So it is difficult to get that huge amount
> to data to different system and convert to Pandas dataframe and run against
> Python.
>
>
>
> So we are trying to train the same model against Pyspark so, that I can do
> the evaluation against trained model in Pyspark but, here the concern that
> we have is the time taken for training is very high and we want to check
> what will be the general approach followed in these kind of scenarios.
>
>
>
>
>
> Thanks,
>
> Anupama.
>
> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
> Registered in Scotland no. SC95000. Telephone: 0131 225 4555.
>
> Lloyds Bank plc. Registered Office: 25 Gresham Street, London EC2V 7HN.
> Registered in England and Wales no. 2065. Telephone 0207626 1500.
>
> Bank of Scotland plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
> Registered in Scotland no. SC327000. Telephone: 03457 801 801.
>
> Lloyds Bank Corporate Markets plc. Registered office: 25 Gresham Street,
> London EC2V 7HN. Registered in England and Wales no. 10399850.
>
> Scottish Widows Schroder Personal Wealth Limited. Registered Office: 25
> Gresham Street, London EC2V 7HN. Registered in England and Wales no.
> 11722983.
>
> Lloyds Bank plc, Bank of Scotland plc and Lloyds Bank Corporate Markets
> plc are authorised by the Prudential Regulation Authority and regulated by
> the Financial Conduct Authority and Prudential Regulation Authority.
>
> Scottish Widows Schroder Personal Wealth Limited is authorised and
> regulated by the Financial Conduct Authority.
>
> Lloyds Bank Corporate Markets Wertpapierhandelsbank GmbH is a wholly-owned
> subsidiary of Lloyds Bank Corporate Markets plc. Lloyds Bank Corporate
> Markets Wertpapierhandelsbank GmbH has its registered office at
> Thurn-und-Taxis Platz 6, 60313 Frankfurt, Germany. The company is
> registered with the Amtsgericht Frankfurt am Main, HRB 111650. Lloyds Bank
> Corporate Markets Wertpapierhandelsbank GmbH is supervised by the
> Bundesanstalt für Finanzdienstleistungsaufsicht.
>
> Halifax is a division of Bank of Scotland plc.
>
> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in
> Scotland no. SC218813.
>
> This e-mail (including any attachments) is private and confidential and
> may contain privileged material. If you have received this e-mail in error,
> please notify the sender and delete it (including any attachments)
> immediately. You must not copy, distribute, disclose or use any of the
> information in it or any attachments. Telephone calls may be monitored or
> recorded.
>

Reply via email to