Re: Keeping track of how long something has been in a queue

2020-09-06 Thread Jungtaek Lim
You may want to google around "session window" and "duration", and check whether the concept fits your requirements. Probably adding some custom logic on top of the session window would work for you, which requires you to implement a custom function for flatMapGroupsWithState. Hope this helps. Th

Re: Query about Spark

2020-09-06 Thread Ankur Das
Thanks, I'll check it out. On Sun, Sep 6, 2020 at 7:15 PM ☼ R Nair wrote: > Or use MLFlow's PySpark UDF. First create a mlflow.pyfunc. > > Best, Ravion > > On Sun, Sep 6, 2020, 9:43 AM ☼ R Nair wrote: > >> Question is not clear..use accumulators, if I took it correctly. >> >> Best, Ravion >> >>

Re: Query about Spark

2020-09-06 Thread ☼ R Nair
Or use MLFlow's PySpark UDF. First create a mlflow.pyfunc. Best, Ravion On Sun, Sep 6, 2020, 9:43 AM ☼ R Nair wrote: > Question is not clear..use accumulators, if I took it correctly. > > Best, Ravion > > On Sun, Sep 6, 2020, 9:41 AM Ankur Das wrote: > >> >> Good Evening Sir/Madam, >> Hope you

Re: Query about Spark

2020-09-06 Thread ☼ R Nair
Question is not clear..use accumulators, if I took it correctly. Best, Ravion On Sun, Sep 6, 2020, 9:41 AM Ankur Das wrote: > > Good Evening Sir/Madam, > Hope you are doing well, I am experimenting on some ML techniques where I > need to test it on a distributed environment. > For example a par

Query about Spark

2020-09-06 Thread Ankur Das
Good Evening Sir/Madam, Hope you are doing well, I am experimenting on some ML techniques where I need to test it on a distributed environment. For example a particular algorithm I want to run it on different nodes at the same time and collect the results at the end in one single node or the parent