Running averages -- but global window state might not be enough

2021-04-20 Thread Raman Gupta
I have a running average problem. As I understand it, the traditional Beam solution is state in a global window, but I'm not quite sure how to approach it for my use case, which is a bit more complex. I have a "score" output every 5 minutes based on a timer, up to a maximum of 1 hour after some ti

Re: [EXT] Re: [EXT] Re: Beam Dataframe - sort and grouping

2021-04-20 Thread Brian Hulette
Hi Wenbing, Sorry for taking so long to get back to you on this. I discussed this with Robert offline and we came up with a potential workaround - you could try writing out the Parquet file from within the groupby.apply method. You can use beam's FileSystems abstraction to open a Python file object

Re: [EXT] Re: [EXT] Re: Beam Dataframe - sort and grouping

2021-04-20 Thread Robert Bradshaw
It would also be helpful to understand what your overall objective is with this output. Is there a reason you need it sorted/partitioned in a certain way? On Tue, Apr 20, 2021 at 4:51 PM Brian Hulette wrote: > Hi Wenbing, > Sorry for taking so long to get back to you on this. > I discussed this