Hi,

The main puzzle here is how to deliver the priority change of a row during
a given window, my best shot would be to have a side input
(PCollectionView) containing the change of priority, then in the slow
worker beam transform extract this side input and update the corresponding
row with the new priority.

Again is just an idea.

JC

Chad Dombrova <[email protected]> schrieb am Sa., 24. Aug. 2019, 03:32:

> Hi all,
> Our team is brainstorming how to solve a particular type of problem with
> Beam, and it's a bit beyond our experience level, so I thought I'd turn to
> the experts for some advice.
>
> Here are the pieces of our puzzle:
>
>    - a data source with the following properties:
>       - rows represent work to do
>       - each row has an integer priority
>       - rows can be added or deleted
>       - priorities of a row can be changed
>       - <10k rows
>    - a slow worker Beam transform (Map/ParDo) that consumes a row at a
>    time
>
>
> We want a streaming pipeline that delivers rows from our data store to the
> worker transform,  resorting the source based on priority each time a new
> row is delivered.  The goal is that last second changes in priority can
> affect the order of the slowly yielding read.  Throughput is not a major
> concern since the worker is the bottleneck.
>
> I have a few questions:
>
>    - is the sort of problem that BeamSQL can solve? I'm not sure how
>    sorting and resorting are handled there in a streaming context...
>    - I'm unclear on how back-pressure in Flink affects streaming reads.
>    It's my hope that data/messages are left in the data source until
>    back-pressure subsides, rather than read eagerly into memory.  Can someone
>    clarify this for me?
>    - is there a combination of windowing and triggering that can solve
>    this continual resorting plus slow yielding problem?  It's not unreasonable
>    to keep all of our rows in memory on Flink, as long as we're snapshotting
>    state.
>
>
> Any advice on how an expert Beamer would solve this is greatly appreciated!
>
> Thanks,
> chad
>
>

Reply via email to