Hi,
I have a Dataflow pipeline with PubSub UnboundedSource, The pipeline
transforms the data and writes to another PubSub topic. I have a question
regarding exceptions in DoFns. If I chose to ignore an exception processing
an element, does it ACK the bundle?
Also if I were to just throw the excep
is a chance of data loss if you have a pipeline that needs to
> checkpoint data to disk and you shut down the pipeline without draining the
> data. In that case, messages may have been ack'd to pubsub and held durably
> only in the checkpoint state in Dataflow, so shutting down the pip
PI affordances that Beam
> provides for exception handling.
>
> [0]
> https://beam.apache.org/releases/javadoc/2.21.0/org/apache/beam/sdk/transforms/MapElements.html#exceptionsVia-org.apache.beam.sdk.transforms.InferableFunction-
>
>
> On Mon, Jun 1, 2020 at 1:56 PM KV 59 wrote:
&g
Hi All,
I'm building a pipeline to process events as they come and do not really
care about the event time and watermark. I'm more interested in not
discarding the events and reducing the latency. The downstream pipeline has
a stateful DoFn. I understand that the default window strategy is Global
ion for GBK is per key and window pane. Note that
>> elementCountAtLeast means that the runner can buffer as many as it wants
>> and can decide to offer a low latency pipeline by triggering often or
>> better throughput through the use of buffering.
>>
>>
>>
>>
Hi,
I'm using AvroCoder for my custom type. If I use the AvroCoder and
perform these operations Object1 -> encode -> bytes -> decode -> Object2.
Object1 and Object2 are not equal. The same works locally.
What could be the problem? I'm using Beam 2.23.0 , Avro 1.9.1 and Java 1.8
. Do the versions
Hi,
I have a Beam streaming pipeline processing live data from PubSub using
sliding windows on event timestamps. I want to recompute the metrics for
historical data in BigQuery. What are my options?
I have looked at
https://stackoverflow.com/questions/56702878/how-to-use-apache-beam-to-process-hi
in and restart so has the same problem while being more complicated.
>
> Thanks,
> Ankur
>
> On Tue, Aug 24, 2021 at 7:44 AM KV 59 wrote:
>
>> Hi,
>>
>> I have a Beam streaming pipeline processing live data from PubSub using
>> sliding windows on event ti
Hi,
I have a question about how windows are assigned to elements or vice versa.
( I thought I understood this all this while but I'm a little confused now)
My understanding:
For a sliding window configuration of size S and period P
1. For every element E, with timestamp T it is assigned to each
Adding to my question if there is no element E with a timestamp T then
there is no window [T-s, T)
On Tue, Sep 28, 2021 at 6:25 AM KV 59 wrote:
> Hi,
>
> I have a question about how windows are assigned to elements or vice
> versa. ( I thought I understood this all this while but
or equality (to see if two elements are in the same window)
> and it knows the timestamp when the window ends.
>
> Back to your question - your assessment is mostly correct, except that the
> windows are always being created each time.
>
> Reuven
>
> On Tue, Sep 28, 2021
Hi,
I have a question on how the Dataflow drains a job. I have a job which
reads from PubSub and uses sliding windows to compute aggregates for each
window.
When I update the job and the new job is not compatible, I have an option
to either cancel or drain the job.
I want to understand how the d
12 matches
Mail list logo