Hi all,
I have created a simple snippet as such:
import apache_beam as beam
from apache_beam.io.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions
import logging
logging.basicConfig(level=logging.WARNING)
opts = direct_opts
with beam.Pipeline(options=Pip
steps. This is tracked in
> https://issues.apache.org/jira/browse/BEAM-11998.
>
> Thanks,
> Cham
>
> On Wed, Jun 2, 2021 at 8:28 AM Ahmet Altay wrote:
>
>> /cc @Boyuan Zhang for kafka @Chamikara Jayalath
>> for multi language might be able to help.
>&g
CC-ing Chamikara as he got omitted from the reply all I did earlier.
On Thu, Jun 3, 2021 at 12:43 AM Alex Koay wrote:
> Yeah, I figured it wasn't supported correctly on DirectRunner. Stumbled
> upon several threads saying so.
>
> On Dataflow, I've encountered a few dif
ort>.
>
> Thanks,
> Cham
>
> On Wed, Jun 2, 2021 at 9:45 AM Alex Koay wrote:
>
>> CC-ing Chamikara as he got omitted from the reply all I did earlier.
>>
>> On Thu, Jun 3, 2021 at 12:43 AM Alex Koay wrote:
>>
>>> Yeah, I figured it wasn
Several questions:
1. Is there any way to set the log level for the Java workers via a Python
Dataflow pipeline?
2. What is the easiest way to debug an external transform in Java? My main
pipeline code is in Python.
3. Are there any edge cases with the UnboundedSourceWrapperFn SDF that I
should
share the implementation of the source and the pipeline, that
> would be really helpful.
>
> +Lukasz Cwik for awareness.
>
> On Tue, Jun 15, 2021 at 9:50 AM Chamikara Jayalath
> wrote:
>
>>
>>
>> On Tue, Jun 15, 2021 at 3:20 AM Alex Koay wrote:
>
ad version even when using 2.24.0 (I'm not entirely sure why this is the
case).
I feel like I'm almost at the verge of fixing the problem, but at this
point I'm still far from it.
On Wed, Jun 16, 2021 at 11:24 AM Alex Koay wrote:
> 1. I'm building a streaming pipeline.
&
process and instead reusing existing
sources over and over
I'd be happy to send pull requests to help fix this issue but perhaps will
need some direction as to how I should fix this.
On Wed, Jun 16, 2021 at 8:32 PM Alex Koay wrote:
> Alright, some updates.
>
> Using DirectRunner hel
to do two things:
> 1) Finalize the current bundle
> 2) Schedule the continuation for the checkpoint mark
>
> Based upon your description it looks like for some reason the runner is
> unable to complete the current bundle.
>
> On Thu, Jun 17, 2021 at 2:48 AM Alex Koay wrote:
&g
> 2:
> https://github.com/apache/beam/blob/7494d1450f2fc19344c385435496634e4b28a997/runners/direct-java/src/main/java/org/apache/beam/runners/direct/EvaluationContext.java#L157
>
> On Thu, Jun 17, 2021 at 10:42 AM Alex Koay wrote:
>
>> Could you be referring to this part
rt of this
PR (https://github.com/apache/beam/pull/13592), do you know if there was
something behind this?
Thanks!
Cheers
Alex
On Thu, Jun 24, 2021 at 4:01 PM Alex Koay wrote:
> Good news for anybody who's following this.
> I finally had some time today to look into the problem again w
Can any Dataflow experts / Googlers enlighten me as to why this happens?
I shimmed the FnHarness for a Python pipeline with a Java external
transform, and it seems that the ProcessBundleHandler receives different
process-bundle-descriptor-%d for the same processor.
This leads to the system defeatin
>From my understanding, you need the Pipeline for mainly two things:
1. Marking the start of any processing flows (it serves as the PBegin
"PCollection") so any sources that follows it will run.
2. Running / executing / deploying the pipeline -- this happens
automatically with the context manager i
13 matches
Mail list logo