Re: [ANNOUNCE] New PMC Member: Danny McCormick

2025-01-22 Thread Reza Rokni via dev
Woohoo! Congrats :-)

On Fri, Jan 10, 2025 at 6:35 PM Kenneth Knowles  wrote:

> Congrats!
>
> On Thu, Jan 9, 2025 at 10:15 AM Yi Hu via dev  wrote:
>
>> Congrats, Danny!
>>
>> On Wed, Jan 8, 2025 at 8:40 PM Austin Bennett <
>> whatwouldausti...@gmail.com> wrote:
>>
>>> Congrats and Thanks, Danny!
>>>
>>> On Fri, Dec 27, 2024 at 5:51 AM Ahmed Abualsaud via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Well deserved! Thanks for all your hard work Danny

 On Fri, Dec 20, 2024 at 7:58 PM LDesire  wrote:

> Congratulations Danny! 😀




Beam High Priority Issue Report (35)

2025-01-22 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/33698 The finalize_release job is flaky
https://github.com/apache/beam/issues/33569 [Task]: Remove Google Analytics 
from Beam Website
https://github.com/apache/beam/issues/33425 [Bug]: 
beam_Publish_Beam_SDK_Snapshots and beam_PostCommit_Python_Arm are extremely 
flaky due to failing to build wheels
https://github.com/apache/beam/issues/33407 [Bug]: tfrecordio does not work 
with snappy >= 0.7
https://github.com/apache/beam/issues/33220 The PreCommit Flink Container job 
is flaky
https://github.com/apache/beam/issues/33064 The PostCommit Python 
ValidatesContainer Dataflow job is flaky
https://github.com/apache/beam/issues/32997 [Bug]: Non Retained Messages 
missing after MqttIO.Read checkpoint restore 
https://github.com/apache/beam/issues/32949 The PostCommit Java ValidatesRunner 
Flink Java8 job is flaky
https://github.com/apache/beam/issues/32509 [Bug]: Unable to Restart Google 
Spanner Change Streams Consumer due to tableExists(table_name) bug
https://github.com/apache/beam/issues/32161 The Publish Beam SDK Snapshots job 
is flaky
https://github.com/apache/beam/issues/32144 The PerformanceTests WordCountIT 
PythonVersions job is flaky
https://github.com/apache/beam/issues/31846 The Clean Up GCP Resources job is 
flaky
https://github.com/apache/beam/issues/31254 [Failing Test]: Onnx inference unit 
tests are failing.
https://github.com/apache/beam/issues/30799 The PostCommit Python Dependency 
job is flaky
https://github.com/apache/beam/issues/30519 The PostCommit XVR GoUsingJava 
Dataflow job is flaky
https://github.com/apache/beam/issues/29971 [Bug]: FixedWindows not working for 
large Kafka topic
https://github.com/apache/beam/issues/29515 [Bug]: WriteToFiles in python leave 
few records in temp directory when writing to large number (100+) of files
https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java SDK Harness 
doesn't update user counters in OnTimer callback functions
https://github.com/apache/beam/issues/28760 [Bug]: EFO Kinesis IO reader 
provided by apache beam does not pick the event time for watermarking
https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not 
propagate a Coder to AvroSource
https://github.com/apache/beam/issues/26041 [Bug]: Unable to create 
exactly-once Flink pipeline with stream source and file sink
https://github.com/apache/beam/issues/25946 [Task]: Support more Beam portable 
schema types as Python types
https://github.com/apache/beam/issues/24776 [Bug]: Race condition in Python SDK 
Harness ProcessBundleProgress
https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder 
will drop message id and orderingKey
https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for 
dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it
https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial 
(order 1000 elements) numpy input flakes in non-cython environment
https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table 
destinations returns wrong tableId
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit 
data at GC time
https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit 
empty pane when it should


P1 Issues with no update in the last week:

https://github.com/apache/beam/issues/31931 The IcebergIO Integration Tests job 
is flaky
https://github.com/apache/beam/issues/30606 The PostCommit Java Nexmark 
Dataflow job is flaky
https://github.com/apache/beam/issues/30527 The PostCommit Java IO Performance 
Tests job is flaky
https://github.com/apache/beam/issues/30525 The PostCommit Python 
ValidatesContainer Dataflow With RC job is flaky
https://github.com/apache/beam/issues/30507 The LoadTests Go GBK Flink Batch 
job is flaky
https://github.com/apache/beam/issues/25975 [Bug]: KinesisIO processing-time 
watermarking can cause data loss




[Bug] Problem with subscription

2025-01-22 Thread Enrique Calderon
Hi!
I am trying to subscribe to both the mailing lists of users and developers. I 
have already sent several emails to both dev-subscr...@beam.apache.org and 
user-subscr...@beam.apache.org but I am never receiving anything that notifies 
me about the subscription neither the new emails on the mailing list.
My email is this, ksobrena...@ks32.dev.
Greetings
- Enrique (ksobrenat32)

Re: [Bug] Problem with subscription

2025-01-22 Thread Pablo Estrada
Hello team!
I have been working with Enrique and I tried to help him figure out if there 
was something wrong with his subscribe requests to no avail. Anybody knows how 
to escalate/debug where his subscription request is failing/getting stuck?
Best
-P.

On 2025/01/22 19:37:45 Enrique Calderon wrote:
> Hi!
> I am trying to subscribe to both the mailing lists of users and developers. I 
> have already sent several emails to both dev-subscr...@beam.apache.org and 
> user-subscr...@beam.apache.org but I am never receiving anything that 
> notifies me about the subscription neither the new emails on the mailing list.
> My email is this, ksobrena...@ks32.dev.
> Greetings
> - Enrique (ksobrenat32)


Re: [Bug] Problem with subscription

2025-01-22 Thread Enrique Calderon
I have checked everything and even tried again but with no success. I can also 
confirm it works because I am able to receive the confirmation email on another 
email account but not on this one.
On Wednesday, January 22nd, 2025 at 3:32 PM, XQ Hu  wrote:

> Can you check your spam folder? I just tested it with my personal gmail and I 
> got the confirmation email and just replied to it.
>
> On Wed, Jan 22, 2025 at 4:31 PM Enrique Calderon  wrote:
>
>> No, I haven't received any emails at all, including any subscription emails.
>> Re-sending this because I did not reply to all
>> On Wednesday, January 22nd, 2025 at 3:13 PM, XQ Hu  wrote:
>>
>>> Did you get the email like confirm subscribe to u...@beam.apache.org to 
>>> confirm your subscription?
>>>
>>> On Wed, Jan 22, 2025 at 3:44 PM Robert Bradshaw via dev 
>>>  wrote:
>>>
 Welcome to the community, Enrique!

 I have no idea why the subscriptions aren't working, or how to debug
 this. Apache infra would probably have people who would be better at
 looking into this, as they run the mailing lists.

 On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada  wrote:
>
> Hello team!
> I have been working with Enrique and I tried to help him figure out if 
> there was something wrong with his subscribe requests to no avail. 
> Anybody knows how to escalate/debug where his subscription request is 
> failing/getting stuck?
> Best
> -P.
>
> On 2025/01/22 19:37:45 Enrique Calderon wrote:
> > Hi!
> > I am trying to subscribe to both the mailing lists of users and 
> > developers. I have already sent several emails to both 
> > dev-subscr...@beam.apache.org and user-subscr...@beam.apache.org but I 
> > am never receiving anything that notifies me about the subscription 
> > neither the new emails on the mailing list.
> > My email is this, ksobrena...@ks32.dev.
> > Greetings
> > - Enrique (ksobrenat32)

Re: [Bug] Problem with subscription

2025-01-22 Thread Robert Bradshaw via dev
Welcome to the community, Enrique!

I have no idea why the subscriptions aren't working, or how to debug
this. Apache infra would probably have people who would be better at
looking into this, as they run the mailing lists.

On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada  wrote:
>
> Hello team!
> I have been working with Enrique and I tried to help him figure out if there 
> was something wrong with his subscribe requests to no avail. Anybody knows 
> how to escalate/debug where his subscription request is failing/getting stuck?
> Best
> -P.
>
> On 2025/01/22 19:37:45 Enrique Calderon wrote:
> > Hi!
> > I am trying to subscribe to both the mailing lists of users and developers. 
> > I have already sent several emails to both dev-subscr...@beam.apache.org 
> > and user-subscr...@beam.apache.org but I am never receiving anything that 
> > notifies me about the subscription neither the new emails on the mailing 
> > list.
> > My email is this, ksobrena...@ks32.dev.
> > Greetings
> > - Enrique (ksobrenat32)


Re: [Bug] Problem with subscription

2025-01-22 Thread XQ Hu via dev
cc this to the user list to test my recent subscription with my private
gmail.

On Wed, Jan 22, 2025 at 3:44 PM Robert Bradshaw via dev 
wrote:

> Welcome to the community, Enrique!
>
> I have no idea why the subscriptions aren't working, or how to debug
> this. Apache infra would probably have people who would be better at
> looking into this, as they run the mailing lists.
>
> On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada  wrote:
> >
> > Hello team!
> > I have been working with Enrique and I tried to help him figure out if
> there was something wrong with his subscribe requests to no avail. Anybody
> knows how to escalate/debug where his subscription request is
> failing/getting stuck?
> > Best
> > -P.
> >
> > On 2025/01/22 19:37:45 Enrique Calderon wrote:
> > > Hi!
> > > I am trying to subscribe to both the mailing lists of users and
> developers. I have already sent several emails to both
> dev-subscr...@beam.apache.org and user-subscr...@beam.apache.org but I am
> never receiving anything that notifies me about the subscription neither
> the new emails on the mailing list.
> > > My email is this, ksobrena...@ks32.dev.
> > > Greetings
> > > - Enrique (ksobrenat32)
>


Re: [Bug] Problem with subscription

2025-01-22 Thread XQ Hu via dev
Did you get the email like confirm subscribe to u...@beam.apache.org to
confirm your subscription?

On Wed, Jan 22, 2025 at 3:44 PM Robert Bradshaw via dev 
wrote:

> Welcome to the community, Enrique!
>
> I have no idea why the subscriptions aren't working, or how to debug
> this. Apache infra would probably have people who would be better at
> looking into this, as they run the mailing lists.
>
> On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada  wrote:
> >
> > Hello team!
> > I have been working with Enrique and I tried to help him figure out if
> there was something wrong with his subscribe requests to no avail. Anybody
> knows how to escalate/debug where his subscription request is
> failing/getting stuck?
> > Best
> > -P.
> >
> > On 2025/01/22 19:37:45 Enrique Calderon wrote:
> > > Hi!
> > > I am trying to subscribe to both the mailing lists of users and
> developers. I have already sent several emails to both
> dev-subscr...@beam.apache.org and user-subscr...@beam.apache.org but I am
> never receiving anything that notifies me about the subscription neither
> the new emails on the mailing list.
> > > My email is this, ksobrena...@ks32.dev.
> > > Greetings
> > > - Enrique (ksobrenat32)
>


Re: [Bug] Problem with subscription

2025-01-22 Thread XQ Hu via dev
Can you check your spam folder? I just tested it with my personal gmail and
I got the confirmation email and just replied to it.

On Wed, Jan 22, 2025 at 4:31 PM Enrique Calderon 
wrote:

> No, I haven't received any emails at all, including any subscription
> emails.
> Re-sending this because I did not reply to all
> On Wednesday, January 22nd, 2025 at 3:13 PM, XQ Hu 
> wrote:
>
> Did you get the email like confirm subscribe to u...@beam.apache.org to
> confirm your subscription?
>
> On Wed, Jan 22, 2025 at 3:44 PM Robert Bradshaw via dev <
> dev@beam.apache.org> wrote:
>
>> Welcome to the community, Enrique!
>>
>> I have no idea why the subscriptions aren't working, or how to debug
>> this. Apache infra would probably have people who would be better at
>> looking into this, as they run the mailing lists.
>>
>> On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada 
>> wrote:
>> >
>> > Hello team!
>> > I have been working with Enrique and I tried to help him figure out if
>> there was something wrong with his subscribe requests to no avail. Anybody
>> knows how to escalate/debug where his subscription request is
>> failing/getting stuck?
>> > Best
>> > -P.
>> >
>> > On 2025/01/22 19:37:45 Enrique Calderon wrote:
>> > > Hi!
>> > > I am trying to subscribe to both the mailing lists of users and
>> developers. I have already sent several emails to both
>> dev-subscr...@beam.apache.org and user-subscr...@beam.apache.org but I
>> am never receiving anything that notifies me about the subscription neither
>> the new emails on the mailing list.
>> > > My email is this, ksobrena...@ks32.dev.
>> > > Greetings
>> > > - Enrique (ksobrenat32)
>>
>
>


Re: Using resource hints or annotations for transform expansion

2025-01-22 Thread Chamikara Jayalath via dev
On Tue, Jan 21, 2025 at 4:51 PM Robert Bradshaw via dev 
wrote:

> On Tue, Jan 21, 2025 at 7:26 AM Kenneth Knowles  wrote:
> >
> > On Tue, Jan 21, 2025 at 2:35 AM Jan Lukavský  wrote:
> >>
> >>
> >> On 1/20/25 18:18, Kenneth Knowles wrote:
> >>
> >> This all sounds good. I will add my standard comment that this hint is
> a property of the data, not the pipeline logic. So it is a different type
> of hint than key invariance and fanout ratio).
> >>
> >> This is not a problem for the proposed approach, in my opinion.
> Obviously, almost always there will be some pipeline code that is written
> specifically for the data in mind.
> >>
> >> A couple other examples of "hints" that you can keep in mind are
> Combine.withHotkeyFanout, Redistribute (both variants), and
> GroupByKey.fewKeys. These were chosen to be expressed as transforms, even
> though they are more like hints. I bring up these examples to say that we
> don't have to be too pedantic here, because it is already too late :-)
> >>
> >> And anyhow a runner is always allowed to implement any piece of a
> pipeline with anything that has the same "behavior", whether or not it is
> expressed as a hint or some other way (that's the whole point of Beam, and
> how we have fusion, combiner lifting, flatten unzipping, multiple runners,
> etc).
> >>
> >> It is probably less appropriate for reusable transforms that are
> expected to be used in more than one context. That can be up to
> transform/pipeline authors.
> >>
> >> And to bring it back around and connect to the above: having an API
> like GroupByKey.biggerThanMemory() as an API choice is just as fine with me
> as GroupByKey.fewKeys() and it can just be a composite that adds the
> hint/annotation to the primitive node. No need to combine API design with
> model design / no need to force users to express things in terms of the
> lowest level parts of the model.
> >>
> >> I was thinking about that as well. But there is a problem. The GBK is
> often part of some other transform (e.g. FileIO, but can be any other). We
> need a way to (optionally) change the behavior of a transform that is part
> of some outer composite. Therefore this should work for
> >>
> >> FileIO.write(...).addAnnotation(GroupByKey.HUGE)
> >
> > It is a very good point that library transforms should probably not be
> annotated but they do need to be adjusted when executed. FWIW this is also
> why windowing strategy is on PCollection and automatically propagated.
> >
> > But also another good example: FileIO has GBKs that are small even if
> the data incoming is huge. In the analogy with windowing strategy, the
> library transform has to own the re-windowing / re-sizing.
> >
> > So maybe PCollection.addAnnotation(SizeEstimate.HUGE) could make more
> sense.
>
> This doesn't solve the problem, as the operation you're trying to
> modify may be entirely internal to the composite. (Unless we have
> annotations that get attached to inputs and "follow" through like
> windowing, with operations that can add/remove/modify these
> annotations.)
>
> Being able to annotate a composite and having it apply (per runner
> semantics) to all subtransforms doesn't seem too bad. If you really
> need to have part of the transform be executed one way, and part
> another, that feels like you need to break apart (re-implement) the
> transform itself.
>

Or a transform (GBK in this case) should be able to perform a call to get
the aggregated set of annotations of composites that surround it. That way
you can just add the annotation to the outermost transform (FileIO in this
case) and the GBK should be able to change the behavior based on that.

- Cham


>
> > But then it starts to look like a lot of manual propagation of
> annotations (if we don't make it default) or a lot of manually undoing
> annotations (if we do make it default).
>
> And that too.
>


Re: [Bug] Problem with subscription

2025-01-22 Thread XQ Hu via dev
Probably your email provider filters it? move dev to Bcc.

On Wed, Jan 22, 2025 at 4:43 PM Enrique Calderon 
wrote:

> I have checked everything and even tried again but with no success. I can
> also confirm it works because I am able to receive the confirmation email
> on another email account but not on this one.
> On Wednesday, January 22nd, 2025 at 3:32 PM, XQ Hu 
> wrote:
>
> Can you check your spam folder? I just tested it with my personal gmail
> and I got the confirmation email and just replied to it.
>
> On Wed, Jan 22, 2025 at 4:31 PM Enrique Calderon 
> wrote:
>
>> No, I haven't received any emails at all, including any subscription
>> emails.
>> Re-sending this because I did not reply to all
>> On Wednesday, January 22nd, 2025 at 3:13 PM, XQ Hu 
>> wrote:
>>
>> Did you get the email like confirm subscribe to u...@beam.apache.org to
>> confirm your subscription?
>>
>> On Wed, Jan 22, 2025 at 3:44 PM Robert Bradshaw via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Welcome to the community, Enrique!
>>>
>>> I have no idea why the subscriptions aren't working, or how to debug
>>> this. Apache infra would probably have people who would be better at
>>> looking into this, as they run the mailing lists.
>>>
>>> On Wed, Jan 22, 2025 at 11:45 AM Pablo Estrada 
>>> wrote:
>>> >
>>> > Hello team!
>>> > I have been working with Enrique and I tried to help him figure out if
>>> there was something wrong with his subscribe requests to no avail. Anybody
>>> knows how to escalate/debug where his subscription request is
>>> failing/getting stuck?
>>> > Best
>>> > -P.
>>> >
>>> > On 2025/01/22 19:37:45 Enrique Calderon wrote:
>>> > > Hi!
>>> > > I am trying to subscribe to both the mailing lists of users and
>>> developers. I have already sent several emails to both
>>> dev-subscr...@beam.apache.org and user-subscr...@beam.apache.org but I
>>> am never receiving anything that notifies me about the subscription neither
>>> the new emails on the mailing list.
>>> > > My email is this, ksobrena...@ks32.dev.
>>> > > Greetings
>>> > > - Enrique (ksobrenat32)
>>>
>>
>>
>


Re: [ANNOUNCE] New PMC Member: Danny McCormick

2025-01-22 Thread Ahmet Altay via dev
Congratulations Danny!

On Wed, Jan 22, 2025 at 2:39 AM Reza Rokni via dev 
wrote:

> Woohoo! Congrats :-)
>
> On Fri, Jan 10, 2025 at 6:35 PM Kenneth Knowles  wrote:
>
>> Congrats!
>>
>> On Thu, Jan 9, 2025 at 10:15 AM Yi Hu via dev 
>> wrote:
>>
>>> Congrats, Danny!
>>>
>>> On Wed, Jan 8, 2025 at 8:40 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
 Congrats and Thanks, Danny!

 On Fri, Dec 27, 2024 at 5:51 AM Ahmed Abualsaud via dev <
 dev@beam.apache.org> wrote:

> Well deserved! Thanks for all your hard work Danny
>
> On Fri, Dec 20, 2024 at 7:58 PM LDesire  wrote:
>
>> Congratulations Danny! 😀
>
>


Urgent: Action Required for Iceberg Production

2025-01-22 Thread Ling Li
Hi team,

Happy New Year! I have a question regarding
the "org.apache.beam:beam-sdks-java-io-iceberg:2.61.0".

Does the org.apache.beam.sdk.schemas.Schema getDataSchema(String
destination) of class DynamicDestinations already exist?

Our team tried to use DynamicDestinations to decide which iceberg table to
write in the runtime but failed.


Attached is my code class DynamicIcebergDestinations. Our problem is that
we don't have a universal schema that can match all events to be
implemented for getDataSchema(). We need to use a parameter to get the
correct schema for each event, so we want to use: getDataSchema(String
destination). But seems getDataSchema(String destination) is not
implemented for org.apache.beam.sdk.io.iceberg.DynamicDestinations. Since
getDataSchema() will be automatically called and it cannot return a
universal schema, there is no way for us to
use IcebergIO.writeRows(icebergCatalogConfig).to(dynamicDestinations). I
also tried to use managed I/O connector, but I don't think
Managed.write support
with DynamicDestinations. Could you give some urgent guidance?

Thank you so much.

Sincerely,
Luyao


DynamicIcebergDestinations.java
Description: Binary data