Re: Inquiry About GSoC Project - Beam ML Vector DB/Feature Store Integrations

2025-03-17 Thread A Gardner
Hi Need to unsubscribe, my gmail is rather full - how do I achieve this? Alex On Mon, 17 Mar 2025 at 13:18, Danny McCormick via dev wrote: > Hey Aditya, there is not necessarily a single set of benchmarks which we > can use to evaluate an IO, and defining exactly what/how we should be > measur

[GSoC] Project Proposal - Enhancing Data Lineage Support in Beam

2025-03-17 Thread Charles Nguyen
Hi all, I'm Charles and I'm interested in Beam's GSoC project this year on enhancing data lineage support (GitHub Issue 33980 and 33981). I've written up the proposal here

Re: Testing Beam YAML piplines

2025-03-17 Thread Joey Tran
Got an access denied on the doc. Is intended to be shared publicly? On Mon, Mar 17, 2025 at 6:17 PM Robert Bradshaw via dev wrote: > I've been thinking a bit about productionizing yaml pipelines, and a > large part of that involves being able to write and run tests. > > I've put my thoughts up a

Re: Testing Beam YAML piplines

2025-03-17 Thread Robert Bradshaw via dev
Sorry, yes, that was the intent. Try it now. Happy to make anyone who wants to contribute an editor as well. On Mon, Mar 17, 2025 at 3:32 PM Joey Tran wrote: > > Got an access denied on the doc. Is intended to be shared publicly? > > On Mon, Mar 17, 2025 at 6:17 PM Robert Bradshaw via dev > wro

Beam - Google Summer of Code Contributor Inquiry

2025-03-17 Thread Jerry Liang
Hello, I am an EECS recent graduate from UC Berkeley interested in databases and distributed systems. I came across the Google Summer of Code project at this link (https://issues.apache.org/jira/browse/GSOC-279) and was curious as to how I should get started in developing a proposal. How should I

Testing Beam YAML piplines

2025-03-17 Thread Robert Bradshaw via dev
I've been thinking a bit about productionizing yaml pipelines, and a large part of that involves being able to write and run tests. I've put my thoughts up at https://s.apache.org/beam-yaml-testing ; comments welcome. - Robert

Re: [python] is merge_accumulators called with stream of accumulators?

2025-03-17 Thread Joey Tran
Trying the dev list instead. I can't find any mention in the documentation whether `merge_accumulators` is called with all the accumulators at once or with an iterator. Perusing some of the combinefns in the python sdk, it looks like all of them work with an iterator and don't try to iterate over

Re: Inquiry About GSoC Project - Beam ML Vector DB/Feature Store Integrations

2025-03-17 Thread Danny McCormick via dev
You can email dev-unsubscr...@beam.apache.org. https://beam.apache.org/community/contact-us/#:~:text=After%20you%20subscribe%2C%20you'll,%40beam.apache.org. On Mon, Mar 17, 2025 at 10:55 AM A Gardner wrote: > Hi > > Need to unsubscribe, my gmail is rather full - how do I achieve this? > > Alex

Re: Inquiry About GSoC Project - Beam ML Vector DB/Feature Store Integrations

2025-03-17 Thread Danny McCormick via dev
Hey Aditya, there is not necessarily a single set of benchmarks which we can use to evaluate an IO, and defining exactly what/how we should be measuring completeness and performance is part of the work to be done here. I think this is a good thing for you to try to initially define in your project