Hello! My name is Nivaldo, and I'd like to express my interest in joining this year's GSOC to add real world use cases for Beam's MLTransform/Enrichment transforms <https://issues.apache.org/jira/browse/GSOC-259>.
***About me*** I am a Senior Data Engineer from Brazil, with 6-7 YOE in helping companies make the most out of their data. I've contributed to Beam in the past (See [1] <https://github.com/apache/beam/pull/23879> and [2] <https://github.com/apache/beam/issues/21089>), but I think I still fit the criteria of being a beginner in open source development (Using [3] <https://developers.google.com/open-source/gsoc/faq#how_do_i_know_if_i_am_considered_a_beginner_in_open_source_development> as a reference). Most notably I spent 2-3 months contributing to the creation of a Rust SDK for Beam, but due to unfortunate events, I abruptly stopped contributing. I was happy to see that some amazing members of the community have been able to fork the code I wrote and continue from there. Part of the reason I had for that contribution was to prepare a career transition into Software Engineering, but I also had to put that goal on hold at the time. Recently, my circumstances have changed and I have been preparing to continue with a more domain-specific version of this goal, more directed towards machine learning. Working on this project would be an excellent way to increment my portfolio, learn relevant skills and contribute to the Beam community. I learned a lot about Beam's internals and fundamental concepts while working on the Rust SDK (See [4] <https://github.com/apache/beam/compare/master...nivaldoh:beam:rust_sdk> for my commits), and I think this knowledge would give me a nice headstart to work with the ML transforms. Briefly speaking, I also have some experience working with Beam professionally (See [5] <https://github.com/google/megalista/pull/12>), and I have two official Google Cloud certifications (Professional Data Engineer and Professional ML Engineer). I have a bachelor's degree in CS, and there's a chance I might start a Master's degree program in CS/AI this summer/fall (pending university decisions). ***Questions*** 1. Would I actually be eligible to apply to GSOC for this project, or do I not count as an open source beginner anymore in this case? The total number of PRs and issues I've ever opened on Github would be below 10 as far as I'm aware. I've never worked formally as a Software Engineer, so I'd have a lot to learn from a mentor and would be looking forward to that. 2. I'd like to understand the scope and exact purpose of the use cases a bit better. Are they meant to serve more like standalone tutorials with purely mock data, or maybe more like reusable/adaptable examples where users can fit in their own data? Additionally, is my assessment correct that the implementation would consist basically of actual code, testing and documentation? 3. Would it be possible to define what exactly would count as a "slowly changing source" for the purposes of the Enrichment use cases to be implemented? 4. Regarding the implementation of 1 or more additional Enrichment handlers for currently unsupported sources, we'd be looking into adding, for instance, something like a BigQueryEnrichmentHandler, is that correct? Thank you for reading this. ***References*** [1]: https://github.com/apache/beam/pull/23879 [2]: https://github.com/apache/beam/issues/21089 [3]: https://developers.google.com/open-source/gsoc/faq#how_do_i_know_if_i_am_considered_a_beginner_in_open_source_development [4]: https://github.com/apache/beam/compare/master...nivaldoh:beam:rust_sdk [5]: https://github.com/google/megalista/pull/12