Respected Sir,
Sir, is understanding and strong fundamentals of Deep learning essential as a 
part of project prerequisites? Also sir what about machine learning 
fundamentals. So basically I meant to ask, that sir out of the 3 in the 
AI,ML,DL(the hierarchy circle), which one would you suggest me to have strong 
grip over it especially for this project?

Thanking you,
Siddharth

From: SIDDHARTH SALIAN <siddharthsalia...@gmail.com>
Date: Tuesday, 4 March 2025 at 1:53 AM
To: Danny McCormick <dannymccorm...@google.com>, Danny McCormick via user 
<user@beam.apache.org>
Subject: Re: Regarding the GSOC 2025 Project
Respected Sir,
Thank you for the email. I have understood. I’ll continue the conversation upon 
this project in this user mailing lists. Anything with regard to ideas and 
opinions, I shall move to dev list.

Thanking you,
Siddharth

From: Danny McCormick <dannymccorm...@google.com>
Date: Tuesday, 4 March 2025 at 1:50 AM
To: SIDDHARTH SALIAN <siddharthsalia...@gmail.com>
Cc: Danny McCormick via user <user@beam.apache.org>
Subject: Re: Regarding the GSOC 2025 Project
I'd probably recommend using the dev@ list; both are fine, but dev@ is probably 
more likely to have more folks with ideas/opinions.

Thanks,
Danny

On Mon, Mar 3, 2025 at 3:17 PM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Respected Sir,
Thank you for the email. I have understood. Also sir should I move to dev – 
mailing lists for further conversation on this project? Or I shall continue 
future conversations here at user mailing lists?

Thanking you,
Siddharth

From: Danny McCormick 
<dannymccorm...@google.com<mailto:dannymccorm...@google.com>>
Date: Tuesday, 4 March 2025 at 1:43 AM
To: SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>>
Cc: Danny McCormick via user <user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: Regarding the GSOC 2025 Project
I think that should be plenty for now, thanks!

On Mon, Mar 3, 2025 at 3:11 PM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Respected Sir,
Thank you for the email. I shall go through the mentioned link. Sir anything 
more to be read currently as a part of project requisites? Or is it good for 
now sir?

Thanking you,
Siddharth

From: Danny McCormick 
<dannymccorm...@google.com<mailto:dannymccorm...@google.com>>
Date: Tuesday, 4 March 2025 at 1:37 AM
To: SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>>
Cc: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>, 
damcc...@apache.org<mailto:damcc...@apache.org> 
<damcc...@apache.org<mailto:damcc...@apache.org>>
Subject: Re: Regarding the GSOC 2025 Project
> Sir, with reference to the point about python, I meant to ask that sir, like 
> apart from learning the main coding language of python, anything more 
> important topic has to be learnt (such as python with ML pipelines, etc.) as 
> a part of project prerequisites?

I think that knowing how to write good python code is the most important thing. 
It might be useful, but not required, to understand how to generate embeddings 
using python and more generally to understand how embeddings work [1].

Thanks,
Danny
[1] 
https://stackoverflow.blog/2023/11/09/an-intuitive-introduction-to-text-embeddings/

On Mon, Mar 3, 2025 at 3:02 PM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Respected Sir,


  1.  With reference to the previous email.



  1.  Thank you for the email, I shall follow the mode of communication through 
mailing lists.


  1.  Sir, with reference to the point about python, I meant to ask that sir, 
like apart from learning the main coding language of python, anything more 
important topic has to be learnt (such as python with ML pipelines, etc.) as a 
part of project prerequisites?


Best Regards,
Thanking you,
Siddharth Salian

From: Danny McCormick via user 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Tuesday, 4 March 2025 at 1:25 AM
To: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Cc: Danny McCormick 
<dannymccorm...@google.com<mailto:dannymccorm...@google.com>>
Subject: Re: Regarding the GSOC 2025 Project
> Sir, apart from strong fundamentals of vector DB’s, python fundamentals, Beam 
> docs, writing sink, is there anything much important topic to be 
> covered/learnt other than these as part of project prerequisites?

I think those are the main pieces to consider here.

> Sir, I also wanted to ask, what all other topics have to be covered in python 
> other than the main code language as a part of project prerequisites.

I don't understand what you're asking - could you try rephrasing?

> Sir, as the GSOC – 2025 organization list have been released, as well as the 
> project list (for GSOC 2025) has been released. As I’ am interested in this 
> project and you are the potential mentor for it, if you could please tell me 
> which mode of communication would be better - either slack or through mailing 
> lists? I’ am asking this because I would want to seek multiple helps when 
> needed, when I’ am understanding the project/codebases, as It’s a new concept 
> and environment for me. Also continuous mails won’t be appealing. Whatever 
> you agree upon sir, we can follow it upon sir.

Lets keep conversation on the mailing list - that way anyone who is interested 
in the project can benefit.

Thanks,
Danny

On Sun, Mar 2, 2025 at 11:48 AM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Respected Sir,


  1.  With reference to the previous mails.



  1.  Sir, apart from strong fundamentals of vector DB’s, python fundamentals, 
Beam docs, writing sink, is there anything much important topic to be 
covered/learnt other than these as part of project prerequisites?


  1.  Sir, I also wanted to ask, what all other topics have to be covered in 
python other than the main code language as a part of project prerequisites.



  1.  Sir, as the GSOC – 2025 organization list have been released, as well as 
the project list (for GSOC 2025) has been released. As I’ am interested in this 
project and you are the potential mentor for it, if you could please tell me 
which mode of communication would be better - either slack or through mailing 
lists? I’ am asking this because I would want to seek multiple helps when 
needed, when I’ am understanding the project/codebases, as It’s a new concept 
and environment for me. Also continuous mails won’t be appealing. Whatever you 
agree upon sir, we can follow it upon sir.

Best Regards,
Thanking you,
Siddharth Salian

From: SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>>
Date: Friday, 21 February 2025 at 12:59 AM
To: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: Regarding the GSOC 2025 Project
Hello Sir,
Thank you for the email. I have understood.

Thanks,
Siddharth Salian

From: Danny McCormick via user 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Thursday, 20 February 2025 at 9:51 PM
To: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Cc: Danny McCormick 
<dannymccorm...@google.com<mailto:dannymccorm...@google.com>>
Subject: Re: Regarding the GSOC 2025 Project
> Sir, as you have mentioned in the mail, Python is must for this project, I 
> just wanted to ask, what about Java and Golang SDK applications, I mean I 
> know it’s an AI/ML pipeline based project, but if you could tell me it would 
> add to my clarity.

I would expect this project to pretty much exclusively be in Python. The only 
exception is if some vector DB or feature store only offers a Go or Java client 
(but this seems unlikely)

> Sir, I wanted to also ask, as Retrieval Augmented Generation(RAG) has a close 
> relation with this project, don’t you think RAG is still limited to capturing 
> historical data, or it has capability of capturing latest/modern data’s too?

I'm not sure I understand the question, but I can try to give an overview of 
how I think Beam and RAG work together. Basically, I think Beam can be used to:


  1.  Ingest data -> generate embeddings -> write to a vector DB. This can 
include very recent data, it just depends on how you configure your source 
(e.g. you could ingest Data continuously with PubSub or Kafka)
  2.  Ingest incoming query -> enrich with embedding data from a vector DB -> 
perform inference with the additional relevant context -> write result somewhere
So I think this can handle reasonably tight data freshness requirements.

On Tue, Feb 18, 2025 at 11:01 AM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Respected Sir,


  1.  Thank you for the email. With the reference to the previous mail , I have 
understood all the points and I shall also go through the I/O page in the 
documentation page as well as vector DB’s, features.



  1.  Sir, as you have mentioned in the mail, Python is must for this project, 
I just wanted to ask, what about Java and Golang SDK applications, I mean I 
know it’s an AI/ML pipeline based project, but if you could tell me it would 
add to my clarity.


  1.  Sir, I wanted to also ask, as Retrieval Augmented Generation(RAG) has a 
close relation with this project, don’t you think RAG is still limited to 
capturing historical data, or it has capability of capturing latest/modern 
data’s too?


Best regards,
Thanking you,
Siddharth Salian

From: Danny McCormick via user 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Tuesday, 18 February 2025 at 8:36 PM
To: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Cc: Danny McCormick 
<dannymccorm...@google.com<mailto:dannymccorm...@google.com>>
Subject: Re: Regarding the GSOC 2025 Project
Hey Siddharth, thanks for reaching out. I'm glad you're interested in the 
project. In general, I would expect there to be more details about projects 
once we know which ones have been accepted.

> Sir, if you could tell me the pre-required knowledge (such as major 
> programming languages used, etc., ) for this project, it would bring more 
> clarity to me sir.

I would expect it to be primarily done in Python, though it depends what 
connectors are available for each vector DB/feature store. Other than that, the 
main things you'd want to learn about are Beam itself, especially about how to 
write a sink (IO 
standards<https://beam.apache.org/documentation/io/io-standards> can help 
here), and also high level how vector DBs and feature stores work.

Thanks,
Danny



On Thu, Feb 13, 2025 at 10:55 PM SIDDHARTH SALIAN 
<siddharthsalia...@gmail.com<mailto:siddharthsalia...@gmail.com>> wrote:
Hello Sir,


  1.  My intention of writing this email is with reference to the GSOC 2025 
mail - https://lists.apache.org/thread/o3mwncq0k4c58c630n49l7bvhq74o2wj


  1.  I’m Siddharth Salian and I’m an undergraduate student and I’m part of 
Apache Beam and I have just joined the community. After going through the GSOC 
2025 idea list and going through the project description, I founded 
https://issues.apache.org/jira/browse/GSOC-279 this project to be interesting 
for me sir. So sir, I would like to contribute to this project in GSOC 2025, 
since AI/ML is area of my interest. Since you are the mentor, I’m letting you 
know sir.



  1.  Sir, if you could tell me the pre-required knowledge (such as major 
programming languages used, etc., ) for this project, it would bring more 
clarity to me sir.



  1.  Sir also wanted to ask is there any other project that you are thinking 
about for GSOC 2025, I would like to contribute in it sir.


Best Regards,

Thanking You
Siddharth Salian


Reply via email to