Dear Marcin, Thank you for your suggestion. I think it's a great idea.
What do you think about providing the attachment details to users including filename, type, and data (as byte array)? Thank you for your valuable feedback. Best regards, LDesire > 2024. 11. 12. 오후 9:49, Marcin Stańczak <m.stanczak...@gmail.com> 작성: > > Hello, > > One additional scenario that might be useful is the ability to fetch > and process email attachments, such as CSV files, from specific > recipients who send automated reports with a consistent schema. This > would allow for seamless integration of recurring email-based data > into data pipelines. > > Looking forward to hearing about your progress! > > Best regards, > Marcin > > On Tue, Nov 12, 2024 at 1:42 PM Piotr Wiśniowski > <contact.wisniowskipi...@gmail.com> wrote: >> >> Hi, >> Had theoretical poc project in the past with quite similar functionalities >> needed. >> Bounded read makes sense, and can be threatened as special case of unbounded >> read. The second I could imagine is doing the same (reading emails for >> downstream processing like some logic triggers or ml categorization and then >> send to different departments). >> From my perspective write is way more complicated and not sure If >> beam/streaming applications are best pick for this tasks. Two potential >> problems is that it needs distributed throttling out of the box for sending >> emails. This can be done by using fixed parallelism (for example fixed >> number of keys) and adaptive throttling (there is some out of the box code >> for that already). The second problem I see is that even exactly once >> processing options in runners (dataflow/flink) do not guarantee that sending >> will be executed only once in all cases (this only guarantee that only a >> single output will be seen downstream). To get around that probably double >> locking would be required, but this together with throttling might be >> challenging to get at same time. >> Regarding potential use cases for write, definitely distributed notification >> systems - have seen ideas for such projects already in at least 3 >> corporation s. Some features they required (as far as my memory is correct): >> - templating messages for output (Jinja like) but this could technically be >> pushed upstream >> - priority queue - so that if there is a more urgent message in a priority >> queue it should be send first before normal queue at same time considering >> throttling. >> - single destination throttling - so a single email will get at most x msgs >> per week. >> - channel configuration - so that user receiving notification could >> configure which channel he wants to get msgs (email, slack, mobile push, sms >> etc. ). >> But above are typical requirements for whole notification apps, nor only for >> the mail io, but I guess you could extract from this some use cases. >> >> For the unbounded read, definitely emails could be used as some kind of >> interface users could use to trigger asynchronous tasks (gdpr data deletion >> for example). Having dedicated mail io read would avoid the need of having >> separate be app to fetch the emails or additional brooker configuration for >> emails systems (sometimes this is not possible because security policies in >> corporations). >> >> Let me know if this is helpful. Happy to see such initiatives 🙂 >> Best Wiśniowski Piotr >> >> >> wt., 12 lis 2024, 13:03 użytkownik LDesire <two_som...@icloud.com> napisał: >>> >>> Hello, >>> >>> I am currently working on developing a MailIO connector for Apache Beam. >>> >>> While I have made progress implementing bounded read functionality, I'm >>> somewhat uncertain about the practical use cases where users would need the >>> MailIO connector. >>> >>> The use cases I've considered are: >>> >>> - Bounded Read: >>> Email folder archiving - For example, archiving all messages from specific >>> folders to storage systems like GCS, HDFS, or S3. >>> >>> - Write: >>> Integrating with messaging systems like Pub/Sub to collect user behavior >>> data, generating AI-powered messages based on these behaviors, and then >>> using MailIO.write to compose and send emails. >>> >>> I haven't considered implementing Unbounded Read yet. >>> >>> I'm wondering if there might be other valuable use cases that I haven't >>> thought of? >>> >>> Thank you.