I'm glad you think it's generally a good idea!

I will mention, though, that with these better docs I've almost finished,
I'm hoping that Structured Streaming no longer stays a specialist topic
that requires "trench warfare." With good pedagogy, I think that it's very
approachable. The Knowledge Sharing Hub could be useful for e2e real-world
use-cases, but I think that operator semantics, stream configurations, etc.
have a better home in the official documentation.

Thanks for your engagement, Mich. Looking forward to hearing others'
opinions.

Neil

On Mon, Mar 25, 2024 at 2:50 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi,
>
> Your intended work on improving the Structured Streaming documentation is
> great! Clear and well-organized instructions are important for everyone
> using Spark, beginners and experts alike.
> Having said that, Spark Structured Streaming much like other specialist
> topics with Spark say (k8s) or otherwise cannot be mastered by
> documentation alone. These topics require a considerable amount of practice
> and trench warfare so to speak to master them. Suffice to say that I agree
> with the proposals of making examples. However, it is an area that many try
> to master but fail( judging by typical issues brought up in the user group
> and otherwise). Perhaps using a section such as the proposed "Knowledge
> Sharing Hub'', may become more relevant. Moreover, the examples have to
> reflect real life scenarios and conversly will be of limited use otherwise.
>
> HTH
>
> Mich Talebzadeh,
>
> Technologist | Data | Generative AI | Financial Fraud
>
> London
> United Kingdom
>
>
>    view my Linkedin profile
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> Disclaimer: The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner Von Braun)".
>
> Mich Talebzadeh,
> Technologist | Data | Generative AI | Financial Fraud
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 25 Mar 2024 at 21:19, Neil Ramaswamy <n...@ramaswamy.org> wrote:
>
>> Hi all,
>>
>> I recently started an effort to improve the Structured Streaming
>> documentation. I thought that the current documentation, while very
>> comprehensive, could be improved in terms of organization, clarity, and
>> presence of examples.
>>
>> You can view the repo here
>> <https://github.com/neilramaswamy/structured-streaming>, and you can see
>> a preview of the site here <https://structured-streaming.vercel.app/>.
>> It's almost at full parity with the programming guide, and it also has
>> additional content, like a guide on unit testing and an in-depth
>> explanation of watermarks. I think it's at a point where we can bring this
>> to completion if it's something that the community wants.
>>
>> I'd love to hear feedback from everyone: is this something that we would
>> want to move forward with? As it borrows certain parts from the programming
>> guide, it has an Apache License, so I'd be more than happy if it is adopted
>> by an official Spark repo.
>>
>> Best,
>> Neil
>>
>

Reply via email to