Thanks for the thoughtful comments, Michael and Shivaram. From what I’ve
seen in this thread and on JIRA, it looks like the current plan with regard
to application-facing APIs for sinks is roughly:
1. Rewrite incremental query compilation for Structured Streaming.
2. Redesign Structured Streaming's source and sink APIs so that they do not
depend on RDDs.
3. Allow the new APIs to stabilize.
4. Open these APIs to use by application code.

Is there a way for those of us who aren’t involved in the first two steps
to get some idea of the current plans and progress? I get asked a lot about
when Structured Streaming will be a viable replacement for Spark Streaming,
and I like to be able to give accurate advice.

Fred

On Tue, Oct 4, 2016 at 3:02 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> I don't quite understand why exposing it indirectly through a typed
>> interface should be delayed before finalizing the API.
>>
>
> Spark has a long history
> <https://spark-project.atlassian.net/browse/SPARK-1094> of maintaining
> binary compatibility in its public APIs.  I strongly believe this is one of
> the things that has made the project successful.  Exposing internals that
> we know are going to change in the primary user facing API for creating
> Streaming DataFrames seems directly counter to this goal.  I think the
> argument that "you can do it anyway" fails to capture user expectations who
> probably aren't closely following this discussion.
>
> If advanced users want to dig though the code and experiment, great.  I
> hope they report back on whats good and what can be improved.  However, if
> you add the function suggested in the PR to DataStreamReader, you are
> giving them a bad experience by leaking internals that don't even show up
> in the published documentation.
>

Reply via email to