[
https://issues.apache.org/jira/browse/SPARK-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553301#comment-15553301
]
Yuval Itzchakov commented on SPARK-15406:
-----------------------------------------
As someone using Spark Streaming and Kafka in production, I really don't
understand the need for a "speedy" 2.0.1 release. People today are generally
confused as for the availability of Kafka and Structured Streaming. I've
answered numerous StackOverflow answers where people ask "how do I use Kafka as
a source?", not understanding this is still a missing piece in the puzzle. I
would appreciate a design process which can be split up into potential smaller
set of features released every minor version, but there still needs to be a
thought process behind this. For example, not being able to specify offsets in
a production environment is a show stopper for me when I'm expected to deal
with exactly one semantics, and this may be a big deal for other users as well.
I think releasing a small subset of features would only go about and confuse
even more, especially if things are about to break in the future.
Having said that, I would be happy to take part in such a design process and
help implementation where needed. I think Tathagatas document is a good start,
and perhaps we should move all questions and remarks to the document.
> Structured streaming support for consuming from Kafka
> -----------------------------------------------------
>
> Key: SPARK-15406
> URL: https://issues.apache.org/jira/browse/SPARK-15406
> Project: Spark
> Issue Type: New Feature
> Reporter: Cody Koeninger
>
> This is the parent JIRA to track all the work for the building a Kafka source
> for Structured Streaming. Here is the design doc for an initial version of
> the Kafka Source.
> https://docs.google.com/document/d/19t2rWe51x7tq2e5AOfrsM9qb8_m7BRuv9fel9i0PqR8/edit?usp=sharing
> ================== Old description =========================
> Structured streaming doesn't have support for kafka yet. I personally feel
> like time based indexing would make for a much better interface, but it's
> been pushed back to kafka 0.10.1
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]