Re: Structured Stream in Spark

KhajaAsmath Mohammed Fri, 27 Oct 2017 08:42:10 -0700

Hi TathagataDas,

I was trying to use eventhub with spark streaming. Looks like I was able to
make connection successfully but cannot see any data on the console. Not
sure if eventhub is supported or not.


https://github.com/Azure/spark-eventhubs/blob/master/examples/src/main/scala/com/microsoft/spark/sql/examples/EventHubsStructuredStreamingExample.scala

is the code snippet I have used to connect to eventhub

Thanks,
Asmath



On Thu, Oct 26, 2017 at 9:39 AM, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:

> Thanks TD.
>
> On Wed, Oct 25, 2017 at 6:42 PM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> Please do not confuse old Spark Streaming (DStreams) with Structured
>> Streaming. Structured Streaming's offset and checkpoint management is far
>> more robust than DStreams.
>> Take a look at my talk - https://spark-summit.org/201
>> 7/speakers/tathagata-das/
>>
>> On Wed, Oct 25, 2017 at 9:29 PM, KhajaAsmath Mohammed <
>> mdkhajaasm...@gmail.com> wrote:
>>
>>> Thanks Subhash.
>>>
>>> Have you ever used zero data loss concept with streaming. I am bit
>>> worried to use streamig when it comes to data loss.
>>>
>>> https://blog.cloudera.com/blog/2017/06/offset-management-for
>>> -apache-kafka-with-apache-spark-streaming/
>>>
>>>
>>> does structured streaming handles it internally?
>>>
>>> On Wed, Oct 25, 2017 at 3:10 PM, Subhash Sriram <
>>> subhash.sri...@gmail.com> wrote:
>>>
>>>> No problem! Take a look at this:
>>>>
>>>> http://spark.apache.org/docs/latest/structured-streaming-pro
>>>> gramming-guide.html#recovering-from-failures-with-checkpointing
>>>>
>>>> Thanks,
>>>> Subhash
>>>>
>>>> On Wed, Oct 25, 2017 at 4:08 PM, KhajaAsmath Mohammed <
>>>> mdkhajaasm...@gmail.com> wrote:
>>>>
>>>>> Hi Sriram,
>>>>>
>>>>> Thanks. This is what I was looking for.
>>>>>
>>>>> one question, where do we need to specify the checkpoint directory in
>>>>> case of structured streaming?
>>>>>
>>>>> Thanks,
>>>>> Asmath
>>>>>
>>>>> On Wed, Oct 25, 2017 at 2:52 PM, Subhash Sriram <
>>>>> subhash.sri...@gmail.com> wrote:
>>>>>
>>>>>> Hi Asmath,
>>>>>>
>>>>>> Here is an example of using structured streaming to read from Kafka:
>>>>>>
>>>>>> https://github.com/apache/spark/blob/master/examples/src/mai
>>>>>> n/scala/org/apache/spark/examples/sql/streaming/StructuredKa
>>>>>> fkaWordCount.scala
>>>>>>
>>>>>> In terms of parsing the JSON, there is a from_json function that you
>>>>>> can use. The following might help:
>>>>>>
>>>>>> https://databricks.com/blog/2017/02/23/working-complex-data-
>>>>>> formats-structured-streaming-apache-spark-2-1.html
>>>>>>
>>>>>> I hope this helps.
>>>>>>
>>>>>> Thanks,
>>>>>> Subhash
>>>>>>
>>>>>> On Wed, Oct 25, 2017 at 2:59 PM, KhajaAsmath Mohammed <
>>>>>> mdkhajaasm...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Could anyone provide suggestions on how to parse json data from
>>>>>>> kafka and load it back in hive.
>>>>>>>
>>>>>>> I have read about structured streaming but didn't find any examples.
>>>>>>> is there any best practise on how to read it and parse it with 
>>>>>>> structured
>>>>>>> streaming for this use case?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Asmath
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Structured Stream in Spark

Reply via email to