How will i can to know that for  how much  time particular RDD  had
remained in pipeline .

On Fri, Jun 19, 2015 at 7:59 AM, Tathagata Das <t...@databricks.com> wrote:

> Why do you need to uniquely identify the message? All you need is the time
> when the message was inserted by the receiver, and when it is processed,
> isnt it?
>
>
> On Thu, Jun 18, 2015 at 2:28 PM, anshu shukla <anshushuk...@gmail.com>
> wrote:
>
>> Thanks alot , But i have already  tried the  second way ,Problem with
>> that is that how to  identify the particular RDD from source to sink (as we
>> can do by passing a msg id in storm) . For that i just  updated RDD  and
>> added a msgID (as static variable) . but while dumping them to file some of
>> the tuples of RDD are failed/missed (approx 3000 and data rate is aprox
>> 1500 tuples/sec).
>>
>> On Fri, Jun 19, 2015 at 2:50 AM, Tathagata Das <t...@databricks.com>
>> wrote:
>>
>>> Couple of ways.
>>>
>>> 1. Easy but approx way: Find scheduling delay and processing time using
>>> StreamingListener interface, and then calculate "end-to-end delay = 0.5 *
>>> batch interval + scheduling delay + processing time". The 0.5 * batch
>>> inteval is the approx average batching delay across all the records in the
>>> batch.
>>>
>>> 2. Hard but precise way: You could build a custom receiver that embeds
>>> the current timestamp in the records, and then compare them with the
>>> timestamp at the final step of the records. Assuming the executor and
>>> driver clocks are reasonably in sync, this will measure the latency between
>>> the time is received by the system and the result from the record is
>>> available.
>>>
>>> On Thu, Jun 18, 2015 at 2:12 PM, anshu shukla <anshushuk...@gmail.com>
>>> wrote:
>>>
>>>> Sorry , i missed  the LATENCY word.. for a large  streaming query .How
>>>> to find the time taken by the  particular  RDD  to travel from  initial
>>>> D-STREAM to  final/last  D-STREAM .
>>>> Help Please !!
>>>>
>>>> On Fri, Jun 19, 2015 at 12:40 AM, Tathagata Das <t...@databricks.com>
>>>> wrote:
>>>>
>>>>> Its not clear what you are asking. Find "what" among RDD?
>>>>>
>>>>> On Thu, Jun 18, 2015 at 11:24 AM, anshu shukla <anshushuk...@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Is there any  fixed way to find  among RDD in stream processing
>>>>>> systems , in the Distributed set-up .
>>>>>>
>>>>>> --
>>>>>> Thanks & Regards,
>>>>>> Anshu Shukla
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Anshu Shukla
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Anshu Shukla
>>
>
>


-- 
Thanks & Regards,
Anshu Shukla

Reply via email to