Yes. I am running this in a local mode and the SSCs run on the same JVM. So, if I deploy this on a cluster, such behavior would be gone ? Also, is there anyway I can start the SSCs on a local machine but on different JVMs? I couldn't find anything about this in the documentation.
The inter-mingling of data seems to be gone after I made some of those external classes as 'scala objects' and keeping static maps and all. Is that a good idea as far as performance is concerned ? Thanks Gagan B Mishra On Tue, Apr 22, 2014 at 1:59 AM, Tathagata Das [via Apache Spark User List] <ml-node+s1001560n4556...@n3.nabble.com> wrote: > Are you by any chance starting two StreamingContexts in the same JVM? That > could explain a lot of the weird mixing of data that you are seeing. Its > not a supported usage scenario to start multiple streamingContexts > simultaneously in the same JVM. > > TD > > > On Thu, Apr 17, 2014 at 10:58 PM, gaganbm <[hidden > email]<http://user/SendEmail.jtp?type=node&node=4556&i=0> > > wrote: > >> It happens with normal data rate, i.e., lets say 20 records per second. >> >> Apart from that, I am also getting some more strange behavior. Let me >> explain. >> >> I establish two sscs. Start them one after another. In SSCs I get the >> streams from Kafka sources, and do some manipulations. Like, adding some >> "Record_Name" for example, to each of the incoming records. Now this >> Record_Name is different for both the SSCs, and I get this field from some >> other class, not relevant to the streams. >> >> Now, expected behavior should be, all records in SSC1 gets added with the >> field RECORD_NAME_1 and all records in SSC2 should get added with the field >> RECORD_NAME_2. Both the SSCs have nothing to do with each other as I >> believe. >> >> However, strangely enough, I find many records in SSC1 get added with >> RECORD_NAME_2 and vice versa. Is it some kind of serialization issue ? >> That, the class which provides this RECORD_NAME gets serialized and is >> reconstructed and then some weird thing happens inside ? I am unable to >> figure out. >> >> So, apart from skewed frequency and volume of records in both the >> streams, I am getting this inter-mingling of data among the streams. >> >> Can you help me in how to use some external data to manipulate the RDD >> records ? >> >> Thanks and regards >> >> Gagan B Mishra >> >> >> *Programmer* >> *560034, Bangalore* >> *India* >> >> >> On Tue, Apr 15, 2014 at 4:09 AM, Tathagata Das [via Apache Spark User >> List] <[hidden email] <http://user/SendEmail.jtp?type=node&node=4434&i=0> >> > wrote: >> >>> Does this happen at low event rate for that topic as well, or only for a >>> high volume rate? >>> >>> TD >>> >>> >>> On Wed, Apr 9, 2014 at 11:24 PM, gaganbm <[hidden >>> email]<http://user/SendEmail.jtp?type=node&node=4238&i=0> >>> > wrote: >>> >>>> I am really at my wits' end here. >>>> >>>> I have different Streaming contexts, lets say 2, and both listening to >>>> same >>>> Kafka topics. I establish the KafkaStream by setting different consumer >>>> groups to each of them. >>>> >>>> Ideally, I should be seeing the kafka events in both the streams. But >>>> what I >>>> am getting is really unpredictable. Only one stream gets a lot of >>>> events and >>>> the other one almost gets nothing or very less compared to the other. >>>> Also >>>> the frequency is very skewed. I get a lot of events in one stream >>>> continuously, and after some duration I get a few events in the other >>>> one. >>>> >>>> I don't know where I am going wrong. I can see consumer fetcher threads >>>> for >>>> both the streams that listen to the Kafka topics. >>>> >>>> I can give further details if needed. Any help will be great. >>>> >>>> Thanks >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>> >>> >>> >>> ------------------------------ >>> If you reply to this email, your message will be added to the >>> discussion below: >>> >>> http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4238.html >>> To start a new topic under Apache Spark User List, email [hidden >>> email]<http://user/SendEmail.jtp?type=node&node=4434&i=1> >>> To unsubscribe from Apache Spark User List, click here. >>> NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>> >> >> >> ------------------------------ >> View this message in context: Re: Strange behaviour of different SSCs >> with same Kafka >> topic<http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4434.html> >> >> Sent from the Apache Spark User List mailing list >> archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com. >> > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4556.html > To start a new topic under Apache Spark User List, email > ml-node+s1001560n1...@n3.nabble.com > To unsubscribe from Apache Spark User List, click > here<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=Z2FnYW4ubWlzaHJhQGdtYWlsLmNvbXwxfC0yOTI0Mjc1NjE=> > . > NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4582.html Sent from the Apache Spark User List mailing list archive at Nabble.com.