Figured it out. All I am doing wrong is testing it out in pseudo node vm
with 1 core. The tasks were hanging out for cpu.
In production cluster this works just fine.
On Thu, Aug 11, 2016 at 12:45 AM, Diwakar Dhanuskodi <
diwakar.dhanusk...@gmail.com> wrote:
> Checked executor logs and UI . There
Checked executor logs and UI . There is no error message or something like
that. when there is any action , it is waiting .
There are data in partitions. I could use simple-consumer-shell and print
all data in console. Am I doing anything wrong in foreachRDD?.
This just works fine with single p
zookeeper.connect is irrelevant.
Did you look at your executor logs?
Did you look at the UI for the (probably failed) stages?
Are you actually producing data into all of the kafka partitions?
If you use kafka-simple-consumer-shell.sh to read that partition, do
you get any data?
On Wed, Aug 10, 20
Hi Cody,
Just added zookeeper.connect to kafkaparams . It couldn't come out of batch
window. Other batches are queued. I could see foreach(println) of dataFrame
printing one of partition's data and not the other.
Couldn't see any errors from log.
val brokers = "localhost:9092,localhost:9093"
val
Those logs you're posting are from right after your failure, they don't
include what actually went wrong when attempting to read json. Look at your
logs more carefully.
On Aug 10, 2016 2:07 AM, "Diwakar Dhanuskodi"
wrote:
> Hi Siva,
>
> With below code, it is stuck up at
> * sqlContext.read.json(
I am testing with one partition now. I am using Kafka 0.9 and Spark 1.6.1
(Scala 2.11). Just start with one topic first and then add more. I am not
partitioning the topic.
HTH,
Regards,
Sivakumaran
> On 10-Aug-2016, at 5:56 AM, Diwakar Dhanuskodi
> wrote:
>
> Hi Siva,
>
> Does topic has
Hi Siva,
With below code, it is stuck up at
* sqlContext.read.json(rdd.map(_._2)).toDF()*
There are two partitions in topic.
I am running spark 1.6.2
val topics = "topic.name"
val brokers = "localhost:9092"
val topicsSet = topics.split(",").toSet
val sparkConf = new
SparkConf().setAppName("Kafka
Hi Siva,
Does topic has partitions? which version of Spark you are using?
On Wed, Aug 10, 2016 at 2:38 AM, Sivakumaran S wrote:
> Hi,
>
> Here is a working example I did.
>
> HTH
>
> Regards,
>
> Sivakumaran S
>
> val topics = "test"
> val brokers = "localhost:9092"
> val topicsSet = topics.sp
It stops working at sqlContext.read.json(rdd.map(_._2)) . Topics without
partitions is working fine. Do I need to set any other configs
val kafkaParams =
Map[String,String]("bootstrap.servers"->"localhost:9093,localhost:9092", "
group.id" -> "xyz","auto.offset.reset"->"smallest")
Spark version is 1
No, you don't need a conditional. read.json on an empty rdd will
return an empty dataframe. Foreach on an empty dataframe or an empty
rdd won't do anything (a task will still get run, but it won't do
anything).
Leave the conditional out. Add one thing at a time to the working
rdd.foreach exampl
Hi Cody,
Without conditional . It is going with fine. But any processing inside
conditional get on to waiting (or) something.
Facing this issue with partitioned topics. I would need conditional to skip
processing when batch is empty.
kafkaStream.foreachRDD(
rdd => {
val dataFrame = sqlContex
Hi,
Here is a working example I did.
HTH
Regards,
Sivakumaran S
val topics = "test"
val brokers = "localhost:9092"
val topicsSet = topics.split(",").toSet
val sparkConf = new
SparkConf().setAppName("KafkaWeatherCalc").setMaster("local")
//spark://localhost:7077
val sc = new SparkContext(spar
Take out the conditional and the sqlcontext and just do
rdd => {
rdd.foreach(println)
as a base line to see if you're reading the data you expect
On Tue, Aug 9, 2016 at 3:47 PM, Diwakar Dhanuskodi
wrote:
> Hi,
>
> I am reading json messages from kafka . Topics has 2 partitions. When
> runnin
13 matches
Mail list logo