Hi Cody,
It worked, after moving the parameter to sparkConf. I don't see that error.
But, Now i'm seeing the count for each RDD returns 0. But, there are
records in the topic i'm reading.
Do you see anything wrong with how i'm creating the Direct Stream ?
Thanks
Jagadish
On Wed, Nov 15, 2017 at
Hello,
I'm wondering if it's possible to get access to the detailed job/stage/task
level metrics via the metrics system (JMX, Graphite, &c). I've enabled the
wildcard sink and I do not see them. It seems these values are only
available over http/json and to SparkListener instances, is this the cas
Hey,
i am currently using Spark 2.2.0 for Hadoop 2.7.x in in a Standalone
cluster for testing. I want to Access some files to share them one the
nodes on the cluster using addFiles. As local directories are not
supported for this i want to use s3 to do the job.
In contrast to nearly everythi
Hi,
When I tried reading parquet data that was generated by spark in cascading
it throws following error
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read
value at 0 in block -1 in file ""
at
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(Intern
spark.streaming.kafka.consumer.poll.ms is a spark configuration, not
a kafka parameter.
see http://spark.apache.org/docs/latest/configuration.html
On Tue, Nov 14, 2017 at 8:56 PM, jkagitala wrote:
> Hi,
>
> I'm trying to add spark-streaming to our kafka topic. But, I keep getting
> this error
>
Thanks Steve and Vadim for the feedback.
@Steve, are you suggesting creating a custom receiver and somehow piping it
through Spark Streaming/Spark SQL? Or are you suggesting creating smaller
datasets from the stream and using my original code to process smaller
datasets? It'd be very helpful for a
Hi,
I am new in the usage of spark streaming. I have developed one spark
streaming job which runs every 30 minutes with checkpointing directory.
I have to implement minor change, shall I kill the spark streaming job once
the batch is completed using yarn application -kill command and update the
j
There's a lot of off-heap memory involved in decompressing Snappy,
compressing ZLib.
Since you're running using `local[*]`, you process multiple tasks
simultaneously, so they all might consume memory.
I don't think that increasing heap will help, since it looks like you're
hitting system memory l
On 14 Nov 2017, at 15:32, Alec Swan
mailto:alecs...@gmail.com>> wrote:
But I wonder if there is a way to stream/batch the content of JSON file in
order to convert it to ORC piecemeal and avoid reading the whole JSON file in
memory in the first place?
That is what you'll need to do; you'd
Greetings,
I am running a unit test designed to stream a folder where I am manually
copying csv files. The files do not always get picked up. They only get
detected when the job starts with the files already in the folder.
I even tried using the option of fileNameOnly newly included in 2.2.0. Have
10 matches
Mail list logo