My kafka is in a docker container.
How do I read this Kafka data in my Spark streaming app.
Also, I need to write data from Spark Streaming to Cassandra database which
is in docker container.
I appreciate any help.
Thanks.
Hi Ayan,
I mean by Incremental load from HBase, weekly running batch jobs takes rows
from HBase table and dump it out to Hive. Now when next i run Job it only
takes newly arrived jobs.
Same as if we use Sqoop for incremental load from RDBMS to Hive with below
command,
sqoop job --create myssb1 -
Hi all,
the following code will run with Spark 2.0.2 but not with Spark 2.1.0:
//
case class Data(id: Int, param: Map[String, InnerData])
case class InnerData(name: String, value: Int)
import spark.implicits._
val e= Data(1, Map("key" -> InnerData("name", 123)))
val data = Seq(e)
val d=
Hi all,
the following code will run with Spark 2.0.2 but not with Spark 2.1.0:
//
case class Data(id: Int, param: Map[String, InnerData])
case class InnerData(name: String, value: Int)
import spark.implicits._
val e= Data(1, Map("key" -> InnerData("name", 123)))
val data = Seq(e)
val
IMHO you should not "think" HBase in RDMBS terms, but you can use
ColumnFilters to filter out new records
On Fri, Jan 6, 2017 at 7:22 PM, Chetan Khatri
wrote:
> Hi Ayan,
>
> I mean by Incremental load from HBase, weekly running batch jobs takes
> rows from HBase table and dump it out to Hive. No
Ayan, Thanks
Correct I am not thinking RDBMS terms, i am wearing NoSQL glasses !
On Fri, Jan 6, 2017 at 3:23 PM, ayan guha wrote:
> IMHO you should not "think" HBase in RDMBS terms, but you can use
> ColumnFilters to filter out new records
>
> On Fri, Jan 6, 2017 at 7:22 PM, Chetan Khatri > wr
My spark 2.0 + kafka 0.8 streaming job fails with error partition leaderset
exception. When I check the kafka topic the partition, it is indeed in error
with Leader = -1 and empty ISR. I did lot of google and all of them point to
either restarting or deleting the topic. To do any of those
Kafka is designed to only allow reads from leaders. You need to fix
this at the kafka level not the spark level.
On Fri, Jan 6, 2017 at 7:33 AM, Raghu Vadapalli wrote:
>
> My spark 2.0 + kafka 0.8 streaming job fails with error partition leaderset
> exception. When I check the kafka topic the p
On 5 Jan 2017, at 20:07, Manohar Reddy
mailto:manohar.re...@happiestminds.com>> wrote:
Hi Steve,
Thanks for the reply and below is follow-up help needed from you.
Do you mean we can set up two native file system to single sparkcontext ,so
then based on urls prefix( gs://bucket/path and dest s3a
On 5 Jan 2017, at 21:10, Ankur Srivastava
mailto:ankur.srivast...@gmail.com>> wrote:
Yes I did try it out and it choses the local file system as my checkpoint
location starts with s3n://
I am not sure how can I make it load the S3FileSystem.
set fs.default.name to s3n://whatever , or, in spar
I have two separate but similar issues that I've narrowed down to a pretty good
level of detail. I'm using Spark 1.6.3, particularly Spark SQL.
I'm concerned with a single dataset for now, although the details apply to
other, larger datasets. I'll call it "table". It's around 160 M records,
ave
11 matches
Mail list logo