Spark streaming multiple kafka topic doesn't work at-least-once

2017-01-23 Thread hakanilter
Hi everyone, I have a spark (1.6.0-cdh5.7.1) streaming job which receives data from multiple kafka topics. After starting the job, everything works fine first (like 700 req/sec) but after a while (couples of days or a week) it starts processing only some part of the data (like 350 req/sec). When I

Re: Problem with loading files: Loss was due to java.io.EOFException java.io.EOFException

2014-05-21 Thread hakanilter
The problem is solved after hadoop-core dependency added. But I think there is a misunderstanding about local files. I found this one: "Note that if you've connected to a Spark master, it's possible that it will attempt to load the file on one of the different machines in the cluster, so make sure

Problem with loading files: Loss was due to java.io.EOFException java.io.EOFException

2014-05-20 Thread hakanilter
Hi everyone, I'm having problems with loading files. Either with java code or spark-shell, I got same errors when I try to load a text file. I added hadoop-client and hadoop-common 2.0.0-cdh4.6.0 as dependencies and maven-shade-plugin is configured. I have CDH 4.6.0, spark-0.9.1-bin-cdh4 and JD