date:20150822

how to migrate from spark 0.9 to spark 1.4

2015-08-22 Thread sai rakesh

currently i am using spark 0.9 on my data i wrote code in java for sparksql.now i want to use spark 1.4 so how to do and what changes i have to do for tables.i ahve .sql file,pom file,.py file. iam using s3 for storage -- View this message in context: http://apache-spark-user-list.1001560.n3.na

Re: Spark streaming multi-tasking during I/O

2015-08-22 Thread Sateesh Kavuri

Hi Akhil, Think of the scenario as running a piece of code in normal Java with multiple threads. Lets say there are 4 threads spawned by a Java process to handle reading from database, some processing and storing to database. In this process, while a thread is performing a database I/O, the CPU co

Re: Using spark streaming to load data from Kafka to HDFS

2015-08-22 Thread Xu (Simon) Chen

Last time I checked, Camus doesn't support storing data as parquet, which is a deal breaker for me. Otherwise it works well for my Kafka topics with low data volume. I am currently using spark streaming to ingest data, generate semi-realtime stats and publish to a dashboard, and dump full dataset

pickling error with PySpark and Elasticsearch-py analyzer

2015-08-22 Thread pkphlam

Reposting my question from SO: http://stackoverflow.com/questions/32161865/elasticsearch-analyze-not-compatible-with-spark-in-python I'm using the elasticsearch-py client within PySpark using Python 3 and I'm running into a problem using the analyze() function with ES in conjunction with an RDD. I

spark 1.4.1 - LZFException

2015-08-22 Thread Yadid Ayzenberg

Hi All, We have a spark standalone cluster running 1.4.1 and we are setting spark.io.compression.codec to lzf. I have a long running interactive application which behaves as normal, but after a few days I get the following exception in multiple jobs. Any ideas on what could be causing this ?

Re: subscribe

2015-08-22 Thread Brandon White

https://www.youtube.com/watch?v=umDr0mPuyQc On Sat, Aug 22, 2015 at 8:01 AM, Ted Yu wrote: > See http://spark.apache.org/community.html > > Cheers > > On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes < > li...@hermes-it-consulting.de> wrote: > >> subscribe >> >> -

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Cody Koeninger

To be perfectly clear, the direct kafka stream will also recover from any failures, because it does the simplest thing possible - fail the task and let spark retry it. If you're consistently having socket closed problems on one task after another, there's probably something else going on in your e

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Dibyendu Bhattacharya

I think you also can give a try to this consumer : http://spark-packages.org/package/dibbhatt/kafka-spark-consumer in your environment. This has been running fine for topic with large number of Kafka partition ( > 200 ) like yours without any issue.. no issue with connection as this consumer re-use

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Shushant Arora

On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Akhil Das

Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, "Shushant Arora" wrote: > Exception comes when client has so many connections to some another > external server also. > So I think Exception is coming because of client side issue only- server > side ther

Re: Spark streaming multi-tasking during I/O

2015-08-22 Thread Sateesh Kavuri

Thanks Akhil. Does this mean that the executor running in the VM can spawn two concurrent jobs on the same core? If this is the case, this is what we are looking for. Also, which version of Spark is this flag in? Thanks, Sateesh On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das wrote: > You can look a

Re: Worker Machine running out of disk for Long running Streaming process

2015-08-22 Thread Ashish Rangole

Interesting. TD, can you please throw some light on why this is and point to the relevant code in Spark repo. It will help in a better understanding of things that can affect a long running streaming job. On Aug 21, 2015 1:44 PM, "Tathagata Das" wrote: > Could you periodically (say every 10 mins

Re: How can I save the RDD result as Orcfile with spark1.3?

2015-08-22 Thread Ted Yu

In Spark 1.4, there was considerable refactoring around interaction with Hive, such as SPARK-7491. It would not be straight forward to port ORC support to 1.3 FYI On Fri, Aug 21, 2015 at 10:21 PM, dong.yajun wrote: > hi Ted, > > thanks for your reply, are there any other way to do this with sp

Re: subscribe

2015-08-22 Thread Ted Yu

See http://spark.apache.org/community.html Cheers On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes wrote: > subscribe > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@sp

2015-08-22 Thread Lars Hermes

subscribe - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Shushant Arora

On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das wrote:

Re: Spark streaming multi-tasking during I/O

2015-08-22 Thread Akhil Das

Hmm for a singl core VM you will have to run it in local mode(specifying master= local[4]). The flag is available in all the versions of spark i guess. On Aug 22, 2015 5:04 AM, "Sateesh Kavuri" wrote: > Thanks Akhil. Does this mean that the executor running in the VM can spawn > two concurrent jo

Re: Spark streaming multi-tasking during I/O

2015-08-22 Thread Sateesh Kavuri

Hi Rishitesh, We are not using any RDD's to parallelize the processing and all of the algorithm runs on a single core (and in a single thread). The parallelism is done at the user level The disk can be started in a separate IO, but then the executor will not be able to take up more jobs, since th

sparkStreaming how to work with partitions,how tp create partition

2015-08-22 Thread Gaurav Agarwal

1. how to work with partition in spark streaming from kafka 2. how to create partition in spark streaming from kafka when i send the message from kafka topic having three partitions. Spark will listen the message when i say kafkautils.createStream or createDirectstSream have local[4] Now i want

Re: spark streaming 1.3 kafka error

2015-08-22 Thread Shushant Arora

Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each tas

Re: Transformation not happening for reduceByKey or GroupByKey

2015-08-22 Thread satish chandra j

HI All, Currently using DSE 4.7 and Spark 1.2.2 version Regards, Satish On Fri, Aug 21, 2015 at 7:30 PM, java8964 wrote: > What version of Spark you are using, or comes with DSE 4.7? > > We just cannot reproduce it in Spark. > > yzhang@localhost>$ more test.spark > val pairs = sc.makeRDD(Seq((0

how to migrate from spark 0.9 to spark 1.4

Re: Spark streaming multi-tasking during I/O

Re: Using spark streaming to load data from Kafka to HDFS

pickling error with PySpark and Elasticsearch-py analyzer

spark 1.4.1 - LZFException

Re: subscribe

Re: spark streaming 1.3 kafka error

Re: spark streaming 1.3 kafka error

Re: spark streaming 1.3 kafka error

Re: spark streaming 1.3 kafka error

Re: Spark streaming multi-tasking during I/O

Re: Worker Machine running out of disk for Long running Streaming process

Re: How can I save the RDD result as Orcfile with spark1.3?

Re: subscribe

subscribe

Re: spark streaming 1.3 kafka error

Re: Spark streaming multi-tasking during I/O

Re: Spark streaming multi-tasking during I/O

sparkStreaming how to work with partitions,how tp create partition

Re: spark streaming 1.3 kafka error

Re: Transformation not happening for reduceByKey or GroupByKey

21 matches

Site Navigation

Mail list logo

Footer information