from:"Nikolay Zhebet"

Re: JDBC Very Slow

2016-09-16 Thread Nikolay Zhebet

Hi! Can you split init code with current comand? I thing it is main problem in your code. 16 сент. 2016 г. 8:26 PM пользователь "Benjamin Kim" написал: > Has anyone using Spark 1.6.2 encountered very slow responses from pulling > data from PostgreSQL using JDBC? I can get to the table and see the

Re: Tuning level of Parallelism: Increase or decrease?

2016-08-01 Thread Nikolay Zhebet

gitbooks.io/mastering-apache-spark/content/yarn/) > that Spark on YARN increases data locality because YARN tries to place > tasks next to HDFS blocks. > > Can anyone verify/support one side or the other? > > Thank you, > Jestin > > On Mon, Aug 1, 2016 at 1:15 AM, Nikolay Zhe

Re: Windows - Spark 2 - Standalone - Worker not able to connect to Master

2016-08-01 Thread Nikolay Zhebet

map 127.0.0.1 | grep 7077". Try to use analog of this commands in Windows and check if is available spark master from your running environment? 2016-08-01 14:35 GMT+03:00 ayan guha : > No I confirmed master is running by spark ui at localhost:8080 > On 1 Aug 2016 18:22, "Nikolay Zhe

Re: multiple spark streaming contexts

2016-08-01 Thread Nikolay Zhebet

I could have them all in parallel > in one app / jar run. > > Thanks, > > On Mon, Aug 1, 2016 at 1:08 PM, Nikolay Zhebet wrote: > >> Hi, If you want read several kafka topics in spark-streaming job, you can >> set names of topics splited by coma and after that you

Re: Windows - Spark 2 - Standalone - Worker not able to connect to Master

2016-08-01 Thread Nikolay Zhebet

I think you haven't run spark master yet, or maybe port 7077 is not yours default port for spark master. 2016-08-01 4:24 GMT+03:00 ayan guha : > Hi > > I just downloaded Spark 2.0 on my windows 7 to check it out. However, not > able to set up a standalone cluster: > > > Step 1: master set up (Suc

Re: Tuning level of Parallelism: Increase or decrease?

2016-08-01 Thread Nikolay Zhebet

Hi. Maybe you can help "data locality".. If you use groupBY and joins, than most likely you will see alot of network operations. This can be werry slow. You can try prepare, transform your information in that way, what can minimize transporting temporary information between worker-nodes. Try googl

Re: multiple spark streaming contexts

2016-08-01 Thread Nikolay Zhebet

Hi, If you want read several kafka topics in spark-streaming job, you can set names of topics splited by coma and after that you can read all messages from all topics in one flow: val topicMap = topics.split(",").map((_, numThreads.toInt)).toMap val lines = KafkaUtils.createStream[String, String,

Re: spark.read.format("jdbc")

2016-08-01 Thread Nikolay Zhebet

You should specify classpath for your jdbc connection. As example, if you want connect to Impala, you can try it snippet: import java.util.Properties import org.apache.spark._ import org.apache.spark.sql.SQLContext import java.sql.Connection import java.sql.DriverManager Class.forName("com.cloud

Re: JDBC Very Slow

Re: Tuning level of Parallelism: Increase or decrease?

Re: Windows - Spark 2 - Standalone - Worker not able to connect to Master

Re: multiple spark streaming contexts

Re: Windows - Spark 2 - Standalone - Worker not able to connect to Master

Re: Tuning level of Parallelism: Increase or decrease?

Re: multiple spark streaming contexts

Re: spark.read.format("jdbc")

8 matches

Site Navigation

Mail list logo

Footer information