from:"Shailesh Birari"

Re: Spark Job not using all nodes in cluster

2015-05-20 Thread Shailesh Birari

015 13:16, "Shailesh Birari" wrote: > >> Hi, >> >> I have a 4 node Spark 1.3.1 cluster. All four nodes have 4 cores and 64 GB >> of RAM. >> I have around 600,000+ Json files on HDFS. Each file is small around 1KB >> in >> size. Total data is around

Spark Job not using all nodes in cluster

2015-05-19 Thread Shailesh Birari

Hi, I have a 4 node Spark 1.3.1 cluster. All four nodes have 4 cores and 64 GB of RAM. I have around 600,000+ Json files on HDFS. Each file is small around 1KB in size. Total data is around 16GB. Hadoop block size is 256MB. My application reads these files with sc.textFile() (or sc.jsonFile() tri

Spark SQL Self join with agreegate

2015-03-19 Thread Shailesh Birari

Hello, I want to use Spark sql to aggregate some columns of the data. e.g. I have huge data with some columns as: time, src, dst, val1, val2 I want to calculate sum(val1) and sum(val2) for all unique pairs of src and dst. I tried by forming SQL query SELECT a.time, a.src, a.dst, sum(

Re: Spark 1.2 – How to change Default (Random) port ….

2015-03-15 Thread Shailesh Birari

Hi SM, Apologize for delayed response. No, the issue is with Spark 1.2.0. There is a bug in Spark 1.2.0. Recently Spark have latest 1.3.0 release so it might have fixed in it. I am not planning to test it soon, may be after some time. You can try for it. Regards, Shailesh -- View this messa

Re: Spark 1.2 – How to change Default (Random) port ….

2015-01-26 Thread Shailesh Birari

ng "spark.shuffle.blockTransferService" to "nio". > > On Sun, Jan 25, 2015 at 6:28 PM, Shailesh Birari > wrote: > >> Can anyone please let me know ? >> I don't want to open all ports on n/w. So, am interested in the property >> by >> w

Re: Spark 1.2 – How to change Default (Random) port ….

2015-01-25 Thread Shailesh Birari

Can anyone please let me know ? I don't want to open all ports on n/w. So, am interested in the property by which this new port I can configure. Shailesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-How-to-change-Default-Random-port-tp21306p2

Spark 1.2 – How to change Default (Random) port ….

2015-01-21 Thread Shailesh Birari

Hello, Recently, I have upgraded my setup to Spark 1.2 from Spark 1.1. I have 4 node Ubuntu Spark Cluster. With Spark 1.1, I used to write Spark Scala program in Eclipse on my Windows development host and submit the job on Ubuntu Cluster, from Eclipse (Windows machine). As on my network not all

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari

0.1 is > guaranteed to work, as should any other version from the past few years). > > On Tue, Jan 20, 2015 at 6:16 PM, Shailesh Birari > wrote: > >> Hi Frank, >> >> Its a normal eclipse project where I added Scala and Spark libraries as >> user libraries. >

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari

are using Maven (or what) to build, but if you can pull up > your builds dependency tree, you will likely find com.google.guava being > brought in by one of your dependencies. > > Regards, > > Frank Austin Nothaft > fnoth...@berkeley.edu > fnoth...@eecs.berkeley.edu > 202-3

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari

are mixing versions > of Spark then, with some that still refer to unshaded Guava. Make sure > you are not packaging Spark with your app and that you don't have > other versions lying around. > > On Tue, Jan 20, 2015 at 11:55 PM, Shailesh Birari > wrote: > > Hello, > >

Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari

Hello, I recently upgraded my setup from Spark 1.1 to Spark 1.2. My existing applications are working fine on ubuntu cluster. But, when I try to execute Spark MLlib application from Eclipse (Windows node) it gives java.lang.NoClassDefFoundError: com/google/common/base/Preconditions exception. Not

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

2014-11-09 Thread Shailesh Birari

Have you tried to set the host name/port to your Windows machine ? Also specify the following ports for Spark. Make sure the ports you mentioned should not be blocked (on windows machine). spark.fileserver.port spark.broadcast.port spark.replClassServer.port spark.blockManager.port spark.executor.

Re: Spark SQL takes unexpected time

2014-11-03 Thread Shailesh Birari

Yes, I am using Spark1.1.0 and have used rdd.registerTempTable(). I tried by adding sqlContext.cacheTable(), but it took 59 seconds (more than earlier). I also tried by changing schema to use Long data type in some fields but seems conversion takes more time. Is there any way to specify index ?

Spark SQL takes unexpected time

2014-11-02 Thread Shailesh Birari

Hello, I have written an Spark SQL application which reads data from HDFS and query on it. The data size is around 2GB (30 million records). The schema and query I am running is as below. The query takes around 05+ seconds to execute. I tried by adding rdd.persist(StorageLevel.MEMORY_AND

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

2014-10-29 Thread Shailesh Birari

Thanks by setting driver host to Windows and specifying some ports (like driver, fileserver, broadcast etc..) it worked perfectly. I need to specify those ports as not all ports are open on my machine. For, driver host name, I was assuming Spark should get it, as in case of linux we are not settin

Re: Submiting Spark application through code

2014-10-28 Thread Shailesh Birari

Yes, this is doable. I am submitting the Spark job using JavaSparkContext spark = new JavaSparkContext(sparkMaster, "app name", System.getenv("SPARK_HOME"), new String[] {"application JAR"}); To run this first you have to create the application jar and in above API specify

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

2014-10-28 Thread Shailesh Birari

Can anyone please help me here ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-job-on-Unix-cluster-from-dev-environment-Windows-tp16989p17552.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --

Re: Submitting Spark job on cluster from dev environment

2014-10-27 Thread Shailesh Birari

Some more update. Now, I tried with by setting spark.driver.host to Spark Master node and spark.driver.port to 51800 (available open port), but its failing with bind error. I was hoping that it will start the driver on supplied host:port and as its unix node there should not be any issue. Can you

Re: Submitting Spark job on cluster from dev environment

2014-10-27 Thread Shailesh Birari

Hello, I am able to submit Job on Spark cluster from Windows desktop. But the executors are not able to run. When I check the Spark UI (which is on Windows, as Driver is there) it shows me JAVA_HOME, CLASS_PATH and other environment variables related to Windows. I tried by setting spark.executor.e

Re: Spark Streaming - How to write RDD's in same directory ?

2014-10-21 Thread Shailesh Birari

Thanks Sameer for quick reply. I will try to implement it. Shailesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-How-to-write-RDD-s-in-same-directory-tp16962p16970.html Sent from the Apache Spark User List mailing list archive at Nabb

Spark Streaming - How to write RDD's in same directory ?

2014-10-21 Thread Shailesh Birari

Hello, Spark 1.1.0, Hadoop 2.4.1 I have written a Spark streaming application. And I am getting FileAlreadyExistsException for rdd.saveAsTextFile(outputFolderPath). Here is brief what I am is trying to do. My application is creating text file stream using Java Stream context. The input file is on

Re: java.lang.OutOfMemoryError while running SVD MLLib example

2014-09-25 Thread Shailesh Birari

Hi Xianguri, After setting SVD to smaller value (200) its working. Thanks, Shailesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-while-running-SVD-MLLib-example-tp14972p15179.html Sent from the Apache Spark User List mailing

Re: java.lang.OutOfMemoryError while running SVD MLLib example

2014-09-24 Thread Shailesh Birari

Note, the data is random numbers (double). Any suggestions/pointers will be highly appreciated. Thanks, Shailesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-while-running-SVD-MLLib-example-tp14972p15083.html Sent from the A

Re: Spark Job not using all nodes in cluster

Spark Job not using all nodes in cluster

Spark SQL Self join with agreegate

Re: Spark 1.2 – How to change Default (Random) port ….

Re: Spark 1.2 – How to change Default (Random) port ….

Re: Spark 1.2 – How to change Default (Random) port ….

Spark 1.2 – How to change Default (Random) port ….

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

Re: Spark SQL takes unexpected time

Spark SQL takes unexpected time

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

Re: Submiting Spark application through code

Re: Submitting Spark job on Unix cluster from dev environment (Windows)

Re: Submitting Spark job on cluster from dev environment

Re: Submitting Spark job on cluster from dev environment

Re: Spark Streaming - How to write RDD's in same directory ?

Spark Streaming - How to write RDD's in same directory ?

Re: java.lang.OutOfMemoryError while running SVD MLLib example

Re: java.lang.OutOfMemoryError while running SVD MLLib example

23 matches

Site Navigation

Mail list logo

Footer information