SparkContext not creating due Logger initialization

2016-12-08 Thread Adnan Ahmed
304) Thanks Adnan Ahmed

Re: Storing Compressed data in HDFS into Spark

2015-10-22 Thread Adnan Haider
serialized data. Although Parquet is an option, I believe it will only make sense to use it when running Spark SQL. However, if I am using graphx or mllib will it help? Thanks, Adnan Haider B.S Candidate, Computer Science Illinois Institute of Technology On Thu, Oct 22, 2015 at 7:15 AM, Igor Berman

Re: SparkPi performance-3 cluster standalone mode

2014-04-24 Thread Adnan
Hi, Relatively new on spark and have tried running SparkPi example on a standalone 12 core three machine cluster. What I'm failing to understand is, that running this example with a single slice gives better performance as compared to using 12 slices. Same was the case when I was using parallelize

Re: how to set spark.executor.memory and heap size

2014-04-24 Thread Adnan Yaqoob
Sorry wrong format: file:///home/wxhsdp/spark/example/standalone/README.md An extra / is needed at the start. On Thu, Apr 24, 2014 at 1:46 PM, Adnan Yaqoob wrote: > You need to use proper url format: > > file://home/wxhsdp/spark/example/standalone/README.md > > > On Thu, Ap

Re: how to set spark.executor.memory and heap size

2014-04-24 Thread Adnan Yaqoob
You need to use proper url format: file://home/wxhsdp/spark/example/standalone/README.md On Thu, Apr 24, 2014 at 1:29 PM, wxhsdp wrote: > i think maybe it's the problem of read local file > > val logFile = "/home/wxhsdp/spark/example/standalone/README.md" > val logData = sc.textFile(logFile).c

Re: Access Last Element of RDD

2014-04-23 Thread Adnan Yaqoob
This function will return scala List, you can use List's last function to get the last element. For example: RDD.take(RDD.count()).last On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna wrote: > Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD. > > I want only

Re: Access Last Element of RDD

2014-04-23 Thread Adnan Yaqoob
You can use following code: RDD.take(RDD.count()) On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna wrote: > Hi All, Some help ! > RDD.first or RDD.take(1) gives the first item, is there a straight forward > way to access the last element in a similar way ? > > I coudnt fine a tail/last method for

Re: how to set spark.executor.memory and heap size

2014-04-23 Thread Adnan Yaqoob
When I was testing spark, I faced this issue, this issue is not related to memory shortage, It is because your configurations are not correct. Try to pass you current Jar to to the SparkContext with SparkConf's setJars function and try again. On Thu, Apr 24, 2014 at 8:38 AM, wxhsdp wrote: > by t

Re: Executing spark jobs with predefined Hadoop user

2014-04-10 Thread Adnan
Then problem is not on spark side, you have three options, choose any one of them: 1. Change permissions on /tmp/Iris folder from shell on NameNode with "hdfs dfs -chmod" command. 2. Run your hadoop service with hdfs user. 3. Disable dfs.permissions in conf/hdfs-site.xml. Regards, Adn

Re: Executing spark jobs with predefined Hadoop user

2014-04-10 Thread Adnan
You need to use proper HDFS URI with saveAsTextFile. For Example: rdd.saveAsTextFile("hdfs://NameNode:Port/tmp/Iris/output.tmp") Regards, Adnan Asaf Lahav wrote > Hi, > > We are using Spark with data files on HDFS. The files are stored as files > for predefined hadoop u

How to execute a function from class in distributed jar on each worker node?

2014-04-08 Thread Adnan
Spark API function to do it. Can somebody help me with it or point me in the right direction? Regards, Adnan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-execute-a-function-from-class-in-distributed-jar-on-each-worker-node-tp3870.html Sent fro