Hi,
I have some csv file in HDFS with headers like col1, col2, col3, I want to
add a column named id, so the a record would be
How can I do this using Spark SQL ? Can id be auto increment ?
Thanks,
Xiaohe
Hi,
I have hadoop 2.4 cluster running on some remote VMs, can I start spark
shell or submit from my laptop. For example:
bin/spark-shell --mast yarn-client
If this is possible, how can I do this ?
I have copied the same hadoop to my laptop(but I don't run hadoop on my
laptop), I have also set:
is provided, you need to change
> it to compile to run SparkPi in Intellij. As I remember, you also need to
> change guava and jetty related library to compile too.
>
> On Mon, Aug 17, 2015 at 2:14 AM, xiaohe lan
> wrote:
>
>> Hi,
>>
>> I am trying to run Spark
Hi,
I am trying to run SparkPi in Intellij and getting NoClassDefFoundError.
Anyone else saw this issue before ?
Exception in thread "main" java.lang.NoClassDefFoundError:
scala/collection/Seq
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0
Change jdk from 1.8.0_45 to 1.7.0_79 solve this issue.
I saw https://issues.apache.org/jira/browse/SPARK-6388
But it is not a problem however.
On Thu, Jul 2, 2015 at 1:30 PM, xiaohe lan wrote:
> Hi Expert,
>
> Hadoop version: 2.4
> Spark version: 1.3.1
>
> I am running t
Hi Expert,
Hadoop version: 2.4
Spark version: 1.3.1
I am running the SparkPi example application.
bin/spark-submit --class org.apache.spark.examples.SparkPi --master
yarn-client --executor-memory 2G lib/spark-examples-1.3.1-hadoop2.4.0.jar
2
The same command sometimes gets WARN ReliableDeli
:
> Awesome!
>
> It's documented here:
> https://spark.apache.org/docs/latest/submitting-applications.html
>
> -Sandy
>
> On Mon, May 18, 2015 at 8:03 PM, xiaohe lan
> wrote:
>
>> Hi Sandy,
>>
>> Thanks for your information. Yes, spark-submit --master y
, Sandy Ryza
wrote:
> Hi Xiaohe,
>
> The all Spark options must go before the jar or they won't take effect.
>
> -Sandy
>
> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan
> wrote:
>
>> Sorry, them both are assigned task actually.
>>
>> Aggreg
MB295.4
MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6 MB304.8
MB
On Sun, May 17, 2015 at 11:50 PM, xiaohe lan wrote:
> bash-4.1$ ps aux | grep SparkSubmit
> xilan 1704 13.2 1.2 5275520 380244 pts/0 Sl+ 08:39 0:13
> /scratch/xilan/jdk1.8.0_45/bin/java -cp
&
executor-cores param? While you submit the job, do a ps aux
> | grep spark-submit and see the exact command parameters.
>
> Thanks
> Best Regards
>
> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan
> wrote:
>
>> Hi,
>>
>> I have a 5 nodes yarn cluster, I used
Hi,
When I start spark shell by passing yarn to master option, println does not
print elements in RDD:
bash-4.1$ spark-shell --master yarn
15/05/17 01:50:08 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Welcome to
Hi,
I have a 5 nodes yarn cluster, I used spark-submit to submit a simple app.
spark-submit --master yarn target/scala-2.10/simple-project_2.10-1.0.jar
--class scala.SimpleApp --num-executors 5
I have set the number of executor to 5, but from sparkui I could see only
two executors and it ran ve
> http://mbonaci.github.io/mbo-spark/
> You dont need to install spark on every node.Just install it on one node
> or you can install it on remote system also and made a spark cluster.
> Thanks
> Madhvi
>
> On Thursday 30 April 2015 09:31 AM, xiaohe lan wrote:
>
>> Hi experts
Hi experts,
I see spark on yarn has yarn-client and yarn-cluster mode. I also have a 5
nodes hadoop cluster (hadoop 2.4). How to install spark if I want to try
the spark on yarn mode.
Do I need to install spark on the each node of hadoop cluster ?
Thanks,
Xiaohe
14 matches
Mail list logo