customized
hadoop jar and relative pom.xml to nexus repository. Check the link for
reference:
https://books.sonatype.com/nexus-book/reference/staging-deployment.html
fightf...@163.com
From: Lu, Yingqi
Date: 2016-03-08 15:23
To: fightf...@163.com; user
Subject: RE: How to compile Spark
I think you can establish your own maven repository and deploy your modified
hadoop binary jar
with your modified version number. Then you can add your repository in spark
pom.xml and use
mvn -Dhadoop.version=
fightf...@163.com
From: Lu, Yingqi
Date: 2016-03-08 15:09
To: user
I think this may be some permission issue. Check your spark conf for hadoop
related.
fightf...@163.com
From: Arunkumar Pillai
Date: 2016-02-23 14:08
To: user
Subject: spark 1.6 Not able to start spark
Hi When i try to start spark-shell
I'm getting following error
Exception in thread
Oh, thanks. Make sense to me.
Best,
Sun.
fightf...@163.com
From: Takeshi Yamamuro
Date: 2016-02-04 16:01
To: fightf...@163.com
CC: user
Subject: Re: Re: About cache table performance in spark sql
Hi,
Parquet data are column-wise and highly compressed, so the size of deserialized
rows in
? From impala I get the overall
parquet file size if about 24.59GB. Would be good to had some correction on
this.
Best,
Sun.
fightf...@163.com
From: Prabhu Joseph
Date: 2016-02-04 14:35
To: fightf...@163.com
CC: user
Subject: Re: About cache table performance in spark sql
Sun,
When
age
cannot hold the 24.59GB+ table size into memory. But why the performance is so
different and even so bad ?
Best,
Sun.
fightf...@163.com
...@163.com
From: Ted Yu
Date: 2016-02-04 11:49
To: fightf...@163.com
CC: user
Subject: Re: Re: clear cache using spark sql cli
In spark-shell, I can do:
scala> sqlContext.clearCache()
Is that not the case for you ?
On Wed, Feb 3, 2016 at 7:35 PM, fightf...@163.com wrote:
Hi, Ted
Yes. I had s
Hi, Ted
Yes. I had seen that issue. But it seems that in spark-sql cli cannot do
command like :
sqlContext.clearCache()
Is this right ? In spark-sql cli I can only run some sql queries. So I want to
see if there
are any available options to reach this.
Best,
Sun.
fightf...@163.com
Hi,
How could I clear cache (execute sql query without any cache) using spark sql
cli ?
Is there any command available ?
Best,
Sun.
fightf...@163.com
377,769 milliseconds
ago. The last packet sent successfully to the server was 377,790 milliseconds
ago.
Do I need to increase the partitions ? Or shall I write parquet file for each
partition in a iterable way ?
Thanks a lot for your advice.
Best,
Sun.
fightf...@163.com
From: 刘虓
Date
sfully. Do I need to increase the partitions? Or is
there any other
alternatives I can choose to tune this ?
Best,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2016-01-20 15:06
To: 刘虓
CC: user
Subject: Re: Re: spark dataframe jdbc read/write using dbcp connection pool
Hi,
Thanks a lot
4")
The added_year column in mysql table contains range of (1985-2015), and I pass
the numPartitions property
to get the partition purpose. Is this what you recommend ? Can you advice a
little more implementation on this ?
Best,
Sun.
fightf...@163.com
From: 刘虓
Date: 2016-01-20 11:26
1 in stage 0.0
(TID 2)
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link
failure
fightf...@163.com
rTempTable("video_test")
sqlContext.sql("select count(1) from video_test").show()
Overally the load process would stuck and get connection timeout. Mysql table
hold about 100 million records.
Would be happy to provide more usable info.
Best,
Sun.
fightf...@163.com
Hi, Vivek M
I had ever tried 1.5.x spark-cassandra connector and indeed encounter some
classpath issues, mainly for the guaua dependency.
I believe that can be solved by some maven config, but have not tried that yet.
Best,
Sun.
fightf...@163.com
From: vivek.meghanat...@wipro.com
Date
Emm...I think you can do a df.map and store each column value to your list.
fightf...@163.com
发件人: zml张明磊
发送时间: 2015-12-25 15:33
收件人: user@spark.apache.org
抄送: dev-subscr...@spark.apache.org
主题: How can I get the column data based on specific column name and then stored
these data in array
Agree with you that assembly jar is not good to publish. However, what he
really need is to fetch
an updatable maven jar file.
fightf...@163.com
From: Mark Hamstra
Date: 2015-12-11 15:34
To: fightf...@163.com
CC: Xiaoyong Zhu; Jeff Zhang; user; Zhaomin Xu; Joe Zhang (SDE)
Subject: Re: RE
Using maven to download the assembly jar is fine. I would recommend to deploy
this
assembly jar to your local maven repo, i.e. nexus repo, Or more likey a
snapshot repository
fightf...@163.com
From: Xiaoyong Zhu
Date: 2015-12-11 15:10
To: Jeff Zhang
CC: user@spark.apache.org; Zhaomin Xu
t and got the
daily distinct count. However , I am not sure about this implementation can be
some efficient workaround.
Hope some guys can shed a little light on this.
Best,
Sun.
fightf...@163.com
using this.
fightf...@163.com
发件人: censj
发送时间: 2015-12-09 15:44
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
So, I how to get this jar? I use set package project.I not found sbt lib.
在 2015年12月9日,15:42,fightf...@163.com 写道:
I don't think it really nee
I don't think it really need CDH component. Just use the API
fightf...@163.com
发件人: censj
发送时间: 2015-12-09 15:31
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
But this is dependent on CDH。I not install CDH。
在 2015年12月9日,15:18,fightf...@163.com 写道:
Act
Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase
Also, HBASE-13992 already integrates that feature into the hbase side, but
that feature has not been released.
Best,
Sun.
fightf...@163.com
From: censj
Date: 2015-12-09 15:04
To: user@spark.apache.org
Subject: About
Well , Sorry for late reponse and thanks a lot for pointing out the clue.
fightf...@163.com
From: Akhil Das
Date: 2015-12-03 14:50
To: Sahil Sareen
CC: fightf...@163.com; user
Subject: Re: spark sql cli query results written to file ?
Oops 3 mins late. :)
Thanks
Best Regards
On Thu, Dec 3
HI,
How could I save the spark sql cli running queries results and write the
results to some local file ?
Is there any available command ?
Thanks,
Sun.
fightf...@163.com
hive config, that would help to locate root cause for
the problem.
Best,
Sun.
fightf...@163.com
From: Ashok Kumar
Date: 2015-12-01 18:54
To: user@spark.apache.org
Subject: New to Spark
Hi,
I am new to Spark.
I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables.
I have
Could you provide your hive-site.xml file info ?
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 17:04
To: fightf...@163.com; user
Subject: RE: error while creating HiveContext
Hi,
I verified and I could see hive-site.xml in spark conf directory
Hi,
I think you just want to put the hive-site.xml in the spark/conf directory and
it would load
it into spark classpath.
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 15:04
To: user
Subject: error while creating HiveContext
Hi,
I am building a
I think the exception info just says clear that you may miss some tez related
jar on the
spark thrift server classpath.
fightf...@163.com
From: DaeHyun Ryu
Date: 2015-11-11 14:47
To: user
Subject: Spark Thrift doesn't start
Hi folks,
I configured tez as execution engine of Hive. After
Hi,
Have you ever considered cassandra as a replacement ? We are now almost the
seem usage as your engine, e.g. using mysql to store
initial aggregated data. Can you share more about your kind of Cube queries ?
We are very interested in that arch too : )
Best,
Sun.
fightf...@163.com
prompt response.
fightf...@163.com
From: tsh
Date: 2015-11-10 02:56
To: fightf...@163.com; user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Hi,
I'm in the same position right now: we are going to implement something like
OLAP BI + Machine Learning explorations on the
of olap architecture.
And we are happy to hear more use case from this community.
Best,
Sun.
fightf...@163.com
From: Jörn Franke
Date: 2015-11-09 14:40
To: fightf...@163.com
CC: user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Is there any distributor supporting
-apache-cassandra-and-spark
fightf...@163.com
Hi
I notice that you configured the following :
configuration.set("hbase.master", "192.168.1:6");
Did you mistyped the host IP ?
Best,
Sun.
fightf...@163.com
发件人: jinhong lu
发送时间: 2015-10-27 17:22
收件人: spark users
主题: spark to hbase
Hi,
I write my result to hd
Gateway s3 rest api, agreed for such inconvinience and
some incompobilities. However, we had not
yet quite researched and tested over radosgw a lot. But we had some little
requirements using gw in some use cases.
Hope for more considerations and talks.
Best,
Sun.
fightf...@163.com
From: Jerry
progress ?
Best,
Sun.
fightf...@163.com
Hi, Sarath
Did you try to use and increase spark.excecutor.extraJaveOptions -XX:PermSize=
-XX:MaxPermSize=
fightf...@163.com
From: Sarath Chandra
Date: 2015-07-29 17:39
To: user@spark.apache.org
Subject: PermGen Space Error
Dear All,
I'm using -
=> Spark 1.2.0
=> Hive 0.13.
Hi, there
I test with sqlContext.sql(select funcName(param1,param2,...) from tableName )
just worked fine.
Would you like to paste your test code here ? And which version of Spark are u
using ?
Best,
Sun.
fightf...@163.com
From: vinod kumar
Date: 2015-07-27 15:04
To: User
Subject
suggest you
firstly to deploy a spark standalone cluster to run some integration tests, and
also you can consider running spark on yarn for
the later development use cases.
Best,
Sun.
fightf...@163.com
From: Jeetendra Gangele
Date: 2015-07-23 13:39
To: user
Subject: Re: Need help in setting
Hi, there
Which version are you using ? Actually the problem seems gone after we change
our spark version from 1.2.0 to 1.3.0
Not sure what the internal changes did.
Best,
Sun.
fightf...@163.com
From: Night Wolf
Date: 2015-05-12 22:05
To: fightf...@163.com
CC: Patrick Wendell; user; dev
Hi, there
you may need to add :
import sqlContext.implicits._
Best,
Sun
fightf...@163.com
From: java8964
Date: 2015-04-03 10:15
To: user@spark.apache.org
Subject: Cannot run the example in the Spark 1.3.0 following the document
I tried to check out what Spark SQL 1.3.0. I installed it
sqlContext.cacheTable operation,
we can see the cache results. Not sure what's happening here. If anyone can
reproduce this issue, please let me know.
Thanks,
Sun
fightf...@163.com
From: Sean Owen
Date: 2015-04-01 15:54
To: Yuri Makhno
CC: fightf...@163.com; Taotao.Li; user
Subject: R
Hi
Still no good luck with your guide.
Best.
Sun.
fightf...@163.com
From: Yuri Makhno
Date: 2015-04-01 15:26
To: fightf...@163.com
CC: Taotao.Li; user
Subject: Re: Re: rdd.cache() not working ?
cache() method returns new RDD so you have to use something like this:
val person
Hi
That is just the issue. After running person.cache we then run person.count
however, there still not be any cache performance showed from web ui storage.
Thanks,
Sun.
fightf...@163.com
From: Taotao.Li
Date: 2015-04-01 14:02
To: fightfate
CC: user
Subject: Re: rdd.cache() not working
for a little.
Best,
Sun.
case class Person(id: Int, col1: String)
val person =
sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p
=> Person(p(0).trim.toInt, p(1)))
person.cache
person.count
fightf...@163.com
Looks like some authentification issues. Can you check that your current user
had authority to operate (maybe r/w/x) on /user/hive/warehouse?
Thanks,
Sun.
fightf...@163.com
From: smoradi
Date: 2015-03-18 09:24
To: user
Subject: saveAsTable fails to save RDD in Spark SQL 1.3.0
Hi,
Basically
,
Sun.
fightf...@163.com
From: Ranga
Date: 2015-03-18 06:45
To: user@spark.apache.org
Subject: StorageLevel: OFF_HEAP
Hi
I am trying to use the OFF_HEAP storage level in my Spark (1.2.1) cluster. The
Tachyon (0.6.0-SNAPSHOT) nodes seem to be up and running. However, when I try
to persist th
Hi,
If you use maven, what is the actual compiling errors?
fightf...@163.com
From: Su She
Date: 2015-03-16 13:20
To: user@spark.apache.org
Subject: Running Scala Word Count Using Maven
Hello Everyone,
I am trying to run the Word Count from here:
https://github.com/holdenk/learning-spark
Hi, Sandeep
From your error log I can see that jdbc driver not found in your classpath. Did
you had your mysql
jdbc jar correctly configured in the specific classpath? Can you establish a
hive jdbc connection using
the url : jdbc:hive2://localhost:1 ?
Thanks,
Sun.
fightf...@163.com
Thanks haoyuan.
fightf...@163.com
From: Haoyuan Li
Date: 2015-03-16 12:59
To: fightf...@163.com
CC: Shao, Saisai; user
Subject: Re: RE: Building spark over specified tachyon
Here is a patch: https://github.com/apache/spark/pull/4867
On Sun, Mar 15, 2015 at 8:46 PM, fightf...@163.com wrote
Thanks, Jerry
I got that way. Just to make sure whether there can be some option to directly
specifying tachyon version.
fightf...@163.com
From: Shao, Saisai
Date: 2015-03-16 11:10
To: fightf...@163.com
CC: user
Subject: RE: Building spark over specified tachyon
I think you could change the
,
Sun.
fightf...@163.com
Hi,
You may want to check your spark environment config in spark-env.sh,
specifically for the SPARK_LOCAL_IP and check that whether you did modify
that value, which may default be localhost.
Thanks,
Sun.
fightf...@163.com
From: sara mustafa
Date: 2015-03-14 15:13
To: user
Subject
Hi,
You may want to check your spark environment config in spark-env.sh,
specifically for the SPARK_LOCAL_IP and check that whether you did modify
that value, which may default be localhost.
Thanks,
Sun.
fightf...@163.com
From: sara mustafa
Date: 2015-03-14 15:13
To: user
Subject: deploying
Hi, there
You may want to check your hbase config.
e.g. the following property can be changed to /hbase
zookeeper.znode.parent
/hbase-unsecure
fightf...@163.com
From: HARIPRIYA AYYALASOMAYAJULA
Date: 2015-03-14 10:47
To: user
Subject: Problem connecting to HBase
Hello
Hi,
You can first establish a scala ide to develop and debug your spark program,
lets say, intellij idea or eclipse.
Thanks,
Sun.
fightf...@163.com
From: Xi Shen
Date: 2015-03-06 09:19
To: user@spark.apache.org
Subject: Spark code development practice
Hi,
I am new to Spark. I see every
application? Does spark provide such configs for achieving that goal?
We know that this is trickle to get it working. Just want to know that how
could this be resolved, or from other possible channel for
we did not cover.
Expecting for your kind advice.
Thanks,
Sun.
fightf...@163.com
Hi,
Really have no adequate solution got for this issue. Expecting any available
analytical rules or hints.
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-09 11:56
To: user; dev
Subject: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data
supporting modifying this ?
Very thanks,
fightf...@163.com
Hi,
Problem still exists. Any experts would take a look at this?
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-06 17:54
To: user; dev
Subject: Sort Shuffle performance issues about using AppendOnlyMap for large
data sets
Hi, all
Recently we had caught performance
)
- java.lang.Thread.run() @bci=11, line=744 (Interpreted frame)
fightf...@163.com
dump.png
Description: Binary data
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h
Hi, Siddharth
You can re build spark with maven by specifying -Dhadoop.version=2.5.0
Thanks,
Sun.
fightf...@163.com
From: Siddharth Ubale
Date: 2015-01-30 15:50
To: user@spark.apache.org
Subject: Hi: hadoop 2.5 for spark
Hi ,
I am beginner with Apache spark.
Can anyone let me know if it
val kv = new KeyValue(rowkeyBytes,colfam,qual,value)
List(kv)
}
Thanks,
Sun
fightf...@163.com
From: Jim Green
Date: 2015-01-28 04:44
To: Ted Yu
CC: user
Subject: Re: Bulk loading into hbase using saveAsNewAPIHadoopFile
I used below code, and it still failed with
62 matches
Mail list logo