Re: Spark 0.9.1 core dumps on Mesos 0.18.0

2014-07-31 Thread qingyang li
hi, dale , yes, that could work. But new problem comes: --task always lost- 14/07/31 16:46:29 INFO TaskSetManager: Starting task 0.0:1 as TID 20 on executor 20140731-154806-1694607552-5050-4716-7: bigdata008 (PROCESS_LOCAL) 14/07/31 16:46:29 INFO TaskSetManager: Serialized task 0.0:1

Re: Spark 0.9.1 core dumps on Mesos 0.18.0

2014-06-17 Thread qingyang li
i am using spark 0.9.1 , mesos 0.19.0 and tachyon 0.4.1 , is spark0.9.1 compatiable with mesos0.19.0? 2014-06-17 15:50 GMT+08:00 qingyang li : > hi, steven, have you resolved this problem? i encounter the same > problem, too. > > > 2014-04-18 3:48 GMT+08:00 Sean Owen : >

Re: Spark 0.9.1 core dumps on Mesos 0.18.0

2014-06-17 Thread qingyang li
hi, steven, have you resolved this problem? i encounter the same problem, too. 2014-04-18 3:48 GMT+08:00 Sean Owen : > Oh dear I read this as a build problem. I can build with the latest > Java 7, including those versions of Spark and Mesos, no problem. I did > not deploy them. > > Mesos does

how to improve sharkserver2's parallelism performance?

2014-06-09 Thread qingyang li
if i have to sumbmit a job which will cost 3 seconds, how many jobs at the same time sharkserver2 could handled? how to improve sharkserver2's parallelism performance?

Re: how to control task number?

2014-05-27 Thread qingyang li
it's spliting by setting some configuration ,such as setting "map.split.size=64M" ? 2014-05-27 16:59 GMT+08:00 qingyang li : > when i using "create table bigtable002 tblproperties('shark.cache'=' > tachyon') as select * from bigtable001 limit 40;" , ther

Re: how to set task number?

2014-05-27 Thread qingyang li
on') as select * from bigtable001 ;" , there will be 35 files created on tachyon. so, I think spark/shark know how to split files when creating table, could i control it's spliting by setting some configuration ,such as setting "map.split.size=64M" ? 2014-05-26 12:14

Re: how to set task number?

2014-05-25 Thread qingyang li
MT+08:00 Aaron Davidson : > What is the format of your input data, prior to insertion into Tachyon? > > > On Sun, May 25, 2014 at 7:52 PM, qingyang li wrote: > >> i tried "set mapred.map.tasks=30" , it does not work, it seems shark >> does not support this settin

Re: how to set task number?

2014-05-25 Thread qingyang li
son : > You can try setting "mapred.map.tasks" to get Hive to do the right thing. > > > On Sun, May 25, 2014 at 7:27 PM, qingyang li wrote: > >> Hi, Aaron, thanks for sharing. >> >> I am using shark to execute query , and table is created on tachyon. I >&

Re: how to set task number?

2014-05-25 Thread qingyang li
interface. The Tachyon master also has a > useful web interface, available at port 1. > > > On Sun, May 25, 2014 at 5:43 PM, qingyang li wrote: > >> hi, Mayur, thanks for replying. >> I know spark application should take all cores by default. My question >> is

Re: how to set task number?

2014-05-25 Thread qingyang li
gt; > > On Thu, May 22, 2014 at 4:07 PM, qingyang li wrote: > >> my aim of setting task number is to increase the query speed,and I >> have also found " mapPartitionsWithIndex at >> Operator.scala:333<http://192.168.1.101:4040/stages/stage?id=17&

Re: how to set task number?

2014-05-22 Thread qingyang li
/192.168.1.101:4040/stages/stage?id=17> to make the costing time down? 2014-05-22 18:09 GMT+08:00 qingyang li : > i have added SPARK_JAVA_OPTS+="-Dspark. > default.parallelism=40 " in shark-env.sh, > but i find there are only10 tasks on the cluster and 2 tasks each

Re: how to set task number?

2014-05-22 Thread qingyang li
i have added SPARK_JAVA_OPTS+="-Dspark. default.parallelism=40 " in shark-env.sh, but i find there are only10 tasks on the cluster and 2 tasks each machine. 2014-05-22 18:07 GMT+08:00 qingyang li : > i have added SPARK_JAVA_OPTS+="-Dspark.default.parallelism=40 &q

Re: how to set task number?

2014-05-22 Thread qingyang li
i have added SPARK_JAVA_OPTS+="-Dspark.default.parallelism=40 " in shark-env.sh 2014-05-22 17:50 GMT+08:00 qingyang li : > i am using tachyon as storage system and using to shark to query a table > which is a bigtable, i have 5 machines as a spark cluster, there are 4 > co

how to set task number?

2014-05-22 Thread qingyang li
i am using tachyon as storage system and using to shark to query a table which is a bigtable, i have 5 machines as a spark cluster, there are 4 cores on each machine . My question is: 1. how to set task number on each core? 2. where to see how many partitions of one RDD?

Re: Shark Direct insert into table value (?)

2014-04-02 Thread qingyang li
for now , it does not support direct insert. 2014-04-03 10:52 GMT+08:00 abhietc31 : > Hi, > I'm trying to run script in SHARK(0.81) " insert into emp (id,name) > values (212,"Abhi") " but it doesn't work. > I urgently need direct insert as it is show stopper. > > I know that we can do " inser

Re: Shark does not give any results with SELECT count(*) command

2014-03-26 Thread qingyang li
result on bigdata001, but fail on bigdata003, so, if spark choose one node randomly to store the result? if i did not say the problem clearly, please let me know. thanks. 2014-03-26 16:55 GMT+08:00 qingyang li : > hi, Praveen, I can start server on bigdata001 using "/bin/shark -

Re: Shark does not give any results with SELECT count(*) command

2014-03-26 Thread qingyang li
m bigdata003 to check if port is accessible. Hope that > helps. > > > On Wed, Mar 26, 2014 at 12:57 PM, qingyang li wrote: > >> hi, Praveen, thanks for replying. >> >> I am using hive-0.11 which comes from amplab, at the begining , the >> hive-site.xml of ampla

Re: Shark does not give any results with SELECT count(*) command

2014-03-26 Thread qingyang li
bigdata003(master is bigdata001)? 2014-03-25 18:41 GMT+08:00 Praveen R : > Hi Qingyang Li, > > Shark-0.9.0 uses a patched version of hive-0.11 and using > configuration/metastore of hive-0.12 could be incompatible. > > May I know the reason you are using hive-site.xml from previous

Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
Egor, i encounter the same problem which you have asked in this thread: http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E have you fixed this problem? i am using shark to read a table which i have created on h

Re: Shark does not give any results with SELECT count(*) command

2014-03-25 Thread qingyang li
d , i deleted some attributes from hive-site.xml When run select count(*) from xxx, no resut and no errors output. Can someone give me some suggestions to debug ? 2014-03-20 11:27 GMT+08:00 qingyang li : > have found the cause , my problem is : > the style of file salves is not corr

Re: Shark does not give any results with SELECT count(*) command

2014-03-19 Thread qingyang li
have found the cause , my problem is : the style of file salves is not correct, so the task only be run on master. explain here to help other guy who also encounter similiar problem. 2014-03-20 9:57 GMT+08:00 qingyang li : > Hi, i install spark0.9.0 and shark0.9 on 3 nodes , when i run sel

Shark does not give any results with SELECT count(*) command

2014-03-19 Thread qingyang li
Hi, i install spark0.9.0 and shark0.9 on 3 nodes , when i run select * from src , i can get result, but when i run select count(*) from src or select * from src limit 1, there is no result output. i have found similiar problem on google groups: https://groups.google.com/forum/#!searchin/spark-use

Re: how to config worker HA

2014-03-12 Thread qingyang li
each partition on two cluster nodes. 1. is this one point of fault-tolerance ? 2.if replicate each partition on two cluster nodes will help worker node HA ? 3. if there is MEMORY_ONLY_3 which could replicate each partition on three cluster nodes? 2014-03-12 12:11 GMT+08:00 qingyang li : > i

how to config worker HA

2014-03-11 Thread qingyang li
i have one table in memery, when one worker becomes dead, i can not query data from that table. Here is it's storage status: RDD NameStorage LevelCached PartitionsFraction CachedSize in MemorySize on Disk table01 Memory Deserialized 1x Replicated 11

Re: is spark 0.9.0 HA?

2014-03-10 Thread qingyang li
tions-inside-the-cluster > > Please let me know if you have further questions. > > > On Mon, Mar 10, 2014 at 6:57 PM, qingyang li wrote: > >> is spark 0.9.0 HA? we only have one master server , i think is is not . >> so, Does anyone know how to support HA for spark? >> > >

is spark 0.9.0 HA?

2014-03-10 Thread qingyang li
is spark 0.9.0 HA? we only have one master server , i think is is not . so, Does anyone know how to support HA for spark?

if there is shark 0.9 build can be download?

2014-03-10 Thread qingyang li
Does anyone know if there is shark 0.9 build can be download? if not, when there will be shark 0.9 build?

what is shark's mailiing list?

2014-03-10 Thread qingyang li
Does anyone know what is shark's mailiing list? I have tried shark-uesr@googlegroups, but it is not. It is also very slow to open groups.google.com/forum/#!forum/shark-users. is there any other way to communicate with shark developers?

Re: how to get size of rdd in memery

2014-03-07 Thread qingyang li
0.9.0-incubating-bin-hadoop2]# free -g total used free sharedbuffers cached Mem:15 6 8 0 0 4 -/+ buffers/cache: 2 13 Swap:7 0 7 2014-03-07 16:51 GMT+08:00 qing

Re: how to get size of rdd in memery

2014-03-07 Thread qingyang li
0.9.0-incubating-bin-hadoop2]# free -g total used free sharedbuffers cached Mem:15 6 8 0 0 4 -/+ buffers/cache: 2 13 Swap:7 0 7 2014-03-07 16:51 GMT+08:00 qing

Re: how to get size of rdd in memery

2014-03-07 Thread qingyang li
t; > > > > On Fri, Mar 7, 2014 at 12:09 AM, qingyang li wrote: > >> dear community, can anyone tell me : how to get size of rdd in memery ? >> thanks. >> > >

how to get size of rdd in memery

2014-03-07 Thread qingyang li
dear community, can anyone tell me : how to get size of rdd in memery ? thanks.

Re: need someone to help clear some questions.

2014-03-06 Thread qingyang li
more helpful. > Hope this helps > > > On Thu, Mar 6, 2014 at 3:25 AM, qingyang li wrote: > >> just a addition for #3, i have such configuration in shark-env.sh: >> >> export HADOOP_HOME=/usr/lib/hadoop >> export HADOOP_CONF_DIR=/etc/hadoop/conf >> expo

Re: need someone to help clear some questions.

2014-03-06 Thread qingyang li
. > hdfs://namenode/user2/vols.csv, see this thread > https://groups.google.com/forum/#!topic/tachyon-users/3Da4zcHKBbY > > Lastly as your questions are more shark than spark related there is a > separate shark user group that might be more helpful. > Hope this helps > > &g

Re: need someone to help clear some questions.

2014-03-06 Thread qingyang li
just a addition for #3, i have such configuration in shark-env.sh: export HADOOP_HOME=/usr/lib/hadoop export HADOOP_CONF_DIR=/etc/hadoop/conf export HIVE_HOME=/usr/lib/hive/ #export HIVE_CONF_DIR=/etc/hive/conf export MASTER=spark://bigdata001:7077 - 2014-03-06 16:20 GMT+08:00 qingyang

need someone to help clear some questions.

2014-03-06 Thread qingyang li
hi, spark community, i have setup 3 nodes cluster using spark 0.9 and shark 0.9, My question is : 1. is there any neccessary to install shark on every node since it is a client to use spark service ? 2. when i run shark-withinfo, i got such warning: WARN shark.SharkEnv: Hive Hadoop shims detecte