executor running time vs getting result from jupyter notebook

2016-04-14 Thread Patcharee Thongtra
Hi, I am running a jupyter notebook - pyspark. I noticed from the history server UI there are some tasks spending a lot of time on either - executor running time - getting result But some tasks finished both steps very quick. All tasks however have very similar input size. What can be the f

custom inputformat recordreader

2015-11-26 Thread Patcharee Thongtra
Hi, In python how to use inputformat/custom recordreader? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

data local read counter

2015-11-25 Thread Patcharee Thongtra
Hi, Is there a counter for data local read? I understood that it is locality level counter, but it seems not. Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user

locality level counter

2015-11-25 Thread Patcharee Thongtra
Hi, I do not understand how this locality level counter work. I have an application working on unsplittable binary files in 6 nodes cluster. One file = 3 data blocks. The application reads the whole file into RDD. Why are the Local Level of all tasks (in History Server UI) NODE_LOCAL? Tha

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
driver side didn’t set the predicate at all, then somewhere else is broken. Can you please file a JIRA with a simple reproduce step, and let me know the JIRA number? Thanks. Zhan Zhang On Oct 13, 2015, at 1:01 AM, Patcharee Thongtra mailto:patcharee.thong...@uni.no>> wrote: Hi Zhan Zhan

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
Hi Zhan Zhang, Is my problem (which is ORC predicate is not generated from WHERE clause even though spark.sql.orc.filterPushdown=true) can be related to some factors below ? - orc file version (File Version: 0.12 with HIVE_8732) - hive version (using Hive 1.2.1.2.3.0.0-2557) - orc table is no

No assemblies found in assembly/target/scala-2.10

2015-03-13 Thread Patcharee Thongtra
Hi, I am trying to build spark 1.3 from source. After I executed| mvn -DskipTests clean package| I tried to use shell but got this error [root@sandbox spark]# ./bin/spark-shell Exception in thread "main" java.lang.IllegalStateException: No assemblies found in '/root/spark/assembly/target/scal

bad symbolic reference. A signature in SparkContext.class refers to term conf in value org.apache.hadoop which is not available

2015-03-11 Thread Patcharee Thongtra
Hi, I have built spark version 1.3 and tried to use this in my spark scala application. When I tried to compile and build the application by SBT, I got error> bad symbolic reference. A signature in SparkContext.class refers to term conf in value org.apache.hadoop which is not available It se

java.lang.RuntimeException: Couldn't find function Some

2015-03-09 Thread Patcharee Thongtra
Hi, In my spark application I queried a hive table and tried to take only one record, but got java.lang.RuntimeException: Couldn't find function Some val rddCoOrd = sql("SELECT date, x, y FROM coordinate where order by date limit 1") valresultCoOrd = rddCoOrd.take(1)(0) Any ideas? I