Spark SQL doesn't produce output while hive does

2019-03-03 Thread mayangyang02
Hi, We have a sql. When we ran it with Hive, it produced the result normally. But when we ran it with Spark, id didn’t produce any output. We found that what caused the problem is the where statement. The where statement is as following: where 1=1 and user_id <> '未知' and user_mobile <> -99 Can an

Shuffle service with more than one executor

2019-03-03 Thread Bruno Faria
Hi, I have a spark standalone cluster running on Kubernetes With anti-affinity for network performance. I’d like to enable spark dynamic allocation and for this I need to enable shuffle services but Looks like I can’t do that running more than one worker instance on the same worker. Is there a

Re: disable spark disk cache

2019-03-03 Thread Hien Luu
Hi Andrey, Below is the description of MEMORY_ONLY from https://spark.apache.org/docs/latest/rdd-programming-guide.html "Store RDD as deserialized Java objects in the JVM. If the RDD does not fit in memory, some partitions will not be cached and will be recomputed on the fly each time they're nee

disable spark disk cache

2019-03-03 Thread Andrey Dudin
Hello everyone, Is there a way to prevent caching data to disk even if the memory(RAM) runs out? As I know, spark will use disk even if I use MEMORY_ONLY. How to disable this mechanism? I want to get something like out of memory exception if the memory(RAM) runs out. Thanks, Andrey