RE: configure number of cached partition in memory on SparkSQL

2015-03-18 Thread Judy Nash
; user@spark.apache.org Subject: Re: configure number of cached partition in memory on SparkSQL Hi Judy, In the case of HadoopRDD and NewHadoopRDD, partition number is actually decided by the InputFormat used. And spark.sql.inMemoryColumnarStorage.batchSize is not related to partition number, it

Re: configure number of cached partition in memory on SparkSQL

2015-03-16 Thread Cheng Lian
Hi Judy, In the case of |HadoopRDD| and |NewHadoopRDD|, partition number is actually decided by the |InputFormat| used. And |spark.sql.inMemoryColumnarStorage.batchSize| is not related to partition number, it controls the in-memory columnar batch size within a single partition. Also, what d

configure number of cached partition in memory on SparkSQL

2015-03-04 Thread Judy Nash
Hi, I am tuning a hive dataset on Spark SQL deployed via thrift server. How can I change the number of partitions after caching the table on thrift server? I have tried the following but still getting the same number of partitions after caching: Spark.default.parallelism spark.sql.inMemoryColu