Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

via GitHub Sat, 15 Mar 2025 09:35:49 -0700


andygrove commented on code in PR #1525:
URL: https://github.com/apache/datafusion-comet/pull/1525#discussion_r1996139134



##########
docs/source/user-guide/tuning.md:
##########
@@ -17,18 +17,96 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Tuning Guide
+# Comet Tuning Guide
 
 Comet provides some tuning options to help you get the best performance from 
your queries.
 
 ## Memory Tuning
 
-### Unified Memory Management with Off-Heap Memory
+It is necessary to specify how much memory Comet can use in addition to memory 
already allocated to Spark. In some
+cases, it may be possible to reduce the amount of memory allocated to Spark so 
that overall memory allocation is
+the same or lower than the original configuration. In other cases, enabling 
Comet may require allocating more memory
+than before. See the [Determining How Much Memory to Allocate] section for 
more details.
 
-The recommended way to share memory between Spark and Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
-Comet to share an off-heap memory pool with Spark. The size of the pool is 
specified by `spark.memory.offHeap.size`. For more details about Spark off-heap 
memory mode, please refer to Spark documentation: 
https://spark.apache.org/docs/latest/configuration.html.
+[Determining How Much Memory to Allocate]: 
#determining-how-much-memory-to-allocate
 
-The type of pool can be specified with `spark.comet.exec.memoryPool`.
+Comet supports Spark's on-heap (the default) and off-heap mode for allocating 
memory. However, we strongly recommend
+using off-heap mode. Comet has some limitations when running in on-heap mode, 
such as requiring more memory overall,
+and requiring shuffle memory to be separately configured.
+
+### Configuring Comet Memory in Off-Heap Mode
+
+The recommended way to allocate memory for Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
+Comet to share an off-heap memory pool with Spark, reducing the overall memory 
overhead. The size of the pool is
+specified by `spark.memory.offHeap.size`. For more details about Spark 
off-heap memory mode, please refer to
+Spark documentation: https://spark.apache.org/docs/latest/configuration.html.
+
+### Configuring Comet Memory in On-Heap Mode
+
+When running in on-heap mode, Comet memory can be allocated by setting 
`spark.comet.memoryOverhead`. If this setting
+is not provided, it will be calculated by multiplying the current Spark 
executor memory by
+`spark.comet.memory.overhead.factor` (default value is `0.2`) which may or may 
not result in enough memory for 
+Comet to operate. It is not recommended to rely on this behavior. It is better 
to specify `spark.comet.memoryOverhead`
+explicitly.
+
+Comet supports native shuffle and columnar shuffle (these terms are explained 
in the [shuffle] section below). 
+In on-heap mode, columnar shuffle memory must be separately allocated using 
`spark.comet.columnar.shuffle.memorySize`. 
+If this setting is not provided, it will be calculated by multiplying 
`spark.comet.memoryOverhead` by
+`spark.comet.columnar.shuffle.memory.factor` (default value is `1.0`). If a 
shuffle exceeds this amount of memory 
+then the query will fail.
+
+[shuffle]: #shuffle
+
+### Determining How Much Memory to Allocate
+
+Generally, increasing the amount of memory allocated to Comet will improve 
query performance by reducing the
+amount of time spent spilling to disk, especially for aggregate, join, and 
shuffle operations. Allocating insufficient
+memory can result in out-of-memory errors. This is no different from 
allocating memory in Spark and the amount of
+memory will vary for different workloads, so some experimentation will be 
required.
+
+Here is a real-world example, based on running benchmarks derived from TPC-H, 
running on a single executor against 
+local Parquet files using the 100 GB data set.
+
+Baseline Spark Performance 
+
+- Spark completes the benchmark in 632 seconds with 8 cores and 8 GB RAM
+- With less than 8 GB RAM, performance degrades due to spilling
+- Spark can complete the benchmark with as little as 3 GB of RAM, but with 
worse performance (744 seconds)
+
+Comet Performance
+
+- Comet requires at least 5 GB of RAM in off-heap mode and 6 GB RAM in On-Heap 
mode, but performance at this level 
+  is around 340 seconds, which is significantly faster than Spark with any 
amount of RAM
+- Comet running in off-heap with 8 cores completes the benchmark in 295 
seconds, more than 2x faster than Spark
+- It is worth noting that running Comet with only 4 cores and 4 GB RAM 
completes the benchmark in 520 seconds, 
+  providing better performance than Spark for half the resource
+
+It may be possible to reduce Comet's memory overhead by reducing batch sizes 
or increasing number of partitions.
+
+### SortExec
+
+Comet's SortExec implementation spills to disk when under memory pressure, but 
there are some known issues in the 
+underlying DataFusion SortExec implementation that could cause out-of-memory 
errors during spilling. See
+https://github.com/apache/datafusion/issues/14692 for more information.
+
+Workarounds for this problem include:
+
+- Allocating more off-heap memory
+- Disabling native sort by setting `spark.comet.exec.sort.enabled=false` 

Review Comment:
   @kazuyukitanimura  I forget where we discussed this, but I added the info 
here to explain the current limitation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

Reply via email to