alamb commented on code in PR #17275:
URL: https://github.com/apache/datafusion/pull/17275#discussion_r2336569903


##########
datafusion/core/tests/parquet/filter_pushdown.rs:
##########
@@ -601,3 +602,99 @@ fn get_value(metrics: &MetricsSet, metric_name: &str) -> 
usize {
         }
     }
 }
+
+#[tokio::test]
+async fn predicate_cache_default() -> datafusion_common::Result<()> {
+    let ctx = SessionContext::new();
+    // The cache is on by default, but not used unless filter pushdown is 
enabled
+    PredicateCacheTest {
+        expected_inner_records: 0,
+        expected_records: 0,
+    }
+    .run(&ctx)
+    .await
+}
+
+#[tokio::test]
+async fn predicate_cache_pushdown_default() -> datafusion_common::Result<()> {
+    let mut config = SessionConfig::new();
+    config.options_mut().execution.parquet.pushdown_filters = true;
+    let ctx = SessionContext::new_with_config(config);
+    // The cache is on by default, and used when filter pushdown is enabled
+    PredicateCacheTest {
+        expected_inner_records: 8,
+        expected_records: 4,
+    }
+    .run(&ctx)
+    .await
+}
+
+#[tokio::test]
+async fn predicate_cache_pushdown_disable() -> datafusion_common::Result<()> {
+    // Can disable the cache even with filter pushdown by setting the size to 
0. In this case we
+    // expect the inner records are reported but no records are read from the 
cache
+    let mut config = SessionConfig::new();
+    config.options_mut().execution.parquet.pushdown_filters = true;
+    config
+        .options_mut()
+        .execution
+        .parquet
+        .max_predicate_cache_size = Some(0);
+    let ctx = SessionContext::new_with_config(config);
+    PredicateCacheTest {
+        // file has 8 rows, which need to be read twice, one for filter, one 
for
+        // final output
+        expected_inner_records: 16,
+        // Expect this to 0 records read as the cache is disabled. However, it 
is
+        // non zero due to https://github.com/apache/arrow-rs/issues/8307

Review Comment:
   I did verify that the cache is not being used via the debugger. However,  
this metric is very confusing. I filed a ticket to track:
   -  https://github.com/apache/arrow-rs/issues/8307



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to