comphead commented on code in PR #16332: URL: https://github.com/apache/datafusion/pull/16332#discussion_r2134795185
########## datafusion-cli/tests/sql/integration/glob_test.sql: ########## @@ -0,0 +1,15 @@ +-- Test glob function with files available in CI +-- Test 1: Single CSV file - verify basic functionality +SELECT COUNT(*) AS cars_count FROM glob('../datafusion/core/tests/data/cars.csv'); + +-- Test 2: Data aggregation from CSV file - verify actual data reading +SELECT car, COUNT(*) as count FROM glob('../datafusion/core/tests/data/cars.csv') GROUP BY car ORDER BY car; + +-- Test 3: JSON file with explicit format parameter - verify format specification +SELECT COUNT(*) AS json_count FROM glob('../datafusion/core/tests/data/1.json', 'json'); + +-- Test 4: Single specific CSV file - verify another CSV works +SELECT COUNT(*) AS example_count FROM glob('../datafusion/core/tests/data/example.csv'); + +-- Test 5: Glob pattern with wildcard - test actual glob functionality +SELECT COUNT(*) AS glob_pattern_count FROM glob('../datafusion/core/tests/data/exa*.csv'); Review Comment: should we introduce a new function? can we reuse current model? what should be the behavior if there are mixed CSV/JSON/Parquet files in the folder? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org