Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
parthchandra commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608725526 @kazuyukitanimura : https://github.com/apache/datafusion-comet/pull/1327 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608633437 > These are four different cases aren't they? Microsecond/Millisecond, with and without Tz. Could we do something like ``` DataType::Timestamp(a, Some(_)) i

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608630732 > native_datafusion does not implement v2 yet. But I suppose we can enable it for the test and ensure it falls back to Spark correctly. Ideally we should keep the

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
parthchandra commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608591081 ``` DataType::Timestamp(TimeUnit::Microsecond, Some(_)) => { ... DataType::Timestamp(TimeUnit::Millisecond, Some(_)) => { ...

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
parthchandra commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608586806 > spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala > > ``` > Seq("parquet").foreach { v1List => > withSQLConf( > SQLCo

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608579793 Duplicated case match @andygrove @parthchandra native/spark-expr/src/utils.rs ``` DataType::Timestamp(TimeUnit::Microsecond, Some(_)) => {

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2608505481 spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala ``` Seq("parquet").foreach { v1List => withSQLConf( SQLConf.USE_V1_SO

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
andygrove commented on code in PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#discussion_r1926112409 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -1204,12 +1250,21 @@ object CometSparkSessionExtensions extends Logging {

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
andygrove commented on code in PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#discussion_r1926111345 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -1204,12 +1250,21 @@ object CometSparkSessionExtensions extends Logging {

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-22 Thread via GitHub
andygrove commented on code in PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#discussion_r1926110126 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -944,6 +974,14 @@ class CometSparkSessionExtensions } override

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
parthchandra commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606277066 Sorry about the merge issues guys. git merge seemed to add a couple of files that had been moved/renamed every time I merged from main (and I am pretty sure I removed them

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606146267 > @andygrove @parthchandra Is the change in `docs/source/contributor-guide/benchmarking.md` relevant? This was a merge issue. Will be fixed in https://github.com/apache

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606143968 > This looks `CometScanExec` gets used regardless `COMET_EXEC_ENABLED` and `COMET_NATIVE_SCAN_IMPL` checks The CometScanExec gets replaced later on: ```scala

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606145521 > @andygrove @parthchandra Why are we adding files like `native/spark-expr/src/strings.rs`? There was some refactoring in the past few days in main that resulted in the

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606141084 @andygrove @parthchandra @comphead I think we should revert this change and redo the merge at this point -- This is an automated message from the Apache Git Servic

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606140765 Follow on PR to fix merge issues - https://github.com/apache/datafusion-comet/pull/1320 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606140490 @andygrove @parthchandra ``` // data source V1 case scanExec @ FileSourceScanExec( HadoopFsRelation(_, partitionSchema, _

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606127743 @andygrove @parthchandra Why are we adding files like `native/spark-expr/src/strings.rs`? -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606119962 @andygrove @parthchandra Is the change in `docs/source/contributor-guide/benchmarking.md` relevant? -- This is an automated message from the Apache Git Service. To

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove merged PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@