Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
huaxingao commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167873990 ## common/src/main/java/org/apache/comet/parquet/FileReader.java: ## @@ -128,6 +134,48 @@ public FileReader(InputFile file, ParquetReadOptions options, Read

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
hsiang-c commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167782872 ## common/src/main/java/org/apache/comet/parquet/FileReader.java: ## @@ -128,6 +134,48 @@ public FileReader(InputFile file, ParquetReadOptions options, ReadO

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
hsiang-c commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167786141 ## common/src/main/java/org/apache/comet/parquet/FileReader.java: ## @@ -209,6 +257,55 @@ public void setRequestedSchema(List projection) { } } +

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
hsiang-c commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167782872 ## common/src/main/java/org/apache/comet/parquet/FileReader.java: ## @@ -128,6 +134,48 @@ public FileReader(InputFile file, ParquetReadOptions options, ReadO

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
hsiang-c commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167786141 ## common/src/main/java/org/apache/comet/parquet/FileReader.java: ## @@ -209,6 +257,55 @@ public void setRequestedSchema(List projection) { } } +

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
hsiang-c commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3006454989 > In my local copy of Iceberg, I updated SparkBatchQueryScan to implement SupportsComet. @andygrove You can apply the diff to Iceberg 1.8.1 for Comet support. I'll

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3006159227 I am now able to get this working end-to-end :tada: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3005044455 The `NoSuchMethodError` error was my mistake. I was building Comet for Spark 3.4 but using the jar for Spark 3.5 when testing. I no longer see any errors, but Comet is not acc

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2167562231 ## common/pom.xml: ## @@ -26,7 +26,7 @@ under the License. org.apache.datafusion comet-parent-spark${spark.version.short}_${scala.binary.versi

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
snmvaughan commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3005663410 I'm surprised we don't have a Comet interface which provides the access needed by Comet, in combination with an Iceberg implementation of that interface that wraps the Iceber

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3005415617 In my local copy of Iceberg, I updated `SparkBatchQueryScan` to implement `SupportsComet`. When using both the Comet and Iceberg jars on the classpath, Comet is unable to reco

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3004952533 I added the following method to `FileReader` locally: ```scala /** Sets the projected columns to be read later via {@link #readNextRowGroup()} */ public void s

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-25 Thread via GitHub
andygrove commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3004865649 I am testing this locally now. There is still one API call that references a Parquet class, causing Iceberg to fail to compile: ``` /home/andy/git/apache/iceberg/par

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-24 Thread via GitHub
huaxingao commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3002056893 cc @andygrove @parthchandra @hsiang-c Could you please review this PR? Thanks a lot! -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-24 Thread via GitHub
huaxingao commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3002050043 I have a draft iceberg [PR](https://github.com/apache/iceberg/pull/13378) -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-24 Thread via GitHub
andygrove commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2164143550 ## common/src/main/java/org/apache/comet/parquet/ColumnReader.java: ## @@ -126,6 +126,13 @@ public void setPageReader(PageReader pageReader) throws IOExcept

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-23 Thread via GitHub
huaxingao commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-2997669574 I somehow got some strange errors: ``` [info] ParquetV1QuerySuite: [info] - simple select queries (635 milliseconds) [info] - appending (254 milliseconds) [info]

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-23 Thread via GitHub
andygrove commented on code in PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#discussion_r2162206037 ## common/src/main/java/org/apache/comet/parquet/ColumnReader.java: ## @@ -126,6 +126,13 @@ public void setPageReader(PageReader pageReader) throws IOExcept

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-20 Thread via GitHub
huaxingao closed pull request #1920: feat: Encapsulate Parquet objects URL: https://github.com/apache/datafusion-comet/pull/1920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-20 Thread via GitHub
codecov-commenter commented on PR #1920: URL: https://github.com/apache/datafusion-comet/pull/1920#issuecomment-2992325126 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1920?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] feat: Encapsulate Parquet objects [datafusion-comet]

2025-06-20 Thread via GitHub
huaxingao opened a new pull request, #1920: URL: https://github.com/apache/datafusion-comet/pull/1920 ## Which issue does this PR close? Closes #. ## Rationale for this change Iceberg shades Parquet. We can't pass Parquet objects from Iceberg to Comet. In order to ge