Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-07-06 Thread via GitHub
kosiew closed pull request #15295: Enhance Schema adapter to accommodate evolving struct URL: https://github.com/apache/datafusion/pull/15295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-07-06 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-3043339058 Reimplementation completed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-22 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2901313004 Thanks @alamb , @adriangb for the review. Working on the smaller PRs.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-21 Thread via GitHub
adriangb commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2100981813 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -1178,6 +1207,31 @@ impl ListingTable { } } +/// Extension trait for FileSource to allow schema

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-21 Thread via GitHub
alamb commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2100961322 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,943 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-21 Thread via GitHub
alamb commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2898922232 > This PR is large and cumbersome to review. > I propose to close it and re-implement as: The break up as you suggest sounds reasonable to me @mbutrovich and @adrian

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-20 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2895687205 @kosiew that sounds good but please keep this working branch around on the `schema-adapter` branch. Maybe you can cherry-pick the changes onto a new branch when you break these

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-20 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2893829674 I propose to close this large PR and re-implement as: ## PR 1: Extract and test core SchemaAdapter helpers **Description** Pull the field‐mapping logic out of `Defaul

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-16 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2887827231 Ok I verified it works. Can you update the PR to remove all the extra logging and put it up for review? This is amazing work. Thank you! cc @alamb any chance you can take

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-16 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2879192184 @kosiew sounds good, I can’t get to a computer for the next few days but can try afterwards. In the meantime can you try one last thing - in your test order the files backwards.

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-13 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2875527091 @kosiew in case it's helpful I ended up building your datafusion fork (on branch schema-adapter) locally so you can get more useful stack traces. See below: ``` Error

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-13 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2875480275 @kosiew did you update the code or just add a test? I'm getting the same error ``` Error fetching table metadata: Failed to collect data frame results: Shared(ArrowErr

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-12 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2871799892 hi @TheBuilderJR , I pushed a simpler test_datafusion_schema_evolution- https://github.com/apache/datafusion/pull/15295/files#diff-f73022e8850396e8f5595b58e1b24a7fd08499dc6ac

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-08 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2865192764 @kosiew thanks I tried that but am getting this error: ``` Error fetching table metadata: Failed to collect data frame results: Shared(ArrowError(ExternalError(Executio

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-08 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2864869245 > there's 10 v1 files, 5 v2 files, 3 v3 files and 1 v4 files. Ideally ListingTableConfig could just derive the mapping from each. Is that possible with your abstraction? # Here

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-07 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2861173475 @kosiew so I think the tricky part is that there are actually multiple evolutions. Basically my code currenty looks like this ``` let con

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-07 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2857995748 hi @TheBuilderJR , In the [2 schemas you quoted](https://github.com/apache/datafusion/pull/15295#issuecomment-2851094063), there are: 1. file_schema (from the parquet files)

[PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-07 Thread via GitHub
kosiew opened a new pull request, #15295: URL: https://github.com/apache/datafusion/pull/15295 ## Which issue does this PR close? - Closes #14757. ## Rationale for this change [arrow-rs suggests that SchemaAdapter is better approach for handling evolving struct](https:/

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-06 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2856949489 @TheBuilderJR , Thanks for providing the schemas. I'll look at this again and update later. -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-05 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2851094063 Actually this is the two schemas: Error fetching table metadata: Failed to collect data frame results: Shared(Plan("Cannot cast file schema field additionalInfo of type St

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-05 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2851086728 @kosiew sorry I just saw this, I don't remember the schemas but happy to test again. Is this PR ready? -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-04-28 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2837154356 Closing, no update from @TheBuilderJR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-04-28 Thread via GitHub
kosiew closed pull request #15295: Enhance Schema adapter to accommodate evolving struct URL: https://github.com/apache/datafusion/pull/15295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-04-05 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2743256589 hi @TheBuilderJR I haven't tidied up the PR yet. Here's how I used the NestedSchema in test_datafusion_schema_evolution_with_compaction ```rust use datafusion:

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-30 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2764961704 @TheBuilderJR , Thanks for testing this. `Unexpected batch schema from file, expected 1 cols but got 12` Can you share the 2 schemas? -- This is an automated

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-30 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2764720923 I tried patching this in and getting these failures. Any idea why? Unfortunately I can't share the full source. ``` Error fetching table metadata: Failed to collect dat

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-25 Thread via GitHub
kosiew commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2013310548 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,868 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-25 Thread via GitHub
TheBuilderJR commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2012299357 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,868 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-25 Thread via GitHub
kosiew commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2750776919 hi @TheBuilderJR , I looked at #15259 and wanted to try incorporating its `struct_coercion`. But the amended `struct_coercion` caused `sqllogictests` to fail. So I did not

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-22 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2745948813 Nice! Fwiw another edge case I found recently that's probably worth testing is a List where the Struct evolves. I ended up solving it by updating list_coersion but curious if yo

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-20 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2739119123 @kosiew any chance you can try running the test case in https://github.com/apache/datafusion/issues/14757? It's a real world example of schema evolution that I hope can be solve

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2000486806 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,582 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2000486806 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,582 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

[PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew opened a new pull request, #15295: URL: https://github.com/apache/datafusion/pull/15295 ## Which issue does this PR close? - Closes #14757. ## Rationale for this change This PR introduces a `NestedStructSchemaAdapter` to improve schema evolution handling in DataFu