Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-07 Thread via GitHub
rkrishn7 commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2644048701 @jonahgao Thank you for calling this out. I think you're right! In fact, I think we could say more generally, this issue arises when the schema of the source of an `INSER

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-07 Thread via GitHub
jayzhan211 closed issue #14394: DataSink::write_all given invalid RecordBatchStream URL: https://github.com/apache/datafusion/issues/14394 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-06 Thread via GitHub
jonahgao commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2639143085 I am concerned whether union will also lead to similar behavior when the nullability of the two inputs is different. -- This is an automated message from the Apache Git Serv

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-03 Thread via GitHub
rkrishn7 commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2632997864 Poked around and it looks like the specific issue here is due to the fact that the schema assigned to `LogicalPlan::Values` during planning defaults all its fields to nullable.

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-03 Thread via GitHub
gatesn commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2631324037 Is this overhead a problem if it's only a debug_assertion? It will be compiled out in release builds. Or are you looking for a release assertion to put somewhere? -- This is a

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-03 Thread via GitHub
zhuqi-lucas commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2631299645 ```rust LogicalPlan::Dml(DmlStatement { table_name, op: WriteOp::Insert(insert_op), .. }) => {

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-03 Thread via GitHub
ozankabak commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2630849990 The only thing I would be worried about is the potential overhead. I get the feeling that this is something that should be done "once" but I'm not sure how we can do it. Maybe

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-03 Thread via GitHub
alamb commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2630622523 > I wonder if it's worth adding that as a debug assertion? I agree this would be good to add -- most similar check in DataFusion do a check and raise an `DataFusionError;:In

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-02 Thread via GitHub
gatesn commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2629380503 Yes, this code fails: https://github.com/apache/datafusion/compare/main...gatesn:datafusion:ngates/record-batch-stream-schema?expand=1#diff-9b1672adeba35025e24d21f8d7da2f0

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-02 Thread via GitHub
zhuqi-lucas commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2629374509 Thank you @gatesn for the report. Can you provide the full code or sql to reproduce the it? So we can solve it more quickly. -- This is an automated message from t

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-02 Thread via GitHub
gatesn commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2629360656 I imagine it's downstream, and of course, the debug assert only catches bad RecordBatchStream impls that used the Adapter. This specific bug may not even be caught by the

Re: [I] DataSink::write_all given invalid RecordBatchStream [datafusion]

2025-02-01 Thread via GitHub
ozankabak commented on issue #14394: URL: https://github.com/apache/datafusion/issues/14394#issuecomment-2629258009 Is this a bug in `DataSink` code, or is it a downstream bug that becomes somewhat hard to notice because there is no `debug_assert`? -- This is an automated message from the