Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
alamb commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2761888196 Turns out this is a bug in the generator -- https://github.com/clflushopt/tpchgen-rs/issues/73#issuecomment-2761885245 -- This is an automated message from the Apache Git Servic

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
alamb closed issue #15456: [Bug] datafusion-cli may fail to read csv files URL: https://github.com/apache/datafusion/issues/15456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
alamb commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2761675489 Nice find @chenkovsky -- so looks like there is some bug in the data generator after all. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
chenkovsky commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760736404 ``` grep -n "p_partkey" part.csv ``` why there are two head rows ``` 1:p_partkey,p_name,p_mfgr,p_brand,p_type,p_size,p_container,p_retailprice,p_commen

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
niebayes commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760766577 > why there are two head rows I didn't find this. You might find the cause. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-28 Thread via GitHub
niebayes commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760761409 The line number in the error message is the row index of a certain record batch, not the line number in the csv file. I have filed an issue to arrow-rs for making this error me

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-27 Thread via GitHub
chenkovsky commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760334856 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-27 Thread via GitHub
alamb commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2759119415 The problem can be reproduced without the tpchdbgen CLI: Step 1: download [part.zip](https://github.com/user-attachments/files/19492616/part.zip) Step 2: unzip ```shel

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-27 Thread via GitHub
alamb commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2759121460 I also tried disabling file scan repartition and it still happens ```sql > set datafusion.optimizer.repartition_file_scans = false; 0 row(s) fetched. Elapsed 0.001 s

Re: [I] [Bug] datafusion-cli may fail to read csv files [datafusion]

2025-03-27 Thread via GitHub
niebayes commented on issue #15456: URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2758144809 This https://github.com/apache/arrow-rs/issues/7344 may help debugging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t