alamb commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2761888196
Turns out this is a bug in the generator --
https://github.com/clflushopt/tpchgen-rs/issues/73#issuecomment-2761885245
--
This is an automated message from the Apache Git Servic
alamb closed issue #15456: [Bug] datafusion-cli may fail to read csv files
URL: https://github.com/apache/datafusion/issues/15456
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
alamb commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2761675489
Nice find @chenkovsky -- so looks like there is some bug in the data
generator after all.
--
This is an automated message from the Apache Git Service.
To respond to the message
chenkovsky commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760736404
```
grep -n "p_partkey" part.csv
```
why there are two head rows
```
1:p_partkey,p_name,p_mfgr,p_brand,p_type,p_size,p_container,p_retailprice,p_commen
niebayes commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760766577
> why there are two head rows
I didn't find this. You might find the cause.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
niebayes commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760761409
The line number in the error message is the row index of a certain record
batch, not the line number in the csv file. I have filed an issue to arrow-rs
for making this error me
chenkovsky commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2760334856
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
alamb commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2759119415
The problem can be reproduced without the tpchdbgen CLI:
Step 1: download
[part.zip](https://github.com/user-attachments/files/19492616/part.zip)
Step 2: unzip
```shel
alamb commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2759121460
I also tried disabling file scan repartition and it still happens
```sql
> set datafusion.optimizer.repartition_file_scans = false;
0 row(s) fetched.
Elapsed 0.001 s
niebayes commented on issue #15456:
URL: https://github.com/apache/datafusion/issues/15456#issuecomment-2758144809
This https://github.com/apache/arrow-rs/issues/7344 may help debugging.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
10 matches
Mail list logo