[ 
https://issues.apache.org/jira/browse/ARROW-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566543#comment-17566543
 ] 

Will Jones commented on ARROW-17068:
------------------------------------

My guess is that if you pass {{use_legacy_dataset=False}} it should work. This 
option will become the default in 9.0.0 and we are removing legacy datasets 
implementation eventually so we might not fix this.

If you can, it would be preferable to use the dataset writer in 
{{pyarrow.dataset}}:

{code:python}
import pyarrow.dataset as ds

ds.write_dataset(table, base_dir="tests", partitioning=["col2"], 
file_visitor=lambda x: written_files.append(x.path)))
{code}

> [Python] "pyarrow.parquet.write_to_dataset", option "file_visitor" nothing 
> happen
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-17068
>                 URL: https://issues.apache.org/jira/browse/ARROW-17068
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 8.0.0
>            Reporter: Alejandro Marco Ramos
>            Priority: Minor
>
> When try to use the callback "file_visitor", nothing happens.
>  
> Example:
> {code:java}
> import pyarrow as pa
> from pyarrow import parquet as pa_parquet
> table = pa.table([
>         pa.array([1, 2, 3, 4, 5]),
>         pa.array(["a", "b", "c", "d", "e"]),
>         pa.array([1.0, 2.0, 3.0, 4.0, 5.0])
>     ], names=["col1", "col2", "col3"])
> written_files = []
> pa_parquet.write_to_dataset(table, partition_cols=["col2"], 
> root_path="tests", file_visitor=lambda x: written_files.append(x.path)))
> assert len(written_files) > 0  # This raises, length is 0{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to