[Python] Dataset scanner fragment skip options.

Jerald Alex Mon, 12 Jun 2023 08:10:59 -0700

Hi Experts,

I have been using dataset.scanner to read the data with specific filter
conditions and batch_size of 1000 to read the data.


ds.scanner(filter=pc.field('a') != 3, batch_size=1000).to_batches()

I would like to know if it is possible to skip the specific set of batches,
for example, the first 10 batches and read from the 11th Batch.

https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Dataset.html#pyarrow.dataset.Dataset.scanner
Also, what's the fragment_scan_options in dataset scanner and how do we
make use of it?

Really appreciate any input. thanks!

Regards,
Alex

[Python] Dataset scanner fragment skip options.

Reply via email to