Hi HK,

On Mon, 7 Mar 2022 10:16:07 -0800
HK Verma <hkve...@gmail.com> wrote:
> I am integrating Arrow with another C++ library. For this, I wrote an input
> stream which feeds CSV data into the streaming reader. It fails for very
> large files with the error messages like - "CSV parser got out of sync with
> chunker".

This probably means your CSV data embeds newlines in values, hence the
naive (but extremely fast) CSV chunking doesn't correspond to the
actual CSV boundaries as detected by the full blown CSV parser.

Can you set the relevant option to true and try again?
https://arrow.apache.org/docs/cpp/api/formats.html#_CPPv4N5arrow3csv12ParseOptions18newlines_in_valuesE

Regards

Antoine.


Reply via email to