Hi Joris,

Really thanks for pointing out where the doc sources are! I will start the
PR and share it with you so that we can work on it together. You know, I
can do the ORC reader & writer with options and you can do the dataset
integration that you did.

Best,
Ian

On Thursday, November 25, 2021, Joris Van den Bossche <
jorisvandenboss...@gmail.com> wrote:

> Hi Ian,
>
> Yes, more documentation regarding ORC would be very welcome! I think
> your list of missing docs is correct:
>
> - It's briefly mentioned in the Python API docs
> (https://arrow.apache.org/docs/python/api/formats.html#orc-files), but
> incomplete
> - The C++ reference docs list the OrcFileFormat for the dataset API
> (https://arrow.apache.org/docs/cpp/api/dataset.html#_
> CPPv4N5arrow7dataset13OrcFileFormatE),
> but not the direct ORC interface (like is done for Parquet at
> https://arrow.apache.org/docs/cpp/api/formats.html, for which the
> source lives at
> https://github.com/apache/arrow/blob/master/docs/source/
> cpp/api/formats.rst)
> - There is indeed no user guide. The Parquet python doc page lives at
> https://github.com/apache/arrow/blob/master/docs/source/python/parquet.rst
>
> Best,
> Joris
>
> On Wed, 24 Nov 2021 at 04:55, Ian Joiner <iajoiner...@gmail.com> wrote:
> >
> > Hi,
> >
> > Today I found that pretty much none of our ORC-related work (e.g. ORC
> > writer in C++ & Python, Arrow Dataset with ORC) has ever been documented.
> > This is something we have to fix or users won’t even be aware that ORC
> > support exists, let alone how to use it.
> >
> > From my understanding it seems that we miss the following docs:
> > 1. C++ and Python API reference (partially missing)
> > 2. User Guide (entirely absent)
> >
> > As the person who created and self-assigned
> > https://issues.apache.org/jira/browse/ARROW-13231 I’d like to spend the
> > next a couple of days fixing it. Could you guys please point me towards
> > what actually needs to be revised? In particular where is the source of
> > https://arrow.apache.org/docs/python/parquet.html ?
> > Really thanks!
> >
> > Ian
>

Reply via email to