Hello, I can second that we should move the documentation to a central one. As a C++ and Python contributor at the same time it always hard to think of where you should document a specific piece. We have a very small C++ documentation and a bit larger Python one. For some features it would though make sense to have them in both. IPC and in-process sharing is also a main part of the Arrow project. Documenting this separately for each language will be a lot of work and probably leave blind spots in each language.
Not everything in each language ecosystem can be directly included in Sphinx but as Sphinx is becoming a very broadly used documentation system, there are many nice converters like Breeze [1] (Doxygen to Sphinx) available. To directly answer the questions: - Should we do this at all (i.e. build up a central documentation system)? Yes - Should we use Sphinx for it? Very much in favour. There is probably also a tendency that some people prefer Markdown (I do) but given the feature set of Sphinx, I would very much argue in favour of it. - To which extent our current docs should be migrated to Sphinx (apart from the Python docs, which already use Sphinx)? For example, should the specs (currently standalone pages written in Markdown) be migrated to Sphinx for better cross-referencing and navigation? What about the C++ tutorial pages? etc. I would migrate C++ documentation definitely fully into that but the C++ / Python relation is very tight. There are a lot of topics that either touch two languages or are general to the project, these should also go in there. - Should we preferably have a single Sphinx doctree, or several independent per-topic / per-language doctrees? I'm not 100% sure what the definition of a "Sphinx doctree" is but as we will have many shared topics between the different implemenations so I would expect that we should have a single documentation with well organized sections. Also we probably will face the issue we have documentation on a specific topic and only a small part is different between two implementations/setups/... I really like the Scala/Python tabs in the Spark docs [2]. There is a Sphinx extension that seems to something similar to this [3]. This could either be used to have documentation on how to construct things where one switches between Ruby and Python or the main issue where I would need it: Setting up the build with slightly different package managers (e.g. conda vs pip in Python). Uwe [1]: https://breathe.readthedocs.io/en/latest/ [2]: http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations [3]: http://sphinxcontrib-contentui.readthedocs.io/en/latest/tabs.html On Sat, May 12, 2018, at 6:03 PM, Antoine Pitrou wrote: > > Hi, > > In the following PR discussion it was mentioned that we currently lack a > central documentation system for cross-language topics: > https://github.com/apache/arrow/pull/1575#issuecomment-364062240 > > Sphinx looks like a reasonable contender for that purpose. For that who > don't know it, Sphinx is a documentation system initially developed for > the Python language, which quickly became widely-used amongst Python > projects, and is now being used by non-Python projects as well. For > example, the LLVM docs (https://llvm.org/docs/) and even the Linux > kernel online docs are now written using Sphinx > (https://www.kernel.org/doc/html/latest/index.html). > > Sphinx uses reStructuredText (a.k.a "reST") as its basic markup > language, but with many extensions. It allows for structured > documentation with extensive cross-referencing (even between independent > Sphinx sites, using the "intersphinx" extension). > > The questions here are: > > - Should we do this at all (i.e. build up a central documentation system)? > > - Should we use Sphinx for it? > > - To which extent our current docs should be migrated to Sphinx (apart > from the Python docs, which already use Sphinx)? For example, should > the specs (currently standalone pages written in Markdown) be migrated > to Sphinx for better cross-referencing and navigation? What about the > C++ tutorial pages? etc. > > - Should we preferably have a single Sphinx doctree, or several > independent per-topic / per-language doctrees? > > Regards > > Antoine.