Hi, > I really like the Scala/Python tabs in the Spark docs [2].
> [2]: > http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations Oh, I also like it. > > - Should we do this at all (i.e. build up a central documentation system)? Yes. > > - Should we use Sphinx for it? I'm neutral. If we choose Sphinx or something, we need some works for Apache Arrow C. It uses GTK-Doc as its documentation system. We'll need to create a tool like https://github.com/pygobject/pgi-docgen . (It's a tool for Sphinx.) Apache Arrow C needs to keep using GTK-Doc style for API documentation. Because it's also used by GObject Introspection. GObject Introspection is very important in Apache Arrow C. For example, The Ruby bindings needs GObject Introspection support. So we shouldn't drop GObject Introspection support. Other documentations such as tutorial (they doesn't exist yet :<) don't need to use GTK-Doc style. We'll need to create a similar tool for Apache Arrow Ruby. The most API of Apache Arrow Ruby are generated automatically by GObject Introspection support. We can reuse GTK-Doc style documentation in Apache Arrow C for Apache Arrow Ruby. We may be able to use https://github.com/ruby-gnome2/yard-gobject-introspection for Apache Arrow Ruby. It's not completed yet but we can improve it. (I'm one of the developers of it.) Thanks, -- kou In <1526586273.3156930.1375962952.57744...@webmail.messagingengine.com> "Re: Language-independent and cross-language docs" on Thu, 17 May 2018 21:44:33 +0200, "Uwe L. Korn" <uw...@xhochy.com> wrote: > Hello, > > I can second that we should move the documentation to a central one. As a C++ > and Python contributor at the same time it always hard to think of where you > should document a specific piece. We have a very small C++ documentation and > a bit larger Python one. For some features it would though make sense to have > them in both. IPC and in-process sharing is also a main part of the Arrow > project. Documenting this separately for each language will be a lot of work > and probably leave blind spots in each language. > > Not everything in each language ecosystem can be directly included in Sphinx > but as Sphinx is becoming a very broadly used documentation system, there are > many nice converters like Breeze [1] (Doxygen to Sphinx) available. > > To directly answer the questions: > > - Should we do this at all (i.e. build up a central documentation system)? > > Yes > > - Should we use Sphinx for it? > > Very much in favour. There is probably also a tendency that some people > prefer Markdown (I do) but given the feature set of Sphinx, I would very much > argue in favour of it. > > - To which extent our current docs should be migrated to Sphinx (apart > from the Python docs, which already use Sphinx)? For example, should > the specs (currently standalone pages written in Markdown) be migrated > to Sphinx for better cross-referencing and navigation? What about the > C++ tutorial pages? etc. > > I would migrate C++ documentation definitely fully into that but the C++ / > Python relation is very tight. There are a lot of topics that either touch > two languages or are general to the project, these should also go in there. > > - Should we preferably have a single Sphinx doctree, or several > independent per-topic / per-language doctrees? > > I'm not 100% sure what the definition of a "Sphinx doctree" is but as we will > have many shared topics between the different implemenations so I would > expect that we should have a single documentation with well organized > sections. > > Also we probably will face the issue we have documentation on a specific > topic and only a small part is different between two > implementations/setups/... I really like the Scala/Python tabs in the Spark > docs [2]. There is a Sphinx extension that seems to something similar to this > [3]. This could either be used to have documentation on how to construct > things where one switches between Ruby and Python or the main issue where I > would need it: Setting up the build with slightly different package managers > (e.g. conda vs pip in Python). > > Uwe > > [1]: https://breathe.readthedocs.io/en/latest/ > [2]: > http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations > [3]: http://sphinxcontrib-contentui.readthedocs.io/en/latest/tabs.html > > > On Sat, May 12, 2018, at 6:03 PM, Antoine Pitrou wrote: >> >> Hi, >> >> In the following PR discussion it was mentioned that we currently lack a >> central documentation system for cross-language topics: >> https://github.com/apache/arrow/pull/1575#issuecomment-364062240 >> >> Sphinx looks like a reasonable contender for that purpose. For that who >> don't know it, Sphinx is a documentation system initially developed for >> the Python language, which quickly became widely-used amongst Python >> projects, and is now being used by non-Python projects as well. For >> example, the LLVM docs (https://llvm.org/docs/) and even the Linux >> kernel online docs are now written using Sphinx >> (https://www.kernel.org/doc/html/latest/index.html). >> >> Sphinx uses reStructuredText (a.k.a "reST") as its basic markup >> language, but with many extensions. It allows for structured >> documentation with extensive cross-referencing (even between independent >> Sphinx sites, using the "intersphinx" extension). >> >> The questions here are: >> >> - Should we do this at all (i.e. build up a central documentation system)? >> >> - Should we use Sphinx for it? >> >> - To which extent our current docs should be migrated to Sphinx (apart >> from the Python docs, which already use Sphinx)? For example, should >> the specs (currently standalone pages written in Markdown) be migrated >> to Sphinx for better cross-referencing and navigation? What about the >> C++ tutorial pages? etc. >> >> - Should we preferably have a single Sphinx doctree, or several >> independent per-topic / per-language doctrees? >> >> Regards >> >> Antoine.