Among other things, the columnar format specification files should probably make their way into this new documentation project.
On Mon, May 21, 2018 at 5:19 PM, Wes McKinney <wesmck...@gmail.com> wrote: > I don't think we should attempt to create a documentation "super > project" that includes the generated API reference for all the > libraries in Apache Arrow. I do think that creating a documentation > "hub" project (with the low-level API docs being the "spokes") is a > good idea. Currently, the Jekyll project website serves as a very > crude hub. It would be better to build something more suited for > writing developer documentation. > > So in other words, the subprojects would continue to generate API docs > using the current tools (Javadoc, GTK-Doc, Doxygen, Sphinx, etc.) but > the objective of the "top level docs" is to make the entire project > easier to navigate than it is now. > > On Sun, May 20, 2018 at 3:15 AM, Kouhei Sutou <k...@clear-code.com> wrote: >> Hi, >> >>> I really like the Scala/Python tabs in the Spark docs [2]. >> >>> [2]: >>> http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations >> >> Oh, I also like it. >> >>> > - Should we do this at all (i.e. build up a central documentation system)? >> >> Yes. >> >>> > - Should we use Sphinx for it? >> >> I'm neutral. >> >> If we choose Sphinx or something, we need some works for >> Apache Arrow C. It uses GTK-Doc as its documentation >> system. We'll need to create a tool like >> https://github.com/pygobject/pgi-docgen . (It's a tool for >> Sphinx.) >> >> Apache Arrow C needs to keep using GTK-Doc style for API >> documentation. Because it's also used by GObject >> Introspection. GObject Introspection is very important in >> Apache Arrow C. For example, The Ruby bindings needs GObject >> Introspection support. So we shouldn't drop GObject >> Introspection support. >> >> Other documentations such as tutorial (they doesn't exist >> yet :<) don't need to use GTK-Doc style. >> >> >> We'll need to create a similar tool for Apache Arrow Ruby. >> The most API of Apache Arrow Ruby are generated >> automatically by GObject Introspection support. We can reuse >> GTK-Doc style documentation in Apache Arrow C for Apache >> Arrow Ruby. >> >> We may be able to use >> https://github.com/ruby-gnome2/yard-gobject-introspection >> for Apache Arrow Ruby. It's not completed yet but we can >> improve it. (I'm one of the developers of it.) >> >> >> Thanks, >> -- >> kou >> >> In <1526586273.3156930.1375962952.57744...@webmail.messagingengine.com> >> "Re: Language-independent and cross-language docs" on Thu, 17 May 2018 >> 21:44:33 +0200, >> "Uwe L. Korn" <uw...@xhochy.com> wrote: >> >>> Hello, >>> >>> I can second that we should move the documentation to a central one. As a >>> C++ and Python contributor at the same time it always hard to think of >>> where you should document a specific piece. We have a very small C++ >>> documentation and a bit larger Python one. For some features it would >>> though make sense to have them in both. IPC and in-process sharing is also >>> a main part of the Arrow project. Documenting this separately for each >>> language will be a lot of work and probably leave blind spots in each >>> language. >>> >>> Not everything in each language ecosystem can be directly included in >>> Sphinx but as Sphinx is becoming a very broadly used documentation system, >>> there are many nice converters like Breeze [1] (Doxygen to Sphinx) >>> available. >>> >>> To directly answer the questions: >>> >>> - Should we do this at all (i.e. build up a central documentation system)? >>> >>> Yes >>> >>> - Should we use Sphinx for it? >>> >>> Very much in favour. There is probably also a tendency that some people >>> prefer Markdown (I do) but given the feature set of Sphinx, I would very >>> much argue in favour of it. >>> >>> - To which extent our current docs should be migrated to Sphinx (apart >>> from the Python docs, which already use Sphinx)? For example, should >>> the specs (currently standalone pages written in Markdown) be migrated >>> to Sphinx for better cross-referencing and navigation? What about the >>> C++ tutorial pages? etc. >>> >>> I would migrate C++ documentation definitely fully into that but the C++ / >>> Python relation is very tight. There are a lot of topics that either touch >>> two languages or are general to the project, these should also go in there. >>> >>> - Should we preferably have a single Sphinx doctree, or several >>> independent per-topic / per-language doctrees? >>> >>> I'm not 100% sure what the definition of a "Sphinx doctree" is but as we >>> will have many shared topics between the different implemenations so I >>> would expect that we should have a single documentation with well organized >>> sections. >>> >>> Also we probably will face the issue we have documentation on a specific >>> topic and only a small part is different between two >>> implementations/setups/... I really like the Scala/Python tabs in the Spark >>> docs [2]. There is a Sphinx extension that seems to something similar to >>> this [3]. This could either be used to have documentation on how to >>> construct things where one switches between Ruby and Python or the main >>> issue where I would need it: Setting up the build with slightly different >>> package managers (e.g. conda vs pip in Python). >>> >>> Uwe >>> >>> [1]: https://breathe.readthedocs.io/en/latest/ >>> [2]: >>> http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations >>> [3]: http://sphinxcontrib-contentui.readthedocs.io/en/latest/tabs.html >>> >>> >>> On Sat, May 12, 2018, at 6:03 PM, Antoine Pitrou wrote: >>>> >>>> Hi, >>>> >>>> In the following PR discussion it was mentioned that we currently lack a >>>> central documentation system for cross-language topics: >>>> https://github.com/apache/arrow/pull/1575#issuecomment-364062240 >>>> >>>> Sphinx looks like a reasonable contender for that purpose. For that who >>>> don't know it, Sphinx is a documentation system initially developed for >>>> the Python language, which quickly became widely-used amongst Python >>>> projects, and is now being used by non-Python projects as well. For >>>> example, the LLVM docs (https://llvm.org/docs/) and even the Linux >>>> kernel online docs are now written using Sphinx >>>> (https://www.kernel.org/doc/html/latest/index.html). >>>> >>>> Sphinx uses reStructuredText (a.k.a "reST") as its basic markup >>>> language, but with many extensions. It allows for structured >>>> documentation with extensive cross-referencing (even between independent >>>> Sphinx sites, using the "intersphinx" extension). >>>> >>>> The questions here are: >>>> >>>> - Should we do this at all (i.e. build up a central documentation system)? >>>> >>>> - Should we use Sphinx for it? >>>> >>>> - To which extent our current docs should be migrated to Sphinx (apart >>>> from the Python docs, which already use Sphinx)? For example, should >>>> the specs (currently standalone pages written in Markdown) be migrated >>>> to Sphinx for better cross-referencing and navigation? What about the >>>> C++ tutorial pages? etc. >>>> >>>> - Should we preferably have a single Sphinx doctree, or several >>>> independent per-topic / per-language doctrees? >>>> >>>> Regards >>>> >>>> Antoine.