=begin INTRO Mongers,
I must say, I am a bit disappointed that the discussions about the future of documentation in Perl has died. Or was everyone fully occupied by YAPC::NA? I spent last week with my family on a stormy island, without sufficient internet access, so was unable to stirr things up again, but maybe this email will bring the focus back. Damian challenged me by asking what I think how Perl6's documentation should be done. When I think about documentation, I do not (immediately) think about some mark-up language; that is just a minor component. My focus is on the whole process: from writer to reader. Although Damain says to have studied OODoc, this most import features of that system were ignored: simplifying the documentation process, improving the documentation quality. Each time I re-read the list below, there are things I wish to change or add: it is neither complete nor final. But I cannot wait longer to post it. It's not my wish to extend this into a detailed requirements document, just to set a focus of discussion. Maybe someone wants to comment on it? MarkOv =end INTRO ======== Documentation of code (i.e. Perl6) In this text, we try to determine the environment for the optimal documentation system for Perl6, but applicable to any other programming language. Everyone will have his/her own weights on different aspects, and you may even totally not agree with some of the listed remarks: it is open to discussion. Some of the wishes contradict an other as well. === Goal The sole goal: the best documentation for Perl6 It is very important to keep in mind that documentation is made for some target community to be read. Write-only texts are useless. === Target communities Documentation is added to code, to provide additional information about the code to inform some target community. There are different target communities possible for the same piece of code, which should all be served as good as possible. Traditionally, you see 1) code comments, for maintainers of the software 2) manual-pages, for everyone else But more specific user groups can be defined: 1) code comments, for maintainers 2) developer manuals (the complete interface) 3) user manuals (distribution external interfaces) 4) selective look-up (for perldoc -f or IDE) === Fundamentals There are a few fundaments for good documentation: - it must be written - it must be correct - it must be consistent in structure - it must be consistent in content - it must be accessible (find back/pleasant to read) [Writing] The best way to get people into writing documentation, is to make it as simple as possible. This means: - reduce the need to read man-pages or books to be able to create it, - reduce the amount of text to be typed, - avoid the need for additional tools to be installed, - reduce the need for configuration. All for the sake of laziness. The less time people need to manage their documentation environment, the more time they have to write quality texts. [Writing] To ease the burden of writing docs, documentation generating tools should use as much information from the code as is useful for the target community. Replication between code and docs make changes a double effort. Manual replicated of text between files (like the inclusion of the license text in each file) during programming is an avoidable burden. [Writing] Each documentation fragment belongs to some part of the implementation. This may be a distribution, a file, a class or grammar or package, a method, rule or sub, a positional or named parameter, and so on. This relation comes natural (because of the mixture of code and doc), or enforced (via some reference syntax). [Writing] The documentation fragments need some markup. Many mark-up languages exist, which do have more or less the same features. Two of those are POD and PDD S26. Within one distribution, it is useful to use the same kind of markup syntax. Document generators should only get a minimal abstract interface to collect the results of the markup parser, for instance a $markup->produceHtml($fragment, ...) [Writing] The markup language used should be capable of addressing the things that a document writer wishes to express, not on what certain output back-end can handle (those can always ignore things they cannot handle) [Correctness] The documentation and the related code must be cross checkable, on matters they overlap. Better to avoid replication, in which case there is no overlap to be checked. [Correctness] The user should be stimulated to write in a good style. One of the ways to achieve this, is to avoid the need to write the same sentence over and over again. For instance: "This method returns a boolean, to indicate success" is a sentence to avoid. (Template based) auto- generation could be used introduce abbreviations for often used constructs. [Correctness] Produced manuals should by default be checked for the completeness (like Pod::Coverage), correctness in syntax (like Pod::Checker), and the used references. If possible, spell-checking (like Pod::Spell) should be invoked automatically. [Structure] There is a set of components we will always find in (UNIX) manual pages: the one-line purpose (name), the synopsis, extended description, the subs and methods, the "see also", authors, and license. The order and location of these documentation fragments, and their exact names are arbitrary. Only the back-end can decide how, whether, and in which order these components appear. [Structure] The doc-writers should have general information about which documentation components are minimally needed by the back-ends, for instance the name and the license. A short-list of chapter names suffices. [Structure] The documentation generating back-ends shall have the same idea about the structure and meaning of the contributed documentation. The back-ends only generate end-user texts, without any need for interpretation of the doc fragments. Only this way, systems like search.cpan.org can be of value. [Content] On the documentation writer's side, there usually is a serious problem: writers do not know enough about the readers: their level of education, their actual interests, and the media they use to read the documentation. This results in inconsistent documentation between distributions. For instance, some people put internal interfaces into the manual-pages, where other do not want to bother the readers hence include them as comments. In Perl6, we have scoped subs and private methods, so it is much clearer whether a component is available to everyone or not. We may produce different manual pages. [Content] Iff the used markup-language permits the author to specify the commands which change the back-end's output (in the anarchistic tradition of Perl), therewith endangering the consistency in output style or frustrating the automatic processing of the content by other back-ends, then there must be a simple way for the back-ends to protect themselves. There should be a standard way to remove this cruft from the documentation fragments. [Accessible] The back-end, which produces the document the user will read, can be traditional UNIX manual-pages, HTML web-pages, a printed book, whatever. Of course, you want to produce documents which fit as good as possible to the possibilities which a certain output medium gives. POD(5) can be used to produce web-pages, but the features to link between document elements are far below the levels we are used to for HTML web-pages. It should be very simple to create anchors and references to very specific locations in the text, like a single option description. Preferably without the need to define destination anchor points: the documentation where you point to can be in a different package, not under your control. [Accessible] Documentation fragments are usually written in coding order. The programmer's activities are often quite chaotic: functions and methods are written in the order that the programmer needs them. Related components are often close together in the file, but there is no role for manual-order during this development process. What is the optimal order for the user to consume the fragments? A very workable solution is to group the items (for instance, constructors, accessors, ...) into text sections, and within those groups use alphabetic sorting. Each section may need some introductory text and examples. To make doc fragments groupable is a requirement, and back-ends will work-out how that grouping is used. === Generation documentation The documentation generation process could look schematically something like this: for each file in the MANIFEST parse Perl into its AST extract doc and code info from AST into doc-tree for each doc-fragment in doc-tree do spell-check preform consistency, structural checks on doc-tree collect inheritance information into doc-tree for each package, class, grammar, pod in doc-tree call generator back-end(s) syntax check produced man-pages