I read Simon Parent's thesis, "How Programmers Comment When They Think Nobody's Watching". Simon is analyzing comments in source files.
Simon quotes two other sources about comments to try to find a classification scheme. I've quoted the summaries Simon quoted from the sources [1] and [2]. I've included a summary of Simon's criteria [3] as well as the "extreme" criteria I outline in a previous post [4]. I've included Val's original post [0] which laid out the original criteria. Perhaps these can be taken as starting points of rational dialog. Val[0] I've been thinking for a while that the Clojure community could benefit a lot from a more sophisticated and ergonomic documentation system. I have seen some existing plugins like lein-sphinx, but I think it would be really good to have documentation that would be written in Clojure, for the following reasons : we're all very fond of Clojure data structures and their syntax. (I don't know about you, but I find that even HTML looks better in Clojure than in HTML). Plus, Clojure programmers already know how to edit them. (better reason) The facts that Vars are first-class citizens and that symbols can be referred explicitly with hardly any ceremony (macros) are a exceptional opportunity to make smart and highly-structured documentation very easily. if it's in Clojure, Clojure programmers can seamlessly build ad hoc documentation functionality on top of it to suit their own particular needs. I haven't found anything of the like yet, and if it exists, I would be grateful if someone would redirect me to it. Here are my thoughts on this : Clojure doc-strings, although they are quite handy as reminders and for doc-indexation, are too raw a content. Even when they are done right, they tend to be cumbersome, and it's too bad to have such concise code drown in the middle of so much documentation. What's more, I believe that when programmers program a function (or anything), they tend to think more about the implementation than the (uninformed) usage, so they have little incentive to make it right. Building on 1. having a system where documentation and programs live in separate files, in the same way as tests, would enforce a healthy separation of concerns. Importantly, it would make life much easier on the Version Control perspective. Documentation should probably be made differently than what people have got accustomed to by classical languages. Because you seldom find types, and because IMHO Clojure programs are formed more by factoring out recurring mechanisms in code than from implementing intellectual abstractions, the relevant concepts tend not to be obvious in the code. Since in Clojure we program with verbs, not nouns, I think documentation is best made by example. Documentation of a Var should not be a formal description of what it is and what it does with some cryptically-named variables. Every bit of documentation should be a micro-tutorial. Emphasis should be put on usage, examples, tips, pitfalls, howtos. There should be structure in the documentation, and it shouldn't be just :see-also links - there should be semantics in it. For example, some functions/macros are really meant to be nothing but shorthands for calling other functions : that kind of relationship should be explicitly documented. Documentation should not be just information about each separate Var in a namespace. There should be a hierarchy to make the most useful elements of an API more obvious. Also, adding cross-vars documentation elements such as tags and topics could make it easier to navigate and understand. Documentation in the REPL is great, it was one of the very good surprises when I started learning Clojure. However, a rich and good-looking presentation like in Javadocs would be welcome too. Of course, all of the above are just vague principles. Here is some functionality I suggest for a start : Documentation content elements could be written in a Clojure DSL emulating some kind of docbook-like markup language. On the user side, the documentation would be accessible through a generated web interface, a REPL interface, and maybe other formats like Wiki. Documentation could be programmed anywhere in a project by simply referring to the relevant Vars and calling the documentation API. Ideally, there would be a dedicated folder for documentation files, and a Leiningen plugin to compile them and generate the HTML from them. I often find myself lost because I have no idea what shape some arguments to a function should have, such as config maps and maps representing application-specific models. To adress this, I propose to explicitly declare and describe "stereotypes" in the documentation. Such stereotypes could be, for instance, "JDBC connection" or "Ring middleware". From what I have seen, some good work has already been done in that direction, but it would be good to make room for it in documentation. Weigh the documentation contents by importance, to allow for displaying the documentation with several levels of details. Cross-vars, semantic documentation with topics, tags, and links. Topics would group several API elements together to explain a technique or concept; they could have a :prerequisite relationship to help the reader navigate them. I imagine tags giving hints on various aspects of a Var, such as :curried for a function, or :utility, or :use-with-caution, etc. Links could be such things as the famous :see-also, but could also represent more precise relationships, such as :calls-to, :often-used-with, :similar-to, etc. In addition to small, Var-specific, self-contained code samples, there could be larger examples (e.g sample applications), and pointers from the documentation to specific points in these examples. There could be other types of documentation than just static description, such as exercises, koans, quizzes, etc. ========================================== [3] p24: "McConnell[1] has a classification scheme that is normative; it is designed for writing code, and specifically for deciding what kinds of comments should be written. These categories are about the value of comments, and McConnell presents them from worst to best, excluding the last category which is a catch-all. Indeed, McConnell says that only summary, intent, and the last category are acceptable in completed code. * Repeat of the code: states what the code does in different words. Just more to read * Explanation of the code: Explains complicated, tricky, or sensitive code. Make the code clearer instead * Marker in the code: Identifies unfinished work. Not intented to be left in the completed code * Summary of the code: Distills a block of code into one or two sentences. Such comments are useful for quick scanning * Description of the code's intent: Explains the purpose of a section of code, more at the level of the problem than at the level of the solution * Information that cannot possibly be expressed by the code itself: Copyright notices, confidentiality notices, pointers to external documentation, etc." ========================================== [3] p24: "Baecker and Marcus[2] are concerned with typesetting programs, and recognize that different kinds of comments deserve to be formatted differently. This is their motivation for a preliminary taxonomy of comments. In this taxonomy they provide a list of communication objective; their defintions of these communication objectives are summarized" * Identification: Calls attention to the existence of a section of code * Emphasis: Calls attention to some aspect of additional significance about the code. * Description: Makes explicit intuitable attributes of the code * Explanation: Clarifies some aspects of the code * Amusement: Secondary text to help the reader through long or difficult code (jokes, anecdotes, epigrams, illustrations, etc) * Summary and Review: Reflects upon the reader's progress * Announcements or Warnings: Informs of recent changes, or provides cautionary remarks * Testing, Gaming, or Simulation: Quizzes may be useful for long code documents to test the reader's understanding. * Measurement or Indexing: Metrics of the code which may be useful. * Analogies, Metaphors, Parables: Aids in understanding otherwise impenetrable concepts * Informal Remarks: Spontaneous graffiti from past programmers ========================================== [3] p27: Simon Parent[3] tries to classify existing comments in projects from a class at the University of Waterloo. * Execution Narrative: Comments that describe the execution of the program largely fall into two types: those which describe the current state of the program, and those which describe actions * Clarification: Help the reader understand the meaning of a tricky piece of code. These comments explain an aspect of the code that is particular to its form. * Data Definition: Comments which refer to a data definition. The typical example is a comment which elaborates a variable name. * Sectioning: These comments divide the code into logical units. * Development Narrative: These comments describe the development of the program's source code. A typical example is a reminder that work is unfinished. This is where the programmers criticize the code, give advice to those who will follow, and express their wishes for the future. * Prologue: These comments give introductory remarks before a major section of code. A typical example can include the functions's purpose, return value, constraints on input, or even implementation details. * Unclassified ========================================== Tim Daly[4] tries to find documentation criteria that include in-file comments as well as higher level organization of needed information, representing "the extreme" case. Consider Clojure's primary data structure implementation. It is basically an immutable, log32, red-black tree. For some people that is more than sufficient, especially if they have been working in the code base for years. For others, especially as a developer new to the project, there is a lot to know. Without this information it is very difficult to contribute. A new developer needs an introduction to the IDEA of immutable data structures with a reference to Okasaki's thesis which is online at http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (a bibliography). A new developer needs to know that the DESIGN of Clojure relies on these immutable data structures so they don't introduce any "quick and efficient" hacks that violate the design. (a Clojure overview) A new developer needs to know WHAT a red-black tree is, WHY it was chosen, and HOW Clojure maps user-visible data structures to them. (the chapter on this particular data structure) A new developer needs to know the IMPLICATIONS of the choice of log32 since it defines the efficiency. (the design constraints section and the algorithmic analysis section) A new developer needs to know HOW to update a log32 immutable red-black tree. (a pseudocode explanation with pictures) A new developer needs to know HOW the log32 red-black tree is implemented. It is not immediately obvious how the 5-bit chunks are mapped into a 32 bit word. If the task was to re-implement it on a 64 bit word they'd have to know the details to understand the code. (the actual code with explanations of the variables) If the new developer's task is to modify the code for a 64 bit architecture they would need a way to find the code (the table of contents) and places where this information is mentioned (an index). All of the places where it is written need to be properly updated. Even if we focus strictly on what a new developer needs to know we end up with something that smells a lot like a book. From the above we see the need for 1) a bibliography 2) a Clojure overview 3) a chapter focus on this data structure 4) sections on design constraints and algorithmic analysis 5) a section of pseudocode with pictures 6) a section with code and details of the actual implementation 7) a table of contents 8) an index [0] Val Waeselynck clojure@googlegroups.com Wed, 30 Apr 2014 16:08:33 -0700 (PDT) [1] Steve McConnell "Code Complete, Second Edition" Microsoft Press, Redmond, WA, USA, 2004 [2] Ronald M. Baecker and Aaron Marcus "Human factors and typography for more readable programs" ACM, New York, NY, USA, 1989 [3] Simon Benjamin Orion Parent "How Programmers Comment When They Think Nobody's Watching" Master's Thesis, University of Waterloo, Waterloo, Ontario, Canada 2014 [4] Tim Daly clojure@googlegroups.com Wed Apr 30 03:09:05 2014 -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.