Hi, I am very much interested to complete this project, i did apply to Gsoc last time with this project and was rejected (however there were better students). And i would be happy if i can take this time. Let me know your thoughts.
On 3/7/13, Kevin Horn <kevin.h...@gmail.com> wrote: > Sorry it's taken me so long to get back to this. But it's gotten to be a > Looong email. > > On Sat, Mar 2, 2013 at 3:14 AM, Glyph <gl...@twistedmatrix.com> wrote: > >> >> On Mar 1, 2013, at 9:35 PM, Kevin Horn <kevin.h...@gmail.com> wrote: >> >> That "never-ending" series of Lore source fixes took place over the >> course >> of a couple of weeks. Doing things that way was not my idea, though it >> seemed reasonable at the time because the idea was that we would do the >> cutover at the end of it. >> >> >> Well, let's go to the video tape. Based on this comment - < >> http://twistedmatrix.com/trac/ticket/4500#comment:12> - these tickets >> were closed over a period ranging from 2010/07 to 2011/03. 6 months isn't >> quite "weeks", but okay I guess it wasn't "never-ending" either :). >> >> > Hmmm. I recall it as being much shorter. Probably most of the work took > place it two "spurts" around the beginning and end of that time, and that's > why I remember it that way. But I'm not interested in digging through a > bunch of old dates to find out for sure. > > >> (As an aside, lore2sphinx is in no way a "broken pile of regexes". Not >> to >> say that it isn't broken in some really significant ways, because it is, >> but it doesn't use regexes at all. Just sayin'.) >> >> >> Actually yeah, "regex" is just a curse-word here :). It's the emitter >> I'm >> complaining about, anyway, not the parser, so deriding it as a "regex" is >> in no way accurate. >> > > I figured that was the case, I just wanted to say something so others > reading this didn't get the wrong impression about how lore2sphinx is > implemented. I mean it's not code I'm very proud of, but it's not _that_ > bad :) > > > <<< snip a bunch of stuff about who said what when, why I thought what I > thought, etc. >>> > > It boils down to the fact that a bunch of the conversations happened either > in person or on IRC. This was mostly because I was in a hurry at the time, > usually because I wanted to do something before additions were made to the > documentation, which was in a somewhat "known" state (as in I knew how it > was going to behave when run through lore2sphinx) at the time. > > Also, please elaborate on what you mean by "do *everything* in one big >> bang. My intention was never to do anything but get the SphinxBuilder >> working on that branch. Was there something else you thought I was >> doing? >> Was there something else I should (or should not) have been doing? >> >> >> My reasoning goes like this: the ticket for the release tools is still >> not >> in review, so you must be waiting for something to re-submit it. It >> looks >> like you responded to the code, so the only thing I could think you were >> still waiting for would be for the lore sources themselves to be ready. >> >> > It's been long enough that I can't fully recall my reasoning on this. But > _probably_ I decided that if I finished the release tools ticket, someone > might use it. Which would be great, except that I think I had decided that > before that actually happened I needed to figure out a way to emit nicer > output from lore2sphinx. So I left it alone until I had figured out how to > do that. > > At least, that _might_ have been part of my thought process. It really was > ages ago. > > [the fixed-up Lore sources] got left alone because of the release tools >> hangup. Ideally the release tools would have been done before the whole >> lore-source-tweaking process, but they weren't. I'll admit my >> frustration >> played a part in this, but so did the deafening silence I got when I >> asked >> for anyone to comment on the ticket. >> >> >> Where and how did you ask people to comment on the ticket? I don't >> recall >> being asked, and I tend to be pretty good about leaving prompts like that >> in my inbox until I've done what was asked. (Not *perfect*, of course, >> and >> if you asked a list then there might have been some bystander effect.) >> It >> seems like we might have avoided this whole mess if you had just attached >> the 'review' keyword :). >> > > On IRC. > > >> >> My perception has been that I would say "what do we need to do to make >> this happen"? There would be some hemming and hawing (and at least >> several >> times long discussions about how documentation didn't really fit the >> regular UQDS process) and a sort of plan would be invented. I would >> proceed according to the plan as I understood it. I would then say "OK, >> we're ready"! And then be told that some other thing not in the plan >> needed to be done. The cycle would then repeat. >> >> >> The only "cycle" I can either see on the tickets or recall here is where >> the release tools didn't come in to the initial plan. >> > > This was the latest of several (3 or 4) according to my > recollection/perception. It doesn't really matter now. > > >> No [the need for release automation] was not brought up until well into >> the process. I (sort of) understand the desire for this, but it seems >> pretty weird to be building what is essentially a wrapper for an existing >> tool, along with tests for said wrapper, >> >> >> OK. I can believe that this did not happen. One problem is that we (the >> inner-circle old-school Twisted developers) tend to engage in >> conversations >> about how a thing might be done while at the same time we discuss what >> must >> be done. And we also tend to discuss what policy is (or what all or some >> of us believe it *ought to be* in some case, further confusing the issue) >> without making explicit what the *purpose* of that requirement is. >> >> I would ask the community to help us with this by doing a couple of >> things. >> >> If somebody says "X is policy", always ask for a link to it. If there is >> a link, it'll help you understand it better. If there *isn't* a link, >> then the authority telling you it's "policy" might just be remembering >> that >> it's the way we've done things since forever and of course it's a good >> idea. There are definitely things that I have thought were in the coding >> standard that are not actually written down anywhere, on more than one >> occasion. >> >> If a meandering discussion is happening - here, on the mailing list, on >> the ticket - never be afraid to break it up and separate out the >> different >> concerns which are being discussed: what is necessary for compliance with >> our development process, what would be a good idea from a design point of >> view, how the work might be broken up to get through review more >> manageably, what other concerns are in play. >> >> Especially, if you ever see a code review where a reviewer says "I >> think..." without making it clear what you should *do*, you should always >> ask, 'is this a requirement of the review or just some thoughts you >> have'. >> >> > And when we ask, we should ask on the ticket, and put it back into review, > yes? Because I think this was the part (or at least _A_ part) I was really > missing here. > > >> There's also the problem of "I think you should..." being interpreted as >> "You must...". It is *very* hard to consistently separate design >> feedback from code review, although we try very hard; but, it's hard to >> separate it out when reading it as well. So one important point to keep >> in >> mind is that, as the author of a proposed change, outside the things that >> are agreed upon policy consensus, you always have some degree of >> discretion >> to disagree with a reviewer. And you should freely do so when submitting >> anything for re-review. It's best to just do this as quickly as >> possible, >> so that it gets back to the reviewer without a whole lot of delay, and >> they >> can respond with either "I still disagree, but you're doing the work, so >> OK >> go ahead" or "No, you really have to do this, it's required by policy >> document X, here's a link" ;-). >> >> >>> 1. The documentation itself needs to be able to be generated from any >>> version of trunk. While one or two formatting snafus are acceptable >>> to be >>> fixed after the fact, the documentation needs to be in a >>> comprehensible >>> state in every revision of trunk, which means that in order to land >>> on >>> trunk, the ReST output. >>> >>> So...you didn't finish that sentence. I realize you apologized for >> errors at the end of your mail, but I have a feeling you were going to >> say >> something rather important there... >> >> >> Well yes, that was the point of the apology. That was a rather important >> thing. What I was probably going to say was just: >> >> The ReST output needs to be in good enough shape to be generally >> readable, >> with a manageable number of errors. But, we need to be able to *verify* >> that it has not too many errors. >> >> >> And I'd already discussed that somewhat above. >> >> Now that I've replied to all of that, let me give you a rundown of what >> I've been thinking and planning, so that you have an idea of where I'm >> coming from. >> >> Here are the various things that I have perceived to be >> necessary/required >> in order to get the conversion to happen: >> >> a) The conversion process needs to be able to be run concurrently with >> Lore for an extended period of time. In other words, Lore would be the >> "official" version of the docs, and the Sphinx docs would be built in >> some >> form of automated fashion until everyone was happy with them and/or ready >> to deprecate/abandon Lore. >> >> >> Your understanding of this requirement is slightly off, I think, although >> possibly the consequences are the same. As per the difficulties I laid >> out >> above, about separating the requirements from the strategies for >> satisfying >> said requirements. >> > > I've been told that almost verbatim, several times. This is basically what > led to the Sphinx buildbot happening. Perhaps I wasn't clear about what I > meant. > > >> The thing that we weren't going to tolerate was any message saying that >> people should hold off on writing documentation, even for "a little >> while" >> while we fixed up the lore conversion, because without a contractual >> obligation for someone to finish this work, there's really no telling how >> long "a little while" would be :). >> > > Well, when I originally was pushing it, my plan was for that little while > to be "today" (this was at PyCon during the only day of sprints I was able > to attend), and if it didn't get done, we'd abandon that particular > attempt. You and exarkun managed to convince me that even this was > probably not a very good idea though. > > >> Since the whole point of this sphinx conversion is to appeal to >> documentation authors who prefer the ReST format as input (it's >> definitely >> not to make the docs look nicer, writing a new stylesheet for Lore would >> have taken 1/100th of the effort and nobody has expressed interest in >> doing >> that), creating a period where things were even *less* appealing to >> documentation authors would defeat the purpose. >> > > I actually considered the stylesheet thing, but it was really only a > passing thought. My personal motivation started with not being able to > find things in the documentation. So I started looking at the various Lore > tickets to see whether there was something to clean up that would help. > And a bunch of them seemed to be asking for things that Sphinx already > did. Sphinx was starting to become a common tool, and I had used it on > several other projects, and found it pleasant to work with. Also, when I > asked about Lore on IRC, I got a lot of "I'm not sure anyone knows how that > works these days" and "oh man, I wish we didn't have to support that any > more", etc. So I started looking into how to convert the docs over to use > Sphinx. > > >> Another possible solution to this problem would be to modify Lore so it >> could process ReST sources, so that we could convert the documentation >> within the repository piecemeal, and start writing any new docs in ReST, >> but still have a coherent whole of documentation produced, eventually >> switching the documentation processor from Lore to Sphinx. >> > > This would require someone smarter than me. Or at least more versed in > formal parsing theory/techniques. Or something. And that would be just to > read the docutils sources. I find them...alien. (though less so that when > I first started looking at them...I'm not sure if they've improved, or I > have) > > >> Yet another possible solution would be to modify Sphinx, adding a plugin >> to process the Lore sources. >> > > This is more reasonable, but still has problems. Actually the reasonable > thing would be to create a docutils piece to process Lore sources, and then > maybe some Sphinx extensions on top of that. Or something. Still, it > might have been doable. However, I think Lore would have had to be > modified as well, and possibly the Lore format expanded > to accommodate certain constructs that it just doesn't do right now (mostly > I'm thinking of the toctree directive and related stuff). > > >> As an aside: this is the part of the process which has been so >> frustrating >> to me, personally. The two alternate solutions I proposed here (and have >> proposed before) seem far saner and more manageable in terms of effort, >> to >> me. But, everyone I have spoken to about docutils and ReST has told me >> in >> no uncertain terms that they are both a pile of heinous hacks that resist >> any attempt at sensible software-engineering solutions to problems, so we >> need to resort to hackish system-integration stuff like what we've done. >> This worries me. >> > > Ooookaaaaay....I don't know how to respond to that exactly. > > >> I know that Sphinx's output is well-loved by the Python community, but if >> it's so hard to call into that we can't reasonably modify it to get an >> XML >> DOM that looks like Lore source to Lore, and it's so hard to plug in to >> it >> that we can't give it a data structure that it likes from Lore's XML DOM, >> then how the heck is it being maintained? And if it actually *isn't* >> that >> bad, then why haven't I managed to find someone that knows its code well >> enough to do one or the other of these things? >> > > It would be possible to make Sphinx emit Lore sources, though I'm not sure > what that buys. You could do this either through a custom Sphinx > "builder", or possibly even just using a custom html template with the html > builder. But you'd need ReST sources to feed into Sphinx, so... > > You could write a docutils "parser" which parses a document and returns a > "nodetree" data structure. This would get you as far as docutils, but > AFAIK there is no existing way to get Sphinx to use any parser other than > the default ReST one. You could probably create such a thing, which would > almost certainly involve modifications to Sphinx, though that's not > necessarily a big deal. It might not even be hard. I think this would > actually be a lot easier now than when I started down this path, mostly > because docutils seems to have better documentation on the nodes that can > go in the "nodetree" I mentioned above. Note that I said "seems" because > I'm not sure if it's that docutils documentation has gotten more complete, > or just that I've bounced around in it enough times to find things. The > Docutils docs have the same problem that the Twisted docs have, which is > that they are nigh un-navigable. (I also think that the docutils docs > should start using Sphinx, but I'm not sure how well that would go over in > that camp...) > > The main problem with creating such a parser, is that Sphinx uses a bunch > of docutils extensions to tie together the disparate documents in your > project, and Lore, like vanilla docutils, doesn't have much of a concept of > being one document among many (at least not from within a document). For > example, it has things to handle tables of contents, cross document links > (with the ability to link to a document section, rather than a specific > document, so if it gets moved to a different document, the link gets > adjusted), compilation for glossaries and index entries from across the > docs project, etc. So you'd need to add some stuff to Lore to account for > this (some is already there). And then we'd have to go through and modify > a bunch of the Lore sources anyway. > > Like I said, this looks a lot more feasible now than it did when I first > looked at it, though I'm not sure whether it's me or docutils/Sphinx that's > changed. Probably some of each. > > At any rate, back then it seemed awfully difficult, and less interesting. > > Hmmm. And you'd also need to make some changes to the way Sphinx picks up > files. And probably some other stuff I haven't thought of. > > I have no direct knowledge of any of this stuff, because my main interest >> here is improving the experience of working on Twisted, both for you, >> Kevin, and for the people who will arguably be helped by the use of >> Sphinx. >> Maybe I'm completely wrong and Sphinx is beautifully architected and we >> could have done this from day 1. But I faintly hope that some Docutils >> and >> Sphinx contributor hears that I said "sphinx is garbage" and makes a fool >> of me by contributing either a lore modification or a sphinx plugin which >> solves this whole problem so we can do the format or tool migration >> incrementally :). >> >> b) Because of a), there needs to be tooling to run lore2sphinx (or >> whatever) on a regular basis. (This was sort of being done via the >> Sphinx-building buildbot, but in a very ad-hockery sort of way, which was >> brittle, broke a couple of times, and needed to be improved.) >> >> >> Hmm. I wasn't aware of that. But it seems like it's running by a charm >> now. >> > > I think this is because a) exarkun fixed it a couple of times, and b) I > stopped making changes to the lore2sphinx repo (which the buildbot pulls > from). I'm also referring here to something which is completely > non-obvious to anyone who hasn't actually run lore2sphinx by hand, which is > that the command line tool was fairly terrible in several ways. This made > it harder to use for development than it should have been. > > >> >> c) There needs to be release management tooling to build the Sphinx docs >> from ReST into whatever formats we want to publish (HTML and PDF to >> start, >> maybe others later on) >> >> >> Yup. (ePub? PDF is so last-century... :)) >> >> d) Convert the Lore sources to better ReST documents without all the >> problems that the current lore2sphinx output has. >> >> >> So, this wasn't *necessary*. If we had gotten through the release >> automation stuff - and I still don't understand why that's stuck - we >> could >> have merged it. >> > > Well, I decided it was. Or at least really really desirable. > > >> I at one time thought this was pretty impractical. My first attempt at a >> conversion tool tried to use an intermediate object model, but I ran into >> trouble when trying to combine the various objects. So I abandoned the >> effort and created what became lore2sphinx, which basically just combined >> a >> bunch of strings. I then figured out a way of making the intermediate >> object thing work, and that was lore2sphinx-ng. Then it became >> convenient >> to split out the intermediate object model from the documetn processing >> code, so I put all of that into a library and that became rstgen. >> >> It seems the saving grace here is that rstgen might be a generally useful >> tool in its own right, with more of a long-term future than lore2sphinx >> would have had. >> > > I admit that I have become more interested in the actual problem of > "generating ReST" than I once was. And I hope that it will become a > generally useful tool. > > And probably one of the reasons I have been making such relatively slow > progress on it is is _because_ I'm trying to solve a more general problem > than I once was. The original lore2sphinx (the one running on the buildbot > now) was very much a minimal-thing-that-could-possibly-work kind of > solution. It tried to do just enough to get the job done. It sort of did > get the job done, but I was never very satisfied with it. > > >> (For anyone who is curious, the lore2sphinx-ng repo is forked off from >> the >> lore2sphinx repo, primarily because I didn't want to break the Sphinx >> buildbot by making drastic changes.) >> >> >> Have a link? >> > > I've posted it a couple of times in this thread, though I can hardly blame > you for either missing it or losing track of it. > > original: https://bitbucket.org/khorn/lore2sphinx > extra-crispy: https://bitbucket.org/khorn/lore2sphinx-ng > > >> >> Here's what my plan was prior to this whole discussion getting started >> again. >> >> 1) Finish rstgen, where "finished" in this instance is defined as "is >> capable of generating all the vanilla docutils and sphinx-specific ReST >> elements that we need for converting the >> Twisted documentation. >> >> >> Sounds like a worthy goal, although I don't think this is necessarily >> required. Have you been working on it for the last 2 years? Do you have >> any idea when it might be done? It might be worthwhile to write a >> *smaller* . >> > > I started on rstgen a bit more than a year ago. I was hung up on the > problem of how to combine various parts of a document for a while without > having the crazy space-handling issues. And also I've been trying to come > up with a relatively friendly API, and enough generality that it will end > up useful outside of the lore2sphinx context. > > I really started on l2s-ng last July during "Julython". I've been working > on it in fits and starts a few times since then. > > >> >> 2) Finish lore2sphinx-ng (which would probably have ended with merging it >> back into the lore2sphinx repo), where "finished" means that it would be >> capable of processing all the XHTML Lore tags that were defined in the >> Lore >> documentation and used in the Twisted documentation, and generating a >> tree >> of rstgen elements, which could then be rendered into ReST. >> >> >> Cool. >> >> While this would be handy, especially for people working on documentation >> branches, it's not necessarily necessary. >> >> (this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng >> is less...well, let's just call it broken than lore2sphinx's was/is.) >> >> >> OK. >> >> 3) Go back and finish SphinxBuilder (release tooling for building a >> sphinx >> project, which is basically a wrapper for sphinx-build, plus some vague >> "version feature"). >> >> >> This is really the crux; this is the thing you should work on first, I >> think, even if you're going to keep working on lore2sphinx-ng. Basically >> the only reason that I was keen to get the lore to sphinx conversion >> improved in the first place was that creating this tool seemed to be >> dragging on for quite a while after the "chunk tickets" were done. But >> now, this tool is almost done, and we could re-do the lore-source review >> if >> you wanted to do that. The current lore2sphinx might well be good enough >> to just go with, especially if the next-generation version is going to >> take >> another six months to finish. >> > > I'll take a look at this again soonish (a week? this month? don't know.). > Probably it's a matter of: > > - merge forward (it has been a while) > - figure out how the other tools guess/determine the Twisted version in the > checkout, and make SphinxBuilder do that. > - get it reveiewed > - commit > > But I'll have to remember how to use combinator again (which will be much > easier now that the combinator "docs" are on the Twisted wiki...thanks to > whomever did that!) > > Yes, I could probably use Bazaar, but so far every time I've tried that, > I've ended up spending waaaaaay too much time just on the VCS. I guess I > have some kind of mental block with bzr. I'll get over it someday I > suppose. > > >> >> 4) Get someone to use something less hackish than what's currently >> building the Sphinx docs on the buildbot, and preferably in such a way >> that >> the results of those builds could be published somewhere and have >> persistent links. Currently the results of what the Sphinx buildbot does >> are stored for a time, and then go away, so you'll see links to build >> results in some trac tickets that go nowhere, which is decidedly >> unhelpful. >> My plan was that we'd set up something where the Sphinx docs would get >> generated and published someplace for every buildbot build so that we >> could >> always have the current results for the lore to sphinx conversion for the >> tip of each branch. I have no idea whether this is actually feasible or >> practical, but it seemed like it would be useful. >> >> >> OK, *this* sounds like really unnecessary turd-polishing ;-). This >> builder is an interim step; the more interesting step is the builder that >> just builds the sphinx docs, in the same way that the current builder >> builds the lore docs. Furthermore, it seems to be working fine. Build >> results links that go nowhere are a known problem with buildbot, since it >> does eventually lose most history, and this type of history takes up a >> fair >> bit of disk space. >> > > Well, it was mostly motivated by the fact that we were doing a lot of > linking to build results that would then cease to exist for a while, and it > really annoyed me. It doesn't seem nearly as "necessary" to me now as it > once did. > > >> >> 5) Proceed with Sphinx docs being built from lore sources, making tweaks >> as necessary to lore2sphinx(ng) for as long as it took for the generated >> docs to be good enough to justify switching to Sphinx entirely. >> 6) Switch to Sphinx entirely. >> >> I really wasn't planning on trying to get people excited about switching >> to Sphinx again until 1) and 2) were at least "mostly" done (for certain >> values of done) and I had gone back to finish 3). >> >> So. I guess at this point the question is whether to try and go with >> what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + >> rstgen). I think 3-6 in my above plan need to happen in any case, and I >> think those will be much easier with lore2sphinx-ng+rstgen. >> >> >> This decision is really determined by time estimates. >> >> In any case, work out the sphinx release automation tool first, since we >> need that regardless of how we switch over >> > > Got it. > > >> >> IIRC, rstgen has support for most of the vanilla docutils elements, with >> the notable exception of tables (and maybe definition lists...can't >> recall >> whether I finished those). It has a basic level of test coverage (of >> course you can never have too many tests) for rendering the elements >> individually, and some test for elements in combination (particularly >> nested lists). Footnotes and Citations I think also need some work, >> which >> I have a plan for, but haven't implemented yet (i don't think). >> >> >> The "new" lore2sphinx CLI tool needs more work, but is relatively >> straightforward. Like the old tool, it's basically an elementtree >> processor, except instead of spitting out strings that get joined >> together >> (which granted was an unholy mess), it generates rstgen elements, which >> all >> have a .render() method. After processing a Lore document, you shoudl >> end >> up with a rstgen.Document object. You call it's render() method, which >> calls it's children's render() methods, etc. and it's turtles all the way >> down. >> >> The framework is there for the new CLI tool, it's mostly a matter of >> writing a bunch of short methods that take elementtree elements as input >> and return appropriate rstgen objects. >> >> Obviously these tools aren't finished, but they produce much better >> output >> than the old version of lore2sphinx w.r.t. whitespace handling, paragraph >> wrapping, etc. >> >> >> Aesthetically, this appeals to me a lot more than going with the >> messiness >> of lore2sphinx. >> > > Me too. > > > >> But it is _not_ a requirement. >> > > Understood. Though I think it might be a practical requirement, even if it > isn't a policy requirement. If that makes sense. > > > >> Some of the code is still pretty messy, but nowhere near the train wreck >> that the current/old version of lore2sphinx is. By which I mean it _can_ >> be cleaned up, it just hasn't been yet. In particular there's some >> places >> in rstgen where the API is (to me at least) obviously awful, but I >> haven't >> gotten around to fixing it yet. >> >> Please review the code. Please feel free to ask questions if you're >> interested. >> >> Personally, I've gotten over being in a hurry about all this, and I think >> a robust tool is more likely to succeed in the long run, though finishing >> it may make the run a bit longer. So I'm for finishing >> lore2sphinx-ng+rstgen. >> >> >> I think a little false urgency might not hurt here :-). I'm not going to >> work on the tool - just writing these emails probably blew my Twisted >> development budget for the next two months ;-) >> > > I can relate... :) > > >> - but I will do my best to quickly clear up any procedural >> what-needs-to-be-done questions unambiguously. Please ping if anything >> gets you stuck. >> > > I'll let you know. > > -- > Kevin Horn > -- Cordially Abdul Rauf (haseeb) _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python