On Fri, Jun 14, 2013 at 6:40 PM, Glyph <gl...@twistedmatrix.com> wrote: > > On Jun 14, 2013, at 3:43 PM, Glyph <gl...@twistedmatrix.com> wrote: > > I think that we can satisfy all of these requirements by expanding the > notion of an "identifier" a bit. > > > In our last episode, I mentioned that an identifier should have several > properties: > > human readable (and potentially non-unique) for ease of writing messages and > reading them to get a general sense of the behavior of a system > globally unique (and potentially hard to read) for ease of filtering and > monitoring > long-term (and potentially unique to humans) for ease of writing tools to > deal with particular messages, whose exact details may change over time > > > I think we can achieve all of these things by making there be multiple ways > to identify a log message. Here's how that could work: > > When you are first writing a log message, just do this, as you usually > would; toss off some crummy english message: > > log.info("hey check it out I just {verb}ed a {noun}", verb=request.method, > noun=request.uri) > > > Asking the author to write a log message with more detail and care than this > is basically just a recipe for disappointment :). > > Later, someone might want to instrument these messages. We've all agreed > that re.findall("hey check it out.*") is a crummy way to do this. The > obvious thing would be something like this: > > tapEventsLike(myObserver, publisher, log_format="hey check it out I just > {verb}ed a {noun}") > > > That's fine to start with, of course, but it's not very resilient to changes > in the code; the developer might later change this line to read: > > log.info("hey check it out I just {verb}ed a {noun} and got a {result}", > verb=request.method, noun=request.uri, result=request.response.status) > > > Now the log tap is broken, and there's no reasonable way to fix it; the > format string doesn't have a place that it could go. > > But, the 'log' in the above example is a logger object associated with a > class or module, and so has 3 persistent pieces of information it can use to > build a long-term identifier that the event tap can use. We could reflect > that in an API like this: > > from example.module import log > tapEventsLike(myObserver, publisher, > log_id=log.idForOriginalFormat("hey check it out I just > {verb}ed a {noun}")) > > > which could be computed offline, and then specified like this, to be even > more resilient to API changes: > > tapEventsLike(myObserver, publisher, > log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f') > > > That string, of course, is computed like this: > >>>> import uuid >>>> from urllib import urlencode >>>> uri = "http://twistedmatrix.com/ns/log/?" + urlencode(dict(format="hey >>>> check it out I just {verb}ed a {noun}", module="example.module")) >>>> uuid.uuid5(uuid.NAMESPACE_URL, uri) > UUID('604e45b5-2d41-5ba5-9ae1-7205163c3e3f') > > > (Although if we decide we like this system, we could of course go with a > helpful command-line tool to generate these for you.) > > As you evolve your codebase to take logging more seriously, you could change > the original 'log.info' call to instead look like this: > > verbANoun = log.declare( > > "hey check it out I just {verb}ed a {noun} and got a {result}", > > log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f', > > ) > > # ... > > verbANoun(verb=request.method, noun=request.uri, > result=request.response.status) > > > and then monitoring can be accomplished via public API: > > from example.module import verbANoun > verbANoun.tap(myObserver, theLogPublisher) > > > So, after some evolution, our message has 3 identifiers: > > the original format string, which was easy to come up with on the spot, and > need not be unique > a UUID, that was automatically generated but can be carried around to allow > for cleaning up the codebase and the message without breaking any code doing > monitoring > an API name that the logger and the logee can use to agree on the semantics > of a particular message > > > Best of all, you can get this even for the plethora of existing log messages > within Twisted that aren't even using structured values, as we evolve them > to have more metadata. You also now have the ability to localize the format > string, if you want to start presenting these log messages in contexts other > than log files for system administrators. > > Thoughts?
There's... a lot here. I'm overwhelmed by how complex this system would be, and I think that end users would be as well. I don't really want to put UUIDs into my source code, whether I type them or paste them, and I don't want to have to invoke command line tools to figure out what those UUIDs in order to be a good developer. I think there's basically no practical benefit over the hierarchical "system" + event-identifier system, where: 1. it's trivial to specify a hierarchical, easy-to-read "system" key that has a small-ish scope 2. you think of a unique event name in that small-ish scope and type it out. I'm beginning to think #1 is the most important of all, but I think I'll continue to use event-names to describe all my log statements. As I said in my response to Wilfredo, I think it's really important to stop specifying the "system" for application code in Twisted. The application's code should not be considered a part of the HTTP protocol's system. There are also other problems with the way we use system, like putting extra random data like request ID in there. -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/ _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python