On Sun, 18 Nov 2012 12:53:50 -0500, Roy Smith wrote: > I've got a script which trolls our log files looking for python stack > dumps. For each dump it finds, it computes a signature (basically, a > call sequence which led to the exception) and uses this signature as a > dictionary key. Here's the relevant code (abstracted slightly for > readability): > > def main(args): > crashes = {} > [...] > for line in open(log_file): > if does_not_look_like_a_stack_dump(line): > continue > lines = traceback_helper.unfold(line) > header, stack = traceback_helper.extract_stack(lines) > signature = tuple(stack) > if signature in crashes: > count, header = crashes[signature] > crashes[signature] = (count + 1, header) > else: > crashes[signature] = (1, header) > > You can find traceback_helper at > https://bitbucket.org/roysmith/python-tools/src/4f8118d175ed/logs/ > traceback_helper.py > > The stack that's returned is a list. It's inherently a list, per the > classic definition:
Er, no, it's inherently a blob of multiple text lines. Sure, you've built it a line at a time by using a list, but I've already covered that case. Once you've identified a stack, you never append to it, sort it, delete lines in the middle of it... none of these list operations are meaningful for a Python stack trace. The stack becomes a fixed string, and not just because you use it as a dict key, but because inherently it counts as a single, immutable blob of lines. A tuple of individual lines is one reasonable data structure for a blob of lines. Another would be a single string: signature = '\n'.join(stack) Depending on what you plan to do with the signatures, one or the other implementation might be better. I'm sure that there are other data structures as well. > * It's variable length. Different stacks have different depths. Once complete, the stack trace is fixed length, but that fixed length is different from one stack to the next. Deleting a line would make it incomplete, and adding a line would make it invalid. > * It's homogeneous. There's nothing particularly significant about each > entry other than it's the next one in the stack. > > * It's mutable. I can build it up one item at a time as I discover > them. The complete stack trace is inhomogeneous and immutable. I've already covered immutability above: removing, adding or moving lines will invalidate the stack trace. Inhomogeneity comes from the structure of a stack trace. The mere fact that each line is a string does not mean that any two lines are equivalent. Different lines represent different things. Traceback (most recent call last): File "./prattle.py", line 873, in select selection = self.do_callback(cb, response) File "./prattle.py", line 787, in do_callback raise callback ValueError: what do you mean? is a valid stack. But: Traceback (most recent call last): raise callback selection = self.do_callback(cb, response) File "./prattle.py", line 787, in do_callback ValueError: what do you mean? File "./prattle.py", line 873, in select is not. A stack trace has structure. The equivalent here is the difference between: ages = [23, 42, 19, 67, # age, age, age, age 17, 94, 32, 51, # ... ] values = [23, 1972, 1, 34500, # age, year, number of children, income 35, 1985, 0, 67900, # age, year, number of children, income ] A stack trace is closer to the second example than the first: each item may be the same type, but the items don't represent the same *kind of thing*. You could make a stack trace homogeneous with a little work: - drop the Traceback line and the final exception line; - parse the File lines to extract the useful fields; - combine them with the source code. Now you have a blob of homogeneous records, here shown as lines of text with ! as field separator: ./prattle.py ! 873 ! select ! selection = self.do_callback(cb, response) ./prattle.py ! 787 ! do_callback ! raise callback But there's really nothing you can do about the immutability. There isn't any meaningful reason why you might want to take a complete stack trace and add or delete lines from it. -- Steven -- http://mail.python.org/mailman/listinfo/python-list