Hello Edwin, Edwin Grubbs [2010-09-16 17:47 -0500]: > Launchpad.net is looking into whether to use the problem_report python > module to store website errors or even to use the apport python module > to help collect system data for the problem report. Currently, each > exception is stored in a separate "oops" file with a bunch of extra > data, such as the cgi request variables, and it is formatted like an > rfc822 email message to take advantage of modules for formatting and > parsing.
That indeed is what Apport .crash reports have as well. > The oops-tools project, which analyzes and displays the oops files in > a web page, is planned to be open sourced soon. Therefore, I have two > main questions. > 1. Is there interest in having the problem_report format be extended > to handle more complex data structures that will be parsed and > analyzed by a tool such as oops-tools? Not from my side. So far we got along well with just having a single-layer dictionary. The convention for lists as values is to have one element per line, e. g.: Dependencies: libfoo1 libbar2 Can you point out an example what else you need? > 2. Would apport be interested in receiving other features of > oops-tools, such as the django based web interface for viewing oopses? Is this read-only, or can you also update the data there? We have used Launchpad Bugs as a "crash database" backend so far, because a bug tracker provides us all the functionaly that we need, except that it's sometimes hard to tell apart crashes and regular bugs, for getting a clean view for triagers. It sounds like an interesting option, though, if it can represent the structure of Ubuntu, like distros/packages/package versions, etc. > The second question is probably hard to answer right now, so I'll > focus on the limitations of the problem_report format that we would > either extend in a wrapper class or in problem_report itself. > > * problem_report doesn't provide a standard format for complex data. Right, it currently uses standard RFC822, which doesn't define any more complex data types. > Even adding another level of name/value pairs inside a field is not > well supported, since you have to use a StringIO object to get the > data from ProblemReport object to put it in a field of another > ProblemReport. Lists of dictionaries would also require their own > format. Here is an example of recursive ProblemReports. This works fine if you hardcode assumptions about the syntax of particular field names, which we generally have to for such post-processing scripts anyway. But if we need complex data structures, then I'd rather use a standard format like JSON for this, as you suggested. The problem_report module is not conceptually limited to RFC822 only. For example, it also has the ability to output its data Multipart/MIME format (for uploading data to Launchpad). So it wouldn't be a problem at all to add reading/writing JSON. However, the module currently _is_ conceptually limited to a single level dictionary structure, since API users can (and do) pretty much treat it as a dictionary with extra features, and can currently rely on the data types of the values (strings). We could allow more, and then just fix the existing write() and write_mime() to throw an exception if they encounter an unrepresentable data type; this would mean you could never upload such a report to Launchpad bugs. > * problem_report only allows field names to contain letters, numbers, > ".", "_", and "-". That could cause problems when dumping a bunch of > name/value pairs from an application in order to analyze it later. That's not a problem in Apport and package hooks, since (as pointed out before) the set of key names is pretty much static. In the cases where it isn't, hookutils provides a helper for cleaning up key names. I'd like to avoid arbitrary strings here, since it can easily lead to problems, break the RFC822 format, or cause unexpected errors in scripts which process those reports. > * problem_report really supports text or compressed text files. There > is no ability to specify a content-type even when using > problem_report's write_mime() method. In general we know what content type a field has. If not, then you could always specify it in another field, like: Data: blob0xDEADBEEF DataType: image/jpeg ? > * The write_mime() method even encodes the single-line name/value > pairs as base64, so it is not at all human readable. Only if it's longer than 5 lines or has non-ASCII characters, otherwise it lands in the "short values" text section (where it is readable). But why do you care? This format is supposed to be nothing more than a transport vehicle from client computers to Launchpad. It's not really supposed to be looked at by humans. Thanks, Martin -- Martin Pitt | http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
signature.asc
Description: Digital signature
-- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss