Re: improving access to telemetry data

Justin Lebar Thu, 28 Feb 2013 09:15:25 -0800

It sounds to me like people want both

1) Easier access to aggregated data so they can build their own
dashboards roughly comparable in features to the current dashboards.


2) Easier access to raw databases so that people can build up more
complex analyses, either by exporting the raw data from the db, or by
analyzing it in the db.

That is, I don't think we can or should export JSON with all the data
in our databases.  That is a lot of data.

On Thu, Feb 28, 2013 at 12:08 PM, Benjamin Smedberg
<benja...@smedbergs.us> wrote:
> On 2/28/2013 10:59 AM, Benoit Jacob wrote:
>>>
>>> Because the raw crash files do not include new metadata fields, this has
>>> led to weird engineering practices like shoving interesting metadata into
>>> the freeform app notes field, and then parsing that data back out later.
>>> I'm worried about perpetuating this kind of behavior, which is hard on
>>> the
>>> database and leads to very arcane queries in many cases.
>>>
>> I don't agree with the notion that freeform fields are bad. freeform plain
>> text is an amazing file format. It allows to add any kind of data without
>> administrative overhead and is still easy to parse (if the data was that
>> was added was formatted with easy parsing in mind).
>
> The obvious disadvantage is that it is much more difficult to
> machine-process. For example elasticsearch can't index on it (at least not
> without lots of custom parsing), and in general you can't ask tools like
> hbase or elasticsearch to filter on that without a user defined function.
> (Regexes might work for some kinds of text processing.)
>
>>
>> But if one considers it a bad thing that people use it, then one should
>> address the issues that are causing people to use it. As you mention, raw
>> crash files may not include newer metadata fields. So maybe that can be
>> fixed by making it easier or even automatable to include new fields in raw
>> crash files?
>
> Yes, that is all filed. We can't automatically include the field, because we
> don't know whether they are supposed to be public or private, but we should
> soon be able to have a dynamically updateable list.
>
> Note that if mcmanus is correct, we're going to be dealing with 1M fields
> per day here. That's a lot more than the 250k from crash-stats, especially
> because the payload is bigger. I believe that the flat files from
> crash-stats are a really useful kludge because we couldn't figure out a
> better way to expose the raw data. But that kludge will start to fall over
> pretty quickly, and perhaps we should just expose a better way to do queries
> using the databases, which are surprisingly good at doing these kinds of
> queries efficiently.
>
>
> --BDS
>
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: improving access to telemetry data

Reply via email to