On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <[email protected]> wrote:
> Thanks, Robert!
>
> I think my lack of clarity is around the MetricSpec. Maybe what's in my
> head and what's being proposed are the same thing. When I read that the
> MetricSpec describes the proto structure, that sound kind of complicated to
> me. But I may be misinterpreting it. What I picture is something like a
> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
> :):
>
> {
> name: "my_timer"
>
name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
requiring namespaces). Or "beam:metric:[some non-user designation]"
labels: { "ptransform" }
>
How does an SDK act on this information?
> type: GAUGE
> value_type: int64
>
I was lumping type and value_type into the same field, as a urn for
possibly extensibility, as they're tightly coupled (e.g. quantiles,
distributions).
> units: SECONDS
> description: "Times my stuff"
>
Are both of these optional metadata, in the form of key-value field, for
flattened into the field itself (along with every other kind of metadata
you may want to attach)?
> }
>
> Then metrics submitted would look like:
> {
> name: "my_timer"
> labels: {"ptransform": "MyTransform"}
> int_value: 100
> }
>
Yes, or value could be a bytes field that is encoded according to
[value_]type above, if we want that extensibility (e.g. if we want to
bundle the pardo sub-timings together, we'd need a proto for the value, but
that seems to specific to hard code into the basic structure).
> The simplicity coming from the fact that there's only one proto format for
> the spec and for the value. The only thing that varies are the entries in
> the map and the value field set. It's pretty easy to establish contracts
> around this type of spec and even generate protos for use the in SDK that
> make the expectations explicit.
>
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <[email protected]>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <[email protected]> wrote:
>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <[email protected]>
>> wrote:
>>
>>> I like the generalization from entity -> labels. I view the purpose of
>>> those fields to provide context. And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky. I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID). But then there's the
>>> context. In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata. Then values are reported separately. For
>>> system metrics, the definitions can be packaged with the SDK. For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>