Re: Follow up ideas, to simplify creating MonitoringInfos.

Alex Amato Mon, 29 Oct 2018 17:20:02 -0700

Hi Robert and community, :)

I was starting to code this up, but I wasn't sure exactly how to do some of
the proto syntax. Would you mind taking a look at this doc
<https://docs.google.com/document/d/1SB59MMVZXO0Aa6w0gf4m0qM4oYt4SiofDq3QxnpQaK4/edit?usp=sharing>
and let me know if you know how to resolve any of these issues:


   - Using a map in an option.
   - Referring to string "constants" from other enum options in .proto
   files.

Please see the comments I have listed in the doc
<https://docs.google.com/document/d/1SB59MMVZXO0Aa6w0gf4m0qM4oYt4SiofDq3QxnpQaK4/edit?usp=sharing>,
and a few alternative suggestions.
Thanks

On Wed, Oct 24, 2018 at 10:08 AM Alex Amato <[email protected]> wrote:

> Okay. That makes sense. Using runtime validation and protos is what I was
> thinking as well.
> I'll include you as a reviewer in my PRs.
>
> As for the choice of a builder/constructor/factory. If we go with factory
> methods/constructor then we will need a method for each metric type
> (intCounter, latestInt64, ...). Plus, then I don't think we can have any
> constructor parameters for labels, we will just need to accept a dictionary
> for label keys+values like you say. This is because we cannot offer a
> method for each URN and its combination of labels, and we should avoid such
> static detection, as you say.
>
> The builder however, allows us to add a method for setting each label.
> Since there are a small number of labels I think this should be fine. A
> specific metric URN will have a specific set of labels which we can
> validate being set. Any variant of this should use a different label (or a
> new version in the label). Thus, the builder can give an advantage, making
> it easier to set label values without needing to provide the correct key
> for the label, when its set. You just need to call the correct method.
>
> Perhaps it might be best to leave this open to each SDK based on its
> language style (Builder, Factory, etc.) , but make sure we use the
> protos+runtime validation.
>
> On Wed, Oct 24, 2018 at 7:02 AM Robert Bradshaw <[email protected]>
> wrote:
>
>> Thanks for bringing this to the list; it's a good question.
>>
>> I think the difficulty comes from trying to statically define a lists
>> of possibilities that should instead be runtime values. E.g. we
>> currently we're up to about a dozen distinct types, and having a
>> setter for each is both verbose and not very extensible (especially
>> replicated cross language). I'm not sure the set of possible labels is
>> fixed either. Generally in the FnAPI we've been using shared constants
>> for this kind of thing instead. (I was wary about the protos for the
>> same reasons; it would be good to avoid leaking this through to each
>> of the various SDKs.) The amount of static typing/validation one gets
>> depends on how much logic you build into each of these methods (e.g.
>> does it (almost?) always "metric" which is assumed to already be
>> encoded correctly, or a specific type that has tradeoffs with the
>> amount you can do generically (e.g. we have code that first loops over
>> counters, then over distributions, then over gauges, and I don't think
>> we want continue that pattern all M places for all N types)).
>>
>> I would probably lean towards mostly doing runtime validation here.
>> Specifically, one would have a data file that defines, for each known
>> URN, its type along with the set of permitted/expected/required
>> labels. On construction a MonitoringInfo data point, one would provide
>> a URN + value + labelMap, and optionally a type. On construction, the
>> constructor (method, factory, whatever) would look up the URN to
>> determine the type (or throw an error if it's both not known and not
>> provided), which could be then used to fetch an encoder of sorts (that
>> can go from value <-> proto encoding, possibly with some validation).
>> If labels and/or annotations are provided and the URN is known, we can
>> validate these as well.
>>
>> As for proto/enums vs. yaml, the former is nice because code
>> generation comes for free, but has turned out to be much more verbose
>> (both the definition and the use) and I'm still on the fence whether
>> it's a net win.
>>
>> (Unfortunately AutoValue won't help much here, as the ultimate goal is
>> to wrap a very nested proto OneOf, ideally with some validation.
>> (Also, in Python, generally, having optional, named arguments makes
>> this kind of builder pattern less needed.))
>>
>> On Wed, Oct 24, 2018 at 4:12 AM Kenneth Knowles <[email protected]> wrote:
>> >
>> > FWIW AutoValue will build most of that class for you, if it is as you
>> say.
>> >
>> > Kenn
>> >
>> > On Tue, Oct 23, 2018 at 6:04 PM Alex Amato <[email protected]> wrote:
>> >>
>> >> Hi Robert + beam dev list,
>> >>
>> >> I was thinking about your feedback in PR#6205, and agree that this
>> monitoring_infos.py became a bit big.
>> >>
>> >> I'm working on the Java Implementation of this now, and want to
>> incorporate some of these ideas and improve on this.
>> >>
>> >> I that that I should make something like a MonitoringInfoBuilder
>> class. With a few methods
>> >>
>> >> setUrn
>> >> setTimestamp
>> >> setting the value (One method for each Type we support. Setting this
>> will also set the type string)
>> >>
>> >> setInt64CounterValue
>> >> setDoubleCounterValue
>> >> setLatestInt64
>> >> setTopNInt64
>> >> setMonitoringDataTable
>> >> setDistributionInt64
>> >> ...
>> >>
>> >> setting labels (will set the key and value properly for the label)
>> >>
>> >> setPTransform(value)
>> >> setPcollection(value)
>> >> ...
>> >>
>> >>
>> >> I think this will make building a metric much easier, you would just
>> call the 4 methods and the .build(). These builders are common in Java. (I
>> guess there is a similar thing way we could do in python? I'd like to go
>> back and refactor that as well when I am done)
>> >>
>> >> -------------
>> >>
>> >> As for your other suggestion to define metrics in the proto/enum file
>> instead of the yaml file. I am not too sure about the best strategy for
>> this. My initial thoughts are:
>> >>
>> >> Make a proto extension allowing you to describe/define a
>> MonitoringInfo's (the same info as the metric_definitions.yaml file):
>> >>
>> >> URN
>> >> Type
>> >> Labels required
>> >> Annotations: Description, Units, etc.
>> >>
>> >> Make the builder read in that MonitoringInfo definision/description
>> assert everything is set properly? I think this would be a decent data
>> driven approach.
>> >>
>> >> I was wondering if you had something else in mind?
>> >>
>> >> Thanks
>> >> Alex
>> >>
>> >>
>>
>

Re: Follow up ideas, to simplify creating MonitoringInfos.

Reply via email to