On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <[email protected]> wrote:
> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <[email protected]> wrote: > >> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <[email protected]> >> wrote: >> >>> Also, the only use for payloads is because "User Counter" is currently a >>> single URN, rather than using the namespacing characteristics of URNs to >>> map user names onto distinct metric names. >>> >> >> Can they be URNs? I don't see value in having a "user metric" URN where >> you then have to look elsewhere for what the real name is. >> > > Yes, that was my point with the parenthetical statement. I would rather > have "beam:counter:user:use_provide_namespace:user_provide_name" than use > the payload field for this. So if we're going to keep the payload field, we > need more compelling usecases. > Or just "beam:counter:<namespace>:<name>" or even "beam:metric:<namespace>:<name>" since metrics have a type separate from their name. Kenn > A payload avoids the messiness of having to pack (and parse) arbitrary >> parameters into a name though.) If we're going to choose names that the >> system and sdks agree to have specific meanings, and to avoid accidental >> collisions, making them full-fledged documented URNs has value. >> > >> Value is the "payload". Likely worth changing the name to avoid confusion >> with the payload above. It's bytes because it depends on the type. I would >> try to avoid nesting it too deeply (e.g. a payload within a payload). If we >> thing the types are generally limited, another option would be a oneof >> field (with a bytes option just in case) for transparency. There are pros >> and cons going this route. >> >> Type is what I proposed we add, instead of it being implicit in the name >> (and unknowable if one does not recognize the name). This makes things more >> open-ended and easier to evolve and work with. >> >> Entity could be generalized to Label, or LabelSet if desired. But as >> mentioned I think it makes sense to pull this out as a separate field, >> especially when it makes sense to aggregate a single named counter across >> labels as well as for a single label (e.g. execution time of composite >> transforms). >> >> - Robert >> >> >> >> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <[email protected]> >> wrote: >> >>> Hi folks - >>> >>> Before we totally go down the path of highly structured metric protos, >>> I'd like to propose considering a simple metrics interface between the SDK >>> and the runner. Something more generic and closer to what most monitoring >>> systems would use. >>> >>> To use Spark as an example, the Metric system uses a simple metric >>> format of name, value and type to report all metrics in a single structure, >>> regardless of the source or context of the metric. >>> >>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html >>> >>> The subsystems have contracts for what metrics they will expose and how >>> they are calculated: >>> >>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html >>> >>> Codifying the system metrics in the SDK seems perfectly reasonable - no >>> reason to make the notion of metric generic at that level. But at the >>> point the metric is leaving the SDK and going to the runner, a simpler, >>> generic encoding of the metrics might make it easier to adapt and maintain >>> system. The generic format can include information about downstream >>> consumers, if that's useful. >>> >>> Spark supports a number of Metric Sinks - external monitoring systems. >>> If runners receive a simple list of metrics, implementing any number of >>> Sinks for Beam would be straightforward and would generally be a one time >>> implementation. If instead all system metrics are sent embedded in a >>> highly structured, semantically meaningful structure, runner code would >>> need to be updated to support exporting the new metric. We seem to be >>> heading in the direction of "if you don't understand this metric, you can't >>> use it / export it". But most systems seem to assume metrics are really >>> simple named values that can be handled a priori. >>> >>> So I guess my primary question is: Is it necessary for Beam to treat >>> metrics as highly semantic, arbitrarily complex data? Or could they >>> possibly be the sort of simple named values as they are in most monitoring >>> systems and in Spark? With the SDK potentially providing scaffolding to >>> add meaning and structure, but simplifying that out before leaving SDK >>> code. Is the coupling to a semantically meaningful structure between the >>> SDK and runner and necessary complexity? >>> >>> Andrea >>> >>> >>> >>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <[email protected]> >>> wrote: >>> >>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <[email protected]> wrote: >>>> >>>>> >>>>> *Thank you for this clarification. I think the table of files fits >>>>> into the model as one of type string-set (with union as aggregation). * >>>>> Its not a list of files, its a list of metadata for each file, several >>>>> pieces of data per file. >>>>> >>>>> Are you proposing that there would be separate URNs as well for each >>>>> entity being measured then, so the the URN defines the type of entity >>>>> being >>>>> measured. >>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for >>>>> PCollection entities >>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for >>>>> PTransform entities >>>>> >>>> >>>> Yes. FWIW, it may not even be needed to put this in the name, e.g. >>>> execution times are never for PCollections, and even if they were it'd be >>>> semantically a very different beast (which should not re-use the same URN). >>>> >>>> *message MetricSpec {* >>>>> * // (Required) A URN that describes the accompanying payload.* >>>>> * // For any URN that is not recognized (by whomever is inspecting* >>>>> * // it) the parameter payload should be treated as opaque and* >>>>> * // passed as-is.* >>>>> * string urn = 1;* >>>>> >>>>> * // (Optional) The data specifying any parameters to the URN. If* >>>>> * // the URN does not require any arguments, this may be omitted.* >>>>> * bytes parameters_payload = 2;* >>>>> >>>>> * // (Required) A URN that describes the type of values this metric* >>>>> * // records (e.g. durations that should be summed).* >>>>> *}* >>>>> >>>>> *message Metric[Values] {* >>>>> * // (Required) The original requesting MetricSpec.* >>>>> * MetricSpec metric_spec = 1;* >>>>> >>>>> * // A mapping of entities to (encoded) values.* >>>>> * map<string, bytes> values;* >>>>> This ignores the non-unqiueness of entity identifiers. This is why in >>>>> my doc, I have specified the entity type and its string identifier >>>>> @Ken, I believe you have pointed this out in the past, that uniqueness >>>>> is only guaranteed within a type of entity (all PCollections), but not >>>>> between entities (A Pcollection and PTransform may have the same >>>>> identifier). >>>>> >>>> >>>> See above for why this is not an issue. The extra complexity (in protos >>>> and code), the inability to use them as map keys, and the fact that they'll >>>> be 100% redundant for all entities for a given metric convinces me that >>>> it's not worth creating and tracking an enum for the type alongside the id. >>>> >>>> >>>>> *}* >>>>> >>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <[email protected]> >>>>> wrote: >>>>> >>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> To Robert's proto: >>>>>>> >>>>>>> // A mapping of entities to (encoded) values. >>>>>>>> map<string, bytes> values; >>>>>>>> >>>>>>> >>>>>>> Are the keys here the names of the metrics, aka what is used for >>>>>>> URNs in the doc? >>>>>>> >>>>>>>> >>>>>> They're the entities to which a metric is attached, e.g. a >>>>>> PTransform, a PCollection, or perhaps a process/worker. >>>>>> >>>>>> >>>>>>> } >>>>>>>> >>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was >>>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases >>>>>>>>>> to >>>>>>>>>> drive additional abstraction if/when it becomes compelling. >>>>>>>>>> >>>>>>>>>> Kenn >>>>>>>>>> >>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of >>>>>>>>>>> supported type" didn't include new ways of aggregating ints. As >>>>>>>>>>> long as >>>>>>>>>>> that means we have a fixed set of aggregations (that align with >>>>>>>>>>> what what >>>>>>>>>>> users want and metrics back end support) it seems like we are doing >>>>>>>>>>> user >>>>>>>>>>> metrics right. >>>>>>>>>>> >>>>>>>>>>> - Ben >>>>>>>>>>> >>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are >>>>>>>>>>>> used a lot but others are less mainstream so being too fine from >>>>>>>>>>>> the start >>>>>>>>>>>> can just add complexity and bugs in impls IMHO. >>>>>>>>>>>> >>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <[email protected]> >>>>>>>>>>>> a écrit : >>>>>>>>>>>> >>>>>>>>>>>>> By "type" of metric, I mean both the data types (including >>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a >>>>>>>>>>>>> type, as >>>>>>>>>>>>> would double-distribution. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> When you say type do you mean accumulator type, result type, >>>>>>>>>>>>>> or accumulator strategy? Specifically, what is the "type" of >>>>>>>>>>>>>> sumint, >>>>>>>>>>>>>> sumlong, meanlong, etc? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and >>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road >>>>>>>>>>>>>>> (and may >>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom >>>>>>>>>>>>>>> metrics of >>>>>>>>>>>>>>> standard type. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric >>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to >>>>>>>>>>>>>>>> justify support. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back? >>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus >>>>>>>>>>>>>>>> any special >>>>>>>>>>>>>>>> system metrivs? >>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks. I think this has simplified things. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating >>>>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would >>>>>>>>>>>>>>>>> propose >>>>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field >>>>>>>>>>>>>>>>> "type" which is >>>>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents >>>>>>>>>>>>>>>>> of its >>>>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or >>>>>>>>>>>>>>>>> maxing over ints >>>>>>>>>>>>>>>>> would be a typical example. Though we could pursue making >>>>>>>>>>>>>>>>> this opaque to >>>>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and >>>>>>>>>>>>>>>>> difficult) >>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least >>>>>>>>>>>>>>>>> aggregate and >>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself >>>>>>>>>>>>>>>>> understand the >>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the >>>>>>>>>>>>>>>>> specialization >>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as >>>>>>>>>>>>>>>>> well.) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for >>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and >>>>>>>>>>>>>>>>> it spec would >>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Robert >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far. >>>>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics >>>>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of >>>>>>>>>>>>>>>>>> the protos a >>>>>>>>>>>>>>>>>> little bit. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The point of this change was to futureproof the >>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom >>>>>>>>>>>>>>>>>> aggregation >>>>>>>>>>>>>>>>>> functions for its metric updates. >>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated >>>>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an >>>>>>>>>>>>>>>>>> approach which >>>>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without >>>>>>>>>>>>>>>>>> deserializing them. These >>>>>>>>>>>>>>>>>> are forwarded to user provided code which then would >>>>>>>>>>>>>>>>>> deserialize the metric >>>>>>>>>>>>>>>>>> update payloads and perform the custom aggregations. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric >>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform >>>>>>>>>>>>>>>>>> names inside >>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the >>>>>>>>>>>>>>>>>> metrics as the >>>>>>>>>>>>>>>>>> entities are pulled outside of the metric. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw >>>>>>>>>>>>>>>>>> attention to this particular revision. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make >>>>>>>>>>>>>>>>>>> a decision by Friday, and begin working on related PRs next >>>>>>>>>>>>>>>>>>> week. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before >>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this >>>>>>>>>>>>>>>>>>> thread Friday >>>>>>>>>>>>>>>>>>> afternoon. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it >>>>>>>>>>>>>>>>>>>> easily in >>>>>>>>>>>>>>>>>>>> future discussions, website, etc. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks for sharing. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments. >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> Hello beam community, >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this >>>>>>>>>>>>>>>>>>>> proposal so far. I >>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. >>>>>>>>>>>>>>>>>>>> There were some larger >>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of >>>>>>>>>>>>>>>>>>>> these I have added a >>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my >>>>>>>>>>>>>>>>>>>> recommendation as well >>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered. >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised >>>>>>>>>>>>>>>>>>>> proposal. Please take >>>>>>>>>>>>>>>>>>>> >> another look and let me know >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please >>>>>>>>>>>>>>>>>>>> take another look after >>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well. >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> Thanks again, >>>>>>>>>>>>>>>>>>>> >> Alex >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>
