Sounds like one of the first decision points is whether to use a framework with distributed tracing or not. I think I would opt for not requiring distributed tracing.
Most of Iceberg is a self-contained library, so there are few points at which distributed tracing would make sense. Is there much value in tracing the metadata swap that happens in a metastore? I'm not sure there is. I think it would probably be sufficient to use a simpler metrics library. I've used DropWizard before, which I thought was trying to be the SLF4J of metrics. Is that still the case? I'd prefer to go with an established project that is likely to have broad support. And one that has a reasonable dependency set. On Mon, Feb 18, 2019 at 2:33 PM filip <filip....@gmail.com> wrote: > Both these solutions provide support for collecting metrics and > distributed tracing independent of the platform of choice. They seem to be > overlapping quite a lot though. > > OpenCensus [1] provides bindings for Go, Java, C++ and more [2] and it > also seems to support OOB backends and custom ones as well [3]. Looking > over the troubleshooting > section [4] I could see reasonable value in collecting performance metrics > for measures around operations retries, latencies, error rates, etc. though > I guess that the distributed > tracing is their main selling point. The documentation advertises low > footprint too. > > Opentracing is focusing on providing a standard for distributed tracing > for both service and application level. No backend provided OOB afaik but > it seems it's covered quite > extensively by existing backends such as Zipkin, CNCF Jaeger and more [5]. > There specification documentation [6] is very comprehensive. > > Oh and there is the OpenMetrics [7] too which aims to standardize on how > we expose metrics. I am learning a lot over of interesting things from > their issues page [8] > > Then there is the good old codahale/dropwizard metrics library [9] that we > could leverage just as well to expose internal metrics from the library, no > potential distributed tracing support though. > I don't think that DW metrics supports tags though, reading [10] it seems > they're looking at it as a breaking change and engineering team is looking > to add tags support in version 5.0. > > I am thinking that distributed tracing might prove very useful for > troubleshooting operations that require atomic guarantees. > I am thinking/ hoping that should any backend we'd use for implementing > Iceberg be using either opencensus or opentracing we might get support of > distributed tracing, it'd be really interesting > to see spanning across process boundaries. > > I am saying a lot of "hoping" and "thinking" because I haven't used either > one in a real-world implementation but I thought I'd might get folks > interested on the topic and something good comes out of this. > > [1] https://opencensus.io/introduction/ > https://opensource.google.com/projects/opencensus > [2] https://opencensus.io/language-support/ > [3] https://opencensus.io/introduction/#backend-support > [4] https://opencensus.io/advanced-concepts/troubleshooting/ > [5] https://opentracing.io/docs/supported-tracers/ > [6] https://opentracing.io/specification/ > [7] https://openmetrics.io/ > [8] https://github.com/OpenObservability/OpenMetrics/issues > [9] https://metrics.dropwizard.io/4.0.0/ > [10] https://github.com/dropwizard/metrics/issues/1175 > > > On Mon, Feb 18, 2019 at 11:03 PM Ryan Blue <rb...@netflix.com.invalid> > wrote: > >> I don't know. Can you elaborate on what opencensus and opentracing are? >> >> On Mon, Feb 18, 2019 at 12:51 PM filip <filip....@gmail.com> wrote: >> >>> >>> /Filip >>> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> > > > -- > Filip Bocse > -- Ryan Blue Software Engineer Netflix