I don't understand what you mean by "not sustainable".
Not something you can keep in your codebase forever.
It really depends on who does it I would guess. It's very probably a couple
of months to get something that is bullet-proof and mergeable.
Is it on the roadmap for a future version of LTTng?
Jérémie Galarneau <jeremie.galarn...@efficios.com> a écrit :
On 29 May 2018 at 09:47, Loïc Gelle <loic.ge...@polymtl.ca> wrote:
I agree that integrating C code into a Go codebase is somewhat inelegant.
Not only that, but it's not sustainable. It is more a hack than a feature.
However, I'm not sure what you mean by "implementation issues that are
specific to the language itself".
I mean that if you put static calls to C tracepoints from a Go program,
you always have a function call (and the ~50ns overhead) triggered each
time you hit the tracepoint, whether the tracepoint is actually enabled or
not. So basically you can't count on the compiler (specific to the
language) to do clever branch prediction for you, which reduces the
interest of instrumenting your code.
In the case of the first solution, note that the agent tracks which events
are enabled or not. In that sense, the check is performed within the Go
code, as is currently done for Python and Java.
I like the third solution that you propose. I think that the first one is
definitely not ideal and that the second one is too much work and
maintaining. How much time do you estimate is necessary for the development
of the third solution?
By the way, I am currently working on instrumenting the Go runtime to
capture information on the goroutines. I am using dyntrace (
https://github.com/charpercyr/dyntrace) for that, which kind of works but
is really hacky.
That sounds great. What kind of information are you capturing?
Thanks,
Jérémie
Jérémie Galarneau <jeremie.galarn...@efficios.com> a écrit :
On 28 May 2018 at 10:30, Loïc Gelle <loic.ge...@polymtl.ca> wrote:
Hi Jeremie,
Thanks for your answer. I roughly estimated the overhead of calling an
empty C function (passing two integer arguments) from Go to 50ns per
call.
Maybe not a big deal for a lot of use cases, but more problematic if you
want to trace performance-critical parts of Go like its runtime itself.
The
overhead could even be bigger when it involves passing strings or arrays
that have different memory layouts in Golang and C. What was the overhead
that you observed for Python and Java?
50ns per call doesn't sound too bad honestly.
You have to ask yourself if you could get within 50ns of lttng-ust's
performance with a custom ring buffer implemented in Go.
To use some very rough numbers, lttng-ust for that payload, takes around
~250ns per event. With Mathieu's work on restartable sequence, that number
will be shaved off quite a bit (by half, if I remember correctly), and I'm
not sure you'll be able to use that kind of mechanism from Go code.
I don't have numbers on hand for Python and Java. In both cases, we are
hooking into logging frameworks so the overhead of calling into C code
probably pales in comparison to the time spent formatting strings.
That's another problem in using the current "agent" mechanism; it really
only accommodates a very specific tracepoint signature that takes a string
payload.
From what I understand, it will always be a problem to have agents for
languages different than C, especially if you want to keep relying on
existing C code. Even if the sessiond part is independant from the agent
itself, there are tons of implementation issues that are specific to the
language itself. The problem with Go is that calling C functions is
really
a hack that does not integrate well with the build system that was
designed
for Go.
The solutions I see:
1) Replicate the current "agent" scheme and serialize all Go events to
strings
Not ideal as you lose the events' typing, you have to serialize to strings
on the fast path, and you can hardly filter on event payloads.
2) Write a native Go ring-buffer that can be consumed by LTTng
In essence, all the tracing would happen in Go. Events would be serialized
by Go code and the Go "agent" would produce the CTF metadata that
describes
their layout.
From an integration standpoint, that's probably the most elegant solution
as you have no hard dependency on native code in your go projects.
However,
it's a _lot_ of work.
First, you have to re-implement a ring-buffer that needs to perform within
50ns of lttng-ust's ring-buffer to be useful. You also need to port the
event filtering bytecode interpreter to Go.
Then, we need to find a way to consume that ring-buffer's content from a
form of consumer daemon within lttng-tools.
3) Add an lttng-ust API to allow dynamic event declaration
This is something we have been considering for a while.
Basically, we would like to introduce an API that allows applications to
dynamically declare tracepoints.
Then, those events would be serialized from Go, but the ring-buffer logic
would remain in C.
On each event, we would:
- Obtain a memory area from lttng-ust (reserve phase, C code called from
Go)
- Write the event's content to that area (from Go code)
- Commit the event (C code called from Go)
With this, you don't have to manually declare tracepoints and integrate
them into a build system to generate providers; the Go application just
needs to link to lttng-ust at runtime.
It's not a perfect solution, but it seems like an interesting compromise.
What do you think?
Jérémie
Did I provide more context?
Cheers,
Loïc.
Jérémie Galarneau <jeremie.galarn...@efficios.com> a écrit :
On 4 May 2018 at 06:03, Loïc Gelle <loic.ge...@polymtl.ca> wrote:
Hi,
There has been a previous discussion on the mailing list about porting
LTTng to Golang, about a year ago: https://lists.lttng.org/
pipermail/lttng-dev/2017-June/027203.html . This new topic is to
discuss
more precisely about implementation possibilities.
Currently, one has to use the the C UST agent from LTTng in order to
instrument Golang programs, and to compile the whole thing using custom
Makefiles and cgo. Here is a recent example that I wrote:
https://github.com/loicgelle/jaeger-go-lttng-instr
As you can guess, there are a low of drawbacks in that approach. It is
actually a hack and cannot be integrated into more complex Golang
program
that use a more complex build process (e.g. the Golang runtime itself),
because of the compiler instructions that you have to include at the
top
of
the Golang files. There is also a big concern about the performance of
this
solution, as calling a C function from Go requires to do a full context
switch on the stack, because the calling conventions in C and Golang
are
different.
I think a more integrated and performant solution is needed. We can’t
really ignore a language such as Golang that is now widely adopted for
cloud applications. LTTng is really the best solution out there in
terms
of
overhead per tracepoint, and could benefit from being made available to
such a large community. My question to the experts on this mailing
list:
how much would it take to write a Golang agent for LTTng?
Hi Loïc,
Without having performed any measurements myself, it does seem like
calling
C from Go is very expensive. In that context, I can see that LTTng would
probably lose its performance advantage over any native Go solution.
However, it wouldn't hurt to measure the impact and see if it really is
a
deal breaker.
We faced the same dilemma when implementing the Java and Python support
in
lttng-ust. In those cases, we ended up calling C code, with the
performance
penalties it implies. The correlation with other applications' and the
kernel's events, along with the rest of LTTng's features, provided
enough
value to make that solution worthwhile.
There aren't a ton of solutions if we can't call existing C code. We
basically have to reimplement a ring-buffer and the setup/communication
infrastructure to interact with the lttng-sessiond. The communication
with
the session daemon is not a big concern as the protocol is fairly
straightforward.
The "hairy" part is that lttng-ust and lttng-consumerd use a shared
memory
map to produce and consume the tracing buffers. This means that all
changes
to that memory layout would need to be replicated in the Go tracer,
making
future evolution more difficult. Also, I don't know how easy it would be
to
synchronize C and Go applications interacting in a shared memory map
given
those languages have different memory models. My knowledge of Go doesn't
go
that far.
A more viable solution could be to introduce a Go-native consumer daemon
implementing its own synchronization with Go applications. This way,
that
implementation could evolve on its own and could also start with a
simpler
ring buffer than lttng-ust's.
Still, it is not a small undertaking and it basically means maintaining
a
third tracer implementation.
What do you think?
Thanks!
Jérémie
Cheers,
Loïc.
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev