[lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
Hello,

I already started a thread over at xenomai.org [1], but I guess its
more efficient to ask here aswell.
The basic concept is that xenomai thread run *below* Linux (threads
and irg handlers), which means that xenomai threads must not use any
linux services like the futex syscall or socket communication.

## tracepoints

expecting that tracepoints are the only thing that should be used from
the xenomai threads, is there anything using linux services.
the "bulletproof" urcu apparently does not need anything for the
reader lock (aslong as the thread is already registered), but I dont
know how the write-buffers are prepared.

You can call linux sycalls from xenomai threads (it will switch to the
linux shadow thread for that and lose realtime characteristics), so a
one time setup/shutdown like registering the threads is not an issue.

## membarrier syscall

I haven't got an explanation yet, but I believe this syscall does
nothing to xenomai threads (each has a shadow linux thread, that is
*idle* when the xenomai thread is active).
liburcu has configure options allow forcing the usage of this syscall
but not disabling it, which likely is necessary for Xenomai.

Any input is welcome.
Kind regards, Norbert

[1] - https://xenomai.org/pipermail/xenomai/2019-November/042027.html
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:

> Hello,
> 
> I already started a thread over at xenomai.org [1], but I guess its
> more efficient to ask here aswell.
> The basic concept is that xenomai thread run *below* Linux (threads
> and irg handlers), which means that xenomai threads must not use any

I guess you mean "irq handlers" here.

> linux services like the futex syscall or socket communication.
> 
> ## tracepoints
> 
> expecting that tracepoints are the only thing that should be used from
> the xenomai threads, is there anything using linux services.
> the "bulletproof" urcu apparently does not need anything for the
> reader lock (aslong as the thread is already registered),

Indeed the first time the urcu-bp read-lock is encountered by a thread,
the thread registration is performed, which requires locks, memory allocation,
and so on. After that, the thread can use urcu-bp read-side lock without
requiring any system call.

> but I dont know how the write-buffers are prepared.

LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
which is injected into the process by a lttng-ust constructor.

What you will care about is how the tracepoint call-site (within a Xenomai
thread) interacts with the ring buffers.

The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
threads. The lttng-ust ring buffer is split into sub-buffers, each sub-buffer
corresponding to a CTF trace "packet". When a sub-buffer is filled, lttng-ust
invokes "write(2)" to a pipe to let the consumer daemon know there is data
available in that ring buffer. You will want to get rid of that write(2) system
call from a Xenomai thread.

The proper configuration is to use lttng-enable-channel(1) "--read-timer"
option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This will
ensure that the consumer daemon uses a polling approach to check periodically
whether data needs to be consumed within each buffer, thus removing the
use of the write(2) system call on the application-side.

> 
> You can call linux sycalls from xenomai threads (it will switch to the
> linux shadow thread for that and lose realtime characteristics), so a
> one time setup/shutdown like registering the threads is not an issue.

OK, good, so you can actually do the initial setup when launching the thread.
You need to remember to invoke use a liburcu-bp read-side lock/unlock pair,
or call urcu_bp_read_ongoing() at thread startup within that initialization
phase to ensure urcu-bp registration has been performed.

> 
> ## membarrier syscall
> 
> I haven't got an explanation yet, but I believe this syscall does
> nothing to xenomai threads (each has a shadow linux thread, that is
> *idle* when the xenomai thread is active).

That's indeed a good point. I suspect membarrier may not send any IPI
to Xenomai threads (that would have to be confirmed). I suspect the
latency introduced by this IPI would be unwanted.

> liburcu has configure options allow forcing the usage of this syscall
> but not disabling it, which likely is necessary for Xenomai.

I suspect what you'd need there is a way to allow a process to tell
liburcu-bp (or liburcu) to always use the fall-back mechanism which does
not rely on sys_membarrier. This could be allowed before the first use of
the library. I think extending the liburcu APIs to allow this should be
straightforward enough. This approach would be more flexible than requiring
liburcu to be specialized at configure time. This new API would return an error
if invoked with a liburcu library compiled with 
--disable-sys-membarrier-fallback.

If you have control over your entire system's kernel, you may want to try
just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.

Another thing to make sure is to have a glibc and Linux kernel which perform
clock_gettime() as vDSO for the monotonic clock, because you don't want a
system call there. If that does not work for you, you can alternatively
implement your own lttng-ust and lttng-modules clock plugin .so/.ko to override
the clock used by lttng, and for instance use TSC directly. See for instance
the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.

Thanks,

Mathieu


> 
> Any input is welcome.
> Kind regards, Norbert
> 
> [1] - https://xenomai.org/pipermail/xenomai/2019-November/042027.html
> ___
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Jan Kiszka

On 22.11.19 16:42, Mathieu Desnoyers wrote:

- On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:


Hello,

I already started a thread over at xenomai.org [1], but I guess its
more efficient to ask here aswell.
The basic concept is that xenomai thread run *below* Linux (threads
and irg handlers), which means that xenomai threads must not use any


I guess you mean "irq handlers" here.


linux services like the futex syscall or socket communication.

## tracepoints

expecting that tracepoints are the only thing that should be used from
the xenomai threads, is there anything using linux services.
the "bulletproof" urcu apparently does not need anything for the
reader lock (aslong as the thread is already registered),


Indeed the first time the urcu-bp read-lock is encountered by a thread,
the thread registration is performed, which requires locks, memory allocation,
and so on. After that, the thread can use urcu-bp read-side lock without
requiring any system call.


So, we will probably want to perform such a registration unconditionally 
(in case lttng usage is enabled) for our RT threads during their setup.





but I dont know how the write-buffers are prepared.


LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
which is injected into the process by a lttng-ust constructor.

What you will care about is how the tracepoint call-site (within a Xenomai
thread) interacts with the ring buffers.

The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
threads. The lttng-ust ring buffer is split into sub-buffers, each sub-buffer
corresponding to a CTF trace "packet". When a sub-buffer is filled, lttng-ust
invokes "write(2)" to a pipe to let the consumer daemon know there is data
available in that ring buffer. You will want to get rid of that write(2) system
call from a Xenomai thread.

The proper configuration is to use lttng-enable-channel(1) "--read-timer"
option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This will
ensure that the consumer daemon uses a polling approach to check periodically
whether data needs to be consumed within each buffer, thus removing the
use of the write(2) system call on the application-side.



You can call linux sycalls from xenomai threads (it will switch to the
linux shadow thread for that and lose realtime characteristics), so a
one time setup/shutdown like registering the threads is not an issue.


OK, good, so you can actually do the initial setup when launching the thread.
You need to remember to invoke use a liburcu-bp read-side lock/unlock pair,
or call urcu_bp_read_ongoing() at thread startup within that initialization
phase to ensure urcu-bp registration has been performed.



## membarrier syscall

I haven't got an explanation yet, but I believe this syscall does
nothing to xenomai threads (each has a shadow linux thread, that is
*idle* when the xenomai thread is active).


That's indeed a good point. I suspect membarrier may not send any IPI
to Xenomai threads (that would have to be confirmed). I suspect the
latency introduced by this IPI would be unwanted.


Is an "IPI" a POSIX signal here? Or are real IPI that delivers an 
interrupt to Linux on another CPU? The latter would still be possible, 
but it would be delayed until all Xenomai threads on that core eventual 
took a break (which should happen a couple of times per second under 
normal conditions - 100% RT load is an illegal application state).





liburcu has configure options allow forcing the usage of this syscall
but not disabling it, which likely is necessary for Xenomai.


I suspect what you'd need there is a way to allow a process to tell
liburcu-bp (or liburcu) to always use the fall-back mechanism which does
not rely on sys_membarrier. This could be allowed before the first use of
the library. I think extending the liburcu APIs to allow this should be
straightforward enough. This approach would be more flexible than requiring
liburcu to be specialized at configure time. This new API would return an error
if invoked with a liburcu library compiled with 
--disable-sys-membarrier-fallback.

If you have control over your entire system's kernel, you may want to try
just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.

Another thing to make sure is to have a glibc and Linux kernel which perform
clock_gettime() as vDSO for the monotonic clock, because you don't want a
system call there. If that does not work for you, you can alternatively
implement your own lttng-ust and lttng-modules clock plugin .so/.ko to override
the clock used by lttng, and for instance use TSC directly. See for instance
the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.


clock_gettime & Co for a Xenomai application is syscall-free as well.

Thanks,
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
___
lttng-dev mailing list
lttng-d

[lttng-dev] [RFC PATCH liburcu] urcu-bp: introduce urcu_bp_disable_sys_membarrier()

2019-11-22 Thread Mathieu Desnoyers
Real-time applications with Xenomai threads wishing to use urcu-bp
read-side within real-time threads require to disable use of the
membarrier system call, relying on the fall-back based on regular
memory barriers on the read-side instead. Allow disabling use of
sys_membarrier before liburcu-bp's first use.

Signed-off-by: Mathieu Desnoyers 
---
 include/urcu/urcu-bp.h | 12 
 src/urcu-bp.c  | 38 +++---
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/include/urcu/urcu-bp.h b/include/urcu/urcu-bp.h
index 2ea17e6..bfab965 100644
--- a/include/urcu/urcu-bp.h
+++ b/include/urcu/urcu-bp.h
@@ -157,6 +157,18 @@ extern void urcu_bp_after_fork_child(void);
 extern void urcu_bp_register_thread(void);
 
 /*
+ * Require liburcu-bp to use the fallback (based on memory barriers on
+ * the read-side) rather than pairing the sys_membarrier system call in
+ * synchronize_rcu() with compiler barriers on the read-side. Should
+ * be invoked when there are no RCU reader threads present.
+ * Return 0 on success.
+ * Return -1, errno = EBUSY if there are RCU reader threads present.
+ * Return -1, errno = EINVAL if the library has been configured without
+ * the membarrier fallback support.
+ */
+extern int urcu_bp_disable_sys_membarrier(void);
+
+/*
  * In the bulletproof version, the following functions are no-ops.
  */
 static inline void urcu_bp_unregister_thread(void)
diff --git a/src/urcu-bp.c b/src/urcu-bp.c
index 05efd97..4aaa3d6 100644
--- a/src/urcu-bp.c
+++ b/src/urcu-bp.c
@@ -123,6 +123,8 @@ void __attribute__((destructor)) urcu_bp_exit(void);
 int urcu_bp_has_sys_membarrier;
 #endif
 
+static bool urcu_bp_sys_membarrier_is_disabled;
+
 /*
  * rcu_gp_lock ensures mutual exclusion between threads calling
  * synchronize_rcu().
@@ -607,6 +609,11 @@ void urcu_bp_thread_exit_notifier(void *rcu_key)
 
 #ifdef CONFIG_RCU_FORCE_SYS_MEMBARRIER
 static
+bool urcu_bp_force_sys_membarrier(void)
+{
+   return true;
+}
+static
 void urcu_bp_sys_membarrier_status(bool available)
 {
if (!available)
@@ -614,20 +621,45 @@ void urcu_bp_sys_membarrier_status(bool available)
 }
 #else
 static
+bool urcu_bp_force_sys_membarrier(void)
+{
+   return false;
+}
+static
 void urcu_bp_sys_membarrier_status(bool available)
 {
-   if (!available)
-   return;
-   urcu_bp_has_sys_membarrier = 1;
+   urcu_bp_has_sys_membarrier = available;
 }
 #endif
 
+int urcu_bp_disable_sys_membarrier(void)
+{
+   mutex_lock(&rcu_registry_lock);
+   if (!cds_list_empty(®istry)) {
+   mutex_unlock(&rcu_registry_lock);
+   errno = EBUSY;
+   return -1;
+   }
+   mutex_unlock(&rcu_registry_lock);
+   if (urcu_bp_force_sys_membarrier()) {
+   errno = EINVAL;
+   return -1;
+   }
+   mutex_lock(&init_lock);
+   urcu_bp_sys_membarrier_is_disabled = true;
+   urcu_bp_sys_membarrier_status(false);
+   mutex_unlock(&init_lock);
+   return 0;
+}
+
 static
 void urcu_bp_sys_membarrier_init(void)
 {
bool available = false;
int mask;
 
+   if (urcu_bp_sys_membarrier_is_disabled)
+   return;
mask = membarrier(MEMBARRIER_CMD_QUERY, 0);
if (mask >= 0) {
if (mask & MEMBARRIER_CMD_PRIVATE_EXPEDITED) {
-- 
2.11.0

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 10:52 AM, Jan Kiszka jan.kis...@siemens.com wrote:

> On 22.11.19 16:42, Mathieu Desnoyers wrote:
>> - On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:
>> 
>>> Hello,
>>>
>>> I already started a thread over at xenomai.org [1], but I guess its
>>> more efficient to ask here aswell.
>>> The basic concept is that xenomai thread run *below* Linux (threads
>>> and irg handlers), which means that xenomai threads must not use any
>> 
>> I guess you mean "irq handlers" here.
>> 
>>> linux services like the futex syscall or socket communication.
>>>
>>> ## tracepoints
>>>
>>> expecting that tracepoints are the only thing that should be used from
>>> the xenomai threads, is there anything using linux services.
>>> the "bulletproof" urcu apparently does not need anything for the
>>> reader lock (aslong as the thread is already registered),
>> 
>> Indeed the first time the urcu-bp read-lock is encountered by a thread,
>> the thread registration is performed, which requires locks, memory 
>> allocation,
>> and so on. After that, the thread can use urcu-bp read-side lock without
>> requiring any system call.
> 
> So, we will probably want to perform such a registration unconditionally
> (in case lttng usage is enabled) for our RT threads during their setup.

Yes. I'm currently doing a slight update to liburcu master branch to
allow urcu_bp_register_thread() calls to invoke urcu_bp_register() if
the thread is not registered yet. This seems more expected than implementing
urcu_bp_register_thread() as a no-op.

If you care about older liburcu versions, you will want to stick to use
rcu read lock/unlock pairs or rcu_read_ongoing to initialize urcu-bp, but
with future liburcu versions, urcu_bp_register_thread() will be another
option. See:

commit 5b46e39d0e4d2592853c7bfc11add02b1101c04b
Author: Mathieu Desnoyers 
Date:   Fri Nov 22 11:02:36 2019 -0500

urcu-bp: perform thread registration on urcu_bp_register_thread

> 
>> 
>>> but I dont know how the write-buffers are prepared.
>> 
>> LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
>> which is injected into the process by a lttng-ust constructor.
>> 
>> What you will care about is how the tracepoint call-site (within a Xenomai
>> thread) interacts with the ring buffers.
>> 
>> The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
>> threads. The lttng-ust ring buffer is split into sub-buffers, each sub-buffer
>> corresponding to a CTF trace "packet". When a sub-buffer is filled, lttng-ust
>> invokes "write(2)" to a pipe to let the consumer daemon know there is data
>> available in that ring buffer. You will want to get rid of that write(2) 
>> system
>> call from a Xenomai thread.
>> 
>> The proper configuration is to use lttng-enable-channel(1) "--read-timer"
>> option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This will
>> ensure that the consumer daemon uses a polling approach to check periodically
>> whether data needs to be consumed within each buffer, thus removing the
>> use of the write(2) system call on the application-side.
>> 
>>>
>>> You can call linux sycalls from xenomai threads (it will switch to the
>>> linux shadow thread for that and lose realtime characteristics), so a
>>> one time setup/shutdown like registering the threads is not an issue.
>> 
>> OK, good, so you can actually do the initial setup when launching the thread.
>> You need to remember to invoke use a liburcu-bp read-side lock/unlock pair,
>> or call urcu_bp_read_ongoing() at thread startup within that initialization
>> phase to ensure urcu-bp registration has been performed.
>> 
>>>
>>> ## membarrier syscall
>>>
>>> I haven't got an explanation yet, but I believe this syscall does
>>> nothing to xenomai threads (each has a shadow linux thread, that is
>>> *idle* when the xenomai thread is active).
>> 
>> That's indeed a good point. I suspect membarrier may not send any IPI
>> to Xenomai threads (that would have to be confirmed). I suspect the
>> latency introduced by this IPI would be unwanted.
> 
> Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
> interrupt to Linux on another CPU? The latter would still be possible,
> but it would be delayed until all Xenomai threads on that core eventual
> took a break (which should happen a couple of times per second under
> normal conditions - 100% RT load is an illegal application state).

I'm talking about a real in-kernel IPI (as in inter-processor interrupt).
However, the way sys_membarrier detects which CPUs should receive that IPI
is by iterating on all cpu runqueues, and figure out which CPU is currently
running a thread which uses the same mm as the sys_membarrier caller
(for the PRIVATE membarrier commands).

So I suspect that the Xenomai thread is really not within the Linux scheduler
runqueue when it runs.

> 
>> 
>>> liburcu has configure options allow forcing the usage of this syscall
>>> but not disabling i

Re: [lttng-dev] [RFC PATCH liburcu] urcu-bp: introduce urcu_bp_disable_sys_membarrier()

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 12:00 PM, Mathieu Desnoyers 
mathieu.desnoy...@efficios.com wrote:

> Real-time applications with Xenomai threads wishing to use urcu-bp
> read-side within real-time threads require to disable use of the
> membarrier system call, relying on the fall-back based on regular
> memory barriers on the read-side instead. Allow disabling use of
> sys_membarrier before liburcu-bp's first use.

This last sentence should actually read:

Allow disabling use of sys_membarrier when there are no urcu-bp reader
threads present.

Thanks,

Mathieu

> 
> Signed-off-by: Mathieu Desnoyers 
> ---
> include/urcu/urcu-bp.h | 12 
> src/urcu-bp.c  | 38 +++---
> 2 files changed, 47 insertions(+), 3 deletions(-)
> 
> diff --git a/include/urcu/urcu-bp.h b/include/urcu/urcu-bp.h
> index 2ea17e6..bfab965 100644
> --- a/include/urcu/urcu-bp.h
> +++ b/include/urcu/urcu-bp.h
> @@ -157,6 +157,18 @@ extern void urcu_bp_after_fork_child(void);
> extern void urcu_bp_register_thread(void);
> 
> /*
> + * Require liburcu-bp to use the fallback (based on memory barriers on
> + * the read-side) rather than pairing the sys_membarrier system call in
> + * synchronize_rcu() with compiler barriers on the read-side. Should
> + * be invoked when there are no RCU reader threads present.
> + * Return 0 on success.
> + * Return -1, errno = EBUSY if there are RCU reader threads present.
> + * Return -1, errno = EINVAL if the library has been configured without
> + * the membarrier fallback support.
> + */
> +extern int urcu_bp_disable_sys_membarrier(void);
> +
> +/*
>  * In the bulletproof version, the following functions are no-ops.
>  */
> static inline void urcu_bp_unregister_thread(void)
> diff --git a/src/urcu-bp.c b/src/urcu-bp.c
> index 05efd97..4aaa3d6 100644
> --- a/src/urcu-bp.c
> +++ b/src/urcu-bp.c
> @@ -123,6 +123,8 @@ void __attribute__((destructor)) urcu_bp_exit(void);
> int urcu_bp_has_sys_membarrier;
> #endif
> 
> +static bool urcu_bp_sys_membarrier_is_disabled;
> +
> /*
>  * rcu_gp_lock ensures mutual exclusion between threads calling
>  * synchronize_rcu().
> @@ -607,6 +609,11 @@ void urcu_bp_thread_exit_notifier(void *rcu_key)
> 
> #ifdef CONFIG_RCU_FORCE_SYS_MEMBARRIER
> static
> +bool urcu_bp_force_sys_membarrier(void)
> +{
> + return true;
> +}
> +static
> void urcu_bp_sys_membarrier_status(bool available)
> {
>   if (!available)
> @@ -614,20 +621,45 @@ void urcu_bp_sys_membarrier_status(bool available)
> }
> #else
> static
> +bool urcu_bp_force_sys_membarrier(void)
> +{
> + return false;
> +}
> +static
> void urcu_bp_sys_membarrier_status(bool available)
> {
> - if (!available)
> - return;
> - urcu_bp_has_sys_membarrier = 1;
> + urcu_bp_has_sys_membarrier = available;
> }
> #endif
> 
> +int urcu_bp_disable_sys_membarrier(void)
> +{
> + mutex_lock(&rcu_registry_lock);
> + if (!cds_list_empty(®istry)) {
> + mutex_unlock(&rcu_registry_lock);
> + errno = EBUSY;
> + return -1;
> + }
> + mutex_unlock(&rcu_registry_lock);
> + if (urcu_bp_force_sys_membarrier()) {
> + errno = EINVAL;
> + return -1;
> + }
> + mutex_lock(&init_lock);
> + urcu_bp_sys_membarrier_is_disabled = true;
> + urcu_bp_sys_membarrier_status(false);
> + mutex_unlock(&init_lock);
> + return 0;
> +}
> +
> static
> void urcu_bp_sys_membarrier_init(void)
> {
>   bool available = false;
>   int mask;
> 
> + if (urcu_bp_sys_membarrier_is_disabled)
> + return;
>   mask = membarrier(MEMBARRIER_CMD_QUERY, 0);
>   if (mask >= 0) {
>   if (mask & MEMBARRIER_CMD_PRIVATE_EXPEDITED) {
> --
> 2.11.0

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Jan Kiszka

On 22.11.19 18:01, Mathieu Desnoyers wrote:


## membarrier syscall

I haven't got an explanation yet, but I believe this syscall does
nothing to xenomai threads (each has a shadow linux thread, that is
*idle* when the xenomai thread is active).


That's indeed a good point. I suspect membarrier may not send any IPI
to Xenomai threads (that would have to be confirmed). I suspect the
latency introduced by this IPI would be unwanted.


Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
interrupt to Linux on another CPU? The latter would still be possible,
but it would be delayed until all Xenomai threads on that core eventual
took a break (which should happen a couple of times per second under
normal conditions - 100% RT load is an illegal application state).


I'm talking about a real in-kernel IPI (as in inter-processor interrupt).
However, the way sys_membarrier detects which CPUs should receive that IPI
is by iterating on all cpu runqueues, and figure out which CPU is currently
running a thread which uses the same mm as the sys_membarrier caller
(for the PRIVATE membarrier commands).

So I suspect that the Xenomai thread is really not within the Linux scheduler
runqueue when it runs.


True. Xenomai first suspends the RT thread's Linux shadow and then kicks 
the Xenomai scheduler to interrupt Linux (and schedule in the RT 
thread). So, from a remote Linux perspective, something else will be 
running at this point.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka :
>
> On 22.11.19 16:42, Mathieu Desnoyers wrote:
> > - On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:
> >
> >> Hello,
> >>
> >> I already started a thread over at xenomai.org [1], but I guess its
> >> more efficient to ask here aswell.
> >> The basic concept is that xenomai thread run *below* Linux (threads
> >> and irg handlers), which means that xenomai threads must not use any
> >
> > I guess you mean "irq handlers" here.
> >
> >> linux services like the futex syscall or socket communication.
> >>
> >> ## tracepoints
> >>
> >> expecting that tracepoints are the only thing that should be used from
> >> the xenomai threads, is there anything using linux services.
> >> the "bulletproof" urcu apparently does not need anything for the
> >> reader lock (aslong as the thread is already registered),
> >
> > Indeed the first time the urcu-bp read-lock is encountered by a thread,
> > the thread registration is performed, which requires locks, memory 
> > allocation,
> > and so on. After that, the thread can use urcu-bp read-side lock without
> > requiring any system call.
>
> So, we will probably want to perform such a registration unconditionally
> (in case lttng usage is enabled) for our RT threads during their setup.

Who is we? Do you plan to add automatic support at xenomai mainline?

But yes, some setup is likely needed if one wants to use lttng


> >
> > That's indeed a good point. I suspect membarrier may not send any IPI
> > to Xenomai threads (that would have to be confirmed). I suspect the
> > latency introduced by this IPI would be unwanted.
>
> Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
> interrupt to Linux on another CPU? The latter would still be possible,
> but it would be delayed until all Xenomai threads on that core eventual
> took a break (which should happen a couple of times per second under
> normal conditions - 100% RT load is an illegal application state).

Not POSIX, some inter-thread interrupts. point is the syscall waits
for the set of
registered *running* Linux threads. I doubt Xenomai threads can be reached that
way, the shadow Linux thread will be idle and it won't block.
I dont think its worth extending this syscall (seems rather dangerous actually,
given that I had some deadlocks with other "lazy schemes", see below)

>
> >
> >> liburcu has configure options allow forcing the usage of this syscall
> >> but not disabling it, which likely is necessary for Xenomai.
> >
> > I suspect what you'd need there is a way to allow a process to tell
> > liburcu-bp (or liburcu) to always use the fall-back mechanism which does
> > not rely on sys_membarrier. This could be allowed before the first use of
> > the library. I think extending the liburcu APIs to allow this should be
> > straightforward enough. This approach would be more flexible than requiring
> > liburcu to be specialized at configure time. This new API would return an 
> > error
> > if invoked with a liburcu library compiled with 
> > --disable-sys-membarrier-fallback.
> >
> > If you have control over your entire system's kernel, you may want to try
> > just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.
> >
> > Another thing to make sure is to have a glibc and Linux kernel which perform
> > clock_gettime() as vDSO for the monotonic clock, because you don't want a
> > system call there. If that does not work for you, you can alternatively
> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
> > override
> > the clock used by lttng, and for instance use TSC directly. See for instance
> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
>
> clock_gettime & Co for a Xenomai application is syscall-free as well.

Yes, and that gave me a deadlock already, if a library us not compiled
for Xenomai,
it will either use the syscall (and you detect that immediatly) or it
will work most of the time,
and lock up once in a while if a Linux thread took the "writer lock"
of the VDSO structures
and your high priority xenomai thread is busy waiting infinitely.

Only sane approach would be to use either the xenomai function directly,
or recreate the function (rdtsc + interpolation on x86).
Either compiling/patching lttng for Cobalt (which I really would not
want to do) or using a
clock plugin.
If the later is supposed to be minimal, then that would mean I would
have to get the
interpolation factors cobalt uses (without bringing in libcobalt).

Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
AFAIK, so timestamps will
be different to the rest of Linux.
On my last plattform I did some tracing using internal stamp and
regulary wrote a
block with internal and external timestamps so those could be
converted "offline".
Anything similar with lttng or tools handling the traces?

regards, Norbert
___
lttng-dev mailing list
lttng-dev@list

Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Jan Kiszka

On 22.11.19 18:44, Norbert Lange wrote:

Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka :


On 22.11.19 16:42, Mathieu Desnoyers wrote:

- On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:


Hello,

I already started a thread over at xenomai.org [1], but I guess its
more efficient to ask here aswell.
The basic concept is that xenomai thread run *below* Linux (threads
and irg handlers), which means that xenomai threads must not use any


I guess you mean "irq handlers" here.


linux services like the futex syscall or socket communication.

## tracepoints

expecting that tracepoints are the only thing that should be used from
the xenomai threads, is there anything using linux services.
the "bulletproof" urcu apparently does not need anything for the
reader lock (aslong as the thread is already registered),


Indeed the first time the urcu-bp read-lock is encountered by a thread,
the thread registration is performed, which requires locks, memory allocation,
and so on. After that, the thread can use urcu-bp read-side lock without
requiring any system call.


So, we will probably want to perform such a registration unconditionally
(in case lttng usage is enabled) for our RT threads during their setup.


Who is we? Do you plan to add automatic support at xenomai mainline?

But yes, some setup is likely needed if one wants to use lttng


I wouldn't refuse patches to make this happen in mainline. If patches 
are best applied there. We could use a deterministic and fast 
application tracing frame work people can build upon, and that they can 
smoothly combine with system level traces.







That's indeed a good point. I suspect membarrier may not send any IPI
to Xenomai threads (that would have to be confirmed). I suspect the
latency introduced by this IPI would be unwanted.


Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
interrupt to Linux on another CPU? The latter would still be possible,
but it would be delayed until all Xenomai threads on that core eventual
took a break (which should happen a couple of times per second under
normal conditions - 100% RT load is an illegal application state).


Not POSIX, some inter-thread interrupts. point is the syscall waits
for the set of
registered *running* Linux threads. I doubt Xenomai threads can be reached that
way, the shadow Linux thread will be idle and it won't block.
I dont think its worth extending this syscall (seems rather dangerous actually,
given that I had some deadlocks with other "lazy schemes", see below)


Ack. It sounds like this will become messy at best, fragile at worst.








liburcu has configure options allow forcing the usage of this syscall
but not disabling it, which likely is necessary for Xenomai.


I suspect what you'd need there is a way to allow a process to tell
liburcu-bp (or liburcu) to always use the fall-back mechanism which does
not rely on sys_membarrier. This could be allowed before the first use of
the library. I think extending the liburcu APIs to allow this should be
straightforward enough. This approach would be more flexible than requiring
liburcu to be specialized at configure time. This new API would return an error
if invoked with a liburcu library compiled with 
--disable-sys-membarrier-fallback.

If you have control over your entire system's kernel, you may want to try
just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.

Another thing to make sure is to have a glibc and Linux kernel which perform
clock_gettime() as vDSO for the monotonic clock, because you don't want a
system call there. If that does not work for you, you can alternatively
implement your own lttng-ust and lttng-modules clock plugin .so/.ko to override
the clock used by lttng, and for instance use TSC directly. See for instance
the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.


clock_gettime & Co for a Xenomai application is syscall-free as well.


Yes, and that gave me a deadlock already, if a library us not compiled
for Xenomai,
it will either use the syscall (and you detect that immediatly) or it
will work most of the time,
and lock up once in a while if a Linux thread took the "writer lock"
of the VDSO structures
and your high priority xenomai thread is busy waiting infinitely.

Only sane approach would be to use either the xenomai function directly,
or recreate the function (rdtsc + interpolation on x86).


rdtsc is not portable, thus a no-go.


Either compiling/patching lttng for Cobalt (which I really would not
want to do) or using a
clock plugin.


I suspect you will want to have at least a plugin that was built against 
Xenomai libs.



If the later is supposed to be minimal, then that would mean I would
have to get the
interpolation factors cobalt uses (without bringing in libcobalt).

Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
AFAIK, so timestamps will
be different to the rest of Linux.


CLOCK_HOST_REALTIME is synchronized.


On my

Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
>
> LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
> which is injected into the process by a lttng-ust constructor.
>
> What you will care about is how the tracepoint call-site (within a Xenomai
> thread) interacts with the ring buffers.
>
> The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
> threads. The lttng-ust ring buffer is split into sub-buffers, each sub-buffer
> corresponding to a CTF trace "packet". When a sub-buffer is filled, lttng-ust
> invokes "write(2)" to a pipe to let the consumer daemon know there is data
> available in that ring buffer. You will want to get rid of that write(2) 
> system
> call from a Xenomai thread.
>
> The proper configuration is to use lttng-enable-channel(1) "--read-timer"
> option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This will
> ensure that the consumer daemon uses a polling approach to check periodically
> whether data needs to be consumed within each buffer, thus removing the
> use of the write(2) system call on the application-side.

Ah thanks.

But that's configuration outside of the RT app if I understand this correctly.
So if one configures a tracer wrong, then the app will suddenly misbehave.
Would be nice to be able to somehow tell that there is only read-timer allowed.


>
> > liburcu has configure options allow forcing the usage of this syscall
> > but not disabling it, which likely is necessary for Xenomai.
>
> I suspect what you'd need there is a way to allow a process to tell
> liburcu-bp (or liburcu) to always use the fall-back mechanism which does
> not rely on sys_membarrier. This could be allowed before the first use of
> the library. I think extending the liburcu APIs to allow this should be
> straightforward enough. This approach would be more flexible than requiring
> liburcu to be specialized at configure time. This new API would return an 
> error
> if invoked with a liburcu library compiled with 
> --disable-sys-membarrier-fallback.

I was under the impression, that you counted clock-cycles for every operation ;)
Not sure, maybe a separate lib for realtime is the better way. Having no option
can be considered foolproof, and sideeffects of the syscall not working would be
a real pain.

regards, Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
Am Fr., 22. Nov. 2019 um 18:52 Uhr schrieb Jan Kiszka :
>
> On 22.11.19 18:44, Norbert Lange wrote:
> > Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka 
> > :
> >>
> >> On 22.11.19 16:42, Mathieu Desnoyers wrote:
> >>> - On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com 
> >>> wrote:
> >>>
>  Hello,
> 
>  I already started a thread over at xenomai.org [1], but I guess its
>  more efficient to ask here aswell.
>  The basic concept is that xenomai thread run *below* Linux (threads
>  and irg handlers), which means that xenomai threads must not use any
> >>>
> >>> I guess you mean "irq handlers" here.
> >>>
>  linux services like the futex syscall or socket communication.
> 
>  ## tracepoints
> 
>  expecting that tracepoints are the only thing that should be used from
>  the xenomai threads, is there anything using linux services.
>  the "bulletproof" urcu apparently does not need anything for the
>  reader lock (aslong as the thread is already registered),
> >>>
> >>> Indeed the first time the urcu-bp read-lock is encountered by a thread,
> >>> the thread registration is performed, which requires locks, memory 
> >>> allocation,
> >>> and so on. After that, the thread can use urcu-bp read-side lock without
> >>> requiring any system call.
> >>
> >> So, we will probably want to perform such a registration unconditionally
> >> (in case lttng usage is enabled) for our RT threads during their setup.
> >
> > Who is we? Do you plan to add automatic support at xenomai mainline?
> >
> > But yes, some setup is likely needed if one wants to use lttng
>
> I wouldn't refuse patches to make this happen in mainline. If patches
> are best applied there. We could use a deterministic and fast
> application tracing frame work people can build upon, and that they can
> smoothly combine with system level traces.

Sure (good to hear), I just dont think enabling it automatic/unconditionally
is a good thing.


>
> >
> >>
> >>>
>  liburcu has configure options allow forcing the usage of this syscall
>  but not disabling it, which likely is necessary for Xenomai.
> >>>
> >>> I suspect what you'd need there is a way to allow a process to tell
> >>> liburcu-bp (or liburcu) to always use the fall-back mechanism which does
> >>> not rely on sys_membarrier. This could be allowed before the first use of
> >>> the library. I think extending the liburcu APIs to allow this should be
> >>> straightforward enough. This approach would be more flexible than 
> >>> requiring
> >>> liburcu to be specialized at configure time. This new API would return an 
> >>> error
> >>> if invoked with a liburcu library compiled with 
> >>> --disable-sys-membarrier-fallback.
> >>>
> >>> If you have control over your entire system's kernel, you may want to try
> >>> just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.
> >>>
> >>> Another thing to make sure is to have a glibc and Linux kernel which 
> >>> perform
> >>> clock_gettime() as vDSO for the monotonic clock, because you don't want a
> >>> system call there. If that does not work for you, you can alternatively
> >>> implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
> >>> override
> >>> the clock used by lttng, and for instance use TSC directly. See for 
> >>> instance
> >>> the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
> >>
> >> clock_gettime & Co for a Xenomai application is syscall-free as well.
> >
> > Yes, and that gave me a deadlock already, if a library us not compiled
> > for Xenomai,
> > it will either use the syscall (and you detect that immediatly) or it
> > will work most of the time,
> > and lock up once in a while if a Linux thread took the "writer lock"
> > of the VDSO structures
> > and your high priority xenomai thread is busy waiting infinitely.
> >
> > Only sane approach would be to use either the xenomai function directly,
> > or recreate the function (rdtsc + interpolation on x86).
>
> rdtsc is not portable, thus a no-go.

Its not portable, but you have equivalents on ARM, powerpc.
ie. "Do the same think as Xenomai"

> > Either compiling/patching lttng for Cobalt (which I really would not
> > want to do) or using a
> > clock plugin.
>
> I suspect you will want to have at least a plugin that was built against
> Xenomai libs.

That will then do alot other stuff like spwaning a printf thread.

>
> > If the later is supposed to be minimal, then that would mean I would
> > have to get the
> > interpolation factors cobalt uses (without bringing in libcobalt).
> >
> > Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
> > AFAIK, so timestamps will
> > be different to the rest of Linux.
>
> CLOCK_HOST_REALTIME is synchronized.

Thats not monotonic?

>
> > On my last plattform I did some tracing using internal stamp and
> > regulary wrote a
> > block with internal and external timestamps so those could be
> > converted "offline".
>
> 

Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Jan Kiszka

On 22.11.19 19:01, Norbert Lange wrote:

Am Fr., 22. Nov. 2019 um 18:52 Uhr schrieb Jan Kiszka :


On 22.11.19 18:44, Norbert Lange wrote:

Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka :


On 22.11.19 16:42, Mathieu Desnoyers wrote:

- On Nov 22, 2019, at 4:14 AM, Norbert Lange nolang...@gmail.com wrote:


Hello,

I already started a thread over at xenomai.org [1], but I guess its
more efficient to ask here aswell.
The basic concept is that xenomai thread run *below* Linux (threads
and irg handlers), which means that xenomai threads must not use any


I guess you mean "irq handlers" here.


linux services like the futex syscall or socket communication.

## tracepoints

expecting that tracepoints are the only thing that should be used from
the xenomai threads, is there anything using linux services.
the "bulletproof" urcu apparently does not need anything for the
reader lock (aslong as the thread is already registered),


Indeed the first time the urcu-bp read-lock is encountered by a thread,
the thread registration is performed, which requires locks, memory allocation,
and so on. After that, the thread can use urcu-bp read-side lock without
requiring any system call.


So, we will probably want to perform such a registration unconditionally
(in case lttng usage is enabled) for our RT threads during their setup.


Who is we? Do you plan to add automatic support at xenomai mainline?

But yes, some setup is likely needed if one wants to use lttng


I wouldn't refuse patches to make this happen in mainline. If patches
are best applied there. We could use a deterministic and fast
application tracing frame work people can build upon, and that they can
smoothly combine with system level traces.


Sure (good to hear), I just dont think enabling it automatic/unconditionally
is a good thing.


I don't disagree. If it requires built-time control or could also be 
enabled during application setup is something to be seen later.














liburcu has configure options allow forcing the usage of this syscall
but not disabling it, which likely is necessary for Xenomai.


I suspect what you'd need there is a way to allow a process to tell
liburcu-bp (or liburcu) to always use the fall-back mechanism which does
not rely on sys_membarrier. This could be allowed before the first use of
the library. I think extending the liburcu APIs to allow this should be
straightforward enough. This approach would be more flexible than requiring
liburcu to be specialized at configure time. This new API would return an error
if invoked with a liburcu library compiled with 
--disable-sys-membarrier-fallback.

If you have control over your entire system's kernel, you may want to try
just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.

Another thing to make sure is to have a glibc and Linux kernel which perform
clock_gettime() as vDSO for the monotonic clock, because you don't want a
system call there. If that does not work for you, you can alternatively
implement your own lttng-ust and lttng-modules clock plugin .so/.ko to override
the clock used by lttng, and for instance use TSC directly. See for instance
the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.


clock_gettime & Co for a Xenomai application is syscall-free as well.


Yes, and that gave me a deadlock already, if a library us not compiled
for Xenomai,
it will either use the syscall (and you detect that immediatly) or it
will work most of the time,
and lock up once in a while if a Linux thread took the "writer lock"
of the VDSO structures
and your high priority xenomai thread is busy waiting infinitely.

Only sane approach would be to use either the xenomai function directly,
or recreate the function (rdtsc + interpolation on x86).


rdtsc is not portable, thus a no-go.


Its not portable, but you have equivalents on ARM, powerpc.
ie. "Do the same think as Xenomai"


If you use existing code, I'm fine. Just not invent something "new" here.




Either compiling/patching lttng for Cobalt (which I really would not
want to do) or using a
clock plugin.


I suspect you will want to have at least a plugin that was built against
Xenomai libs.


That will then do alot other stuff like spwaning a printf thread.




If the later is supposed to be minimal, then that would mean I would
have to get the
interpolation factors cobalt uses (without bringing in libcobalt).

Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
AFAIK, so timestamps will
be different to the rest of Linux.


CLOCK_HOST_REALTIME is synchronized.


Thats not monotonic?


Yeah, it's REALTIME, in synch with CLOCK_REALTIME of Linux. 
CLOCK_MONOTONIC should have a static offset at worst. I think that could 
be resolved if it wasn't yet.







On my last plattform I did some tracing using internal stamp and
regulary wrote a
block with internal and external timestamps so those could be
converted "offline".


Sounds not like something we want to promote.


This was 

Re: [lttng-dev] [RFC PATCH liburcu] urcu-bp: introduce urcu_bp_disable_sys_membarrier()

2019-11-22 Thread Norbert Lange
I still would like to propose having a compile time "off switch",
cant see how this would work with preloaded libraries for one
(lttng itself whips with multiple)

Cant tell for sure now, but a setup where "bulletproof Xenomai"
libraries are used seems to better than special setup magic.
If you have to do do alot measurements for finding rare hiccups,
the last thing you want it figuring out you forget one config.

Thanks for the speedy support, Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 12:44 PM, Norbert Lange nolang...@gmail.com wrote:

> Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka 
> :
>>
>> On 22.11.19 16:42, Mathieu Desnoyers wrote:

[...]

> 
> 
>> >
>> > That's indeed a good point. I suspect membarrier may not send any IPI
>> > to Xenomai threads (that would have to be confirmed). I suspect the
>> > latency introduced by this IPI would be unwanted.
>>
>> Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
>> interrupt to Linux on another CPU? The latter would still be possible,
>> but it would be delayed until all Xenomai threads on that core eventual
>> took a break (which should happen a couple of times per second under
>> normal conditions - 100% RT load is an illegal application state).
> 
> Not POSIX, some inter-thread interrupts. point is the syscall waits
> for the set of
> registered *running* Linux threads.

Just a small clarification: the PRIVATE membarrier command does not *wait*
for other threads, but it rather ensures that all other running threads
have had IPIs that issue memory barriers before it returns.

This is just a building block that can be used to speed up stuff like liburcu
and JIT memory reclaim.

[...]

>> >
>> > Another thing to make sure is to have a glibc and Linux kernel which 
>> > perform
>> > clock_gettime() as vDSO for the monotonic clock, because you don't want a
>> > system call there. If that does not work for you, you can alternatively
>> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
>> > override
>> > the clock used by lttng, and for instance use TSC directly. See for 
>> > instance
>> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
>>
>> clock_gettime & Co for a Xenomai application is syscall-free as well.
> 
> Yes, and that gave me a deadlock already, if a library us not compiled
> for Xenomai,
> it will either use the syscall (and you detect that immediatly) or it
> will work most of the time,
> and lock up once in a while if a Linux thread took the "writer lock"
> of the VDSO structures
> and your high priority xenomai thread is busy waiting infinitely.
> 
> Only sane approach would be to use either the xenomai function directly,
> or recreate the function (rdtsc + interpolation on x86).
> Either compiling/patching lttng for Cobalt (which I really would not
> want to do) or using a
> clock plugin.
> If the later is supposed to be minimal, then that would mean I would
> have to get the
> interpolation factors cobalt uses (without bringing in libcobalt).
> 
> Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
> AFAIK, so timestamps will
> be different to the rest of Linux.
> On my last plattform I did some tracing using internal stamp and
> regulary wrote a
> block with internal and external timestamps so those could be
> converted "offline".
> Anything similar with lttng or tools handling the traces?

Can a Xenomai thread issue clock_gettime(CLOCK_MONOTONIC) ?

AFAIK we don't have tooling to do what you describe out of the box,
but it could probably be implemented as a babeltrace 2 filter plugin.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 12:55 PM, Norbert Lange nolang...@gmail.com wrote:

>>
>> LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
>> which is injected into the process by a lttng-ust constructor.
>>
>> What you will care about is how the tracepoint call-site (within a Xenomai
>> thread) interacts with the ring buffers.
>>
>> The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
>> threads. The lttng-ust ring buffer is split into sub-buffers, each sub-buffer
>> corresponding to a CTF trace "packet". When a sub-buffer is filled, lttng-ust
>> invokes "write(2)" to a pipe to let the consumer daemon know there is data
>> available in that ring buffer. You will want to get rid of that write(2) 
>> system
>> call from a Xenomai thread.
>>
>> The proper configuration is to use lttng-enable-channel(1) "--read-timer"
>> option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This will
>> ensure that the consumer daemon uses a polling approach to check periodically
>> whether data needs to be consumed within each buffer, thus removing the
>> use of the write(2) system call on the application-side.
> 
> Ah thanks.
> 
> But that's configuration outside of the RT app if I understand this correctly.
> So if one configures a tracer wrong, then the app will suddenly misbehave.
> Would be nice to be able to somehow tell that there is only read-timer 
> allowed.

So an RT application would prohibit tracing to non-RT ring buffers ? IOW, if a
channel is configured without the --read-timer option, nothing would appear from
the RT threads in those buffers.

Should this be per-process or per-thread ?

> 
> 
>>
>> > liburcu has configure options allow forcing the usage of this syscall
>> > but not disabling it, which likely is necessary for Xenomai.
>>
>> I suspect what you'd need there is a way to allow a process to tell
>> liburcu-bp (or liburcu) to always use the fall-back mechanism which does
>> not rely on sys_membarrier. This could be allowed before the first use of
>> the library. I think extending the liburcu APIs to allow this should be
>> straightforward enough. This approach would be more flexible than requiring
>> liburcu to be specialized at configure time. This new API would return an 
>> error
>> if invoked with a liburcu library compiled with
>> --disable-sys-membarrier-fallback.
> 
> I was under the impression, that you counted clock-cycles for every operation 
> ;)

Well it's just a new API that allows tweaking the state of a boolean which 
controls
branches which are already there on the fast-path. ;)

> Not sure, maybe a separate lib for realtime is the better way. Having no 
> option
> can be considered foolproof, and sideeffects of the syscall not working would 
> be
> a real pain.

e.g. a liburcu-bp-rt.so ? That would bring interesting integration challenges 
with
lttng-ust though. Should we then build a liblttng-ust-rt.so as well ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] [RFC PATCH liburcu] urcu-bp: introduce urcu_bp_disable_sys_membarrier()

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 1:37 PM, Norbert Lange nolang...@gmail.com wrote:

> I still would like to propose having a compile time "off switch",
> cant see how this would work with preloaded libraries for one
> (lttng itself whips with multiple)

One option would be to introduce an environment variable that would
control this.

> 
> Cant tell for sure now, but a setup where "bulletproof Xenomai"
> libraries are used seems to better than special setup magic.
> If you have to do do alot measurements for finding rare hiccups,
> the last thing you want it figuring out you forget one config.

Indeed. I'm not sure what's the best way forward however.

Thanks,

Mathieu

> 
> Thanks for the speedy support, Norbert

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
Am Fr., 22. Nov. 2019 um 20:00 Uhr schrieb Mathieu Desnoyers
:
>
> - On Nov 22, 2019, at 12:44 PM, Norbert Lange nolang...@gmail.com wrote:
>
> > Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka 
> > :
> >>
> >> On 22.11.19 16:42, Mathieu Desnoyers wrote:
>
> [...]
>
> >
> >
> >> >
> >> > That's indeed a good point. I suspect membarrier may not send any IPI
> >> > to Xenomai threads (that would have to be confirmed). I suspect the
> >> > latency introduced by this IPI would be unwanted.
> >>
> >> Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
> >> interrupt to Linux on another CPU? The latter would still be possible,
> >> but it would be delayed until all Xenomai threads on that core eventual
> >> took a break (which should happen a couple of times per second under
> >> normal conditions - 100% RT load is an illegal application state).
> >
> > Not POSIX, some inter-thread interrupts. point is the syscall waits
> > for the set of
> > registered *running* Linux threads.
>
> Just a small clarification: the PRIVATE membarrier command does not *wait*
> for other threads, but it rather ensures that all other running threads
> have had IPIs that issue memory barriers before it returns.

Ok, normal linux IRQs have to wait till Xenomai gives the cores back,
hence the waiting.

>
> >> >
> >> > Another thing to make sure is to have a glibc and Linux kernel which 
> >> > perform
> >> > clock_gettime() as vDSO for the monotonic clock, because you don't want a
> >> > system call there. If that does not work for you, you can alternatively
> >> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
> >> > override
> >> > the clock used by lttng, and for instance use TSC directly. See for 
> >> > instance
> >> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
> >>
> >> clock_gettime & Co for a Xenomai application is syscall-free as well.
> >
> > Yes, and that gave me a deadlock already, if a library us not compiled
> > for Xenomai,
> > it will either use the syscall (and you detect that immediatly) or it
> > will work most of the time,
> > and lock up once in a while if a Linux thread took the "writer lock"
> > of the VDSO structures
> > and your high priority xenomai thread is busy waiting infinitely.
> >
> > Only sane approach would be to use either the xenomai function directly,
> > or recreate the function (rdtsc + interpolation on x86).
> > Either compiling/patching lttng for Cobalt (which I really would not
> > want to do) or using a
> > clock plugin.
> > If the later is supposed to be minimal, then that would mean I would
> > have to get the
> > interpolation factors cobalt uses (without bringing in libcobalt).
> >
> > Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
> > AFAIK, so timestamps will
> > be different to the rest of Linux.
> > On my last plattform I did some tracing using internal stamp and
> > regulary wrote a
> > block with internal and external timestamps so those could be
> > converted "offline".
> > Anything similar with lttng or tools handling the traces?
>
> Can a Xenomai thread issue clock_gettime(CLOCK_MONOTONIC) ?

Yes it can, if the calls goes through the VDSO, then it mostly works.
And once in a while deadlocks the system if a Xenomai thread waits for a
spinlock that the Linux kernel owns and doesnt give back as said thread will
not let the Linux Kernel run (as described above).

>
> AFAIK we don't have tooling to do what you describe out of the box,
> but it could probably be implemented as a babeltrace 2 filter plugin.

There are alot ways to do that, I hoped for some standardized way.
regards, Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
Am Fr., 22. Nov. 2019 um 20:03 Uhr schrieb Mathieu Desnoyers
:
>
> - On Nov 22, 2019, at 12:55 PM, Norbert Lange nolang...@gmail.com wrote:
>
> >>
> >> LTTng-UST prepares the ring buffers from lttng-ust's "listener" thread,
> >> which is injected into the process by a lttng-ust constructor.
> >>
> >> What you will care about is how the tracepoint call-site (within a Xenomai
> >> thread) interacts with the ring buffers.
> >>
> >> The "default" setup for lttng-ust ring buffers is not suitable for Xenomai
> >> threads. The lttng-ust ring buffer is split into sub-buffers, each 
> >> sub-buffer
> >> corresponding to a CTF trace "packet". When a sub-buffer is filled, 
> >> lttng-ust
> >> invokes "write(2)" to a pipe to let the consumer daemon know there is data
> >> available in that ring buffer. You will want to get rid of that write(2) 
> >> system
> >> call from a Xenomai thread.
> >>
> >> The proper configuration is to use lttng-enable-channel(1) "--read-timer"
> >> option (see https://lttng.org/docs/v2.11/#doc-channel-read-timer). This 
> >> will
> >> ensure that the consumer daemon uses a polling approach to check 
> >> periodically
> >> whether data needs to be consumed within each buffer, thus removing the
> >> use of the write(2) system call on the application-side.
> >
> > Ah thanks.
> >
> > But that's configuration outside of the RT app if I understand this 
> > correctly.
> > So if one configures a tracer wrong, then the app will suddenly misbehave.
> > Would be nice to be able to somehow tell that there is only read-timer 
> > allowed.
>
> So an RT application would prohibit tracing to non-RT ring buffers ? IOW, if a
> channel is configured without the --read-timer option, nothing would appear 
> from
> the RT threads in those buffers.
>
> Should this be per-process or per-thread ?

I dont know lttng internals, I'd give this as an option to the lttng
control-thread
for the whole process?

> >> > liburcu has configure options allow forcing the usage of this syscall
> >> > but not disabling it, which likely is necessary for Xenomai.
> >>
> >> I suspect what you'd need there is a way to allow a process to tell
> >> liburcu-bp (or liburcu) to always use the fall-back mechanism which does
> >> not rely on sys_membarrier. This could be allowed before the first use of
> >> the library. I think extending the liburcu APIs to allow this should be
> >> straightforward enough. This approach would be more flexible than requiring
> >> liburcu to be specialized at configure time. This new API would return an 
> >> error
> >> if invoked with a liburcu library compiled with
> >> --disable-sys-membarrier-fallback.
> >
> > I was under the impression, that you counted clock-cycles for every 
> > operation ;)
>
> Well it's just a new API that allows tweaking the state of a boolean which 
> controls
> branches which are already there on the fast-path. ;)
>
> > Not sure, maybe a separate lib for realtime is the better way. Having no 
> > option
> > can be considered foolproof, and sideeffects of the syscall not working 
> > would be
> > a real pain.
>
> e.g. a liburcu-bp-rt.so ? That would bring interesting integration challenges 
> with
> lttng-ust though. Should we then build a liblttng-ust-rt.so as well ?

For my usecase, there is a xenomai-system with everything compiled from scratch,
and that would be a compile-time option, no new names.

If you want something more generic, think of such a layout:

/usr/lib/liblttng-ust.so
/usr/lib/liblttng-ust-*.so
/usr/lib/liburcu-bp.so
/usr/xenomai/lib/liburcu-bp.so

Then compile your app with RUNPATH=/usr/xenomai/lib and the
xenomai-flavour of  liburcu-bp.so
should be picked up (I believe that works even for preloaded libs).

Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Mathieu Desnoyers
- On Nov 22, 2019, at 2:57 PM, Norbert Lange nolang...@gmail.com wrote:

> Am Fr., 22. Nov. 2019 um 20:00 Uhr schrieb Mathieu Desnoyers
> :
>>
>> - On Nov 22, 2019, at 12:44 PM, Norbert Lange nolang...@gmail.com wrote:
>>
>> > Am Fr., 22. Nov. 2019 um 16:52 Uhr schrieb Jan Kiszka 
>> > :
>> >>
>> >> On 22.11.19 16:42, Mathieu Desnoyers wrote:
>>
>> [...]
>>
>> >
>> >
>> >> >
>> >> > That's indeed a good point. I suspect membarrier may not send any IPI
>> >> > to Xenomai threads (that would have to be confirmed). I suspect the
>> >> > latency introduced by this IPI would be unwanted.
>> >>
>> >> Is an "IPI" a POSIX signal here? Or are real IPI that delivers an
>> >> interrupt to Linux on another CPU? The latter would still be possible,
>> >> but it would be delayed until all Xenomai threads on that core eventual
>> >> took a break (which should happen a couple of times per second under
>> >> normal conditions - 100% RT load is an illegal application state).
>> >
>> > Not POSIX, some inter-thread interrupts. point is the syscall waits
>> > for the set of
>> > registered *running* Linux threads.
>>
>> Just a small clarification: the PRIVATE membarrier command does not *wait*
>> for other threads, but it rather ensures that all other running threads
>> have had IPIs that issue memory barriers before it returns.
> 
> Ok, normal linux IRQs have to wait till Xenomai gives the cores back,
> hence the waiting.

In the case of membarrier, IPIs are only sent to CPUs which runqueues
show that the currently running thread belongs to the same process
(for the PRIVATE command). So in this case we would not be sending
any IPI to the cores running Xenomai threads.

> 
>>
>> >> >
>> >> > Another thing to make sure is to have a glibc and Linux kernel which 
>> >> > perform
>> >> > clock_gettime() as vDSO for the monotonic clock, because you don't want 
>> >> > a
>> >> > system call there. If that does not work for you, you can alternatively
>> >> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
>> >> > override
>> >> > the clock used by lttng, and for instance use TSC directly. See for 
>> >> > instance
>> >> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
>> >>
>> >> clock_gettime & Co for a Xenomai application is syscall-free as well.
>> >
>> > Yes, and that gave me a deadlock already, if a library us not compiled
>> > for Xenomai,
>> > it will either use the syscall (and you detect that immediatly) or it
>> > will work most of the time,
>> > and lock up once in a while if a Linux thread took the "writer lock"
>> > of the VDSO structures
>> > and your high priority xenomai thread is busy waiting infinitely.
>> >
>> > Only sane approach would be to use either the xenomai function directly,
>> > or recreate the function (rdtsc + interpolation on x86).
>> > Either compiling/patching lttng for Cobalt (which I really would not
>> > want to do) or using a
>> > clock plugin.
>> > If the later is supposed to be minimal, then that would mean I would
>> > have to get the
>> > interpolation factors cobalt uses (without bringing in libcobalt).
>> >
>> > Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
>> > AFAIK, so timestamps will
>> > be different to the rest of Linux.
>> > On my last plattform I did some tracing using internal stamp and
>> > regulary wrote a
>> > block with internal and external timestamps so those could be
>> > converted "offline".
>> > Anything similar with lttng or tools handling the traces?
>>
>> Can a Xenomai thread issue clock_gettime(CLOCK_MONOTONIC) ?
> 
> Yes it can, if the calls goes through the VDSO, then it mostly works.
> And once in a while deadlocks the system if a Xenomai thread waits for a
> spinlock that the Linux kernel owns and doesnt give back as said thread will
> not let the Linux Kernel run (as described above).

Ah, yes, read seqlock can be tricky in that kind of scenario indeed.

Then what we'd need is the nmi-safe monotonic clock that went into the
Linux kernel a while ago. It's called "monotonic fast", but really what
it does is to remove the need to use a read-seqlock. AFAIK it's not
exposed through the vDSO at the moment though.

Thanks,

Mathieu

> 
>>
>> AFAIK we don't have tooling to do what you describe out of the box,
>> but it could probably be implemented as a babeltrace 2 filter plugin.
> 
> There are alot ways to do that, I hoped for some standardized way.
> regards, Norbert

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
> >> >> > Another thing to make sure is to have a glibc and Linux kernel which 
> >> >> > perform
> >> >> > clock_gettime() as vDSO for the monotonic clock, because you don't 
> >> >> > want a
> >> >> > system call there. If that does not work for you, you can 
> >> >> > alternatively
> >> >> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko 
> >> >> > to override
> >> >> > the clock used by lttng, and for instance use TSC directly. See for 
> >> >> > instance
> >> >> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
> >> >>
> >> >> clock_gettime & Co for a Xenomai application is syscall-free as well.
> >> >
> >> > Yes, and that gave me a deadlock already, if a library us not compiled
> >> > for Xenomai,
> >> > it will either use the syscall (and you detect that immediatly) or it
> >> > will work most of the time,
> >> > and lock up once in a while if a Linux thread took the "writer lock"
> >> > of the VDSO structures
> >> > and your high priority xenomai thread is busy waiting infinitely.
> >> >
> >> > Only sane approach would be to use either the xenomai function directly,
> >> > or recreate the function (rdtsc + interpolation on x86).
> >> > Either compiling/patching lttng for Cobalt (which I really would not
> >> > want to do) or using a
> >> > clock plugin.
> >> > If the later is supposed to be minimal, then that would mean I would
> >> > have to get the
> >> > interpolation factors cobalt uses (without bringing in libcobalt).
> >> >
> >> > Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
> >> > AFAIK, so timestamps will
> >> > be different to the rest of Linux.
> >> > On my last plattform I did some tracing using internal stamp and
> >> > regulary wrote a
> >> > block with internal and external timestamps so those could be
> >> > converted "offline".
> >> > Anything similar with lttng or tools handling the traces?
> >>
> >> Can a Xenomai thread issue clock_gettime(CLOCK_MONOTONIC) ?
> >
> > Yes it can, if the calls goes through the VDSO, then it mostly works.
> > And once in a while deadlocks the system if a Xenomai thread waits for a
> > spinlock that the Linux kernel owns and doesnt give back as said thread will
> > not let the Linux Kernel run (as described above).
>
> Ah, yes, read seqlock can be tricky in that kind of scenario indeed.
>
> Then what we'd need is the nmi-safe monotonic clock that went into the
> Linux kernel a while ago. It's called "monotonic fast", but really what
> it does is to remove the need to use a read-seqlock. AFAIK it's not
> exposed through the vDSO at the moment though.

An easy to use, consistent clock between Linux and Xenomai? Should be
the ultimate goal.
But I think its way less intrusive to just make the existing vDSO read/writes
safe by using the same scheme of atomic modification-count +
alternating buffers.

The vDSO is weird anyway, CLOCK_MONOTONIC_RAW was missing for a long
time (or still is?).

Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Using lttng-ust with xenomai

2019-11-22 Thread Norbert Lange
> >> liburcu has configure options allow forcing the usage of this syscall
> >> but not disabling it, which likely is necessary for Xenomai.
> >
> > I suspect what you'd need there is a way to allow a process to tell
> > liburcu-bp (or liburcu) to always use the fall-back mechanism which does
> > not rely on sys_membarrier. This could be allowed before the first use 
> > of
> > the library. I think extending the liburcu APIs to allow this should be
> > straightforward enough. This approach would be more flexible than 
> > requiring
> > liburcu to be specialized at configure time. This new API would return 
> > an error
> > if invoked with a liburcu library compiled with 
> > --disable-sys-membarrier-fallback.
> >
> > If you have control over your entire system's kernel, you may want to 
> > try
> > just configuring the kernel within CONFIG_MEMBARRIER=n in the meantime.
> >
> > Another thing to make sure is to have a glibc and Linux kernel which 
> > perform
> > clock_gettime() as vDSO for the monotonic clock, because you don't want 
> > a
> > system call there. If that does not work for you, you can alternatively
> > implement your own lttng-ust and lttng-modules clock plugin .so/.ko to 
> > override
> > the clock used by lttng, and for instance use TSC directly. See for 
> > instance
> > the lttng-ust(3) LTTNG_UST_CLOCK_PLUGIN environment variable.
> 
>  clock_gettime & Co for a Xenomai application is syscall-free as well.
> >>>
> >>> Yes, and that gave me a deadlock already, if a library us not compiled
> >>> for Xenomai,
> >>> it will either use the syscall (and you detect that immediatly) or it
> >>> will work most of the time,
> >>> and lock up once in a while if a Linux thread took the "writer lock"
> >>> of the VDSO structures
> >>> and your high priority xenomai thread is busy waiting infinitely.
> >>>
> >>> Only sane approach would be to use either the xenomai function directly,
> >>> or recreate the function (rdtsc + interpolation on x86).
> >>
> >> rdtsc is not portable, thus a no-go.
> >
> > Its not portable, but you have equivalents on ARM, powerpc.
> > ie. "Do the same think as Xenomai"
>
> If you use existing code, I'm fine. Just not invent something "new" here.

The idea it to build the lttng plugin from the same code,
just using the things that are necessary for reading the monotonic clock

> >>> Either compiling/patching lttng for Cobalt (which I really would not
> >>> want to do) or using a
> >>> clock plugin.
> >>
> >> I suspect you will want to have at least a plugin that was built against
> >> Xenomai libs.
> >
> > That will then do alot other stuff like spwaning a printf thread.
> >
> >>
> >>> If the later is supposed to be minimal, then that would mean I would
> >>> have to get the
> >>> interpolation factors cobalt uses (without bringing in libcobalt).
> >>>
> >>> Btw. the Xenomai and Linux monotonic clocks arent synchronised at all
> >>> AFAIK, so timestamps will
> >>> be different to the rest of Linux.
> >>
> >> CLOCK_HOST_REALTIME is synchronized.
> >
> > Thats not monotonic?
>
> Yeah, it's REALTIME, in synch with CLOCK_REALTIME of Linux.
> CLOCK_MONOTONIC should have a static offset at worst. I think that could
> be resolved if it wasn't yet.

Linux CLOCK_MONOTONIC is skew corrected to increment at the same rate as
CLOCK_REALTIME.
You might have a chance with Linux CLOCK_MONOTONIC_RAW,
if you use the identical scaling method.

>
> >
> >>
> >>> On my last plattform I did some tracing using internal stamp and
> >>> regulary wrote a
> >>> block with internal and external timestamps so those could be
> >>> converted "offline".
> >>
> >> Sounds not like something we want to promote.
> >
> > This was a questing to lttng and its tool environment. I suppose we
> > werent the first
> > ones with multiple clocks in a system.
> > If anything needs to be done in Xenomai it might be a concurrent
> > readout of Linux/cobalt time(s),
> > the rest would be done offline, potentially on another system.
>
> Sure, doable, but I prefer not having to do that.

It might offer alot flexibility you don't get otherwise (without a ton of work).

Norbert
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] [PATCH lttng-tools v2] Fix: update apps on untrack only when session is active

2019-11-22 Thread Jérémie Galarneau
Merged in master and stable-2.11.

Thanks!
Jérémie

On Mon, Nov 18, 2019 at 03:12:20PM -0500, Jonathan Rajotte wrote:
> This mimics what is done on the track side.
> 
> Fixes #1210
> 
> Signed-off-by: Jonathan Rajotte 
> ---
> 
> Used wrong issue number.
> 
> ---
>  src/bin/lttng-sessiond/trace-ust.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/src/bin/lttng-sessiond/trace-ust.c 
> b/src/bin/lttng-sessiond/trace-ust.c
> index 486b53d30..a6c0c04ad 100644
> --- a/src/bin/lttng-sessiond/trace-ust.c
> +++ b/src/bin/lttng-sessiond/trace-ust.c
> @@ -922,6 +922,7 @@ end:
>  int trace_ust_untrack_pid(struct ltt_ust_session *session, int pid)
>  {
>   int retval = LTTNG_OK;
> + bool should_update_apps = false;
>  
>   if (pid == -1) {
>   /* Create empty tracker, replace old tracker. */
> @@ -938,7 +939,7 @@ int trace_ust_untrack_pid(struct ltt_ust_session 
> *session, int pid)
>   fini_pid_tracker(&tmp_tracker);
>  
>   /* Remove session from all applications */
> - ust_app_global_update_all(session);
> + should_update_apps = true;
>   } else {
>   int ret;
>   struct ust_app *app;
> @@ -957,9 +958,12 @@ int trace_ust_untrack_pid(struct ltt_ust_session 
> *session, int pid)
>   /* Remove session from application. */
>   app = ust_app_find_by_pid(pid);
>   if (app) {
> - ust_app_global_update(session, app);
> + should_update_apps = true;
>   }
>   }
> + if (should_update_apps && session->active) {
> + ust_app_global_update_all(session);
> + }
>  end:
>   return retval;
>  }
> -- 
> 2.17.1
> 
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] [PATCH lttng-tools] Require automake >= 1.12

2019-11-22 Thread Jérémie Galarneau
Merged in master, stable-2.11, and stable-2.10.

Thanks!
Jérémie

On Thu, Nov 07, 2019 at 02:02:55PM -0500, Michael Jeanson wrote:
> The test suite LOG_DRIVER statement requires that automake >= 1.12 be used
> during bootstrap.
> 
> Signed-off-by: Michael Jeanson 
> ---
>  configure.ac | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/configure.ac b/configure.ac
> index d8ab1e0ac..10b338420 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -8,7 +8,7 @@ AC_CONFIG_MACRO_DIR([m4])
>  AC_CANONICAL_TARGET
>  AC_CANONICAL_HOST
>  
> -AM_INIT_AUTOMAKE([foreign dist-bzip2 no-dist-gzip tar-pax nostdinc])
> +AM_INIT_AUTOMAKE([1.12 foreign dist-bzip2 no-dist-gzip tar-pax nostdinc])
>  AM_MAINTAINER_MODE([enable])
>  
>  # Enable silent rules if available (Introduced in AM 1.11)
> -- 
> 2.17.1
> 
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] [PATCH lttng-tools 1/3] Fix: relayd: tracefile rotation: viewer opening missing index file

2019-11-22 Thread Jérémie Galarneau
All three patches of this series were merged in master and
stable-2.11.

Thanks!
Jérémie

On Fri, Nov 01, 2019 at 04:23:03PM -0400, Mathieu Desnoyers wrote:
> Moving the head position of the tracefile array when the data is
> received opens a window where a viewer attaching to the session could
> try to open a missing index file (which has not been received yet).
> 
> However, we want to bump the tail position as soon as we receive
> data, because the prior tail is not valid anymore.
> 
> Solve this by introducing two head positions: the "read" head
> and the "write" head. The "write" head is the position of the
> newest data file (equivalent to the prior "head" position). We
> also introduce a "read" head position, which is only moved
> forward when the index is received.
> 
> The viewer now uses the "read" head position as upper bound, which
> ensures it never attempts to open a non-existing index file.
> 
> Signed-off-by: Mathieu Desnoyers 
> ---
>  src/bin/lttng-relayd/stream.c  |  4 +-
>  src/bin/lttng-relayd/tracefile-array.c | 58 --
>  src/bin/lttng-relayd/tracefile-array.h | 21 --
>  src/bin/lttng-relayd/viewer-stream.c   |  2 +-
>  4 files changed, 58 insertions(+), 27 deletions(-)
> 
> diff --git a/src/bin/lttng-relayd/stream.c b/src/bin/lttng-relayd/stream.c
> index 94698f8d..4d3d37a2 100644
> --- a/src/bin/lttng-relayd/stream.c
> +++ b/src/bin/lttng-relayd/stream.c
> @@ -958,7 +958,7 @@ int stream_init_packet(struct relay_stream *stream, 
> size_t packet_size,
>   stream->stream_handle,
>   stream->tracefile_size_current, packet_size,
>   stream->tracefile_current_index, 
> new_file_index);
> - tracefile_array_file_rotate(stream->tfa);
> + tracefile_array_file_rotate(stream->tfa, 
> TRACEFILE_ROTATE_WRITE);
>   stream->tracefile_current_index = new_file_index;
>  
>   if (stream->stream_fd) {
> @@ -1095,6 +1095,7 @@ int stream_update_index(struct relay_stream *stream, 
> uint64_t net_seq_num,
>  
>   ret = relay_index_try_flush(index);
>   if (ret == 0) {
> + tracefile_array_file_rotate(stream->tfa, TRACEFILE_ROTATE_READ);
>   tracefile_array_commit_seq(stream->tfa);
>   stream->index_received_seqcount++;
>   *flushed = true;
> @@ -1188,6 +1189,7 @@ int stream_add_index(struct relay_stream *stream,
>   }
>   ret = relay_index_try_flush(index);
>   if (ret == 0) {
> + tracefile_array_file_rotate(stream->tfa, TRACEFILE_ROTATE_READ);
>   tracefile_array_commit_seq(stream->tfa);
>   stream->index_received_seqcount++;
>   stream->pos_after_last_complete_data_index += index->total_size;
> diff --git a/src/bin/lttng-relayd/tracefile-array.c 
> b/src/bin/lttng-relayd/tracefile-array.c
> index 20b760c0..3d62317a 100644
> --- a/src/bin/lttng-relayd/tracefile-array.c
> +++ b/src/bin/lttng-relayd/tracefile-array.c
> @@ -62,7 +62,8 @@ void tracefile_array_destroy(struct tracefile_array *tfa)
>   free(tfa);
>  }
>  
> -void tracefile_array_file_rotate(struct tracefile_array *tfa)
> +void tracefile_array_file_rotate(struct tracefile_array *tfa,
> + enum tracefile_rotate_type type)
>  {
>   uint64_t *headp, *tailp;
>  
> @@ -70,24 +71,37 @@ void tracefile_array_file_rotate(struct tracefile_array 
> *tfa)
>   /* Not in tracefile rotation mode. */
>   return;
>   }
> - /* Rotate to next file.  */
> - tfa->file_head = (tfa->file_head + 1) % tfa->count;
> - if (tfa->file_head == tfa->file_tail) {
> - /* Move tail. */
> - tfa->file_tail = (tfa->file_tail + 1) % tfa->count;
> - }
> - headp = &tfa->tf[tfa->file_head].seq_head;
> - tailp = &tfa->tf[tfa->file_head].seq_tail;
> - /*
> -  * If we overwrite a file with content, we need to push the tail
> -  * to the position following the content we are overwriting.
> -  */
> - if (*headp != -1ULL) {
> - tfa->seq_tail = tfa->tf[tfa->file_tail].seq_tail;
> + switch (type) {
> + case TRACEFILE_ROTATE_READ:
> + /*
> +  * Rotate read head to write head position, thus allowing
> +  * reader to consume the newly rotated head file.
> +  */
> + tfa->file_head_read = tfa->file_head_write;
> + break;
> + case TRACEFILE_ROTATE_WRITE:
> + /* Rotate write head to next file, pushing tail if needed.  */
> + tfa->file_head_write = (tfa->file_head_write + 1) % tfa->count;
> + if (tfa->file_head_write == tfa->file_tail) {
> + /* Move tail. */
> + tfa->file_tail = (tfa->file_tail + 1) % tfa->count;
> + }
> + headp = &tfa->tf[tfa->file_head_write].seq_head;
> + tailp = &tfa->tf[tfa->file_hea