Re: staging/wilc1000: wrong conversion to completion?

2016-07-19 Thread Binoy Jayan
On 11 July 2016 at 13:38, Arnd Bergmann  wrote:
> On Monday, July 11, 2016 9:41:15 AM CEST Jiri Slaby wrote:
>> Hi,
>>
>> while looking at this commit:
>>
>> commit b27a6d5e636ac80b223a18ca2b3c892f1caef9e3
>> Author: Binoy Jayan 
>> Date:   Wed Jun 15 11:00:34 2016 +0530
>>
>> staging: wilc1000: Replace semaphore txq_event with completion
>>
>> The semaphore 'txq_event' is used as completion, so convert it
>> to a struct completion type.
>>
>> Signed-off-by: Binoy Jayan 
>> Reviewed-by: Arnd Bergmann 
>> Signed-off-by: Greg Kroah-Hartman 
>>
>> diff --git a/drivers/staging/wilc1000/linux_wlan.c
>> b/drivers/staging/wilc1000/linux_wlan.c
>> index 274c390d17cd..baf932681362 100644
>> --- a/drivers/staging/wilc1000/linux_wlan.c
>> +++ b/drivers/staging/wilc1000/linux_wlan.c
>> @@ -316,7 +316,7 @@ static int linux_wlan_txq_task(void *vp)
>>
>> complete(&wl->txq_thread_started);
>> while (1) {
>> -   down(&wl->txq_event);
>> +   wait_for_completion(&wl->txq_event);
>>
>> if (wl->close) {
>> complete(&wl->txq_thread_started);
>> @@ -650,7 +650,7 @@ void wilc1000_wlan_deinit(struct net_device *dev)
>> mutex_unlock(&wl->hif_cs);
>> }
>> if (&wl->txq_event)
>> -   up(&wl->txq_event);
>> +   wait_for_completion(&wl->txq_event);
>>
>>
>> I wonder: is this correct? Should that be complete() instead?
>>
>
> Yes, I agree, sorry for missing that in my review.
>
> Arnd

Sorry for the typo. Just saw the email after coming back from
vacation. Will send the patch soon.

Binoy


[PATCH] staging: wilc1000: txq_event: Fix coding error

2016-07-21 Thread Binoy Jayan
Fix incorrect usage of completion interface by replacing
'wait_for_completion' with 'complete'. This error was introduced
accidentally while replacing semaphores with mutexes.

Reported-by: Jiri Slaby 
Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/linux_wlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 3a66255..3221511 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -648,7 +648,7 @@ void wilc1000_wlan_deinit(struct net_device *dev)
mutex_unlock(&wl->hif_cs);
}
if (&wl->txq_event)
-   wait_for_completion(&wl->txq_event);
+   complete(&wl->txq_event);
 
wlan_deinitialize_threads(dev);
deinit_irq(dev);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[RFC PATCH v7 0/5] *** Latency histograms ***

2016-09-20 Thread Binoy Jayan
Latency Histograms
=

1. Introduction

This series consists of patches meant for capturing latencies which
occur in a realtime system and to render it as a histogram of meaningful
data. Few of these patches were originally part of the PREEMPT_RT patchet
and the rt linux tree. These are adapted to make use of the kernel event
infrastructure and histogram triggers. This patch series aims at preparing
these patches for mainlining. The individual changelog describe the specific
patchset. To check the feasibility of mainlining this series, please refer
section 5. The original patch set (Latency hist) may be found under section 6.
The patch series also consists of an additonal histogram feature and a bug fix.

v6: https://lkml.org/lkml/2016/9/7/253

2. Changes v6 -> v7:
-
 - Patch to capture additional per process data added as part of 
   wakeup latency patch
   tracing: wakeup latency events and histograms
 - Removed tracepoint condition '*_enabled' and let tracer core handle it
 - Adjusted order of the new struct member 'struct hrtimer' to reduce footprint
 - Documented new member 'tim_expiry' in 'struct hrtimer'
 - For patch 3, use the timestamp from the tracer itself to avoid repeated
   requests to hardware
 - Removed unwanted type casts 
 - Added comments for new tracepoints

3. Questions which were unanswered in v6
-
- How is this facility useful on it's own?
To capture potential sources of latencies

- What is the value of adding this?
The goal is not to document the deterministic execution time of irq
(the "IRQ latency"), but to provide evidence that both the scheduler
and the context switch guarantee that a waiting user-space real-time
application will start not later than the maximum acceptable latency.
This made it necessary to determine the "preemption latency" sometimes
also referred to as "total latency" 

- How is it used and what valuable debug information can be collected?
Information collected are IRQ latency, Preemption latency and total latency

For more information, please refer 'Latency Hist article' in the reference
section mentioned below.

4. Switchtime latency
--

The existing two on-line methods to record system latency (timer and
wakeup latency) do not completely reflect the user-space wakeup time,
since the duration of the context switch is not taken into account. This
patch adds two new histograms - one that records the duration of the
context switch and another one that records the sum of the timer delay,
the wakeup latency and the newly available duration of the context
switch. The latter histogram probably is best suitable to determine the
worst-case total preemption latency of a given system.

NB: This patch (refer 'Switchtime latency' under section 6) is presently
not part of the PREEMPT_RT patchset and also has not been adapted to work
with the kernel event infrastructure. It can be done if the other patches
can be considered for mainlining.

5. Mainlining and alternatives
---
As of now, the preemptirqsoff, timer and wakeup latencies are part of
the PREEMPT_RT patch series. But not the switchtime latency which is
maintained separately in an off-tree repository. The preemptirqsoff,
and timer latency histogram alone may not make sense due to the reasons
mentioned in section 3. The implementation of these includes modifying
the 'task_struct' which might not sound feasible to many stakeholders.

Also, the switchtime histogram have not been adapted for mainlining
for the same reason. Refer 'Latency Hist article' for more details.

An alternative to keeping the timestamp in 'task_struct' as suggested
by Daniel Wagner (for example in wakeup histogram) is to create kprobe
events for 'wakeup' and 'wakeup_sched_switch' from userland and use the
'hist' triggers to somehow find the difference between the timestamps in
each of the above events. This could involve using tracing_maps by
creating a global variable to keep the process's context. But seems like
too much overhead during runtime. Moreover, this would involve
considerable work from the user space like many of the tracing
utilities and defeats the purpose of simplicity of the in-kernel histograms.

6. Reference
-
Latency hist:  
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb
Latency Hist article:  https://www.osadl.org/Single-View.111+M538f5aa49d6.0.html
Switchtime latency:http://www.spinics.net/lists/linux-rt-users/msg14377.html

-Binoy

Binoy Jayan (3):
  tracing: Add preemptirqsoff timing events
  tracing: Measure delayed hrtimer offset latency
  tracing: wakeup latency events and histograms

Daniel Wagner

[RFC PATCH v7 2/5] tracing: Add hist trigger support for generic fields

2016-09-20 Thread Binoy Jayan
From: Daniel Wagner 

Whenever a trace is printed the generic fields (CPU, COMM) are
reconstructed (see trace_print_context()). CPU is taken from the
trace_iterator and COMM is extracted from the savedcmd map (see
__trace_find_cmdline()).

We can't reconstruct this information for hist events. Therefore this
information needs to be stored when a new event is added to the hist
buffer.

There is already support for extracting the COMM for the common_pid
field. For this the tracing_map_ops infrasture is used. Unfortunately, we
can't reuse it because it extends an existing hist_field. That means we
first need to add a hist_field before we are able to make reuse of
trace_map_ops.

Furthermore, it is quite easy to extend the current code to support
those two fields by adding hist_field_cpu() and hist_field_comm().

Signed-off-by: Daniel Wagner 
---
 kernel/trace/trace_events.c  | 13 +++--
 kernel/trace/trace_events_hist.c | 36 ++--
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 03c0a48..ea8da30 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -150,9 +150,10 @@ int trace_define_field(struct trace_event_call *call, 
const char *type,
 }
 EXPORT_SYMBOL_GPL(trace_define_field);
 
-#define __generic_field(type, item, filter_type)   \
+#define __generic_field(type, item, filter_type, size) \
ret = __trace_define_field(&ftrace_generic_fields, #type,   \
-  #item, 0, 0, is_signed_type(type),   \
+  #item, 0, size,  \
+  is_signed_type(type),\
   filter_type);\
if (ret)\
return ret;
@@ -170,10 +171,10 @@ static int trace_define_generic_fields(void)
 {
int ret;
 
-   __generic_field(int, CPU, FILTER_CPU);
-   __generic_field(int, cpu, FILTER_CPU);
-   __generic_field(char *, COMM, FILTER_COMM);
-   __generic_field(char *, comm, FILTER_COMM);
+   __generic_field(int, CPU, FILTER_CPU, sizeof(int));
+   __generic_field(int, cpu, FILTER_CPU, sizeof(int));
+   __generic_field(char *, COMM, FILTER_COMM, TASK_COMM_LEN + 1);
+   __generic_field(char *, comm, FILTER_COMM, TASK_COMM_LEN + 1);
 
return ret;
 }
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index f3a960e..7ed6743 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -75,6 +75,16 @@ static u64 hist_field_log2(struct hist_field *hist_field, 
void *event)
return (u64) ilog2(roundup_pow_of_two(val));
 }
 
+static u64 hist_field_cpu(struct hist_field *hist_field, void *event)
+{
+   return (u64) smp_processor_id();
+}
+
+static u64 hist_field_comm(struct hist_field *hist_field, void *event)
+{
+   return (u64) (unsigned long) current->comm;
+}
+
 #define DEFINE_HIST_FIELD_FN(type) \
 static u64 hist_field_##type(struct hist_field *hist_field, void *event)\
 {  \
@@ -119,6 +129,8 @@ enum hist_field_flags {
HIST_FIELD_FL_SYSCALL   = 128,
HIST_FIELD_FL_STACKTRACE= 256,
HIST_FIELD_FL_LOG2  = 512,
+   HIST_FIELD_FL_CPU   = 1024,
+   HIST_FIELD_FL_COMM  = 2048,
 };
 
 struct hist_trigger_attrs {
@@ -374,7 +386,13 @@ static struct hist_field *create_hist_field(struct 
ftrace_event_field *field,
if (WARN_ON_ONCE(!field))
goto out;
 
-   if (is_string_field(field)) {
+   if (field->filter_type == FILTER_CPU) {
+   flags |= HIST_FIELD_FL_CPU;
+   hist_field->fn = hist_field_cpu;
+   } else if (field->filter_type == FILTER_COMM) {
+   flags |= HIST_FIELD_FL_COMM;
+   hist_field->fn = hist_field_comm;
+   } else if (is_string_field(field)) {
flags |= HIST_FIELD_FL_STRING;
 
if (field->filter_type == FILTER_STATIC_STRING)
@@ -748,7 +766,8 @@ static int create_tracing_map_fields(struct 
hist_trigger_data *hist_data)
 
if (hist_field->flags & HIST_FIELD_FL_STACKTRACE)
cmp_fn = tracing_map_cmp_none;
-   else if (is_string_field(field))
+   else if (is_string_field(field) ||
+hist_field->flags & HIST_FIELD_FL_COMM)
cmp_fn = tracing_map_cmp_string;
else
cmp_fn = tracing_map_cmp_num(field->size,
@@ -856,11 +875,9 @@ static inline void add_to_key(char *compound_key, void 
*key,
   

[RFC PATCH v7 1/5] tracing: Dereference pointers without RCU checks

2016-09-20 Thread Binoy Jayan
From: Daniel Wagner 

The tracepoint can't be used in code section where we are in the
middle of a state transition.

For example if we place a tracepoint inside start/stop_critical_section(),
lockdep complains with

[0.035589] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3560 \
check_flags.part.36+0x1bc/0x210() [0.036000] \
DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [0.036000] Kernel panic - 
not \
syncing: panic_on_warn set ... [0.036000]
[0.036000] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc7+ #460
[0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS \
1.7.5-20140709_153950- 04/01/2014 [0.036000]  81f2463a 
88007c93bb98 \
81afb317 0001 [0.036000]  81f212b3 
88007c93bc18 \
81af7bc2 88007c93bbb8 [0.036000]  0008 
88007c93bc28 \
88007c93bbc8 0093bbd8 [0.036000] Call Trace:
[0.036000]  [] dump_stack+0x4f/0x7b
[0.036000]  [] panic+0xc0/0x1e9
[0.036000]  [] ? _raw_spin_unlock_irqrestore+0x38/0x80
[0.036000]  [] warn_slowpath_common+0xc0/0xc0
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] warn_slowpath_fmt+0x46/0x50
[0.036000]  [] check_flags.part.36+0x1bc/0x210
[0.036000]  [] lock_is_held+0x78/0x90
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] ? __do_softirq+0x3db/0x500
[0.036000]  [] trace_preempt_on+0x255/0x260
[0.036000]  [] preempt_count_sub+0xab/0xf0
[0.036000]  [] __local_bh_enable+0x36/0x70
[0.036000]  [] __do_softirq+0x3db/0x500
[0.036000]  [] run_ksoftirqd+0x1f/0x60
[0.036000]  [] smpboot_thread_fn+0x193/0x2a0
[0.036000]  [] ? SyS_setgroups+0x150/0x150
[0.036000]  [] kthread+0xf2/0x110
[0.036000]  [] ? wait_for_completion+0xc3/0x120
[0.036000]  [] ? preempt_count_sub+0xab/0xf0
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000]  [] ret_from_fork+0x58/0x90
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000] ---[ end Kernel panic - not syncing: panic_on_warn set ...

PeterZ was so kind to explain it to me what is happening:

"__local_bh_enable() tests if this is the last SOFTIRQ_OFFSET, if so it
tells lockdep softirqs are enabled with trace_softirqs_on() after that
we go an actually modify the preempt_count with preempt_count_sub().
Then in preempt_count_sub() you call into trace_preempt_on() if this
was the last preempt_count increment but you do that _before_ you
actually change the preempt_count with __preempt_count_sub() at this
point lockdep and preempt_count think the world differs and *boom*"

So the simplest way to avoid this is by disabling the consistency
checks.

We also need to take care of the iterating in trace_events_trigger.c
to avoid a splatter in conjunction with the hist trigger.

Signed-off-by: Daniel Wagner 
Signed-off-by: Binoy Jayan 
---
 include/linux/rculist.h | 36 
 include/linux/tracepoint.h  |  4 ++--
 kernel/trace/trace_events_filter.c  |  4 ++--
 kernel/trace/trace_events_trigger.c |  6 +++---
 4 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8beb98d..bee836b 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -279,6 +279,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
container_of(lockless_dereference(ptr), type, member)
 
 /**
+ * list_entry_rcu_notrace - get the struct for this entry (for tracing)
+ * @ptr:the &struct list_head pointer.
+ * @type:   the type of the struct this is embedded in.
+ * @member: the name of the list_head within the struct.
+ *
+ * This primitive may safely run concurrently with the _rcu list-mutation
+ * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
+ *
+ * This is the same as list_entry_rcu() except that it does
+ * not do any RCU debugging or tracing.
+ */
+#define list_entry_rcu_notrace(ptr, type, member) \
+({ \
+   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
+   container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), type, 
member); \
+})
+
+/**
  * Where are list_empty_rcu() and list_first_entry_rcu()?
  *
  * Implementing those functions following their counterparts list_empty() and
@@ -391,6 +409,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
 pos = list_entry_lockless(pos->member.next, typeof(*pos), member))
 
 /**
+ * list_for_each_entry_rcu_notrace -   iterate over rcu list of given 
type (for tracing)
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_head within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as list_add_rcu()
+ * as long 

[RFC PATCH v7 4/5] tracing: Measure delayed hrtimer offset latency

2016-09-20 Thread Binoy Jayan
Measure latencies caused due to delayed timer offsets in nanoseconds.
It shows the latency captured due to a delayed timer expire event. It
happens for example when a timer misses its deadline due to disabled
interrupts. A process if scheduled as a result of the timer expiration
suffers this latency. It is used to calculate the total wakeup latency
of a process which is the sum of the delayed timer offset and the
wakeup latency.

[
Initial work and idea by Carsten
Link: 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb
]

Cc: Carsten Emde 
Signed-off-by: Binoy Jayan 
---
 include/linux/hrtimer.h |  4 
 include/linux/sched.h   |  3 +++
 kernel/time/Kconfig |  8 
 kernel/time/hrtimer.c   | 47 +++
 4 files changed, 62 insertions(+)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 5e00f80..05d8086 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -90,6 +90,7 @@ enum hrtimer_restart {
  * @is_rel:Set if the timer was armed relative
  * @start_pid:  timer statistics field to store the pid of the task which
  * started the timer
+ * @tim_expiry: hrtimer expiry time or 0 in case already expired
  * @start_site:timer statistics field to store the site where the timer
  * was started
  * @start_comm: timer statistics field to store the name of the process which
@@ -104,6 +105,9 @@ struct hrtimer {
struct hrtimer_clock_base   *base;
u8  state;
u8  is_rel;
+#ifdef CONFIG_TRACE_DELAYED_TIMER_OFFSETS
+   ktime_t tim_expiry;
+#endif
 #ifdef CONFIG_TIMER_STATS
int start_pid;
void*start_site;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 62c68e5..7bf67f8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1891,6 +1891,9 @@ struct task_struct {
/* bitmask and counter of trace recursion */
unsigned long trace_recursion;
 #endif /* CONFIG_TRACING */
+#ifdef CONFIG_TRACE_DELAYED_TIMER_OFFSETS
+   long timer_offset;
+#endif /* CONFIG_TRACE_DELAYED_TIMER_OFFSETS */
 #ifdef CONFIG_KCOV
/* Coverage collection mode enabled for this task (0 if disabled). */
enum kcov_mode kcov_mode;
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 4008d9f..de4793c 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -193,5 +193,13 @@ config HIGH_RES_TIMERS
  hardware is not capable then this option only increases
  the size of the kernel image.
 
+config TRACE_DELAYED_TIMER_OFFSETS
+   depends on HIGH_RES_TIMERS
+   select GENERIC_TRACER
+   bool "Delayed Timer Offsets"
+   help
+ Capture offsets of delayed hrtimer in nanoseconds. It is used
+ to construct wakeup latency histogram.
+
 endmenu
 endif
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9ba7c82..7048f86 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -56,6 +56,8 @@
 
 #include "tick-internal.h"
 
+static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer);
+
 /*
  * The timer bases:
  *
@@ -960,6 +962,47 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer 
*timer, ktime_t tim,
return tim;
 }
 
+#ifdef CONFIG_TRACE_DELAYED_TIMER_OFFSETS
+static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+ktime_t tim)
+{
+   ktime_t now = new_base->get_time();
+
+   if (ktime_to_ns(tim) < ktime_to_ns(now))
+   timer->tim_expiry = now;
+   else
+   timer->tim_expiry = ktime_set(0, 0);
+}
+
+static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
+   ktime_t basenow)
+{
+   long latency;
+   struct task_struct *task;
+
+   latency = ktime_to_ns(ktime_sub(basenow,
+ ktime_to_ns(timer->tim_expiry) ?
+ timer->tim_expiry : hrtimer_get_expires(timer)));
+   task = timer->function == hrtimer_wakeup ?
+   container_of(timer, struct hrtimer_sleeper,
+timer)->task : NULL;
+   if (task && latency > 0)
+   task->timer_offset = latency;
+}
+#else
+static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+ktime_t tim)
+{
+}
+static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
+   ktime_t basenow)
+{

[RFC PATCH v7 5/5] tracing: wakeup latency events and histograms

2016-09-20 Thread Binoy Jayan
These latencies usually occurs during the wakeup of a process. To
determine this latency, the kernel stores the time stamp when a process
is scheduled to be woken up, and determines the duration of the wakeup
time shortly before control is passed over to this process. Note that the
apparent latency in user space may be somewhat longer, since the process
may be interrupted after control is passed over to it but before the
execution in user space takes place. Simply measuring the interval between
enqueuing and wakeup may also not appropriate in cases when a process is
scheduled as a result of a timer expiration. The timer may have missed its
deadline, e.g. due to disabled interrupts, but this latency would not be
registered. Therefore, the offsets of missed hrtimers are recorded in the
same histogram. The missed hrtimer offsets and the wakeup latency together
contribute to the total latency. With the histogram triggers in place, the
plots may be generated, with per-cpu breakdown of events captured and
the latency measured in nanoseconds.

The following histogram triggers may be used:

'hist:key=cpu,ccomm:val=wakeup_lat,total_lat:sort=cpu'
'hist:key=cpu,ccomm:val=timeroffset,total_lat:sort=cpu'
'hist:key=cpu,ccomm:val=total_lat:sort=cpu'
'hist:key=ccomm:val=total_lat if cpu==0'

Enable the tracer 'wakeup' or 'wakeup_rt' to capture wakeup latencies of
the respective processes.

In '/sys/kernel/debug/tracing'

echo wakeup > current_tracer

[
Initial work and idea by Carsten
Link: 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb
]

Cc: Carsten Emde 
Signed-off-by: Binoy Jayan 
---
 include/linux/sched.h |  3 +++
 include/trace/events/sched.h  | 34 ++
 kernel/trace/Kconfig  | 10 ++
 kernel/trace/trace_sched_wakeup.c | 35 ---
 4 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7bf67f8..82f3b62 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1894,6 +1894,9 @@ struct task_struct {
 #ifdef CONFIG_TRACE_DELAYED_TIMER_OFFSETS
long timer_offset;
 #endif /* CONFIG_TRACE_DELAYED_TIMER_OFFSETS */
+#ifdef CONFIG_TRACE_EVENTS_WAKEUP_LATENCY
+   u64 wakeup_timestamp_start;
+#endif
 #ifdef CONFIG_KCOV
/* Coverage collection mode enabled for this task (0 if disabled). */
enum kcov_mode kcov_mode;
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 9b90c57..c8b81d0 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -562,6 +562,40 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
 
TP_printk("cpu=%d", __entry->cpu)
 );
+
+#ifdef CONFIG_TRACE_EVENTS_WAKEUP_LATENCY
+/**
+ * latency_wakeup - Called when process is woken up
+ * @next:  task to be woken up
+ * @wakeup_lat:process wakeup latency in nano seconds
+ */
+TRACE_EVENT(latency_wakeup,
+
+   TP_PROTO(struct task_struct *next, u64 wakeup_latency),
+   TP_ARGS(next, wakeup_latency),
+
+   TP_STRUCT__entry(
+   __array(char,   ccomm,  TASK_COMM_LEN)
+   __field(int,cprio)
+   __field(unsigned long,  wakeup_lat)
+   __field(unsigned long,  timeroffset)
+   __field(unsigned long,  total_lat)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry->ccomm, next->comm, TASK_COMM_LEN);
+   __entry->cprio  = next->prio;
+   __entry->wakeup_lat = wakeup_latency;
+   __entry->timeroffset = next->timer_offset;
+   __entry->total_lat = wakeup_latency + next->timer_offset;
+   ),
+
+   TP_printk("curr=%s[%d] wakeup_lat=%lu timeroffset=%ld total_lat=%lu",
+   __entry->ccomm, __entry->cprio, __entry->wakeup_lat,
+   __entry->timeroffset, __entry->total_lat)
+);
+#endif
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index f4b86e8..20cf135 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -634,6 +634,16 @@ config RING_BUFFER_STARTUP_TEST
 
 If unsure, say N
 
+config TRACE_EVENTS_WAKEUP_LATENCY
+   bool "Trace wakeup latency events"
+   depends on TRACE_DELAYED_TIMER_OFFSETS
+   depends on SCHED_TRACER
+   help
+Generate the total wakeup latency of a process. It includes the
+wakeup latency and the timer offset latency. Wakeup latency is the
+difference in the time when a process is scheduled to be woken up
+and when it is actually woken up. It depends on the wakeup tracer.
+
 config TRACE_ENUM_MAP_FILE
bool "Show enum mappings f

[RFC PATCH v7 3/5] tracing: Add preemptirqsoff timing events

2016-09-20 Thread Binoy Jayan
Potential sources of latencies are code segments where interrupts,
preemption or both are disabled (aka critical sections). To create
histograms of potential sources of latency, the kernel stores the time
stamp at the start of a critical section, determines the time elapsed
when the end of the section is reached, and increments the frequency
counter of that latency value - irrespective of whether any concurrently
running process is affected by latency or not.

With the tracepoints for irqs off, preempt off and critical timing added
at the end of the critical sections, the potential sources of latencies
which occur in these sections can be measured in nanoseconds. With the
hist triggers in place, the histogram plots may be generated, with per-cpu
breakdown of events captured. It is based on linux kernel's event
infrastructure.

The following filter(s) may be used to with the histograms

'hist:key=latency.log2:val=hitcount:sort=latency'
'hist:key=ltype,latency:val=hitcount:sort=latency if cpu==1'
'hist:key=ltype:val=latency:sort=ltype if ltype==0 && cpu==2'

Where ltype is
1: IRQSOFF latency
2: PREEMPTOFF Latency
3: Critical Timings

Enable one or more of the following tracers to capture the associated
latencies i.e. irq/preempt/critical timing

In '/sys/kernel/debug/tracing'

echo irqsoff > current_tracer - irq and critical time latencies
echo preemptoff > current_tracer - preempt and critical time latencies
echo preemptirqsoff > current_tracer - irq, preempt and critical timing

[
- Initial work and idea by Carsten as part of PREEMPT_RT patch series
  Link: 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb
  No code was taken from the RT patch.
- RFC using hist infrastructure code by Daniel.
- Got it re-written in shape as Daniel suggested to take over author ship.
]

Cc: Carsten Emde 
Cc: Daniel Wagner 
Signed-off-by: Binoy Jayan 
---
 include/trace/events/latency.h | 62 ++
 kernel/trace/trace_irqsoff.c   | 43 +++--
 2 files changed, 91 insertions(+), 14 deletions(-)
 create mode 100644 include/trace/events/latency.h

diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
new file mode 100644
index 000..66442d5
--- /dev/null
+++ b/include/trace/events/latency.h
@@ -0,0 +1,62 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM latency
+
+#if !defined(_TRACE_HIST_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HIST_H
+
+#include 
+
+#ifndef __TRACE_LATENCY_TYPE
+#define __TRACE_LATENCY_TYPE
+
+enum latency_type {
+   LT_NONE,
+   LT_IRQ,
+   LT_PREEMPT,
+   LT_CRITTIME,
+   LT_MAX
+};
+
+TRACE_DEFINE_ENUM(LT_IRQ);
+TRACE_DEFINE_ENUM(LT_PREEMPT);
+TRACE_DEFINE_ENUM(LT_CRITTIME);
+
+#define show_ltype(type)   \
+   __print_symbolic(type,  \
+   { LT_IRQ,   "IRQ" },\
+   { LT_PREEMPT,   "PREEMPT" },\
+   { LT_CRITTIME,  "CRIT_TIME" })
+#endif
+
+DECLARE_EVENT_CLASS(latency_template,
+   TP_PROTO(int ltype, u64 latency),
+
+   TP_ARGS(ltype, latency),
+
+   TP_STRUCT__entry(
+   __field(int,ltype)
+   __field(u64,latency)
+   ),
+
+   TP_fast_assign(
+   __entry->ltype  = ltype;
+   __entry->latency= latency;
+   ),
+
+   TP_printk("ltype=%s(%d), latency=%lu", show_ltype(__entry->ltype),
+ __entry->ltype, (unsigned long) __entry->latency)
+);
+
+/**
+ * latency_preemptirqsoff - called when a cpu exits state of preemption / irq
+ * @ltype: type of the critical section. Refer 'show_ltype'
+ * @latency:   latency in nano seconds
+ */
+DEFINE_EVENT(latency_template, latency_preemptirqsoff,
+   TP_PROTO(int ltype, u64 latency),
+   TP_ARGS(ltype, latency));
+
+#endif /* _TRACE_HIST_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03cdff8..4f7442d 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -13,9 +13,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include 
+
 static struct trace_array  *irqsoff_trace __read_mostly;
 static int tracer_enabled __read_mostly;
 
@@ -298,11 +303,18 @@ static bool report_latency(struct trace_array *tr, 
cycle_t delta)
return true;
 }
 
+static inline void latency_preemptirqsoff_timing(enum latency_type type,
+cycle_t delta)
+{
+   trace_latency_preemptirqsoff(type, (u64) delta);
+}
+
 static void
 ch

Re: [RFC PATCH v7 4/5] tracing: Measure delayed hrtimer offset latency

2016-09-20 Thread Binoy Jayan
On 20 September 2016 at 19:49, Thomas Gleixner  wrote:
> On Tue, 20 Sep 2016, Binoy Jayan wrote:
>> +#ifdef CONFIG_TRACE_DELAYED_TIMER_OFFSETS
>> +static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
>> +  struct hrtimer_clock_base *new_base,
>> +  ktime_t tim)
>> +{
>> + ktime_t now = new_base->get_time();
>> +
>> + if (ktime_to_ns(tim) < ktime_to_ns(now))
>> + timer->tim_expiry = now;
>> + else
>> + timer->tim_expiry = ktime_set(0, 0);
>
> You still fail to explain why this get_time() magic is required.
>
> This is executed unconditionally when the config switch is enabled and does
> not depend on whether the trace functionality is enabled or not. So you are
> imposing the extra get_time() call, which can be expensive depending on the
> underlying hardware, on every hrtimer start invocation.
>
> Tracing is supposed to have ZERO impact when it is not used and even when
> it's in use then the impact should be kept as low as possible. The above
> does none of that.
>
> Neither did you provide a proper rationale for this infrastructure in the
> changelog.
>
> You can repost that over and over and it will not go anywhere if you don't
> start to address the review comments I give you.

Hi Thomas,

Sorry, I missed to address this comment from you. From what I understand
why the get_time() is needed is to get the more accurate current time when
the hrtimer base is changed (from the cpu in which it was fired on, to
the current
cpu on which it is currently made to run or restarted) wherein the hrtimer base
needs to be switched to the new cpu provided that it is not running in
pinned mode.

Carsten,

Could you please comment on that?

Thanks,
Binoy


Re: [RFC PATCH v7 4/5] tracing: Measure delayed hrtimer offset latency

2016-09-22 Thread Binoy Jayan
Hi Thomas,

Thank you for the reply and sharing your insights.

On 21 September 2016 at 21:28, Thomas Gleixner  wrote:

> Sorry. This has nothing to do with changing the hrtimer_base, simply
> because the time base is the same on all cpus.

The condition 'ktime_to_ns(tim) < ktime_to_ns(now)' checks if the timer
has already expired w.r.t. 'soft timeout' value as it does not include
the slack value 'delta_ns'. In that case 'tim_expiry' is normalized to
the current time. (I was under the impression that this inaccuracy
could be because timer was initially running on a different cpu. If that
is not the case, I guess we can use the code mentioned below).

Otherwise it postpones the decision of storing the expiry value
until the hrtimer interrupt. In this case, it calculates the latency
using the hard timeout (which includes the fuzz) as returned by a call
to 'hrtimer_get_expires' in 'latency_hrtimer_timing_stop'.

Since for calculating latency, we use hard timeout value which includes
the slack, and since the actual timeout might have happened in between
the soft and hard timeout, the actual expiry time could be less than
the hard expiry time. This is why latency is checked for negative value
before storing when the trace point is hit.

static inline void latency_hrtimer_timing_start(ktime_t tim)
{
 timer->tim_expiry = tim;
}

static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
ktime_t basenow)
{
long latency;
struct task_struct *task;

latency = ktime_to_ns(basenow) - hrtimer_get_softexpires_tv64(timer);

task = timer->function == hrtimer_wakeup ?
container_of(timer, struct hrtimer_sleeper,
 timer)->task : NULL;
if (task && latency > 0)   // Now the check for latency may
not be needed
task->timer_offset = latency;
}

I am using 'hrtimer_get_softexpires_tv64' instead of 'hrtimer_get_expires'
so that 'latency' is never negative. Please let me know if this looks ok.

> It's not Carstens repsonsibility to explain the nature of the change.
> You are submitting that code and so it's your job to provide proper
> explanations and justifications. If you can't do that, then how do you
> think that the review process, which is a feedback loop between the
> reviewer and the submitter, should work?
>
> Answer: It cannot work that way. I hope I don't have to explain why.

Sure, I'll avoid this confusion in the future. I think I should have talked
to him only offline and not here.

Thanks,
Binoy


[PATCH] staging: unisys: visorbus: Replace semaphore with mutex

2016-06-19 Thread Binoy Jayan
The semaphore 'visordriver_callback_lock' is a simple mutex, so
it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/unisys/include/visorbus.h   |  3 ++-
 drivers/staging/unisys/visorbus/visorbus_main.c | 14 +++---
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/unisys/include/visorbus.h 
b/drivers/staging/unisys/include/visorbus.h
index 9baf1ec..38edca8 100644
--- a/drivers/staging/unisys/include/visorbus.h
+++ b/drivers/staging/unisys/include/visorbus.h
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "periodic_work.h"
 #include "channel.h"
@@ -159,7 +160,7 @@ struct visor_device {
struct list_head list_all;
struct periodic_work *periodic_work;
bool being_removed;
-   struct semaphore visordriver_callback_lock;
+   struct mutex visordriver_callback_lock;
bool pausing;
bool resuming;
u32 chipset_bus_no;
diff --git a/drivers/staging/unisys/visorbus/visorbus_main.c 
b/drivers/staging/unisys/visorbus/visorbus_main.c
index 3a147db..93996a5 100644
--- a/drivers/staging/unisys/visorbus/visorbus_main.c
+++ b/drivers/staging/unisys/visorbus/visorbus_main.c
@@ -544,10 +544,10 @@ dev_periodic_work(void *xdev)
struct visor_device *dev = xdev;
struct visor_driver *drv = to_visor_driver(dev->device.driver);
 
-   down(&dev->visordriver_callback_lock);
+   mutex_lock(&dev->visordriver_callback_lock);
if (drv->channel_interrupt)
drv->channel_interrupt(dev);
-   up(&dev->visordriver_callback_lock);
+   mutex_unlock(&dev->visordriver_callback_lock);
if (!visor_periodic_work_nextperiod(dev->periodic_work))
put_device(&dev->device);
 }
@@ -588,7 +588,7 @@ visordriver_probe_device(struct device *xdev)
if (!drv->probe)
return -ENODEV;
 
-   down(&dev->visordriver_callback_lock);
+   mutex_lock(&dev->visordriver_callback_lock);
dev->being_removed = false;
 
res = drv->probe(dev);
@@ -598,7 +598,7 @@ visordriver_probe_device(struct device *xdev)
fix_vbus_dev_info(dev);
}
 
-   up(&dev->visordriver_callback_lock);
+   mutex_unlock(&dev->visordriver_callback_lock);
return res;
 }
 
@@ -614,11 +614,11 @@ visordriver_remove_device(struct device *xdev)
 
dev = to_visor_device(xdev);
drv = to_visor_driver(xdev->driver);
-   down(&dev->visordriver_callback_lock);
+   mutex_lock(&dev->visordriver_callback_lock);
dev->being_removed = true;
if (drv->remove)
drv->remove(dev);
-   up(&dev->visordriver_callback_lock);
+   mutex_unlock(&dev->visordriver_callback_lock);
dev_stop_periodic_work(dev);
 
put_device(&dev->device);
@@ -778,7 +778,7 @@ create_visor_device(struct visor_device *dev)
POSTCODE_LINUX_4(DEVICE_CREATE_ENTRY_PC, chipset_dev_no, chipset_bus_no,
 POSTCODE_SEVERITY_INFO);
 
-   sema_init(&dev->visordriver_callback_lock, 1);  /* unlocked */
+   mutex_init(&dev->visordriver_callback_lock);
dev->device.bus = &visorbus_type;
dev->device.groups = visorbus_channel_groups;
device_initialize(&dev->device);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 1/2] staging: wilc1000: message_queue: Move code to host interface

2016-06-20 Thread Binoy Jayan
Move the contents of wilc_msgqueue.c and wilc_msgqueue.h into
host_interface.c, remove 'wilc_msgqueue.c' and 'wilc_msgqueue.h'.
This is done so as to restructure the implementation of the kthread
'hostIFthread' using a work queue.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/host_interface.c | 163 +-
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 --
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 -
 4 files changed, 162 insertions(+), 174 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

diff --git a/drivers/staging/wilc1000/Makefile 
b/drivers/staging/wilc1000/Makefile
index acc3f3e..d226283 100644
--- a/drivers/staging/wilc1000/Makefile
+++ b/drivers/staging/wilc1000/Makefile
@@ -6,7 +6,6 @@ ccflags-y += -DFIRMWARE_1002=\"atmel/wilc1002_firmware.bin\" \
 ccflags-y += -I$(src)/ -DWILC_ASIC_A0 -DWILC_DEBUGFS
 
 wilc1000-objs := wilc_wfi_cfgoperations.o linux_wlan.o linux_mon.o \
-   wilc_msgqueue.o \
coreconfigurator.o host_interface.o \
wilc_wlan_cfg.o wilc_debugfs.o \
wilc_wlan.o
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 9535842..494345b 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -3,11 +3,13 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
+#include 
+#include 
 #include "coreconfigurator.h"
 #include "wilc_wlan.h"
 #include "wilc_wlan_if.h"
-#include "wilc_msgqueue.h"
 #include 
 #include "wilc_wfi_netdevice.h"
 
@@ -57,6 +59,20 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
+struct message {
+   void *buf;
+   u32 len;
+   struct list_head list;
+};
+
+struct message_queue {
+   struct semaphore sem;
+   spinlock_t lock;
+   bool exiting;
+   u32 recv_count;
+   struct list_head msg_list;
+};
+
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -264,6 +280,151 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
+static int wilc_mq_create(struct message_queue *mq);
+static int wilc_mq_send(struct message_queue *mq,
+const void *send_buf, u32 send_buf_size);
+static int wilc_mq_recv(struct message_queue *mq,
+void *recv_buf, u32 recv_buf_size, u32 *recv_len);
+static int wilc_mq_destroy(struct message_queue *mq);
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_create(struct message_queue *mq)
+{
+   spin_lock_init(&mq->lock);
+   sema_init(&mq->sem, 0);
+   INIT_LIST_HEAD(&mq->msg_list);
+   mq->recv_count = 0;
+   mq->exiting = false;
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_destroy(struct message_queue *mq)
+{
+   struct message *msg;
+
+   mq->exiting = true;
+
+   /* Release any waiting receiver thread. */
+   while (mq->recv_count > 0) {
+   up(&mq->sem);
+   mq->recv_count--;
+   }
+
+   while (!list_empty(&mq->msg_list)) {
+   msg = list_first_entry(&mq->msg_list, struct message, list);
+   list_del(&msg->list);
+   kfree(msg->buf);
+   }
+
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_send(struct message_queue *mq,
+const void *send_buf, u32 send_buf_size)
+{
+   unsigned long flags;
+   struct message *new_msg = NULL;
+
+   if (!mq || (send_buf_size == 0) || !send_buf)
+   return -EINVAL;
+
+   if (mq->exiting)
+   return -EFAULT;
+
+   /* construct a new message */
+   new_msg = kmalloc(sizeof(*new_msg), GFP_ATOMIC);
+   if (!new_msg)
+   return -ENOMEM;
+
+   new_msg->len = send_buf_size;
+   INIT_LIST_HEAD(&new_msg->list);
+   new_msg->buf = kmemdup(send_buf, send_buf_size, GFP_ATOMIC);
+   

[PATCH v2 0/2] *** staging: wilc1000: Replace semaphores ***

2016-06-20 Thread Binoy Jayan
This is the second patch series for 'wilc1000'. The original patch series
consisted 7 patches of which only the first 5 are good. The patch 6 and 7
are being worked on in this series in a different way.

This patch series removes the semaphore 'sem' in 'wilc1000' and also
restructures the implementation of kthread / message_queue logic with
a create_singlethread_workqueue() / queue_work() setup.

These are part of a bigger effort to eliminate all semaphores
from the linux kernel.

They build correctly (individually and as a whole).

NB: The changes are untested

Rework on the review comments by Arnd w.r.t. v1 for patch 2:

struct message_queue can be removed since
 - after the workqueue conversion, mq->sem is no longer needed
 - recv_count is not needed, it just counts the number of entries in the list
 - struct wilc' pointer can be retrieved from the host_if_msg, (vif->wilc)
 - the message list is not needed because we always look only at the
   first entry, except in wilc_mq_destroy(), but it would be better
   to just call destroy_workqueue(), which also drains the remaining work.
 - the exiting flag is also handled by destroy_workqueue()   
 - with everything else gone, the spinlock is also not needed any more.

Do 'kfree' only at the end of 'host_if_work' 

wilc_initialized is always '1' so the conditional 'wilc_mq_send'
in 'hostIFthread' can be removed.

A connect command (HOST_IF_MSG_CONNECT) does not complete while scan is 
ongoing. 
So, the special handling of this command needs to be preserved.

Use create_singlethread_workqueue() instead of alloc_workqueue(), so that
we stay closer to the current behavior by having the thread run only
on one CPU at a time and not having a 'dedicated' thread for each.

Split the patch to seperate interface changes to 'wilc_mq_send'
No easy way found to split the patch to change the interface
'wilc_mq_send' and to 'wilc_enqueue_cmd' as the parameters 
'mq' 'send_buf' and 'send_buf_size' itself are part of the message
queue implementation.


Binoy Jayan (2):
  staging: wilc1000: message_queue: Move code to host interface
  staging: wilc1000: Replace kthread with workqueue for host interface

 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 417 +++---
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 ---
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 --
 5 files changed, 216 insertions(+), 379 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 2/2] staging: wilc1000: Replace kthread with workqueue for host interface

2016-06-20 Thread Binoy Jayan
Deconstruct the kthread / message_queue logic, replacing it with
create_singlethread_workqueue() / queue_work() setup, by adding a
'struct work_struct' to 'struct host_if_msg'. The current kthread
hostIFthread() is converted to a work queue helper with the name
'host_if_work'.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 542 +++---
 2 files changed, 198 insertions(+), 349 deletions(-)

diff --git a/drivers/staging/wilc1000/TODO b/drivers/staging/wilc1000/TODO
index 95199d8..ec93b2e 100644
--- a/drivers/staging/wilc1000/TODO
+++ b/drivers/staging/wilc1000/TODO
@@ -4,6 +4,11 @@ TODO:
 - remove custom debug and tracing functions
 - rework comments and function headers(also coding style)
 - replace all semaphores with mutexes or completions
+- Move handling for each individual members of 'union message_body' out
+  into a separate 'struct work_struct' and completely remove the multiplexer
+  that is currently part of host_if_work(), allowing movement of the
+  implementation of each message handler into the callsite of the function
+  that currently queues the 'host_if_msg'.
 - make spi and sdio components coexist in one build
 - turn compile-time platform configuration (BEAGLE_BOARD,
   PANDA_BOARD, PLAT_WMS8304, PLAT_RK, CUSTOMER_PLATFORM, ...)
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 494345b..92d4561 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
 #include 
 #include 
@@ -59,20 +60,6 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
-struct message {
-   void *buf;
-   u32 len;
-   struct list_head list;
-};
-
-struct message_queue {
-   struct semaphore sem;
-   spinlock_t lock;
-   bool exiting;
-   u32 recv_count;
-   struct list_head msg_list;
-};
-
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -211,6 +198,7 @@ struct host_if_msg {
u16 id;
union message_body body;
struct wilc_vif *vif;
+   struct work_struct work;
 };
 
 struct join_bss_param {
@@ -245,8 +233,7 @@ struct join_bss_param {
 static struct host_if_drv *terminated_handle;
 bool wilc_optaining_ip;
 static u8 P2P_LISTEN_STATE;
-static struct task_struct *hif_thread_handler;
-static struct message_queue hif_msg_q;
+static struct workqueue_struct *hif_workqueue;
 static struct completion hif_thread_comp;
 static struct completion hif_driver_comp;
 static struct completion hif_wait_response;
@@ -280,55 +267,8 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
-static int wilc_mq_create(struct message_queue *mq);
-static int wilc_mq_send(struct message_queue *mq,
-const void *send_buf, u32 send_buf_size);
-static int wilc_mq_recv(struct message_queue *mq,
-void *recv_buf, u32 recv_buf_size, u32 *recv_len);
-static int wilc_mq_destroy(struct message_queue *mq);
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_create(struct message_queue *mq)
-{
-   spin_lock_init(&mq->lock);
-   sema_init(&mq->sem, 0);
-   INIT_LIST_HEAD(&mq->msg_list);
-   mq->recv_count = 0;
-   mq->exiting = false;
-   return 0;
-}
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_destroy(struct message_queue *mq)
-{
-   struct message *msg;
-
-   mq->exiting = true;
-
-   /* Release any waiting receiver thread. */
-   while (mq->recv_count > 0) {
-   up(&mq->sem);
-   mq->recv_count--;
-   }
-
-   while (!list_empty(&mq->msg_list)) {
-   msg = list_first_entry(&mq->msg_list, struct message, list);
-   list_del(&msg->list);
-   kfree(msg->buf);
-   }
-
-   return 0;
-}
+static int wilc_enqueue_cmd(struct host_if_msg *msg);
+static void host_if_work(struct work_struct *work);
 
 /*!
  *  @authorsyounan
@@ -336,95 +276,19 @@ static int wilc_mq_destroy(struct message_queue *mq)
  *  @note  copied from FLO glue implementatuion
  *  @version   1.0
  */
-static int wilc_mq_s

[PATCH 0/2] *** Latency Histogram ***

2016-07-27 Thread Binoy Jayan
Hi,

I was looking at these RT kernel patches and was wondering why it has
not been upstreamed yet. I have made a few changes to these patches to
make it compliant with upstream submission process. Also did a minimal
testing on my msm board. Can some one from rt kernel team shed some
light on why this is not upstreamed yet? Also if there some way to test
this throroughly on a board with a high resolution timer running mainline
(and not rt) kernel.

Binoy

Carsten Emde (1):
  tracing: Add latency histograms

Yang Shi (1):
  trace: Add missing tracer macros

 Documentation/trace/histograms.txt  |  186 ++
 include/linux/hrtimer.h |3 +
 include/linux/sched.h   |6 +
 include/trace/events/hist.h |   74 +++
 include/trace/events/latency_hist.h |  136 +
 kernel/time/hrtimer.c   |   20 +
 kernel/trace/Kconfig|  104 
 kernel/trace/Makefile   |4 +
 kernel/trace/latency_hist.c | 1091 +++
 kernel/trace/trace_irqsoff.c|9 +
 10 files changed, 1633 insertions(+)
 create mode 100644 Documentation/trace/histograms.txt
 create mode 100644 include/trace/events/hist.h
 create mode 100644 include/trace/events/latency_hist.h
 create mode 100644 kernel/trace/latency_hist.c

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 1/2] tracing: Add latency histograms

2016-07-27 Thread Binoy Jayan
From: Carsten Emde 

This patch provides a recording mechanism to store data of potential
sources of system latencies. The recordings separately determine the
latency caused by a delayed timer expiration, by a delayed wakeup of the
related user space program and by the sum of both. The histograms can be
enabled and reset individually. The data are accessible via the debug
filesystem. The patch is adapted from the rt kernel and restructured to
avoid unwanted macros from source files. For details please consult
Documentation/trace/histograms.txt.

Signed-off-by: Carsten Emde 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Binoy Jayan 
---
 Documentation/trace/histograms.txt  |  186 ++
 include/linux/hrtimer.h |3 +
 include/linux/sched.h   |6 +
 include/trace/events/hist.h |   72 +++
 include/trace/events/latency_hist.h |  136 +
 kernel/time/hrtimer.c   |   20 +
 kernel/trace/Kconfig|  104 
 kernel/trace/Makefile   |4 +
 kernel/trace/latency_hist.c | 1091 +++
 kernel/trace/trace_irqsoff.c|9 +
 10 files changed, 1631 insertions(+)
 create mode 100644 Documentation/trace/histograms.txt
 create mode 100644 include/trace/events/hist.h
 create mode 100644 include/trace/events/latency_hist.h
 create mode 100644 kernel/trace/latency_hist.c

diff --git a/Documentation/trace/histograms.txt 
b/Documentation/trace/histograms.txt
new file mode 100644
index 000..60644ff
--- /dev/null
+++ b/Documentation/trace/histograms.txt
@@ -0,0 +1,186 @@
+   Using the Linux Kernel Latency Histograms
+
+
+This document gives a short explanation how to enable, configure and use
+latency histograms. Latency histograms are primarily relevant in the
+context of real-time enabled kernels (CONFIG_PREEMPT/CONFIG_PREEMPT_RT)
+and are used in the quality management of the Linux real-time
+capabilities.
+
+
+* Purpose of latency histograms
+
+A latency histogram continuously accumulates the frequencies of latency
+data. There are two types of histograms
+- potential sources of latencies
+- effective latencies
+
+
+* Potential sources of latencies
+
+Potential sources of latencies are code segments where interrupts,
+preemption or both are disabled (aka critical sections). To create
+histograms of potential sources of latency, the kernel stores the time
+stamp at the start of a critical section, determines the time elapsed
+when the end of the section is reached, and increments the frequency
+counter of that latency value - irrespective of whether any concurrently
+running process is affected by latency or not.
+- Configuration items (in the Kernel hacking/Tracers submenu)
+  CONFIG_INTERRUPT_OFF_LATENCY
+  CONFIG_PREEMPT_OFF_LATENCY
+
+
+* Effective latencies
+
+Effective latencies are actually occurring during wakeup of a process. To
+determine effective latencies, the kernel stores the time stamp when a
+process is scheduled to be woken up, and determines the duration of the
+wakeup time shortly before control is passed over to this process. Note
+that the apparent latency in user space may be somewhat longer, since the
+process may be interrupted after control is passed over to it but before
+the execution in user space takes place. Simply measuring the interval
+between enqueuing and wakeup may also not appropriate in cases when a
+process is scheduled as a result of a timer expiration. The timer may have
+missed its deadline, e.g. due to disabled interrupts, but this latency
+would not be registered. Therefore, the offsets of missed timers are
+recorded in a separate histogram. If both wakeup latency and missed timer
+offsets are configured and enabled, a third histogram may be enabled that
+records the overall latency as a sum of the timer latency, if any, and the
+wakeup latency. This histogram is called "timerandwakeup".
+- Configuration items (in the Kernel hacking/Tracers submenu)
+  CONFIG_WAKEUP_LATENCY
+  CONFIG_MISSED_TIMER_OFSETS
+
+
+* Usage
+
+The interface to the administration of the latency histograms is located
+in the debugfs file system. To mount it, either enter
+
+mount -t sysfs nodev /sys
+mount -t debugfs nodev /sys/kernel/debug
+
+from shell command line level, or add
+
+nodev  /syssysfs   defaults0 0
+nodev  /sys/kernel/debug   debugfs defaults0 0
+
+to the file /etc/fstab. All latency histogram related files are then
+available in the directory /sys/kernel/debug/tracing/latency_hist. A
+particular histogram type is enabled by writing non-zero to the related
+variable in the /sys/kernel/debug/tracing/latency_hist/enable directory.
+Select "preemptirqsoff" for the histograms of potential sources of
+latencies and "wakeup" for histograms of effective latencies etc. The
+histogram data - one per CPU - are available in the files
+
+/sys/kernel/debug/tracing/latency_hist/preemptoff/C

[PATCH 2/2] trace: Add missing tracer macros

2016-07-27 Thread Binoy Jayan
From: Yang Shi 

When building rt kernel with IRQSOFF_TRACER enabled but INTERRUPT_OFF_HIST
or PREEMPT_OFF_HIST disabled, the below build failure will be triggered:

| kernel/trace/trace_irqsoff.c: In function 'time_hardirqs_on':
| kernel/trace/trace_irqsoff.c:453:2: error: implicit declaration of
|   function 'trace_preemptirqsoff_hist_rcuidle'
|   [-Werror=implicit-function-declaration]
|   trace_preemptirqsoff_hist_rcuidle(IRQS_ON, 0);
|   ^
| cc1: some warnings being treated as errors
| scripts/Makefile.build:258: recipe for target
|   'kernel/trace/trace_irqsoff.o' failed
| make[4]: *** [kernel/trace/trace_irqsoff.o] Error 1
| make[4]: *** Waiting for unfinished jobs
| scripts/Makefile.build:403: recipe for target 'kernel/trace' failed

These macros are only defined when both PREEMPT_OFF_HIST and
PREEMPT_OFF_HIST are enabled, otherwise just trace_preemptirqsoff_hist
is defined as a preprocessor macro.

Signed-off-by: Yang Shi 
Cc: linaro-ker...@lists.linaro.org
Cc: bige...@linutronix.de
Cc: rost...@goodmis.org
Cc: stable...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1445280008-8456-1-git-send-email-yang@linaro.org
Signed-off-by: Thomas Gleixner 
---
 include/trace/events/hist.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/trace/events/hist.h b/include/trace/events/hist.h
index 6122e42..37f6eb8 100644
--- a/include/trace/events/hist.h
+++ b/include/trace/events/hist.h
@@ -9,6 +9,7 @@
 
 #if !defined(CONFIG_PREEMPT_OFF_HIST) && !defined(CONFIG_INTERRUPT_OFF_HIST)
 #define trace_preemptirqsoff_hist(a, b)
+#define trace_preemptirqsoff_hist_rcuidle(a, b)
 #else
 TRACE_EVENT(preemptirqsoff_hist,
 
@@ -33,6 +34,7 @@ TRACE_EVENT(preemptirqsoff_hist,
 
 #ifndef CONFIG_MISSED_TIMER_OFFSETS_HIST
 #define trace_hrtimer_interrupt(a, b, c, d)
+#define trace_hrtimer_interrupt_rcuidle(a, b, c, d)
 #else
 TRACE_EVENT(hrtimer_interrupt,
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v6 0/4] *** Latency histograms - IRQSOFF,PREEMPTOFF, Delayed HRTIMERS ***

2016-09-07 Thread Binoy Jayan
Hi,

Thank you Daniel, Steven for reviewing the code and for the comments.
These set of patches [v6] capture latency events caused by interrupts and
premption disabled in kernel. The patches are based on the hist trigger
feature developed by Tom Zanussi.

v5: https://lkml.org/lkml/2016/9/2/246
v4: https://lkml.org/lkml/2016/8/30/188
v3: https://lkml.org/lkml/2016/8/29/50
v2: https://lkml.org/lkml/2016/8/24/296

v4 -> v5:
  - Add hist trigger support for generic fields
  - hrtimer latency event moved to hrtimer event headers
  - Cleanup

-Binoy

Binoy Jayan (2):
  tracing: Add trace_irqsoff tracepoints
  tracing: Histogram for delayed hrtimer offsets

Daniel Wagner (2):
  tracing: Deference pointers without RCU checks
  tracing: Add hist trigger support for generic fields

 include/linux/hrtimer.h |  3 ++
 include/linux/rculist.h | 36 
 include/linux/tracepoint.h  |  4 +--
 include/trace/events/latency.h  | 56 +
 include/trace/events/timer.h| 25 +
 kernel/time/Kconfig |  7 +
 kernel/time/hrtimer.c   | 52 ++
 kernel/trace/trace_events.c | 13 +
 kernel/trace/trace_events_filter.c  |  4 +--
 kernel/trace/trace_events_hist.c| 36 
 kernel/trace/trace_events_trigger.c |  6 ++--
 kernel/trace/trace_irqsoff.c| 35 +++
 12 files changed, 258 insertions(+), 19 deletions(-)
 create mode 100644 include/trace/events/latency.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v6 1/4] tracing: Deference pointers without RCU checks

2016-09-07 Thread Binoy Jayan
From: Daniel Wagner 

The tracepoint can't be used in code section where we are in the
middle of a state transition.

For example if we place a tracepoint inside start/stop_critical_section(),
lockdep complains with

[0.035589] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3560 \
check_flags.part.36+0x1bc/0x210() [0.036000] \
DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [0.036000] Kernel panic - 
not \
syncing: panic_on_warn set ... [0.036000]
[0.036000] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc7+ #460
[0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS \
1.7.5-20140709_153950- 04/01/2014 [0.036000]  81f2463a 
88007c93bb98 \
81afb317 0001 [0.036000]  81f212b3 
88007c93bc18 \
81af7bc2 88007c93bbb8 [0.036000]  0008 
88007c93bc28 \
88007c93bbc8 0093bbd8 [0.036000] Call Trace:
[0.036000]  [] dump_stack+0x4f/0x7b
[0.036000]  [] panic+0xc0/0x1e9
[0.036000]  [] ? _raw_spin_unlock_irqrestore+0x38/0x80
[0.036000]  [] warn_slowpath_common+0xc0/0xc0
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] warn_slowpath_fmt+0x46/0x50
[0.036000]  [] check_flags.part.36+0x1bc/0x210
[0.036000]  [] lock_is_held+0x78/0x90
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] ? __do_softirq+0x3db/0x500
[0.036000]  [] trace_preempt_on+0x255/0x260
[0.036000]  [] preempt_count_sub+0xab/0xf0
[0.036000]  [] __local_bh_enable+0x36/0x70
[0.036000]  [] __do_softirq+0x3db/0x500
[0.036000]  [] run_ksoftirqd+0x1f/0x60
[0.036000]  [] smpboot_thread_fn+0x193/0x2a0
[0.036000]  [] ? SyS_setgroups+0x150/0x150
[0.036000]  [] kthread+0xf2/0x110
[0.036000]  [] ? wait_for_completion+0xc3/0x120
[0.036000]  [] ? preempt_count_sub+0xab/0xf0
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000]  [] ret_from_fork+0x58/0x90
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000] ---[ end Kernel panic - not syncing: panic_on_warn set ...

PeterZ was so kind to explain it to me what is happening:

"__local_bh_enable() tests if this is the last SOFTIRQ_OFFSET, if so it
tells lockdep softirqs are enabled with trace_softirqs_on() after that
we go an actually modify the preempt_count with preempt_count_sub().
Then in preempt_count_sub() you call into trace_preempt_on() if this
was the last preempt_count increment but you do that _before_ you
actually change the preempt_count with __preempt_count_sub() at this
point lockdep and preempt_count think the world differs and *boom*"

So the simplest way to avoid this is by disabling the consistency
checks.

We also need to take care of the iterating in trace_events_trigger.c
to avoid a splatter in conjunction with the hist trigger.

Signed-off-by: Daniel Wagner 
Signed-off-by: Binoy Jayan 
---
 include/linux/rculist.h | 36 
 include/linux/tracepoint.h  |  4 ++--
 kernel/trace/trace_events_filter.c  |  4 ++--
 kernel/trace/trace_events_trigger.c |  6 +++---
 4 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8beb98d..bee836b 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -279,6 +279,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
container_of(lockless_dereference(ptr), type, member)
 
 /**
+ * list_entry_rcu_notrace - get the struct for this entry (for tracing)
+ * @ptr:the &struct list_head pointer.
+ * @type:   the type of the struct this is embedded in.
+ * @member: the name of the list_head within the struct.
+ *
+ * This primitive may safely run concurrently with the _rcu list-mutation
+ * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
+ *
+ * This is the same as list_entry_rcu() except that it does
+ * not do any RCU debugging or tracing.
+ */
+#define list_entry_rcu_notrace(ptr, type, member) \
+({ \
+   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
+   container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), type, 
member); \
+})
+
+/**
  * Where are list_empty_rcu() and list_first_entry_rcu()?
  *
  * Implementing those functions following their counterparts list_empty() and
@@ -391,6 +409,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
 pos = list_entry_lockless(pos->member.next, typeof(*pos), member))
 
 /**
+ * list_for_each_entry_rcu_notrace -   iterate over rcu list of given 
type (for tracing)
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_head within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as list_add_rcu()
+ * as long 

[PATCH v6 3/4] tracing: Add trace_irqsoff tracepoints

2016-09-07 Thread Binoy Jayan
This work is based on work by Daniel Wagner. A few tracepoints are added
at the end of the critical section. With the hist trigger in place, the
hist trigger plots may be generated, with per-cpu breakdown of events
captured. It is based on linux kernel's event infrastructure.

The following filter(s) may be used

'hist:key=latency.log2:val=hitcount:sort=latency'
'hist:key=ltype,latency:val=hitcount:sort=latency if cpu==1'
'hist:key=ltype:val=latency:sort=ltype if ltype==0 && cpu==2'

Where ltype is
0: IRQSOFF latency
1: PREEMPTOFF Latency
2: Critical Timings

This captures only the latencies introduced by disabled irqs and
preemption. Additional per process data has to be captured to calculate
the effective latencies introduced for individual processes.

Initial work - latency.patch

[1] 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb

Signed-off-by: Binoy Jayan 
---
 include/trace/events/latency.h | 56 ++
 kernel/trace/trace_irqsoff.c   | 35 ++
 2 files changed, 91 insertions(+)
 create mode 100644 include/trace/events/latency.h

diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
new file mode 100644
index 000..96c8757
--- /dev/null
+++ b/include/trace/events/latency.h
@@ -0,0 +1,56 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM latency
+
+#if !defined(_TRACE_HIST_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HIST_H
+
+#include 
+
+#ifndef __TRACE_LATENCY_TYPE
+#define __TRACE_LATENCY_TYPE
+
+enum latency_type {
+   LT_IRQ,
+   LT_PREEMPT,
+   LT_CRITTIME,
+   LT_MAX
+};
+
+TRACE_DEFINE_ENUM(LT_IRQ);
+TRACE_DEFINE_ENUM(LT_PREEMPT);
+TRACE_DEFINE_ENUM(LT_CRITTIME);
+
+#define show_ltype(type)   \
+   __print_symbolic(type,  \
+   { LT_IRQ,   "IRQ" },\
+   { LT_PREEMPT,   "PREEMPT" },\
+   { LT_CRITTIME,  "CRIT_TIME" })
+#endif
+
+DECLARE_EVENT_CLASS(latency_template,
+   TP_PROTO(int ltype, cycles_t latency),
+
+   TP_ARGS(ltype, latency),
+
+   TP_STRUCT__entry(
+   __field(int,ltype)
+   __field(cycles_t,   latency)
+   ),
+
+   TP_fast_assign(
+   __entry->ltype  = ltype;
+   __entry->latency= latency;
+   ),
+
+   TP_printk("ltype=%s(%d), latency=%lu", show_ltype(__entry->ltype),
+ __entry->ltype, (unsigned long) __entry->latency)
+);
+
+DEFINE_EVENT(latency_template, latency_preempt,
+   TP_PROTO(int ltype, cycles_t latency),
+   TP_ARGS(ltype, latency));
+
+#endif /* _TRACE_HIST_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03cdff8..60ee660 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -13,13 +13,19 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include 
+
 static struct trace_array  *irqsoff_trace __read_mostly;
 static int tracer_enabled __read_mostly;
 
 static DEFINE_PER_CPU(int, tracing_cpu);
+static DEFINE_PER_CPU(cycle_t, lat_ts[LT_MAX]);
 
 static DEFINE_RAW_SPINLOCK(max_trace_lock);
 
@@ -419,9 +425,23 @@ stop_critical_timing(unsigned long ip, unsigned long 
parent_ip)
atomic_dec(&data->disabled);
 }
 
+static inline void latency_preempt_timing_start(enum latency_type ltype)
+{
+   this_cpu_write(lat_ts[ltype], (cycle_t) trace_clock_local());
+}
+
+static inline void latency_preempt_timing_stop(enum latency_type type)
+{
+   trace_latency_preempt(type,
+   (cycle_t) trace_clock_local() - this_cpu_read(lat_ts[type]));
+}
+
 /* start and stop critical timings used to for stoppage (in idle) */
 void start_critical_timings(void)
 {
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_start(LT_CRITTIME);
+
if (preempt_trace() || irq_trace())
start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
 }
@@ -431,6 +451,9 @@ void stop_critical_timings(void)
 {
if (preempt_trace() || irq_trace())
stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
+
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_stop(LT_CRITTIME);
 }
 EXPORT_SYMBOL_GPL(stop_critical_timings);
 
@@ -438,6 +461,9 @@ EXPORT_SYMBOL_GPL(stop_critical_timings);
 #ifdef CONFIG_PROVE_LOCKING
 void time_hardirqs_on(unsigned long a0, unsigned long a1)
 {
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_stop(LT_IRQ);
+
if (!preempt_trace() && irq_trace

[PATCH v6 2/4] tracing: Add hist trigger support for generic fields

2016-09-07 Thread Binoy Jayan
From: Daniel Wagner 

Whenever a trace is printed the generic fields (CPU, COMM) are
reconstructed (see trace_print_context()). CPU is taken from the
trace_iterator and COMM is extracted from the savedcmd map (see
__trace_find_cmdline()).

We can't reconstruct this information for hist events. Therefore this
information needs to be stored when a new event is added to the hist
buffer.

There is already support for extracting the COMM for the common_pid
field. For this the tracing_map_ops infrasture is used. Unfortunately, we
can't reuse it because it extends an existing hist_field. That means we
first need to add a hist_field before we are able to make reuse of
trace_map_ops.

Furthermore, it is quite easy to extend the current code to support
those two fields by adding hist_field_cpu() and hist_field_comm().

Signed-off-by: Daniel Wagner 
---
 kernel/trace/trace_events.c  | 13 +++--
 kernel/trace/trace_events_hist.c | 36 ++--
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 03c0a48..ea8da30 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -150,9 +150,10 @@ int trace_define_field(struct trace_event_call *call, 
const char *type,
 }
 EXPORT_SYMBOL_GPL(trace_define_field);
 
-#define __generic_field(type, item, filter_type)   \
+#define __generic_field(type, item, filter_type, size) \
ret = __trace_define_field(&ftrace_generic_fields, #type,   \
-  #item, 0, 0, is_signed_type(type),   \
+  #item, 0, size,  \
+  is_signed_type(type),\
   filter_type);\
if (ret)\
return ret;
@@ -170,10 +171,10 @@ static int trace_define_generic_fields(void)
 {
int ret;
 
-   __generic_field(int, CPU, FILTER_CPU);
-   __generic_field(int, cpu, FILTER_CPU);
-   __generic_field(char *, COMM, FILTER_COMM);
-   __generic_field(char *, comm, FILTER_COMM);
+   __generic_field(int, CPU, FILTER_CPU, sizeof(int));
+   __generic_field(int, cpu, FILTER_CPU, sizeof(int));
+   __generic_field(char *, COMM, FILTER_COMM, TASK_COMM_LEN + 1);
+   __generic_field(char *, comm, FILTER_COMM, TASK_COMM_LEN + 1);
 
return ret;
 }
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index f3a960e..42a1e36 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -75,6 +75,16 @@ static u64 hist_field_log2(struct hist_field *hist_field, 
void *event)
return (u64) ilog2(roundup_pow_of_two(val));
 }
 
+static u64 hist_field_cpu(struct hist_field *hist_field, void *event)
+{
+   return (u64) smp_processor_id();
+}
+
+static u64 hist_field_comm(struct hist_field *hist_field, void *event)
+{
+   return (u64) current->comm;
+}
+
 #define DEFINE_HIST_FIELD_FN(type) \
 static u64 hist_field_##type(struct hist_field *hist_field, void *event)\
 {  \
@@ -119,6 +129,8 @@ enum hist_field_flags {
HIST_FIELD_FL_SYSCALL   = 128,
HIST_FIELD_FL_STACKTRACE= 256,
HIST_FIELD_FL_LOG2  = 512,
+   HIST_FIELD_FL_CPU   = 1024,
+   HIST_FIELD_FL_COMM  = 2048,
 };
 
 struct hist_trigger_attrs {
@@ -374,7 +386,13 @@ static struct hist_field *create_hist_field(struct 
ftrace_event_field *field,
if (WARN_ON_ONCE(!field))
goto out;
 
-   if (is_string_field(field)) {
+   if (field->filter_type == FILTER_CPU) {
+   flags |= HIST_FIELD_FL_CPU;
+   hist_field->fn = hist_field_cpu;
+   } else if (field->filter_type == FILTER_COMM) {
+   flags |= HIST_FIELD_FL_COMM;
+   hist_field->fn = hist_field_comm;
+   } else if (is_string_field(field)) {
flags |= HIST_FIELD_FL_STRING;
 
if (field->filter_type == FILTER_STATIC_STRING)
@@ -748,7 +766,8 @@ static int create_tracing_map_fields(struct 
hist_trigger_data *hist_data)
 
if (hist_field->flags & HIST_FIELD_FL_STACKTRACE)
cmp_fn = tracing_map_cmp_none;
-   else if (is_string_field(field))
+   else if (is_string_field(field) ||
+hist_field->flags & HIST_FIELD_FL_COMM)
cmp_fn = tracing_map_cmp_string;
else
cmp_fn = tracing_map_cmp_num(field->size,
@@ -856,11 +875,9 @@ static inline void add_to_key(char *compound_key, void 
*key,
   

[PATCH v6 4/4] tracing: Histogram for delayed hrtimer offsets

2016-09-07 Thread Binoy Jayan
Generate a histogram of the latencies of delayed timer offsets in
nanoseconds. It shows the latency captured due to a delayed timer expire
event. It happens for example when a timer misses its deadline due to
disabled interrupts. A process if scheduled as a result of the timer
expiration suffers this latency.

The following filter(s) may be used

'hist:key=common_pid.execname:val=toffset,hitcount'
'hist:key=cpu,tcomm:val=toffset:sort=tcomm'
'hist:key=comm,tcomm:sort=comm,tcomm'

Signed-off-by: Binoy Jayan 
---
 include/linux/hrtimer.h  |  3 +++
 include/trace/events/timer.h | 25 +
 kernel/time/Kconfig  |  7 ++
 kernel/time/hrtimer.c| 52 
 4 files changed, 87 insertions(+)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 5e00f80..9146842 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -104,6 +104,9 @@ struct hrtimer {
struct hrtimer_clock_base   *base;
u8  state;
u8  is_rel;
+#ifdef CONFIG_DELAYED_TIMER_OFFSETS_HIST
+   ktime_t praecox;
+#endif
 #ifdef CONFIG_TIMER_STATS
int start_pid;
void*start_site;
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 28c5da6..ee45aed 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -382,6 +382,31 @@ TRACE_EVENT(tick_stop,
 );
 #endif
 
+#ifdef CONFIG_DELAYED_TIMER_OFFSETS_HIST
+TRACE_EVENT(latency_hrtimer_interrupt,
+
+   TP_PROTO(long long toffset, struct task_struct *task),
+
+   TP_ARGS(toffset, task),
+
+   TP_STRUCT__entry(
+   __field(long long,  toffset)
+   __array(char,   tcomm,  TASK_COMM_LEN)
+   __field(int,tprio)
+   ),
+
+   TP_fast_assign(
+   __entry->toffset = toffset;
+   memcpy(__entry->tcomm, task != NULL ? task->comm : "",
+   task != NULL ? TASK_COMM_LEN : 7);
+   __entry->tprio  = task != NULL ? task->prio : -1;
+   ),
+
+   TP_printk("toffset=%lld thread=%s[%d]",
+   __entry->toffset, __entry->tcomm, __entry->tprio)
+);
+#endif
+
 #endif /*  _TRACE_TIMER_H */
 
 /* This part must be outside protection */
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 4008d9f..8ff19dd 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -193,5 +193,12 @@ config HIGH_RES_TIMERS
  hardware is not capable then this option only increases
  the size of the kernel image.
 
+config DELAYED_TIMER_OFFSETS_HIST
+   depends on HIGH_RES_TIMERS
+   select GENERIC_TRACER
+   bool "Delayed Timer Offsets Histogram"
+   help
+ Generate a histogram of delayed timer offsets in nanoseconds.
+
 endmenu
 endif
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9ba7c82..432d49a 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -56,6 +56,8 @@
 
 #include "tick-internal.h"
 
+static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer);
+
 /*
  * The timer bases:
  *
@@ -960,6 +962,52 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer 
*timer, ktime_t tim,
return tim;
 }
 
+#ifdef CONFIG_DELAYED_TIMER_OFFSETS_HIST
+static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+ktime_t tim)
+{
+   if (unlikely(trace_latency_hrtimer_interrupt_enabled())) {
+   ktime_t now = new_base->get_time();
+
+   if (ktime_to_ns(tim) < ktime_to_ns(now))
+   timer->praecox = now;
+   else
+   timer->praecox = ktime_set(0, 0);
+   }
+}
+
+static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
+   ktime_t basenow)
+{
+   long latency;
+   struct task_struct *task;
+
+   if (likely(!trace_latency_hrtimer_interrupt_enabled()))
+   return;
+
+   latency = ktime_to_ns(ktime_sub(basenow,
+ ktime_to_ns(timer->praecox) ?
+ timer->praecox : hrtimer_get_expires(timer)));
+   task = timer->function == hrtimer_wakeup ?
+   container_of(timer, struct hrtimer_sleeper,
+timer)->task : NULL;
+   if (latency > 0)
+   trace_latency_hrtimer_interrupt((u64) latency, task);
+}
+#else
+static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+   

Re: [PATCH v6 4/4] tracing: Histogram for delayed hrtimer offsets

2016-09-08 Thread Binoy Jayan
[Adding Carsten in cc ]

Thank you Thomas for reviewing this and providing insights.

On 8 September 2016 at 12:40, Thomas Gleixner  wrote:
> On Wed, 7 Sep 2016, Binoy Jayan wrote:
>> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
>> index 5e00f80..9146842 100644
>> --- a/include/linux/hrtimer.h
>> +++ b/include/linux/hrtimer.h
>> @@ -104,6 +104,9 @@ struct hrtimer {
>>   struct hrtimer_clock_base   *base;
>>   u8  state;
>>   u8  is_rel;
>> +#ifdef CONFIG_DELAYED_TIMER_OFFSETS_HIST
>> + ktime_t praecox;
>> +#endif
>
> Throwing a new struct member at some random place optimizes the struct
> footprint, right?

My bad, I'll move the member two items up, just below the pointer 'base'.

> And of course documenting new struct members is optional, correct? I'm
> really looking forward for the explanation of that variable name.

It marks the start time when a process is scheduled to be woken up as
the result
of expiry of the hrtimer. Will be mentioning it in the comments.

>> +
>> +   TP_fast_assign(
>> +   __entry->toffset = toffset;
>> +   memcpy(__entry->tcomm, task != NULL ? task->comm : "",
>> +   task != NULL ? TASK_COMM_LEN : 7);
>> +   __entry->tprio  = task != NULL ? task->prio : -1;
>> +   ),
>
> What's the value of storing prio? None, if the task is using the DL
> scheduler.
>
>> +#ifdef CONFIG_DELAYED_TIMER_OFFSETS_HIST
>> +static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
>> +  struct hrtimer_clock_base *new_base,
>> +  ktime_t tim)
>> +{
>> + if (unlikely(trace_latency_hrtimer_interrupt_enabled())) {
>> + ktime_t now = new_base->get_time();
>> +
>> + if (ktime_to_ns(tim) < ktime_to_ns(now))
>> + timer->praecox = now;
>> + else
>> + timer->praecox = ktime_set(0, 0);
>
> What's the whole point of this? You're adding an extra get_time() call into
> that path. What for? Comments in the code are overrated, right?

Will add comments here.

>> + }
>> +}
>> +
>> +static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
>> + ktime_t basenow)
>> +{
>> + long latency;
>> + struct task_struct *task;
>> +
>> + if (likely(!trace_latency_hrtimer_interrupt_enabled()))
>> + return;
>> +
>> + latency = ktime_to_ns(ktime_sub(basenow,
>> +   ktime_to_ns(timer->praecox) ?
>> +   timer->praecox : hrtimer_get_expires(timer)));
>> + task = timer->function == hrtimer_wakeup ?
>> + container_of(timer, struct hrtimer_sleeper,
>> +  timer)->task : NULL;
>
> This is a complete horrible hack. You're tying the task evaluation into a
> single instance of hrtimer users. What's the justification for this and why
> do you need task at all?

If I am understanding it not wrongly, I was trying to mark the time when
and hrtimer is started or restarted and the time when the same expires.
The expiry time is compared against the time now and the actual expiry.
The task indicates the task woken up as a result of the timer expiry.

>
>> + if (latency > 0)
>> + trace_latency_hrtimer_interrupt((u64) latency, task);
>
> And how should latency become < 0 ever? hrtimer interrupt guarantees to
> never expire timers prematurely.

Not sure why, but I have seen some negative values here.

> Neither the changelog nor the code contain any information about how that
> thing is to be used and what the actual value of the stored information
> is.
>
> No way that this ad hoc statistics hackery which we carry in RT for a well
> known reason is going to go upstream without proper justification and a
> weel thought out and documented functionality.
>

As Carsten has mentioned in his patch, this latency alone is not useful enough
without the process latency which occur due to the disabled interrupts/preemtion
or because of a timer missing its deadline. Since the process latency histogram
needed tracepoints in scheduler code which is discouraged, I haven't gotten
around to do it yet. I've been trying to calculate latencies by making
use of kprobes
events and tracing_maps but I was finding it little tricky.

Thanks,
Binoy


Re: [PATCH v6 3/4] tracing: Add trace_irqsoff tracepoints

2016-09-13 Thread Binoy Jayan
Hi Thomas,

Sorry for the late reply. I was trying out some way to do it the way
you suggested.
Tried to talk to Carsten regarding the hrtimer latency patch but was unable to.

On 8 September 2016 at 13:36, Thomas Gleixner  wrote:
> On Wed, 7 Sep 2016, Binoy Jayan wrote:
>> This captures only the latencies introduced by disabled irqs and
>> preemption. Additional per process data has to be captured to calculate
>> the effective latencies introduced for individual processes.
>
> And what is the additional per process data and how is it captured and
> used?

This is the patch which would touch the scheduler code which I did not
want to do.
I was trying to achieve the same using kprobes but did not get around to make it
work with the histograms.

>> +static inline void latency_preempt_timing_start(enum latency_type ltype)
>> +{
>> + this_cpu_write(lat_ts[ltype], (cycle_t) trace_clock_local());
>
> What is this silly type cast for? Why can't you just use u64 ?

>> +static inline void latency_preempt_timing_stop(enum latency_type type)
>> +{
>> + trace_latency_preempt(type,
>> + (cycle_t) trace_clock_local() - this_cpu_read(lat_ts[type]));
>
> And then of course you use a completely different data type in the trace
> itself.

This has been changed.

>> +DECLARE_EVENT_CLASS(latency_template,
>> + TP_PROTO(int ltype, cycles_t latency),
>
> Are you sure, that you know what you are doing here? If yes, then please
> explain it in form of comments so mere mortals can understand it as well.

Added comments for the same.

>>  /* start and stop critical timings used to for stoppage (in idle) */
>>  void start_critical_timings(void)
>>  {
>> + if (unlikely(trace_latency_preempt_enabled()))
>> + latency_preempt_timing_start(LT_CRITTIME);
>
> I doubt, that this conditional is less expensive than a simple
> unconditional store to a per cpu variable.

This is changed as well.

>> @@ -431,6 +451,9 @@ void stop_critical_timings(void)
>>  {
>>   if (preempt_trace() || irq_trace())
>>   stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
>> +
>> + if (unlikely(trace_latency_preempt_enabled()))
>> + latency_preempt_timing_stop(LT_CRITTIME);
>
> And this is silly as well. You can put the whole evaluation into the trace
> event assignement so the tracer core will handle that conditional.

I hope this can be done for "time_hardirqs_off" and "trace_preempt_off"
as well.

> Aside of that it is silly to evaluate trace_clock_local() for the actual
> tracepoint simply because that time is already stored in the tracepoint
> itself. The flow here is:
>
> event = trace_event_buffer_lock_reserve();
> entry = ring_buffer_event_data(event);
> { ; }  <-- Here we assign the entries by the __field and
>__array macros.
>
>
> So you should talk to Steven about having a way to retrieve that time from
> entry itself in a generic way.
>

Steven mentioned that the timestamp in the ring buffer is in the
offset form and also
it may not be equivalent to trace_clock_local() time. I also tried
using the timestamp
from the per cpu trace data but it did not seem to provide correct
value for the timestamp.

struct trace_array_cpu *data =
per_cpu_ptr(irqsoff_trace->trace_buffer.data, cpu);
data->preempt_timestamp

Thanks,
Binoy


Re: [PATCH v3 2/3] tracing: Add trace_irqsoff tracepoints

2016-08-30 Thread Binoy Jayan
Hi Daniel/Steven,

On 30 August 2016 at 20:32, Daniel Wagner  wrote:
> On 08/30/2016 04:20 PM, Daniel Wagner wrote:
>> Just setting the size of the type is not enough. The hist_field_*
>> getter function want to know the offset too:
>
> With this hack here it should work. The COMM generic field is handled via
> the tracing_map_ops. We could do it also there but than we need to
> a condition in the trace_map_ops if it is COMM or CPU.
> This shortcut here avoids it:

> diff --git a/kernel/trace/trace_events_hist.c 
> b/kernel/trace/trace_events_hist.c
> index f3a960e..77073b7 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -556,6 +567,11 @@ static int create_key_field(struct hist_trigger_data 
> *hist_data,
> key_size = MAX_FILTER_STR_VAL;
> else
> key_size = field->size;
> +
> +   // strcmp(field_name, "cpu") would also work to figure
> +   // out if this is a one of the generic fields.
> +   if (field->filter_type == FILTER_CPU)
> +   flags |= HIST_FIELD_FL_CPU;
> }
>
> hist_data->fields[key_idx] = create_hist_field(field, flags);

I applied Daniel's fix and it seems to work. Can we use the following instead
on top of your patch? Like how the other keys are compared against
respective fields?

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 46203b7..963a121 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -537,6 +537,12 @@ static int create_key_field(struct
hist_trigger_data *hist_data,
  } else {
  char *field_name = strsep(&field_str, ".");

+/* Cannot keep these 2 lines in side the if() below
+ * as field_str would be NULL for the key 'cpu'
+ */
+if (strcmp(field_name, "cpu") == 0)
+flags |= HIST_FIELD_FL_CPU;
+
  if (field_str) {
  if (strcmp(field_str, "hex") == 0)
  flags |= HIST_FIELD_FL_HEX;
@@ -568,11 +574,6 @@ static int create_key_field(struct
hist_trigger_data *hist_data,
  else
  key_size = field->size;

- // strcmp(field_name, "cpu") would also work to figure
- // out if this is a one of the generic fields.
- if (field->filter_type == FILTER_CPU)
- flags |= HIST_FIELD_FL_CPU;
-
  }

  hist_data->fields[key_idx] = create_hist_field(field, flags);

- Binoy


Re: [PATCH v4 3/3] tracing: Histogram for missed timer offsets

2016-08-31 Thread Binoy Jayan
On 30 August 2016 at 19:45, Steven Rostedt  wrote:
> On Tue, 30 Aug 2016 15:58:44 +0530
> Binoy Jayan  wrote:
>
>> +
>> + TP_STRUCT__entry(
>> + __field(long long,  toffset)
>> + __array(char,   ccomm,  TASK_COMM_LEN)
>
> Can curr be different than current? If not, lets not record it.
>
> -- Steve
>

Hi Steve,

If my understanding is right, I think both are not the same. The
predefined field relates to the current
process which was interrupted by the hrtimer. This I guess does not
have a meaning in this context.
Mostly it is the idle process which is interrupted by the hrtimer. But
the ccomm field refers to the task
woken up by the process. The latencies are measured for this task. So
I it is needed.

-Binoy


Re: [PATCH v4 2/3] tracing: Add trace_irqsoff tracepoints

2016-08-31 Thread Binoy Jayan
Hi Steven/Daniel,

On 30 August 2016 at 19:38, Steven Rostedt  wrote:
>> +
>> + TP_printk("ltype=%d, latency=%lu",
>> + __entry->ltype, (unsigned long) __entry->latency)
>
> The print of ltype should be text and not a number. Well, you could
> have both text and a number, but a number is useless for those looking
> at traces.

I am using '__print_symbolic' to display ltype as a string but it
still shows up in
the histogram as a number. Would you suggest to change this as well?

Thanks,
Binoy


[PATCH v5 3/4] tracing: Add trace_irqsoff tracepoints

2016-09-02 Thread Binoy Jayan
This work is based on work by Daniel Wagner. A few tracepoints are added
at the end of the critical section. With the hist trigger in place, the
hist trigger plots may be generated, with per-cpu breakdown of events
captured. It is based on linux kernel's event infrastructure.

The following filter(s) may be used

'hist:key=latency.log2:val=hitcount:sort=latency'
'hist:key=ltype,latency:val=hitcount:sort=latency if cpu==1'
'hist:key=ltype:val=latency:sort=ltype if ltype==0 && cpu==2'

Where ltype is
0: IRQSOFF latency
1: PREEMPTOFF Latency
2: Critical Timings

This captures only the latencies introduced by disabled irqs and
preemption. Additional per process data has to be captured to calculate
the effective latencies introduced for individual processes.

Initial work - latency.patch

[1] 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb

Signed-off-by: Binoy Jayan 
---
 include/trace/events/latency.h | 50 ++
 kernel/trace/trace_irqsoff.c   | 35 +
 2 files changed, 85 insertions(+)
 create mode 100644 include/trace/events/latency.h

diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
new file mode 100644
index 000..ca57f06
--- /dev/null
+++ b/include/trace/events/latency.h
@@ -0,0 +1,50 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM latency
+
+#if !defined(_TRACE_HIST_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HIST_H
+
+#include 
+
+#ifndef __TRACE_LATENCY_TYPE
+#define __TRACE_LATENCY_TYPE
+enum latency_type {
+   LT_IRQ,
+   LT_PREEMPT,
+   LT_CRITTIME,
+   LT_MAX
+};
+#define show_ltype(type)   \
+   __print_symbolic(type,  \
+   { LT_IRQ,   "IRQ" },\
+   { LT_PREEMPT,   "PREEMPT" },\
+   { LT_PREEMPT,   "CRIT_TIME" })
+#endif
+
+DECLARE_EVENT_CLASS(latency_template,
+   TP_PROTO(int ltype, cycles_t latency),
+
+   TP_ARGS(ltype, latency),
+
+   TP_STRUCT__entry(
+   __field(int,ltype)
+   __field(cycles_t,   latency)
+   ),
+
+   TP_fast_assign(
+   __entry->ltype  = ltype;
+   __entry->latency= latency;
+   ),
+
+   TP_printk("ltype=%s(%d), latency=%lu", show_ltype(__entry->ltype),
+ __entry->ltype, (unsigned long) __entry->latency)
+);
+
+DEFINE_EVENT(latency_template, latency_preempt,
+   TP_PROTO(int ltype, cycles_t latency),
+   TP_ARGS(ltype, latency));
+
+#endif /* _TRACE_HIST_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03cdff8..60ee660 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -13,13 +13,19 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include 
+
 static struct trace_array  *irqsoff_trace __read_mostly;
 static int tracer_enabled __read_mostly;
 
 static DEFINE_PER_CPU(int, tracing_cpu);
+static DEFINE_PER_CPU(cycle_t, lat_ts[LT_MAX]);
 
 static DEFINE_RAW_SPINLOCK(max_trace_lock);
 
@@ -419,9 +425,23 @@ stop_critical_timing(unsigned long ip, unsigned long 
parent_ip)
atomic_dec(&data->disabled);
 }
 
+static inline void latency_preempt_timing_start(enum latency_type ltype)
+{
+   this_cpu_write(lat_ts[ltype], (cycle_t) trace_clock_local());
+}
+
+static inline void latency_preempt_timing_stop(enum latency_type type)
+{
+   trace_latency_preempt(type,
+   (cycle_t) trace_clock_local() - this_cpu_read(lat_ts[type]));
+}
+
 /* start and stop critical timings used to for stoppage (in idle) */
 void start_critical_timings(void)
 {
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_start(LT_CRITTIME);
+
if (preempt_trace() || irq_trace())
start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
 }
@@ -431,6 +451,9 @@ void stop_critical_timings(void)
 {
if (preempt_trace() || irq_trace())
stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
+
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_stop(LT_CRITTIME);
 }
 EXPORT_SYMBOL_GPL(stop_critical_timings);
 
@@ -438,6 +461,9 @@ EXPORT_SYMBOL_GPL(stop_critical_timings);
 #ifdef CONFIG_PROVE_LOCKING
 void time_hardirqs_on(unsigned long a0, unsigned long a1)
 {
+   if (unlikely(trace_latency_preempt_enabled()))
+   latency_preempt_timing_stop(LT_IRQ);
+
if (!preempt_trace() && irq_trace())
stop_critical_timing(a0, a1);
 }
@@ -446,6 +472,9 @@ void time_hardirqs_off(u

[PATCH v5 4/4] tracing: Histogram for delayed hrtimer offsets

2016-09-02 Thread Binoy Jayan
Generate a histogram of the latencies of delayed timer offsets in
nanoseconds. It shows the latency captured due to a delayed timer expire
event. It happens for example when a timer misses its deadline due to
disabled interrupts. A process if scheduled as a result of the timer
expiration suffers this latency.

The following filter(s) may be used

'hist:key=common_pid.execname:val=toffset,hitcount'
'hist:key=cpu,tcomm:val=toffset:sort=tcomm'
'hist:key=common_pid.execname,tcomm'

Signed-off-by: Binoy Jayan 
---
 include/linux/hrtimer.h|  3 +++
 include/trace/events/latency.h | 23 +
 kernel/time/hrtimer.c  | 46 ++
 3 files changed, 72 insertions(+)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 5e00f80..e09de14 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -104,6 +104,9 @@ struct hrtimer {
struct hrtimer_clock_base   *base;
u8  state;
u8  is_rel;
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   ktime_t praecox;
+#endif
 #ifdef CONFIG_TIMER_STATS
int start_pid;
void*start_site;
diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
index ca57f06..d616db5 100644
--- a/include/trace/events/latency.h
+++ b/include/trace/events/latency.h
@@ -44,6 +44,29 @@ DEFINE_EVENT(latency_template, latency_preempt,
TP_PROTO(int ltype, cycles_t latency),
TP_ARGS(ltype, latency));
 
+TRACE_EVENT(latency_hrtimer_interrupt,
+
+   TP_PROTO(long long toffset, struct task_struct *task),
+
+   TP_ARGS(toffset, task),
+
+   TP_STRUCT__entry(
+   __field(long long,  toffset)
+   __array(char,   tcomm,  TASK_COMM_LEN)
+   __field(int,tprio)
+   ),
+
+   TP_fast_assign(
+   __entry->toffset = toffset;
+   memcpy(__entry->tcomm, task != NULL ? task->comm : "",
+   task != NULL ? TASK_COMM_LEN : 7);
+   __entry->tprio  = task != NULL ? task->prio : -1;
+   ),
+
+   TP_printk("toffset=%lld thread=%s[%d]",
+   __entry->toffset, __entry->tcomm, __entry->tprio)
+);
+
 #endif /* _TRACE_HIST_H */
 
 /* This part must be outside protection */
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9ba7c82..04d936b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -53,9 +53,12 @@
 #include 
 
 #include 
+#include 
 
 #include "tick-internal.h"
 
+static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer);
+
 /*
  * The timer bases:
  *
@@ -960,6 +963,45 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer 
*timer, ktime_t tim,
return tim;
 }
 
+static inline void latency_hrtimer_timing_start(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+ktime_t tim)
+{
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   if (unlikely(trace_latency_hrtimer_interrupt_enabled())) {
+   ktime_t now = new_base->get_time();
+
+   if (ktime_to_ns(tim) < ktime_to_ns(now))
+   timer->praecox = now;
+   else
+   timer->praecox = ktime_set(0, 0);
+   }
+#endif
+}
+
+static inline void latency_hrtimer_timing_stop(struct hrtimer *timer,
+   ktime_t basenow)
+{
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   long latency;
+
+   struct task_struct *task;
+
+   if (likely(!trace_latency_hrtimer_interrupt_enabled()))
+   return;
+
+   latency = ktime_to_ns(ktime_sub(basenow,
+ ktime_to_ns(timer->praecox) ?
+ timer->praecox : hrtimer_get_expires(timer)));
+
+   task = timer->function == hrtimer_wakeup ?
+   container_of(timer, struct hrtimer_sleeper,
+timer)->task : NULL;
+   if (latency > 0)
+   trace_latency_hrtimer_interrupt((u64) latency, task);
+#endif
+}
+
 /**
  * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
  * @timer: the timer to be added
@@ -992,6 +1034,8 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 
timer_stats_hrtimer_set_start_info(timer);
 
+   latency_hrtimer_timing_start(timer, new_base, tim);
+
leftmost = enqueue_hrtimer(timer, new_base);
if (!leftmost)
goto unlock;
@@ -1284,6 +1328,8 @@ static void __hrtimer_run_queues(struct hrti

[PATCH v5 1/4] tracing: Deference pointers without RCU checks

2016-09-02 Thread Binoy Jayan
From: Daniel Wagner 

The tracepoint can't be used in code section where we are in the
middle of a state transition.

For example if we place a tracepoint inside start/stop_critical_section(),
lockdep complains with

[0.035589] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3560 \
check_flags.part.36+0x1bc/0x210() [0.036000] \
DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [0.036000] Kernel panic - 
not \
syncing: panic_on_warn set ... [0.036000]
[0.036000] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc7+ #460
[0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS \
1.7.5-20140709_153950- 04/01/2014 [0.036000]  81f2463a 
88007c93bb98 \
81afb317 0001 [0.036000]  81f212b3 
88007c93bc18 \
81af7bc2 88007c93bbb8 [0.036000]  0008 
88007c93bc28 \
88007c93bbc8 0093bbd8 [0.036000] Call Trace:
[0.036000]  [] dump_stack+0x4f/0x7b
[0.036000]  [] panic+0xc0/0x1e9
[0.036000]  [] ? _raw_spin_unlock_irqrestore+0x38/0x80
[0.036000]  [] warn_slowpath_common+0xc0/0xc0
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] warn_slowpath_fmt+0x46/0x50
[0.036000]  [] check_flags.part.36+0x1bc/0x210
[0.036000]  [] lock_is_held+0x78/0x90
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] ? __do_softirq+0x3db/0x500
[0.036000]  [] trace_preempt_on+0x255/0x260
[0.036000]  [] preempt_count_sub+0xab/0xf0
[0.036000]  [] __local_bh_enable+0x36/0x70
[0.036000]  [] __do_softirq+0x3db/0x500
[0.036000]  [] run_ksoftirqd+0x1f/0x60
[0.036000]  [] smpboot_thread_fn+0x193/0x2a0
[0.036000]  [] ? SyS_setgroups+0x150/0x150
[0.036000]  [] kthread+0xf2/0x110
[0.036000]  [] ? wait_for_completion+0xc3/0x120
[0.036000]  [] ? preempt_count_sub+0xab/0xf0
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000]  [] ret_from_fork+0x58/0x90
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000] ---[ end Kernel panic - not syncing: panic_on_warn set ...

PeterZ was so kind to explain it to me what is happening:

"__local_bh_enable() tests if this is the last SOFTIRQ_OFFSET, if so it
tells lockdep softirqs are enabled with trace_softirqs_on() after that
we go an actually modify the preempt_count with preempt_count_sub().
Then in preempt_count_sub() you call into trace_preempt_on() if this
was the last preempt_count increment but you do that _before_ you
actually change the preempt_count with __preempt_count_sub() at this
point lockdep and preempt_count think the world differs and *boom*"

So the simplest way to avoid this is by disabling the consistency
checks.

We also need to take care of the iterating in trace_events_trigger.c
to avoid a splatter in conjunction with the hist trigger.

Signed-off-by: Daniel Wagner 
Signed-off-by: Binoy Jayan 
---
 include/linux/rculist.h | 36 
 include/linux/tracepoint.h  |  4 ++--
 kernel/trace/trace_events_filter.c  |  4 ++--
 kernel/trace/trace_events_trigger.c |  6 +++---
 4 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8beb98d..bee836b 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -279,6 +279,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
container_of(lockless_dereference(ptr), type, member)
 
 /**
+ * list_entry_rcu_notrace - get the struct for this entry (for tracing)
+ * @ptr:the &struct list_head pointer.
+ * @type:   the type of the struct this is embedded in.
+ * @member: the name of the list_head within the struct.
+ *
+ * This primitive may safely run concurrently with the _rcu list-mutation
+ * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
+ *
+ * This is the same as list_entry_rcu() except that it does
+ * not do any RCU debugging or tracing.
+ */
+#define list_entry_rcu_notrace(ptr, type, member) \
+({ \
+   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
+   container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), type, 
member); \
+})
+
+/**
  * Where are list_empty_rcu() and list_first_entry_rcu()?
  *
  * Implementing those functions following their counterparts list_empty() and
@@ -391,6 +409,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
 pos = list_entry_lockless(pos->member.next, typeof(*pos), member))
 
 /**
+ * list_for_each_entry_rcu_notrace -   iterate over rcu list of given 
type (for tracing)
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_head within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as list_add_rcu()
+ * as long 

[PATCH v5 0/4] *** Latency histograms - IRQSOFF,PREEMPTOFF ***

2016-09-02 Thread Binoy Jayan
Hi,

Thank you Daniel, Steven for reviewing the code and for the comments.
I have made the changes mentioned below and have added the hack to mark
the field cpu as a key.

These set of patches [v5] capture latency events caused by interrupts and
premption disabled in kernel. The patches are based on the hist trigger
feature developed by Tom Zanussi.

v4: https://lkml.org/lkml/2016/8/30/188
v3: https://lkml.org/lkml/2016/8/29/50
v2: https://lkml.org/lkml/2016/8/24/296

Changes from v4
  - Added unlikely() for less probable paths
  - Dropped field 'ccomm' for hrtimer latency
  - Patch to mark the generic field cpu as a key field and make it part
of histogram output
  - Changed ambiguous function names

TODO:
1. kselftest test scripts
2. delayed hrtimer offset test scenario

Thanks,
Binoy

Binoy Jayan (3):
  tracing: Add cpu as a key field in histogram
  tracing: Add trace_irqsoff tracepoints
  tracing: Histogram for delayed hrtimer offsets

Daniel Wagner (1):
  tracing: Deference pointers without RCU checks

 include/linux/hrtimer.h |  3 ++
 include/linux/rculist.h | 36 ++
 include/linux/tracepoint.h  |  4 +-
 include/trace/events/latency.h  | 73 +
 kernel/time/hrtimer.c   | 46 +++
 kernel/trace/trace_events.c |  3 +-
 kernel/trace/trace_events_filter.c  |  4 +-
 kernel/trace/trace_events_hist.c| 15 
 kernel/trace/trace_events_trigger.c |  6 +--
 kernel/trace/trace_irqsoff.c| 35 ++
 10 files changed, 217 insertions(+), 8 deletions(-)
 create mode 100644 include/trace/events/latency.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v5 2/4] tracing: Add cpu as a key field in histogram

2016-09-02 Thread Binoy Jayan
The field 'cpu' although part of the set of generic fields, is not made
part of the key fields when mentioned in the trigger command. This hack
suggested by Daniel marks it as one of the key fields and make it appear
in the histogram output.

Signed-off-by: Binoy Jayan 
---
 kernel/trace/trace_events.c  |  3 ++-
 kernel/trace/trace_events_hist.c | 15 +++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 03c0a48..c395608 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -152,7 +152,8 @@ EXPORT_SYMBOL_GPL(trace_define_field);
 
 #define __generic_field(type, item, filter_type)   \
ret = __trace_define_field(&ftrace_generic_fields, #type,   \
-  #item, 0, 0, is_signed_type(type),   \
+  #item, 0, sizeof(type),  \
+  is_signed_type(type),\
   filter_type);\
if (ret)\
return ret;
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 0c05b8a..4e0a12e 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -75,6 +75,11 @@ static u64 hist_field_log2(struct hist_field *hist_field, 
void *event)
return (u64) ilog2(roundup_pow_of_two(val));
 }
 
+static u64 hist_field_cpu(struct hist_field *hist_field, void *event)
+{
+   return (u64) raw_smp_processor_id();
+}
+
 #define DEFINE_HIST_FIELD_FN(type) \
 static u64 hist_field_##type(struct hist_field *hist_field, void *event)\
 {  \
@@ -119,6 +124,7 @@ enum hist_field_flags {
HIST_FIELD_FL_SYSCALL   = 128,
HIST_FIELD_FL_STACKTRACE= 256,
HIST_FIELD_FL_LOG2  = 512,
+   HIST_FIELD_FL_CPU   = 1024,
 };
 
 struct hist_trigger_attrs {
@@ -371,6 +377,11 @@ static struct hist_field *create_hist_field(struct 
ftrace_event_field *field,
goto out;
}
 
+   if (flags & HIST_FIELD_FL_CPU) {
+   hist_field->fn = hist_field_cpu;
+   goto out;
+   }
+
if (WARN_ON_ONCE(!field))
goto out;
 
@@ -526,6 +537,9 @@ static int create_key_field(struct hist_trigger_data 
*hist_data,
} else {
char *field_name = strsep(&field_str, ".");
 
+   if (strcmp(field_name, "cpu") == 0)
+   flags |= HIST_FIELD_FL_CPU;
+
if (field_str) {
if (strcmp(field_str, "hex") == 0)
flags |= HIST_FIELD_FL_HEX;
@@ -556,6 +570,7 @@ static int create_key_field(struct hist_trigger_data 
*hist_data,
key_size = MAX_FILTER_STR_VAL;
else
key_size = field->size;
+
}
 
hist_data->fields[key_idx] = create_hist_field(field, flags);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH v4 3/3] tracing: Histogram for missed timer offsets

2016-09-02 Thread Binoy Jayan
On 30 August 2016 at 16:20, Masami Hiramatsu
 wrote:
> Hi Binoy,
>>
>> +static inline void trace_latency_hrtimer_mark_ts(struct hrtimer *timer,
>> +struct hrtimer_clock_base *new_base,
>> +ktime_t tim)
>> +{
>> +#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
>> +   if (trace_latency_hrtimer_interrupt_enabled()) {
>
> You would better use unlikely() here.
>
>> +   ktime_t now = new_base->get_time();
>> +
>> +   if (ktime_to_ns(tim) < ktime_to_ns(now))
>
> Wouldn't we need to consider the case of wrap around?
>
>> +   timer->praecox = now;
>> +   else
>> +   timer->praecox = ktime_set(0, 0);
>> +   }
>> +#endif
>> +}

Hi Masami,

I always see these values to be relative and not absolute time. I
found 'praecox' to be always zero during test.
What do you think.

Binoy


[PATCH v2 1/3] tracing: Deference pointers without RCU checks

2016-08-24 Thread Binoy Jayan
From: Daniel Wagner 

The tracepoint can't be used in code section where we are in the
middle of a state transition.

For example if we place a tracepoint inside start/stop_critical_section(),
lockdep complains with

[0.035589] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3560 \
check_flags.part.36+0x1bc/0x210() [0.036000] \
DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [0.036000] Kernel panic - 
not \
syncing: panic_on_warn set ... [0.036000]
[0.036000] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc7+ #460
[0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS \
1.7.5-20140709_153950- 04/01/2014 [0.036000]  81f2463a 
88007c93bb98 \
81afb317 0001 [0.036000]  81f212b3 
88007c93bc18 \
81af7bc2 88007c93bbb8 [0.036000]  0008 
88007c93bc28 \
88007c93bbc8 0093bbd8 [0.036000] Call Trace:
[0.036000]  [] dump_stack+0x4f/0x7b
[0.036000]  [] panic+0xc0/0x1e9
[0.036000]  [] ? _raw_spin_unlock_irqrestore+0x38/0x80
[0.036000]  [] warn_slowpath_common+0xc0/0xc0
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] warn_slowpath_fmt+0x46/0x50
[0.036000]  [] check_flags.part.36+0x1bc/0x210
[0.036000]  [] lock_is_held+0x78/0x90
[0.036000]  [] ? __local_bh_enable+0x36/0x70
[0.036000]  [] ? __do_softirq+0x3db/0x500
[0.036000]  [] trace_preempt_on+0x255/0x260
[0.036000]  [] preempt_count_sub+0xab/0xf0
[0.036000]  [] __local_bh_enable+0x36/0x70
[0.036000]  [] __do_softirq+0x3db/0x500
[0.036000]  [] run_ksoftirqd+0x1f/0x60
[0.036000]  [] smpboot_thread_fn+0x193/0x2a0
[0.036000]  [] ? SyS_setgroups+0x150/0x150
[0.036000]  [] kthread+0xf2/0x110
[0.036000]  [] ? wait_for_completion+0xc3/0x120
[0.036000]  [] ? preempt_count_sub+0xab/0xf0
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000]  [] ret_from_fork+0x58/0x90
[0.036000]  [] ? kthread_create_on_node+0x240/0x240
[0.036000] ---[ end Kernel panic - not syncing: panic_on_warn set ...

PeterZ was so kind to explain it to me what is happening:

"__local_bh_enable() tests if this is the last SOFTIRQ_OFFSET, if so it
tells lockdep softirqs are enabled with trace_softirqs_on() after that
we go an actually modify the preempt_count with preempt_count_sub().
Then in preempt_count_sub() you call into trace_preempt_on() if this
was the last preempt_count increment but you do that _before_ you
actually change the preempt_count with __preempt_count_sub() at this
point lockdep and preempt_count think the world differs and *boom*"

So the simplest way to avoid this is by disabling the consistency
checks.

We also need to take care of the iterating in trace_events_trigger.c
to avoid a splatter in conjunction with the hist trigger.

Signed-off-by: Daniel Wagner 
---
 include/linux/rculist.h | 36 
 include/linux/tracepoint.h  |  4 ++--
 kernel/trace/trace_events_trigger.c |  6 +++---
 3 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8beb98d..bee836b 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -279,6 +279,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
container_of(lockless_dereference(ptr), type, member)
 
 /**
+ * list_entry_rcu_notrace - get the struct for this entry (for tracing)
+ * @ptr:the &struct list_head pointer.
+ * @type:   the type of the struct this is embedded in.
+ * @member: the name of the list_head within the struct.
+ *
+ * This primitive may safely run concurrently with the _rcu list-mutation
+ * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
+ *
+ * This is the same as list_entry_rcu() except that it does
+ * not do any RCU debugging or tracing.
+ */
+#define list_entry_rcu_notrace(ptr, type, member) \
+({ \
+   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
+   container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), type, 
member); \
+})
+
+/**
  * Where are list_empty_rcu() and list_first_entry_rcu()?
  *
  * Implementing those functions following their counterparts list_empty() and
@@ -391,6 +409,24 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
 pos = list_entry_lockless(pos->member.next, typeof(*pos), member))
 
 /**
+ * list_for_each_entry_rcu_notrace -   iterate over rcu list of given 
type (for tracing)
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_head within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as list_add_rcu()
+ * as long as the traversal is guarded by rcu_read_lock().
+ *
+ * This is the same as list_for_each_entry_rcu() exce

[PATCH v2 2/3] tracing: Add trace_irqsoff tracepoints

2016-08-24 Thread Binoy Jayan
This work is based on work by Daniel Wagner. A few tracepoints are added
at the end of the critical section. With the hist trigger in place, the
hist trigger plots may be generated, with per-cpu breakdown of events
captured. It is based on linux kernel's event infrastructure.

The following filter(s) may be used

'hist:key=latency.log2:val=hitcount:sort=latency'
'hist:key=cpu,latency:val=hitcount:sort=latency if cpu==1'
'hist:key=common_pid.execname'

This captures only the latencies introduced by disabled irqs and
preemption. Additional per process data has to be captured to calculate
the effective latencies introduced for individual processes.

Initial work - latency.patch

[1] 
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb

Signed-off-by: Binoy Jayan 
---
 include/trace/events/latency.h | 43 ++
 kernel/trace/trace_irqsoff.c   | 42 -
 2 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/latency.h

diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
new file mode 100644
index 000..77896c7
--- /dev/null
+++ b/include/trace/events/latency.h
@@ -0,0 +1,43 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM latency
+
+#if !defined(_TRACE_HIST_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HIST_H
+
+#include 
+
+DECLARE_EVENT_CLASS(latency_template,
+   TP_PROTO(int cpu, cycles_t latency),
+
+   TP_ARGS(cpu, latency),
+
+   TP_STRUCT__entry(
+   __field(int,cpu)
+   __field(cycles_t,   latency)
+   ),
+
+   TP_fast_assign(
+   __entry->cpu= cpu;
+   __entry->latency= latency;
+   ),
+
+   TP_printk("cpu=%d, latency=%lu", __entry->cpu,
+   (unsigned long) __entry->latency)
+);
+
+DEFINE_EVENT(latency_template, latency_irqs,
+   TP_PROTO(int cpu, cycles_t latency),
+   TP_ARGS(cpu, latency));
+
+DEFINE_EVENT(latency_template, latency_preempt,
+   TP_PROTO(int cpu, cycles_t latency),
+   TP_ARGS(cpu, latency));
+
+DEFINE_EVENT(latency_template, latency_critical_timings,
+   TP_PROTO(int cpu, cycles_t latency),
+   TP_ARGS(cpu, latency));
+
+#endif /* _TRACE_HIST_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03cdff8..3fcf446 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -13,14 +13,22 @@
 #include 
 #include 
 #include 
+#include 
+
+#include 
 
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include 
+
 static struct trace_array  *irqsoff_trace __read_mostly;
 static int tracer_enabled __read_mostly;
 
 static DEFINE_PER_CPU(int, tracing_cpu);
-
+static DEFINE_PER_CPU(cycle_t __maybe_unused, ts_irqs);
+static DEFINE_PER_CPU(cycle_t __maybe_unused, ts_preempt);
+static DEFINE_PER_CPU(cycle_t __maybe_unused, ts_critical_timings);
 static DEFINE_RAW_SPINLOCK(max_trace_lock);
 
 enum {
@@ -419,9 +427,17 @@ stop_critical_timing(unsigned long ip, unsigned long 
parent_ip)
atomic_dec(&data->disabled);
 }
 
+static inline cycle_t get_delta(cycle_t __percpu *ts)
+{
+   return (cycle_t) trace_clock_local() - this_cpu_read(*ts);
+}
 /* start and stop critical timings used to for stoppage (in idle) */
 void start_critical_timings(void)
 {
+   if (trace_latency_critical_timings_enabled())
+   this_cpu_write(ts_critical_timings,
+   (cycle_t) trace_clock_local());
+
if (preempt_trace() || irq_trace())
start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
 }
@@ -431,6 +447,13 @@ void stop_critical_timings(void)
 {
if (preempt_trace() || irq_trace())
stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
+
+   if (trace_latency_critical_timings_enabled()) {
+   trace_latency_critical_timings(
+   raw_smp_processor_id(),
+   get_delta(&ts_critical_timings));
+   }
+
 }
 EXPORT_SYMBOL_GPL(stop_critical_timings);
 
@@ -438,6 +461,10 @@ EXPORT_SYMBOL_GPL(stop_critical_timings);
 #ifdef CONFIG_PROVE_LOCKING
 void time_hardirqs_on(unsigned long a0, unsigned long a1)
 {
+   if (trace_latency_irqs_enabled()) {
+   trace_latency_irqs(raw_smp_processor_id(),
+   get_delta(&ts_irqs));
+   }
if (!preempt_trace() && irq_trace())
stop_critical_timing(a0, a1);
 }
@@ -446,6 +473,10 @@ void time_hardirqs_off(unsigned long a0, unsigned long a1)
 {
if (!preempt_trace() && irq_trace())
start_critical_timing(a0, a1);
+
+   if

[PATCH v2 0/3] *** Latency histograms - IRQSOFF,PREEMPTOFF ***

2016-08-24 Thread Binoy Jayan
Hi,

Thank you Steven and Daniel for reviewing v1 and providing suggestions.
These set of patches [v2] capture latency events caused by interrupts and
premption disabled in kernel. The patches are based on the hist trigger
feature developed by Tom Zanussi.

Git-commit: 7ef224d1d0e3a1ade02d02c01ce1dcffb736d2c3

As mentioned by Daniel, there is also a good write up in the following 
blog by Brendan Gregg:
http://www.brendangregg.com/blog/2016-06-08/linux-hist-triggers.html

The perf interface for the same have not been developed yet.
Related efforts: https://patchwork.kernel.org/patch/8439401

hwlat_detector tracer:
https://lkml.org/lkml/2016/8/4/348
https://lkml.org/lkml/2016/8/4/346

The patch series also contains histogram triggers for missed
hrtimer offsets.

Dependencies:
CONFIG_IRQSOFF_TRACER
CONFIG_PREEMPT_TRACER
CONFIG_PROVE_LOCKING
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG

Usage of triggers to generate histograms:

mount -t debugfs nodev /sys/kernel/debug
echo 'hist:key=latency.log2:val=hitcount:sort=latency' > 
/sys/kernel/debug/tracing/events/latency/latency_irqs/trigger
echo 'hist:key=common_pid.execname' > 
/sys/kernel/debug/tracing/events/latency/latency_hrtimer_interrupt/trigger

CPU specific breakdown of events:

echo 'hist:key=cpu,latency:val=hitcount:sort=latency' > 
/sys/kernel/debug/tracing/events/latency/latency_preempt/trigger
echo 'hist:key=cpu,latency:val=hitcount:sort=latency if cpu==1' > 
/sys/kernel/debug/tracing/events/latency/latency_preempt/trigger

Histogram output:
cat /sys/kernel/debug/tracing/events/latency/latency_irqs/hist
cat /sys/kernel/debug/tracing/events/latency/latency_preempt/hist
cat /sys/kernel/debug/tracing/events/latency/latency_critical_timings/hist
cat /sys/kernel/debug/tracing/events/latency/latency_hrtimer_interrupt/hist

Disable a trigger:
echo '!hist:key=latency.log2' > 
/sys/kernel/debug/tracing/events/latency/latency_irqs/trigger

Changes from v1 as per comments from Steven/Daniel
  - Use single tracepoint for irq/preempt/critical timings by introducing
a trace type field to differentiate trace type in the same tracepoint.
A suspicious RCU usage error was introduced, while using the trigger
command by mentioning the trace type as a key along with cpu.
I couldn't figure out why. Also, this type of arrangement may also 
be substandard performance vice.
  - Using a more accurate fast local clock instead of a global ftrace clock.

TODO:
1. perf interface. Not sure if this is needed
2. Latency histograms - process wakeup latency
  - Suggestion from Daniel to not introduce tracepoints in scheduler's hotpaths
  - Alternative to attach kprobes to functions which falls in critical paths
and find diff of timestamps using event trigger commands.

For example:

echo "p:myprobe1 start_critical_timings" > 
/sys/kernel/debug/tracing/kprobe_events
echo "p:myprobe2 stop_critical_timings" >>  
/sys/kernel/debug/tracing/kprobe_events
cat /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe1/enable
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe2/enable
cat /sys/kernel/debug/tracing/kprobe_events

And somehow save the timestamps for 'myprobe1' and 'myprobe2' in
'event_hist_trigger()'. This seems not feasible now as the histogram
data is saved as a 'sum' only for the conditions met in the key definition.
This makes it impossible to save timestamps for individual events.

kernel/trace/trace_events_hist.c +840: hist_trigger_elt_update()

Mhiramat and Steve, suggested an alternative to keep this timestamp is
to create a new ftrace map, store the timestamp with context "key" on the
named map upon event start. Then, at the event end trigger the histogram,
pick timestamp from the map by using context "key" and calculate the
difference. Basically this needs is a "map" which can be accessed from both
the events, .i.e that is the "global variable".

Binoy

Binoy Jayan (2):
  tracing: Add trace_irqsoff tracepoints
  tracing: Histogram for missed timer offsets

Daniel Wagner (1):
  tracing: Deference pointers without RCU checks

 include/linux/hrtimer.h |  3 ++
 include/linux/rculist.h | 36 ++
 include/linux/tracepoint.h  |  4 +-
 include/trace/events/latency.h  | 74 +
 kernel/time/hrtimer.c   | 39 +++
 kernel/trace/trace_events_trigger.c |  6 +--
 kernel/trace/trace_irqsoff.c| 42 -
 7 files changed, 198 insertions(+), 6 deletions(-)
 create mode 100644 include/trace/events/latency.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 3/3] tracing: Histogram for missed timer offsets

2016-08-24 Thread Binoy Jayan
Latencies of missed timer offsets. Generate a histogram of missed
timer offsets in microseconds. This will be a based along with irq
and preemption latencies to calculate the effective process wakeup
latencies.

The following filter(s) may be used

'hist:key=common_pid.execname'
'hist:key=common_pid.execname,cpu:val=toffset,hitcount'

Signed-off-by: Binoy Jayan 
---
 include/linux/hrtimer.h|  3 +++
 include/trace/events/latency.h | 31 +++
 kernel/time/hrtimer.c  | 39 +++
 3 files changed, 73 insertions(+)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 5e00f80..e09de14 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -104,6 +104,9 @@ struct hrtimer {
struct hrtimer_clock_base   *base;
u8  state;
u8  is_rel;
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   ktime_t praecox;
+#endif
 #ifdef CONFIG_TIMER_STATS
int start_pid;
void*start_site;
diff --git a/include/trace/events/latency.h b/include/trace/events/latency.h
index 77896c7..24cf009 100644
--- a/include/trace/events/latency.h
+++ b/include/trace/events/latency.h
@@ -37,6 +37,37 @@ DEFINE_EVENT(latency_template, latency_critical_timings,
TP_PROTO(int cpu, cycles_t latency),
TP_ARGS(cpu, latency));
 
+TRACE_EVENT(latency_hrtimer_interrupt,
+
+   TP_PROTO(int cpu, long long toffset, struct task_struct *curr,
+   struct task_struct *task),
+
+   TP_ARGS(cpu, toffset, curr, task),
+
+   TP_STRUCT__entry(
+   __field(int,cpu)
+   __field(long long,  toffset)
+   __array(char,   ccomm,  TASK_COMM_LEN)
+   __field(int,cprio)
+   __array(char,   tcomm,  TASK_COMM_LEN)
+   __field(int,tprio)
+   ),
+
+   TP_fast_assign(
+   __entry->cpu = cpu;
+   __entry->toffset = toffset;
+   memcpy(__entry->ccomm, curr->comm, TASK_COMM_LEN);
+   __entry->cprio  = curr->prio;
+   memcpy(__entry->tcomm, task != NULL ? task->comm : "",
+   task != NULL ? TASK_COMM_LEN : 7);
+   __entry->tprio  = task != NULL ? task->prio : -1;
+   ),
+
+   TP_printk("cpu=%d toffset=%lld curr=%s[%d] thread=%s[%d]",
+   __entry->cpu, __entry->toffset, __entry->ccomm,
+   __entry->cprio, __entry->tcomm, __entry->tprio)
+);
+
 #endif /* _TRACE_HIST_H */
 
 /* This part must be outside protection */
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 9ba7c82..1a96e34 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -53,9 +53,12 @@
 #include 
 
 #include 
+#include 
 
 #include "tick-internal.h"
 
+static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer);
+
 /*
  * The timer bases:
  *
@@ -960,6 +963,38 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer 
*timer, ktime_t tim,
return tim;
 }
 
+static inline void trace_latency_mark_ts(struct hrtimer *timer,
+struct hrtimer_clock_base *new_base,
+ktime_t tim)
+{
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   if (trace_latency_hrtimer_interrupt_enabled()) {
+   ktime_t now = new_base->get_time();
+
+   if (ktime_to_ns(tim) < ktime_to_ns(now))
+   timer->praecox = now;
+   else
+   timer->praecox = ktime_set(0, 0);
+   }
+#endif
+}
+
+static inline void trace_missed_hrtimer(struct hrtimer *timer, ktime_t basenow)
+{
+#if defined(CONFIG_PREEMPT_TRACER) || defined(CONFIG_IRQSOFF_TRACER)
+   if (trace_latency_hrtimer_interrupt_enabled())
+   trace_latency_hrtimer_interrupt(raw_smp_processor_id(),
+   ktime_to_ns(ktime_sub(ktime_to_ns(timer->praecox) ?
+   timer->praecox : hrtimer_get_expires(timer),
+   basenow)),
+   current,
+   timer->function == hrtimer_wakeup ?
+   container_of(timer, struct hrtimer_sleeper,
+   timer)->task : NULL);
+#endif
+
+}
+
 /**
  * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
  * @timer: the timer to be added
@@ -992,6 +1027,8 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 
timer_stats_hrtimer_set_start_info(timer);
 
+   trace_latency_mark_ts(timer, new_base, tim);
+
lef

Re: [PATCH v2 0/3] *** Latency histograms - IRQSOFF,PREEMPTOFF ***

2016-08-24 Thread Binoy Jayan
On 25 August 2016 at 10:56, Daniel Wagner  wrote:
> Hi Binoy,
>
> On 08/24/2016 01:17 PM, Binoy Jayan wrote:
>>
>> Histogram output:
>> cat /sys/kernel/debug/tracing/events/latency/latency_irqs/hist
>> cat /sys/kernel/debug/tracing/events/latency/latency_preempt/hist
>> cat /sys/kernel/debug/tracing/events/latency/latency_critical_timings/hist
>> cat
>> /sys/kernel/debug/tracing/events/latency/latency_hrtimer_interrupt/hist
>
>
> [...]
>
>> Changes from v1 as per comments from Steven/Daniel
>>   - Use single tracepoint for irq/preempt/critical timings by introducing
>> a trace type field to differentiate trace type in the same tracepoint.
>
>
> Did you send out the right patches? This version still looks like the
> previous one in this regard. And wouldn't be the 'Histogram output' have
> only one file? Maybe I just understood something wrong here.
>
> cheers,
> daniel

Hi Daniel,

This patch is after incorporating changes w.r.t. comments by steven.
And regarding
using one tracepoint, I have mentioned the same in the cover letter. I have sent
you (only) another patch with that change. When I do it like that I
get an RCU error,
the first time the "type" key is used. It is weird that it happens
only for the first time
something is echo-ed to the trigger file. I haven't been able to
figure out why yet.

Binoy


Re: [PATCH v2 1/3] tracing: Deference pointers without RCU checks

2016-08-25 Thread Binoy Jayan
On 26 August 2016 at 07:19, Masami Hiramatsu  wrote:
> On Wed, 24 Aug 2016 16:47:28 +0530
>> "__local_bh_enable() tests if this is the last SOFTIRQ_OFFSET, if so it
>> tells lockdep softirqs are enabled with trace_softirqs_on() after that
>> we go an actually modify the preempt_count with preempt_count_sub().
>> Then in preempt_count_sub() you call into trace_preempt_on() if this
>> was the last preempt_count increment but you do that _before_ you
>> actually change the preempt_count with __preempt_count_sub() at this
>> point lockdep and preempt_count think the world differs and *boom*"
>>
>> So the simplest way to avoid this is by disabling the consistency
>> checks.
>>
>> We also need to take care of the iterating in trace_events_trigger.c
>> to avoid a splatter in conjunction with the hist trigger.
>
> Special care for lockdep inside tracepoint handler is reasonable.
>
> Reviewed-by: Masami Hiramatsu 
>
> Steven, since this seems a bugfix, could you pick this from the series?
>
> Thank you,
>
> --
> Masami Hiramatsu 


Hi Daniel/Masami,

I ran into a similar rcu error while using same tracepoint for all
three latency types
and using a filter like below to trigger only events falling under a
specific type.

echo 'hist:key=ltype,cpu:val=latency:sort=ltype,cpu if ltype==0' > \
   /sys/kernel/debug/tracing/events/latency/latency_preempt/trigger

The error occurs only when I use the predicate 'if ltype==0' as filter.

It occurs in 'filter_match_preds' during a call to 'rcu_dereference_sched'.
kernel/trace/trace_events_filter.c +611 : filter_match_preds()

Surprisingly, this happens only the first time the echo command is used on
the trigger file after each boot.

Do you think it is similar to the bug you have fixed? May be i'll try using
"rcu_dereference_raw_notrace" instead of 'rcu_dereference_sched'.

 Binoy


[ 1029.324257] ===
[ 1029.324785] [ INFO: suspicious RCU usage. ]
[ 1029.328698] 4.7.0+ #49 Not tainted
[ 1029.332858] ---
[ 1029.336334] 
/local/mnt/workspace/src/korg/linux/kernel/trace/trace_events_filter.c:611
suspicious rcu_dereference_check() usage!
[ 1029.340423]
[ 1029.340423] other info that might help us debug this:
[ 1029.340423]
[ 1029.352226]
[ 1029.352226] RCU used illegally from idle CPU!
[ 1029.352226] rcu_scheduler_active = 1, debug_locks = 0
[ 1029.359953] RCU used illegally from extended quiescent state!
[ 1029.371057] no locks held by swapper/0/0.
[ 1029.376696]
[ 1029.376696] stack backtrace:
[ 1029.380693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0+ #49
[ 1029.385033] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 1029.39] Call trace:
[ 1029.397798] [] dump_backtrace+0x0/0x1e0
[ 1029.399967] [] show_stack+0x24/0x2c
[ 1029.405523] [] dump_stack+0xb0/0xf0
[ 1029.410557] [] lockdep_rcu_suspicious+0xe8/0x120
[ 1029.415595] [] filter_match_preds+0x108/0x118
[ 1029.421669] [] event_triggers_call+0x5c/0xc0
[ 1029.427485] [] trace_event_buffer_commit+0x11c/0x244
[ 1029.433390] []
trace_event_raw_event_latency_template+0x58/0xa4
[ 1029.439902] [] time_hardirqs_on+0x264/0x290
[ 1029.447450] [] trace_hardirqs_on_caller+0x20/0x180
[ 1029.453179] [] trace_hardirqs_on+0x10/0x18
[ 1029.459604] [] cpuidle_enter_state+0xc8/0x2e0
[ 1029.465246] [] cpuidle_enter+0x34/0x40
[ 1029.470888] [] call_cpuidle+0x3c/0x5c
[ 1029.476442] [] cpu_startup_entry+0x1c0/0x360
[ 1029.481392] [] rest_init+0x150/0x160
[ 1029.487293] [] start_kernel+0x3a4/0x3b8
[ 1029.492415] [] __primary_switched+0x30/0x74

Binoy


[PATCH v6 2/2] crypto: Multikey template for essiv

2017-06-21 Thread Binoy Jayan
Just for reference and to get the performance numbers.
Not for merging.

Depends on the following patches by Gilad:
 MAINTAINERS: add Gilad BY as maintainer for ccree
 staging: ccree: add devicetree bindings
 staging: ccree: add TODO list
 staging: add ccree crypto driver

A multi key template implementation which calls the underlying
iv generator 'essiv-aes-du512-dx' cum crypto algorithm. This
template sits on top of the underlying IV generator and accepts
a key length that is a multiple of the underlying key length.
This has not been tested on Juno with the CryptoCell accelerator
for which it was written for.

The underlying IV generator 'essiv-aes-du512-dx' generates IV for
every 512 byte blocks.

Signed-off-by: Binoy Jayan 
---
 drivers/md/dm-crypt.c|5 +-
 drivers/staging/ccree/Makefile   |2 +-
 drivers/staging/ccree/essiv.c|  777 
 drivers/staging/ccree/essiv_sw.c | 1040 ++
 4 files changed, 1821 insertions(+), 3 deletions(-)
 create mode 100644 drivers/staging/ccree/essiv.c
 create mode 100644 drivers/staging/ccree/essiv_sw.c

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index bef54f5..32f75dd 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1555,7 +1555,8 @@ static int __init geniv_register_algs(void)
if (err)
goto out_undo_plain;
 
-   err = crypto_register_template(&crypto_essiv_tmpl);
+   err = 0;
+   // err = crypto_register_template(&crypto_essiv_tmpl);
if (err)
goto out_undo_plain64;
 
@@ -1594,7 +1595,7 @@ static void __exit geniv_deregister_algs(void)
 {
crypto_unregister_template(&crypto_plain_tmpl);
crypto_unregister_template(&crypto_plain64_tmpl);
-   crypto_unregister_template(&crypto_essiv_tmpl);
+   // crypto_unregister_template(&crypto_essiv_tmpl);
crypto_unregister_template(&crypto_benbi_tmpl);
crypto_unregister_template(&crypto_null_tmpl);
crypto_unregister_template(&crypto_lmk_tmpl);
diff --git a/drivers/staging/ccree/Makefile b/drivers/staging/ccree/Makefile
index 44f3e3e..524e930 100644
--- a/drivers/staging/ccree/Makefile
+++ b/drivers/staging/ccree/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_CRYPTO_DEV_CCREE) := ccree.o
-ccree-y := ssi_driver.o ssi_sysfs.o ssi_buffer_mgr.o ssi_request_mgr.o 
ssi_cipher.o ssi_hash.o ssi_aead.o ssi_ivgen.o ssi_sram_mgr.o ssi_pm.o 
ssi_pm_ext.o
+ccree-y := ssi_driver.o ssi_sysfs.o ssi_buffer_mgr.o ssi_request_mgr.o 
ssi_cipher.o ssi_hash.o ssi_aead.o ssi_ivgen.o ssi_sram_mgr.o ssi_pm.o 
ssi_pm_ext.o essiv.o
 ccree-$(CCREE_FIPS_SUPPORT) += ssi_fips.o ssi_fips_ll.o ssi_fips_ext.o 
ssi_fips_local.o
diff --git a/drivers/staging/ccree/essiv.c b/drivers/staging/ccree/essiv.c
new file mode 100644
index 000..719b8bf
--- /dev/null
+++ b/drivers/staging/ccree/essiv.c
@@ -0,0 +1,777 @@
+/*
+ * Copyright (C) 2003 Jana Saout 
+ * Copyright (C) 2004 Clemens Fruhwirth 
+ * Copyright (C) 2006-2015 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2013 Milan Broz 
+ *
+ * This file is released under the GPL.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DM_MSG_PREFIX  "crypt"
+#define MAX_SG_LIST(BIO_MAX_PAGES * 8)
+#define MIN_IOS64
+#define LMK_SEED_SIZE  64 /* hash + 0 */
+#define TCW_WHITENING_SIZE 16
+
+struct geniv_ctx;
+struct geniv_req_ctx;
+
+/* Sub request for each of the skcipher_request's for a segment */
+struct geniv_subreq {
+   struct scatterlist src;
+   struct scatterlist dst;
+   struct geniv_req_ctx *rctx;
+   struct skcipher_request req CRYPTO_MINALIGN_ATTR;
+};
+
+struct geniv_req_ctx {
+   struct geniv_subreq *subreq;
+   int is_write;
+   sector_t iv_sector;
+   unsigned int nents;
+   struct completion restart;
+   atomic_t req_pending;
+   struct skcipher_request *req;
+};
+
+struct crypt_iv_operations {
+   int (*ctr)(struct geniv_ctx *ctx);
+   void (*dtr)(struct geniv_ctx *ctx);
+   int (*init)(struct geniv_ctx *ctx);
+   int (*wipe)(struct geniv_ctx *ctx);
+   int (*generator)(struct geniv_ctx *ctx,
+struct geniv_req_ctx *rctx,
+struct geniv_subreq *subreq, u8 *iv);
+   int (*post)(struct geniv_ctx *ctx,
+   struct geniv_req_ctx *rctx,
+   struct geniv_subreq *subreq, u8 *iv);
+};
+
+struct geniv_ctx {
+   unsigned int tfms_count;
+   struct crypto_skcipher *child;
+   struct crypto_skcipher **tfms;
+   char *ivmode;
+ 

[PATCH v6 1/2] crypto: Add IV generation algorithms

2017-06-21 Thread Binoy Jayan
Just for reference. Not for merging.

Currently, the iv generation algorithms are implemented in dm-crypt.c.
The goal is to move these algorithms from the dm layer to the kernel
crypto layer by implementing them as template ciphers so they can be
implemented in hardware for performance. As part of this patchset, the
iv-generation code is moved from the dm layer to the crypto layer and
adapt the dm-layer to send a whole 'bio' (as defined in the block layer)
at a time. Each bio contains an in memory representation of physically
contiguous disk blocks. The dm layer sets up a chained scatterlist of
these blocks split into physically contiguous segments in memory so that
DMA can be performed. Also, the key management code is moved from dm layer
to the cryto layer since the key selection for encrypting neighboring
sectors depend on the keycount.

Synchronous crypto requests to encrypt/decrypt a sector are processed
sequentially. Asynchronous requests if processed in parallel, are freed
in the async callback. The storage space for the initialization vectors
are allocated in the iv generator implementations.

Interface to the crypto layer - include/crypto/geniv.h

Signed-off-by: Binoy Jayan 
---
 drivers/md/dm-crypt.c  | 1939 ++--
 include/crypto/geniv.h |   46 ++
 2 files changed, 1432 insertions(+), 553 deletions(-)
 create mode 100644 include/crypto/geniv.h

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 389a363..bef54f5 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -32,170 +32,120 @@
 #include 
 #include 
 #include 
-
 #include 
-
-#define DM_MSG_PREFIX "crypt"
-
-/*
- * context holding the current state of a multi-part conversion
- */
-struct convert_context {
-   struct completion restart;
-   struct bio *bio_in;
-   struct bio *bio_out;
-   struct bvec_iter iter_in;
-   struct bvec_iter iter_out;
-   sector_t cc_sector;
-   atomic_t cc_pending;
-   struct skcipher_request *req;
+#include 
+#include 
+#include 
+#include 
+
+#define DM_MSG_PREFIX  "crypt"
+#define MAX_SG_LIST(BIO_MAX_PAGES * 8)
+#define MIN_IOS64
+#define LMK_SEED_SIZE  64 /* hash + 0 */
+#define TCW_WHITENING_SIZE 16
+
+struct geniv_ctx;
+struct geniv_req_ctx;
+
+/* Sub request for each of the skcipher_request's for a segment */
+struct geniv_subreq {
+   struct scatterlist src;
+   struct scatterlist dst;
+   struct geniv_req_ctx *rctx;
+   struct skcipher_request req CRYPTO_MINALIGN_ATTR;
 };
 
-/*
- * per bio private data
- */
-struct dm_crypt_io {
-   struct crypt_config *cc;
-   struct bio *base_bio;
-   struct work_struct work;
-
-   struct convert_context ctx;
-
-   atomic_t io_pending;
-   int error;
-   sector_t sector;
-
-   struct rb_node rb_node;
-} CRYPTO_MINALIGN_ATTR;
-
-struct dm_crypt_request {
-   struct convert_context *ctx;
-   struct scatterlist sg_in;
-   struct scatterlist sg_out;
+struct geniv_req_ctx {
+   struct geniv_subreq *subreq;
+   int is_write;
sector_t iv_sector;
+   unsigned int nents;
+   struct completion restart;
+   atomic_t req_pending;
+   struct skcipher_request *req;
 };
 
-struct crypt_config;
-
 struct crypt_iv_operations {
-   int (*ctr)(struct crypt_config *cc, struct dm_target *ti,
-  const char *opts);
-   void (*dtr)(struct crypt_config *cc);
-   int (*init)(struct crypt_config *cc);
-   int (*wipe)(struct crypt_config *cc);
-   int (*generator)(struct crypt_config *cc, u8 *iv,
-struct dm_crypt_request *dmreq);
-   int (*post)(struct crypt_config *cc, u8 *iv,
-   struct dm_crypt_request *dmreq);
+   int (*ctr)(struct geniv_ctx *ctx);
+   void (*dtr)(struct geniv_ctx *ctx);
+   int (*init)(struct geniv_ctx *ctx);
+   int (*wipe)(struct geniv_ctx *ctx);
+   int (*generator)(struct geniv_ctx *ctx,
+struct geniv_req_ctx *rctx,
+struct geniv_subreq *subreq, u8 *iv);
+   int (*post)(struct geniv_ctx *ctx,
+   struct geniv_req_ctx *rctx,
+   struct geniv_subreq *subreq, u8 *iv);
 };
 
-struct iv_essiv_private {
+struct geniv_essiv_private {
struct crypto_ahash *hash_tfm;
u8 *salt;
 };
 
-struct iv_benbi_private {
+struct geniv_benbi_private {
int shift;
 };
 
-#define LMK_SEED_SIZE 64 /* hash + 0 */
-struct iv_lmk_private {
+struct geniv_lmk_private {
struct crypto_shash *hash_tfm;
u8 *seed;
 };
 
-#define TCW_WHITENING_SIZE 16
-struct iv_tcw_private {
+struct geniv_tcw_private {
struct crypto_shash *crc32_tfm;
u8 *iv_seed;
u8 *whitening;
 };
 
-/*
- * Crypt: maps a linear range of a block device
- * and encrypts / decrypts at the same time.
- 

[PATCH v6 0/2] IV Generation algorithms for dm-crypt

2017-06-21 Thread Binoy Jayan
===
dm-crypt optimization for larger block sizes
===

Currently, the iv generation algorithms are implemented in dm-crypt.c. The goal
is to move these algorithms from the dm layer to the kernel crypto layer by
implementing them as template ciphers so they can be used in relation with
algorithms like aes, and with multiple modes like cbc, ecb etc. As part of this
patchset, the iv-generation code is moved from the dm layer to the crypto layer
and adapt the dm-layer to send a whole 'bio' (as defined in the block layer)
at a time. Each bio contains the in memory representation of physically
contiguous disk blocks. Since the bio itself may not be contiguous in main
memory, the dm layer sets up a chained scatterlist of these blocks split into
physically contiguous segments in memory so that DMA can be performed.

One challenge in doing so is that the IVs are generated based on a 512-byte
sector number. This infact limits the block sizes to 512 bytes. But this should
not be a problem if a hardware with iv generation support is used. The geniv
itself splits the segments into sectors so it could choose the IV based on
sector number. But it could be modelled in hardware effectively by not
splitting up the segments in the bio.

Another challenge faced is that dm-crypt has an option to use multiple keys.
The key selection is done based on the sector number. If the whole bio is
encrypted / decrypted with the same key, the encrypted volumes will not be
compatible with the original dm-crypt [without the changes]. So, the key
selection code is moved to crypto layer so the neighboring sectors are
encrypted with a different key.

The dm layer allocates space for iv. The hardware drivers can choose to make
use of this space to generate their IVs sequentially or allocate it on their
own. This can be moved to crypto layer too. Postponing this decision until
the requirement to integrate milan's changes are clear.

Interface to the crypto layer - include/crypto/geniv.h

More information on test procedure can be found in v1.
Results of performance tests with software crypto in v5.

The patch 'crypto: Multikey template for essiv' depends on
the following patches by Gilad:
 MAINTAINERS: add Gilad BY as maintainer for ccree
 staging: ccree: add devicetree bindings
 staging: ccree: add TODO list
 staging: add ccree crypto driver

Revisions:
--

v1: https://patchwork.kernel.org/patch/9439175
v2: https://patchwork.kernel.org/patch/9471923
v3: https://lkml.org/lkml/2017/1/18/170
v4: https://patchwork.kernel.org/patch/9559665
v5: https://patchwork.kernel.org/patch/9669237

v5 --> v6:
--

1. Moved allocation of initialization vectors to the iv-generator
2. Few consmetic changes as the consequence of the above
3. Few logical to boolean expressions for faster calculation
4. Included multikey template for splitting keys.
   This needs testing with real hardware (juno with ccree)
   and also modification. It is only for testing and not
   for inclusion upstream.

v4 --> v5
--

1. Fix for the multiple instance issue in /proc/crypto
2. Few cosmetic changes including struct alignment
3. Simplified 'struct geniv_req_info'

v3 --> v4
--
Fix for the bug reported by Gilad Ben-Yossef.
The element '__ctx' in 'struct skcipher_request req' overflowed into the
element 'struct scatterlist src' which immediately follows 'req' in
'struct geniv_subreq' and corrupted src.

v2 --> v3
--

1. Moved iv algorithms in dm-crypt.c for control
2. Key management code moved from dm layer to cryto layer
   so that cipher instance selection can be made depending on key_index
3. The revision v2 had scatterlist nodes created for every sector in the bio.
   It is modified to create only once scatterlist node to reduce memory
   foot print. Synchronous requests are processed sequentially. Asynchronous
   requests are processed in parallel and is freed in the async callback.
4. Changed allocation for sub-requests using mempool

v1 --> v2
--

1. dm-crypt changes to process larger block sizes (one segment in a bio)
2. Incorporated changes w.r.t. comments from Herbert.


Binoy Jayan (2):
  crypto: Add IV generation algorithms
  crypto: Multikey template for essiv

 drivers/md/dm-crypt.c| 1940 +++---
 drivers/staging/ccree/Makefile   |2 +-
 drivers/staging/ccree/essiv.c|  777 +++
 drivers/staging/ccree/essiv_sw.c | 1040 
 include/crypto/geniv.h   |   46 +
 5 files changed, 3251 insertions(+), 554 deletions(-)
 create mode 100644 drivers/staging/ccree/essiv.c
 create mode 100644 drivers/staging/ccree/essiv_sw.c
 create mode 100644 include/crypto/geniv.h

-- 
Binoy Jayan



Re: [RFC PATCH v2] crypto: Add IV generation algorithms

2017-01-04 Thread Binoy Jayan
Hi Herbert,

On 2 January 2017 at 12:23, Herbert Xu  wrote:
> On Mon, Jan 02, 2017 at 12:16:45PM +0530, Binoy Jayan wrote:
>
> Right.  The actual number of underlying tfms that do the work
> won't change compared to the status quo.  We're just structuring
> it such that if the overall scheme is supported by the hardware
> then we can feed more than one sector at a time to it.

I was thinking of continuing to have the iv generation algorithms as template
ciphers instead of regular 'skcipher' as it is easier to inherit the parameters
from the underlying cipher (e.g. aes) like cra_blocksize, cra_alignmask,
ivsize, chunksize etc.

Usually, the underlying cipher for the template ciphers are instantiated
in the following function:

skcipher_instance:skcipher_alg:init()

Since the number of such cipher instances depend on the key count, which is
not known at the time of creation of the cipher (it's passed to as an argument
to the setkey api), the creation of those have to be delayed until the setkey
operation of the template cipher. But as Mark pointed out, the users of this
cipher may get confused if the creation of the underlying cipher fails while
trying to do a 'setkey' on the template cipher. I was wondering if I can create
a single instance of the cipher and assign it to tfms[0] and allocate the
remaining instances when the setkey operation is called later with the encoded
key_count so that errors during cipher creation are uncovered earlier.

Thanks,
Binoy


[RFC PATCH v3] IV Generation algorithms for dm-crypt

2017-01-18 Thread Binoy Jayan
===
GENIV Template cipher
===

Currently, the iv generation algorithms are implemented in dm-crypt.c. The goal
is to move these algorithms from the dm layer to the kernel crypto layer by
implementing them as template ciphers so they can be used in relation with
algorithms like aes, and with multiple modes like cbc, ecb etc. As part of this
patchset, the iv-generation code is moved from the dm layer to the crypto layer
and adapt the dm-layer to send a whole 'bio' (as defined in the block layer)
at a time. Each bio contains the in memory representation of physically
contiguous disk blocks. Since the bio itself may not be contiguous in main
memory, the dm layer sets up a chained scatterlist of these blocks split into
physically contiguous segments in memory so that DMA can be performed.

One challenge in doing so is that the IVs are generated based on a 512-byte
sector number. This infact limits the block sizes to 512 bytes. But this should
not be a problem if a hardware with iv generation support is used. The geniv
itself splits the segments into sectors so it could choose the IV based on
sector number. But it could be modelled in hardware effectively by not
splitting up the segments in the bio.

Another challenge faced is that dm-crypt has an option to use multiple keys.
The key selection is done based on the sector number. If the whole bio is
encrypted / decrypted with the same key, the encrypted volumes will not be
compatible with the original dm-crypt [without the changes]. So, the key
selection code is moved to crypto layer so the neighboring sectors are
encrypted with a different key.

The dm layer allocates space for iv. The hardware drivers can choose to make
use of this space to generate their IVs sequentially or allocate it on their
own. This can be moved to crypto layer too. Postponing this decision until
the requirement to integrate milan's changes are clear.

Interface to the crypto layer - include/crypto/geniv.h

Revisions:

v1: https://patchwork.kernel.org/patch/9439175
v2: https://patchwork.kernel.org/patch/9471923

v2 --> v3
--

1. Moved iv algorithms in dm-crypt.c for control
2. Key management code moved from dm layer to cryto layer
   so that cipher instance selection can be made depending on key_index
3. The revision v2 had scatterlist nodes created for every sector in the bio.
   It is modified to create only once scatterlist node to reduce memory
   foot print. Synchronous requests are processed sequentially. Asynchronous
   requests are processed in parallel and is freed in the async callback.
4. Changed allocation for sub-requests using mempool

v1 --> v2
--

1. dm-crypt changes to process larger block sizes (one segment in a bio)
2. Incorporated changes w.r.t. comments from Herbert.

Binoy Jayan (1):
  crypto: Add IV generation algorithms

 drivers/md/dm-crypt.c  | 1891 ++--
 include/crypto/geniv.h |   47 ++
 2 files changed, 1399 insertions(+), 539 deletions(-)
 create mode 100644 include/crypto/geniv.h

-- 
Binoy Jayan



[RFC PATCH v3] crypto: Add IV generation algorithms

2017-01-18 Thread Binoy Jayan
Currently, the iv generation algorithms are implemented in dm-crypt.c.
The goal is to move these algorithms from the dm layer to the kernel
crypto layer by implementing them as template ciphers so they can be
implemented in hardware for performance. As part of this patchset, the
iv-generation code is moved from the dm layer to the crypto layer and
adapt the dm-layer to send a whole 'bio' (as defined in the block layer)
at a time. Each bio contains an in memory representation of physically
contiguous disk blocks. The dm layer sets up a chained scatterlist of
these blocks split into physically contiguous segments in memory so that
DMA can be performed. Also, the key management code is moved from dm layer
to the cryto layer since the key selection for encrypting neighboring
sectors depend on the keycount.

Synchronous crypto requests to encrypt/decrypt a sector are processed
sequentially. Asynchronous requests if processed in parallel, are freed
in the async callback. The dm layer allocates space for iv. The hardware
implementations can choose to make use of this space to generate their IVs
sequentially or allocate it on their own.
Interface to the crypto layer - include/crypto/geniv.h

Signed-off-by: Binoy Jayan 
---
 drivers/md/dm-crypt.c  | 1891 ++--
 include/crypto/geniv.h |   47 ++
 2 files changed, 1399 insertions(+), 539 deletions(-)
 create mode 100644 include/crypto/geniv.h

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 7c6c572..7275b0f 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -32,170 +32,113 @@
 #include 
 #include 
 #include 
-
 #include 
-
-#define DM_MSG_PREFIX "crypt"
-
-/*
- * context holding the current state of a multi-part conversion
- */
-struct convert_context {
-   struct completion restart;
-   struct bio *bio_in;
-   struct bio *bio_out;
-   struct bvec_iter iter_in;
-   struct bvec_iter iter_out;
-   sector_t cc_sector;
-   atomic_t cc_pending;
-   struct skcipher_request *req;
+#include 
+#include 
+#include 
+#include 
+
+#define DM_MSG_PREFIX  "crypt"
+#define MAX_SG_LIST(BIO_MAX_PAGES * 8)
+#define MIN_IOS64
+#define LMK_SEED_SIZE  64 /* hash + 0 */
+#define TCW_WHITENING_SIZE 16
+
+struct geniv_ctx;
+struct geniv_req_ctx;
+
+/* Sub request for each of the skcipher_request's for a segment */
+struct geniv_subreq {
+   struct skcipher_request req CRYPTO_MINALIGN_ATTR;
+   struct scatterlist src;
+   struct scatterlist dst;
+   int n;
+   struct geniv_req_ctx *rctx;
 };
 
-/*
- * per bio private data
- */
-struct dm_crypt_io {
-   struct crypt_config *cc;
-   struct bio *base_bio;
-   struct work_struct work;
-
-   struct convert_context ctx;
-
-   atomic_t io_pending;
-   int error;
-   sector_t sector;
-
-   struct rb_node rb_node;
-} CRYPTO_MINALIGN_ATTR;
-
-struct dm_crypt_request {
-   struct convert_context *ctx;
-   struct scatterlist sg_in;
-   struct scatterlist sg_out;
+struct geniv_req_ctx {
+   struct geniv_subreq *subreq;
+   bool is_write;
sector_t iv_sector;
+   unsigned int nents;
+   u8 *iv;
+   struct completion restart;
+   atomic_t req_pending;
+   struct skcipher_request *req;
 };
 
-struct crypt_config;
-
 struct crypt_iv_operations {
-   int (*ctr)(struct crypt_config *cc, struct dm_target *ti,
-  const char *opts);
-   void (*dtr)(struct crypt_config *cc);
-   int (*init)(struct crypt_config *cc);
-   int (*wipe)(struct crypt_config *cc);
-   int (*generator)(struct crypt_config *cc, u8 *iv,
-struct dm_crypt_request *dmreq);
-   int (*post)(struct crypt_config *cc, u8 *iv,
-   struct dm_crypt_request *dmreq);
+   int (*ctr)(struct geniv_ctx *ctx);
+   void (*dtr)(struct geniv_ctx *ctx);
+   int (*init)(struct geniv_ctx *ctx);
+   int (*wipe)(struct geniv_ctx *ctx);
+   int (*generator)(struct geniv_ctx *ctx,
+struct geniv_req_ctx *rctx,
+struct geniv_subreq *subreq);
+   int (*post)(struct geniv_ctx *ctx,
+   struct geniv_req_ctx *rctx,
+   struct geniv_subreq *subreq);
 };
 
-struct iv_essiv_private {
+struct geniv_essiv_private {
struct crypto_ahash *hash_tfm;
u8 *salt;
 };
 
-struct iv_benbi_private {
+struct geniv_benbi_private {
int shift;
 };
 
-#define LMK_SEED_SIZE 64 /* hash + 0 */
-struct iv_lmk_private {
+struct geniv_lmk_private {
struct crypto_shash *hash_tfm;
u8 *seed;
 };
 
-#define TCW_WHITENING_SIZE 16
-struct iv_tcw_private {
+struct geniv_tcw_private {
struct crypto_shash *crc32_tfm;
u8 *iv_seed;
u8 *whitening;
 };
 
-/*
- * Crypt: maps a linear range of a block device
- * and encrypts / de

Re: [RFC PATCH v3] crypto: Add IV generation algorithms

2017-01-18 Thread Binoy Jayan
Hi Gilad,

On 18 January 2017 at 20:51, Gilad Ben-Yossef  wrote:
> I have some review comments and a bug report -

Thank you very much for testing this on ARM and for the comments.

> I'm pretty sure this needs to be
>
>  n2 = bio_segments(ctx->bio_out);

Yes you are right, that was a typo :)

>> +
>> +   rinfo.is_write = bio_data_dir(ctx->bio_in) == WRITE;
>
> Please consider wrapping the above boolean expression in parenthesis.

Well, I can do that to enhance the clarity.

>> +   rinfo.iv_sector = ctx->cc_sector;
>> +   rinfo.nents = nents;
>> +   rinfo.iv = iv;
>> +
>> +   skcipher_request_set_crypt(req, dmreq->sg_in, dmreq->sg_out,
>
> Also, where do the scatterlist src2 and dst2 that you use
> sg_set_page() get sg_init_table() called on?
> I couldn't figure it out...

Thank you pointing this out. I missed out to add sg_init_table(src2, 1)
and sg_init_table(dst2, 1), but sg_set_page is used in geniv_iter_block.
This is probably the reason for the panic on ARM platform. However it
ran fine under qemu-x86. May be I should setup an arm platform too
for testing.

Regards,
Binoy


Re: [RFC PATCH v3] crypto: Add IV generation algorithms

2017-01-19 Thread Binoy Jayan
Hi Gilad,

On 19 January 2017 at 15:17, Gilad Ben-Yossef  wrote:
> I tried adding sg_init_table() where I thought it was appropriate and
> it didn't resolve the issue.
>
> For what it's worth, my guess is that the difference between our
> setups is not related to Arm but to other options or the storage I'm
> using.

I was able to reproduce this again on my qemu setup with a 1GB virtual
disk. That is the same thing I do with the x86 setup as well.

> Are you using cryptd?

You mean config CRYPTO_CRYPTD?

-Binoy


[PATCH v3 0/3] *** staging: wilc1000: Replace semaphores ***

2016-06-22 Thread Binoy Jayan
This patchset [v3] is part of the second patch series for 'wilc1000'.
The original patch series consisted 7 patches of which only the first 5
are good. The patch 6 and 7 are being worked on in this series
in a different way.

This patch series removes the semaphore 'sem' in 'wilc1000' and also
restructures the implementation of kthread / message_queue logic with
a create_singlethread_workqueue() / queue_work() setup.

These are part of a bigger effort to eliminate all semaphores
from the linux kernel.

They build correctly (individually and as a whole).

NB: The changes are untested

Discussion carried forward from previous patchset [v2]

Rework on the review comments by Arnd w.r.t. v1

struct message_queue can be removed since
 - after the workqueue conversion, mq->sem is no longer needed
 - recv_count is not needed, it just counts the number of entries in the list
 - struct wilc' pointer can be retrieved from the host_if_msg, (vif->wilc)
 - the message list is not needed because we always look only at the
   first entry, except in wilc_mq_destroy(), but it would be better
   to just call destroy_workqueue(), which also drains the remaining work.
 - the exiting flag is also handled by destroy_workqueue()   
 - with everything else gone, the spinlock is also not needed any more.

Do 'kfree' only at the end of 'host_if_work' 

wilc_initialized is always '1' so the conditional 'wilc_mq_send'
in 'hostIFthread' can be removed.

A connect command (HOST_IF_MSG_CONNECT) does not complete while scan is 
ongoing. 
So, the special handling of this command needs to be preserved.

Use create_singlethread_workqueue() instead of alloc_workqueue(), so that
we stay closer to the current behavior by having the thread run only
on one CPU at a time and not having a 'dedicated' thread for each.

Split the patch to seperate interface changes to 'wilc_mq_send'
No easy way found to split the patch to change the interface
'wilc_mq_send' and to 'wilc_enqueue_cmd' as the parameters 
'mq' 'send_buf' and 'send_buf_size' itself are part of the message
queue implementation.

New changes in v3

Rework on the review comments by Arnd w.r.t. v2
 - Removed forward declaration for wilc_enqueue_cmd
 - Change the interface 'wilc_mq_send' in a different patch
 - Avoid change in indentation in host_if_work and move it to
   a different patch

Cannot remove forward declaraition for 'host_if_work'
   since there is a mutual dependency.

Binoy

Binoy Jayan (3):
  staging: wilc1000: message_queue: Move code to host interface
  staging: wilc1000: Replace kthread with workqueue for host interface
  staging: wilc1000: Change interface wilc_mq_send to wilc_enqueue_cmd

 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 395 +++---
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 ---
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 ---
 5 files changed, 204 insertions(+), 369 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 1/3] staging: wilc1000: message_queue: Move code to host interface

2016-06-22 Thread Binoy Jayan
Move the contents of wilc_msgqueue.c and wilc_msgqueue.h into
host_interface.c, remove 'wilc_msgqueue.c' and 'wilc_msgqueue.h'.
This is done so as to restructure the implementation of the kthread
'hostIFthread' using a work queue.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/host_interface.c | 163 +-
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 --
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 -
 4 files changed, 162 insertions(+), 174 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

diff --git a/drivers/staging/wilc1000/Makefile 
b/drivers/staging/wilc1000/Makefile
index acc3f3e..d226283 100644
--- a/drivers/staging/wilc1000/Makefile
+++ b/drivers/staging/wilc1000/Makefile
@@ -6,7 +6,6 @@ ccflags-y += -DFIRMWARE_1002=\"atmel/wilc1002_firmware.bin\" \
 ccflags-y += -I$(src)/ -DWILC_ASIC_A0 -DWILC_DEBUGFS
 
 wilc1000-objs := wilc_wfi_cfgoperations.o linux_wlan.o linux_mon.o \
-   wilc_msgqueue.o \
coreconfigurator.o host_interface.o \
wilc_wlan_cfg.o wilc_debugfs.o \
wilc_wlan.o
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 9535842..2d250c6 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -3,11 +3,13 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
+#include 
+#include 
 #include "coreconfigurator.h"
 #include "wilc_wlan.h"
 #include "wilc_wlan_if.h"
-#include "wilc_msgqueue.h"
 #include 
 #include "wilc_wfi_netdevice.h"
 
@@ -57,6 +59,20 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
+struct message {
+   void *buf;
+   u32 len;
+   struct list_head list;
+};
+
+struct message_queue {
+   struct semaphore sem;
+   spinlock_t lock;
+   bool exiting;
+   u32 recv_count;
+   struct list_head msg_list;
+};
+
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -264,6 +280,151 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
+static int wilc_mq_create(struct message_queue *mq);
+static int wilc_mq_send(struct message_queue *mq,
+const void *send_buf, u32 send_buf_size);
+static int wilc_mq_recv(struct message_queue *mq,
+void *recv_buf, u32 recv_buf_size, u32 *recv_len);
+static int wilc_mq_destroy(struct message_queue *mq);
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_create(struct message_queue *mq)
+{
+   spin_lock_init(&mq->lock);
+   sema_init(&mq->sem, 0);
+   INIT_LIST_HEAD(&mq->msg_list);
+   mq->recv_count = 0;
+   mq->exiting = false;
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_destroy(struct message_queue *mq)
+{
+   struct message *msg;
+
+   mq->exiting = true;
+
+   /* Release any waiting receiver thread. */
+   while (mq->recv_count > 0) {
+   up(&mq->sem);
+   mq->recv_count--;
+   }
+
+   while (!list_empty(&mq->msg_list)) {
+   msg = list_first_entry(&mq->msg_list, struct message, list);
+   list_del(&msg->list);
+   kfree(msg->buf);
+   }
+
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_send(struct message_queue *mq,
+   const void *send_buf, u32 send_buf_size)
+{
+   unsigned long flags;
+   struct message *new_msg = NULL;
+
+   if (!mq || (send_buf_size == 0) || !send_buf)
+   return -EINVAL;
+
+   if (mq->exiting)
+   return -EFAULT;
+
+   /* construct a new message */
+   new_msg = kmalloc(sizeof(*new_msg), GFP_ATOMIC);
+   if (!new_msg)
+   return -ENOMEM;
+
+   new_msg->len = send_buf_size;
+   INIT_LIST_HEAD(&new_msg->list);
+   new_msg->buf = kmemdup(send_buf, send_buf_size, GFP_ATOMIC);
+   if (!new_msg->buf) {
+   

[PATCH v3 2/3] staging: wilc1000: Replace kthread with workqueue for host interface

2016-06-22 Thread Binoy Jayan
Deconstruct the kthread / message_queue logic, replacing it with
create_singlethread_workqueue() / queue_work() setup, by adding a
'struct work_struct' to 'struct host_if_msg'. The current kthread
hostIFthread() is converted to a work queue helper with the name
'host_if_work'.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 282 --
 2 files changed, 75 insertions(+), 212 deletions(-)

diff --git a/drivers/staging/wilc1000/TODO b/drivers/staging/wilc1000/TODO
index 95199d8..ec93b2e 100644
--- a/drivers/staging/wilc1000/TODO
+++ b/drivers/staging/wilc1000/TODO
@@ -4,6 +4,11 @@ TODO:
 - remove custom debug and tracing functions
 - rework comments and function headers(also coding style)
 - replace all semaphores with mutexes or completions
+- Move handling for each individual members of 'union message_body' out
+  into a separate 'struct work_struct' and completely remove the multiplexer
+  that is currently part of host_if_work(), allowing movement of the
+  implementation of each message handler into the callsite of the function
+  that currently queues the 'host_if_msg'.
 - make spi and sdio components coexist in one build
 - turn compile-time platform configuration (BEAGLE_BOARD,
   PANDA_BOARD, PLAT_WMS8304, PLAT_RK, CUSTOMER_PLATFORM, ...)
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 2d250c6..242c3d7 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
 #include 
 #include 
@@ -211,6 +212,7 @@ struct host_if_msg {
u16 id;
union message_body body;
struct wilc_vif *vif;
+   struct work_struct work;
 };
 
 struct join_bss_param {
@@ -245,7 +247,7 @@ struct join_bss_param {
 static struct host_if_drv *terminated_handle;
 bool wilc_optaining_ip;
 static u8 P2P_LISTEN_STATE;
-static struct task_struct *hif_thread_handler;
+static struct workqueue_struct *hif_workqueue;
 static struct message_queue hif_msg_q;
 static struct completion hif_thread_comp;
 static struct completion hif_driver_comp;
@@ -280,55 +282,7 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
-static int wilc_mq_create(struct message_queue *mq);
-static int wilc_mq_send(struct message_queue *mq,
-const void *send_buf, u32 send_buf_size);
-static int wilc_mq_recv(struct message_queue *mq,
-void *recv_buf, u32 recv_buf_size, u32 *recv_len);
-static int wilc_mq_destroy(struct message_queue *mq);
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_create(struct message_queue *mq)
-{
-   spin_lock_init(&mq->lock);
-   sema_init(&mq->sem, 0);
-   INIT_LIST_HEAD(&mq->msg_list);
-   mq->recv_count = 0;
-   mq->exiting = false;
-   return 0;
-}
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_destroy(struct message_queue *mq)
-{
-   struct message *msg;
-
-   mq->exiting = true;
-
-   /* Release any waiting receiver thread. */
-   while (mq->recv_count > 0) {
-   up(&mq->sem);
-   mq->recv_count--;
-   }
-
-   while (!list_empty(&mq->msg_list)) {
-   msg = list_first_entry(&mq->msg_list, struct message, list);
-   list_del(&msg->list);
-   kfree(msg->buf);
-   }
-
-   return 0;
-}
+static void host_if_work(struct work_struct *work);
 
 /*!
  *  @authorsyounan
@@ -339,92 +293,17 @@ static int wilc_mq_destroy(struct message_queue *mq)
 static int wilc_mq_send(struct message_queue *mq,
const void *send_buf, u32 send_buf_size)
 {
-   unsigned long flags;
-   struct message *new_msg = NULL;
+   struct host_if_msg *new_msg;
 
-   if (!mq || (send_buf_size == 0) || !send_buf)
-   return -EINVAL;
-
-   if (mq->exiting)
-   return -EFAULT;
-
-   /* construct a new message */
-   new_msg = kmalloc(sizeof(*new_msg), GFP_ATOMIC);
+   new_msg = kmemdup(send_buf, sizeof(*new_msg), GFP_ATOMIC);
if (!new_msg)
return -ENOMEM;
 
-   new_msg->len = send_buf_size;
-   INIT_LIST_HEAD(&new_msg->list);
-  

[PATCH v3 3/3] staging: wilc1000: Change interface wilc_mq_send to wilc_enqueue_cmd

2016-06-22 Thread Binoy Jayan
Replace the interface 'wilc_mq_send' with 'wilc_enqueue_cmd'
and remove the now unused structures 'message' and 'message_queue'.
Restructure switch statement in the work queue helper function
host_if_work and remove unwanted indentation.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/host_interface.c | 332 ++
 1 file changed, 158 insertions(+), 174 deletions(-)

diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 242c3d7..8ba48c2 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -60,20 +60,6 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
-struct message {
-   void *buf;
-   u32 len;
-   struct list_head list;
-};
-
-struct message_queue {
-   struct semaphore sem;
-   spinlock_t lock;
-   bool exiting;
-   u32 recv_count;
-   struct list_head msg_list;
-};
-
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -290,12 +276,11 @@ static void host_if_work(struct work_struct *work);
  *  @note  copied from FLO glue implementatuion
  *  @version   1.0
  */
-static int wilc_mq_send(struct message_queue *mq,
-   const void *send_buf, u32 send_buf_size)
+static int wilc_enqueue_cmd(struct host_if_msg *msg)
 {
struct host_if_msg *new_msg;
 
-   new_msg = kmemdup(send_buf, sizeof(*new_msg), GFP_ATOMIC);
+   new_msg = kmemdup(msg, sizeof(*new_msg), GFP_ATOMIC);
if (!new_msg)
return -ENOMEM;
 
@@ -2404,7 +2389,7 @@ static void ListenTimerCB(unsigned long arg)
msg.vif = vif;
msg.body.remain_on_ch.id = vif->hif_drv->remain_on_ch.id;
 
-   result = wilc_mq_send(&hif_msg_q, &msg, sizeof(struct host_if_msg));
+   result = wilc_enqueue_cmd(&msg);
if (result)
netdev_err(vif->ndev, "wilc_mq_send fail\n");
 }
@@ -2514,160 +2499,159 @@ static void host_if_work(struct work_struct *work)
 
if (msg->id == HOST_IF_MSG_CONNECT &&
msg->vif->hif_drv->usr_scan_req.scan_result) {
-   wilc_mq_send(&hif_msg_q, msg, sizeof(struct host_if_msg));
+   wilc_enqueue_cmd(msg);
usleep_range(2 * 1000, 2 * 1000);
-   } else {
-
-   switch (msg->id) {
-   case HOST_IF_MSG_SCAN:
-   handle_scan(msg->vif, &msg->body.scan_info);
-   break;
-
-   case HOST_IF_MSG_CONNECT:
-   Handle_Connect(msg->vif, &msg->body.con_info);
-   break;
+   goto free_msg;
+   }
+   switch (msg->id) {
+   case HOST_IF_MSG_SCAN:
+   handle_scan(msg->vif, &msg->body.scan_info);
+   break;
 
-   case HOST_IF_MSG_RCVD_NTWRK_INFO:
-   Handle_RcvdNtwrkInfo(msg->vif, &msg->body.net_info);
-   break;
+   case HOST_IF_MSG_CONNECT:
+   Handle_Connect(msg->vif, &msg->body.con_info);
+   break;
 
-   case HOST_IF_MSG_RCVD_GNRL_ASYNC_INFO:
-   Handle_RcvdGnrlAsyncInfo(msg->vif,
-&msg->body.async_info);
-   break;
+   case HOST_IF_MSG_RCVD_NTWRK_INFO:
+   Handle_RcvdNtwrkInfo(msg->vif, &msg->body.net_info);
+   break;
 
-   case HOST_IF_MSG_KEY:
-   Handle_Key(msg->vif, &msg->body.key_info);
-   break;
+   case HOST_IF_MSG_RCVD_GNRL_ASYNC_INFO:
+   Handle_RcvdGnrlAsyncInfo(msg->vif,
+&msg->body.async_info);
+   break;
 
-   case HOST_IF_MSG_CFG_PARAMS:
-   handle_cfg_param(msg->vif, &msg->body.cfg_info);
-   break;
+   case HOST_IF_MSG_KEY:
+   Handle_Key(msg->vif, &msg->body.key_info);
+   break;
 
-   case HOST_IF_MSG_SET_CHANNEL:
-   handle_set_channel(msg->vif, &msg->body.channel_info);
-   break;
+   case HOST_IF_MSG_CFG_PARAMS:
+   handle_cfg_param(msg->vif, &msg->body.cfg_info);
+   break;
 
-   case HOST_IF_MSG_DISCONNECT:
-   Handle_Disconnect(msg->vif);
-   break;
+   case HOST_IF_MSG_SET_CHANNEL:
+   handle_set_channel(msg->vif, &msg->body.channel_info);
+   break;
 
-   case HOST_IF_MSG_RCVD_SCAN_COMPLETE:
-   del_timer(&msg->vif

[PATCH v4 1/3] staging: wilc1000: message_queue: Move code to host interface

2016-06-22 Thread Binoy Jayan
Move the contents of wilc_msgqueue.c and wilc_msgqueue.h into
host_interface.c, remove 'wilc_msgqueue.c' and 'wilc_msgqueue.h'.
This is done so as to restructure the implementation of the kthread
'hostIFthread' using a work queue.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/host_interface.c | 163 +-
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 --
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 -
 4 files changed, 162 insertions(+), 174 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

diff --git a/drivers/staging/wilc1000/Makefile 
b/drivers/staging/wilc1000/Makefile
index acc3f3e..d226283 100644
--- a/drivers/staging/wilc1000/Makefile
+++ b/drivers/staging/wilc1000/Makefile
@@ -6,7 +6,6 @@ ccflags-y += -DFIRMWARE_1002=\"atmel/wilc1002_firmware.bin\" \
 ccflags-y += -I$(src)/ -DWILC_ASIC_A0 -DWILC_DEBUGFS
 
 wilc1000-objs := wilc_wfi_cfgoperations.o linux_wlan.o linux_mon.o \
-   wilc_msgqueue.o \
coreconfigurator.o host_interface.o \
wilc_wlan_cfg.o wilc_debugfs.o \
wilc_wlan.o
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 9535842..2d250c6 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -3,11 +3,13 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
+#include 
+#include 
 #include "coreconfigurator.h"
 #include "wilc_wlan.h"
 #include "wilc_wlan_if.h"
-#include "wilc_msgqueue.h"
 #include 
 #include "wilc_wfi_netdevice.h"
 
@@ -57,6 +59,20 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
+struct message {
+   void *buf;
+   u32 len;
+   struct list_head list;
+};
+
+struct message_queue {
+   struct semaphore sem;
+   spinlock_t lock;
+   bool exiting;
+   u32 recv_count;
+   struct list_head msg_list;
+};
+
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -264,6 +280,151 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
+static int wilc_mq_create(struct message_queue *mq);
+static int wilc_mq_send(struct message_queue *mq,
+const void *send_buf, u32 send_buf_size);
+static int wilc_mq_recv(struct message_queue *mq,
+void *recv_buf, u32 recv_buf_size, u32 *recv_len);
+static int wilc_mq_destroy(struct message_queue *mq);
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_create(struct message_queue *mq)
+{
+   spin_lock_init(&mq->lock);
+   sema_init(&mq->sem, 0);
+   INIT_LIST_HEAD(&mq->msg_list);
+   mq->recv_count = 0;
+   mq->exiting = false;
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_destroy(struct message_queue *mq)
+{
+   struct message *msg;
+
+   mq->exiting = true;
+
+   /* Release any waiting receiver thread. */
+   while (mq->recv_count > 0) {
+   up(&mq->sem);
+   mq->recv_count--;
+   }
+
+   while (!list_empty(&mq->msg_list)) {
+   msg = list_first_entry(&mq->msg_list, struct message, list);
+   list_del(&msg->list);
+   kfree(msg->buf);
+   }
+
+   return 0;
+}
+
+/*!
+ *  @authorsyounan
+ *  @date  1 Sep 2010
+ *  @note  copied from FLO glue implementatuion
+ *  @version   1.0
+ */
+static int wilc_mq_send(struct message_queue *mq,
+   const void *send_buf, u32 send_buf_size)
+{
+   unsigned long flags;
+   struct message *new_msg = NULL;
+
+   if (!mq || (send_buf_size == 0) || !send_buf)
+   return -EINVAL;
+
+   if (mq->exiting)
+   return -EFAULT;
+
+   /* construct a new message */
+   new_msg = kmalloc(sizeof(*new_msg), GFP_ATOMIC);
+   if (!new_msg)
+   return -ENOMEM;
+
+   new_msg->len = send_buf_size;
+   INIT_LIST_HEAD(&new_msg->list);
+   new_msg->buf = kmemdup(send_buf, send_buf_size, GFP_ATOMIC);
+   if (!new_msg->buf) {
+   

[PATCH v4 0/3] *** staging: wilc1000: Replace semaphores ***

2016-06-22 Thread Binoy Jayan
Hi,

Thank you Arnd for patiently reviewing this patch series multiple times and
apologies to everyone for spamming you inboxes with a patch (v3) that does
not even build. It was due to an uncommited change in my git repo before
generating the patch. It is corrected in v4.

This patchset [v4] is part of the second patch series for 'wilc1000'.
The original patch series consisted 7 patches of which only the first 5
are good. The patch 6 and 7 are being worked on in this series
in a different way.

This patch series removes the semaphore 'sem' in 'wilc1000' and also
restructures the implementation of kthread / message_queue logic with
a create_singlethread_workqueue() / queue_work() setup.

These are part of a bigger effort to eliminate all semaphores
from the linux kernel.

They build correctly (individually and as a whole).

NB: The changes are untested

Discussion carried forward from previous patchset [v2]

Rework on the review comments by Arnd w.r.t. v1

struct message_queue can be removed since
 - after the workqueue conversion, mq->sem is no longer needed
 - recv_count is not needed, it just counts the number of entries in the list
 - struct wilc' pointer can be retrieved from the host_if_msg, (vif->wilc)
 - the message list is not needed because we always look only at the
   first entry, except in wilc_mq_destroy(), but it would be better
   to just call destroy_workqueue(), which also drains the remaining work.
 - the exiting flag is also handled by destroy_workqueue()   
 - with everything else gone, the spinlock is also not needed any more.

Do 'kfree' only at the end of 'host_if_work' 

wilc_initialized is always '1' so the conditional 'wilc_mq_send'
in 'hostIFthread' can be removed.

A connect command (HOST_IF_MSG_CONNECT) does not complete while scan is 
ongoing. 
So, the special handling of this command needs to be preserved.

Use create_singlethread_workqueue() instead of alloc_workqueue(), so that
we stay closer to the current behavior by having the thread run only
on one CPU at a time and not having a 'dedicated' thread for each.

Split the patch to seperate interface changes to 'wilc_mq_send'
No easy way found to split the patch to change the interface
'wilc_mq_send' and to 'wilc_enqueue_cmd' as the parameters 
'mq' 'send_buf' and 'send_buf_size' itself are part of the message
queue implementation.

New changes in v3

Rework on the review comments by Arnd w.r.t. v2
 - Remove forward declaration for wilc_enqueue_cmd
 - Change the interface 'wilc_mq_send' in a different patch
 - Avoid change in indentation in host_if_work and move it to
   a different patch

Cannot remove forward declaration of local function 'host_if_work'
   since there is a mutual dependency.

New changes in v4

Remove unused identifier 'hif_msg_q' which causes the build error.

Binoy

Binoy Jayan (3):
  staging: wilc1000: message_queue: Move code to host interface
  staging: wilc1000: Replace kthread with workqueue for host interface
  staging: wilc1000: Change interface wilc_mq_send to wilc_enqueue_cmd

 drivers/staging/wilc1000/Makefile |   1 -
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 396 +++---
 drivers/staging/wilc1000/wilc_msgqueue.c  | 144 ---
 drivers/staging/wilc1000/wilc_msgqueue.h  |  28 ---
 5 files changed, 204 insertions(+), 370 deletions(-)
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.c
 delete mode 100644 drivers/staging/wilc1000/wilc_msgqueue.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 3/3] staging: wilc1000: Change interface wilc_mq_send to wilc_enqueue_cmd

2016-06-22 Thread Binoy Jayan
Replace the interface 'wilc_mq_send' with 'wilc_enqueue_cmd'
and remove the now unused structures 'message' and 'message_queue'.
Restructure switch statement in the work queue helper function
host_if_work and remove unwanted indentation.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/host_interface.c | 333 ++
 1 file changed, 158 insertions(+), 175 deletions(-)

diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 242c3d7..9c70318 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -60,20 +60,6 @@
 #define TCP_ACK_FILTER_LINK_SPEED_THRESH   54
 #define DEFAULT_LINK_SPEED 72
 
-struct message {
-   void *buf;
-   u32 len;
-   struct list_head list;
-};
-
-struct message_queue {
-   struct semaphore sem;
-   spinlock_t lock;
-   bool exiting;
-   u32 recv_count;
-   struct list_head msg_list;
-};
-
 struct host_if_wpa_attr {
u8 *key;
const u8 *mac_addr;
@@ -248,7 +234,6 @@ static struct host_if_drv *terminated_handle;
 bool wilc_optaining_ip;
 static u8 P2P_LISTEN_STATE;
 static struct workqueue_struct *hif_workqueue;
-static struct message_queue hif_msg_q;
 static struct completion hif_thread_comp;
 static struct completion hif_driver_comp;
 static struct completion hif_wait_response;
@@ -290,12 +275,11 @@ static void host_if_work(struct work_struct *work);
  *  @note  copied from FLO glue implementatuion
  *  @version   1.0
  */
-static int wilc_mq_send(struct message_queue *mq,
-   const void *send_buf, u32 send_buf_size)
+static int wilc_enqueue_cmd(struct host_if_msg *msg)
 {
struct host_if_msg *new_msg;
 
-   new_msg = kmemdup(send_buf, sizeof(*new_msg), GFP_ATOMIC);
+   new_msg = kmemdup(msg, sizeof(*new_msg), GFP_ATOMIC);
if (!new_msg)
return -ENOMEM;
 
@@ -2404,7 +2388,7 @@ static void ListenTimerCB(unsigned long arg)
msg.vif = vif;
msg.body.remain_on_ch.id = vif->hif_drv->remain_on_ch.id;
 
-   result = wilc_mq_send(&hif_msg_q, &msg, sizeof(struct host_if_msg));
+   result = wilc_enqueue_cmd(&msg);
if (result)
netdev_err(vif->ndev, "wilc_mq_send fail\n");
 }
@@ -2514,160 +2498,159 @@ static void host_if_work(struct work_struct *work)
 
if (msg->id == HOST_IF_MSG_CONNECT &&
msg->vif->hif_drv->usr_scan_req.scan_result) {
-   wilc_mq_send(&hif_msg_q, msg, sizeof(struct host_if_msg));
+   wilc_enqueue_cmd(msg);
usleep_range(2 * 1000, 2 * 1000);
-   } else {
-
-   switch (msg->id) {
-   case HOST_IF_MSG_SCAN:
-   handle_scan(msg->vif, &msg->body.scan_info);
-   break;
-
-   case HOST_IF_MSG_CONNECT:
-   Handle_Connect(msg->vif, &msg->body.con_info);
-   break;
+   goto free_msg;
+   }
+   switch (msg->id) {
+   case HOST_IF_MSG_SCAN:
+   handle_scan(msg->vif, &msg->body.scan_info);
+   break;
 
-   case HOST_IF_MSG_RCVD_NTWRK_INFO:
-   Handle_RcvdNtwrkInfo(msg->vif, &msg->body.net_info);
-   break;
+   case HOST_IF_MSG_CONNECT:
+   Handle_Connect(msg->vif, &msg->body.con_info);
+   break;
 
-   case HOST_IF_MSG_RCVD_GNRL_ASYNC_INFO:
-   Handle_RcvdGnrlAsyncInfo(msg->vif,
-&msg->body.async_info);
-   break;
+   case HOST_IF_MSG_RCVD_NTWRK_INFO:
+   Handle_RcvdNtwrkInfo(msg->vif, &msg->body.net_info);
+   break;
 
-   case HOST_IF_MSG_KEY:
-   Handle_Key(msg->vif, &msg->body.key_info);
-   break;
+   case HOST_IF_MSG_RCVD_GNRL_ASYNC_INFO:
+   Handle_RcvdGnrlAsyncInfo(msg->vif,
+&msg->body.async_info);
+   break;
 
-   case HOST_IF_MSG_CFG_PARAMS:
-   handle_cfg_param(msg->vif, &msg->body.cfg_info);
-   break;
+   case HOST_IF_MSG_KEY:
+   Handle_Key(msg->vif, &msg->body.key_info);
+   break;
 
-   case HOST_IF_MSG_SET_CHANNEL:
-   handle_set_channel(msg->vif, &msg->body.channel_info);
-   break;
+   case HOST_IF_MSG_CFG_PARAMS:
+   handle_cfg_param(msg->vif, &msg->body.cfg_info);
+   break;
 
-   case HOST_IF

[PATCH v4 2/3] staging: wilc1000: Replace kthread with workqueue for host interface

2016-06-22 Thread Binoy Jayan
Deconstruct the kthread / message_queue logic, replacing it with
create_singlethread_workqueue() / queue_work() setup, by adding a
'struct work_struct' to 'struct host_if_msg'. The current kthread
hostIFthread() is converted to a work queue helper with the name
'host_if_work'.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/TODO |   5 +
 drivers/staging/wilc1000/host_interface.c | 282 --
 2 files changed, 75 insertions(+), 212 deletions(-)

diff --git a/drivers/staging/wilc1000/TODO b/drivers/staging/wilc1000/TODO
index 95199d8..ec93b2e 100644
--- a/drivers/staging/wilc1000/TODO
+++ b/drivers/staging/wilc1000/TODO
@@ -4,6 +4,11 @@ TODO:
 - remove custom debug and tracing functions
 - rework comments and function headers(also coding style)
 - replace all semaphores with mutexes or completions
+- Move handling for each individual members of 'union message_body' out
+  into a separate 'struct work_struct' and completely remove the multiplexer
+  that is currently part of host_if_work(), allowing movement of the
+  implementation of each message handler into the callsite of the function
+  that currently queues the 'host_if_msg'.
 - make spi and sdio components coexist in one build
 - turn compile-time platform configuration (BEAGLE_BOARD,
   PANDA_BOARD, PLAT_WMS8304, PLAT_RK, CUSTOMER_PLATFORM, ...)
diff --git a/drivers/staging/wilc1000/host_interface.c 
b/drivers/staging/wilc1000/host_interface.c
index 2d250c6..242c3d7 100644
--- a/drivers/staging/wilc1000/host_interface.c
+++ b/drivers/staging/wilc1000/host_interface.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "host_interface.h"
 #include 
 #include 
@@ -211,6 +212,7 @@ struct host_if_msg {
u16 id;
union message_body body;
struct wilc_vif *vif;
+   struct work_struct work;
 };
 
 struct join_bss_param {
@@ -245,7 +247,7 @@ struct join_bss_param {
 static struct host_if_drv *terminated_handle;
 bool wilc_optaining_ip;
 static u8 P2P_LISTEN_STATE;
-static struct task_struct *hif_thread_handler;
+static struct workqueue_struct *hif_workqueue;
 static struct message_queue hif_msg_q;
 static struct completion hif_thread_comp;
 static struct completion hif_driver_comp;
@@ -280,55 +282,7 @@ static struct wilc_vif *join_req_vif;
 static void *host_int_ParseJoinBssParam(struct network_info *ptstrNetworkInfo);
 static int host_int_get_ipaddress(struct wilc_vif *vif, u8 *ip_addr, u8 idx);
 static s32 Handle_ScanDone(struct wilc_vif *vif, enum scan_event enuEvent);
-static int wilc_mq_create(struct message_queue *mq);
-static int wilc_mq_send(struct message_queue *mq,
-const void *send_buf, u32 send_buf_size);
-static int wilc_mq_recv(struct message_queue *mq,
-void *recv_buf, u32 recv_buf_size, u32 *recv_len);
-static int wilc_mq_destroy(struct message_queue *mq);
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_create(struct message_queue *mq)
-{
-   spin_lock_init(&mq->lock);
-   sema_init(&mq->sem, 0);
-   INIT_LIST_HEAD(&mq->msg_list);
-   mq->recv_count = 0;
-   mq->exiting = false;
-   return 0;
-}
-
-/*!
- *  @authorsyounan
- *  @date  1 Sep 2010
- *  @note  copied from FLO glue implementatuion
- *  @version   1.0
- */
-static int wilc_mq_destroy(struct message_queue *mq)
-{
-   struct message *msg;
-
-   mq->exiting = true;
-
-   /* Release any waiting receiver thread. */
-   while (mq->recv_count > 0) {
-   up(&mq->sem);
-   mq->recv_count--;
-   }
-
-   while (!list_empty(&mq->msg_list)) {
-   msg = list_first_entry(&mq->msg_list, struct message, list);
-   list_del(&msg->list);
-   kfree(msg->buf);
-   }
-
-   return 0;
-}
+static void host_if_work(struct work_struct *work);
 
 /*!
  *  @authorsyounan
@@ -339,92 +293,17 @@ static int wilc_mq_destroy(struct message_queue *mq)
 static int wilc_mq_send(struct message_queue *mq,
const void *send_buf, u32 send_buf_size)
 {
-   unsigned long flags;
-   struct message *new_msg = NULL;
+   struct host_if_msg *new_msg;
 
-   if (!mq || (send_buf_size == 0) || !send_buf)
-   return -EINVAL;
-
-   if (mq->exiting)
-   return -EFAULT;
-
-   /* construct a new message */
-   new_msg = kmalloc(sizeof(*new_msg), GFP_ATOMIC);
+   new_msg = kmemdup(send_buf, sizeof(*new_msg), GFP_ATOMIC);
if (!new_msg)
return -ENOMEM;
 
-   new_msg->len = send_buf_size;
-   INIT_LIST_HEAD(&new_msg->list);
-  

[PATCH v2 0/5] *** staging: wilc1000: Replace semaphores with mutexes or completions ***

2016-06-14 Thread Binoy Jayan
These are a set of patches [v2] which removes semaphores from:

drivers/staging/wilc1000

These are part of a bigger effort to eliminate all semaphores
from the linux kernel.

They build correctly (individually and as a whole).

NB: The changes are untested

Changes w.r.t. review comments on v1

1. Whitespace removed in patch 3
2. Removed semaphore 'close_exit_sync'
3. To rework on patch 6 and send in a seperate patch series

Binoy Jayan (5):
  staging: wilc1000: Replace semaphore txq_event with completion
  staging: wilc1000: Replace semaphore txq_add_to_head_cs with mutex
  staging: wilc1000: Replace semaphore cfg_event with completion
  staging: wilc1000: Replace semaphore sync_event with completion
  staging: wilc1000: Remove semaphore close_exit_sync

 drivers/staging/wilc1000/linux_wlan.c | 31 ++--
 drivers/staging/wilc1000/wilc_wfi_netdevice.h | 10 
 drivers/staging/wilc1000/wilc_wlan.c  | 34 +--
 3 files changed, 35 insertions(+), 40 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 1/5] staging: wilc1000: Replace semaphore txq_event with completion

2016-06-14 Thread Binoy Jayan
The semaphore 'txq_event' is used as completion, so convert it
to a struct completion type.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/wilc1000/linux_wlan.c | 8 
 drivers/staging/wilc1000/wilc_wfi_netdevice.h | 3 ++-
 drivers/staging/wilc1000/wilc_wlan.c  | 8 +---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 4f93c11..90f906d 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -316,7 +316,7 @@ static int linux_wlan_txq_task(void *vp)
 
complete(&wl->txq_thread_started);
while (1) {
-   down(&wl->txq_event);
+   wait_for_completion(&wl->txq_event);
 
if (wl->close) {
complete(&wl->txq_thread_started);
@@ -650,7 +650,7 @@ void wilc1000_wlan_deinit(struct net_device *dev)
mutex_unlock(&wl->hif_cs);
}
if (&wl->txq_event)
-   up(&wl->txq_event);
+   wait_for_completion(&wl->txq_event);
 
wlan_deinitialize_threads(dev);
deinit_irq(dev);
@@ -681,7 +681,7 @@ static int wlan_init_locks(struct net_device *dev)
spin_lock_init(&wl->txq_spinlock);
sema_init(&wl->txq_add_to_head_cs, 1);
 
-   sema_init(&wl->txq_event, 0);
+   init_completion(&wl->txq_event);
 
sema_init(&wl->cfg_event, 0);
sema_init(&wl->sync_event, 0);
@@ -738,7 +738,7 @@ static void wlan_deinitialize_threads(struct net_device 
*dev)
wl->close = 1;
 
if (&wl->txq_event)
-   up(&wl->txq_event);
+   complete(&wl->txq_event);
 
if (wl->txq_thread) {
kthread_stop(wl->txq_thread);
diff --git a/drivers/staging/wilc1000/wilc_wfi_netdevice.h 
b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
index 3a561df6..12d7c7b 100644
--- a/drivers/staging/wilc1000/wilc_wfi_netdevice.h
+++ b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
@@ -42,6 +42,7 @@
 #include "host_interface.h"
 #include "wilc_wlan.h"
 #include 
+#include 
 
 #define FLOW_CONTROL_LOWER_THRESHOLD   128
 #define FLOW_CONTROL_UPPER_THRESHOLD   256
@@ -178,7 +179,7 @@ struct wilc {
 
struct semaphore cfg_event;
struct semaphore sync_event;
-   struct semaphore txq_event;
+   struct completion txq_event;
struct completion txq_thread_started;
 
struct task_struct *txq_thread;
diff --git a/drivers/staging/wilc1000/wilc_wlan.c 
b/drivers/staging/wilc1000/wilc_wlan.c
index 11e16d5..1a57135 100644
--- a/drivers/staging/wilc1000/wilc_wlan.c
+++ b/drivers/staging/wilc1000/wilc_wlan.c
@@ -1,3 +1,4 @@
+#include 
 #include "wilc_wlan_if.h"
 #include "wilc_wlan.h"
 #include "wilc_wfi_netdevice.h"
@@ -89,7 +90,7 @@ static void wilc_wlan_txq_add_to_tail(struct net_device *dev,
 
spin_unlock_irqrestore(&wilc->txq_spinlock, flags);
 
-   up(&wilc->txq_event);
+   complete(&wilc->txq_event);
 }
 
 static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
@@ -119,7 +120,7 @@ static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
 
spin_unlock_irqrestore(&wilc->txq_spinlock, flags);
up(&wilc->txq_add_to_head_cs);
-   up(&wilc->txq_event);
+   complete(&wilc->txq_event);
 
return 0;
 }
@@ -287,7 +288,8 @@ static int wilc_wlan_txq_filter_dup_tcp_ack(struct 
net_device *dev)
spin_unlock_irqrestore(&wilc->txq_spinlock, wilc->txq_spinlock_flags);
 
while (dropped > 0) {
-   wilc_lock_timeout(wilc, &wilc->txq_event, 1);
+   wait_for_completion_timeout(&wilc->txq_event,
+   msecs_to_jiffies(1));
dropped--;
}
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 0/5] *** staging: wilc1000: Replace semaphores with mutexes or completions ***

2016-06-14 Thread Binoy Jayan
These are a set of patches [v3] which removes semaphores from:

drivers/staging/wilc1000

These are part of a bigger effort to eliminate all semaphores
from the linux kernel.

They build correctly (individually and as a whole).

NB: The changes are untested

Changes w.r.t. review comments on v1

1. Whitespace removed in patch 3
2. Removed semaphore 'close_exit_sync'
3. To rework on patch 6 and send in a seperate patch series

Binoy Jayan (5):
  staging: wilc1000: Replace semaphore txq_event with completion
  staging: wilc1000: Replace semaphore txq_add_to_head_cs with mutex
  staging: wilc1000: Replace semaphore cfg_event with completion
  staging: wilc1000: Replace semaphore sync_event with completion
  staging: wilc1000: Remove semaphore close_exit_sync

 drivers/staging/wilc1000/linux_wlan.c | 31 ++--
 drivers/staging/wilc1000/wilc_wfi_netdevice.h | 10 
 drivers/staging/wilc1000/wilc_wlan.c  | 34 +--
 3 files changed, 35 insertions(+), 40 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 2/5] staging: wilc1000: Replace semaphore txq_add_to_head_cs with mutex

2016-06-14 Thread Binoy Jayan
The semaphore 'txq_add_to_head_cs' is a simple mutex, so it should be
written as one. Semaphores are going away in the future. Also, removing
the timeout scenario as the error handling code does not propagate the
timeout properly.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/linux_wlan.c |  4 ++--
 drivers/staging/wilc1000/wilc_wfi_netdevice.h |  3 ++-
 drivers/staging/wilc1000/wilc_wlan.c  | 11 ---
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 90f906d..a933551 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 #include 
 
@@ -679,7 +679,7 @@ static int wlan_init_locks(struct net_device *dev)
mutex_init(&wl->rxq_cs);
 
spin_lock_init(&wl->txq_spinlock);
-   sema_init(&wl->txq_add_to_head_cs, 1);
+   mutex_init(&wl->txq_add_to_head_cs);
 
init_completion(&wl->txq_event);
 
diff --git a/drivers/staging/wilc1000/wilc_wfi_netdevice.h 
b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
index 12d7c7b..239cd43 100644
--- a/drivers/staging/wilc1000/wilc_wfi_netdevice.h
+++ b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
@@ -43,6 +43,7 @@
 #include "wilc_wlan.h"
 #include 
 #include 
+#include 
 
 #define FLOW_CONTROL_LOWER_THRESHOLD   128
 #define FLOW_CONTROL_UPPER_THRESHOLD   256
@@ -171,7 +172,7 @@ struct wilc {
struct wilc_vif *vif[NUM_CONCURRENT_IFC];
u8 open_ifcs;
 
-   struct semaphore txq_add_to_head_cs;
+   struct mutex txq_add_to_head_cs;
spinlock_t txq_spinlock;
 
struct mutex rxq_cs;
diff --git a/drivers/staging/wilc1000/wilc_wlan.c 
b/drivers/staging/wilc1000/wilc_wlan.c
index 1a57135..9afbe8d 100644
--- a/drivers/staging/wilc1000/wilc_wlan.c
+++ b/drivers/staging/wilc1000/wilc_wlan.c
@@ -99,9 +99,7 @@ static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
unsigned long flags;
struct wilc *wilc = vif->wilc;
 
-   if (wilc_lock_timeout(wilc, &wilc->txq_add_to_head_cs,
-   CFG_PKTS_TIMEOUT))
-   return -1;
+   mutex_lock(&wilc->txq_add_to_head_cs);
 
spin_lock_irqsave(&wilc->txq_spinlock, flags);
 
@@ -119,7 +117,7 @@ static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
wilc->txq_entries += 1;
 
spin_unlock_irqrestore(&wilc->txq_spinlock, flags);
-   up(&wilc->txq_add_to_head_cs);
+   mutex_unlock(&wilc->txq_add_to_head_cs);
complete(&wilc->txq_event);
 
return 0;
@@ -573,8 +571,7 @@ int wilc_wlan_handle_txq(struct net_device *dev, u32 
*txq_count)
if (wilc->quit)
break;
 
-   wilc_lock_timeout(wilc, &wilc->txq_add_to_head_cs,
-   CFG_PKTS_TIMEOUT);
+   mutex_lock(&wilc->txq_add_to_head_cs);
wilc_wlan_txq_filter_dup_tcp_ack(dev);
tqe = wilc_wlan_txq_get_first(wilc);
i = 0;
@@ -755,7 +752,7 @@ _end_:
if (ret != 1)
break;
} while (0);
-   up(&wilc->txq_add_to_head_cs);
+   mutex_unlock(&wilc->txq_add_to_head_cs);
 
wilc->txq_exit = 1;
*txq_count = wilc->txq_entries;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 3/5] staging: wilc1000: Replace semaphore cfg_event with completion

2016-06-14 Thread Binoy Jayan
The semaphore 'cfg_event' is used as completion, so convert
it to a struct completion type.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/linux_wlan.c |  2 +-
 drivers/staging/wilc1000/wilc_wfi_netdevice.h |  2 +-
 drivers/staging/wilc1000/wilc_wlan.c  | 15 ---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index a933551..81a469a 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -683,7 +683,7 @@ static int wlan_init_locks(struct net_device *dev)
 
init_completion(&wl->txq_event);
 
-   sema_init(&wl->cfg_event, 0);
+   init_completion(&wl->cfg_event);
sema_init(&wl->sync_event, 0);
init_completion(&wl->txq_thread_started);
 
diff --git a/drivers/staging/wilc1000/wilc_wfi_netdevice.h 
b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
index 239cd43..5fbc07c 100644
--- a/drivers/staging/wilc1000/wilc_wfi_netdevice.h
+++ b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
@@ -178,7 +178,7 @@ struct wilc {
struct mutex rxq_cs;
struct mutex hif_cs;
 
-   struct semaphore cfg_event;
+   struct completion cfg_event;
struct semaphore sync_event;
struct completion txq_event;
struct completion txq_thread_started;
diff --git a/drivers/staging/wilc1000/wilc_wlan.c 
b/drivers/staging/wilc1000/wilc_wlan.c
index 9afbe8d..19a5809 100644
--- a/drivers/staging/wilc1000/wilc_wlan.c
+++ b/drivers/staging/wilc1000/wilc_wlan.c
@@ -310,7 +310,7 @@ static int wilc_wlan_txq_add_cfg_pkt(struct wilc_vif *vif, 
u8 *buffer,
netdev_dbg(vif->ndev, "Adding config packet ...\n");
if (wilc->quit) {
netdev_dbg(vif->ndev, "Return due to clear function\n");
-   up(&wilc->cfg_event);
+   complete(&wilc->cfg_event);
return 0;
}
 
@@ -769,7 +769,7 @@ static void wilc_wlan_handle_rxq(struct wilc *wilc)
 
do {
if (wilc->quit) {
-   up(&wilc->cfg_event);
+   complete(&wilc->cfg_event);
break;
}
rqe = wilc_wlan_rxq_remove(wilc);
@@ -820,7 +820,7 @@ static void wilc_wlan_handle_rxq(struct wilc *wilc)
wilc_wlan_cfg_indicate_rx(wilc, 
&buffer[pkt_offset + offset], pkt_len, &rsp);
if (rsp.type == WILC_CFG_RSP) {
if (wilc->cfg_seq_no == 
rsp.seq_no)
-   up(&wilc->cfg_event);
+   
complete(&wilc->cfg_event);
} else if (rsp.type == 
WILC_CFG_RSP_STATUS) {
wilc_mac_indicate(wilc, 
WILC_MAC_INDICATE_STATUS);
 
@@ -1228,11 +1228,12 @@ int wilc_wlan_cfg_set(struct wilc_vif *vif, int start, 
u16 wid, u8 *buffer,
if (wilc_wlan_cfg_commit(vif, WILC_CFG_SET, drv_handler))
ret_size = 0;
 
-   if (wilc_lock_timeout(wilc, &wilc->cfg_event,
-   CFG_PKTS_TIMEOUT)) {
+   if (!wait_for_completion_timeout(&wilc->cfg_event,
+   msecs_to_jiffies(CFG_PKTS_TIMEOUT))) {
netdev_dbg(vif->ndev, "Set Timed Out\n");
ret_size = 0;
}
+
wilc->cfg_frame_in_use = 0;
wilc->cfg_frame_offset = 0;
wilc->cfg_seq_no += 1;
@@ -1265,8 +1266,8 @@ int wilc_wlan_cfg_get(struct wilc_vif *vif, int start, 
u16 wid, int commit,
if (wilc_wlan_cfg_commit(vif, WILC_CFG_QUERY, drv_handler))
ret_size = 0;
 
-   if (wilc_lock_timeout(wilc, &wilc->cfg_event,
-   CFG_PKTS_TIMEOUT)) {
+   if (!wait_for_completion_timeout(&wilc->cfg_event,
+   msecs_to_jiffies(CFG_PKTS_TIMEOUT))) {
netdev_dbg(vif->ndev, "Get Timed Out\n");
ret_size = 0;
}
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 4/5] staging: wilc1000: Replace semaphore sync_event with completion

2016-06-14 Thread Binoy Jayan
The semaphore 'sync_event' is used as completion, so convert
it to a struct completion type. Also, return -ETIME if the return
value of wait_for_completion_timeout is 0.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/linux_wlan.c | 10 +-
 drivers/staging/wilc1000/wilc_wfi_netdevice.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 81a469a..39fe350 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -241,7 +241,7 @@ void wilc_mac_indicate(struct wilc *wilc, int flag)
  (unsigned char *)&status, 4);
if (wilc->mac_status == WILC_MAC_STATUS_INIT) {
wilc->mac_status = status;
-   up(&wilc->sync_event);
+   complete(&wilc->sync_event);
} else {
wilc->mac_status = status;
}
@@ -386,9 +386,9 @@ static int linux_wlan_start_firmware(struct net_device *dev)
if (ret < 0)
return ret;
 
-   ret = wilc_lock_timeout(wilc, &wilc->sync_event, 5000);
-   if (ret)
-   return ret;
+   if (!wait_for_completion_timeout(&wilc->sync_event,
+   msecs_to_jiffies(5000)))
+   return -ETIME;
 
return 0;
 }
@@ -684,7 +684,7 @@ static int wlan_init_locks(struct net_device *dev)
init_completion(&wl->txq_event);
 
init_completion(&wl->cfg_event);
-   sema_init(&wl->sync_event, 0);
+   init_completion(&wl->sync_event);
init_completion(&wl->txq_thread_started);
 
return 0;
diff --git a/drivers/staging/wilc1000/wilc_wfi_netdevice.h 
b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
index 5fbc07c..5cc6a82 100644
--- a/drivers/staging/wilc1000/wilc_wfi_netdevice.h
+++ b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
@@ -179,7 +179,7 @@ struct wilc {
struct mutex hif_cs;
 
struct completion cfg_event;
-   struct semaphore sync_event;
+   struct completion sync_event;
struct completion txq_event;
struct completion txq_thread_started;
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 1/5] staging: wilc1000: Replace semaphore txq_event with completion

2016-06-14 Thread Binoy Jayan
The semaphore 'txq_event' is used as completion, so convert it
to a struct completion type.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/linux_wlan.c | 8 
 drivers/staging/wilc1000/wilc_wfi_netdevice.h | 3 ++-
 drivers/staging/wilc1000/wilc_wlan.c  | 8 +---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 4f93c11..90f906d 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -316,7 +316,7 @@ static int linux_wlan_txq_task(void *vp)
 
complete(&wl->txq_thread_started);
while (1) {
-   down(&wl->txq_event);
+   wait_for_completion(&wl->txq_event);
 
if (wl->close) {
complete(&wl->txq_thread_started);
@@ -650,7 +650,7 @@ void wilc1000_wlan_deinit(struct net_device *dev)
mutex_unlock(&wl->hif_cs);
}
if (&wl->txq_event)
-   up(&wl->txq_event);
+   wait_for_completion(&wl->txq_event);
 
wlan_deinitialize_threads(dev);
deinit_irq(dev);
@@ -681,7 +681,7 @@ static int wlan_init_locks(struct net_device *dev)
spin_lock_init(&wl->txq_spinlock);
sema_init(&wl->txq_add_to_head_cs, 1);
 
-   sema_init(&wl->txq_event, 0);
+   init_completion(&wl->txq_event);
 
sema_init(&wl->cfg_event, 0);
sema_init(&wl->sync_event, 0);
@@ -738,7 +738,7 @@ static void wlan_deinitialize_threads(struct net_device 
*dev)
wl->close = 1;
 
if (&wl->txq_event)
-   up(&wl->txq_event);
+   complete(&wl->txq_event);
 
if (wl->txq_thread) {
kthread_stop(wl->txq_thread);
diff --git a/drivers/staging/wilc1000/wilc_wfi_netdevice.h 
b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
index 3a561df6..12d7c7b 100644
--- a/drivers/staging/wilc1000/wilc_wfi_netdevice.h
+++ b/drivers/staging/wilc1000/wilc_wfi_netdevice.h
@@ -42,6 +42,7 @@
 #include "host_interface.h"
 #include "wilc_wlan.h"
 #include 
+#include 
 
 #define FLOW_CONTROL_LOWER_THRESHOLD   128
 #define FLOW_CONTROL_UPPER_THRESHOLD   256
@@ -178,7 +179,7 @@ struct wilc {
 
struct semaphore cfg_event;
struct semaphore sync_event;
-   struct semaphore txq_event;
+   struct completion txq_event;
struct completion txq_thread_started;
 
struct task_struct *txq_thread;
diff --git a/drivers/staging/wilc1000/wilc_wlan.c 
b/drivers/staging/wilc1000/wilc_wlan.c
index 11e16d5..1a57135 100644
--- a/drivers/staging/wilc1000/wilc_wlan.c
+++ b/drivers/staging/wilc1000/wilc_wlan.c
@@ -1,3 +1,4 @@
+#include 
 #include "wilc_wlan_if.h"
 #include "wilc_wlan.h"
 #include "wilc_wfi_netdevice.h"
@@ -89,7 +90,7 @@ static void wilc_wlan_txq_add_to_tail(struct net_device *dev,
 
spin_unlock_irqrestore(&wilc->txq_spinlock, flags);
 
-   up(&wilc->txq_event);
+   complete(&wilc->txq_event);
 }
 
 static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
@@ -119,7 +120,7 @@ static int wilc_wlan_txq_add_to_head(struct wilc_vif *vif,
 
spin_unlock_irqrestore(&wilc->txq_spinlock, flags);
up(&wilc->txq_add_to_head_cs);
-   up(&wilc->txq_event);
+   complete(&wilc->txq_event);
 
return 0;
 }
@@ -287,7 +288,8 @@ static int wilc_wlan_txq_filter_dup_tcp_ack(struct 
net_device *dev)
spin_unlock_irqrestore(&wilc->txq_spinlock, wilc->txq_spinlock_flags);
 
while (dropped > 0) {
-   wilc_lock_timeout(wilc, &wilc->txq_event, 1);
+   wait_for_completion_timeout(&wilc->txq_event,
+   msecs_to_jiffies(1));
dropped--;
}
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3 5/5] staging: wilc1000: Remove semaphore close_exit_sync

2016-06-14 Thread Binoy Jayan
The semaphore 'close_exit_sync' does not serve any purpose other
than delaying the deregistration of the device which it is trying
to protect from shared access. 'up' is called only when a subdevice
is closed and not when it is opened. So, the semaphore count only
goes up when the device is used.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/wilc1000/linux_wlan.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/staging/wilc1000/linux_wlan.c 
b/drivers/staging/wilc1000/linux_wlan.c
index 39fe350..f87a30f 100644
--- a/drivers/staging/wilc1000/linux_wlan.c
+++ b/drivers/staging/wilc1000/linux_wlan.c
@@ -31,8 +31,6 @@ static struct notifier_block g_dev_notifier = {
.notifier_call = dev_state_ev_handler
 };
 
-static struct semaphore close_exit_sync;
-
 static int wlan_deinit_locks(struct net_device *dev);
 static void wlan_deinitialize_threads(struct net_device *dev);
 
@@ -1088,7 +1086,6 @@ int wilc_mac_close(struct net_device *ndev)
WILC_WFI_deinit_mon_interface();
}
 
-   up(&close_exit_sync);
vif->mac_opened = 0;
 
return 0;
@@ -1232,8 +1229,6 @@ void wilc_netdev_cleanup(struct wilc *wilc)
}
 
if (wilc && (wilc->vif[0]->ndev || wilc->vif[1]->ndev)) {
-   wilc_lock_timeout(wilc, &close_exit_sync, 5 * 1000);
-
for (i = 0; i < NUM_CONCURRENT_IFC; i++)
if (wilc->vif[i]->ndev)
if (vif[i]->mac_opened)
@@ -1258,8 +1253,6 @@ int wilc_netdev_init(struct wilc **wilc, struct device 
*dev, int io_type,
struct net_device *ndev;
struct wilc *wl;
 
-   sema_init(&close_exit_sync, 0);
-
wl = kzalloc(sizeof(*wl), GFP_KERNEL);
if (!wl)
return -ENOMEM;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2] staging: gdm724x: Replace semaphore netlink with mutex

2016-06-14 Thread Binoy Jayan
Replace semaphore netlink_mutex with mutex. Semaphores are
going away in the future.

Signed-off-by: Binoy Jayan 
Reviewed-by: Arnd Bergmann 
---
 drivers/staging/gdm724x/netlink_k.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/staging/gdm724x/netlink_k.c 
b/drivers/staging/gdm724x/netlink_k.c
index a0232e8..abe2425 100644
--- a/drivers/staging/gdm724x/netlink_k.c
+++ b/drivers/staging/gdm724x/netlink_k.c
@@ -14,6 +14,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -21,13 +22,7 @@
 
 #include "netlink_k.h"
 
-#if defined(DEFINE_MUTEX)
 static DEFINE_MUTEX(netlink_mutex);
-#else
-static struct semaphore netlink_mutex;
-#define mutex_lock(x)  down(x)
-#define mutex_unlock(x)up(x)
-#endif
 
 #define ND_MAX_GROUP   30
 #define ND_IFINDEX_LEN sizeof(int)
@@ -96,10 +91,6 @@ struct sock *netlink_init(int unit,
.input  = netlink_rcv,
};
 
-#if !defined(DEFINE_MUTEX)
-   init_MUTEX(&netlink_mutex);
-#endif
-
sock = netlink_kernel_create(&init_net, unit, &cfg);
 
if (sock)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 1/5] rtl8192e: rtllib_device: Replace semaphore wx_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'wx_sem' in the rtllib_device is a simple mutex,
so it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c |  4 +--
 drivers/staging/rtl8192e/rtllib.h|  5 ++--
 drivers/staging/rtl8192e/rtllib_softmac.c| 40 ++--
 drivers/staging/rtl8192e/rtllib_softmac_wx.c | 34 +++
 drivers/staging/rtl8192e/rtllib_wx.c | 10 +++
 5 files changed, 46 insertions(+), 47 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 9b7cc7d..6f7d356 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -1277,14 +1277,14 @@ RESET_START:
rtllib_stop_scan_syncro(ieee);
 
if (ieee->state == RTLLIB_LINKED) {
-   SEM_DOWN_IEEE_WX(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
netdev_info(dev, "ieee->state is RTLLIB_LINKED\n");
rtllib_stop_send_beacons(priv->rtllib);
del_timer_sync(&ieee->associate_timer);
cancel_delayed_work(&ieee->associate_retry_wq);
rtllib_stop_scan(ieee);
netif_carrier_off(dev);
-   SEM_UP_IEEE_WX(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
} else {
netdev_info(dev, "ieee->state is NOT LINKED\n");
rtllib_softmac_stop_protocol(priv->rtllib, 0, true);
diff --git a/drivers/staging/rtl8192e/rtllib.h 
b/drivers/staging/rtl8192e/rtllib.h
index 776e179..513dd61 100644
--- a/drivers/staging/rtl8192e/rtllib.h
+++ b/drivers/staging/rtl8192e/rtllib.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1651,7 +1652,7 @@ struct rtllib_device {
short proto_started;
short proto_stoppping;
 
-   struct semaphore wx_sem;
+   struct mutex wx_mutex;
struct semaphore scan_sem;
struct semaphore ips_sem;
 
@@ -2212,7 +2213,5 @@ void rtllib_indicate_packets(struct rtllib_device *ieee,
 void HTUseDefaultSetting(struct rtllib_device *ieee);
 #define RT_ASOC_RETRY_LIMIT5
 u8 MgntQuery_TxRateExcludeCCKRates(struct rtllib_device *ieee);
-#define SEM_DOWN_IEEE_WX(psem) down(psem)
-#define SEM_UP_IEEE_WX(psem) up(psem)
 
 #endif /* RTLLIB_H */
diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c 
b/drivers/staging/rtl8192e/rtllib_softmac.c
index cfab715..30abb7f 100644
--- a/drivers/staging/rtl8192e/rtllib_softmac.c
+++ b/drivers/staging/rtl8192e/rtllib_softmac.c
@@ -753,7 +753,7 @@ static void rtllib_start_scan(struct rtllib_device *ieee)
}
 }
 
-/* called with wx_sem held */
+/* called with wx_mutex held */
 void rtllib_start_scan_syncro(struct rtllib_device *ieee, u8 is_mesh)
 {
if (IS_DOT11D_ENABLE(ieee)) {
@@ -1590,7 +1590,7 @@ static void rtllib_associate_procedure_wq(void *data)
rtllib_stop_scan_syncro(ieee);
if (ieee->rtllib_ips_leave != NULL)
ieee->rtllib_ips_leave(ieee->dev);
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
if (ieee->data_hard_stop)
ieee->data_hard_stop(ieee->dev);
@@ -1605,14 +1605,14 @@ static void rtllib_associate_procedure_wq(void *data)
 __func__);
if (ieee->rtllib_ips_leave_wq != NULL)
ieee->rtllib_ips_leave_wq(ieee->dev);
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
return;
}
ieee->associate_seq = 1;
 
rtllib_associate_step1(ieee, ieee->current_network.bssid);
 
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 inline void rtllib_softmac_new_net(struct rtllib_device *ieee,
@@ -2582,16 +2582,16 @@ static void rtllib_start_ibss_wq(void *data)
 struct rtllib_device, start_ibss_wq);
/* iwconfig mode ad-hoc will schedule this and return
 * on the other hand this will block further iwconfig SET
-* operations because of the wx_sem hold.
+* operations because of the wx_mutex hold.
 * Anyway some most set operations set a flag to speed-up
 * (abort) this wq (when syncro scanning) before sleeping
-* on the semaphore
+* on the mutex
 */
if (!ieee->proto_started) {
netdev_info(ieee->dev, "==oh driver down return\n");
return;
}
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
 

[PATCH 2/5] rtl8192e: r8192_priv: Replace semaphore wx_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'wx_sem' in the r8192_priv is a simple mutex,
so it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---

This patch depends on the following patch:
rtl8192e: rtllib_device: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192e/rtl8192e/rtl_core.c |  28 
 drivers/staging/rtl8192e/rtl8192e/rtl_core.h |   2 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_wx.c   | 104 +--
 3 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 6f7d356..46a5c49 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -993,7 +993,7 @@ static void _rtl92e_init_priv_lock(struct r8192_priv *priv)
spin_lock_init(&priv->irq_th_lock);
spin_lock_init(&priv->rf_ps_lock);
spin_lock_init(&priv->ps_lock);
-   sema_init(&priv->wx_sem, 1);
+   mutex_init(&priv->wx_mutex);
sema_init(&priv->rf_sem, 1);
mutex_init(&priv->mutex);
 }
@@ -1247,7 +1247,7 @@ static void _rtl92e_if_silent_reset(struct net_device 
*dev)
 
 RESET_START:
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
if (priv->rtllib->state == RTLLIB_LINKED)
rtl92e_leisure_ps_leave(dev);
@@ -1255,7 +1255,7 @@ RESET_START:
if (priv->up) {
netdev_info(dev, "%s():the driver is not up.\n",
__func__);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return;
}
priv->up = 0;
@@ -1292,7 +1292,7 @@ RESET_START:
 
rtl92e_dm_backup_state(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
RT_TRACE(COMP_RESET,
 "%s():<==down process is finished\n",
 __func__);
@@ -2179,9 +2179,9 @@ static int _rtl92e_open(struct net_device *dev)
struct r8192_priv *priv = rtllib_priv(dev);
int ret;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
ret = _rtl92e_try_up(dev);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return ret;
 
 }
@@ -2206,11 +2206,11 @@ static int _rtl92e_close(struct net_device *dev)
rtllib_stop_scan(priv->rtllib);
}
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ret = _rtl92e_down(dev, true);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return ret;
 
@@ -2242,11 +2242,11 @@ static void _rtl92e_restart(void *data)
  reset_wq);
struct net_device *dev = priv->rtllib->dev;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
rtl92e_commit(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 }
 
 static void _rtl92e_set_multicast(struct net_device *dev)
@@ -2265,12 +2265,12 @@ static int _rtl92e_set_mac_adr(struct net_device *dev, 
void *mac)
struct r8192_priv *priv = rtllib_priv(dev);
struct sockaddr *addr = mac;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ether_addr_copy(dev->dev_addr, addr->sa_data);
 
schedule_work(&priv->reset_wq);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return 0;
 }
@@ -2287,7 +2287,7 @@ static int _rtl92e_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
struct iw_point *p = &wrq->u.data;
struct ieee_param *ipw = NULL;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
switch (cmd) {
case RTL_IOCTL_WPA_SUPPLICANT:
@@ -2393,7 +2393,7 @@ static int _rtl92e_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
}
 
 out:
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return ret;
 }
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
index f627fdc..369ebf1 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
@@ -375,7 +375,7 @@ struct r8192_priv {
struct tasklet_struct   irq_tx_tasklet;
struct tasklet_struct   irq_prepare_beacon_tasklet;
 
-   struct semaphorewx_sem;
+   struct mutexwx_mutex;
struct semaphorerf_sem;
stru

[PATCH 0/5] *** rtl8192e: Replace semaphore with mutex ***

2016-06-01 Thread Binoy Jayan
Resending the same patchset by adding the following lists: 
 de...@driverdev.osuosl.org
 linux-kernel@vger.kernel.org

Hi,

These are a set of patches towards the removing semaphores. 
They build correctly (individually and as a whole).
Is there any way to get this tested as I do not have the following hardware:

"RealTek RTL8192E Wireless LAN NIC driver"

Thanks,
Binoy

Binoy Jayan (5):
  rtl8192e: rtllib_device: Replace semaphore wx_sem with mutex
  rtl8192e: r8192_priv: Replace semaphore wx_sem with mutex
  rtl8192e: Replace semaphore rf_sem with mutex
  rtl8192e: Replace semaphore scan_sem with mutex
  rtl8192e: Replace semaphore ips_sem with mutex

 drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c |   4 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_cam.c|   4 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c   |  34 +++
 drivers/staging/rtl8192e/rtl8192e/rtl_core.h   |   4 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_ps.c |   8 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_wx.c | 120 -
 drivers/staging/rtl8192e/rtllib.h  |  10 +--
 drivers/staging/rtl8192e/rtllib_softmac.c  |  58 ++--
 drivers/staging/rtl8192e/rtllib_softmac_wx.c   |  34 +++
 drivers/staging/rtl8192e/rtllib_wx.c   |  10 +--
 10 files changed, 142 insertions(+), 144 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 3/5] rtl8192e: Replace semaphore rf_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'rf_sem' in the rtl8192e is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---

This patch depends on the following patch:
rtl8192e: r8192_priv: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c | 4 ++--
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c   | 2 +-
 drivers/staging/rtl8192e/rtl8192e/rtl_core.h   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c 
b/drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c
index 5e3bbe5..14fbcaa 100644
--- a/drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c
+++ b/drivers/staging/rtl8192e/rtl8192e/r8192E_phy.c
@@ -256,7 +256,7 @@ u32 rtl92e_get_rf_reg(struct net_device *dev, enum 
rf90_radio_path eRFPath,
return 0;
if (priv->rtllib->eRFPowerState != eRfOn && !priv->being_init_adapter)
return  0;
-   down(&priv->rf_sem);
+   mutex_lock(&priv->rf_mutex);
if (priv->Rf_Mode == RF_OP_By_FW) {
Original_Value = _rtl92e_phy_rf_fw_read(dev, eRFPath, RegAddr);
udelay(200);
@@ -265,7 +265,7 @@ u32 rtl92e_get_rf_reg(struct net_device *dev, enum 
rf90_radio_path eRFPath,
}
BitShift =  _rtl92e_calculate_bit_shift(BitMask);
Readback_Value = (Original_Value & BitMask) >> BitShift;
-   up(&priv->rf_sem);
+   mutex_unlock(&priv->rf_mutex);
return Readback_Value;
 }
 
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 46a5c49..3d1948a 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -994,7 +994,7 @@ static void _rtl92e_init_priv_lock(struct r8192_priv *priv)
spin_lock_init(&priv->rf_ps_lock);
spin_lock_init(&priv->ps_lock);
mutex_init(&priv->wx_mutex);
-   sema_init(&priv->rf_sem, 1);
+   mutex_init(&priv->rf_mutex);
mutex_init(&priv->mutex);
 }
 
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
index 369ebf1..6ead460 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
@@ -376,7 +376,7 @@ struct r8192_priv {
struct tasklet_struct   irq_prepare_beacon_tasklet;
 
struct mutexwx_mutex;
-   struct semaphorerf_sem;
+   struct mutexrf_mutex;
struct mutexmutex;
 
struct rt_stats stats;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 5/5] rtl8192e: Replace semaphore ips_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'ips_sem' in the rtl8192e is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---

This patch depends on the following patch:
rtl8192e: Replace semaphore scan_sem with mutex

 drivers/staging/rtl8192e/rtl8192e/rtl_cam.c |  4 ++--
 drivers/staging/rtl8192e/rtl8192e/rtl_ps.c  |  8 
 drivers/staging/rtl8192e/rtl8192e/rtl_wx.c  | 16 
 drivers/staging/rtl8192e/rtllib.h   |  3 +--
 drivers/staging/rtl8192e/rtllib_softmac.c   |  2 +-
 5 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_cam.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_cam.c
index 803c8b0..30f65af 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_cam.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_cam.c
@@ -107,9 +107,9 @@ void rtl92e_set_key(struct net_device *dev, u8 EntryNo, u8 
KeyIndex,
__func__);
return;
}
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
}
}
priv->rtllib->is_set_key = true;
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_ps.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_ps.c
index 98e4d88..aa4b015 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_ps.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_ps.c
@@ -179,9 +179,9 @@ void rtl92e_ips_leave_wq(void *data)
struct net_device *dev = ieee->dev;
struct r8192_priv *priv = (struct r8192_priv *)rtllib_priv(dev);
 
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
 }
 
 void rtl92e_rtllib_ips_leave_wq(struct net_device *dev)
@@ -209,9 +209,9 @@ void rtl92e_rtllib_ips_leave(struct net_device *dev)
 {
struct r8192_priv *priv = (struct r8192_priv *)rtllib_priv(dev);
 
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
 }
 
 static bool _rtl92e_ps_set_mode(struct net_device *dev, u8 rtPsMode)
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c
index 78fe833..7413a10 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c
@@ -281,9 +281,9 @@ static int _rtl92e_wx_set_mode(struct net_device *dev,
netdev_info(dev,
"=>%s(): 
rtl92e_ips_leave\n",
__func__);
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
}
}
}
@@ -442,9 +442,9 @@ static int _rtl92e_wx_set_scan(struct net_device *dev,
RT_TRACE(COMP_PS,
 "=>%s(): rtl92e_ips_leave\n",
 __func__);
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
}
}
rtllib_stop_scan(priv->rtllib);
@@ -698,9 +698,9 @@ static int _rtl92e_wx_set_enc(struct net_device *dev,
return -ENETDOWN;
 
priv->rtllib->wx_set_enc = 1;
-   down(&priv->rtllib->ips_sem);
+   mutex_lock(&priv->rtllib->ips_mutex);
rtl92e_ips_leave(dev);
-   up(&priv->rtllib->ips_sem);
+   mutex_unlock(&priv->rtllib->ips_mutex);
mutex_lock(&priv->wx_mutex);
 
RT_TRACE(COMP_SEC, "Setting SW wep key");
@@ -905,9 +905,9 @@ static int _rtl92e_wx_set_encode_ext(struct net_device *dev,
mutex_lock(&priv->wx_mutex);
 
priv->rtllib->wx_set_enc = 1;
-   down(&priv->rtllib->ips_sem);
+   mu

[PATCH 4/5] rtl8192e: Replace semaphore scan_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'scan_sem' in the rtl8192e is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---

This patch depends on the following patch:
rtl8192e: Replace semaphore rf_sem with mutex

 drivers/staging/rtl8192e/rtllib.h |  2 +-
 drivers/staging/rtl8192e/rtllib_softmac.c | 16 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtllib.h 
b/drivers/staging/rtl8192e/rtllib.h
index 513dd61..5bdc378 100644
--- a/drivers/staging/rtl8192e/rtllib.h
+++ b/drivers/staging/rtl8192e/rtllib.h
@@ -1653,7 +1653,7 @@ struct rtllib_device {
short proto_stoppping;
 
struct mutex wx_mutex;
-   struct semaphore scan_sem;
+   struct mutex scan_mutex;
struct semaphore ips_sem;
 
spinlock_t mgmt_tx_lock;
diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c 
b/drivers/staging/rtl8192e/rtllib_softmac.c
index 30abb7f..7f4033c 100644
--- a/drivers/staging/rtl8192e/rtllib_softmac.c
+++ b/drivers/staging/rtl8192e/rtllib_softmac.c
@@ -513,7 +513,7 @@ static void rtllib_softmac_scan_syncro(struct rtllib_device 
*ieee, u8 is_mesh)
 
ieee->be_scan_inprogress = true;
 
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 
while (1) {
do {
@@ -566,7 +566,7 @@ out:
if (IS_DOT11D_ENABLE(ieee))
DOT11D_ScanComplete(ieee);
}
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 
ieee->be_scan_inprogress = false;
 
@@ -587,7 +587,7 @@ static void rtllib_softmac_scan_wq(void *data)
if (rtllib_act_scanning(ieee, true))
return;
 
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 
if (ieee->eRFPowerState == eRfOff) {
netdev_info(ieee->dev,
@@ -618,7 +618,7 @@ static void rtllib_softmac_scan_wq(void *data)
schedule_delayed_work(&ieee->softmac_scan_wq,
  msecs_to_jiffies(RTLLIB_SOFTMAC_SCAN_TIME));
 
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
return;
 
 out:
@@ -630,7 +630,7 @@ out1:
ieee->actscanning = false;
ieee->scan_watch_dog = 0;
ieee->scanning_continue = 0;
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 
@@ -683,7 +683,7 @@ EXPORT_SYMBOL(rtllib_start_send_beacons);
 
 static void rtllib_softmac_stop_scan(struct rtllib_device *ieee)
 {
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
ieee->scan_watch_dog = 0;
if (ieee->scanning_continue == 1) {
ieee->scanning_continue = 0;
@@ -692,7 +692,7 @@ static void rtllib_softmac_stop_scan(struct rtllib_device 
*ieee)
cancel_delayed_work_sync(&ieee->softmac_scan_wq);
}
 
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 void rtllib_stop_scan(struct rtllib_device *ieee)
@@ -3035,7 +3035,7 @@ void rtllib_softmac_init(struct rtllib_device *ieee)
  ieee);
 
mutex_init(&ieee->wx_mutex);
-   sema_init(&ieee->scan_sem, 1);
+   mutex_init(&ieee->scan_mutex);
sema_init(&ieee->ips_sem, 1);
 
spin_lock_init(&ieee->mgmt_tx_lock);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 4/4] rtl8712: pwrctrl_priv: Replace semaphore lock with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'lock' in 'pwrctrl_priv' is used as a simple mutex, so it
should be written as one. Semaphores are going away in the future.
_enter_pwrlock was using down_interruptible(), so the lock could be broken
by sending a signal. This could be a bug, because nothing checks the return
code here. Hence, using mutex_lock instead of the interruptible version.
Removing the now unused _enter_pwrlock and _down_sema.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
  rtl8712: intf_priv: Replace semaphore lock with completion

 drivers/staging/rtl8712/osdep_service.h   |  7 ---
 drivers/staging/rtl8712/rtl8712_cmd.c | 10 +-
 drivers/staging/rtl8712/rtl871x_pwrctrl.c | 22 +++---
 drivers/staging/rtl8712/rtl871x_pwrctrl.h |  7 +--
 4 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/drivers/staging/rtl8712/osdep_service.h 
b/drivers/staging/rtl8712/osdep_service.h
index 076d508..d89ae09 100644
--- a/drivers/staging/rtl8712/osdep_service.h
+++ b/drivers/staging/rtl8712/osdep_service.h
@@ -60,13 +60,6 @@ struct   __queue {
 #define LIST_CONTAINOR(ptr, type, member) \
((type *)((char *)(ptr)-(SIZE_T)(&((type *)0)->member)))
 
-static inline u32 _down_sema(struct semaphore *sema)
-{
-   if (down_interruptible(sema))
-   return _FAIL;
-   return _SUCCESS;
-}
-
 static inline u32 end_of_queue_search(struct list_head *head,
struct list_head *plist)
 {
diff --git a/drivers/staging/rtl8712/rtl8712_cmd.c 
b/drivers/staging/rtl8712/rtl8712_cmd.c
index 1badc6c..9934eab 100644
--- a/drivers/staging/rtl8712/rtl8712_cmd.c
+++ b/drivers/staging/rtl8712/rtl8712_cmd.c
@@ -264,9 +264,9 @@ static struct cmd_obj *cmd_hdl_filter(struct _adapter 
*padapter,
 */
if (padapter->pwrctrlpriv.pwr_mode > PS_MODE_ACTIVE) {
padapter->pwrctrlpriv.pwr_mode = PS_MODE_ACTIVE;
-   _enter_pwrlock(&(padapter->pwrctrlpriv.lock));
+   mutex_lock(&padapter->pwrctrlpriv.mutex_lock);
r8712_set_rpwm(padapter, PS_STATE_S4);
-   up(&(padapter->pwrctrlpriv.lock));
+   mutex_unlock(&padapter->pwrctrlpriv.mutex_lock);
}
pcmd_r = pcmd;
break;
@@ -395,10 +395,10 @@ _next:
}
if (pcmd->cmdcode == GEN_CMD_CODE(_SetPwrMode)) {
if (padapter->pwrctrlpriv.bSleep) {
-   _enter_pwrlock(&(padapter->
-  pwrctrlpriv.lock));
+   mutex_lock(&padapter->
+  pwrctrlpriv.mutex_lock);
r8712_set_rpwm(padapter, PS_STATE_S2);
-   up(&padapter->pwrctrlpriv.lock);
+   
mutex_unlock(&padapter->pwrctrlpriv.mutex_lock);
}
}
r8712_free_cmd_obj(pcmd);
diff --git a/drivers/staging/rtl8712/rtl871x_pwrctrl.c 
b/drivers/staging/rtl8712/rtl871x_pwrctrl.c
index 98a5e74..8d7ead6 100644
--- a/drivers/staging/rtl8712/rtl871x_pwrctrl.c
+++ b/drivers/staging/rtl8712/rtl871x_pwrctrl.c
@@ -103,14 +103,14 @@ void r8712_cpwm_int_hdl(struct _adapter *padapter,
if (pwrpriv->cpwm_tog == ((preportpwrstate->state) & 0x80))
return;
del_timer(&padapter->pwrctrlpriv.rpwm_check_timer);
-   _enter_pwrlock(&pwrpriv->lock);
+   mutex_lock(&pwrpriv->mutex_lock);
pwrpriv->cpwm = (preportpwrstate->state) & 0xf;
if (pwrpriv->cpwm >= PS_STATE_S2) {
if (pwrpriv->alives & CMD_ALIVE)
complete(&(pcmdpriv->cmd_queue_comp));
}
pwrpriv->cpwm_tog = (preportpwrstate->state) & 0x80;
-   up(&pwrpriv->lock);
+   mutex_unlock(&pwrpriv->mutex_lock);
 }
 
 static inline void register_task_alive(struct pwrctrl_priv *pwrctrl, uint tag)
@@ -141,10 +141,10 @@ static void SetPSModeWorkItemCallback(struct work_struct 
*work)
struct _adapter *padapter = container_of(pwrpriv,
struct _adapter, pwrctrlpriv);
if (!pwrpriv->bSleep) {
-   _enter_pwrlock(&pwrpriv->lock);
+   mutex_lock(&pwrpriv->mutex_lock);
if (pwrpriv->pwr_mode == PS_MODE_ACTIVE)
r8712_set_rpwm(padapter, PS_STATE_S4);
-   up(&pwrpriv->lock);
+   mutex_lock(&pwrpriv->mutex_lock);
}
 }
 
@@ -155,11 +155,11 @@ static void 

[PATCH v2 3/4] rtl8712: intf_priv: Replace semaphore lock with completion

2016-06-01 Thread Binoy Jayan
The semaphore 'lock' in 'intf_priv' is used as completion,
so convert it to a struct completion type.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8712: Replace semaphore terminate_cmdthread_sema with completion


 drivers/staging/rtl8712/osdep_intf.h| 2 +-
 drivers/staging/rtl8712/usb_ops_linux.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rtl8712/osdep_intf.h 
b/drivers/staging/rtl8712/osdep_intf.h
index aa0ec74..5d37e1f 100644
--- a/drivers/staging/rtl8712/osdep_intf.h
+++ b/drivers/staging/rtl8712/osdep_intf.h
@@ -36,7 +36,7 @@ struct intf_priv {
/* when in USB, IO is through interrupt in/out endpoints */
struct usb_device *udev;
struct urb *piorw_urb;
-   struct semaphore io_retevt;
+   struct completion io_retevt_comp;
 };
 
 int r871x_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
diff --git a/drivers/staging/rtl8712/usb_ops_linux.c 
b/drivers/staging/rtl8712/usb_ops_linux.c
index 454cdf6..fc3f263 100644
--- a/drivers/staging/rtl8712/usb_ops_linux.c
+++ b/drivers/staging/rtl8712/usb_ops_linux.c
@@ -50,7 +50,7 @@ uint r8712_usb_init_intf_priv(struct intf_priv *pintfpriv)
pintfpriv->piorw_urb = usb_alloc_urb(0, GFP_ATOMIC);
if (!pintfpriv->piorw_urb)
return _FAIL;
-   sema_init(&(pintfpriv->io_retevt), 0);
+   init_completion(&pintfpriv->io_retevt_comp);
return _SUCCESS;
 }
 
@@ -163,7 +163,7 @@ static void usb_write_mem_complete(struct urb *purb)
else
padapter->bSurpriseRemoved = true;
}
-   up(&pintfpriv->io_retevt);
+   complete(&pintfpriv->io_retevt_comp);
 }
 
 void r8712_usb_write_mem(struct intf_hdl *pintfhdl, u32 addr, u32 cnt, u8 
*wmem)
@@ -187,7 +187,7 @@ void r8712_usb_write_mem(struct intf_hdl *pintfhdl, u32 
addr, u32 cnt, u8 *wmem)
  wmem, cnt, usb_write_mem_complete,
  pio_queue);
usb_submit_urb(piorw_urb, GFP_ATOMIC);
-   _down_sema(&pintfpriv->io_retevt);
+   wait_for_completion_interruptible(&pintfpriv->io_retevt_comp);
 }
 
 static void r8712_usb_read_port_complete(struct urb *purb)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 1/4] rtl8712: Replace semaphore cmd_queue_sema with completion

2016-06-01 Thread Binoy Jayan
The semaphore 'cmd_queue_sema' is used as completion,
so convert it to a struct completion type.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/rtl8712/os_intfs.c| 2 +-
 drivers/staging/rtl8712/rtl8712_cmd.c | 2 +-
 drivers/staging/rtl8712/rtl871x_cmd.c | 6 +++---
 drivers/staging/rtl8712/rtl871x_cmd.h | 2 +-
 drivers/staging/rtl8712/rtl871x_pwrctrl.c | 2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/rtl8712/os_intfs.c 
b/drivers/staging/rtl8712/os_intfs.c
index ab19112..c07bcd0 100644
--- a/drivers/staging/rtl8712/os_intfs.c
+++ b/drivers/staging/rtl8712/os_intfs.c
@@ -243,7 +243,7 @@ static u32 start_drv_threads(struct _adapter *padapter)
 void r8712_stop_drv_threads(struct _adapter *padapter)
 {
/*Below is to terminate r8712_cmd_thread & event_thread...*/
-   up(&padapter->cmdpriv.cmd_queue_sema);
+   complete(&padapter->cmdpriv.cmd_queue_comp);
if (padapter->cmdThread)
_down_sema(&padapter->cmdpriv.terminate_cmdthread_sema);
padapter->cmdpriv.cmd_seq = 1;
diff --git a/drivers/staging/rtl8712/rtl8712_cmd.c 
b/drivers/staging/rtl8712/rtl8712_cmd.c
index 50f4002..172f51f 100644
--- a/drivers/staging/rtl8712/rtl8712_cmd.c
+++ b/drivers/staging/rtl8712/rtl8712_cmd.c
@@ -322,7 +322,7 @@ int r8712_cmd_thread(void *context)
 
allow_signal(SIGTERM);
while (1) {
-   if ((_down_sema(&(pcmdpriv->cmd_queue_sema))) == _FAIL)
+   if 
(wait_for_completion_interruptible(&pcmdpriv->cmd_queue_comp))
break;
if (padapter->bDriverStopped || padapter->bSurpriseRemoved)
break;
diff --git a/drivers/staging/rtl8712/rtl871x_cmd.c 
b/drivers/staging/rtl8712/rtl871x_cmd.c
index 86136cc..69e650b 100644
--- a/drivers/staging/rtl8712/rtl871x_cmd.c
+++ b/drivers/staging/rtl8712/rtl871x_cmd.c
@@ -57,7 +57,7 @@ No irqsave is necessary.
 
 static sint _init_cmd_priv(struct cmd_priv *pcmdpriv)
 {
-   sema_init(&(pcmdpriv->cmd_queue_sema), 0);
+   init_completion(&pcmdpriv->cmd_queue_comp);
sema_init(&(pcmdpriv->terminate_cmdthread_sema), 0);
 
_init_queue(&(pcmdpriv->cmd_queue));
@@ -172,7 +172,7 @@ u32 r8712_enqueue_cmd(struct cmd_priv *pcmdpriv, struct 
cmd_obj *obj)
if (pcmdpriv->padapter->eeprompriv.bautoload_fail_flag)
return _FAIL;
res = _enqueue_cmd(&pcmdpriv->cmd_queue, obj);
-   up(&pcmdpriv->cmd_queue_sema);
+   complete(&pcmdpriv->cmd_queue_comp);
return res;
 }
 
@@ -189,7 +189,7 @@ u32 r8712_enqueue_cmd_ex(struct cmd_priv *pcmdpriv, struct 
cmd_obj *obj)
spin_lock_irqsave(&queue->lock, irqL);
list_add_tail(&obj->list, &queue->queue);
spin_unlock_irqrestore(&queue->lock, irqL);
-   up(&pcmdpriv->cmd_queue_sema);
+   complete(&pcmdpriv->cmd_queue_comp);
return _SUCCESS;
 }
 
diff --git a/drivers/staging/rtl8712/rtl871x_cmd.h 
b/drivers/staging/rtl8712/rtl871x_cmd.h
index e4a2a50..1907bc9 100644
--- a/drivers/staging/rtl8712/rtl871x_cmd.h
+++ b/drivers/staging/rtl8712/rtl871x_cmd.h
@@ -50,7 +50,7 @@ struct cmd_obj {
 };
 
 struct cmd_priv {
-   struct semaphore cmd_queue_sema;
+   struct completion cmd_queue_comp;
struct semaphore terminate_cmdthread_sema;
struct  __queue cmd_queue;
u8 cmd_seq;
diff --git a/drivers/staging/rtl8712/rtl871x_pwrctrl.c 
b/drivers/staging/rtl8712/rtl871x_pwrctrl.c
index bf10d6d..98a5e74 100644
--- a/drivers/staging/rtl8712/rtl871x_pwrctrl.c
+++ b/drivers/staging/rtl8712/rtl871x_pwrctrl.c
@@ -107,7 +107,7 @@ void r8712_cpwm_int_hdl(struct _adapter *padapter,
pwrpriv->cpwm = (preportpwrstate->state) & 0xf;
if (pwrpriv->cpwm >= PS_STATE_S2) {
if (pwrpriv->alives & CMD_ALIVE)
-   up(&(pcmdpriv->cmd_queue_sema));
+   complete(&(pcmdpriv->cmd_queue_comp));
}
pwrpriv->cpwm_tog = (preportpwrstate->state) & 0x80;
up(&pwrpriv->lock);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 0/4] *** rtl8712: Replace semaphores with mutex / completions ***

2016-06-01 Thread Binoy Jayan
Hi,

These are a set of patches [v2] which removes semaphores from:

drivers/staging/rtl8712

They build correctly (individually and as a whole).
NB: I have not tested this as I do not have the following hardware:

"RealTek RTL8712U (RTL8192SU) Wireless LAN NIC driver"

Rework on comments w.r.t. PATCH v1:

 - Removed wrapper functions _wait_completion, _down_sema and _enter_pwrlock
 - Updated changelog to explain use of mutex_lock instead of
   mutex_lock_interruptible in [PATCH v2 4/4]

Binoy

Binoy Jayan (4):
  rtl8712: Replace semaphore cmd_queue_sema with completion
  rtl8712: Replace semaphore terminate_cmdthread_sema with completion
  rtl8712: intf_priv: Replace semaphore lock with completion
  rtl8712: pwrctrl_priv: Replace semaphore lock with mutex

 drivers/staging/rtl8712/os_intfs.c|  4 ++--
 drivers/staging/rtl8712/osdep_intf.h  |  2 +-
 drivers/staging/rtl8712/osdep_service.h   |  7 ---
 drivers/staging/rtl8712/rtl8712_cmd.c | 14 +++---
 drivers/staging/rtl8712/rtl871x_cmd.c |  8 
 drivers/staging/rtl8712/rtl871x_cmd.h |  4 ++--
 drivers/staging/rtl8712/rtl871x_pwrctrl.c | 24 
 drivers/staging/rtl8712/rtl871x_pwrctrl.h |  7 +--
 drivers/staging/rtl8712/usb_ops_linux.c   |  6 +++---
 9 files changed, 32 insertions(+), 44 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 2/4] rtl8712: Replace semaphore terminate_cmdthread_sema with completion

2016-06-01 Thread Binoy Jayan
The semaphore 'terminate_cmdthread_sema' is used as completion,
so convert it to a struct completion type.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
 rtl8712: Replace semaphore cmd_queue_sema with completion

 drivers/staging/rtl8712/os_intfs.c| 2 +-
 drivers/staging/rtl8712/rtl8712_cmd.c | 2 +-
 drivers/staging/rtl8712/rtl871x_cmd.c | 2 +-
 drivers/staging/rtl8712/rtl871x_cmd.h | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rtl8712/os_intfs.c 
b/drivers/staging/rtl8712/os_intfs.c
index c07bcd0..2396bac 100644
--- a/drivers/staging/rtl8712/os_intfs.c
+++ b/drivers/staging/rtl8712/os_intfs.c
@@ -245,7 +245,7 @@ void r8712_stop_drv_threads(struct _adapter *padapter)
/*Below is to terminate r8712_cmd_thread & event_thread...*/
complete(&padapter->cmdpriv.cmd_queue_comp);
if (padapter->cmdThread)
-   _down_sema(&padapter->cmdpriv.terminate_cmdthread_sema);
+   
wait_for_completion_interruptible(&padapter->cmdpriv.terminate_cmdthread_comp);
padapter->cmdpriv.cmd_seq = 1;
 }
 
diff --git a/drivers/staging/rtl8712/rtl8712_cmd.c 
b/drivers/staging/rtl8712/rtl8712_cmd.c
index 172f51f..1badc6c 100644
--- a/drivers/staging/rtl8712/rtl8712_cmd.c
+++ b/drivers/staging/rtl8712/rtl8712_cmd.c
@@ -420,7 +420,7 @@ _next:
break;
r8712_free_cmd_obj(pcmd);
} while (1);
-   up(&pcmdpriv->terminate_cmdthread_sema);
+   complete(&pcmdpriv->terminate_cmdthread_comp);
thread_exit();
 }
 
diff --git a/drivers/staging/rtl8712/rtl871x_cmd.c 
b/drivers/staging/rtl8712/rtl871x_cmd.c
index 69e650b..74fd928 100644
--- a/drivers/staging/rtl8712/rtl871x_cmd.c
+++ b/drivers/staging/rtl8712/rtl871x_cmd.c
@@ -58,7 +58,7 @@ No irqsave is necessary.
 static sint _init_cmd_priv(struct cmd_priv *pcmdpriv)
 {
init_completion(&pcmdpriv->cmd_queue_comp);
-   sema_init(&(pcmdpriv->terminate_cmdthread_sema), 0);
+   init_completion(&pcmdpriv->terminate_cmdthread_comp);
 
_init_queue(&(pcmdpriv->cmd_queue));
 
diff --git a/drivers/staging/rtl8712/rtl871x_cmd.h 
b/drivers/staging/rtl8712/rtl871x_cmd.h
index 1907bc9..ebd2e1d 100644
--- a/drivers/staging/rtl8712/rtl871x_cmd.h
+++ b/drivers/staging/rtl8712/rtl871x_cmd.h
@@ -51,7 +51,7 @@ struct cmd_obj {
 
 struct cmd_priv {
struct completion cmd_queue_comp;
-   struct semaphore terminate_cmdthread_sema;
+   struct completion terminate_cmdthread_comp;
struct  __queue cmd_queue;
u8 cmd_seq;
u8 *cmd_buf;/*shall be non-paged, and 4 bytes aligned*/
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 1/4] rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'wx_sem' in r8192_priv is a simple mutex, so
it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/rtl8192u/r8192U.h  |  2 +-
 drivers/staging/rtl8192u/r8192U_core.c | 28 ++--
 drivers/staging/rtl8192u/r8192U_wx.c   | 80 +-
 3 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U.h 
b/drivers/staging/rtl8192u/r8192U.h
index ee1c722..2780838 100644
--- a/drivers/staging/rtl8192u/r8192U.h
+++ b/drivers/staging/rtl8192u/r8192U.h
@@ -879,7 +879,7 @@ typedef struct r8192_priv {
/* If 1, allow bad crc frame, reception in monitor mode */
short crcmon;
 
-   struct semaphore wx_sem;
+   struct mutex wx_mutex;
struct semaphore rf_sem;/* Used to lock rf write operation */
 
u8 rf_type; /* 0: 1T2R, 1: 2T4R */
diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index 849a95e..3d1b52f 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -2373,7 +2373,7 @@ static void rtl8192_init_priv_lock(struct r8192_priv 
*priv)
 {
spin_lock_init(&priv->tx_lock);
spin_lock_init(&priv->irq_lock);
-   sema_init(&priv->wx_sem, 1);
+   mutex_init(&priv->wx_mutex);
sema_init(&priv->rf_sem, 1);
mutex_init(&priv->mutex);
 }
@@ -3324,12 +3324,12 @@ RESET_START:
 
/* Set the variable for reset. */
priv->ResetProgress = RESET_TYPE_SILENT;
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
if (priv->up == 0) {
RT_TRACE(COMP_ERR,
 "%s():the driver is not up! return\n",
 __func__);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return;
}
priv->up = 0;
@@ -3356,7 +3356,7 @@ RESET_START:
netdev_dbg(dev, "ieee->state is NOT LINKED\n");
ieee80211_softmac_stop_protocol(priv->ieee80211);
}
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
RT_TRACE(COMP_RESET,
 "%s():<==down process is finished\n",
 __func__);
@@ -3556,9 +3556,9 @@ static int rtl8192_open(struct net_device *dev)
struct r8192_priv *priv = ieee80211_priv(dev);
int ret;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
ret = rtl8192_up(dev);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return ret;
 
 }
@@ -3580,11 +3580,11 @@ static int rtl8192_close(struct net_device *dev)
struct r8192_priv *priv = ieee80211_priv(dev);
int ret;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ret = rtl8192_down(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return ret;
 
@@ -3658,11 +3658,11 @@ static void rtl8192_restart(struct work_struct *work)
   reset_wq);
struct net_device *dev = priv->ieee80211->dev;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
rtl8192_commit(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 }
 
 static void r8192_set_multicast(struct net_device *dev)
@@ -3685,12 +3685,12 @@ static int r8192_set_mac_adr(struct net_device *dev, 
void *mac)
struct r8192_priv *priv = ieee80211_priv(dev);
struct sockaddr *addr = mac;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ether_addr_copy(dev->dev_addr, addr->sa_data);
 
schedule_work(&priv->reset_wq);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return 0;
 }
@@ -3707,7 +3707,7 @@ static int rtl8192_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
struct iw_point *p = &wrq->u.data;
struct ieee_param *ipw = NULL;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
 
if (p->length < sizeof(struct ieee_param) || !p->pointer) {
@@ -3800,7 +3800,7 @@ static int rtl8192_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
kfree(ipw);
ipw = NULL;
 out:
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return ret;
 }
 
diff --git a/drivers/staging/rtl8192u/r8192U_wx.c 
b/drivers/staging/rtl8192u/r8192U_wx.c

[PATCH 4/4] rtl8192u: Replace semaphore rf_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'rf_sem' in rtl8192u is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: Replace semaphore scan_sem with mutex

 drivers/staging/rtl8192u/r8192U.h  | 2 +-
 drivers/staging/rtl8192u/r8192U_core.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U.h 
b/drivers/staging/rtl8192u/r8192U.h
index 2780838..7b921d4 100644
--- a/drivers/staging/rtl8192u/r8192U.h
+++ b/drivers/staging/rtl8192u/r8192U.h
@@ -880,7 +880,7 @@ typedef struct r8192_priv {
short crcmon;
 
struct mutex wx_mutex;
-   struct semaphore rf_sem;/* Used to lock rf write operation */
+   struct mutex rf_mutex;  /* Used to lock rf write operation */
 
u8 rf_type; /* 0: 1T2R, 1: 2T4R */
RT_RF_TYPE_819xU rf_chip;
diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index c6d3119..46d613a 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -2374,7 +2374,7 @@ static void rtl8192_init_priv_lock(struct r8192_priv 
*priv)
spin_lock_init(&priv->tx_lock);
spin_lock_init(&priv->irq_lock);
mutex_init(&priv->wx_mutex);
-   sema_init(&priv->rf_sem, 1);
+   mutex_init(&priv->rf_mutex);
mutex_init(&priv->mutex);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 2/4] rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'wx_sem' in ieee80211_device is a simple mutex,
so it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  2 +-
 .../staging/rtl8192u/ieee80211/ieee80211_softmac.c | 36 +++---
 .../rtl8192u/ieee80211/ieee80211_softmac_wx.c  | 34 ++--
 drivers/staging/rtl8192u/ieee80211/ieee80211_wx.c  |  6 ++--
 drivers/staging/rtl8192u/r8192U_core.c |  4 +--
 5 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211.h 
b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
index 68931e5..ef9ae22 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211.h
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
@@ -1799,7 +1799,7 @@ struct ieee80211_device {
short scanning;
short proto_started;
 
-   struct semaphore wx_sem;
+   struct mutex wx_mutex;
struct semaphore scan_sem;
 
spinlock_t mgmt_tx_lock;
diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c 
b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
index ae1274c..c983e49 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
@@ -621,7 +621,7 @@ static void ieee80211_start_scan(struct ieee80211_device 
*ieee)
 
 }
 
-/* called with wx_sem held */
+/* called with wx_mutex held */
 void ieee80211_start_scan_syncro(struct ieee80211_device *ieee)
 {
if (IS_DOT11D_ENABLE(ieee) )
@@ -1389,7 +1389,7 @@ static void ieee80211_associate_procedure_wq(struct 
work_struct *work)
 {
struct ieee80211_device *ieee = container_of(work, struct 
ieee80211_device, associate_procedure_wq);
ieee->sync_scan_hurryup = 1;
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
if (ieee->data_hard_stop)
ieee->data_hard_stop(ieee->dev);
@@ -1402,7 +1402,7 @@ static void ieee80211_associate_procedure_wq(struct 
work_struct *work)
ieee->associate_seq = 1;
ieee80211_associate_step1(ieee);
 
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 inline void ieee80211_softmac_new_net(struct ieee80211_device *ieee, struct 
ieee80211_network *net)
@@ -2331,7 +2331,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
struct ieee80211_device *ieee = container_of(dwork, struct 
ieee80211_device, start_ibss_wq);
/* iwconfig mode ad-hoc will schedule this and return
 * on the other hand this will block further iwconfig SET
-* operations because of the wx_sem hold.
+* operations because of the wx_mutex hold.
 * Anyway some most set operations set a flag to speed-up
 * (abort) this wq (when syncro scanning) before sleeping
 * on the semaphore
@@ -2340,7 +2340,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
printk("==oh driver down return\n");
return;
}
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
if (ieee->current_network.ssid_len == 0) {
strcpy(ieee->current_network.ssid, IEEE80211_DEFAULT_TX_ESSID);
@@ -2431,7 +2431,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
ieee->data_hard_resume(ieee->dev);
netif_carrier_on(ieee->dev);
 
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 inline void ieee80211_start_ibss(struct ieee80211_device *ieee)
@@ -2439,7 +2439,7 @@ inline void ieee80211_start_ibss(struct ieee80211_device 
*ieee)
schedule_delayed_work(&ieee->start_ibss_wq, 150);
 }
 
-/* this is called only in user context, with wx_sem held */
+/* this is called only in user context, with wx_mutex held */
 void ieee80211_start_bss(struct ieee80211_device *ieee)
 {
unsigned long flags;
@@ -2505,7 +2505,7 @@ static void ieee80211_associate_retry_wq(struct 
work_struct *work)
struct ieee80211_device *ieee = container_of(dwork, struct 
ieee80211_device, associate_retry_wq);
unsigned long flags;
 
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
if(!ieee->proto_started)
goto exit;
 
@@ -2537,7 +2537,7 @@ static void ieee80211_associate_retry_wq(struct 
work_struct *work)
spin_unlock_irqrestore(&ieee->lock, flags);
 
 exit:
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 struct sk_buff *ieee80211_get_beacon_(struct ieee80211_device *ieee)
@@ -2583,9 +2583,9 @@ EXPORT_SYMBOL(ieee80211_get_beacon);
 void ieee80211_softmac_stop_protocol(struct ieee80211_device *ieee)
 {
   

[PATCH 0/4] *** rtl8192u: Replace semaphores with mutexes ***

2016-06-01 Thread Binoy Jayan
Hi,

These are a set of patches which removes semaphores from:

drivers/staging/rtl8192u

They build correctly (individually and as a whole).
NB: I have not tested this as I do not have the following hardware:

"RealTek RTL8192U Wireless LAN NIC driver"

Thanks,
Binoy

Binoy Jayan (4):
  rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex
  rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex
  rtl8192u: Replace semaphore scan_sem with mutex
  rtl8192u: Replace semaphore rf_sem with mutex

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  4 +-
 .../staging/rtl8192u/ieee80211/ieee80211_softmac.c | 54 +++
 .../rtl8192u/ieee80211/ieee80211_softmac_wx.c  | 34 -
 drivers/staging/rtl8192u/ieee80211/ieee80211_wx.c  |  6 +-
 drivers/staging/rtl8192u/r8192U.h  |  4 +-
 drivers/staging/rtl8192u/r8192U_core.c | 34 -
 drivers/staging/rtl8192u/r8192U_wx.c   | 80 +++---
 7 files changed, 108 insertions(+), 108 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH 3/4] rtl8192u: Replace semaphore scan_sem with mutex

2016-06-01 Thread Binoy Jayan
The semaphore 'scan_sem' in rtl8192u is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  2 +-
 drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c | 18 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211.h 
b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
index ef9ae22..09e9499 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211.h
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
@@ -1800,7 +1800,7 @@ struct ieee80211_device {
short proto_started;
 
struct mutex wx_mutex;
-   struct semaphore scan_sem;
+   struct mutex scan_mutex;
 
spinlock_t mgmt_tx_lock;
spinlock_t beacon_lock;
diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c 
b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
index c983e49..e8e83f5 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
@@ -427,7 +427,7 @@ void ieee80211_softmac_scan_syncro(struct ieee80211_device 
*ieee)
short ch = 0;
u8 channel_map[MAX_CHANNEL_NUMBER+1];
memcpy(channel_map, GET_DOT11D_INFO(ieee)->channel_map, 
MAX_CHANNEL_NUMBER+1);
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 
while(1)
{
@@ -475,13 +475,13 @@ void ieee80211_softmac_scan_syncro(struct 
ieee80211_device *ieee)
 out:
if(ieee->state < IEEE80211_LINKED){
ieee->actscanning = false;
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
}
else{
ieee->sync_scan_hurryup = 0;
if(IS_DOT11D_ENABLE(ieee))
DOT11D_ScanComplete(ieee);
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 }
 EXPORT_SYMBOL(ieee80211_softmac_scan_syncro);
@@ -495,7 +495,7 @@ static void ieee80211_softmac_scan_wq(struct work_struct 
*work)
memcpy(channel_map, GET_DOT11D_INFO(ieee)->channel_map, 
MAX_CHANNEL_NUMBER+1);
if(!ieee->ieee_up)
return;
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
do{
ieee->current_network.channel =
(ieee->current_network.channel + 1) % 
MAX_CHANNEL_NUMBER;
@@ -517,7 +517,7 @@ static void ieee80211_softmac_scan_wq(struct work_struct 
*work)
 
schedule_delayed_work(&ieee->softmac_scan_wq, 
IEEE80211_SOFTMAC_SCAN_TIME);
 
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
return;
 out:
if(IS_DOT11D_ENABLE(ieee))
@@ -525,7 +525,7 @@ out:
ieee->actscanning = false;
watchdog = 0;
ieee->scanning = 0;
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 
@@ -579,7 +579,7 @@ static void ieee80211_softmac_stop_scan(struct 
ieee80211_device *ieee)
 
//ieee->sync_scan_hurryup = 1;
 
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 // spin_lock_irqsave(&ieee->lock, flags);
 
if (ieee->scanning == 1) {
@@ -589,7 +589,7 @@ static void ieee80211_softmac_stop_scan(struct 
ieee80211_device *ieee)
}
 
 // spin_unlock_irqrestore(&ieee->lock, flags);
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 void ieee80211_stop_scan(struct ieee80211_device *ieee)
@@ -2729,7 +2729,7 @@ void ieee80211_softmac_init(struct ieee80211_device *ieee)
 
 
mutex_init(&ieee->wx_mutex);
-   sema_init(&ieee->scan_sem, 1);
+   mutex_init(&ieee->scan_mutex);
 
spin_lock_init(&ieee->mgmt_tx_lock);
spin_lock_init(&ieee->beacon_lock);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 1/4] rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex

2016-06-02 Thread Binoy Jayan
The semaphore 'wx_sem' in r8192_priv is a simple mutex, so
it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/staging/rtl8192u/r8192U.h  |  2 +-
 drivers/staging/rtl8192u/r8192U_core.c | 28 ++--
 drivers/staging/rtl8192u/r8192U_wx.c   | 80 +-
 3 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U.h 
b/drivers/staging/rtl8192u/r8192U.h
index ee1c722..2780838 100644
--- a/drivers/staging/rtl8192u/r8192U.h
+++ b/drivers/staging/rtl8192u/r8192U.h
@@ -879,7 +879,7 @@ typedef struct r8192_priv {
/* If 1, allow bad crc frame, reception in monitor mode */
short crcmon;
 
-   struct semaphore wx_sem;
+   struct mutex wx_mutex;
struct semaphore rf_sem;/* Used to lock rf write operation */
 
u8 rf_type; /* 0: 1T2R, 1: 2T4R */
diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index 849a95e..3d1b52f 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -2373,7 +2373,7 @@ static void rtl8192_init_priv_lock(struct r8192_priv 
*priv)
 {
spin_lock_init(&priv->tx_lock);
spin_lock_init(&priv->irq_lock);
-   sema_init(&priv->wx_sem, 1);
+   mutex_init(&priv->wx_mutex);
sema_init(&priv->rf_sem, 1);
mutex_init(&priv->mutex);
 }
@@ -3324,12 +3324,12 @@ RESET_START:
 
/* Set the variable for reset. */
priv->ResetProgress = RESET_TYPE_SILENT;
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
if (priv->up == 0) {
RT_TRACE(COMP_ERR,
 "%s():the driver is not up! return\n",
 __func__);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return;
}
priv->up = 0;
@@ -3356,7 +3356,7 @@ RESET_START:
netdev_dbg(dev, "ieee->state is NOT LINKED\n");
ieee80211_softmac_stop_protocol(priv->ieee80211);
}
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
RT_TRACE(COMP_RESET,
 "%s():<==down process is finished\n",
 __func__);
@@ -3556,9 +3556,9 @@ static int rtl8192_open(struct net_device *dev)
struct r8192_priv *priv = ieee80211_priv(dev);
int ret;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
ret = rtl8192_up(dev);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return ret;
 
 }
@@ -3580,11 +3580,11 @@ static int rtl8192_close(struct net_device *dev)
struct r8192_priv *priv = ieee80211_priv(dev);
int ret;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ret = rtl8192_down(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return ret;
 
@@ -3658,11 +3658,11 @@ static void rtl8192_restart(struct work_struct *work)
   reset_wq);
struct net_device *dev = priv->ieee80211->dev;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
rtl8192_commit(dev);
 
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 }
 
 static void r8192_set_multicast(struct net_device *dev)
@@ -3685,12 +3685,12 @@ static int r8192_set_mac_adr(struct net_device *dev, 
void *mac)
struct r8192_priv *priv = ieee80211_priv(dev);
struct sockaddr *addr = mac;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
ether_addr_copy(dev->dev_addr, addr->sa_data);
 
schedule_work(&priv->reset_wq);
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
 
return 0;
 }
@@ -3707,7 +3707,7 @@ static int rtl8192_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
struct iw_point *p = &wrq->u.data;
struct ieee_param *ipw = NULL;
 
-   down(&priv->wx_sem);
+   mutex_lock(&priv->wx_mutex);
 
 
if (p->length < sizeof(struct ieee_param) || !p->pointer) {
@@ -3800,7 +3800,7 @@ static int rtl8192_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
kfree(ipw);
ipw = NULL;
 out:
-   up(&priv->wx_sem);
+   mutex_unlock(&priv->wx_mutex);
return ret;
 }
 
diff --git a/drivers/staging/rtl8192u/r8192U_wx.c 
b/drivers/staging/rtl8192u/r8192U_wx.c

[PATCH v2 0/4] *** rtl8192u: Replace semaphores with mutexes ***

2016-06-02 Thread Binoy Jayan
Hi,

These are a set of patches [v2] which removes semaphores from:

drivers/staging/rtl8192u

These are part of a bigger effort to eliminate all semaphores 
from the linux kernel.

They build correctly (individually and as a whole).
NB: I have not tested this as I do not have the following hardware:

"RealTek RTL8192U Wireless LAN NIC driver"

Review comments w.r.t. PATCH v1:
  Changed the following patch (PATCH 4/4)
   rtl8192u: Replace semaphore rf_sem with mutex
  with
  rtl8192u: Remove unused semaphore rf_sem
  since 'rf_sem' has no users, removing it instead
  of replacing it with a mutex.

Thanks,
Binoy

Binoy Jayan (4):
  rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex
  rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex
  rtl8192u: Replace semaphore scan_sem with mutex
  rtl8192u: Remove unused semaphore rf_sem

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  4 +-
 .../staging/rtl8192u/ieee80211/ieee80211_softmac.c | 54 +++
 .../rtl8192u/ieee80211/ieee80211_softmac_wx.c  | 34 -
 drivers/staging/rtl8192u/ieee80211/ieee80211_wx.c  |  6 +-
 drivers/staging/rtl8192u/r8192U.h  |  3 +-
 drivers/staging/rtl8192u/r8192U_core.c | 33 +
 drivers/staging/rtl8192u/r8192U_wx.c   | 80 +++---
 7 files changed, 106 insertions(+), 108 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 4/4] rtl8192u: Remove unused semaphore rf_sem

2016-06-02 Thread Binoy Jayan
The semaphore 'rf_sem' in rtl8192u has no users, hence removing it.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: Replace semaphore scan_sem with mutex

 drivers/staging/rtl8192u/r8192U.h  | 1 -
 drivers/staging/rtl8192u/r8192U_core.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U.h 
b/drivers/staging/rtl8192u/r8192U.h
index 2780838..f717cb3 100644
--- a/drivers/staging/rtl8192u/r8192U.h
+++ b/drivers/staging/rtl8192u/r8192U.h
@@ -880,7 +880,6 @@ typedef struct r8192_priv {
short crcmon;
 
struct mutex wx_mutex;
-   struct semaphore rf_sem;/* Used to lock rf write operation */
 
u8 rf_type; /* 0: 1T2R, 1: 2T4R */
RT_RF_TYPE_819xU rf_chip;
diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index c6d3119..ccb4259 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -2374,7 +2374,6 @@ static void rtl8192_init_priv_lock(struct r8192_priv 
*priv)
spin_lock_init(&priv->tx_lock);
spin_lock_init(&priv->irq_lock);
mutex_init(&priv->wx_mutex);
-   sema_init(&priv->rf_sem, 1);
mutex_init(&priv->mutex);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 3/4] rtl8192u: Replace semaphore scan_sem with mutex

2016-06-02 Thread Binoy Jayan
The semaphore 'scan_sem' in rtl8192u is a simple mutex, so it should
be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  2 +-
 drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c | 18 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211.h 
b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
index ef9ae22..09e9499 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211.h
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
@@ -1800,7 +1800,7 @@ struct ieee80211_device {
short proto_started;
 
struct mutex wx_mutex;
-   struct semaphore scan_sem;
+   struct mutex scan_mutex;
 
spinlock_t mgmt_tx_lock;
spinlock_t beacon_lock;
diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c 
b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
index c983e49..e8e83f5 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
@@ -427,7 +427,7 @@ void ieee80211_softmac_scan_syncro(struct ieee80211_device 
*ieee)
short ch = 0;
u8 channel_map[MAX_CHANNEL_NUMBER+1];
memcpy(channel_map, GET_DOT11D_INFO(ieee)->channel_map, 
MAX_CHANNEL_NUMBER+1);
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 
while(1)
{
@@ -475,13 +475,13 @@ void ieee80211_softmac_scan_syncro(struct 
ieee80211_device *ieee)
 out:
if(ieee->state < IEEE80211_LINKED){
ieee->actscanning = false;
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
}
else{
ieee->sync_scan_hurryup = 0;
if(IS_DOT11D_ENABLE(ieee))
DOT11D_ScanComplete(ieee);
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 }
 EXPORT_SYMBOL(ieee80211_softmac_scan_syncro);
@@ -495,7 +495,7 @@ static void ieee80211_softmac_scan_wq(struct work_struct 
*work)
memcpy(channel_map, GET_DOT11D_INFO(ieee)->channel_map, 
MAX_CHANNEL_NUMBER+1);
if(!ieee->ieee_up)
return;
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
do{
ieee->current_network.channel =
(ieee->current_network.channel + 1) % 
MAX_CHANNEL_NUMBER;
@@ -517,7 +517,7 @@ static void ieee80211_softmac_scan_wq(struct work_struct 
*work)
 
schedule_delayed_work(&ieee->softmac_scan_wq, 
IEEE80211_SOFTMAC_SCAN_TIME);
 
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
return;
 out:
if(IS_DOT11D_ENABLE(ieee))
@@ -525,7 +525,7 @@ out:
ieee->actscanning = false;
watchdog = 0;
ieee->scanning = 0;
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 
@@ -579,7 +579,7 @@ static void ieee80211_softmac_stop_scan(struct 
ieee80211_device *ieee)
 
//ieee->sync_scan_hurryup = 1;
 
-   down(&ieee->scan_sem);
+   mutex_lock(&ieee->scan_mutex);
 // spin_lock_irqsave(&ieee->lock, flags);
 
if (ieee->scanning == 1) {
@@ -589,7 +589,7 @@ static void ieee80211_softmac_stop_scan(struct 
ieee80211_device *ieee)
}
 
 // spin_unlock_irqrestore(&ieee->lock, flags);
-   up(&ieee->scan_sem);
+   mutex_unlock(&ieee->scan_mutex);
 }
 
 void ieee80211_stop_scan(struct ieee80211_device *ieee)
@@ -2729,7 +2729,7 @@ void ieee80211_softmac_init(struct ieee80211_device *ieee)
 
 
mutex_init(&ieee->wx_mutex);
-   sema_init(&ieee->scan_sem, 1);
+   mutex_init(&ieee->scan_mutex);
 
spin_lock_init(&ieee->mgmt_tx_lock);
spin_lock_init(&ieee->beacon_lock);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2 2/4] rtl8192u: ieee80211_device: Replace semaphore wx_sem with mutex

2016-06-02 Thread Binoy Jayan
The semaphore 'wx_sem' in ieee80211_device is a simple mutex,
so it should be written as one. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
This patch depends on the following patch:
rtl8192u: r8192_priv: Replace semaphore wx_sem with mutex

 drivers/staging/rtl8192u/ieee80211/ieee80211.h |  2 +-
 .../staging/rtl8192u/ieee80211/ieee80211_softmac.c | 36 +++---
 .../rtl8192u/ieee80211/ieee80211_softmac_wx.c  | 34 ++--
 drivers/staging/rtl8192u/ieee80211/ieee80211_wx.c  |  6 ++--
 drivers/staging/rtl8192u/r8192U_core.c |  4 +--
 5 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211.h 
b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
index 68931e5..ef9ae22 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211.h
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211.h
@@ -1799,7 +1799,7 @@ struct ieee80211_device {
short scanning;
short proto_started;
 
-   struct semaphore wx_sem;
+   struct mutex wx_mutex;
struct semaphore scan_sem;
 
spinlock_t mgmt_tx_lock;
diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c 
b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
index ae1274c..c983e49 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
@@ -621,7 +621,7 @@ static void ieee80211_start_scan(struct ieee80211_device 
*ieee)
 
 }
 
-/* called with wx_sem held */
+/* called with wx_mutex held */
 void ieee80211_start_scan_syncro(struct ieee80211_device *ieee)
 {
if (IS_DOT11D_ENABLE(ieee) )
@@ -1389,7 +1389,7 @@ static void ieee80211_associate_procedure_wq(struct 
work_struct *work)
 {
struct ieee80211_device *ieee = container_of(work, struct 
ieee80211_device, associate_procedure_wq);
ieee->sync_scan_hurryup = 1;
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
if (ieee->data_hard_stop)
ieee->data_hard_stop(ieee->dev);
@@ -1402,7 +1402,7 @@ static void ieee80211_associate_procedure_wq(struct 
work_struct *work)
ieee->associate_seq = 1;
ieee80211_associate_step1(ieee);
 
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 inline void ieee80211_softmac_new_net(struct ieee80211_device *ieee, struct 
ieee80211_network *net)
@@ -2331,7 +2331,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
struct ieee80211_device *ieee = container_of(dwork, struct 
ieee80211_device, start_ibss_wq);
/* iwconfig mode ad-hoc will schedule this and return
 * on the other hand this will block further iwconfig SET
-* operations because of the wx_sem hold.
+* operations because of the wx_mutex hold.
 * Anyway some most set operations set a flag to speed-up
 * (abort) this wq (when syncro scanning) before sleeping
 * on the semaphore
@@ -2340,7 +2340,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
printk("==oh driver down return\n");
return;
}
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
 
if (ieee->current_network.ssid_len == 0) {
strcpy(ieee->current_network.ssid, IEEE80211_DEFAULT_TX_ESSID);
@@ -2431,7 +2431,7 @@ static void ieee80211_start_ibss_wq(struct work_struct 
*work)
ieee->data_hard_resume(ieee->dev);
netif_carrier_on(ieee->dev);
 
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 inline void ieee80211_start_ibss(struct ieee80211_device *ieee)
@@ -2439,7 +2439,7 @@ inline void ieee80211_start_ibss(struct ieee80211_device 
*ieee)
schedule_delayed_work(&ieee->start_ibss_wq, 150);
 }
 
-/* this is called only in user context, with wx_sem held */
+/* this is called only in user context, with wx_mutex held */
 void ieee80211_start_bss(struct ieee80211_device *ieee)
 {
unsigned long flags;
@@ -2505,7 +2505,7 @@ static void ieee80211_associate_retry_wq(struct 
work_struct *work)
struct ieee80211_device *ieee = container_of(dwork, struct 
ieee80211_device, associate_retry_wq);
unsigned long flags;
 
-   down(&ieee->wx_sem);
+   mutex_lock(&ieee->wx_mutex);
if(!ieee->proto_started)
goto exit;
 
@@ -2537,7 +2537,7 @@ static void ieee80211_associate_retry_wq(struct 
work_struct *work)
spin_unlock_irqrestore(&ieee->lock, flags);
 
 exit:
-   up(&ieee->wx_sem);
+   mutex_unlock(&ieee->wx_mutex);
 }
 
 struct sk_buff *ieee80211_get_beacon_(struct ieee80211_device *ieee)
@@ -2583,9 +2583,9 @@ EXPORT_SYMBOL(ieee80211_get_beacon);
 void ieee80211_softmac_stop_protocol(struct ieee80211_device *ieee)
 {
   

Re: [PATCH v2 8/8] IB/mlx5: Add helper mlx5_ib_post_send_wait

2016-10-26 Thread Binoy Jayan
On 27 October 2016 at 11:35, Leon Romanovsky  wrote:
> On Tue, Oct 25, 2016 at 06:46:58PM +0530, Binoy Jayan wrote:
>> On 25 October 2016 at 17:56, Leon Romanovsky  wrote:
>> > On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
>>
>> > In case of success (err == 0), you will call to unmap_dma instead of
>> > normal flow.
>> >
>> > NAK,
>> > Leon Romanovsky 
>>
>> Hi Loen,
>>
>> Even in the original code, the regular flow seems to reach 'unmap_dma' after
>> returning from 'wait_for_completion'().
>
> In original flow, the code executed assignments to mr->mmkey. In you
> code, it is skipped.
>

Yes you are right, I just noted it. My bad. I've changed it now.

Thanks,
Binoy


[PATCH v4 06/10] IB/hns: Replace counting semaphore event_sem with wait_event

2016-10-27 Thread Binoy Jayan
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c| 46 -
 drivers/infiniband/hw/hns/hns_roce_device.h |  2 +-
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c 
b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 51a0675..12ef3d8 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -189,6 +189,34 @@ void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 
token, u8 status,
complete(&context->done);
 }
 
+static inline struct hns_roce_cmd_context *
+hns_roce_try_get_context(struct hns_roce_cmdq *cmd)
+{
+   struct hns_roce_cmd_context *context = NULL;
+
+   spin_lock(&cmd->context_lock);
+
+   if (cmd->free_head < 0)
+   goto out;
+
+   context = &cmd->context[cmd->free_head];
+   context->token += cmd->token_mask + 1;
+   cmd->free_head = context->next;
+out:
+   spin_unlock(&cmd->context_lock);
+   return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct hns_roce_cmd_context *
+hns_roce_get_free_context(struct hns_roce_cmdq *cmd)
+{
+   struct hns_roce_cmd_context *context;
+
+   wait_event(cmd->wq, (context = hns_roce_try_get_context(cmd)));
+   return context;
+}
+
 /* this should be called with "use_events" */
 static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
@@ -200,13 +228,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev 
*hr_dev, u64 in_param,
struct hns_roce_cmd_context *context;
int ret = 0;
 
-   spin_lock(&cmd->context_lock);
-   WARN_ON(cmd->free_head < 0);
-   context = &cmd->context[cmd->free_head];
-   context->token += cmd->token_mask + 1;
-   cmd->free_head = context->next;
-   spin_unlock(&cmd->context_lock);
-
+   context = hns_roce_get_free_context(cmd);
init_completion(&context->done);
 
ret = hns_roce_cmd_mbox_post_hw(hr_dev, in_param, out_param,
@@ -238,6 +260,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev 
*hr_dev, u64 in_param,
context->next = cmd->free_head;
cmd->free_head = context - cmd->context;
spin_unlock(&cmd->context_lock);
+   wake_up(&cmd->wq);
 
return ret;
 }
@@ -248,10 +271,8 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev 
*hr_dev, u64 in_param,
 {
int ret = 0;
 
-   down(&hr_dev->cmd.event_sem);
ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
   in_modifier, op_modifier, op, timeout);
-   up(&hr_dev->cmd.event_sem);
 
return ret;
 }
@@ -313,7 +334,7 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
hr_cmd->free_head = 0;
 
-   sema_init(&hr_cmd->event_sem, hr_cmd->max_cmds);
+   init_waitqueue_head(&hr_cmd->wq);
spin_lock_init(&hr_cmd->context_lock);
 
hr_cmd->token_mask = CMD_TOKEN_MASK;
@@ -325,12 +346,9 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
 void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
 {
struct hns_roce_cmdq *hr_cmd = &hr_dev->cmd;
-   int i;
 
hr_cmd->use_events = 0;
-
-   for (i = 0; i < hr_cmd->max_cmds; ++i)
-   down(&hr_cmd->event_sem);
+   hr_cmd->free_head = -1;
 
kfree(hr_cmd->context);
 }
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2afe075..ac95f52 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -364,7 +364,7 @@ struct hns_roce_cmdq {
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
*/
-   struct semaphoreevent_sem;
+   wait_queue_head_t   wq;
int max_cmds;
spinlock_t  context_lock;
int free_head;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 03/10] IB/hns: Replace semaphore poll_sem with mutex

2016-10-27 Thread Binoy Jayan
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace it with a mutex. Also,
remove mutex_[un]lock from mthca_cmd_use_events and mthca_cmd_use_polling
respectively.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c| 11 ---
 drivers/infiniband/hw/hns/hns_roce_device.h |  3 ++-
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c 
b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..51a0675 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -119,7 +119,7 @@ static int hns_roce_cmd_mbox_post_hw(struct hns_roce_dev 
*hr_dev, u64 in_param,
return ret;
 }
 
-/* this should be called with "poll_sem" */
+/* this should be called with "poll_mutex" */
 static int __hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
u8 op_modifier, u16 op,
@@ -167,10 +167,10 @@ static int hns_roce_cmd_mbox_poll(struct hns_roce_dev 
*hr_dev, u64 in_param,
 {
int ret;
 
-   down(&hr_dev->cmd.poll_sem);
+   mutex_lock(&hr_dev->cmd.poll_mutex);
ret = __hns_roce_cmd_mbox_poll(hr_dev, in_param, out_param, in_modifier,
   op_modifier, op, timeout);
-   up(&hr_dev->cmd.poll_sem);
+   mutex_unlock(&hr_dev->cmd.poll_mutex);
 
return ret;
 }
@@ -275,7 +275,7 @@ int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
struct device *dev = &hr_dev->pdev->dev;
 
mutex_init(&hr_dev->cmd.hcr_mutex);
-   sema_init(&hr_dev->cmd.poll_sem, 1);
+   mutex_init(&hr_dev->cmd.poll_mutex);
hr_dev->cmd.use_events = 0;
hr_dev->cmd.toggle = 1;
hr_dev->cmd.max_cmds = CMD_MAX_NUM;
@@ -319,8 +319,6 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->token_mask = CMD_TOKEN_MASK;
hr_cmd->use_events = 1;
 
-   down(&hr_cmd->poll_sem);
-
return 0;
 }
 
@@ -335,7 +333,6 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
down(&hr_cmd->event_sem);
 
kfree(hr_cmd->context);
-   up(&hr_cmd->poll_sem);
 }
 
 struct hns_roce_cmd_mailbox
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..2afe075 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
 #define _HNS_ROCE_DEVICE_H
 
 #include 
+#include 
 
 #define DRV_NAME "hns_roce"
 
@@ -358,7 +359,7 @@ struct hns_roce_cmdq {
struct dma_pool *pool;
u8 __iomem  *hcr;
struct mutexhcr_mutex;
-   struct semaphorepoll_sem;
+   struct mutexpoll_mutex;
/*
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 01/10] IB/core: iwpm_nlmsg_request: Replace semaphore with completion

2016-10-27 Thread Binoy Jayan
Semaphore sem in iwpm_nlmsg_request is used as completion, so
convert it to a struct completion type. Semaphores are going
away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/core/iwpm_msg.c  | 8 
 drivers/infiniband/core/iwpm_util.c | 7 +++
 drivers/infiniband/core/iwpm_util.h | 3 ++-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/iwpm_msg.c 
b/drivers/infiniband/core/iwpm_msg.c
index 1c41b95..761358f 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -394,7 +394,7 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct 
netlink_callback *cb)
/* always for found nlmsg_request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
-   up(&nlmsg_request->sem);
+   complete(&nlmsg_request->comp);
return 0;
 }
 EXPORT_SYMBOL(iwpm_register_pid_cb);
@@ -463,7 +463,7 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct 
netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
-   up(&nlmsg_request->sem);
+   complete(&nlmsg_request->comp);
return 0;
 }
 EXPORT_SYMBOL(iwpm_add_mapping_cb);
@@ -555,7 +555,7 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
-   up(&nlmsg_request->sem);
+   complete(&nlmsg_request->comp);
return 0;
 }
 EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
@@ -749,7 +749,7 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct 
netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
-   up(&nlmsg_request->sem);
+   complete(&nlmsg_request->comp);
return 0;
 }
 EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c 
b/drivers/infiniband/core/iwpm_util.c
index ade71e7..08ddd2e 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -323,8 +323,7 @@ struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 
nlmsg_seq,
nlmsg_request->nl_client = nl_client;
nlmsg_request->request_done = 0;
nlmsg_request->err_code = 0;
-   sema_init(&nlmsg_request->sem, 1);
-   down(&nlmsg_request->sem);
+   init_completion(&nlmsg_request->comp);
return nlmsg_request;
 }
 
@@ -368,8 +367,8 @@ int iwpm_wait_complete_req(struct iwpm_nlmsg_request 
*nlmsg_request)
 {
int ret;
 
-   ret = down_timeout(&nlmsg_request->sem, IWPM_NL_TIMEOUT);
-   if (ret) {
+   ret = wait_for_completion_timeout(&nlmsg_request->comp, 
IWPM_NL_TIMEOUT);
+   if (!ret) {
ret = -EINVAL;
pr_info("%s: Timeout %d sec for netlink request (seq = %u)\n",
__func__, (IWPM_NL_TIMEOUT/HZ), 
nlmsg_request->nlmsg_seq);
diff --git a/drivers/infiniband/core/iwpm_util.h 
b/drivers/infiniband/core/iwpm_util.h
index af1fc14..ea6c299 100644
--- a/drivers/infiniband/core/iwpm_util.h
+++ b/drivers/infiniband/core/iwpm_util.h
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -69,7 +70,7 @@ struct iwpm_nlmsg_request {
u8  nl_client;
u8  request_done;
u16 err_code;
-   struct semaphoresem;
+   struct completion   comp;
struct kref kref;
 };
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 07/10] IB/mthca: Replace counting semaphore event_sem with wait_event

2016-10-27 Thread Binoy Jayan
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/mthca/mthca_cmd.c | 47 ++---
 drivers/infiniband/hw/mthca/mthca_dev.h |  3 ++-
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c 
b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 49c6e19..d6a048a 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -405,6 +405,34 @@ void mthca_cmd_event(struct mthca_dev *dev,
complete(&context->done);
 }
 
+static inline struct mthca_cmd_context *
+mthca_try_get_context(struct mthca_cmd *cmd)
+{
+   struct mthca_cmd_context *context = NULL;
+
+   spin_lock(&cmd->context_lock);
+
+   if (cmd->free_head < 0)
+   goto out;
+
+   context = &cmd->context[cmd->free_head];
+   context->token += cmd->token_mask + 1;
+   cmd->free_head = context->next;
+out:
+   spin_unlock(&cmd->context_lock);
+   return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct mthca_cmd_context *
+mthca_get_free_context(struct mthca_cmd *cmd)
+{
+   struct mthca_cmd_context *context;
+
+   wait_event(cmd->wq, (context = mthca_try_get_context(cmd)));
+   return context;
+}
+
 static int mthca_cmd_wait(struct mthca_dev *dev,
  u64 in_param,
  u64 *out_param,
@@ -417,15 +445,7 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
int err = 0;
struct mthca_cmd_context *context;
 
-   down(&dev->cmd.event_sem);
-
-   spin_lock(&dev->cmd.context_lock);
-   BUG_ON(dev->cmd.free_head < 0);
-   context = &dev->cmd.context[dev->cmd.free_head];
-   context->token += dev->cmd.token_mask + 1;
-   dev->cmd.free_head = context->next;
-   spin_unlock(&dev->cmd.context_lock);
-
+   context = mthca_get_free_context(&dev->cmd);
init_completion(&context->done);
 
err = mthca_cmd_post(dev, in_param,
@@ -458,8 +478,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
context->next = dev->cmd.free_head;
dev->cmd.free_head = context - dev->cmd.context;
spin_unlock(&dev->cmd.context_lock);
+   wake_up(&dev->cmd.wq);
 
-   up(&dev->cmd.event_sem);
return err;
 }
 
@@ -571,7 +591,7 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
dev->cmd.free_head = 0;
 
-   sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+   init_waitqueue_head(&dev->cmd.wq);
spin_lock_init(&dev->cmd.context_lock);
 
for (dev->cmd.token_mask = 1;
@@ -590,12 +610,9 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
  */
 void mthca_cmd_use_polling(struct mthca_dev *dev)
 {
-   int i;
-
dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;
 
-   for (i = 0; i < dev->cmd.max_cmds; ++i)
-   down(&dev->cmd.event_sem);
+   dev->cmd.free_head = -1;
 
kfree(dev->cmd.context);
 }
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h 
b/drivers/infiniband/hw/mthca/mthca_dev.h
index 87ab964..2fc86db 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -46,6 +46,7 @@
 #include 
 #include 
 
+#include 
 #include "mthca_provider.h"
 #include "mthca_doorbell.h"
 
@@ -121,7 +122,7 @@ struct mthca_cmd {
struct pci_pool  *pool;
struct mutex  hcr_mutex;
struct mutex  poll_mutex;
-   struct semaphore  event_sem;
+   wait_queue_head_t wq;
int   max_cmds;
spinlock_tcontext_lock;
int   free_head;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 02/10] IB/core: Replace semaphore sm_sem with an atomic wait

2016-10-27 Thread Binoy Jayan
The semaphore 'sm_sem' is used for an exclusive ownership of the device
so model the same as an atomic variable with an associated wait_event.
Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/core/user_mad.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 415a318..6101c0a 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -67,6 +67,8 @@ enum {
IB_UMAD_MINOR_BASE = 0
 };
 
+#define UMAD_F_CLAIM   0x01
+
 /*
  * Our lifetime rules for these structs are the following:
  * device special file is opened, we take a reference on the
@@ -87,7 +89,8 @@ struct ib_umad_port {
 
struct cdev   sm_cdev;
struct device *sm_dev;
-   struct semaphore   sm_sem;
+   wait_queue_head_t wq;
+   unsigned long flags;
 
struct mutex   file_mutex;
struct list_head   file_list;
@@ -1030,12 +1033,14 @@ static int ib_umad_sm_open(struct inode *inode, struct 
file *filp)
port = container_of(inode->i_cdev, struct ib_umad_port, sm_cdev);
 
if (filp->f_flags & O_NONBLOCK) {
-   if (down_trylock(&port->sm_sem)) {
+   if (test_and_set_bit(UMAD_F_CLAIM, &port->flags)) {
ret = -EAGAIN;
goto fail;
}
} else {
-   if (down_interruptible(&port->sm_sem)) {
+   if (wait_event_interruptible(port->wq,
+!test_and_set_bit(UMAD_F_CLAIM,
+&port->flags))) {
ret = -ERESTARTSYS;
goto fail;
}
@@ -1060,7 +1065,8 @@ static int ib_umad_sm_open(struct inode *inode, struct 
file *filp)
ib_modify_port(port->ib_dev, port->port_num, 0, &props);
 
 err_up_sem:
-   up(&port->sm_sem);
+   clear_bit(UMAD_F_CLAIM, &port->flags);
+   wake_up(&port->wq);
 
 fail:
return ret;
@@ -1079,7 +1085,8 @@ static int ib_umad_sm_close(struct inode *inode, struct 
file *filp)
ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
mutex_unlock(&port->file_mutex);
 
-   up(&port->sm_sem);
+   clear_bit(UMAD_F_CLAIM, &port->flags);
+   wake_up(&port->wq);
 
kobject_put(&port->umad_dev->kobj);
 
@@ -1177,7 +1184,8 @@ static int ib_umad_init_port(struct ib_device *device, 
int port_num,
 
port->ib_dev   = device;
port->port_num = port_num;
-   sema_init(&port->sm_sem, 1);
+   init_waitqueue_head(&port->wq);
+   __clear_bit(UMAD_F_CLAIM, &port->flags);
mutex_init(&port->file_mutex);
INIT_LIST_HEAD(&port->file_list);
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 00/10] infiniband: Remove semaphores

2016-10-27 Thread Binoy Jayan
Hi,

These are a set of patches [v4] which removes semaphores from infiniband.
These are part of a bigger effort to eliminate all semaphores from the
linux kernel.

v3 -> v4:

IB/mlx5: Added patch - Replace semaphore umr_common:sem with wait_event
IB/mlx5: Fixed a bug pointed out by Leon Romanovsky

Thanks,
Binoy

Binoy Jayan (10):
  IB/core: iwpm_nlmsg_request: Replace semaphore with completion
  IB/core: Replace semaphore sm_sem with an atomic wait
  IB/hns: Replace semaphore poll_sem with mutex
  IB/mthca: Replace semaphore poll_sem with mutex
  IB/isert: Replace semaphore sem with completion
  IB/hns: Replace counting semaphore event_sem with wait_event
  IB/mthca: Replace counting semaphore event_sem with wait_event
  IB/mlx5: Add helper mlx5_ib_post_send_wait
  IB/mlx5: Replace semaphore umr_common:sem with wait_event
  IB/mlx5: Simplify completion into a wait_event

 drivers/infiniband/core/iwpm_msg.c  |   8 +-
 drivers/infiniband/core/iwpm_util.c |   7 +-
 drivers/infiniband/core/iwpm_util.h |   3 +-
 drivers/infiniband/core/user_mad.c  |  20 +++--
 drivers/infiniband/hw/hns/hns_roce_cmd.c|  57 -
 drivers/infiniband/hw/hns/hns_roce_device.h |   5 +-
 drivers/infiniband/hw/mlx5/main.c   |   6 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h|   9 +-
 drivers/infiniband/hw/mlx5/mr.c | 124 +---
 drivers/infiniband/hw/mthca/mthca_cmd.c |  57 -
 drivers/infiniband/hw/mthca/mthca_cmd.h |   1 +
 drivers/infiniband/hw/mthca/mthca_dev.h |   5 +-
 drivers/infiniband/ulp/isert/ib_isert.c |   6 +-
 drivers/infiniband/ulp/isert/ib_isert.h |   3 +-
 include/rdma/ib_verbs.h |   1 +
 15 files changed, 153 insertions(+), 159 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 04/10] IB/mthca: Replace semaphore poll_sem with mutex

2016-10-27 Thread Binoy Jayan
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace it with a mutex. Also,
remove mutex_[un]lock from mthca_cmd_use_events and mthca_cmd_use_polling
respectively.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/mthca/mthca_cmd.c | 10 +++---
 drivers/infiniband/hw/mthca/mthca_cmd.h |  1 +
 drivers/infiniband/hw/mthca/mthca_dev.h |  2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c 
b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..49c6e19 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -347,7 +347,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
unsigned long end;
u8 status;
 
-   down(&dev->cmd.poll_sem);
+   mutex_lock(&dev->cmd.poll_mutex);
 
err = mthca_cmd_post(dev, in_param,
 out_param ? *out_param : 0,
@@ -382,7 +382,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
}
 
 out:
-   up(&dev->cmd.poll_sem);
+   mutex_unlock(&dev->cmd.poll_mutex);
return err;
 }
 
@@ -520,7 +520,7 @@ static int mthca_cmd_imm(struct mthca_dev *dev,
 int mthca_cmd_init(struct mthca_dev *dev)
 {
mutex_init(&dev->cmd.hcr_mutex);
-   sema_init(&dev->cmd.poll_sem, 1);
+   mutex_init(&dev->cmd.poll_mutex);
dev->cmd.flags = 0;
 
dev->hcr = ioremap(pci_resource_start(dev->pdev, 0) + MTHCA_HCR_BASE,
@@ -582,8 +582,6 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
 
dev->cmd.flags |= MTHCA_CMD_USE_EVENTS;
 
-   down(&dev->cmd.poll_sem);
-
return 0;
 }
 
@@ -600,8 +598,6 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
down(&dev->cmd.event_sem);
 
kfree(dev->cmd.context);
-
-   up(&dev->cmd.poll_sem);
 }
 
 struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h 
b/drivers/infiniband/hw/mthca/mthca_cmd.h
index d2e5b19..a7f197e 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.h
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.h
@@ -35,6 +35,7 @@
 #ifndef MTHCA_CMD_H
 #define MTHCA_CMD_H
 
+#include 
 #include 
 
 #define MTHCA_MAILBOX_SIZE 4096
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h 
b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..87ab964 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -120,7 +120,7 @@ enum {
 struct mthca_cmd {
struct pci_pool  *pool;
struct mutex  hcr_mutex;
-   struct semaphore  poll_sem;
+   struct mutex  poll_mutex;
struct semaphore  event_sem;
int   max_cmds;
spinlock_tcontext_lock;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 10/10] IB/mlx5: Simplify completion into a wait_event

2016-10-27 Thread Binoy Jayan
Convert the completion 'mlx5_ib_umr_context:done' to a wait_event as it
just waits for the return value to be filled.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +-
 drivers/infiniband/hw/mlx5/mr.c  | 9 +
 include/rdma/ib_verbs.h  | 1 +
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index de31b5f..cf496b5 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -524,7 +524,7 @@ struct mlx5_ib_mw {
 struct mlx5_ib_umr_context {
struct ib_cqe   cqe;
enum ib_wc_status   status;
-   struct completion   done;
+   wait_queue_head_t   wq;
 };
 
 struct umr_common {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index dfaf6f6..49ff2af 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -846,14 +846,14 @@ static void mlx5_ib_umr_done(struct ib_cq *cq, struct 
ib_wc *wc)
container_of(wc->wr_cqe, struct mlx5_ib_umr_context, cqe);
 
context->status = wc->status;
-   complete(&context->done);
+   wake_up(&context->wq);
 }
 
 static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context 
*context)
 {
context->cqe.done = mlx5_ib_umr_done;
-   context->status = -1;
-   init_completion(&context->done);
+   context->status = IB_WC_STATUS_NONE;
+   init_waitqueue_head(&context->wq);
 }
 
 static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
@@ -873,7 +873,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev 
*dev,
if (err) {
mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
} else {
-   wait_for_completion(&umr_context.done);
+   wait_event(umr_context.wq,
+  umr_context.status != IB_WC_STATUS_NONE);
if (umr_context.status != IB_WC_SUCCESS) {
mlx5_ib_warn(dev, "reg umr failed (%u)\n",
 umr_context.status);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..8b15b6f 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -823,6 +823,7 @@ struct ib_ah_attr {
 };
 
 enum ib_wc_status {
+   IB_WC_STATUS_NONE = -1,
IB_WC_SUCCESS,
IB_WC_LOC_LEN_ERR,
IB_WC_LOC_QP_OP_ERR,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 08/10] IB/mlx5: Add helper mlx5_ib_post_send_wait

2016-10-27 Thread Binoy Jayan
Clean up the following common code (to post a list of work requests to the
send queue of the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same.

 - Initialize 'mlx5_ib_umr_context' on stack
 - Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
 - Acquire the semaphore
 - call ib_post_send with a single ib_send_wr
 - wait_for_completion()
 - Check for umr_context.status
 - Release the semaphore

As semaphores are going away in the future, moving all of these into the
shared helper leaves only a single function using the semaphore, which
can then be rewritten to use something else.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/mlx5/mr.c | 115 +++-
 1 file changed, 32 insertions(+), 83 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index d4ad672..1593856 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -856,16 +856,40 @@ static inline void mlx5_ib_init_umr_context(struct 
mlx5_ib_umr_context *context)
init_completion(&context->done);
 }
 
+static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
+struct mlx5_umr_wr *umrwr)
+{
+   struct umr_common *umrc = &dev->umrc;
+   struct ib_send_wr *bad;
+   int err;
+   struct mlx5_ib_umr_context umr_context;
+
+   mlx5_ib_init_umr_context(&umr_context);
+   umrwr->wr.wr_cqe = &umr_context.cqe;
+
+   down(&umrc->sem);
+   err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
+   if (err) {
+   mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
+   } else {
+   wait_for_completion(&umr_context.done);
+   if (umr_context.status != IB_WC_SUCCESS) {
+   mlx5_ib_warn(dev, "reg umr failed (%u)\n",
+umr_context.status);
+   err = -EFAULT;
+   }
+   }
+   up(&umrc->sem);
+   return err;
+}
+
 static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
  u64 virt_addr, u64 len, int npages,
  int page_shift, int order, int access_flags)
 {
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct device *ddev = dev->ib_dev.dma_device;
-   struct umr_common *umrc = &dev->umrc;
-   struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
-   struct ib_send_wr *bad;
struct mlx5_ib_mr *mr;
struct ib_sge sg;
int size;
@@ -894,24 +918,12 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, 
struct ib_umem *umem,
if (err)
goto free_mr;
 
-   mlx5_ib_init_umr_context(&umr_context);
-
-   umrwr.wr.wr_cqe = &umr_context.cqe;
prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
 page_shift, virt_addr, len, access_flags);
 
-   down(&umrc->sem);
-   err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
-   if (err) {
-   mlx5_ib_warn(dev, "post send failed, err %d\n", err);
+   err = mlx5_ib_post_send_wait(dev, &umrwr);
+   if (err && err != -EFAULT)
goto unmap_dma;
-   } else {
-   wait_for_completion(&umr_context.done);
-   if (umr_context.status != IB_WC_SUCCESS) {
-   mlx5_ib_warn(dev, "reg umr failed\n");
-   err = -EFAULT;
-   }
-   }
 
mr->mmkey.iova = virt_addr;
mr->mmkey.size = len;
@@ -920,7 +932,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct 
ib_umem *umem,
mr->live = 1;
 
 unmap_dma:
-   up(&umrc->sem);
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
 
kfree(mr_pas);
@@ -940,13 +951,10 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 
start_page_index, int npages,
 {
struct mlx5_ib_dev *dev = mr->dev;
struct device *ddev = dev->ib_dev.dma_device;
-   struct umr_common *umrc = &dev->umrc;
-   struct mlx5_ib_umr_context umr_context;
struct ib_umem *umem = mr->umem;
int size;
__be64 *pas;
dma_addr_t dma;
-   struct ib_send_wr *bad;
struct mlx5_umr_wr wr;
struct ib_sge sg;
int err = 0;
@@ -1011,10 +1019,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 
start_page_index, int npages,
 
dma_sync_single_for_device(ddev, dma, size, DMA_TO_DEVICE);
 
-   mlx5_ib_init_umr_context(&umr_context);
-
memset(&wr, 0, sizeof(wr));
-   wr.wr.wr_cqe = &umr_context.cqe;
 
sg.addr = dma;
 

[PATCH v4 05/10] IB/isert: Replace semaphore sem with completion

2016-10-27 Thread Binoy Jayan
The semaphore 'sem' in isert_device is used as completion, so convert
it to struct completion. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
 drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 6dd43f6..de80f56 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -619,7 +619,7 @@
mutex_unlock(&isert_np->mutex);
 
isert_info("np %p: Allow accept_np to continue\n", isert_np);
-   up(&isert_np->sem);
+   complete(&isert_np->comp);
 }
 
 static void
@@ -2311,7 +2311,7 @@ struct rdma_cm_id *
isert_err("Unable to allocate struct isert_np\n");
return -ENOMEM;
}
-   sema_init(&isert_np->sem, 0);
+   init_completion(&isert_np->comp);
mutex_init(&isert_np->mutex);
INIT_LIST_HEAD(&isert_np->accepted);
INIT_LIST_HEAD(&isert_np->pending);
@@ -2427,7 +2427,7 @@ struct rdma_cm_id *
int ret;
 
 accept_wait:
-   ret = down_interruptible(&isert_np->sem);
+   ret = wait_for_completion_interruptible(&isert_np->comp);
if (ret)
return -ENODEV;
 
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h 
b/drivers/infiniband/ulp/isert/ib_isert.h
index c02ada5..a1277c0 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -190,7 +191,7 @@ struct isert_device {
 
 struct isert_np {
struct iscsi_np *np;
-   struct semaphoresem;
+   struct completion   comp;
struct rdma_cm_id   *cm_id;
struct mutexmutex;
struct list_headaccepted;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v4 09/10] IB/mlx5: Replace semaphore umr_common:sem with wait_event

2016-10-27 Thread Binoy Jayan
Remove semaphore umr_common:sem used to limit concurrent access to umr qp
and introduce an atomic value 'users' to keep track of the same. Use a
wait_event to block when the limit is reached.

Signed-off-by: Binoy Jayan 
---
 drivers/infiniband/hw/mlx5/main.c| 6 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 7 ++-
 drivers/infiniband/hw/mlx5/mr.c  | 6 --
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index 2217477..eb72bff 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2437,10 +2437,6 @@ static void destroy_umrc_res(struct mlx5_ib_dev *dev)
ib_dealloc_pd(dev->umrc.pd);
 }
 
-enum {
-   MAX_UMR_WR = 128,
-};
-
 static int create_umr_res(struct mlx5_ib_dev *dev)
 {
struct ib_qp_init_attr *init_attr = NULL;
@@ -2520,7 +2516,7 @@ static int create_umr_res(struct mlx5_ib_dev *dev)
dev->umrc.cq = cq;
dev->umrc.pd = pd;
 
-   sema_init(&dev->umrc.sem, MAX_UMR_WR);
+   init_waitqueue_head(&dev->umrc.wq);
ret = mlx5_mr_cache_init(dev);
if (ret) {
mlx5_ib_warn(dev, "mr cache init failed %d\n", ret);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dcdcd19..de31b5f 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -533,7 +533,12 @@ struct umr_common {
struct ib_qp*qp;
/* control access to UMR QP
 */
-   struct semaphoresem;
+   wait_queue_head_t   wq;
+   atomic_tusers;
+};
+
+enum {
+   MAX_UMR_WR = 128,
 };
 
 enum {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 1593856..dfaf6f6 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -867,7 +867,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev 
*dev,
mlx5_ib_init_umr_context(&umr_context);
umrwr->wr.wr_cqe = &umr_context.cqe;
 
-   down(&umrc->sem);
+   /* limit number of concurrent ib_post_send() on qp */
+   wait_event(umrc->wq, atomic_add_unless(&umrc->users, 1, MAX_UMR_WR));
err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
if (err) {
mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
@@ -879,7 +880,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev 
*dev,
err = -EFAULT;
}
}
-   up(&umrc->sem);
+   atomic_dec(&umrc->users);
+   wake_up(&umrc->wq);
return err;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH v4 05/10] IB/isert: Replace semaphore sem with completion

2016-11-17 Thread Binoy Jayan
Hi Sagi,

On 31 October 2016 at 02:42, Sagi Grimberg  wrote:
>> The semaphore 'sem' in isert_device is used as completion, so convert
>> it to struct completion. Semaphores are going away in the future.
>
>
> Umm, this is 100% *not* true. np->sem is designed as a counting to
> sync the iscsi login thread with the connect requests coming from the
> initiators. So this is actually a reliable bug insertion :(
>
> NAK from me...

Sorry for the late reply as I was held up in other activities.

I converted this to a wait_event() implementation but as I was doing it,
I was wondering how it would have been different if it was a completion
and not a semaphore.

File: drivers/infiniband/ulp/isert/ib_isert.c

If isert_connected_handler() is called multiple times, adding an entry to the
list, and if that happens while we use completion, 'done' (part of struct
completion) would be incremented by 1 each time 'complete' is called from
isert_connected_handler. After 'n' iterations, done will be equal to 'n'. If we
call wait_for_completion now from isert_accept_np, it would just decrement
'done' by one and continue without blocking, consuming one node at a time
from the list 'isert_np->pending'.

Alternatively if "done" becomes zero, and the next time wait_for_completion is
called, the API would add a node at the end of the wait queue 'wait' in 'struct
completion' and block until "done" is nonzero. (Ref: do_wait_for_common)
It exists the wait when a call to 'complete' turns 'done' back to 1.
But if there
are multiple waits called before calling complete, all the tasks
calling the wait
gets queued up and they will all would see "done" set to zero. When complete
is called now, done turns 1 again and the first task in the queue is woken up
as it is serialized as FIFO. Now the first wait returns and the done is
decremented by 1 just before the return.

Am I missing something here?

Thanks,
Binoy


Re: [PATCH v4 05/10] IB/isert: Replace semaphore sem with completion

2016-11-18 Thread Binoy Jayan
Hi Arnd,

On 18 November 2016 at 14:28, Arnd Bergmann  wrote:
> On Friday, November 18, 2016 12:27:32 PM CET Binoy Jayan wrote:
>> Hi Sagi,

> I think you are right. This is behavior is actuallly documented in
> Documentation/scheduler/completion.txt:

Thanking for having a look.

> However, this is fairly unusual behavior and I wasn't immediately aware
> of it either when I read Sagi's reply. While your patch looks correct,
> it's probably a good idea to point out the counting behavior of this
> completion as explicitly as possible, in the changelog text of the patch
> as well as in a code comment and perhaps in the naming of the completion.

Will mention this and resend the patch series.

Thanks,
Binoy


Re: [PATCH 1/2] scsi: smartpqi: Replace semaphore sync_request_sem with mutex

2016-10-24 Thread Binoy Jayan
Hi Arnd

On 20 October 2016 at 14:36, Arnd Bergmann  wrote:
> On Thursday, October 20, 2016 2:24:01 PM CEST Binoy Jayan wrote:
>> Semaphores are going away in the future, so replace the semaphore
>> sync_request_sem with the a mutex lock. timeout_msecs is not used
>> for the lock sync_request_sem, so remove the timed locking too.
>>
>> Signed-off-by: Binoy Jayan 
>
> The patch looks correct to me, but I think if you remove the support
> for handling timeouts, you should update the prototype of
> pqi_submit_raid_request_synchronous to no longer pass the timeout
> argument in the first place.

But we still need "timeout_msecs" in a call to
pqi_submit_raid_request_synchronous_with_io_request()

drivers/scsi/smartpqi/smartpqi_init.c +3484

-Binoy


  1   2   3   >