Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open

2019-01-18 Thread Kohli, Gaurav
On 1/18/2019 2:57 PM, Li RongQing wrote: There still is a race window after the commit b027e2298bd588 ("tty: fix data race between tty_init_dev and flush of buf"), and we encountered this crash issue if receive_buf call comes before tty initialization completes in n_tty_open and tty->driver_da

Re: kernel BUG at kernel/sched/core.c:3490!

2019-01-12 Thread Kohli, Gaurav
deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); Regards Gaurav On 1/11/2019 9:47 PM, Qian Cai wrote: On Fri, 2019-01-11 at 16:07 +0530, Kohli, Gaurav wrote: On 1/7/2019 11:26 PM, Oleg Nesterov wrote: pr_crit("XXX: %ld %d\n", current->state, current->o

Re: [PATCH][v2] tty: fix race between flush_to_ldisc and tty_open

2019-01-11 Thread Kohli, Gaurav
Hi, it don't seems to be good idea to put lock on one function and unlock in some other function. If in future some one has to call tty_init_dev, how he can track the unlocking as well of ldisc lock. Regards Gaurav On 12/24/2018 8:13 AM, Li RongQing wrote: There still is a race window after

Re: kernel BUG at kernel/sched/core.c:3490!

2019-01-11 Thread Kohli, Gaurav
On 1/7/2019 11:26 PM, Oleg Nesterov wrote: pr_crit("XXX: %ld %d\n", current->state, current->on_rq); Can we also add flags, this may help to know the path of problem: pr_crit("XXX: %ld %d 0x%x\n", current->state, current->on_rq, current->flags); -- Qualcomm India Private Limited, on b

Re: [PATCH] percpu_counter: Remove debug_object_free call twice

2018-08-26 Thread Kohli, Gaurav
Hi , Sorry for very late reminder, just wanted to know is below understanding of code is wrong? Regards Gaurav On 4/17/2018 11:59 AM, Kohli, Gaurav wrote: On 4/17/2018 3:18 AM, Tejun Heo wrote: On Fri, Apr 13, 2018 at 03:05:03PM +0530, Gaurav Kohli wrote: During percpu_counter destroy

Re: [PATCH v2] timers: Clear must_forward_clk inside base lock

2018-08-02 Thread Kohli, Gaurav
On 8/2/2018 12:04 PM, Thomas Gleixner wrote: On Thu, 2 Aug 2018, Gaurav Kohli wrote: Timer wheel base->must_forward_clock is indicating that the base clock might be stale due to a long idle sleep. The forwarding of base clock takes place in softirq of timer of the base clock takes place in

Re: [PATCH] timers: Clear must_forward_clk inside base lock

2018-07-30 Thread Kohli, Gaurav
Hi John, Thomas, Can you please review below patch and update your comments: Regards Gaurav On 7/26/2018 2:12 PM, Gaurav Kohli wrote: While migrating timer to new base, there is a need to update base clk by calling forward_timer_base to avoid stale clock , but at the same time if run_timer is

Re: [PATCH 4.16 161/279] kthread, sched/wait: Fix kthread_parkme() completion issue

2018-06-21 Thread Kohli, Gaurav
HI Greg, Yes more patches related to this are coming, https://lkml.org/lkml/2018/6/7/317 So i thought, if all go together then it is good. Regards Gaurav On 6/21/2018 12:21 AM, Greg Kroah-Hartman wrote: On Wed, Jun 20, 2018 at 12:10:31PM +0530, Kohli, Gaurav wrote: Hi Greg, As more patches

Re: [PATCH 4.16 161/279] kthread, sched/wait: Fix kthread_parkme() completion issue

2018-06-19 Thread Kohli, Gaurav
Hi Greg, As more patches related to this are coming in 4.17, so it is better if all go together on different branches, plz suggest. Regards Gaurav On 6/18/2018 1:42 PM, Greg Kroah-Hartman wrote: 4.16-stable review patch. If anyone has any objections, please let me know. --

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-06-07 Thread Kohli, Gaurav
HI , In the latest patch mentioned, k should be their instead of p: -WARN_ON_ONCE(!wait_task_inactive(p, TASK_PARKED)) +WARN_ON_ONCE(!wait_task_inactive(k, TASK_PARKED)) Regards Gaurav On 6/7/2018 12:29 AM, Peter Zijlstra wrote: On Wed, Jun 06, 2018 at 03:51:16PM +0200, Oleg Nesterov wrote:

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-06-05 Thread Kohli, Gaurav
Hi, Just for info , the patch that I have shared earlier with pi_lock approach has been tested since last one month and no issue has been observed, https://lkml.org/lkml/2018/4/25/189 Can we take this if it looks good? Regards Gaurav On 6/5/2018 10:05 PM, Oleg Nesterov wrote: On 06/05, Pe

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-06-05 Thread Kohli, Gaurav
ssue. Regards Gaurav On 5/7/2018 4:53 PM, Kohli, Gaurav wrote: Corrected the formatting, Sorry for spam. HI Peter, We have tested with new patch and still seeing same issue, in this dumps we don't have debug traces, but seems there still exist race from code review , Can you p

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-07 Thread Kohli, Gaurav
Corrected the formatting, Sorry for spam. HI Peter, We have tested with new patch and still seeing same issue, in this dumps we don't have debug traces, but seems there still exist race from code review , Can you please check it once: Controller Thread   CPUHP

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-07 Thread Kohli, Gaurav
On 5/2/2018 3:43 PM, Kohli, Gaurav wrote: On 5/2/2018 1:50 PM, Peter Zijlstra wrote: On Wed, May 02, 2018 at 10:45:52AM +0530, Kohli, Gaurav wrote: On 5/1/2018 6:49 PM, Peter Zijlstra wrote:    - complete(&kthread->parked), which we can do inside schedule(); this solves the

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-02 Thread Kohli, Gaurav
On 5/2/2018 1:50 PM, Peter Zijlstra wrote: On Wed, May 02, 2018 at 10:45:52AM +0530, Kohli, Gaurav wrote: On 5/1/2018 6:49 PM, Peter Zijlstra wrote: - complete(&kthread->parked), which we can do inside schedule(); this solves the problem because then kthread_park() will not

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-01 Thread Kohli, Gaurav
On 5/1/2018 6:49 PM, Peter Zijlstra wrote: On 5/1/2018 5:01 PM, Peter Zijlstra wrote: Let me ponder what the best solution is, it's a bit of a mess. So there's: - TASK_PARKED, which we can make a special state; this solves the problem because then wait_task_inactive() is guaranteed

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-01 Thread Kohli, Gaurav
On 5/1/2018 5:01 PM, Peter Zijlstra wrote: On Tue, May 01, 2018 at 04:10:53PM +0530, Kohli, Gaurav wrote: Yes with loop, it will reset TASK_PARKED but that is not happening in the dumps we have seen. But was that with or without the fixed wait-loop? I don't care about stuff you might

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-01 Thread Kohli, Gaurav
On 5/1/2018 3:48 PM, Peter Zijlstra wrote: On Tue, May 01, 2018 at 01:20:26PM +0530, Kohli, Gaurav wrote: But In our older case, where we have seen failure below is the wake up path and ftraces, Wakeup occured and completed before schedule call only. So final state of CPUHP is running not

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-05-01 Thread Kohli, Gaurav
sorry for spam, Adding list On 4/30/2018 4:47 PM, Peter Zijlstra wrote: On Thu, Apr 26, 2018 at 09:23:25PM +0530, Kohli, Gaurav wrote: On 4/26/2018 2:27 PM, Peter Zijlstra wrote: On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote: diff --git a/kernel/kthread.c b/kernel/kthread.c

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-04-26 Thread Kohli, Gaurav
On 4/26/2018 2:27 PM, Peter Zijlstra wrote: On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote: diff --git a/kernel/kthread.c b/kernel/kthread.c index cd50e99202b0..4b6503c6a029 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -177,12 +177,13 @@ void *kthread_probe_data(struc

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-04-25 Thread Kohli, Gaurav
On 4/26/2018 1:39 AM, Peter Zijlstra wrote: On Wed, Apr 25, 2018 at 02:03:19PM +0530, Gaurav Kohli wrote: diff --git a/kernel/smpboot.c b/kernel/smpboot.c index 5043e74..c5c5184 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -122,7 +122,45 @@ static int smpboot_thread_fn(void *data)

Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

2018-04-24 Thread Kohli, Gaurav
On 4/24/2018 11:56 PM, Peter Zijlstra wrote: On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote: The control cpu thread which initiates hotplug calls kthread_park() for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control thread wakes up the hotplug thread. There is a chanc

Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

2018-04-24 Thread Kohli, Gaurav
Hi , We can also fix below race by smpboot code as well: @@ -109,7 +109,6 @@ static int smpboot_thread_fn(void *data)     struct smp_hotplug_thread *ht = td->ht;     while (1) { -   set_current_state(TASK_INTERRUPTIBLE);     preempt_disable();    

Re: [PATCH] percpu_counter: Remove debug_object_free call twice

2018-04-16 Thread Kohli, Gaurav
On 4/17/2018 3:18 AM, Tejun Heo wrote: On Fri, Apr 13, 2018 at 03:05:03PM +0530, Gaurav Kohli wrote: During percpu_counter destroy, debug_object_free is calling twice which may create race. So removing once instance of call from debug_percpu_counter_deactivate. I don't quite follow. Can you p

Re: Query:Regarding percpu_counter debug object destroy

2018-04-13 Thread Kohli, Gaurav
list. Regards Gaurav On 4/13/2018 1:12 PM, Nikolay Borisov wrote: On 13.04.2018 10:32, Kohli, Gaurav wrote: Hi , I have checked below code and it seems we are calling debug_object_free twice, ideally we should deactivate and later we have to destroy. 1st call -> percpu_counter_dest

Query:Regarding percpu_counter debug object destroy

2018-04-13 Thread Kohli, Gaurav
Hi , I have checked below code and it seems we are calling debug_object_free twice, ideally we should deactivate and later we have to destroy. 1st call -> percpu_counter_destroy->debug_percpu_counter_deactivate -> debug_object_free 2nd call -> debug_object_free static bool percpu_counter_fi

Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task

2018-03-09 Thread Kohli, Gaurav
On 3/9/2018 4:18 PM, Tetsuo Handa wrote: Kohli, Gaurav wrote: t->alloc_lock is still held when leaving find_lock_task_mm(), which means that t->mm != NULL. But nothing prevents t from setting t->mm = NULL at exit_mm() from do_exit() and calling exit_creds() from __put_task_struct

Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task

2018-03-07 Thread Kohli, Gaurav
On 3/8/2018 2:26 AM, David Rientjes wrote: On Wed, 7 Mar 2018, Gaurav Kohli wrote: diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 6fd9773..5f4cc4b 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -114,9 +114,11 @@ struct task_struct *find_lock_task_mm(struct task_struct *p) for_each_

Re: Query:Regarding object poison overwritten in binder_transaction

2018-03-05 Thread Kohli, Gaurav
Thanks Greg, I will check the common tree, and it seems to me a new bug , will file a bug if won't be able to resolve from android-common tree. Regards Gaurav On 3/4/2018 12:57 AM, Greg Kroah-Hartman wrote: On Sat, Mar 03, 2018 at 08:22:35PM +0530, Kohli, Gaurav wrote: HI , Is

Query Regarding init block up due to tty_wait_until_sent

2018-03-05 Thread Kohli, Gaurav
Hi, We have seen few instances, where init is getting blocked due to wait in below call: -002|schedule() -003|schedule_timeout() -> timeout for 30 seconds -004|tty_wait_until_sent() -005|tty_port_close_start.part.3() -006|tty_port_close() -007|uart_close() -008|tty_name(inline) -008|tty_releas

Query:Regarding object poison overwritten in binder_transaction

2018-03-03 Thread Kohli, Gaurav
HI , Is there any known issue of slab poisoning in binder_transaction variable on kernel 4.9,  it seems owner variable of spinlock is getting corrupted(which is last 8th byte of binder_transaction struct).    368.423462:   <2> [] print_trailer+0x13c/0x214    368.428998:   <2> [] check_bytes_a

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-17 Thread Kohli, Gaurav
On 1/6/2018 1:20 PM, Kohli, Gaurav wrote: On 1/6/2018 2:35 AM, Alan Cox wrote: On Sat, 6 Jan 2018 01:54:36 +0530 "Kohli, Gaurav" wrote: Hi Alan, Sorry correcting the typo here: +retval =  tty_ldisc_lock(tty, 5 * HZ); +if (retval) +     goto err_release_lock; tty->por

Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task

2018-01-16 Thread Kohli, Gaurav
On 1/16/2018 12:50 PM, Alexey Dobriyan wrote: On Tue, Jan 16, 2018 at 11:06:47AM +0530, Kohli, Gaurav wrote: On 1/10/2018 10:50 AM, Alexey Dobriyan wrote: We are seeing crash in do_task_stat while accessing stack pointer, It seems same task has already completed do_exit call. So it seems a

Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task

2018-01-15 Thread Kohli, Gaurav
On 1/10/2018 10:50 AM, Alexey Dobriyan wrote: We are seeing crash in do_task_stat while accessing stack pointer, It seems same task has already completed do_exit call. So it seems a race between them: Please, post exact kernel version and struct task_struct::usage if you still have that kernel

Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task

2018-01-15 Thread Kohli, Gaurav
On 1/15/2018 4:32 PM, John Ogness wrote: Hello Gaurav. On 2018-01-09, Kohli, Gaurav wrote: We are seeing crash in do_task_stat while accessing stack pointer, It seems same task has already completed do_exit call. So it seems a race between them: Below is the crash trace: 49750.534377

Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task

2018-01-15 Thread Kohli, Gaurav
eip = KSTK_EIP(task);     esp = KSTK_ESP(task);     } Regards Gaurav On 1/9/2018 7:03 PM, Kohli, Gaurav wrote: HI , We are seeing crash in do_task_stat while accessing stack pointer, It seems same task has already completed do_exit call. So it seems a race between them:

Query: Crash is coming during /prod/PID/stat and do_exit of same task

2018-01-09 Thread Kohli, Gaurav
HI , We are seeing crash in do_task_stat while accessing stack pointer, It seems same task has already completed do_exit call. So it seems a race between them: Below is the crash trace: 49750.534377] Kernel BUG at ff8e7a4c53a8 [verbose debug info unavailable] [49750.534394] task: ffe

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav
On 1/6/2018 2:35 AM, Alan Cox wrote: On Sat, 6 Jan 2018 01:54:36 +0530 "Kohli, Gaurav" wrote: Hi Alan, Sorry correcting the typo here: +retval =  tty_ldisc_lock(tty, 5 * HZ); +if (retval) +     goto err_release_lock; tty->port->itty = tty; /* * Structures all instal

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav
HI Alan, Sorry correcting the typo here: On 1/6/2018 1:44 AM, Kohli, Gaurav wrote: Hi Alan, On 1/5/2018 7:45 PM, Alan Cox wrote: But in above case , there we can hit another race, if we have a sequence like this tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will i

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav
elease_lock; +tty_unlock(tty); +release_tty(tty, idx); +tty_ldisc_unlock(tty); +return ERR_PTR(retval); On 1/6/2018 1:44 AM, Kohli, Gaurav wrote: Hi Alan, On 1/5/2018 7:45 PM, Alan Cox wrote: But in above case , there we can hit another race, if we have a sequence like this tty_in

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav
Hi Alan, On 1/5/2018 7:45 PM, Alan Cox wrote: But in above case , there we can hit another race, if we have a sequence like this tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will initialize ldisc , but at this moment disc_data is still NULL And if flush_to_ldisc comes in between, i

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav
On 1/5/2018 7:06 PM, Alan Cox wrote: On Fri, 5 Jan 2018 13:15:45 +0530 "Kohli, Gaurav" wrote: Hi Alan, Can you make that code available otherwise it's impossible to see what the problem might be.   https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/tty/s

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-04 Thread Kohli, Gaurav
Hi Alan, Can you make that code available otherwise it's impossible to see what the problem might be.   https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/tty/serial?h=msm-4.9  As discussed , there not seems a problem as we are getting print request even when port seems to c

Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-04 Thread Kohli, Gaurav
Which tty driver ? serial/msm_serial.c ? We are using our internal driver, msm_geni_serial.c Ok no what I need to see is a trace of what each CPU is doing at the point you detect the problem. That way we can see what the path that races is. Below is stack trace running by init in our cas

Query: Regarding crash in n_tty_receive_buf_common during boot

2017-12-26 Thread Kohli, Gaurav
Hi , We have seen lot of crashes in 4.9 during boot in n_tty_receive_buf_common, when tty->disc_data becomes NULL, Below is the call stack for same 29.710969] PC is at n_tty_receive_buf_common+0x68/0xa3c [   29.716425] LR is at n_tty_receive_buf_common+0x58/0xa3c [   29.721882] pc : [] lr : []

Query: Notifier support for callback profiling

2017-05-19 Thread Kohli, Gaurav
Hi , Can we profile driver specific notifier callback from Linux notifier. Sometimes we need to debug which callback is taking more time or where it is stuck. If kernel/notifier.c provide generic mechanism to profile callbacks that will save profiling individual callbacks. -- Qualcomm India P