RE: [PATCH] ipw2200: le*_add_cpu conversion
On Tuesday, February 12, 2008 3:06 PM, [EMAIL PROTECTED] wrote: > From: Marcin Slusarz <[EMAIL PROTECTED]> > > replace all: > little_endian_variable = > cpu_to_leX(leX_to_cpu(little_endian_variable) + > expression_in_cpu_byteorder); > with: > leX_add_cpu(&little_endian_variable, > expression_in_cpu_byteorder); > generated with semantic patch > > Signed-off-by: Marcin Slusarz <[EMAIL PROTECTED]> > Cc: Zhu Yi <[EMAIL PROTECTED]> > Cc: John W. Linville <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > --- > drivers/net/wireless/ipw2200.c |4 +--- > 1 files changed, 1 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/wireless/ipw2200.c > b/drivers/net/wireless/ipw2200.c > index 3e6ad7b..5d9854e 100644 > --- a/drivers/net/wireless/ipw2200.c > +++ b/drivers/net/wireless/ipw2200.c > @@ -10326,9 +10326,7 @@ static int ipw_tx_skb(struct ipw_priv > *priv, struct ieee80211_txb *txb, >remaining_bytes, >PCI_DMA_TODEVICE)); > > - tfd->u.data.num_chunks = > - > cpu_to_le32(le32_to_cpu(tfd->u.data.num_chunks) + > - 1); > + le32_add_cpu(&tfd->u.data.num_chunks, 1); > } > } > > -- > 1.5.3.7 Thanks! Signed-off-by: Reinette Chatre <[EMAIL PROTECTED]> Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ipw3945: not only it periodically dies, it also BUG()s
On Wednesday, February 06, 2008 1:00 PM, Pavel Machek wrote: > Hmmm... bugzilla says: > >* Exact steps to reproduce >* Reproducability of bug (e.g. intermittent or 100% reproducable) >* Did this problem not exist in previous version of the driver? >* kernel version >* AP brand/model >* dmesg output at debug level 0x43fff > ~~~ > it would be nice to specify how to do this. It is > insmod parameter, right? > >* Type of security (if any) you are using (e.g. WEP64, > WEP128, WPA, WPA2, 802.1x, etc) >* Version of firmware >* Version of the ieee80211 module >* Proximity to the AP > >* Before reporting any firmware errors, please be sure to read Ben > Cahill's mailing list post on how to most effectively report such >bugs. > ~~~ > unfortunately the link here does not work. The instructions on how to report a bug have been updated to address the above issues. Thank you very much for helping us to make it better. Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: iwl3945 not working properly.
On Monday, February 18, 2008 7:47 AM, John W. Linville wrote: > On Mon, Feb 18, 2008 at 05:54:25AM +0100, Wael Nasreddine wrote: >> Hello, >> >> I have a Toshiba Satellite A135-S4427 with and Intel 3945ABG card, >> the driver is not working properly. >> >> When I turn on my PC it works fine, but If I ever bring the interface >> down, I no longer can associate it with any AP without rebooting, >> even the one I was using, I tried rmmod/modprobe iwl3945, didn't do >> anything, >> >> iwconfig shows that the wlan0 has the radio turned off, and >> >> $ cat /sys/bus/pci/drivers/iwl3945/:04:00.0/rf_kill 1 >> >> Even If I echo 0 > /sys/bus/pci/drivers/iwl3945/:04:00.0/rf_kill >> whenever I try to associate the interface with an AP it turns back to >> 1, I tried both iwconfig and NetworkManager, same problem. >> >> There's a button on my laptop for Radio SoftKill (fn+F8) but it's not >> working, the soft kill is being enabled/disabled without my >> interference. >> >> I tried it on kernel-2.6.24 and kernel-2.6.25-rc2 same result... >> >> Any help is appreciated... >> >> P.S: Please Cc to me, I am not subscribed to the mailing list. > > This sounds similar to the bug here: > > https://bugzilla.redhat.com/show_bug.cgi?id=432264 > > The OP in that bug reports that the problem continues even after > reverting to older kernels that worked previously. > > Hopefully part of the Intel crew will have some clue as to what is > happending here? Wael, Could you please help us debug this issue? Unfortunately none of the bugs reported about this issue (http://bughost.org/bugzilla/show_bug.cgi?id=1454 and https://bugzilla.redhat.com/show_bug.cgi?id=432264) have any debugging output. Please reopen bug http://bughost.org/bugzilla/show_bug.cgi?id=1454 and add debugging (loading module with debug=0x43fff) to your report to help us find out what happens on your system when the driver loads as well as when you change the rfkill settings. Thank you very much Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: iwl3945 not working properly.
On Tuesday, February 19, 2008 1:20 PM, Wael Nasreddine wrote: > Since the problem I am having is slightly different than the bugs > above, I'm not sure I should post the debug there but feel free to > post it if you think it is the same... In this case, please create a new bug report for the problem you are encountering. Thanks! Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Recent driver in linux kernel 2.6.25-rc2
On , Lukas Hejtmanek wrote: > as of pre 2.6.25 kernels, kismet monitoring tool does not work with > the message: # kismet > Launching kismet_server: //usr/bin/kismet_server > Suid priv-dropping disabled. This may not be secure. > No specific sources given to be enabled, all will be enabled. > Non-RFMon VAPs will be destroyed on multi-vap interfaces (ie, > madwifi-ng) Enabling channel hopping. > Enabling channel splitting. > Source 0 (iwl4965): Enabling monitor mode for iwl4965 source > interface wlan0 channel 6... > FATAL: Failed to set channel 6 5:Input/output error Please load the driver with debugging enabled (debug=0x43fff) and send us the log capturing during this kismet startup. Thanks Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ipw3945: not only it periodically dies, it also BUG()s
On Tuesday, February 05, 2008 1:45 PM, Pavel Machek wrote: > > ...I've reported this before, with full debugging. Not sure if > anything happened. Could you please point me to where you have reported it before? > Now, I got BUG() in iwl3945-base.c: 3824 Which driver and kernel are you using? Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ipw3945: not only it periodically dies, it also BUG()s
On Wednesday, February 06, 2008 6:32 AM, Pavel Machek wrote: > On Tue 2008-02-05 18:20:58, Chatre, Reinette wrote: >> On Tuesday, February 05, 2008 1:45 PM, Pavel Machek wrote: >> >>> >>> ...I've reported this before, with full debugging. Not sure if >>> anything happened. >> >> Could you please point me to where you have reported it before? > > From [EMAIL PROTECTED] Wed Oct 31 01:52:02 2007 > From: Pavel Machek <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED], >kernel list , >[EMAIL PROTECTED], [EMAIL PROTECTED] > Subject: iwl3945 in 2.6.24-rc1 dies under load > X-Warning: Reading this can be dangerous to your mental health. > > ...and thread that resulted. Could you please create a new bug in our bug tracking system (www.bughost.org) to enable us to track this problem? Please include the relevant information from the thread as well as the information you doscovered recently. Thank you very much Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ipw3945: not only it periodically dies, it also BUG()s
On Wednesday, February 06, 2008 1:00 PM, Pavel Machek wrote: >* dmesg output at debug level 0x43fff > ~~~ > it would be nice to specify how to do this. It is > insmod parameter, right? correct. you can do this as follows: $ insmod iwl3945 debug=0x43fff or $ modprobe iwl3945 debug=0x43fff >* Before reporting any firmware errors, please be sure to read Ben > Cahill's mailing list post on how to most effectively report such >bugs. > ~~~ > unfortunately the link here does not work. We'll try to dig out the instructions from another location and update that link. Thanks for letting us know. > > BTW, why not use kernel.org bugzilla? Having to create another > account is nasty... Users can report iwlwifi bugs in many locations ... their OSV's bug tracker (which could end up being many) as well as the kernel.org bugzilla. We focus on bugs in the bughost.org system. Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ipw3945-devel] [PATCH 2/5] iwlwifi: iwl3945 synchronize interruptand tasklet for down iwlwifi
On , Joonwoo Park wrote: > --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c > +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c > @@ -6262,6 +6262,10 @@ static void __iwl_down(struct iwl_priv *priv) > /* tell the device to stop sending interrupts */ > iwl_disable_interrupts(priv); > > + /* synchronize irq and tasklet */ > + synchronize_irq(priv->pci_dev->irq); > + tasklet_kill(&priv->irq_tasklet); > + Could synchronize_irq() be moved into iwl_disable_interrupts() ? I am also wondering if we cannot call tasklet_kill() before iwl_disable_interrupts() ... thus preventing it from being scheduled when we are going down. Reinette -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ipw3945-devel] [PATCH 1/5] iwlwifi: iwl3945 flush interrupt mask
Joonwoo Park <[EMAIL PROTECTED]> wrote: > interrupt mask > > After enabling/disabling interrupts flushing is required > I have been looking at this patch and I would like to get some more feedback from the experts in the group. First off, the register used for the read in order to flush has to be safe. We have to make sure that the register has no side effects. To do this we could use CSR_INT_MASK, as in "#define iwl_flush32(iwl) iwl_read32(iwl, CSR_INT_MASK)" This also enables us to flush at the end of a sequence of writes instead of after every write. Digging deeper it is not 100% clear to me when we should do flushing to handle write posting. I understand that it should be done in time sensitive code, but that could on a high level mean any write operation in the driver. How should it be decided which writes to the device need to be flushed? Patch is kept below fyi. Reinette > Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- > drivers/net/wireless/iwlwifi/iwl-io.h |2 ++ > drivers/net/wireless/iwlwifi/iwl3945-base.c |6 ++ > 2 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/drivers/net/wireless/iwlwifi/iwl-io.h > b/drivers/net/wireless/iwlwifi/iwl-io.h > index 8a8b96f..f2bc30e 100644 > --- a/drivers/net/wireless/iwlwifi/iwl-io.h > +++ b/drivers/net/wireless/iwlwifi/iwl-io.h > @@ -87,6 +87,8 @@ static inline u32 __iwl_read32(char *f, u32 > l, struct iwl_priv *iwl, u32 ofs) > #define iwl_read32(p, o) _iwl_read32(p, o) > #endif > > +#define iwl_flush32(iwl, ofs) iwl_read32(iwl, ofs) + > static inline int _iwl_poll_bit(struct iwl_priv *priv, u32 addr, > u32 bits, u32 mask, int timeout) > { > diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c > b/drivers/net/wireless/iwlwifi/iwl3945-base.c > index 8ed898d..85f1112 100644 > --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c > +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c > @@ -4410,6 +4410,7 @@ static void iwl_enable_interrupts(struct > iwl_priv *priv) IWL_DEBUG_ISR("Enabling interrupts\n"); > set_bit(STATUS_INT_ENABLED, &priv->status); > iwl_write32(priv, CSR_INT_MASK, CSR_INI_SET_MASK); > + iwl_flush32(priv, CSR_INT_MASK); > } > > static inline void iwl_disable_interrupts(struct iwl_priv *priv) > @@ -4418,11 +4419,15 @@ static inline void > iwl_disable_interrupts(struct iwl_priv *priv) > > /* disable interrupts from uCode/NIC to host */ > iwl_write32(priv, CSR_INT_MASK, 0x); > + iwl_flush32(priv, CSR_INT_MASK); > > /* acknowledge/clear/reset any interrupts still pending >* from uCode or flow handler (Rx/Tx DMA) */ > iwl_write32(priv, CSR_INT, 0x); > + iwl_flush32(priv, CSR_INT); > iwl_write32(priv, CSR_FH_INT_STATUS, 0x); > + iwl_flush32(priv, CSR_FH_INT_STATUS); > + > IWL_DEBUG_ISR("Disabled interrupts\n"); > } > > @@ -4840,6 +4845,7 @@ static irqreturn_t iwl_isr(int irq, void *data) >* If we *don't* have something, we'll re-enable before > leaving here. */ > inta_mask = iwl_read32(priv, CSR_INT_MASK); /* just > for debug */ > iwl_write32(priv, CSR_INT_MASK, 0x); > + iwl_flush32(priv, CSR_INT_MASK); > > /* Discover which interrupts are active/pending */ > inta = iwl_read32(priv, CSR_INT); > -- > 1.5.3.rc5 > > > --- > -- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.ne > t/marketplace ___ > Ipw3945-devel mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/ipw3945-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17
Hi Paul, On 2016-03-23, Paul E. McKenney wrote: > Please boot with the following parameters: > > rcu_tree.rcu_kick_kthreads ftrace > trace_event=sched_waking,sched_wakeup,sched_wake_idle_without_ipi With these parameters I expected more details to show up in the kernel logs but cannot find any. Even so, today I left the machine running again and when this happened I think I was able to capture the trace data for the event. Please find attached the trace information for the kernel message below. Since the complete trace file is very big I trimmed it to show the time around this event - hopefully this will contain the information you need. I would also like to provide some additional information. The system on which I see these events had a time that was _very_ wrong. I noticed that this issue occurs when system-timesynd was one of the tasks calling the functions of interest to your tracing and am wondering if a very out of sync time in process of being corrected could be the cause of this issue? As an experiment I ensured the system time was accurate before leaving the system idle overnight and I did not see the issue the next morning. [ 957.396537] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 957.399933] 1-...: (0 ticks this GP) idle=4d6/0/0 softirq=6311/6311 fqs=0 [ 957.403661] (detected by 0, t=60002 jiffies, g=3583, c=3582, q=47) [ 957.407227] Task dump for CPU 1: [ 957.409964] swapper/1 R running task0 0 1 0x0020 [ 957.413770] 039daa9a7eb9 8801785cfed0 818af34c 8801 [ 957.417696] 00060003 8801785d 880072f9ea00 822dcf80 [ 957.421631] 8801785cc000 8801785cc000 8801785cfee0 818af597 [ 957.425562] Call Trace: [ 957.428124] [] ? cpuidle_enter_state+0xfc/0x310 [ 957.431713] [] ? cpuidle_enter+0x17/0x20 [ 957.435122] [] ? call_cpuidle+0x2a/0x40 [ 957.438467] [] ? cpu_startup_entry+0x28d/0x360 [ 957.441949] [] ? start_secondary+0x114/0x140 [ 957.445378] rcu_preempt kthread starved for 60002 jiffies! g3583 c3582 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 [ 957.449834] rcu_preempt S 8801785b7d68 0 7 2 0x [ 957.453579] 8801785b7d68 88017dc8cc80 88016fe6bb80 8801785abb80 [ 957.457428] 8801785b8000 8801785b7da0 88017dc8cc80 88017dc8cc80 [ 957.461249] 0003 8801785b7d80 81ab03df 000100373021 [ 957.465055] Call Trace: [ 957.467493] [] schedule+0x3f/0xa0 [ 957.470613] [] schedule_timeout+0x127/0x270 [ 957.473976] [] ? detach_if_pending+0x120/0x120 [ 957.477387] [] rcu_gp_kthread+0x6d3/0xa40 [ 957.480659] [] ? wake_atomic_t_function+0x70/0x70 [ 957.484123] [] ? force_qs_rnp+0x1b0/0x1b0 [ 957.487392] [] kthread+0xe6/0x100 [ 957.490470] [] ? kthread_worker_fn+0x190/0x190 [ 957.493859] [] ret_from_fork+0x3f/0x70 [ 957.497044] [] ? kthread_worker_fn+0x190/0x190 Reinette trace.trim.gz Description: trace.trim.gz
RE: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17
Hi Paul, On 2016-03-22, Paul E. McKenney wrote: > On Tue, Mar 22, 2016 at 04:35:32PM +0000, Chatre, Reinette wrote: >> On 2016-03-21, Paul E. McKenney wrote: >>> On Mon, Mar 21, 2016 at 09:22:30AM -0700, Jacob Pan wrote: >>>> On Fri, 18 Mar 2016 16:56:41 -0700 >>>> "Paul E. McKenney" wrote: >>>>> On Fri, Mar 18, 2016 at 02:00:11PM -0700, Josh Triplett wrote: >>>>>> On Thu, Feb 25, 2016 at 04:56:38PM -0800, Paul E. McKenney wrote: >>> >>> [ . . . ] >>> >>>>>> We're seeing a similar stall (~60 seconds) on an x86 development >>>>>> system here. Any luck tracking down the cause of this? If not, any >>>>>> suggestions for traces that might be helpful? >>>>> >>>>> The dmesg containing the stall, the kernel version, and the .config >>>>> would be helpful! Working on a torture test specific to this bug... > > And thank you for the .config. Your kenrle version looks to be 4.5.0. > >>>> +Reinette, she has the system that can reproduce the issue. I >>>> believe she is having some other problems with it at the moment. But >>>> the .config should be available. Version is v4.5. >>> >>> A couple of additional questions: >>> >>> 1. Is the test running on bare metal or virtualized? If the >>> latter, what is the host? >> >> Bare metal. > > OK, you are ahead of me. Mine is virtualized. > >>> 2. Does the workload involve CPU hotplug? >> >> No. > > Again, you are ahead of me. Mine makes extremely heavy use of CPU hotplug. > >>> 3. Are you seeing things like this in dmesg? >>> >>> "rcu_preempt kthread starved for 21033 jiffies" >>> "rcu_sched kthread starved for 32103 jiffies" >>> "rcu_bh kthread starved for 84031 jiffies" >>> >>> If not, you are probably facing some other bug, and should >>> proceed debugging as described in Documentation/RCU/stallwarn.txt. >> >> Below is a sample of what I see as captured with v4.5. The kernel >> configuration is attached. >> >> [ 135.456197] INFO: rcu_preempt detected stalls on CPUs/tasks: [ >> 135.457729] 3-...: (0 ticks this GP) idle=722/0/0 softirq=5532/5532 >> fqs=0 [ 135.459604] (detected by 2, t=60004 jiffies, g=2105, c=2104, >> q=165) [ 135.461318] Task dump for CPU 3: [ 135.461321] swapper/3 >> R running task0 0 1 0x0020 [ 135.461325] >> 0078560040e5 88017846fed0 818af2cc 8801 [ >> 135.461330] 00060003 88017847 880072f32200 >> 822dcec0 [ 135.461334] 88017846c000 88017846c000 >> 88017846fee0 818af517 [ 135.461338] Call Trace: [ >> 135.461345] [] ? cpuidle_enter_state+0xfc/0x310 [ >> 135.461349] [] ? cpuidle_enter+0x17/0x20 [ >> 135.461353] [] ? call_cpuidle+0x2a/0x40 [ >> 135.461355] [] ? cpu_startup_entry+0x28d/0x360 [ >> 135.461360] [] ? start_secondary+0x114/0x140 [ >> 135.461365] rcu_preempt kthread starved for 60004 jiffies! g2105 c2104 > f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 > > And yes, it looks like you are seeing the same bug that I am tracing. > > The kthread is blocked on a schedule_timeout_interruptible(). Given > default configuration, this would have a three-jiffy timeout. > > You set CONFIG_RCU_CPU_STALL_TIMEOUT=60, which matches the 60004 jiffies > above. Is that value due to a distro setting or something? Mainline > uses CONFIG_RCU_CPU_STALL_TIMEOUT=21. Indeed ... this value originated from a Fedora configuration. >> [ 135.463965] rcu_preempt S 88017844fd68 0 7 2 >> 0x [ 135.463969] 88017844fd68 88017dd8cc80 >> 880177ff 880178443b80 [ 135.463973] 88017845 >> 88017844fda0 88017dd8cc80 88017dd8cc80 [ 135.463977] >> 0003 88017844fd80 81ab031f 000100031504 [ >> 135.463981] Call Trace: [ 135.463986] [] >> schedule+0x3f/0xa0 [ 135.463989] [] >> schedule_timeout+0x127/0x270 [ 135.463993] [] ? >> detach_if_pending+0x120/0x120 [ 135.463997] [] >> rcu_gp_kthread+0x6bd/0xa30 [ 135.464000] [] ? >> wake_atomic_t_function+0x70/0x70 [ 135.464003] [] ? >> force_qs_rnp+0x1b0/0x1b0 [ 135.464006] [] >> kthread+0xe6/0x100 [ 135.464009] [] ? >> kthread_worker_fn+0x190/0x190 [ 135.464012] [] >> ret_from_fork+0x3f/0x70 [ 135.464015] [] ? >> kthread_worker_fn+0x190/0x190 > > How lo
RE: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17
Hi Paul, On 2016-03-22, Paul E. McKenney wrote: > On Tue, Mar 22, 2016 at 09:04:47PM +0000, Chatre, Reinette wrote: >> On 2016-03-22, Paul E. McKenney wrote: >>> You set CONFIG_RCU_CPU_STALL_TIMEOUT=60, which matches the 60004 >>> jiffies above. Is that value due to a distro setting or something? >>> Mainline uses CONFIG_RCU_CPU_STALL_TIMEOUT=21. >> >> Indeed ... this value originated from a Fedora configuration. > > OK. Setting it shorter might (or might not) make it reproduce more > quickly. This can be set at boot time via rcupdate.rcu_cpu_stall_timeout. > Or at compile time via CONFIG_RCU_CPU_STALL_TIMEOUT. I kept the original configuration and seem to be able to reproduce with that. >>> If dumping manually shortly after the stall is at all non-trivial >>> (for example, if your reproduction time is many minute or hours), >>> I can supply some patches that automate this. Or you can pick >>> them up from -rcu: >> >> ... could you please point me to the patches you refer to? Or would you like > me to try with the entire kernel from rcu/dev? > > 2dc92e2a86b9 (rcu: Awaken grace-period kthread if too long since FQS) > c3fd2095d015 (rcu: Dump ftrace buffer when kicking grace-period kthread) > > There might be other dependencies, but these are the two that you need. I did not look closely at the patches when I applied them and because of that missed that they need a kernel parameter to be activated. After leaving the system idle overnight with these patches the stalls occurred but without the parameter I did not capture the data you need. I will try again tonight. Below are the traces from last night just in case they have value to you. [10154.635318] INFO: rcu_preempt detected stalls on CPUs/tasks: [10154.639218] 1-...: (0 ticks this GP) idle=c4e/0/0 softirq=99936/99936 fqs=0 [10154.643497] (detected by 0, t=60005 jiffies, g=24190, c=24189, q=79) [10154.647596] Task dump for CPU 1: [10154.650818] swapper/1 R running task0 0 1 0x0020 [10154.655052] 2656bf74de5e 8801785cfed0 818af34c 8801 [10154.659349] 00060003 8801785d 880072f0bc00 822dcf80 [10154.663636] 8801785cc000 8801785cc000 8801785cfee0 818af597 [10154.667916] Call Trace: [10154.670845] [] ? cpuidle_enter_state+0xfc/0x310 [10154.674802] [] ? cpuidle_enter+0x17/0x20 [10154.678564] [] ? call_cpuidle+0x2a/0x40 [10154.682295] [] ? cpu_startup_entry+0x28d/0x360 [10154.686187] [] ? start_secondary+0x114/0x140 [10154.690040] rcu_preempt kthread starved for 60005 jiffies! g24190 c24189 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 [10154.694944] rcu_preempt S 8801785b7d68 0 7 2 0x [10154.699062] 8801785b7d68 88017dc8cc80 8801785c3b80 8801785abb80 [10154.703275] 8801785b8000 8801785b7da0 88017dc8cc80 88017dc8cc80 [10154.707481] 0003 8801785b7d80 81ab03df 0001027e21aa [10154.711692] Call Trace: [10154.714548] [] schedule+0x3f/0xa0 [10154.718075] [] schedule_timeout+0x127/0x270 [10154.721832] [] ? detach_if_pending+0x120/0x120 [10154.725659] [] rcu_gp_kthread+0x6d3/0xa40 [10154.729379] [] ? wake_atomic_t_function+0x70/0x70 [10154.733235] [] ? force_qs_rnp+0x1b0/0x1b0 [10154.736854] [] kthread+0xe6/0x100 [10154.740267] [] ? kthread_worker_fn+0x190/0x190 [10154.743980] [] ret_from_fork+0x3f/0x70 [10154.747511] [] ? kthread_worker_fn+0x190/0x190 [11348.912706] INFO: rcu_preempt detected stalls on CPUs/tasks: [11348.916346] 2-...: (0 ticks this GP) idle=586/0/0 softirq=133504/133504 fqs=0 [11348.920407] (detected by 3, t=60002 jiffies, g=26799, c=26798, q=72) [11348.924244] Task dump for CPU 2: [11348.927205] swapper/2 R running task0 0 1 0x0020 [11348.931178] 2adc83427a76 8801785d3ed0 818af34c 8801 [11348.935217] 00060003 8801785d4000 880177d01e00 822dcf80 [11348.939237] 8801785d 8801785d 8801785d3ee0 818af597 [11348.943252] Call Trace: [11348.945921] [] ? cpuidle_enter_state+0xfc/0x310 [11348.949615] [] ? cpuidle_enter+0x17/0x20 [11348.953115] [] ? call_cpuidle+0x2a/0x40 [11348.956584] [] ? cpu_startup_entry+0x28d/0x360 [11348.960215] [] ? start_secondary+0x114/0x140 [11348.963808] rcu_preempt kthread starved for 60002 jiffies! g26799 c26798 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 [11348.968452] rcu_preempt S 8801785b7d68 0 7 2 0x [11348.972309] 8801785b7d68 88017dd0cc80 8801785c5940 8801785abb80 [11348.976266] 8801785b8000 8801785b7da0 88017dd0cc80 88017dd0cc80 [11348.980207] 0003 8801785b7d80 81ab03df 000102c9d45e [11348.984142] Call Trace: [11348.986714] [] schedule+0x3f/0xa0 [11348.98997
RE: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17
Hi Paul, On 2016-03-23, Paul E. McKenney wrote: > Please boot with the following parameters: > > rcu_tree.rcu_kick_kthreads ftrace > trace_event=sched_waking,sched_wakeup,sched_wake_idle_without_ipi > > Or was this run with tracing? If so, less than three hours isn't too bad. This was with tracing enabled, only missing the crucial rcu_tree.rcu_kick_kthreads Reinette