Re: [PATCH 0/7] Serialise oopses, BUGs, WARNs, dump_stack, soft lockups and hard lockups

2015-02-24 Thread Arjan van de Ven
>> Some architectures already have their own recursive
>> locking for oopses and we have another version for
>> serialising dump_stack.
>>
>> Create a common version and use it everywhere (oopses,
>> BUGs, WARNs, dump_stack, soft lockups and hard lockups).
>
> Dunno. I've had cases where the simultaneity of the oopses
> (i.e. their garbled nature) gave me the clue about the type
> of race to expect.
>


one of the question is if you want to serialize, or if you just want to label.
If you take a cookie (could just be a monotonic increasing number) at
the start of the oops
and then prefix/postfix the stack printing with that number, you don't
serialize (risk of locking up),
but you can pretty trivially see which line  came from where..
if you do the monotonic increasing number approach, you even get an
ordering out of it.
it does mean changing the dump_stack() and co function fingerprint to
take an extra argument,
but that is not TOO insane.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Boot failure with next-20120208

2012-02-13 Thread Arjan van de Ven
On 2/12/2012 7:04 PM, Michael Neuling wrote:
>> Just a quick note to say I got a boot OOPs with next-20120208 and 9 on a
>> Power7 blade (my other PowerPC boot tests are ok.  I'll investigate this
>> further on Monday.
>>
>> The line referenced below is:
>>
>> BUG_ON(!kobj || !kobj->sd || !attr);
>>
>> in sysfs_create_file().
>>
>> calling  .topology_init+0x0/0x1ac @ 1
>> initcall 7_.async_cpu_up+0x0/0x40 returned 0 after 9765 usecs
>> async_continuing @ 20 after 9765 usec
>> [ cut here ]
>> kernel BUG at fs/sysfs/file.c:573!
>> Oops: Exception in kernel mode, sig: 5 [#1]
>> SMP NR_CPUS=32 NUMA pSeries
>> Modules linked in:
>> NIP: c024a35c LR: c04ee050 CTR: c083ca24
>> REGS: c003fd9e7560 TRAP: 0700   Not tainted  (3.3.0-rc2-autokern1)
>> MSR: 80029032   CR: 88002082  XER: 000f
>> CFAR: c024a370
>> TASK = c003fd9e8000[20] 'kworker/u:6' THREAD: c003fd9e4000 CPU: 0
>> GPR00: 0001 c003fd9e77e0 c0d19bb8  
>> GPR04: c0bf37a8 0008 82096400  
>> GPR08:  c0f80028 c0d52bd8  
>> GPR12: 48002088 cf33b000 01affa78 009aa000 
>> GPR16: 00e1f3c8 02d517f0 01aff984 0060 
>> GPR20:    c0c45128 
>> GPR24:  0008  c0c44200 
>> GPR28: c0f80028 0008 c0c85038 0002 
>> NIP [c024a35c] .sysfs_create_file+0x1c/0x40
>> LR [c04ee050] .device_create_file+0x20/0x40
>> Call Trace:
>> [c003fd9e77e0] [c003fd9e78a0] 0xc003fd9e78a0 (unreliable)
>> [c003fd9e7850] [c083c9a4] .register_cpu_online+0x1d0/0x250
>> [c003fd9e7900] [c083ca8c] .sysfs_cpu_notify+0x68/0x28c
>> [c003fd9e79b0] [c083769c] .notifier_call_chain+0x9c/0x100
>> [c003fd9e7a50] [c00a5878] .__cpu_notify+0x38/0x80
>> [c003fd9e7ad0] [c083e124] ._cpu_up+0x10c/0x178
>> [c003fd9e7b90] [c083e2c8] .cpu_up+0x138/0x164
>> [c003fd9e7c20] [c0ba46d0] .async_cpu_up+0x28/0x40
>> [c003fd9e7ca0] [c00d81ec] .async_run_entry_fn+0xbc/0x1f0
>> [c003fd9e7d50] [c00c7cbc] .process_one_work+0x19c/0x590
>> [c003fd9e7e10] [c00c8618] .worker_thread+0x188/0x4b0
>> [c003fd9e7ed0] [c00ce57c] .kthread+0xbc/0xd0
>> [c003fd9e7f90] [c0021448] .kernel_thread+0x54/0x70
>> Instruction dump:
>> 7fa3eb78 ebe1fff8 eba1ffe8 7c0803a6 4e800020 2c23 41820024 e8630030 
>> 7c800074 7800d182 2fa3 419e0014 <0b00> 38a2 4bfffebc e8630030 
>> ---[ end trace 31fd0ba7d8756001 ]---
>> initcall .topology_init+0x0/0x1ac returned 0 after 0 usecs
>> calling  .pcibios_init+0x0/0xe8 @ 1
>> PCI: Probing PCI hardware
>> PCI: Probing PCI hardware done
>> initcall .pcibios_init+0x0/0xe8 returned 0 after 0 usecs
>> calling  .add_system_ram_resources+0x0/0x140 @ 1
>> initcall .add_system_ram_resources+0x0/0x140 returned 0 after 0 usecs
>> calling  
>> .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 @ 1
>> initcall 
>> .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 
>> returned 0 after 0 usecs
>> calling  .opal_init+0x0/0x1cc @ 1
>> opal: Node not found
>> initcall .opal_init+0x0/0x1cc returned -19 after 0 usecs
>> calling  .__machine_initcall_pseries_ioei_init+0x0/0xa0 @ 1
> 
> Reverting "smp: start up non-boot CPUs asynchronously" (8de7a96405 from
> next-20120208) fixes this problem for me.  
> 
if that fixes it, it means PPC has a race somewhere in the cpu hotplug
code, since all the patch does is hotplug the cpus one by one (which the
normal kernel also does, just not in parallel with other work)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Boot failure with next-20120208

2012-02-13 Thread Arjan van de Ven
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2/13/2012 12:05 PM, Andrew Morton wrote:
> On Mon, 13 Feb 2012 06:18:34 -0800 Arjan van de Ven
>  wrote:
> 
>> On 2/12/2012 7:04 PM, Michael Neuling wrote:
>>>> Just a quick note to say I got a boot OOPs with next-20120208
>>>> and 9 on a Power7 blade (my other PowerPC boot tests are ok.
>>>> I'll investigate this further on Monday.
> 
> Thanks for testing linux-next.  Very useful.
> 
>>>> The line referenced below is:
>>>> 
>>>> BUG_ON(!kobj || !kobj->sd || !attr);
>>>> 
>>>> in sysfs_create_file().
> 
> Yes, this is exactly why we should never use BUG_ON(a || b).  We
> don't know which of those three expressions triggered.
> 
>>>> calling  .topology_init+0x0/0x1ac @ 1 initcall
>>>> 7_.async_cpu_up+0x0/0x40 returned 0 after 9765 usecs 
>>>> async_continuing @ 20 after 9765 usec [ cut here
>>>> ] kernel BUG at fs/sysfs/file.c:573! Oops:
>>>> Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=32 NUMA
>>>> pSeries Modules linked in: NIP: c024a35c LR:
>>>> c04ee050 CTR: c083ca24 REGS: c003fd9e7560
>>>> TRAP: 0700   Not tainted  (3.3.0-rc2-autokern1) MSR:
>>>> 80029032   CR: 88002082  XER:
>>>> 000f CFAR: c024a370 TASK = c003fd9e8000[20]
>>>> 'kworker/u:6' THREAD: c003fd9e4000 CPU: 0 GPR00:
>>>> 0001 c003fd9e77e0 c0d19bb8
>>>>  GPR04: c0bf37a8 0008
>>>> 82096400  GPR08: 
>>>> c0f80028 c0d52bd8  GPR12:
>>>> 48002088 cf33b000 01affa78
>>>> 009aa000 GPR16: 00e1f3c8 02d517f0
>>>> 01aff984 0060 GPR20: 
>>>>   c0c45128 GPR24:
>>>>  0008 
>>>> c0c44200 GPR28: c0f80028 0008
>>>> c0c85038 0002 NIP [c024a35c]
>>>> .sysfs_create_file+0x1c/0x40 LR [c04ee050]
>>>> .device_create_file+0x20/0x40 Call Trace: [c003fd9e77e0]
>>>> [c003fd9e78a0] 0xc003fd9e78a0 (unreliable) 
>>>> [c003fd9e7850] [c083c9a4]
>>>> .register_cpu_online+0x1d0/0x250 [c003fd9e7900]
>>>> [c083ca8c] .sysfs_cpu_notify+0x68/0x28c 
>>>> [c003fd9e79b0] [c083769c]
>>>> .notifier_call_chain+0x9c/0x100 [c003fd9e7a50]
>>>> [c00a5878] .__cpu_notify+0x38/0x80 [c003fd9e7ad0]
>>>> [c083e124] ._cpu_up+0x10c/0x178 [c003fd9e7b90]
>>>> [c083e2c8] .cpu_up+0x138/0x164 [c003fd9e7c20]
>>>> [c0ba46d0] .async_cpu_up+0x28/0x40 [c003fd9e7ca0]
>>>> [c00d81ec] .async_run_entry_fn+0xbc/0x1f0 
>>>> [c003fd9e7d50] [c00c7cbc]
>>>> .process_one_work+0x19c/0x590 [c003fd9e7e10]
>>>> [c00c8618] .worker_thread+0x188/0x4b0 
>>>> [c003fd9e7ed0] [c00ce57c] .kthread+0xbc/0xd0 
>>>> [c003fd9e7f90] [c0021448]
>>>> .kernel_thread+0x54/0x70 Instruction dump: 7fa3eb78 ebe1fff8
>>>> eba1ffe8 7c0803a6 4e800020 2c23 41820024 e8630030 
>>>> 7c800074 7800d182 2fa3 419e0014 <0b00> 38a2
>>>> 4bfffebc e8630030 ---[ end trace 31fd0ba7d8756001 ]--- 
>>>> initcall .topology_init+0x0/0x1ac returned 0 after 0 usecs 
>>>> calling  .pcibios_init+0x0/0xe8 @ 1 PCI: Probing PCI
>>>> hardware PCI: Probing PCI hardware done initcall
>>>> .pcibios_init+0x0/0xe8 returned 0 after 0 usecs calling
>>>> .add_system_ram_resources+0x0/0x140 @ 1 initcall
>>>> .add_system_ram_resources+0x0/0x140 returned 0 after 0 usecs 
>>>> calling
>>>> .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8
>>>> @ 1 initcall
>>>> .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8
>>>> returned 0 after 0 usecs calling  .opal_init+0x0/0x1cc @ 1 
>>>> opal: Node not found initcall .opal_init+0x0/0x1cc returned
>>>> -19 after 0 usecs calling
>>>> .__machine_initcall_pseries_ioei_init+0x0/0xa0 @ 1
>>> 
>>> Reverting "smp: start up non-boot CPUs asynchronously"
>>> (8de7a96405 fro

Re: smp: Start up non-boot CPUs asynchronously

2012-02-14 Thread Arjan van de Ven
one coments; will comment more when I get to work

On Tue, Feb 14, 2012 at 1:48 AM, Srivatsa S. Bhat

7. And whichever code between smp_init() and async_synchronize_full() didn't
>
> care about CPU hotplug till today but depended on all cpus being online
> must
> suddenly start worrying about CPU Hotplug. They must register a cpu
> notifier
> and handle callbacks etc etc.. Or if they are not worth that complexity,
> they
> should atleast be redesigned or moved around - like the print statements
> that
> tell how many cpus came up, for example.
>
>
frankly, such code HAS to worry about cpus going online and offline even
today; the firmware, at least on X86, can start taking cores offline/online
once ACPI is initialized
(as controlled by a data center manager from outside the box, usually done
based on thermal or power conditions on a datacenter level).
Now, no doubt that we have bugs in this space, since this only happened
very rarely before.

Question is what to do from a longer term strategy:
Either we declare the number of online CPUs invariant during a certain
phase of the boot (and make ACPI and co honor this as well somehow)
or
We decide to go about fixing these (maybe with the help of lockdep?)

In addition to this, the reality is that the whole "bring cpus up" sequence
needs to be changed; the current one is very messy and requires the hotplug
lock for the whole bring up of each individual cpu... which is a very
unfortunate design; a much better design would be to only take the lock for
the actual registration of the newly brought up CPU to the kernel, while
running the physical bringup without the global lock.
If/when that change gets made, we can do the physical bring up in parallel
(with each other, but also with the rest of the kernel boot), and do the
registration en-mass at some convenient time in the boot, potentially late.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: smp: Start up non-boot CPUs asynchronously

2012-02-14 Thread Arjan van de Ven
Its more than acpi ... machine checks can do it too etc
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Peter Zijlstra  wrote:

On Tue, 2012-02-14 at 06:31 -0800, Arjan van de Ven wrote:
> 
> frankly, such code HAS to worry about cpus going online and offline
> even today; the firmware, at least on X86, can start taking cores
> offline/online once ACPI is initialized (as controlled by a data
> center manager from outside the box, usually done based on thermal or
> power conditions on a datacenter level). Now, no doubt that we have
> bugs in this space, since this only happened very rarely before. 

Which frankly is an utter piece of crap, that ACPI spec is total garbage
and completely useless. You might have noticed that the ACPI code
supporting that failure carries a big nacked-by from me.

That's not to say we shouldn't try to fix hotplug, but bringing that
ACPI nonsense to the table makes me care less, not more.

I mean, really, that spec is broken, the support is worse.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: smp: Start up non-boot CPUs asynchronously

2012-02-14 Thread Arjan van de Ven
On 2/14/2012 11:57 AM, Srivatsa S. Bhat wrote:

>> In addition to this, the reality is that the whole "bring cpus up"
>> sequence needs to be changed; the current one is very messy and requires
>> the hotplug lock for the whole bring up of each individual cpu... which
>> is a very unfortunate design; a much better design would be to only take
>> the lock for the actual registration of the newly brought up CPU to the
>> kernel, while running the physical bringup without the global lock.
>> If/when that change gets made, we can do the physical bring up in
>> parallel (with each other, but also with the rest of the kernel boot),
>> and do the registration en-mass at some convenient time in the boot,
>> potentially late.
>>
> 
> 
> Sounds like a good idea, but how will we take care of CPU_UP_PREPARE and
> CPU_STARTING callbacks then? Because, CPU_UP_PREPARE callbacks are run
> before bringing up the cpu and CPU_STARTING is called from the cpu that is
> coming up. Also, CPU_UP_PREPARE callbacks can be failed, which can lead
> to that particular cpu boot getting aborted. With the "late commissioning
> of CPUs" idea you proposed above, retaining such semantics could become
> very challenging.

some of these callbacks may need to be redesigned as well; or at least,
we may need to decouple the "physical" state of the CPU that's getting
brought up from the "logical" OS visible one.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Boot failure with next-20120208

2012-03-23 Thread Arjan van de Ven
On 3/23/2012 12:22 PM, Andrew Morton wrote:
> On Mon, 13 Feb 2012 12:16:41 -0800
> Arjan van de Ven  wrote:
> 
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>>> The bug looks pretty generic, nothing very PPC-specific there.  It 
>>> might affect other architectures - we won't know until we find out
>>> wht caused it.
>>
>> well one half of the race looks pretty generic...
>> . doesn't mean the other half of the race is though
>>
>>
>>>
>>> Ho hum, I suppose I should pull the patch out of linux-next, to
>>> avoid disrupting other testing.  This means it's going to be hard
>>> to get the bug fixed.
>>
>> it means losing this one big PPC machine indeed until they hit
>> that same race some other way with regular real cpu hotplug ;-(
> 
> So we're kinda stuck with this.  As I can't merge it, I guess I'll make
> smp-start-up-non-boot-cpus-asynchronously.patch disappear.


well yeah, PPC is throwing things in the spanner

we're now working on an x86-only patch with basically the same
improvement, but done in a way that does not touch the other architectures

so by all means drop the patch


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

2009-09-24 Thread Arjan van de Ven
On Thu, 24 Sep 2009 13:33:07 +0200
Peter Zijlstra  wrote:

> On Thu, 2009-09-24 at 18:38 +1000, Benjamin Herrenschmidt wrote:
> > On Thu, 2009-09-24 at 09:51 +0200, Peter Zijlstra wrote:
> > > > I don't quite follow your logic here. This is useful for more
> > > > than just hypervisors. For example, take the HV out of the
> > > > picture for a moment and imagine that the HW has the ability to
> > > > offline CPU in various power levels, with varying latencies to
> > > > bring them back.
> > > 
> > > cpu-hotplug is an utter slow path, anybody saying latency and
> > > hotplug in the same sentence doesn't seem to grasp either or both
> > > concepts.
> > 
> > Let's forget about latency then. Let's imagine I want to set a CPU
> > offline to save power, vs. setting it offline -and- opening the back
> > door of the machine to actually physically replace it :-)
> 
> If the hardware is capable of physical hotplug, then surely powering
> the socket down saves most power and is the preferred mode?

btw just to take away a perception that generally powering down sockets
help; it does not help for all cpus. Some cpus are so efficient in idle
that the incremental gain one would get by "offlining" a core is just
not worth it
(in fact, in x86, it's the same thing)

I obviously can't speak for p-series cpus, just wanted to point out
that there is no universal truth about "offlining saves power".

-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [v6 PATCH 0/7]: cpuidle/x86/POWER: Cleanup idle power management code in x86, cleanup drivers/cpuidle/cpuidle.c and introduce cpuidle to POWER.

2009-09-24 Thread Arjan van de Ven
On Thu, 24 Sep 2009 10:42:41 +0530
Arun R Bharadwaj  wrote:

> * Arun R Bharadwaj  [2009-09-22 16:55:27]:
> 
> Hi Len, (or other acpi folks),
> 
> I had a question regarding ACPI-cpuidle interaction in the current
> implementation.
> 
> Currently, every cpu (i.e. acpi_processor) registers to cpuidle as
> a cpuidle_device. So every cpu has to go through the process of
> setting up the idle states and then registering as a cpuidle device.
> 
> What exactly is the reason behind this?
> 

technically a BIOS can opt to give you C states via ACPI on some cpus,
but not on others.

in practice when this happens it tends to be a bug.. but it's
technically a valid configuration



-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

2009-09-25 Thread Arjan van de Ven
On Fri, 25 Sep 2009 12:55:49 +0530
Vaidyanathan Srinivasan  wrote:

> > I obviously can't speak for p-series cpus, just wanted to point out
> > that there is no universal truth about "offlining saves power".
> 
> Hi Arjan,
> 
> As you have said, on some cpus the extra effort of offlining does not
> save us any extra power, and the state will be same as idle.  The
> assertion that offlining saves power is still valid, it could be same
> as idle or better depending on the architecture and implementation.
> 
> On x86 we still need the code (Venki posted) to take cpus to C6 on
> offline to save power or else offlining consumes more power than idle
> due to C1/hlt state.  This framework can help here as well if we have
> any apprehension on making lowest sleep state as default on x86 and
> want the administrator to decide.

even with Venki's patch, all our measurements indicate that taking
cores away is damage on x86. Don't let that stop what you do for
powerpc, but for x86 it's not a win. Linux is good at keeping cores in
C6 long enough that the downside of offlining is bigger...



-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [v6 PATCH 0/7]: cpuidle/x86/POWER: Cleanup idle power management code in x86, cleanup drivers/cpuidle/cpuidle.c and introduce cpuidle to POWER.

2009-09-25 Thread Arjan van de Ven
On Fri, 25 Sep 2009 10:54:24 +0200
Peter Zijlstra  wrote:

> On Fri, 2009-09-25 at 12:36 +0530, Vaidyanathan Srinivasan wrote:
> > * Arjan van de Ven  [2009-09-24 14:22:28]:
> > 
> > > On Thu, 24 Sep 2009 10:42:41 +0530
> > > Arun R Bharadwaj  wrote:
> > > 
> > > > * Arun R Bharadwaj  [2009-09-22
> > > > 16:55:27]:
> > > > 
> > > > Hi Len, (or other acpi folks),
> > > > 
> > > > I had a question regarding ACPI-cpuidle interaction in the
> > > > current implementation.
> > > > 
> > > > Currently, every cpu (i.e. acpi_processor) registers to cpuidle
> > > > as a cpuidle_device. So every cpu has to go through the process
> > > > of setting up the idle states and then registering as a cpuidle
> > > > device.
> > > > 
> > > > What exactly is the reason behind this?
> > > > 
> > > 
> > > technically a BIOS can opt to give you C states via ACPI on some
> > > cpus, but not on others.
> > > 
> > > in practice when this happens it tends to be a bug.. but it's
> > > technically a valid configuration
> > 
> > So we will need to keep the per-cpu registration as of now because
> > we may have such buggy BIOS in the field and we don't want the
> > cpuidle framework to malfunction there.
> 
> If the BIOS doesn't mention a certain C state on a cpu, and you try to
> set it anyway, does that go boom?
> 
> This whole per-cpu registration thing is horridly ugly, can't you
> have a per-cpu C state exception mask and leave it at that -- if its
> really needed?

the real solution is to make the acpi code always know about C1, even
if the bios doesn't That's one for Len :)

(C1 is just "hlt", what we do in the other idle loop ;-) 


-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [stable] [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-02-28 Thread Arjan van de Ven
On Sat, 28 Feb 2009 09:46:01 -0800
Greg KH  wrote:

> On Sat, Feb 28, 2009 at 09:23:36AM -0800, Linus Torvalds wrote:
> > On Fri, 27 Feb 2009, Roland McGrath wrote:
> > > But here is the patch you asked for.
> > 
> > Yes, this looks much more straightforward. 
> > 
> > And I guess the seccomp interaction means that this is potentially
> > a 2.6.29 thing. Not that I know whether anybody actually _uses_
> > seccomp. It does seem to be enabled in at least Fedora kernels, but
> > it might not be used anywhere.
> 
> It's enabled in SuSE kernels.
> 

but are there users of it?
I thought Andrea stopped the cpushare thing that was the only user of
this


-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-16 Thread Arjan van de Ven
On Sat, 16 Feb 2008 17:08:01 +0100
Roel Kluin <[EMAIL PROTECTED]> wrote:

> The patch below was not yet tested. If it's correct as it is, please
> comment. ---
> Fix Unlikely(x) == y
> 

you found a great set of bugs..
but to be honest... I suspect it's just best to remove unlikely altogether for 
these cases;
unlikely() is almost a go-faster-stripes thing, and if you don't know how to 
use it you shouldn't
be using it... so just removing it for all wrong cases is actually the best 
thing to do imo.

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-16 Thread Arjan van de Ven
On Sat, 16 Feb 2008 18:58:49 +0100
> > If you think unlikely() means something else, we should fix what it
> > maps to towards gcc ;) (to.. be empty ;)
> 
> eventhough the gcc docs say it's just a hint to help the compiler
> optimize the branch it takes by default, I too have noticed that it
> more often does bad than good. Code gets completely reordered and
> even sometimes partially duplicated (especially when the branch is a
> return).
> 
> Last but not least, gcc 4 tends to emit stupid checks, to the point
> that I have replaced unlikely(x) with (x) in my code when gcc >= 4 is
> detected. What I observe is that the following code :
> 
> if (unlikely(p == NULL)) ...

this is pure bad since GCC already assumes that NULL checks are unlikely,
and with the unlikely() code needing to normalize stuff... it will generate 
worse code for sure yes.

> 
> often gets coded like this :
> 
> reg1 = (p == NULL)
> if (reg1 != 0) ...
> 
> ... which clobbers reg1 for nothing and performs a double test.
> 
> But yes, I assumed that the author considered its use to be
> legitimate (I've not looked at the code). Maybe you're right and it
> should be removed, but in this case we would need a large audit of
> the abuses of unlikely()...

no argument.. how about we start with all the cases where the author just got 
it 
entirely wrong ... like the ones from this patch ;)


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-16 Thread Arjan van de Ven
On Sat, 16 Feb 2008 10:31:26 -0800
Geoff Levand <[EMAIL PROTECTED]> wrote:

> On 02/16/2008 09:42 AM, Arjan van de Ven wrote:
> > On Sat, 16 Feb 2008 18:33:16 +0100
> > Willy Tarreau <[EMAIL PROTECTED]> wrote:
> > 
> >> On Sat, Feb 16, 2008 at 09:25:52AM -0800, Arjan van de Ven wrote:
> >> > On Sat, 16 Feb 2008 17:08:01 +0100
> >> > Roel Kluin <[EMAIL PROTECTED]> wrote:
> >> > 
> >> > > The patch below was not yet tested. If it's correct as it is,
> >> > > please comment. ---
> >> > > Fix Unlikely(x) == y
> >> > > 
> >> > 
> >> > you found a great set of bugs..
> >> > but to be honest... I suspect it's just best to remove unlikely
> >> > altogether for these cases; unlikely() is almost a
> >> > go-faster-stripes thing, and if you don't know how to use it you
> >> > shouldn't be using it... so just removing it for all wrong cases
> >> > is actually the best thing to do imo.
> >> 
> >> Well, eventhough the author may not know how to use it, "unlikely"
> >> at least indicates the intention of the author, or his knowledge
> >> of what should happen here. I'd suggest leaving it where it is
> >> because the authot of this code is in best position to know that
> >> this branch is unlikely to happen, eventhough he does not
> >> correctly use the macro.
> >>
> > 
> > you have more faith in the authors knowledge of how his code
> > actually behaves than I think is warranted  :) Or faith in that he
> > knows what "unlikely" means. I should write docs about this; but
> > unlikely() means: 1) It happens less than 0.01% of the cases.
> > 2) The compiler couldn't have figured this out by itself
> >(NULL pointer checks are compiler done already, same for some
> > other conditions) 3) It's a hot codepath where shaving 0.5 cycles
> > (less even on x86) matters (and the author is ok with taking a 500
> > cycles hit if he's wrong)
> > 
> > If you think unlikely() means something else, we should fix what it
> > maps to towards gcc ;) (to.. be empty ;)
> 
> Well, I didn't consider what today's compiler does, but used it as a
> general indicator, because I think that code will be around a long
> time.  If you show me some test results that prove it causes harm I
> might consider removing it. 

for mordern (last 10 years) x86 cpus... the cpu branchpredictor is better
than the coder in general. Same for most other architectures.

unlikely() creates bigger code as well.

Now... we're talking about your super duper hotpath function here right?
One where you care about 0.5 cycle speed improvement? (less on modern
systems ;)

Because if not, the bigger code and general "compiler second guessing" is just
more harmful than good, shown already here by the fact that the code was just 
incorrect
as a virtue of having the unlikely() in.

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-16 Thread Arjan van de Ven
On Sat, 16 Feb 2008 18:33:16 +0100
Willy Tarreau <[EMAIL PROTECTED]> wrote:

> On Sat, Feb 16, 2008 at 09:25:52AM -0800, Arjan van de Ven wrote:
> > On Sat, 16 Feb 2008 17:08:01 +0100
> > Roel Kluin <[EMAIL PROTECTED]> wrote:
> > 
> > > The patch below was not yet tested. If it's correct as it is,
> > > please comment. ---
> > > Fix Unlikely(x) == y
> > > 
> > 
> > you found a great set of bugs..
> > but to be honest... I suspect it's just best to remove unlikely
> > altogether for these cases; unlikely() is almost a
> > go-faster-stripes thing, and if you don't know how to use it you
> > shouldn't be using it... so just removing it for all wrong cases is
> > actually the best thing to do imo.
> 
> Well, eventhough the author may not know how to use it, "unlikely" at
> least indicates the intention of the author, or his knowledge of what
> should happen here. I'd suggest leaving it where it is because the
> authot of this code is in best position to know that this branch is
> unlikely to happen, eventhough he does not correctly use the macro.
>

you have more faith in the authors knowledge of how his code actually behaves 
than I think is warranted  :)
Or faith in that he knows what "unlikely" means.
I should write docs about this; but unlikely() means:
1) It happens less than 0.01% of the cases.
2) The compiler couldn't have figured this out by itself
   (NULL pointer checks are compiler done already, same for some other 
conditions)
3) It's a hot codepath where shaving 0.5 cycles (less even on x86) matters
   (and the author is ok with taking a 500 cycles hit if he's wrong)

If you think unlikely() means something else, we should fix what it maps to 
towards gcc ;)
(to.. be empty ;)

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-18 Thread Arjan van de Ven
On Mon, 18 Feb 2008 13:11:06 -0500
[EMAIL PROTECTED] wrote:

> On Mon, 18 Feb 2008 14:27:10 GMT, David Howells said:
> 
> > __builtin_expect() is useful on FRV where you _have_ to give each
> > branch and conditional branch instruction a measure of probability
> > whether the branch will be taken.
> 
> What does gcc do the 99.998% of the time we don't have
> likely/unlikely coded?

see Andi's email.
It gets the exact same hints that 95%+ of the kernels unlikely/likely get you,
because the heuristics in it are usually the same as the kernel programmers 
heuristics.


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports

2008-02-18 Thread Arjan van de Ven
On Mon, 18 Feb 2008 21:58:42 +0100
Jean Delvare <[EMAIL PROTECTED]> wrote:

> On Tue, 19 Feb 2008 07:42:03 +1100, Benjamin Herrenschmidt wrote:
> > 
> > > Maybe Christian's patch can be improved to not do the check on
> > > these? As long as /dev/port exists, it seems reasonable that the
> > > kernel should behave, no matter what I/O ports are accessed from
> > > user-space.
> > 
> > nonsense.
> > 
> >  /dev/mem exists for example, but you are still not supposed to go
> > bang all over the place in it.
> 
> You should at least be able to read from it without crashing the
> machine. Of course writing is a different story.

keep dreaming. This is not how /dev/mem works today, not on x86 and very likely 
not on ppc either.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] Fix Unlikely(x) == y

2008-02-18 Thread Arjan van de Ven
On Tue, 19 Feb 2008 13:33:53 +1100
Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> Actually one thing I don't like about gcc is that I think it still
> emits cmovs for likely/unlikely branches, which is silly (the gcc
> developers seem to be in love with that instruction). If that goes
> away, then branch hints may be even better.

only for -Os and only if the result is smaller afaik.
(cmov tends to be a performance loss most of the time so for -O2 and such it
isn't used as far as I know.. it does make for nice small code however ;-)

 


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: the printk problem

2008-07-05 Thread Arjan van de Ven

Linus Torvalds wrote:


On Fri, 4 Jul 2008, Linus Torvalds wrote:
Still all happily untested, of course. And still with no actual users 
converted.


Ok, it's tested, and here's an example usage conversion.

The diffstat pretty much says it all. It _does_ change the format of the 
stack trace entry a bit, but I don't think it's for the worse (unless it 
breaks things like the oops tracker - Arjan?)


It changes the symbol-in-module format from

:ext3:add_dirent_to_buf+0x6c/0x26c

to

add_dirent_to_buf+0x6c/0x26c [ext3]

but quite frankly, the latter was the standard format anyway (it's what 
"sprint_symbol()" gives you), and traps_64.c was the odd man out.




This won't break anything for me actually; I already deal with either case.

$ cat oopsparse.pl | wc -l
1252

the kernel is so inconsistent with oops formats (over time/across architectures)
that once you deal with what there is today... you pretty much deal with 
everything.

I also like the improvement; I wished something like this existed several times 
already
in the last few months so for sure

Acked-by: Arjan van de Ven <[EMAIL PROTECTED]>
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: dma_alloc_coherent() on PPC32: physical addresses above 2G possible?

2008-07-20 Thread Arjan van de Ven
On Sun, 20 Jul 2008 20:36:23 +0200
Stefan Richter <[EMAIL PROTECTED]> wrote:

> PS:  I don't want to set the DMA mask of this device to
> DMA_31BIT_MASK because that would be detrimental to other functions
> of the device. It's a TI TSB43AB22A FireWire controller.

Hi,

just want to mention that you can set the coherent mask separately from
the generic mask... is that sufficient for your load?
(you can even set it just for this allocation..)

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: dma_alloc_coherent() on PPC32: physical addresses above 2G possible?

2008-07-20 Thread Arjan van de Ven
On Sun, 20 Jul 2008 21:25:51 +0200
Stefan Richter <[EMAIL PROTECTED]> wrote:

> 
> Later on:
> 
>   if (dev->needs_dma_mask_workaround)
>   pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK);
>   allocate_something_special;
>   if (dev->needs_dma_mask_workaround)
>   pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK);
> 
> Or is there a variant of dma_alloc_coherent() which directly accepts a
> mask argument?

something like this.
But realistically, how many consistent/coherent allocations do you have?
some ring buffers and other one time stuff surely... but not after that?

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev



Re: dma_alloc_coherent() on PPC32: physical addresses above 2G possible?

2008-07-20 Thread Arjan van de Ven
On Sun, 20 Jul 2008 21:25:51 +0200
Stefan Richter <[EMAIL PROTECTED]> wrote:

> Arjan van de Ven wrote:
> > On Sun, 20 Jul 2008 20:36:23 +0200
> > Stefan Richter <[EMAIL PROTECTED]> wrote:
> > 
> >> PS:  I don't want to set the DMA mask of this device to
> >> DMA_31BIT_MASK because that would be detrimental to other functions
> >> of the device. It's a TI TSB43AB22A FireWire controller.
> > 
> > Hi,
> > 
> > just want to mention that you can set the coherent mask separately
> > from the generic mask... is that sufficient for your load?
> > (you can even set it just for this allocation..)
> 
> Hmm.  Would that be done this way?
> During probe:
> 
>   if (chip_is_tsb43ab22a) {
>   if (dma_supported(dev, DMA_31BIT_MASK))
>   chip->needs_dma_mask_workaround = 1;
>   else
>   chip->needs_some_other_workaround = 1;
>   }

btw it might be nicer to make this

chip->something_special_mask = DMA_31BIT_MASK;

then you can just use the mask from this struct rather than another
check


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [BUG] linux-next: Tree for August 26 - Badness at kernel/notifier.c:25

2008-08-26 Thread Arjan van de Ven

Kamalesh Babulal wrote:

Hi Stephen,

Badness warning is seen, while booting up the next-20080825/26 kernels on 
the powerpc boxes




this is fixed in the patch I sent to Ingo earlier today
(attached again for reference)
>From eafa461d187448998b1f66c9134e66b125db9531 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <[EMAIL PROTECTED]>
Date: Tue, 26 Aug 2008 09:01:06 -0700
Subject: [PATCH] debug: add notifier chain debugging

during some development we suspected a case where we left something
in a notifier chain that was from a module that was unloaded already...
and that sort of thing is rather hard to track down.

This patch adds a very simple sanity check (which isn't all that
expensive) to make sure the notifier we're about to call is
actually from either the kernel itself of from a still-loaded
module, avoiding a hard-to-chase-down crash.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>
Acked-by: Tony Luck <[EMAIL PROTECTED]>
---
 include/linux/kernel.h |3 +++
 kernel/extable.c   |   16 
 kernel/notifier.c  |6 ++
 lib/vsprintf.c |2 +-
 4 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 2651f80..4e1366b 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -187,6 +187,9 @@ extern unsigned long long memparse(char *ptr, char **retptr);
 extern int core_kernel_text(unsigned long addr);
 extern int __kernel_text_address(unsigned long addr);
 extern int kernel_text_address(unsigned long addr);
+extern int func_ptr_is_kernel_text(void *ptr);
+extern void *dereference_function_descriptor(void *ptr);
+
 struct pid;
 extern struct pid *session_of_pgrp(struct pid *pgrp);
 
diff --git a/kernel/extable.c b/kernel/extable.c
index a26cb2e..adf0cc9 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -66,3 +66,19 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	return module_text_address(addr) != NULL;
 }
+
+/*
+ * On some architectures (PPC64, IA64) function pointers
+ * are actually only tokens to some data that then holds the
+ * real function address. As a result, to find if a function
+ * pointer is part of the kernel text, we need to do some
+ * special dereferencing first.
+ */
+int func_ptr_is_kernel_text(void *ptr)
+{
+	unsigned long addr;
+	addr = (unsigned long) dereference_function_descriptor(ptr);
+	if (core_kernel_text(addr))
+		return 1;
+	return module_text_address(addr) != NULL;
+}
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 823be11..522277c 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -82,6 +82,12 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
 
 	while (nb && nr_to_call) {
 		next_nb = rcu_dereference(nb->next);
+		if (!func_ptr_is_kernel_text(nb->notifier_call)) {
+			WARN(1, "Invalid notifier called!");
+			nb = next_nb;
+			continue;
+		}
+
 		ret = nb->notifier_call(nb, val, v);
 
 		if (nr_calls)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index d8d1d11..f5e5ffb 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -513,7 +513,7 @@ static char *string(char *buf, char *end, char *s, int field_width, int precisio
 	return buf;
 }
 
-static inline void *dereference_function_descriptor(void *ptr)
+void *dereference_function_descriptor(void *ptr)
 {
 #if defined(CONFIG_IA64) || defined(CONFIG_PPC64)
 	void *p;
-- 
1.5.5.1

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [BUG] linux-next: Tree for August 26 - Badness at kernel/notifier.c:25

2008-08-27 Thread Arjan van de Ven

Kamalesh Babulal wrote:


Thanks for reference of the patch, After replacing the patch with the latest 
one above on the powerpc, the warning still remains


Badness at kernel/notifier.c:86


sadly you have something going on that doesn't list the modules loaded etc...

is this during boot or way later?
(because if it's the later, you might be hitting a legitimate bug ;-)


NIP: c0081470 LR: c0081494 CTR: c005a2d0
REGS: c021ce0bfaf0 TRAP: 0700   Not tainted  
(2.6.27-rc4-next-20080826-autotest)
MSR: 80029032   CR: 24008042  XER: 0005
TASK = c015de08[1] 'swapper' THREAD: c021ce0bc000 CPU: 0
GPR00: c0081494 c021ce0bfd70 c081e940 c0749c38 
GPR04: 0003 0001  c021ce0bfe90 
GPR08:   c04fd9f0 c04fd9f0 
GPR12: 2442 c089c300 02307ef0 c06332a0 
GPR16: c0631f28 c0633388 018bf8b0 0270 
GPR20: c070b07c c0707ef0 c0708160 c0631c58 
GPR24: 0003 0001 c021ce0bfe90  
GPR28:  c0749c20 c07bf338 c0749c38 
NIP [c0081470] .notifier_call_chain+0x70/0x140

LR [c0081494] .notifier_call_chain+0x94/0x140
Call Trace:
[c021ce0bfd70] [c0081494] .notifier_call_chain+0x94/0x140 
(unreliable)
[c021ce0bfe20] [c04fe3fc] .cpu_up+0x10c/0x200
[c021ce0bfee0] [c06cdcc0] .kernel_init+0x1b0/0x440
[c021ce0bff90] [c00299cc] .kernel_thread+0x4c/0x68
Instruction dump:
e863 2fa3 419e00f0 2fa6 419e00e8 2e27 7c7f1b78 3b60 
4828 6000 6000 6000 <0fe0> 2fbd 2f3c 7fbfeb78 





___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: powerpc allmodconfig

2008-10-16 Thread Arjan van de Ven
On Thu, 16 Oct 2008 12:49:23 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:
> #endif
> #define __WARN() warn_on_slowpath(__FILE__, __LINE__)
> #define __WARN_printf(arg...) warn_slowpath(__FILE__, __LINE__, arg)
> #else
> #define __WARN_printf(arg...) __WARN()

the easiest way I suppose would be to do

#define __WARN_printf(arg..) do { printk(arg); __WARN(); } while (0)

any obvious problems with this ?


-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] bootgraph: fix for use with dot symbols

2009-01-27 Thread Arjan van de Ven

Michael Neuling wrote:

powerpc has dot symbols, so the dmesg output looks like:
 
<4>[0.327310] calling  .migration_init+0x0/0x9c @ 1

<4>[0.327595] initcall .migration_init+0x0/0x9c returned 1 after 0 usecs

The below fixes bootgraph.pl so it handles this correctly.  



question for the ppc folks why does the "print symbol" magic format string
thing print a dot symbol and not the real function name?
Should that be fixed instead?
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev