Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

2016-10-09 Thread Christoph Hellwig
On Fri, Oct 07, 2016 at 08:47:51AM +1100, Dave Chinner wrote:
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
> 
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.

That being said - I wonder if we should allow the shared lock on DAX
files IFF the user is specifying O_DIRECT in the open mode..
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

2016-10-09 Thread Piotr Kwapulinski
The MPOL_LOCAL mode has been implemented by
Peter Zijlstra 
(commit: 479e2802d09f1e18a97262c4c6f8f17ae5884bd8).
Add the documentation for this mode.

Signed-off-by: Piotr Kwapulinski 
---
This version adds more details about MPOL_LOCAL mode:
1. difference between MPOL_LOCAL and MPOL_DEFAULT
2. what if local node is overallocated or not allowed by the cpuset
---
 man2/mbind.2 | 28 
 man2/set_mempolicy.2 | 19 ++-
 2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/man2/mbind.2 b/man2/mbind.2
index 3ea24f6..1dbda1e 100644
--- a/man2/mbind.2
+++ b/man2/mbind.2
@@ -130,8 +130,9 @@ argument must specify one of
 .BR MPOL_DEFAULT ,
 .BR MPOL_BIND ,
 .BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
 or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
 All policy modes except
 .B MPOL_DEFAULT
 require the caller to specify via the
@@ -258,9 +259,26 @@ and
 .I maxnode
 arguments specify the empty set, then the memory is allocated on
 the node of the CPU that triggered the allocation.
-This is the only way to specify "local allocation" for a
-range of memory via
-.BR mbind ().
+
+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever the memory for this node will be released. If the
+"local node" is not allowed by the process's current cpuset context
+the kernel will try to allocate memory from other nodes. The kernel
+will allocate memory from the "local node" whenever it becomes
+allowed by the process's current cpuset context. In contrast
+.B MPOL_DEFAULT
+reverts to the policy of the process which may have been set with
+.BR set_mempolicy (2).
+It may not be the "local allocation".
 
 If
 .B MPOL_MF_STRICT
@@ -440,6 +458,8 @@ To select explicit "local allocation" for a memory range,
 specify a
 .I mode
 of
+.B MPOL_LOCAL
+or
 .B MPOL_PREFERRED
 with an empty set of nodes.
 This method will work for
diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
index 1f02037..3592734 100644
--- a/man2/set_mempolicy.2
+++ b/man2/set_mempolicy.2
@@ -79,8 +79,9 @@ argument must specify one of
 .BR MPOL_DEFAULT ,
 .BR MPOL_BIND ,
 .BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
 or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
 All modes except
 .B MPOL_DEFAULT
 require the caller to specify via the
@@ -211,6 +212,22 @@ arguments specify the empty set, then the policy
 specifies "local allocation"
 (like the system default policy discussed above).
 
+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever the memory for this node will be released. If the
+"local node" is not allowed by the process's current cpuset context
+the kernel will try to allocate memory from other nodes. The kernel
+will allocate memory from the "local node" whenever it becomes
+allowed by the process's current cpuset context.
+
 The thread memory policy is preserved across an
 .BR execve (2),
 and is inherited by child threads created using
-- 
2.10.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] hwmon: Add tc654 driver

2016-10-09 Thread Chris Packham
Hi Gunter,

Thanks for the review. v3 on it's way some responses below.

On 10/08/2016 07:29 AM, Guenter Roeck wrote:
> On Fri, Oct 07, 2016 at 02:38:44PM +1300, Chris Packham wrote:
>> Add support for the tc654 and tc655 fan controllers from Microchip.
>>
>> http://ww1.microchip.com/downloads/en/DeviceDoc/20001734C.pdf
>>
>> Signed-off-by: Chris Packham 
>> ---
>>
>> Changes in v2:
>> - Add Documentation/hwmon/tc654
>> - Incorporate most of the review comments from Guenter. Additional error
>>   handling is added. Unused/unnecessary code is removed. I decided not
>>   to go down the regmap path yet. I may circle back to it when I look at
>>   using regmap in the adm9240 driver.
>>
>>  .../devicetree/bindings/i2c/trivial-devices.txt|   2 +
>>  Documentation/hwmon/tc654  |  26 ++
>>  drivers/hwmon/Kconfig  |  11 +
>>  drivers/hwmon/Makefile |   1 +
>>  drivers/hwmon/tc654.c  | 513 
>> +
>>  5 files changed, 553 insertions(+)
>>  create mode 100644 Documentation/hwmon/tc654
>>  create mode 100644 drivers/hwmon/tc654.c
>>
>> diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt 
>> b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
>> index 1416c6a0d2cd..833fb9f133d3 100644
>> --- a/Documentation/devicetree/bindings/i2c/trivial-devices.txt
>> +++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
>> @@ -122,6 +122,8 @@ microchip,mcp4662-502Microchip 8-bit Dual I2C 
>> Digital Potentiometer with NV Mem
>>  microchip,mcp4662-103   Microchip 8-bit Dual I2C Digital Potentiometer 
>> with NV Memory (10k)
>>  microchip,mcp4662-503   Microchip 8-bit Dual I2C Digital Potentiometer 
>> with NV Memory (50k)
>>  microchip,mcp4662-104   Microchip 8-bit Dual I2C Digital Potentiometer 
>> with NV Memory (100k)
>> +microchip,tc654 PWM Fan Speed Controller With Fan Fault 
>> Detection
>> +microchip,tc655 PWM Fan Speed Controller With Fan Fault 
>> Detection
>>  national,lm63   Temperature sensor with integrated fan control
>>  national,lm75   I2C TEMP SENSOR
>>  national,lm80   Serial Interface ACPI-Compatible Microprocessor 
>> System Hardware Monitor
>> diff --git a/Documentation/hwmon/tc654 b/Documentation/hwmon/tc654
>> new file mode 100644
>> index ..93796c5c7e79
>> --- /dev/null
>> +++ b/Documentation/hwmon/tc654
>> @@ -0,0 +1,26 @@
>> +Kernel driver tc654
>> +===
>> +
>> +Supported chips:
>> +  * Microship TC654 and TC655
>> +Prefix: 'tc654'
>> +Datasheet: http://ww1.microchip.com/downloads/en/DeviceDoc/20001734C.pdf
>> +
>> +Authors:
>> +Chris Packham 
>> +Masahiko Iwamoto 
>> +
>> +Description
>> +---
>> +This driver implements support for the Microchip TC654 and TC655.
>> +
>> +The TC654 used the 2-wire interface compatible with the SMBUS 2.0
>
> uses
>

Done.

>> +specification. The TC654 has two (2) inputs for measuring fan RPM and
>> +one (1) PWM output which can be used for fan control.
>> +
>> +Configuration Notes
>> +---
>> +Ordinarily the pwm1_mode ABI is used for controlling the pwm output
>> +mode.  However, for this chip the output is always pwm, and the
>> +pwm1_mode determines if the pwm output is controlled via the pwm1 value
>> +or via the Vin analog input.
>
> Please describe the supported values here.
>
>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>> index 45cef3d2c75c..8681bc65cde5 100644
>> --- a/drivers/hwmon/Kconfig
>> +++ b/drivers/hwmon/Kconfig
>> @@ -907,6 +907,17 @@ config SENSORS_MCP3021
>>This driver can also be built as a module.  If so, the module
>>will be called mcp3021.
>>
>> +config SENSORS_TC654
>> +tristate "Microchip TC654/TC655 and compatibles"
>> +depends on I2C
>> +help
>> +  If you say yes here you get support for TC654 and TC655.
>> +  The TC654 and TC655 are PWM mode fan speed controllers with
>> +  FanSense technology for use with brushless DC fans.
>> +
>> +  This driver can also be built as a module.  If so, the module
>> +  will be called tc654.
>> +
>>  config SENSORS_MENF21BMC_HWMON
>>  tristate "MEN 14F021P00 BMC Hardware Monitoring"
>>  depends on MFD_MENF21BMC
>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>> index aecf4ba17460..c651f0f1d047 100644
>> --- a/drivers/hwmon/Makefile
>> +++ b/drivers/hwmon/Makefile
>> @@ -122,6 +122,7 @@ obj-$(CONFIG_SENSORS_MAX6697)+= max6697.o
>>  obj-$(CONFIG_SENSORS_MAX31790)  += max31790.o
>>  obj-$(CONFIG_SENSORS_MC13783_ADC)+= mc13783-adc.o
>>  obj-$(CONFIG_SENSORS_MCP3021)   += mcp3021.o
>> +obj-$(CONFIG_SENSORS_TC654) += tc654.o
>>  obj-$(CONFIG_SENSORS_MENF21BMC_HWMON) += menf21bmc_hwmon.o
>>  obj-$(CONFIG_SENSORS_NCT6683)   += nct6683.o
>>  obj-$(CONFIG_SENSORS_NCT6775)   += nct6775.o
>> diff --git a/driv

[PATCHv3] hwmon: Add tc654 driver

2016-10-09 Thread Chris Packham
Add support for the tc654 and tc655 fan controllers from Microchip.

http://ww1.microchip.com/downloads/en/DeviceDoc/20001734C.pdf

Signed-off-by: Chris Packham 
---
Changes in v3:
- typofix in documentation
- add missing value to tc654_pwm_map, re-generate based on datasheet.
- remove unnecessary hwmon_dev member from struct tc654_data
- bug fixes in set_fan_min() and show_pwm_mode()
- miscellaneous style fixes

Changes in v2:
- Add Documentation/hwmon/tc654
- Incorporate most of the review comments from Guenter. Additional error
  handling is added. Unused/unnecessary code is removed. I decided not
  to go down the regmap path yet. I may circle back to it when I look at
  using regmap in the adm9240 driver.

 .../devicetree/bindings/i2c/trivial-devices.txt|   2 +
 Documentation/hwmon/tc654  |  31 ++
 drivers/hwmon/Kconfig  |  11 +
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/tc654.c  | 509 +
 5 files changed, 554 insertions(+)
 create mode 100644 Documentation/hwmon/tc654
 create mode 100644 drivers/hwmon/tc654.c

diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt 
b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
index 1416c6a0d2cd..833fb9f133d3 100644
--- a/Documentation/devicetree/bindings/i2c/trivial-devices.txt
+++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
@@ -122,6 +122,8 @@ microchip,mcp4662-502   Microchip 8-bit Dual I2C 
Digital Potentiometer with NV Mem
 microchip,mcp4662-103  Microchip 8-bit Dual I2C Digital Potentiometer with NV 
Memory (10k)
 microchip,mcp4662-503  Microchip 8-bit Dual I2C Digital Potentiometer with NV 
Memory (50k)
 microchip,mcp4662-104  Microchip 8-bit Dual I2C Digital Potentiometer with NV 
Memory (100k)
+microchip,tc654PWM Fan Speed Controller With Fan Fault 
Detection
+microchip,tc655PWM Fan Speed Controller With Fan Fault 
Detection
 national,lm63  Temperature sensor with integrated fan control
 national,lm75  I2C TEMP SENSOR
 national,lm80  Serial Interface ACPI-Compatible Microprocessor System 
Hardware Monitor
diff --git a/Documentation/hwmon/tc654 b/Documentation/hwmon/tc654
new file mode 100644
index ..91a2843f5f98
--- /dev/null
+++ b/Documentation/hwmon/tc654
@@ -0,0 +1,31 @@
+Kernel driver tc654
+===
+
+Supported chips:
+  * Microship TC654 and TC655
+Prefix: 'tc654'
+Datasheet: http://ww1.microchip.com/downloads/en/DeviceDoc/20001734C.pdf
+
+Authors:
+Chris Packham 
+Masahiko Iwamoto 
+
+Description
+---
+This driver implements support for the Microchip TC654 and TC655.
+
+The TC654 uses the 2-wire interface compatible with the SMBUS 2.0
+specification. The TC654 has two (2) inputs for measuring fan RPM and
+one (1) PWM output which can be used for fan control.
+
+Configuration Notes
+---
+Ordinarily the pwm1_mode ABI is used for controlling the pwm output
+mode.  However, for this chip the output is always pwm, and the
+pwm1_mode determines if the pwm output is controlled via the pwm1 value
+or via the Vin analog input.
+
+
+Setting pwm1_mode to 1 will cause the pwm output to be driven based on
+the pwm1 value. Setting pwm1_mode to 0 will cause the pwm output to be
+driven based on the Vin input.
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 45cef3d2c75c..8681bc65cde5 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -907,6 +907,17 @@ config SENSORS_MCP3021
  This driver can also be built as a module.  If so, the module
  will be called mcp3021.
 
+config SENSORS_TC654
+   tristate "Microchip TC654/TC655 and compatibles"
+   depends on I2C
+   help
+ If you say yes here you get support for TC654 and TC655.
+ The TC654 and TC655 are PWM mode fan speed controllers with
+ FanSense technology for use with brushless DC fans.
+
+ This driver can also be built as a module.  If so, the module
+ will be called tc654.
+
 config SENSORS_MENF21BMC_HWMON
tristate "MEN 14F021P00 BMC Hardware Monitoring"
depends on MFD_MENF21BMC
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index aecf4ba17460..c651f0f1d047 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -122,6 +122,7 @@ obj-$(CONFIG_SENSORS_MAX6697)   += max6697.o
 obj-$(CONFIG_SENSORS_MAX31790) += max31790.o
 obj-$(CONFIG_SENSORS_MC13783_ADC)+= mc13783-adc.o
 obj-$(CONFIG_SENSORS_MCP3021)  += mcp3021.o
+obj-$(CONFIG_SENSORS_TC654)+= tc654.o
 obj-$(CONFIG_SENSORS_MENF21BMC_HWMON) += menf21bmc_hwmon.o
 obj-$(CONFIG_SENSORS_NCT6683)  += nct6683.o
 obj-$(CONFIG_SENSORS_NCT6775)  += nct6775.o
diff --git a/drivers/hwmon/tc654.c b/drivers/hwmon/tc654.c
new file mode 100644
index ..456e0bb9f94f
--- /dev/null
+++ b/drivers/hwmon/tc654.c
@@ -0,0 

[PATCH] locking/osq: Provide proper lock/unlock and relaxed flavors

2016-10-09 Thread Davidlohr Bueso

Because osq has only been used for mutex/rwsem spinning logic,
we have gotten away with being rather flexible in any of the
traditional lock/unlock ACQUIRE/RELEASE minimal guarantees.
However, if wanted to be used as a _real_ lock, then it would
be in trouble. To this end, this patch provides the two
alternatives, where osq_lock/unlock() calls have the required
semantics, and a _relaxed() call, for no ordering guarantees
at all.

- node->locked is now completely without ordering for _relaxed()
(currently its under smp_load_acquire, which does not match and
the race is harmless to begin with as we just iterate again. For
the ACQUIRE flavor, it is always formed with ctr dep + smp_rmb().

- In order to avoid more code duplication via macros, the common
osq_wait_next() call is completely unordered, but the caller
can provide the necessary barriers, if required - ie the case for
osq_unlock(): similar to the node->locked case, this also relies
on ctrl dep + smp_wmb() to form RELEASE.

- If osq_lock() fails we never guarantee any ordering (obviously
same goes for _relaxed).

Both mutexes and rwsems have been updated to continue using the
relaxed versions, but this will obviously change for the later.

Signed-off-by: Davidlohr Bueso 
---
XXX: This obviously needs a lot of testing.

include/asm-generic/barrier.h |   9 ++
include/linux/osq_lock.h  |  10 ++
kernel/locking/mutex.c|   6 +-
kernel/locking/osq_lock.c | 279 +++---
kernel/locking/rwsem-xadd.c   |   4 +-
5 files changed, 177 insertions(+), 131 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index fe297b599b0a..0036b08151c3 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -221,6 +221,15 @@ do {   
\
#endif

/**
+ * smp_release__after_ctrl_dep() - Provide RELEASE ordering after a control 
dependency
+ *
+ * A control dependency provides a LOAD->STORE order, the additional WMB
+ * provides STORE->STORE order, together they provide {LOAD,STORE}->STORE 
order,
+ * aka. (store)-RELEASE.
+ */
+#define smp_release__after_ctrl_dep()  smp_wmb()
+
+/**
 * smp_cond_load_acquire() - (Spin) wait for cond with ACQUIRE ordering
 * @ptr: pointer to the variable to wait on
 * @cond: boolean expression to wait for
diff --git a/include/linux/osq_lock.h b/include/linux/osq_lock.h
index 703ea5c30a33..a63ffa95aa70 100644
--- a/include/linux/osq_lock.h
+++ b/include/linux/osq_lock.h
@@ -29,6 +29,16 @@ static inline void osq_lock_init(struct 
optimistic_spin_queue *lock)
atomic_set(&lock->tail, OSQ_UNLOCKED_VAL);
}

+/*
+ * Versions of osq_lock/unlock that do not imply or guarantee (load)-ACQUIRE
+ * (store)-RELEASE barrier semantics.
+ *
+ * Note that a failed call to either osq_lock() or osq_lock_relaxed() does
+ * not imply barriers... we are next to block.
+ */
+extern bool osq_lock_relaxed(struct optimistic_spin_queue *lock);
+extern void osq_unlock_relaxed(struct optimistic_spin_queue *lock);
+
extern bool osq_lock(struct optimistic_spin_queue *lock);
extern void osq_unlock(struct optimistic_spin_queue *lock);

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90db3909..b1bf1e057565 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -316,7 +316,7 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 * acquire the mutex all at once, the spinners need to take a
 * MCS (queued) lock first before spinning on the owner field.
 */
-   if (!osq_lock(&lock->osq))
+   if (!osq_lock_relaxed(&lock->osq))
goto done;

while (true) {
@@ -358,7 +358,7 @@ static bool mutex_optimistic_spin(struct mutex *lock,
}

mutex_set_owner(lock);
-   osq_unlock(&lock->osq);
+   osq_unlock_relaxed(&lock->osq);
return true;
}

@@ -380,7 +380,7 @@ static bool mutex_optimistic_spin(struct mutex *lock,
cpu_relax_lowlatency();
}

-   osq_unlock(&lock->osq);
+   osq_unlock_relaxed(&lock->osq);
done:
/*
 * If we fell out of the spin path because of need_resched(),
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a37857ab55..d3d1042a509c 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -28,6 +28,17 @@ static inline struct optimistic_spin_node *decode_cpu(int 
encoded_cpu_val)
return per_cpu_ptr(&osq_node, cpu_nr);
}

+static inline void set_node_locked_release(struct optimistic_spin_node *node)
+{
+   smp_store_release(&node->locked, 1);
+}
+
+static inline void set_node_locked_relaxed(struct optimistic_spin_node *node)
+{
+   WRITE_ONCE(node->locked, 1);
+
+}
+
/*
 * Get a stable @node->next pointer, either for unlock() or unqueue() purposes.
 * Can return NULL in case w

Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

2016-10-09 Thread Dave Chinner
On Sun, Oct 09, 2016 at 08:17:48AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 07, 2016 at 08:47:51AM +1100, Dave Chinner wrote:
> > Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> > XFS level and never took exclusive locks.
> > 
> > *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> > match the buffered IO single writer POSIX semantics - the test is a
> > bad test based on the fact it exercised a path that is under heavy
> > development and so can't be used as a regression test across
> > multiple kernels.
> 
> That being said - I wonder if we should allow the shared lock on DAX
> files IFF the user is specifying O_DIRECT in the open mode..

It should do - if it doesn't then we screwed up the IO path
selection logic in XFS and we'll need to fix it.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html