date:20170415

NFC-FDP: Completion of error handling around fdp_nci_i2c_read_device_properties()

2017-04-15 Thread SF Markus Elfring

Hello,

I have noticed that the function “fdp_nci_i2c_read_device_properties” does not
contain a null pointer check after a call of the function “devm_kmalloc”.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nfc/fdp/i2c.c?h=v4.10#n234

How do you think about to add a corresponding check and adjust the function
return type there?

Regards,
Markus

Re: [PATCH nf-next] ipset: remove unused function __ip_set_get_netlink

2017-04-15 Thread Pablo Neira Ayuso

On Fri, Apr 14, 2017 at 04:15:41PM +0200, Jozsef Kadlecsik wrote:
> Hi Pablo,
> 
> On Fri, 14 Apr 2017, Pablo Neira Ayuso wrote:
> 
> > On Mon, Apr 10, 2017 at 03:52:37PM -0400, Aaron Conole wrote:
> > > There are no in-tree callers.
> > 
> > @Jozsef, let me know if I should just take this to save you a pull
> > request.
> 
> Just take it, thank you.
> 
> Acked-by: Jozsef Kadlecsik 

Applied, thanks Jozsef.

Re: [PATCH 06/38] Annotate hardware config module parameters in drivers/clocksource/

2017-04-15 Thread Thomas Gleixner

On Sat, 15 Apr 2017, David Howells wrote:
> Thomas Gleixner  wrote:
> 
> > > Btw, is it possible to use IRQ grants to prevent a device that has limited
> > > IRQ options from being drivable?
> > 
> > What do you mean with 'IRQ grants' ?
> 
> request_irq().

I still can't parse the sentence above. If request_irq() fails the device
initialization fails. If you request the wrong irq then request_irq() might
succeed but the device won't work.

Thanks,

tglx

[PATCH] genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs

2017-04-15 Thread Hans de Goede

When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
flags get filled with the trigger type from the irq_data:

if (!(new->flags & IRQF_TRIGGER_MASK))
new->flags |= irqd_get_trigger_type(&desc->irq_data);

On the first setup_irq() the trigger type in irq_data is NONE when the
above code executes, then the irq is started up for the first time and
then the actual trigger type gets established, but that's too late to fix
up new->flags.

When then a second user of the irq requests the irq with IRQF_TRIGGER_NONE
its irqaction's triggertype gets set to the actual trigger type and the
following check fails:

if (!((old->flags ^ new->flags) & IRQF_TRIGGER_MASK))

Resulting in the request_irq failing with -EBUSY even though both
users requested the irq with IRQF_SHARED | IRQF_TRIGGER_NONE

This commit fixes this by comparing the new irqaction's trigger type
to the trigger type stored in the irq_data which correctly reflects
the actual trigger type being used for the irq.

Suggested-by: Thomas Gleixner 
Signed-off-by: Hans de Goede 
---
 kernel/irq/manage.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index a4afe5c..d63e91f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1212,8 +1212,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
 * set the trigger type must match. Also all must
 * agree on ONESHOT.
 */
+   unsigned int oldtype = irqd_get_trigger_type(&desc->irq_data);
+
if (!((old->flags & new->flags) & IRQF_SHARED) ||
-   ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) ||
+   (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
((old->flags ^ new->flags) & IRQF_ONESHOT))
goto mismatch;
 
-- 
2.9.3

Re: [PATCH 06/12] audit: Use timespec64 to represent audit timestamps

2017-04-15 Thread Arnd Bergmann

On Sat, Apr 8, 2017 at 5:58 PM, Deepa Dinamani  wrote:
>> I have no problem merging this patch into audit/next for v4.12, would
>> you prefer me to do that so at least this patch is merged?
>
> This would be fine.
> But, I think whoever takes the last 2 deletion patches should also take them.
> I'm not sure how that part works out.
>
>> It would probably make life a small bit easier for us in the audit
>> world too as it would reduce the potential merge conflict.  However,
>> that's a relatively small thing to worry about.

As Andrew has picked the remaining patches up into -mm, this will work
out fine: any patches picked up by the respective maintainers for v4.12
should arrive as git pull requests before the -mm patches get applied
at a later stage of the merge window.

 Arnd

Re: [PATCH 1/2] regulator: s2mps11: Use kcalloc() in s2mps11_pmic_probe()

2017-04-15 Thread Krzysztof Kozlowski

On Fri, Apr 14, 2017 at 11:01:25PM +0200, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Fri, 14 Apr 2017 22:00:35 +0200
> 
> A multiplication for the size determination of a memory allocation
> indicated that an array data structure should be processed.
> Thus use the corresponding function "kcalloc".
> 
> This issue was detected by using the Coccinelle software.

Unfortunately you write mostly cryptic commit messages. This does not
answer for the main question - why this change is needed. Code looks
okay, but you should explain in simple words why this is needed.

Best regards,
Krzysztof

> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/regulator/s2mps11.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c
> index 7726b874e539..b4e588cce03d 100644
> --- a/drivers/regulator/s2mps11.c
> +++ b/drivers/regulator/s2mps11.c
> @@ -1162,7 +1162,7 @@ static int s2mps11_pmic_probe(struct platform_device 
> *pdev)
>   }
>   }
>  
> - rdata = kzalloc(sizeof(*rdata) * rdev_num, GFP_KERNEL);
> + rdata = kcalloc(rdev_num, sizeof(*rdata), GFP_KERNEL);
>   if (!rdata)
>   return -ENOMEM;
>  
> -- 
> 2.12.2
>

Re: [Patch v2 1/2] lustre: Parantheses added for Macro argument to avoid precedence issues

2017-04-15 Thread g...@kroah.com

On Sat, Apr 15, 2017 at 11:25:00AM +, Rishiraj Manwatkar wrote:
> From: RishirajAM 
> 
> Parantheses are added for Macro argument, to avoid precedence issues.
> 
> Signed-off-by: Rishiraj Manwatkar 
> ---
>  drivers/staging/lustre/lustre/obdclass/cl_io.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

What changed from v1?  Always put that below the --- line.

And your From: line doesn't match your signed-off-by name, that's not
ok.

> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c 
> b/drivers/staging/lustre/lustre/obdclass/cl_io.c
> index ee7d677..0997254 100755
> --- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
> +++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
> @@ -52,9 +52,9 @@
>   */
>  
>  #define cl_io_for_each(slice, io) \
> - list_for_each_entry((slice), &io->ci_layers, cis_linkage)
> + list_for_each_entry((slice), &(io)->ci_layers, cis_linkage)

What 'precidence' issue is this fixing?  How could that ever be
incorrect?

Really, this macro just needs to go away and be used "as is" anyway...

thanks,

greg k-h

Re: [Patch v2 2/2] lustre: fix coding style issue

2017-04-15 Thread g...@kroah.com

On Sat, Apr 15, 2017 at 11:25:11AM +, Rishiraj Manwatkar wrote:
> Comparison should have the CONSTANT on the right side of the test

Your subject needs to be better :(

thanks,

greg k-h

[PATCH 3/3] mm: __first_valid_page skip over offline pages

2017-04-15 Thread Michal Hocko

From: Michal Hocko 

__first_valid_page skips over invalid pfns in the range but it might
still stumble over offline pages. At least start_isolate_page_range
will mark those set_migratetype_isolate. This doesn't represent
any immediate AFAICS because alloc_contig_range will fail to isolate
those pages but it relies on not fully initialized page which will
become a problem later when we stop associating offline pages to zones.
So this is more a preparatory patch than a fix.

Signed-off-by: Michal Hocko 
---
 mm/page_isolation.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 5092e4ef00c8..2b958f33a1eb 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -138,12 +138,18 @@ static inline struct page *
 __first_valid_page(unsigned long pfn, unsigned long nr_pages)
 {
int i;
-   for (i = 0; i < nr_pages; i++)
-   if (pfn_valid_within(pfn + i))
-   break;
-   if (unlikely(i == nr_pages))
-   return NULL;
-   return pfn_to_page(pfn + i);
+
+   for (i = 0; i < nr_pages; i++) {
+   struct page *page;
+
+   if (!pfn_valid_within(pfn + i))
+   continue;
+   page = pfn_to_page(pfn + i);
+   if (PageReserved(page))
+   continue;
+   return page;
+   }
+   return NULL;
 }
 
 /*
@@ -184,8 +190,12 @@ int start_isolate_page_range(unsigned long start_pfn, 
unsigned long end_pfn,
 undo:
for (pfn = start_pfn;
 pfn < undo_pfn;
-pfn += pageblock_nr_pages)
-   unset_migratetype_isolate(pfn_to_page(pfn), migratetype);
+pfn += pageblock_nr_pages) {
+   struct page *page = pfn_to_page(pfn);
+   if (PageReserved(page))
+   continue;
+   unset_migratetype_isolate(page, migratetype);
+   }
 
return -EBUSY;
 }
-- 
2.11.0

[PATCH 2/3] mm, compaction: skip over holes in __reset_isolation_suitable

2017-04-15 Thread Michal Hocko

From: Michal Hocko 

__reset_isolation_suitable walks the whole zone pfn range and it tries
to jump over holes by checking the zone for each page. It might still
stumble over offline pages, though. Skip those by checking PageReserved.

Signed-off-by: Michal Hocko 
---
 mm/compaction.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/compaction.c b/mm/compaction.c
index de64dedefe0e..df4156d8b037 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -239,6 +239,8 @@ static void __reset_isolation_suitable(struct zone *zone)
continue;
 
page = pfn_to_page(pfn);
+   if (PageReserved(page))
+   continue;
if (zone != page_zone(page))
continue;
 
-- 
2.11.0

[no subject]

2017-04-15 Thread Michal Hocko

Hi,
here I 3 more preparatory patches which I meant to send on Thursday but
forgot... After more thinking about pfn walkers I have realized that
the current code doesn't check offline holes in zones. From a quick
review that doesn't seem to be a problem currently. Pfn walkers can race
with memory offlining and with the original hotplug impementation those
offline pages can change the zone but I wasn't able to find any serious
problem other than small confusion. The new hotplug code, will not have
any valid zone, though so those code paths should check PageReserved
to rule offline holes. I hope I have addressed all of them in these 3
patches. I would appreciate if Vlastimil and Jonsoo double check after
me.

[PATCH 1/3] mm: consider zone which is not fully populated to have holes

2017-04-15 Thread Michal Hocko

From: Michal Hocko 

__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid
page from the given pfn range for the given zone. This doesn't handle
zones which are not fully populated though. Memory pageblocks can be
offlined or might not have been onlined yet. In such a case the zone
should be considered to have holes otherwise pfn walkers can touch
and play with offline pages.

Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp. __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though. In order
to do that let's check PageReserved in __pageblock_pfn_to_page because
offline pages are reserved.

Signed-off-by: Michal Hocko 
---
 mm/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0cacba69ab04..dcbbcfdda60e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1351,6 +1351,8 @@ struct page *__pageblock_pfn_to_page(unsigned long 
start_pfn,
return NULL;
 
start_page = pfn_to_page(start_pfn);
+   if (PageReserved(start_page))
+   return NULL;
 
if (page_zone(start_page) != zone)
return NULL;
-- 
2.11.0

[PULL] irqchip updates for 4.12

2017-04-15 Thread Marc Zyngier

Hi Thomas,

Here's the first batch of irqchip updates for 4.12. On the menu, we
have this time the unification of the Faraday irqchips in a single
code base, ACPI support for mgigen, a new Mediatek wake-up controller,
plus some updates here and there.

Please pull.

Thanks,

M.

The following changes since commit c02ed2e75ef4c74e41e421acb4ef1494671585e8:

  Linux 4.11-rc4 (2017-03-26 14:15:16 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
tags/irqchip-4.12

for you to fetch changes up to 9d4b5bdc5b34e3e89e84d7cf62a8e513b25a8905:

  irqchip/irq-imx-gpcv2: Clear OF_POPULATED flag (2017-04-12 09:20:15 +0100)


irqchip updates for v4.12

- Unify gemini and moxa irqchips under the faraday banner
- Extend mtk-sysirq to deal with multiple MMIO regions
- ACPI/IORT support for GICv3 ITS platform MSI
- ACPI support for mbigen
- Add mtk-cirq wakeup interrupt controller driver
- Atmel aic5 suspend support
- Allow GPCv2 to be probed both as an irqchip and a device


Alexandre Belloni (1):
  irqchip/atmel-aic5: Handle suspend to RAM

Andrey Smirnov (1):
  irqchip/irq-imx-gpcv2: Clear OF_POPULATED flag

Hanjun Guo (6):
  irqchip/gic-v3-its: Keep the include header files in alphabetic order
  irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare()
  irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare for 
ACPI
  irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi domain
  platform-msi: Make platform_msi_create_device_domain() ACPI aware
  irqchip/mbigen: Add ACPI support

Kefeng Wang (2):
  irqchip/mbigen: Drop module owner
  irqchip/mbigen: Introduce mbigen_of_create_domain()

Linus Walleij (4):
  dt-bindings: gemini: augment Gemini bindings to reflect Faraday origin
  irqchip/gemini: Refactor Gemini driver to reflect Faraday origin
  irqchip/faraday: Fix the trigger types
  irqchip/faraday: Replace moxa with ftintc010

Marc Zyngier (1):
  irqchip/gic-v3-its: Add IORT hook for platform MSI support

Mars Cheng (3):
  dt-bindings: mtk-sysirq: Add multiple bases support for Mediatek sysirq
  irqchip/mtk-sysirq: Extend intpol base to arbitrary number
  irqchip/mtk-sysirq: Remove unnecessary barrier when configuring trigger

Youlin Pei (2):
  dt-bindings: mtk-cirq: Add binding document
  irqchip: Add Mediatek mtk-cirq driver

 ...errupt-controller.txt => faraday,ftintc010.txt} |  11 +-
 .../interrupt-controller/mediatek,cirq.txt |  35 +++
 .../interrupt-controller/mediatek,sysirq.txt   |  11 +-
 arch/arm/mach-moxart/Kconfig   |   2 +-
 drivers/base/platform-msi.c|   3 +-
 drivers/irqchip/Kconfig|   6 +
 drivers/irqchip/Makefile   |   5 +-
 drivers/irqchip/irq-atmel-aic5.c   |  29 +-
 drivers/irqchip/irq-ftintc010.c| 194 +
 drivers/irqchip/irq-gemini.c   | 185 -
 drivers/irqchip/irq-gic-v3-its-platform-msi.c  | 113 ++--
 drivers/irqchip/irq-gic-v3-its.c   |   2 +-
 drivers/irqchip/irq-imx-gpcv2.c|   5 +
 drivers/irqchip/irq-mbigen.c   | 115 ++--
 drivers/irqchip/irq-moxart.c   | 116 
 drivers/irqchip/irq-mtk-cirq.c | 306 +
 drivers/irqchip/irq-mtk-sysirq.c   | 116 ++--
 17 files changed, 874 insertions(+), 380 deletions(-)
 rename 
Documentation/devicetree/bindings/interrupt-controller/{cortina,gemini-interrupt-controller.txt
 => faraday,ftintc010.txt} (63%)
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/mediatek,cirq.txt
 create mode 100644 drivers/irqchip/irq-ftintc010.c
 delete mode 100644 drivers/irqchip/irq-gemini.c
 delete mode 100644 drivers/irqchip/irq-moxart.c
 create mode 100644 drivers/irqchip/irq-mtk-cirq.c

[PATCH 0/3] staging: rtl8188eu: fix sparse signedness mismatch warnings

2017-04-15 Thread Aishwarya Pant

Suppress all sparse signedness mismatch warnings generated by-
make C=2 M=drivers/staging/rtl8188eu/ CF="-Wtypesign"

Aishwarya Pant (3):
  staging: rtl8188eu: fix sparse signedness warnings in rtw_get_ie
  staging: rtl8188eu: fix sparse signedness warnings in rtw_set_ie
  staging: rtl8188eu: fix sparse signedness warnings in rtw_generate_ie

 drivers/staging/rtl8188eu/core/rtw_ap.c|  2 +-
 drivers/staging/rtl8188eu/core/rtw_ieee80211.c | 15 ---
 drivers/staging/rtl8188eu/core/rtw_mlme_ext.c  | 14 +++---
 drivers/staging/rtl8188eu/core/rtw_xmit.c  |  3 +--
 drivers/staging/rtl8188eu/include/ieee80211.h  |  6 +++---
 5 files changed, 20 insertions(+), 20 deletions(-)

-- 
2.7.4

[PATCH 1/3] staging: rtl8188eu: fix sparse signedness warnings in rtw_get_ie

2017-04-15 Thread Aishwarya Pant

Changed the type of len from (int *) to (unsigned int *) in the
function rtw_get_ie(..) and wherever this function is called to
suppress signedness mismatch warnings of the type-

drivers/staging/rtl8188eu//core/rtw_ap.c:78:60: warning: incorrect type
in argument 3 (different signedness)
drivers/staging/rtl8188eu//core/rtw_ap.c:78:60:expected int *len
drivers/staging/rtl8188eu//core/rtw_ap.c:78:60:got unsigned int
*

Signed-off-by: Aishwarya Pant 
---
 drivers/staging/rtl8188eu/core/rtw_ap.c|  2 +-
 drivers/staging/rtl8188eu/core/rtw_ieee80211.c | 10 +-
 drivers/staging/rtl8188eu/core/rtw_mlme_ext.c  | 14 +++---
 drivers/staging/rtl8188eu/core/rtw_xmit.c  |  3 +--
 drivers/staging/rtl8188eu/include/ieee80211.h  |  4 ++--
 5 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_ap.c 
b/drivers/staging/rtl8188eu/core/rtw_ap.c
index 3fa6af2..91156a2 100644
--- a/drivers/staging/rtl8188eu/core/rtw_ap.c
+++ b/drivers/staging/rtl8188eu/core/rtw_ap.c
@@ -719,7 +719,7 @@ static void start_bss_network(struct adapter *padapter, u8 
*pbuf)
u8 val8, cur_channel, cur_bwmode, cur_ch_offset;
u16 bcn_interval;
u32 acparm;
-   int ie_len;
+   uintie_len;
struct registry_priv *pregpriv = &padapter->registrypriv;
struct mlme_priv *pmlmepriv = &padapter->mlmepriv;
struct security_priv *psecuritypriv = &padapter->securitypriv;
diff --git a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c 
b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
index d1cd340..f55b38f 100644
--- a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
+++ b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
@@ -158,7 +158,7 @@ u8 *rtw_set_ie
 /*
 index: the information element id index, limit is the limit for search
 -*/
-u8 *rtw_get_ie(u8 *pbuf, int index, int *len, int limit)
+u8 *rtw_get_ie(u8 *pbuf, int index, uint *len, int limit)
 {
int tmp, i;
u8 *p;
@@ -293,7 +293,7 @@ int rtw_generate_ie(struct registry_priv *pregistrypriv)
 
 unsigned char *rtw_get_wpa_ie(unsigned char *pie, int *wpa_ie_len, int limit)
 {
-   int len;
+   uint len;
u16 val16;
__le16 le_tmp;
unsigned char wpa_oui_type[] = {0x00, 0x50, 0xf2, 0x01};
@@ -331,7 +331,7 @@ unsigned char *rtw_get_wpa_ie(unsigned char *pie, int 
*wpa_ie_len, int limit)
return NULL;
 }
 
-unsigned char *rtw_get_wpa2_ie(unsigned char *pie, int *rsn_ie_len, int limit)
+unsigned char *rtw_get_wpa2_ie(unsigned char *pie, uint *rsn_ie_len, int limit)
 {
 
return rtw_get_ie(pie, _WPA2_IE_ID_, rsn_ie_len, limit);
@@ -1000,7 +1000,7 @@ int ieee80211_get_hdrlen(u16 fc)
 
 static int rtw_get_cipher_info(struct wlan_network *pnetwork)
 {
-   int wpa_ielen;
+   uint wpa_ielen;
unsigned char *pbuf;
int group_cipher = 0, pairwise_cipher = 0, is8021x = 0;
int ret = _FAIL;
@@ -1045,7 +1045,7 @@ void rtw_get_bcn_info(struct wlan_network *pnetwork)
__le16 le_tmp;
u16 wpa_len = 0, rsn_len = 0;
struct HT_info_element *pht_info = NULL;
-   int len;
+   uint len;
unsigned char   *p;
 
memcpy(&le_tmp, rtw_get_capability_from_ie(pnetwork->network.IEs), 2);
diff --git a/drivers/staging/rtl8188eu/core/rtw_mlme_ext.c 
b/drivers/staging/rtl8188eu/core/rtw_mlme_ext.c
index 30dd4ed..88a3a2b 100644
--- a/drivers/staging/rtl8188eu/core/rtw_mlme_ext.c
+++ b/drivers/staging/rtl8188eu/core/rtw_mlme_ext.c
@@ -286,7 +286,7 @@ static s32 dump_mgntframe_and_wait_ack(struct adapter 
*padapter,
 static int update_hidden_ssid(u8 *ies, u32 ies_len, u8 hidden_ssid_mode)
 {
u8 *ssid_ie;
-   int ssid_len_ori;
+   uint ssid_len_ori;
int len_diff = 0;
 
ssid_ie = rtw_get_ie(ies,  WLAN_EID_SSID, &ssid_len_ori, ies_len);
@@ -1786,7 +1786,7 @@ static void issue_action_BSSCoexistPacket(struct adapter 
*padapter)
plist = phead->next;
 
while (phead != plist) {
-   int len;
+   uint len;
u8 *p;
struct wlan_bssid_ex *pbss_network;
 
@@ -2556,7 +2556,7 @@ static unsigned int OnProbeReq(struct adapter *padapter,
!check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE|WIFI_AP_STATE))
return _SUCCESS;
 
-   p = rtw_get_ie(pframe + WLAN_HDR_A3_LEN + _PROBEREQ_IE_OFFSET_, 
_SSID_IE_, (int *)&ielen,
+   p = rtw_get_ie(pframe + WLAN_HDR_A3_LEN + _PROBEREQ_IE_OFFSET_, 
_SSID_IE_, &ielen,
len - WLAN_HDR_A3_LEN - _PROBEREQ_IE_OFFSET_);
 
/* check (wildcard) SSID */
@@ -2793,7 +2793,7 @@ static unsigned int OnAuth(struct adapter *padapter,
/* checking for challenging txt... */

[PATCH 2/3] staging: rtl8188eu: fix sparse signedness warnings in rtw_set_ie

2017-04-15 Thread Aishwarya Pant

Changed the type of sz from (int) to (unsigned int) to suppress
signedness mismatch warnings of the type-

drivers/staging/rtl8188eu//core/rtw_ieee80211.c:258:97: warning:
incorrect type in argument 5 (different signedness)
drivers/staging/rtl8188eu//core/rtw_ieee80211.c:258:97:expected
unsigned int [usertype] *frlen
drivers/staging/rtl8188eu//core/rtw_ieee80211.c:258:97:got int
*

Signed-off-by: Aishwarya Pant 
---
 drivers/staging/rtl8188eu/core/rtw_ieee80211.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c 
b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
index f55b38f..79dda83 100644
--- a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
+++ b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
@@ -226,7 +226,8 @@ uintrtw_get_rateset_len(u8  *rateset)
 int rtw_generate_ie(struct registry_priv *pregistrypriv)
 {
u8  wireless_mode;
-   int sz = 0, rateLen;
+   int rateLen;
+   uintsz = 0;
struct wlan_bssid_ex *pdev_network = &pregistrypriv->dev_network;
u8 *ie = pdev_network->IEs;
 
-- 
2.7.4

[PATCH 3/3] staging: rtl8188eu: fix sparse signedness warnings in rtw_generate_ie

2017-04-15 Thread Aishwarya Pant

Changed the type of wpa_ie_len from (int *) to (unsigned int *) in the
function rtw_get_wpa_ie(..) to suppress signedness mismatch warnings in
rtw_generate_ie of the type-

drivers/staging/rtl8188eu//core/rtw_ieee80211.c:1009:60: warning: incorrect
type in argument 2 (different signedness)
drivers/staging/rtl8188eu//core/rtw_ieee80211.c:1009:60:expected int
*wpa_ie_len
drivers/staging/rtl8188eu//core/rtw_ieee80211.c:1009:60:got unsigned
int *

Signed-off-by: Aishwarya Pant 
---
 drivers/staging/rtl8188eu/core/rtw_ieee80211.c | 2 +-
 drivers/staging/rtl8188eu/include/ieee80211.h  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c 
b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
index 79dda83..d1dafe0 100644
--- a/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
+++ b/drivers/staging/rtl8188eu/core/rtw_ieee80211.c
@@ -292,7 +292,7 @@ int rtw_generate_ie(struct registry_priv *pregistrypriv)
return sz;
 }
 
-unsigned char *rtw_get_wpa_ie(unsigned char *pie, int *wpa_ie_len, int limit)
+unsigned char *rtw_get_wpa_ie(unsigned char *pie, uint *wpa_ie_len, int limit)
 {
uint len;
u16 val16;
diff --git a/drivers/staging/rtl8188eu/include/ieee80211.h 
b/drivers/staging/rtl8188eu/include/ieee80211.h
index b3f331a..22ab0c4 100644
--- a/drivers/staging/rtl8188eu/include/ieee80211.h
+++ b/drivers/staging/rtl8188eu/include/ieee80211.h
@@ -861,7 +861,7 @@ u8 *rtw_get_ie(u8 *pbuf, int index, uint *len, int limit);
 
 void rtw_set_supported_rate(u8 *SupportedRates, uint mode);
 
-unsigned char *rtw_get_wpa_ie(unsigned char *pie, int *wpa_ie_len, int limit);
+unsigned char *rtw_get_wpa_ie(unsigned char *pie, uint *wpa_ie_len, int limit);
 unsigned char *rtw_get_wpa2_ie(unsigned char *pie, uint *rsn_ie_len, int 
limit);
 int rtw_get_wpa_cipher_suite(u8 *s);
 int rtw_get_wpa2_cipher_suite(u8 *s);
-- 
2.7.4

Re: [PATCH] genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs

2017-04-15 Thread Marc Zyngier

On Sat, Apr 15 2017 at 11:08:31 am BST, Hans de Goede  
wrote:
> When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
> flags get filled with the trigger type from the irq_data:
>
> if (!(new->flags & IRQF_TRIGGER_MASK))
> new->flags |= irqd_get_trigger_type(&desc->irq_data);
>
> On the first setup_irq() the trigger type in irq_data is NONE when the
> above code executes, then the irq is started up for the first time and
> then the actual trigger type gets established, but that's too late to fix
> up new->flags.
>
> When then a second user of the irq requests the irq with IRQF_TRIGGER_NONE
> its irqaction's triggertype gets set to the actual trigger type and the
> following check fails:
>
> if (!((old->flags ^ new->flags) & IRQF_TRIGGER_MASK))
>
> Resulting in the request_irq failing with -EBUSY even though both
> users requested the irq with IRQF_SHARED | IRQF_TRIGGER_NONE
>
> This commit fixes this by comparing the new irqaction's trigger type
> to the trigger type stored in the irq_data which correctly reflects
> the actual trigger type being used for the irq.
>
> Suggested-by: Thomas Gleixner 
> Signed-off-by: Hans de Goede 
> ---
>  kernel/irq/manage.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index a4afe5c..d63e91f 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1212,8 +1212,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
> struct irqaction *new)
>* set the trigger type must match. Also all must
>* agree on ONESHOT.
>*/
> + unsigned int oldtype = irqd_get_trigger_type(&desc->irq_data);
> +
>   if (!((old->flags & new->flags) & IRQF_SHARED) ||
> - ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) ||
> + (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
>   ((old->flags ^ new->flags) & IRQF_ONESHOT))
>   goto mismatch;

Looks sensible to me.

Acked-by: Marc Zyngier 

M.
-- 
Jazz is not dead, it just smell funny.

Re: [PATCH] irqchip/irq-imx-gpcv2: fix spinlock initialization

2017-04-15 Thread Marc Zyngier

On Thu, Apr 13 2017 at 11:27:31 pm BST, Tyler Baker  
wrote:
> Call raw_spin_lock_init() before the spinlocks are used to prevent a
> lockdep splat.
>
> Fixes the following trace:
>
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> Hardware name: Freescale i.MX7 Dual (Device Tree)
> Backtrace:
> [] (dump_backtrace) from [] (show_stack+0x18/0x1c)
>   r7: r6:60d3 r5: r4:c0e273dc
> [] (show_stack) from [] (dump_stack+0xb4/0xe8)
> [] (dump_stack) from [] (register_lock_class+0x208/0x5ec)
>  r9:ef00d010 r8:ef00d010 r7:c1606448 r6: r5: r4:e000
> [] (register_lock_class) from []
> (__lock_acquire+0x7c/0x18d0)
>  r10:c0e0af40 r9:ef00d010 r8:c0e274cc r7:0001 r6:60d3 r5:c1606448
>  r4:e000
> [] (__lock_acquire) from [] (lock_acquire+0x70/0x90)
>  r10: r9:ef007e38 r8:0001 r7:0001 r6:60d3 r5:
>  r4:e000
> [] (lock_acquire) from [] (_raw_spin_lock+0x30/0x40)
>  r8:60d3 r7:ef007e10 r6:0001 r5:ef007e10 r4:ef00d000
> [] (_raw_spin_lock) from [] 
> (imx_gpcv2_irq_unmask+0x1c/0x5c)
>  r4:ef00d000
> [] (imx_gpcv2_irq_unmask) from [] (irq_enable+0x38/0x4c)
>  r5: r4:ef007e00
> [] (irq_enable) from [] (irq_startup+0x84/0x88)
>  r5: r4:ef007e00
> [] (irq_startup) from [] (__setup_irq+0x538/0x5f4)
>  r7:ef007e60 r6:0015 r5:ef007e00 r4:ef007d00
> [] (__setup_irq) from [] (setup_irq+0x60/0xd0)
>  r10:c0d5fa48 r9:efffcbc0 r8:ef007d00 r7:0015 r6:ef007e10 r5:
>  r4:ef007e00
> [] (setup_irq) from [] (_mxc_timer_init+0x1f8/0x248)
>  r9:efffcbc0 r8:0003 r7:016e3600 r6:c0c69bbc r5:ef007c40 r4:ef007c00
> [] (_mxc_timer_init) from [] (mxc_timer_init_dt+0xb0/0xf8)
>  r7: r6:c1669e48 r5:ef7ebf7c r4:ef007c00
> [] (mxc_timer_init_dt) from []
> (imx6dl_timer_init_dt+0x14/0x18)
>  r9:efffcbc0 r8:c0e7b000 r7:c0c695c0 r6:c0d6fe18 r5:0001 r4:ef7ebf7c
> [] (imx6dl_timer_init_dt) from []
> (clocksource_probe+0x54/0xb0)
> [] (clocksource_probe) from [] (time_init+0x30/0x38)
>  r7:c0e07900 r6:c0e7b000 r5: r4:
> [] (time_init) from [] (start_kernel+0x220/0x3a0)
> [] (start_kernel) from [<8000807c>] (0x8000807c)
>  r10: r9:410fc075 r8:8000406a r7:c0e0c958 r6:c0d5fa44 r5:c0e07918
>  r4:c0e7b294
>
> Verified the fix on a imx7d-cl-som with CONFIG_IMX_GPCV2 set.
>
> Signed-off-by: Tyler Baker 
> Reported-by: Tyler Baker 
> Reviewed-by: Fabio Estevam 
> ---
> Issue reported in this thread: https://lkml.org/lkml/2017/4/13/646
>
>  drivers/irqchip/irq-imx-gpcv2.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/irqchip/irq-imx-gpcv2.c b/drivers/irqchip/irq-imx-gpcv2.c
> index e13236f..9463f35 100644
> --- a/drivers/irqchip/irq-imx-gpcv2.c
> +++ b/drivers/irqchip/irq-imx-gpcv2.c
> @@ -230,6 +230,8 @@ static int __init imx_gpcv2_irqchip_init(struct 
> device_node *node,
>   return -ENOMEM;
>   }
>  
> + raw_spin_lock_init(&cd->rlock);
> +
>   cd->gpc_base = of_iomap(node, 0);
>   if (!cd->gpc_base) {
>   pr_err("fsl-gpcv2: unable to map gpc registers\n");

Acked-by: Marc Zyngier 

Thomas, any chance you could take this as a fix through the tip tree?

Thanks,

M.
-- 
Jazz is not dead, it just smell funny.

[PATCH] scsi: storvsc: Allow only one remove lun work item to be issued per lun

2017-04-15 Thread Cathy Avery

When running multipath on a VM if all available paths go down
the driver can schedule large amounts of storvsc_remove_lun
work items to the same lun. In response to the failing paths
typically storvsc responds by taking host->scan_mutex and issuing
a TUR per lun. If there has been heavy IO to the failed device
all the failed IOs are returned from the host. A remove lun work
item is issued per failed IO. If the outstanding TURs have not been
completed in a timely manner the scan_mutex is never released or
released too late. Consequently the many remove lun work items are
not completed as scsi_remove_device also tries to take host->scan_mutex.
This results in dragging the VM down and sometimes completely.

This patch only allows one remove lun to be issued to a particular
lun while it is an instantiated member of the scsi stack.

Signed-off-by: Cathy Avery 
---
 drivers/scsi/storvsc_drv.c | 33 +++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 016639d..9dbb5bf 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -478,6 +478,10 @@ struct storvsc_device {
u64 port_name;
 };
 
+struct storvsc_dev_hostdata {
+   atomic_t req_remove_lun;
+};
+
 struct hv_host_device {
struct hv_device *dev;
unsigned int port;
@@ -918,6 +922,8 @@ static void storvsc_handle_error(struct vmscsi_request 
*vm_srb,
u8 asc, u8 ascq)
 {
struct storvsc_scan_work *wrk;
+   struct storvsc_dev_hostdata *hostdata;
+   struct scsi_device *sdev;
void (*process_err_fn)(struct work_struct *work);
bool do_work = false;
 
@@ -953,8 +959,17 @@ static void storvsc_handle_error(struct vmscsi_request 
*vm_srb,
}
break;
case SRB_STATUS_INVALID_LUN:
-   do_work = true;
-   process_err_fn = storvsc_remove_lun;
+   sdev = scsi_device_lookup(host, 0, vm_srb->target_id,
+ vm_srb->lun);
+   if (sdev) {
+   hostdata = sdev->hostdata;
+   if (hostdata &&
+   !atomic_cmpxchg(&hostdata->req_remove_lun, 0, 1)) {
+   do_work = true;
+   process_err_fn = storvsc_remove_lun;
+   }
+   scsi_device_put(sdev);
+   }
break;
case SRB_STATUS_ABORTED:
if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID &&
@@ -1426,9 +1441,22 @@ static int storvsc_device_configure(struct scsi_device 
*sdevice)
sdevice->no_write_same = 0;
}
 
+   sdevice->hostdata = kzalloc(sizeof(struct storvsc_dev_hostdata),
+   GFP_ATOMIC);
+   if (!sdevice->hostdata)
+   return -ENOMEM;
+
return 0;
 }
 
+static void storvsc_device_destroy(struct scsi_device *sdevice)
+{
+   if (sdevice->hostdata) {
+   kfree(sdevice->hostdata);
+   sdevice->hostdata = NULL;
+   }
+}
+
 static int storvsc_get_chs(struct scsi_device *sdev, struct block_device * 
bdev,
   sector_t capacity, int *info)
 {
@@ -1669,6 +1697,7 @@ static struct scsi_host_template scsi_driver = {
.eh_timed_out = storvsc_eh_timed_out,
.slave_alloc =  storvsc_device_alloc,
.slave_configure =  storvsc_device_configure,
+   .slave_destroy =storvsc_device_destroy,
.cmd_per_lun =  255,
.this_id =  -1,
.use_clustering =   ENABLE_CLUSTERING,
-- 
2.5.0

Re: [PATCH] irqchip/irq-imx-gpcv2: fix spinlock initialization

2017-04-15 Thread Thomas Gleixner

On Sat, 15 Apr 2017, Marc Zyngier wrote:
> 
> Acked-by: Marc Zyngier 
> 
> Thomas, any chance you could take this as a fix through the tip tree?

It's in Linus tree already :)

[tip:irq/core] genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs

2017-04-15 Thread tip-bot for Hans de Goede

Commit-ID:  382bd4de61827dbaaf5fb4fb7b1f4be4a86505e7
Gitweb: http://git.kernel.org/tip/382bd4de61827dbaaf5fb4fb7b1f4be4a86505e7
Author: Hans de Goede 
AuthorDate: Sat, 15 Apr 2017 12:08:31 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 15:42:43 +0200

genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs

When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
flags get filled with the trigger type from the irq_data:

if (!(new->flags & IRQF_TRIGGER_MASK))
new->flags |= irqd_get_trigger_type(&desc->irq_data);

On the first setup_irq() the trigger type in irq_data is NONE when the
above code executes, then the irq is started up for the first time and
then the actual trigger type gets established, but that's too late to fix
up new->flags.

When then a second user of the irq requests the irq with IRQF_TRIGGER_NONE
its irqaction's triggertype gets set to the actual trigger type and the
following check fails:

if (!((old->flags ^ new->flags) & IRQF_TRIGGER_MASK))

Resulting in the request_irq failing with -EBUSY even though both
users requested the irq with IRQF_SHARED | IRQF_TRIGGER_NONE

Fix this by comparing the new irqaction's trigger type to the trigger type
stored in the irq_data which correctly reflects the actual trigger type
being used for the irq.

Suggested-by: Thomas Gleixner 
Signed-off-by: Hans de Goede 
Acked-by: Marc Zyngier 
Link: http://lkml.kernel.org/r/20170415100831.17073-1-hdego...@redhat.com
Signed-off-by: Thomas Gleixner 

---
 kernel/irq/manage.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 155e3c3..ae1c90f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1212,8 +1212,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
 * set the trigger type must match. Also all must
 * agree on ONESHOT.
 */
+   unsigned int oldtype = irqd_get_trigger_type(&desc->irq_data);
+
if (!((old->flags & new->flags) & IRQF_SHARED) ||
-   ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) ||
+   (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
((old->flags ^ new->flags) & IRQF_ONESHOT))
goto mismatch;

Re: [PATCH] scsi: storvsc: Allow only one remove lun work item to be issued per lun

2017-04-15 Thread Christoph Hellwig

Just add a singlethreaded workqueue for storvsc_handle_error and you'll
get serialization for all error handling for free.

Re: [PATCH] irqchip/irq-imx-gpcv2: fix spinlock initialization

2017-04-15 Thread Marc Zyngier

On Sat, Apr 15 2017 at  2:54:54 pm BST, Thomas Gleixner  
wrote:
> On Sat, 15 Apr 2017, Marc Zyngier wrote:
>> 
>> Acked-by: Marc Zyngier 
>> 
>> Thomas, any chance you could take this as a fix through the tip tree?
>
> It's in Linus tree already :)

Ah, that's how you can tell I'm on holiday... ;-)

M.
-- 
Jazz is not dead, it just smell funny.

[tip:locking/core] futex: Clarify mark_wake_futex memory barrier usage

2017-04-15 Thread tip-bot for Darren Hart (VMware)

Commit-ID:  38fcd06e9b7f6855db1f3ebac5e18b8fdb467ffd
Gitweb: http://git.kernel.org/tip/38fcd06e9b7f6855db1f3ebac5e18b8fdb467ffd
Author: Darren Hart (VMware) 
AuthorDate: Fri, 14 Apr 2017 15:31:38 -0700
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 16:03:46 +0200

futex: Clarify mark_wake_futex memory barrier usage

Clarify the scenario described in mark_wake_futex requiring the
smp_store_release(). Update the comment to explicitly refer to the
plist_del now under __unqueue_futex() (previously plist_del was in the
same function as the comment).

Signed-off-by: Darren Hart (VMware) 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170414223138.GA4222@fury
Signed-off-by: Thomas Gleixner 

---
 kernel/futex.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index ede2f1e..357348a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1380,10 +1380,11 @@ static void mark_wake_futex(struct wake_q_head *wake_q, 
struct futex_q *q)
wake_q_add(wake_q, p);
__unqueue_futex(q);
/*
-* The waiting task can free the futex_q as soon as
-* q->lock_ptr = NULL is written, without taking any locks. A
-* memory barrier is required here to prevent the following
-* store to lock_ptr from getting ahead of the plist_del.
+* The waiting task can free the futex_q as soon as q->lock_ptr = NULL
+* is written, without taking any locks. This is possible in the event
+* of a spurious wakeup, for example. A memory barrier is required here
+* to prevent the following store to lock_ptr from getting ahead of the
+* plist_del in __unqueue_futex().
 */
smp_store_release(&q->lock_ptr, NULL);
 }

[tip:locking/core] MAINTAINERS: Add FUTEX SUBSYSTEM

2017-04-15 Thread tip-bot for Darren Hart (VMware)

Commit-ID:  59cd42c29618c45cd3c56da43402b14f611888dd
Gitweb: http://git.kernel.org/tip/59cd42c29618c45cd3c56da43402b14f611888dd
Author: Darren Hart (VMware) 
AuthorDate: Fri, 14 Apr 2017 15:46:08 -0700
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 16:03:46 +0200

MAINTAINERS: Add FUTEX SUBSYSTEM

Add a MAINTAINERS block for the FUTEX SUBSYSTEM which includes the core
kernel code, include headers, testing code, and Documentation. Excludes
arch files, and higher level test code.

I added tglx and mingo as M as they have made the tip commits and peterz
and myself as R.

Signed-off-by: Darren Hart (VMware) 
Cc: Peter Zijlstra 
Cc: Shuah Khan 
Cc: Arnaldo Carvalho de Melo 
Link: http://lkml.kernel.org/r/20170414224608.GA5180@fury
Signed-off-by: Thomas Gleixner 

---
 MAINTAINERS | 17 +
 1 file changed, 17 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fdd5350..0a8cbcc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5406,6 +5406,23 @@ F:   fs/fuse/
 F: include/uapi/linux/fuse.h
 F: Documentation/filesystems/fuse.txt
 
+FUTEX SUBSYSTEM
+M: Thomas Gleixner 
+M: Ingo Molnar 
+R: Peter Zijlstra 
+R: Darren Hart 
+L: linux-kernel@vger.kernel.org
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
locking/core
+S: Maintained
+F: kernel/futex.c
+F: kernel/futex_compat.c
+F: include/asm-generic/futex.h
+F: include/linux/futex.h
+F: include/uapi/linux/futex.h
+F: tools/testing/selftests/futex/
+F: tools/perf/bench/futex*
+F: Documentation/*futex*
+
 FUTURE DOMAIN TMC-16x0 SCSI DRIVER (16-bit)
 M: Rik Faith 
 L: linux-s...@vger.kernel.org

[tip:sched/core] sparc/sysfs: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  ea875ec94eafb858990f3fe9528501f983105653
Gitweb: http://git.kernel.org/tip/ea875ec94eafb858990f3fe9528501f983105653
Author: Thomas Gleixner 
AuthorDate: Thu, 13 Apr 2017 10:17:07 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

sparc/sysfs: Replace racy task affinity logic

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner 
Acked-by: David S. Miller 
Cc: fenghua...@intel.com
Cc: tony.l...@intel.com
Cc: herb...@gondor.apana.org.au
Cc: r...@rjwysocki.net
Cc: pet...@infradead.org
Cc: b...@kernel.crashing.org
Cc: bige...@linutronix.de
Cc: jiangshan...@gmail.com
Cc: sparcli...@vger.kernel.org
Cc: viresh.ku...@linaro.org
Cc: m...@ellerman.id.au
Cc: t...@kernel.org
Cc: l...@kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131001270.2408@nanos
Signed-off-by: Thomas Gleixner 

---
 arch/sparc/kernel/sysfs.c | 39 +++
 1 file changed, 11 insertions(+), 28 deletions(-)

diff --git a/arch/sparc/kernel/sysfs.c b/arch/sparc/kernel/sysfs.c
index d63fc61..5fd352b 100644
--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_group = {
.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-   unsigned long (*func)(unsigned long),
-   unsigned long arg)
-{
-   cpumask_t old_affinity;
-   unsigned long ret;
-
-   cpumask_copy(&old_affinity, ¤t->cpus_allowed);
-   /* should return -EINVAL to userspace */
-   if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-   return 0;
-
-   ret = func(arg);
-
-   set_cpus_allowed_ptr(current, &old_affinity);
-
-   return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable(unsigned long 
junk)
return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-   unsigned long ra, orig_ra;
+   unsigned long ra, orig_ra, *val = data;
 
-   if (val)
+   if (*val)
ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
else
ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enable(unsigned long val)
 static ssize_t show_mmustat_enable(struct device *s,
struct device_attribute *attr, char *buf)
 {
-   unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+   long val = work_on_cpu(s->id, read_mmustat_enable, NULL);
+
return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,15 @@ static ssize_t store_mmustat_enable(struct device *s,
struct device_attribute *attr, const char *buf,
size_t count)
 {
-   unsigned long val, err;
-   int ret = sscanf(buf, "%lu", &val);
+   unsigned long val;
+   long err;
+   int ret;
 
+   ret = sscanf(buf, "%lu", &val);
if (ret != 1)
return -EINVAL;
 
-   err = run_on_cpu(s->id, write_mmustat_enable, val);
+   err = work_on_cpu(s->id, write_mmustat_enable, &val);
if (err)
return -EIO;

[tip:sched/core] workqueue: Provide work_on_cpu_safe()

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  0e8d6a9336b487a1dd6f1991ff376e669d4c87c6
Gitweb: http://git.kernel.org/tip/0e8d6a9336b487a1dd6f1991ff376e669d4c87c6
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:28 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

workqueue: Provide work_on_cpu_safe()

work_on_cpu() is not protected against CPU hotplug. For code which requires
to be either executed on an online CPU or to fail if the CPU is not
available the callsite would have to protect against CPU hotplug.

Provide a function which does get/put_online_cpus() around the call to
work_on_cpu() and fails the call with -ENODEV if the target CPU is not
online.

Preparatory patch to convert several racy task affinity manipulations.

Signed-off-by: Thomas Gleixner 
Acked-by: Tejun Heo 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.262610...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 include/linux/workqueue.h |  5 +
 kernel/workqueue.c| 23 +++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index bde063c..c102ef6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -608,8 +608,13 @@ static inline long work_on_cpu(int cpu, long (*fn)(void 
*), void *arg)
 {
return fn(arg);
 }
+static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+   return fn(arg);
+}
 #else
 long work_on_cpu(int cpu, long (*fn)(void *), void *arg);
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c0168b7..5bf1be0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4735,6 +4735,29 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg)
return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
+
+/**
+ * work_on_cpu_safe - run a function in thread context on a particular cpu
+ * @cpu: the cpu to run on
+ * @fn:  the function to run
+ * @arg: the function argument
+ *
+ * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold
+ * any locks which would prevent @fn from completing.
+ *
+ * Return: The value @fn returns.
+ */
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+   long ret = -ENODEV;
+
+   get_online_cpus();
+   if (cpu_online(cpu))
+   ret = work_on_cpu(cpu, fn, arg);
+   put_online_cpus();
+   return ret;
+}
+EXPORT_SYMBOL_GPL(work_on_cpu_safe);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER

[tip:sched/core] ia64/topology: Remove cpus_allowed manipulation

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  048c9b954e20396e0c45ee778466994d1be2e612
Gitweb: http://git.kernel.org/tip/048c9b954e20396e0c45ee778466994d1be2e612
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:27 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:52 +0200

ia64/topology: Remove cpus_allowed manipulation

The CPU hotplug callback fiddles with the cpus_allowed pointer to pin the
calling thread on the plugged CPU. That's already guaranteed by the hotplug
core code.

Remove it.

Signed-off-by: Thomas Gleixner 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: linux-i...@vger.kernel.org
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.174518...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 arch/ia64/kernel/topology.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index 1a68f01..d76529c 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -355,18 +355,12 @@ static int cache_add_dev(unsigned int cpu)
unsigned long i, j;
struct cache_info *this_object;
int retval = 0;
-   cpumask_t oldmask;
 
if (all_cpu_cache_info[cpu].kobj.parent)
return 0;
 
-   oldmask = current->cpus_allowed;
-   retval = set_cpus_allowed_ptr(current, cpumask_of(cpu));
-   if (unlikely(retval))
-   return retval;
 
retval = cpu_cache_sysfs_init(cpu);
-   set_cpus_allowed_ptr(current, &oldmask);
if (unlikely(retval < 0))
return retval;

[tip:sched/core] ia64/salinfo: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  67cb85fdcee7fbc61c09c00360d1a4ae37641db4
Gitweb: http://git.kernel.org/tip/67cb85fdcee7fbc61c09c00360d1a4ae37641db4
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:29 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

ia64/salinfo: Replace racy task affinity logic

Some of the file operations in /proc/sal require to run code on the
requested cpu. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: linux-i...@vger.kernel.org
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.341863...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 arch/ia64/kernel/salinfo.c | 31 ---
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index d194d5c..63dc9cd 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -179,14 +179,14 @@ struct salinfo_platform_oemdata_parms {
const u8 *efi_guid;
u8 **oemdata;
u64 *oemdata_size;
-   int ret;
 };
 
-static void
+static long
 salinfo_platform_oemdata_cpu(void *context)
 {
struct salinfo_platform_oemdata_parms *parms = context;
-   parms->ret = salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, 
parms->oemdata_size);
+
+   return salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, 
parms->oemdata_size);
 }
 
 static void
@@ -380,16 +380,7 @@ salinfo_log_release(struct inode *inode, struct file *file)
return 0;
 }
 
-static void
-call_on_cpu(int cpu, void (*fn)(void *), void *arg)
-{
-   cpumask_t save_cpus_allowed = current->cpus_allowed;
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
-   (*fn)(arg);
-   set_cpus_allowed_ptr(current, &save_cpus_allowed);
-}
-
-static void
+static long
 salinfo_log_read_cpu(void *context)
 {
struct salinfo_data *data = context;
@@ -399,6 +390,7 @@ salinfo_log_read_cpu(void *context)
/* Clear corrected errors as they are read from SAL */
if (rh->severity == sal_log_severity_corrected)
ia64_sal_clear_state_info(data->type);
+   return 0;
 }
 
 static void
@@ -430,7 +422,7 @@ retry:
spin_unlock_irqrestore(&data_saved_lock, flags);
 
if (!data->saved_num)
-   call_on_cpu(cpu, salinfo_log_read_cpu, data);
+   work_on_cpu_safe(cpu, salinfo_log_read_cpu, data);
if (!data->log_size) {
data->state = STATE_NO_DATA;
cpumask_clear_cpu(cpu, &data->cpu_event);
@@ -459,11 +451,13 @@ salinfo_log_read(struct file *file, char __user *buffer, 
size_t count, loff_t *p
return simple_read_from_buffer(buffer, count, ppos, buf, bufsize);
 }
 
-static void
+static long
 salinfo_log_clear_cpu(void *context)
 {
struct salinfo_data *data = context;
+
ia64_sal_clear_state_info(data->type);
+   return 0;
 }
 
 static int
@@ -486,7 +480,7 @@ salinfo_log_clear(struct salinfo_data *data, int cpu)
rh = (sal_log_record_header_t *)(data->log_buffer);
/* Corrected errors have already been cleared from SAL */
if (rh->severity != sal_log_severity_corrected)
-   call_on_cpu(cpu, salinfo_log_clear_cpu, data);
+   work_on_cpu_safe(cpu, salinfo_log_clear_cpu, data);
/* clearing a record may make a new record visible */
salinfo_log_new_read(cpu, data);
if (data->state == STATE_LOG_RECORD) {
@@ -531,9 +525,8 @@ salinfo_log_write(struct file *file, const char __user 
*buffer, size_t count, lo
.oemdata = &data->oemdata,
.oemdata_size = &data->oemdata_size
};
-   call_on_cpu(cpu, salinfo_platform_oemdata_cpu, &parms);
-   if (parms.ret)
-   count = parms.ret;
+   count = work_on_cpu_safe(cpu, 
salinfo_platform_oemdata_cpu,
+&parms);
} else
data->oemdata_size = 0;
} else

[tip:sched/core] ia64/sn/hwperf: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  9feb42ac88b516e378b9782e82b651ca5bed95c4
Gitweb: http://git.kernel.org/tip/9feb42ac88b516e378b9782e82b651ca5bed95c4
Author: Thomas Gleixner 
AuthorDate: Thu, 6 Apr 2017 14:56:18 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

ia64/sn/hwperf: Replace racy task affinity logic

sn_hwperf_op_cpu() which is invoked from an ioctl requires to run code on
the requested cpu. This is achieved by temporarily setting the affinity of
the calling user space thread to the requested CPU and reset it to the
original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: linux-i...@vger.kernel.org
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122251450.2548@nanos
Signed-off-by: Thomas Gleixner 

---
 arch/ia64/sn/kernel/sn2/sn_hwperf.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/ia64/sn/kernel/sn2/sn_hwperf.c 
b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
index 52704f1..55febd6 100644
--- a/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -598,12 +598,17 @@ static void sn_hwperf_call_sal(void *info)
op_info->ret = r;
 }
 
+static long sn_hwperf_call_sal_work(void *info)
+{
+   sn_hwperf_call_sal(info);
+   return 0;
+}
+
 static int sn_hwperf_op_cpu(struct sn_hwperf_op_info *op_info)
 {
u32 cpu;
u32 use_ipi;
int r = 0;
-   cpumask_t save_allowed;

cpu = (op_info->a->arg & SN_HWPERF_ARG_CPU_MASK) >> 32;
use_ipi = op_info->a->arg & SN_HWPERF_ARG_USE_IPI_MASK;
@@ -629,13 +634,9 @@ static int sn_hwperf_op_cpu(struct sn_hwperf_op_info 
*op_info)
/* use an interprocessor interrupt to call SAL */
smp_call_function_single(cpu, sn_hwperf_call_sal,
op_info, 1);
-   }
-   else {
-   /* migrate the task before calling SAL */ 
-   save_allowed = current->cpus_allowed;
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
-   sn_hwperf_call_sal(op_info);
-   set_cpus_allowed_ptr(current, &save_allowed);
+   } else {
+   /* Call on the target CPU */
+   work_on_cpu_safe(cpu, sn_hwperf_call_sal_work, op_info);
}
}
r = op_info->ret;

[tip:sched/core] powerpc/smp: Replace open coded task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  6d11b87d55eb75007a3721c2de5938f5bbf607fb
Gitweb: http://git.kernel.org/tip/6d11b87d55eb75007a3721c2de5938f5bbf607fb
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:31 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

powerpc/smp: Replace open coded task affinity logic

Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
run on any online CPU at this point, but the setup_cpu() callback requires
to be invoked on the boot CPU. This is achieved by temporarily setting the
affinity of the calling user space thread to the requested CPU and reset it
to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

That's actually not a problem in this context as neither CPU hotplug nor
affinity settings can happen, but the access to task_struct::cpus_allowed
is about to restricted.

Replace it with a call to work_on_cpu_safe() which achieves the same result.

Signed-off-by: Thomas Gleixner 
Acked-by: Michael Ellerman 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: Tejun Heo 
Cc: Paul Mackerras 
Cc: linuxppc-...@lists.ozlabs.org
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.518053...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 arch/powerpc/kernel/smp.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 46f89e6..d68ed1f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -787,24 +787,21 @@ static struct sched_domain_topology_level 
powerpc_topology[] = {
{ NULL, },
 };
 
-void __init smp_cpus_done(unsigned int max_cpus)
+static __init long smp_setup_cpu_workfn(void *data __always_unused)
 {
-   cpumask_var_t old_mask;
+   smp_ops->setup_cpu(boot_cpuid);
+   return 0;
+}
 
-   /* We want the setup_cpu() here to be called from CPU 0, but our
-* init thread may have been "borrowed" by another CPU in the meantime
-* se we pin us down to CPU 0 for a short while
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+   /*
+* We want the setup_cpu() here to be called on the boot CPU, but
+* init might run on any CPU, so make sure it's invoked on the boot
+* CPU.
 */
-   alloc_cpumask_var(&old_mask, GFP_NOWAIT);
-   cpumask_copy(old_mask, ¤t->cpus_allowed);
-   set_cpus_allowed_ptr(current, cpumask_of(boot_cpuid));
-   
if (smp_ops && smp_ops->setup_cpu)
-   smp_ops->setup_cpu(boot_cpuid);
-
-   set_cpus_allowed_ptr(current, old_mask);
-
-   free_cpumask_var(old_mask);
+   work_on_cpu_safe(boot_cpuid, smp_setup_cpu_workfn, NULL);
 
if (smp_ops && smp_ops->bringup_done)
smp_ops->bringup_done();
@@ -812,7 +809,6 @@ void __init smp_cpus_done(unsigned int max_cpus)
dump_numa_cpu_topology();
 
set_sched_topology(powerpc_topology);
-
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

[tip:sched/core] cpufreq/ia64: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  38f05ed04beb276f780fcd2b5c0b78c76d0b3c0c
Gitweb: http://git.kernel.org/tip/38f05ed04beb276f780fcd2b5c0b78c76d0b3c0c
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:55:03 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/ia64: Replace racy task affinity logic

The get() and target() callbacks must run on the affected cpu. This is
achieved by temporarily setting the affinity of the calling thread to the
requested CPU and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner 
Acked-by: Viresh Kumar 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: linux...@vger.kernel.org
Cc: Lai Jiangshan 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122231100.2548@nanos
Signed-off-by: Thomas Gleixner 

---
 drivers/cpufreq/ia64-acpi-cpufreq.c | 92 -
 1 file changed, 39 insertions(+), 53 deletions(-)

diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c 
b/drivers/cpufreq/ia64-acpi-cpufreq.c
index e28a31a..a757c0a 100644
--- a/drivers/cpufreq/ia64-acpi-cpufreq.c
+++ b/drivers/cpufreq/ia64-acpi-cpufreq.c
@@ -34,6 +34,11 @@ struct cpufreq_acpi_io {
unsigned intresume;
 };
 
+struct cpufreq_acpi_req {
+   unsigned intcpu;
+   unsigned intstate;
+};
+
 static struct cpufreq_acpi_io  *acpi_io_data[NR_CPUS];
 
 static struct cpufreq_driver acpi_cpufreq_driver;
@@ -83,8 +88,7 @@ processor_get_pstate (
 static unsigned
 extract_clock (
struct cpufreq_acpi_io *data,
-   unsigned value,
-   unsigned int cpu)
+   unsigned value)
 {
unsigned long i;
 
@@ -98,60 +102,43 @@ extract_clock (
 }
 
 
-static unsigned int
+static long
 processor_get_freq (
-   struct cpufreq_acpi_io  *data,
-   unsigned intcpu)
+   void *arg)
 {
-   int ret = 0;
-   u32 value = 0;
-   cpumask_t   saved_mask;
-   unsigned long   clock_freq;
+   struct cpufreq_acpi_req *req = arg;
+   unsigned intcpu = req->cpu;
+   struct cpufreq_acpi_io  *data = acpi_io_data[cpu];
+   u32 value;
+   int ret;
 
pr_debug("processor_get_freq\n");
-
-   saved_mask = current->cpus_allowed;
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
if (smp_processor_id() != cpu)
-   goto migrate_end;
+   return -EAGAIN;
 
/* processor_get_pstate gets the instantaneous frequency */
ret = processor_get_pstate(&value);
-
if (ret) {
-   set_cpus_allowed_ptr(current, &saved_mask);
pr_warn("get performance failed with error %d\n", ret);
-   ret = 0;
-   goto migrate_end;
+   return ret;
}
-   clock_freq = extract_clock(data, value, cpu);
-   ret = (clock_freq*1000);
-
-migrate_end:
-   set_cpus_allowed_ptr(current, &saved_mask);
-   return ret;
+   return 1000 * extract_clock(data, value);
 }
 
 
-static int
+static long
 processor_set_freq (
-   struct cpufreq_acpi_io  *data,
-   struct cpufreq_policy   *policy,
-   int state)
+   void *arg)
 {
-   int ret = 0;
-   u32 value = 0;
-   cpumask_t   saved_mask;
-   int retval;
+   struct cpufreq_acpi_req *req = arg;
+   unsigned intcpu = req->cpu;
+   struct cpufreq_acpi_io  *data = acpi_io_data[cpu];
+   int ret, state = req->state;
+   u32 value;
 
pr_debug("processor_set_freq\n");
-
-   saved_mask = current->cpus_allowed;
-   set_cpus_allowed_ptr(current, cpumask_of(policy->cpu));
-   if (smp_processor_id() != policy->cpu) {
-   retval = -EAGAIN;
-   goto migrate_end;
-   }
+   if (smp_processor_id() != cpu)
+   return -EAGAIN;
 
if (state == data->acpi_data.state) {
if (unlikely(data->resume)) {
@@ -159,8 +146,7 @@ processor_set_freq (
data->resume = 0;
} else {
pr_debug("Already at target state (P%d)\n", state);
-   retval = 0;
-   goto migrate_end;
+   return 0;
}
}
 
@@ -171,7 +157,6 @@ processor_set_freq (
 * First we write the t

[tip:sched/core] cpufreq/sh: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  205dcc1ecbc566cbc20acf246e68de3b080b3ecf
Gitweb: http://git.kernel.org/tip/205dcc1ecbc566cbc20acf246e68de3b080b3ecf
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:36 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/sh: Replace racy task affinity logic

The target() callback must run on the affected cpu. This is achieved by
temporarily setting the affinity of the calling thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner 
Acked-by: Viresh Kumar 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: linux...@vger.kernel.org
Cc: Lai Jiangshan 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.958216...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 drivers/cpufreq/sh-cpufreq.c | 45 ++--
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
index 86628e2..719c3d9 100644
--- a/drivers/cpufreq/sh-cpufreq.c
+++ b/drivers/cpufreq/sh-cpufreq.c
@@ -30,54 +30,63 @@
 
 static DEFINE_PER_CPU(struct clk, sh_cpuclk);
 
+struct cpufreq_target {
+   struct cpufreq_policy   *policy;
+   unsigned intfreq;
+};
+
 static unsigned int sh_cpufreq_get(unsigned int cpu)
 {
return (clk_get_rate(&per_cpu(sh_cpuclk, cpu)) + 500) / 1000;
 }
 
-/*
- * Here we notify other drivers of the proposed change and the final change.
- */
-static int sh_cpufreq_target(struct cpufreq_policy *policy,
-unsigned int target_freq,
-unsigned int relation)
+static long __sh_cpufreq_target(void *arg)
 {
-   unsigned int cpu = policy->cpu;
+   struct cpufreq_target *target = arg;
+   struct cpufreq_policy *policy = target->policy;
+   int cpu = policy->cpu;
struct clk *cpuclk = &per_cpu(sh_cpuclk, cpu);
-   cpumask_t cpus_allowed;
struct cpufreq_freqs freqs;
struct device *dev;
long freq;
 
-   cpus_allowed = current->cpus_allowed;
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-   BUG_ON(smp_processor_id() != cpu);
+   if (smp_processor_id() != cpu)
+   return -ENODEV;
 
dev = get_cpu_device(cpu);
 
/* Convert target_freq from kHz to Hz */
-   freq = clk_round_rate(cpuclk, target_freq * 1000);
+   freq = clk_round_rate(cpuclk, target->freq * 1000);
 
if (freq < (policy->min * 1000) || freq > (policy->max * 1000))
return -EINVAL;
 
-   dev_dbg(dev, "requested frequency %u Hz\n", target_freq * 1000);
+   dev_dbg(dev, "requested frequency %u Hz\n", target->freq * 1000);
 
freqs.old   = sh_cpufreq_get(cpu);
freqs.new   = (freq + 500) / 1000;
freqs.flags = 0;
 
-   cpufreq_freq_transition_begin(policy, &freqs);
-   set_cpus_allowed_ptr(current, &cpus_allowed);
+   cpufreq_freq_transition_begin(target->policy, &freqs);
clk_set_rate(cpuclk, freq);
-   cpufreq_freq_transition_end(policy, &freqs, 0);
+   cpufreq_freq_transition_end(target->policy, &freqs, 0);
 
dev_dbg(dev, "set frequency %lu Hz\n", freq);
-
return 0;
 }
 
+/*
+ * Here we notify other drivers of the proposed change and the final change.
+ */
+static int sh_cpufreq_target(struct cpufreq_policy *policy,
+unsigned int target_freq,
+unsigned int relation)
+{
+   struct cpufreq_target data = { .policy = policy, .freq = target_freq };
+
+   return work_on_cpu(policy->cpu, __sh_cpufreq_target, &data);
+}
+
 static int sh_cpufreq_verify(struct cpufreq_policy *policy)
 {
struct clk *cpuclk = &per_cpu(sh_cpuclk, policy->cpu);

[tip:sched/core] ACPI/processor: Fix error handling in __acpi_processor_start()

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  a5cbdf693a60d5b86d4d21dfedd90f17754eb273
Gitweb: http://git.kernel.org/tip/a5cbdf693a60d5b86d4d21dfedd90f17754eb273
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:33 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

ACPI/processor: Fix error handling in __acpi_processor_start()

When acpi_install_notify_handler() fails the cooling device stays
registered and the sysfs files created via acpi_pss_perf_init() are
leaked and the function returns success.

Undo acpi_pss_perf_init() and return a proper error code.

Signed-off-by: Thomas Gleixner 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: linux-a...@vger.kernel.org
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.695499...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 drivers/acpi/processor_driver.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 9d5f0c7..eab8cda 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -251,6 +251,9 @@ static int __acpi_processor_start(struct acpi_device 
*device)
if (ACPI_SUCCESS(status))
return 0;
 
+   result = -ENODEV;
+   acpi_pss_perf_exit(pr, device);
+
 err_power_exit:
acpi_processor_power_exit(pr);
return result;

[tip:sched/core] cpufreq/sparc-us3: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  9fe24c4e92d3963d92d7d383e28ed098bd5689d8
Gitweb: http://git.kernel.org/tip/9fe24c4e92d3963d92d7d383e28ed098bd5689d8
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:37 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/sparc-us3: Replace racy task affinity logic

The access to the safari config register in the CPU frequency functions
must be executed on the target CPU. This is achieved by temporarily setting
the affinity of the calling user space thread to the requested CPU and
reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner 
Acked-by: Viresh Kumar 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: linux...@vger.kernel.org
Cc: Lai Jiangshan 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201043.047558...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 drivers/cpufreq/sparc-us3-cpufreq.c | 46 +
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/cpufreq/sparc-us3-cpufreq.c 
b/drivers/cpufreq/sparc-us3-cpufreq.c
index a8d86a4..30645b0 100644
--- a/drivers/cpufreq/sparc-us3-cpufreq.c
+++ b/drivers/cpufreq/sparc-us3-cpufreq.c
@@ -35,22 +35,28 @@ static struct us3_freq_percpu_info *us3_freq_table;
 #define SAFARI_CFG_DIV_32  0x8000UL
 #define SAFARI_CFG_DIV_MASK0xC000UL
 
-static unsigned long read_safari_cfg(void)
+static void read_safari_cfg(void *arg)
 {
-   unsigned long ret;
+   unsigned long ret, *val = arg;
 
__asm__ __volatile__("ldxa  [%%g0] %1, %0"
 : "=&r" (ret)
 : "i" (ASI_SAFARI_CONFIG));
-   return ret;
+   *val = ret;
 }
 
-static void write_safari_cfg(unsigned long val)
+static void update_safari_cfg(void *arg)
 {
+   unsigned long reg, *new_bits = arg;
+
+   read_safari_cfg(®);
+   reg &= ~SAFARI_CFG_DIV_MASK;
+   reg |= *new_bits;
+
__asm__ __volatile__("stxa  %0, [%%g0] %1\n\t"
 "membar#Sync"
 : /* no outputs */
-: "r" (val), "i" (ASI_SAFARI_CONFIG)
+: "r" (reg), "i" (ASI_SAFARI_CONFIG)
 : "memory");
 }
 
@@ -78,29 +84,17 @@ static unsigned long get_current_freq(unsigned int cpu, 
unsigned long safari_cfg
 
 static unsigned int us3_freq_get(unsigned int cpu)
 {
-   cpumask_t cpus_allowed;
unsigned long reg;
-   unsigned int ret;
-
-   cpumask_copy(&cpus_allowed, ¤t->cpus_allowed);
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-   reg = read_safari_cfg();
-   ret = get_current_freq(cpu, reg);
-
-   set_cpus_allowed_ptr(current, &cpus_allowed);
 
-   return ret;
+   if (smp_call_function_single(cpu, read_safari_cfg, ®, 1))
+   return 0;
+   return get_current_freq(cpu, reg);
 }
 
 static int us3_freq_target(struct cpufreq_policy *policy, unsigned int index)
 {
unsigned int cpu = policy->cpu;
-   unsigned long new_bits, new_freq, reg;
-   cpumask_t cpus_allowed;
-
-   cpumask_copy(&cpus_allowed, ¤t->cpus_allowed);
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
+   unsigned long new_bits, new_freq;
 
new_freq = sparc64_get_clock_tick(cpu) / 1000;
switch (index) {
@@ -121,15 +115,7 @@ static int us3_freq_target(struct cpufreq_policy *policy, 
unsigned int index)
BUG();
}
 
-   reg = read_safari_cfg();
-
-   reg &= ~SAFARI_CFG_DIV_MASK;
-   reg |= new_bits;
-   write_safari_cfg(reg);
-
-   set_cpus_allowed_ptr(current, &cpus_allowed);
-
-   return 0;
+   return smp_call_function_single(cpu, update_safari_cfg, &new_bits, 1);
 }
 
 static int __init us3_freq_cpu_init(struct cpufreq_policy *policy)

[tip:sched/core] ACPI/processor: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  8153f9ac43897f9f4786b30badc134fcc1a4fb11
Gitweb: http://git.kernel.org/tip/8153f9ac43897f9f4786b30badc134fcc1a4fb11
Author: Thomas Gleixner 
AuthorDate: Wed, 12 Apr 2017 22:07:34 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

ACPI/processor: Replace racy task affinity logic

acpi_processor_get_throttling() requires to invoke the getter function on
the target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

acpi_processor_get_throttling() is invoked in two ways:

1) The CPU online callback, which is already running on the target CPU and
   obviously protected against hotplug and not affected by affinity
   settings.

2) The ACPI driver probe function, which is not protected against hotplug
   during modprobe.

Switch it over to work_on_cpu() and protect the probe function against CPU
hotplug.

Signed-off-by: Thomas Gleixner 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: linux-a...@vger.kernel.org
Cc: Viresh Kumar 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/20170412201042.785920...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 drivers/acpi/processor_driver.c |  7 -
 drivers/acpi/processor_throttling.c | 62 +
 2 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index eab8cda..8697a82 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -262,11 +262,16 @@ err_power_exit:
 static int acpi_processor_start(struct device *dev)
 {
struct acpi_device *device = ACPI_COMPANION(dev);
+   int ret;
 
if (!device)
return -ENODEV;
 
-   return __acpi_processor_start(device);
+   /* Protect against concurrent CPU hotplug operations */
+   get_online_cpus();
+   ret = __acpi_processor_start(device);
+   put_online_cpus();
+   return ret;
 }
 
 static int acpi_processor_stop(struct device *dev)
diff --git a/drivers/acpi/processor_throttling.c 
b/drivers/acpi/processor_throttling.c
index a12f96c..3de34633 100644
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -62,8 +62,8 @@ struct acpi_processor_throttling_arg {
 #define THROTTLING_POSTCHANGE  (2)
 
 static int acpi_processor_get_throttling(struct acpi_processor *pr);
-int acpi_processor_set_throttling(struct acpi_processor *pr,
-   int state, bool force);
+static int __acpi_processor_set_throttling(struct acpi_processor *pr,
+  int state, bool force, bool direct);
 
 static int acpi_processor_update_tsd_coord(void)
 {
@@ -891,7 +891,8 @@ static int acpi_processor_get_throttling_ptc(struct 
acpi_processor *pr)
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"Invalid throttling state, reset\n"));
state = 0;
-   ret = acpi_processor_set_throttling(pr, state, true);
+   ret = __acpi_processor_set_throttling(pr, state, true,
+ true);
if (ret)
return ret;
}
@@ -901,36 +902,31 @@ static int acpi_processor_get_throttling_ptc(struct 
acpi_processor *pr)
return 0;
 }
 
-static int acpi_processor_get_throttling(struct acpi_processor *pr)
+static long __acpi_processor_get_throttling(void *data)
 {
-   cpumask_var_t saved_mask;
-   int ret;
+   struct acpi_processor *pr = data;
+
+   return pr->throttling.acpi_processor_get_throttling(pr);
+}
 
+static int acpi_processor_get_throttling(struct acpi_processor *pr)
+{
if (!pr)
return -EINVAL;
 
if (!pr->flags.throttling)
return -ENODEV;
 
-   if (!alloc_cpumask_var(&saved_mask, GFP_KERNEL))
-   return -ENOMEM;
-
/*
-* Migrate task to the cpu pointed by pr.
+* This is either called from the CPU hotplug callback of
+* processor_driver or via the ACPI probe function. In the latter
+* case the CPU is not guaranteed to be online. Both call sites are
+* protected against CPU hotplug.
 */
-   cpumask_copy(saved_mask, ¤t->cpus_allowed);
-   /* FIXME: use work_on_cpu() */
-   if (set_cpus_allowed_ptr(current, cpumask_of(pr->id))) {
-   /* Can't migrate to the target pr->id CPU. Exit */
-   free_

[tip:sched/core] cpufreq/sparc-us2e: Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  12699ac53a2e5fbd1fd7c164b11685d55c8aa28b
Gitweb: http://git.kernel.org/tip/12699ac53a2e5fbd1fd7c164b11685d55c8aa28b
Author: Thomas Gleixner 
AuthorDate: Thu, 13 Apr 2017 10:22:43 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:56 +0200

cpufreq/sparc-us2e: Replace racy task affinity logic

The access to the HBIRD_ESTAR_MODE register in the cpu frequency control
functions must happen on the target CPU. This is achieved by temporarily
setting the affinity of the calling user space thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner 
Acked-by: Viresh Kumar 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Herbert Xu 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: linux...@vger.kernel.org
Cc: Lai Jiangshan 
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: "David S. Miller" 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131020280.2408@nanos
Signed-off-by: Thomas Gleixner 

---
 drivers/cpufreq/sparc-us2e-cpufreq.c | 45 +---
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/sparc-us2e-cpufreq.c 
b/drivers/cpufreq/sparc-us2e-cpufreq.c
index 35ddb6d..90f33ef 100644
--- a/drivers/cpufreq/sparc-us2e-cpufreq.c
+++ b/drivers/cpufreq/sparc-us2e-cpufreq.c
@@ -118,10 +118,6 @@ static void us2e_transition(unsigned long estar, unsigned 
long new_bits,
unsigned long clock_tick,
unsigned long old_divisor, unsigned long divisor)
 {
-   unsigned long flags;
-
-   local_irq_save(flags);
-
estar &= ~ESTAR_MODE_DIV_MASK;
 
/* This is based upon the state transition diagram in the IIe manual.  
*/
@@ -152,8 +148,6 @@ static void us2e_transition(unsigned long estar, unsigned 
long new_bits,
} else {
BUG();
}
-
-   local_irq_restore(flags);
 }
 
 static unsigned long index_to_estar_mode(unsigned int index)
@@ -229,48 +223,51 @@ static unsigned long estar_to_divisor(unsigned long estar)
return ret;
 }
 
+static void __us2e_freq_get(void *arg)
+{
+   unsigned long *estar = arg;
+
+   *estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
+}
+
 static unsigned int us2e_freq_get(unsigned int cpu)
 {
-   cpumask_t cpus_allowed;
unsigned long clock_tick, estar;
 
-   cpumask_copy(&cpus_allowed, ¤t->cpus_allowed);
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-   estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
-
-   set_cpus_allowed_ptr(current, &cpus_allowed);
+   if (smp_call_function_single(cpu, __us2e_freq_get, &estar, 1))
+   return 0;
 
return clock_tick / estar_to_divisor(estar);
 }
 
-static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+static void __us2e_freq_target(void *arg)
 {
-   unsigned int cpu = policy->cpu;
+   unsigned int cpu = smp_processor_id();
+   unsigned int *index = arg;
unsigned long new_bits, new_freq;
unsigned long clock_tick, divisor, old_divisor, estar;
-   cpumask_t cpus_allowed;
-
-   cpumask_copy(&cpus_allowed, ¤t->cpus_allowed);
-   set_cpus_allowed_ptr(current, cpumask_of(cpu));
 
new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-   new_bits = index_to_estar_mode(index);
-   divisor = index_to_divisor(index);
+   new_bits = index_to_estar_mode(*index);
+   divisor = index_to_divisor(*index);
new_freq /= divisor;
 
estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
old_divisor = estar_to_divisor(estar);
 
-   if (old_divisor != divisor)
+   if (old_divisor != divisor) {
us2e_transition(estar, new_bits, clock_tick * 1000,
old_divisor, divisor);
+   }
+}
 
-   set_cpus_allowed_ptr(current, &cpus_allowed);
+static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+{
+   unsigned int cpu = policy->cpu;
 
-   return 0;
+   return smp_call_function_single(cpu, __us2e_freq_target, &index, 1);
 }
 
 static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy)

[tip:sched/core] crypto: N2 - Replace racy task affinity logic

2017-04-15 Thread tip-bot for Thomas Gleixner

Commit-ID:  73810a069120aa831debb4d967310ab900f628ad
Gitweb: http://git.kernel.org/tip/73810a069120aa831debb4d967310ab900f628ad
Author: Thomas Gleixner 
AuthorDate: Thu, 13 Apr 2017 10:20:23 +0200
Committer:  Thomas Gleixner 
CommitDate: Sat, 15 Apr 2017 12:20:56 +0200

crypto: N2 - Replace racy task affinity logic

spu_queue_register() needs to invoke setup functions on a particular
CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner 
Acked-by: Herbert Xu 
Acked-by: "David S. Miller" 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: "Rafael J. Wysocki" 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: Sebastian Siewior 
Cc: Lai Jiangshan 
Cc: Viresh Kumar 
Cc: linux-cry...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Tejun Heo 
Cc: Len Brown 
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131019420.2408@nanos
Signed-off-by: Thomas Gleixner 

---
 drivers/crypto/n2_core.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c
index c5aac25..4ecb77a 100644
--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -65,6 +65,11 @@ struct spu_queue {
struct list_headlist;
 };
 
+struct spu_qreg {
+   struct spu_queue*queue;
+   unsigned long   type;
+};
+
 static struct spu_queue **cpu_to_cwq;
 static struct spu_queue **cpu_to_mau;
 
@@ -1631,31 +1636,27 @@ static void queue_cache_destroy(void)
kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
 }
 
-static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+static long spu_queue_register_workfn(void *arg)
 {
-   cpumask_var_t old_allowed;
+   struct spu_qreg *qr = arg;
+   struct spu_queue *p = qr->queue;
+   unsigned long q_type = qr->type;
unsigned long hv_ret;
 
-   if (cpumask_empty(&p->sharing))
-   return -EINVAL;
-
-   if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL))
-   return -ENOMEM;
-
-   cpumask_copy(old_allowed, ¤t->cpus_allowed);
-
-   set_cpus_allowed_ptr(current, &p->sharing);
-
hv_ret = sun4v_ncs_qconf(q_type, __pa(p->q),
 CWQ_NUM_ENTRIES, &p->qhandle);
if (!hv_ret)
sun4v_ncs_sethead_marker(p->qhandle, 0);
 
-   set_cpus_allowed_ptr(current, old_allowed);
+   return hv_ret ? -EINVAL : 0;
+}
 
-   free_cpumask_var(old_allowed);
+static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+{
+   int cpu = cpumask_any_and(&p->sharing, cpu_online_mask);
+   struct spu_qreg qr = { .queue = p, .type = q_type };
 
-   return (hv_ret ? -EINVAL : 0);
+   return work_on_cpu_safe(cpu, spu_queue_register_workfn, &qr);
 }
 
 static int spu_queue_setup(struct spu_queue *p)

Re: [Xen-devel] [PATCH v2] xen, kdump: handle pv domain in paddr_vmcoreinfo_note()

2017-04-15 Thread Petr Tesarik

On Sat, 15 Apr 2017 00:26:05 +0200
Daniel Kiper  wrote:

> On Fri, Apr 14, 2017 at 06:53:36PM +0200, Petr Tesarik wrote:
>[...]
> > shifted towards libkdumpfile (https://github.com/ptesarik/libkdumpfile),
> > and this library can open PV guest dump files without any issues.
> 
> Great! AIUI, it reminds my idea to make such think. However, I have not
> have time to make it happen. Is it based on makedumpfile or written from
> scratch? Do you plan support for Linux kernel dumps and/or Xen ones?

Some ideas are borrowed from existing tools (makedumpfile, crash). All
code is written from scratch, however.

The kdumpfile library itself is designed for use with any platform and
operating system. Xen is treated as just another type of operating
system. Based on which OS type is initialized, the library is able to
provide a hypervisor view or a Dom0 view.

There is another project led by Jeff Mahoney, which extends standard
gdb with semantic commands. This project supports only x86_64 Linux
right now. See https://github.com/jeffmahoney/crash-python.

Petr T

[PATCH] regulator: rn5t618: Fix Fix out of bounds array access

2017-04-15 Thread Axel Lin

The commit "regulator: rn5t618: Add RN5T567 PMIC support" added
RN5T618_DCDC4 to the enum, then RN5T618_REG_NUM is also changed.
So for rn5t618, there is out of bounds array access when checking
regulators[i].name in the for loop.

The number of regulators is different for rn5t567 and rn5t618, so we had
better remove RN5T618_REG_NUM and get the correct num_regulators during
probe instead.

Fixes: ed6d362d8dbc ("regulator: rn5t618: Add RN5T567 PMIC support")
Signed-off-by: Axel Lin 
---
 drivers/regulator/rn5t618-regulator.c | 8 
 include/linux/mfd/rn5t618.h   | 1 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/regulator/rn5t618-regulator.c 
b/drivers/regulator/rn5t618-regulator.c
index 8d2819e..0c09143 100644
--- a/drivers/regulator/rn5t618-regulator.c
+++ b/drivers/regulator/rn5t618-regulator.c
@@ -85,14 +85,17 @@ static int rn5t618_regulator_probe(struct platform_device 
*pdev)
struct regulator_config config = { };
struct regulator_dev *rdev;
struct regulator_desc *regulators;
+   int num_regulators;
int i;
 
switch (rn5t618->variant) {
case RN5T567:
regulators = rn5t567_regulators;
+   num_regulators = ARRAY_SIZE(rn5t567_regulators);
break;
case RN5T618:
regulators = rn5t618_regulators;
+   num_regulators = ARRAY_SIZE(rn5t618_regulators);
break;
default:
return -EINVAL;
@@ -101,10 +104,7 @@ static int rn5t618_regulator_probe(struct platform_device 
*pdev)
config.dev = pdev->dev.parent;
config.regmap = rn5t618->regmap;
 
-   for (i = 0; i < RN5T618_REG_NUM; i++) {
-   if (!regulators[i].name)
-   continue;
-
+   for (i = 0; i < num_regulators; i++) {
rdev = devm_regulator_register(&pdev->dev,
   ®ulators[i],
   &config);
diff --git a/include/linux/mfd/rn5t618.h b/include/linux/mfd/rn5t618.h
index e5a6cde..d7b3155 100644
--- a/include/linux/mfd/rn5t618.h
+++ b/include/linux/mfd/rn5t618.h
@@ -233,7 +233,6 @@ enum {
RN5T618_LDO5,
RN5T618_LDORTC1,
RN5T618_LDORTC2,
-   RN5T618_REG_NUM,
 };
 
 enum {
-- 
2.9.3

[PATCH RESEND] regulator: rn5t618: Fix out of bounds array access

2017-04-15 Thread Axel Lin

The commit "regulator: rn5t618: Add RN5T567 PMIC support" added
RN5T618_DCDC4 to the enum, then RN5T618_REG_NUM is also changed.
So for rn5t618, there is out of bounds array access when checking
regulators[i].name in the for loop.

The number of regulators is different for rn5t567 and rn5t618, so we had
better remove RN5T618_REG_NUM and get the correct num_regulators during
probe instead.

Fixes: ed6d362d8dbc ("regulator: rn5t618: Add RN5T567 PMIC support")
Signed-off-by: Axel Lin 
---
RESEND: Correct subject line (remove double Fix)

 drivers/regulator/rn5t618-regulator.c | 8 
 include/linux/mfd/rn5t618.h   | 1 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/regulator/rn5t618-regulator.c 
b/drivers/regulator/rn5t618-regulator.c
index 8d2819e..0c09143 100644
--- a/drivers/regulator/rn5t618-regulator.c
+++ b/drivers/regulator/rn5t618-regulator.c
@@ -85,14 +85,17 @@ static int rn5t618_regulator_probe(struct platform_device 
*pdev)
struct regulator_config config = { };
struct regulator_dev *rdev;
struct regulator_desc *regulators;
+   int num_regulators;
int i;
 
switch (rn5t618->variant) {
case RN5T567:
regulators = rn5t567_regulators;
+   num_regulators = ARRAY_SIZE(rn5t567_regulators);
break;
case RN5T618:
regulators = rn5t618_regulators;
+   num_regulators = ARRAY_SIZE(rn5t618_regulators);
break;
default:
return -EINVAL;
@@ -101,10 +104,7 @@ static int rn5t618_regulator_probe(struct platform_device 
*pdev)
config.dev = pdev->dev.parent;
config.regmap = rn5t618->regmap;
 
-   for (i = 0; i < RN5T618_REG_NUM; i++) {
-   if (!regulators[i].name)
-   continue;
-
+   for (i = 0; i < num_regulators; i++) {
rdev = devm_regulator_register(&pdev->dev,
   ®ulators[i],
   &config);
diff --git a/include/linux/mfd/rn5t618.h b/include/linux/mfd/rn5t618.h
index e5a6cde..d7b3155 100644
--- a/include/linux/mfd/rn5t618.h
+++ b/include/linux/mfd/rn5t618.h
@@ -233,7 +233,6 @@ enum {
RN5T618_LDO5,
RN5T618_LDORTC1,
RN5T618_LDORTC2,
-   RN5T618_REG_NUM,
 };
 
 enum {
-- 
2.9.3

[PATCH] Revert "mm, page_alloc: only use per-cpu allocator for irq-safe requests"

2017-04-15 Thread Mel Gorman

This reverts commit 374ad05ab64d696303cec5cc8ec3a65d457b7b1c. While the
patch worked great for userspace allocations, the fact that softirq loses
the per-cpu allocator caused problems. It needs to be redone taking into
account that a separate list is needed for hard/soft IRQs or alternatively
find a cheap way of detecting reentry due to an interrupt. Both are possible
but sufficiently tricky that it shouldn't be rushed. Jesper had one method
for allowing softirqs but reported that the cost was high enough that it
performed similarly to a plain revert. His figures for netperf TCP_STREAM
were as follows

Baseline v4.10.0  : 60316 Mbit/s
Current 4.11.0-rc6: 47491 Mbit/s
This patch: 60662 Mbit/s

As this is a regression, I wish to revert to noirq allocator for now and
go back to the drawing board.

Signed-off-by: Mel Gorman 
Reported-by: Tariq Toukan 
---
 mm/page_alloc.c | 43 ---
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6cbde310abed..3bba4f46214c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1090,10 +1090,10 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
 {
int migratetype = 0;
int batch_free = 0;
-   unsigned long nr_scanned, flags;
+   unsigned long nr_scanned;
bool isolated_pageblocks;
 
-   spin_lock_irqsave(&zone->lock, flags);
+   spin_lock(&zone->lock);
isolated_pageblocks = has_isolate_pageblock(zone);
nr_scanned = node_page_state(zone->zone_pgdat, NR_PAGES_SCANNED);
if (nr_scanned)
@@ -1142,7 +1142,7 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
trace_mm_page_pcpu_drain(page, 0, mt);
} while (--count && --batch_free && !list_empty(list));
}
-   spin_unlock_irqrestore(&zone->lock, flags);
+   spin_unlock(&zone->lock);
 }
 
 static void free_one_page(struct zone *zone,
@@ -1150,9 +1150,8 @@ static void free_one_page(struct zone *zone,
unsigned int order,
int migratetype)
 {
-   unsigned long nr_scanned, flags;
-   spin_lock_irqsave(&zone->lock, flags);
-   __count_vm_events(PGFREE, 1 << order);
+   unsigned long nr_scanned;
+   spin_lock(&zone->lock);
nr_scanned = node_page_state(zone->zone_pgdat, NR_PAGES_SCANNED);
if (nr_scanned)
__mod_node_page_state(zone->zone_pgdat, NR_PAGES_SCANNED, 
-nr_scanned);
@@ -1162,7 +1161,7 @@ static void free_one_page(struct zone *zone,
migratetype = get_pfnblock_migratetype(page, pfn);
}
__free_one_page(page, pfn, zone, order, migratetype);
-   spin_unlock_irqrestore(&zone->lock, flags);
+   spin_unlock(&zone->lock);
 }
 
 static void __meminit __init_single_page(struct page *page, unsigned long pfn,
@@ -1240,6 +1239,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, 
phys_addr_t end)
 
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
+   unsigned long flags;
int migratetype;
unsigned long pfn = page_to_pfn(page);
 
@@ -1247,7 +1247,10 @@ static void __free_pages_ok(struct page *page, unsigned 
int order)
return;
 
migratetype = get_pfnblock_migratetype(page, pfn);
+   local_irq_save(flags);
+   __count_vm_events(PGFREE, 1 << order);
free_one_page(page_zone(page), page, pfn, order, migratetype);
+   local_irq_restore(flags);
 }
 
 static void __init __free_pages_boot_core(struct page *page, unsigned int 
order)
@@ -2219,9 +,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
order,
int migratetype, bool cold)
 {
int i, alloced = 0;
-   unsigned long flags;
 
-   spin_lock_irqsave(&zone->lock, flags);
+   spin_lock(&zone->lock);
for (i = 0; i < count; ++i) {
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
@@ -2257,7 +2259,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
order,
 * pages added to the pcp list.
 */
__mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
-   spin_unlock_irqrestore(&zone->lock, flags);
+   spin_unlock(&zone->lock);
return alloced;
 }
 
@@ -2478,20 +2480,17 @@ void free_hot_cold_page(struct page *page, bool cold)
 {
struct zone *zone = page_zone(page);
struct per_cpu_pages *pcp;
+   unsigned long flags;
unsigned long pfn = page_to_pfn(page);
int migratetype;
 
-   if (in_interrupt()) {
-   __free_pages_ok(page, 0);
-   return;
-   }
-
if (!free_pcp_prepare(page))
return;
 
migratetype = get_pfnblock_migratetype(page, pfn);
set_pcppage_migratetype(page, migratetype);
-   preempt_disable();
+   local_irq_save(flags);
+

Re: [PATCH] mm, page_alloc: re-enable softirq use of per-cpu page allocator

2017-04-15 Thread Mel Gorman

On Fri, Apr 14, 2017 at 12:10:27PM +0200, Jesper Dangaard Brouer wrote:
> On Mon, 10 Apr 2017 14:26:16 -0700
> Andrew Morton  wrote:
> 
> > On Mon, 10 Apr 2017 16:08:21 +0100 Mel Gorman  
> > wrote:
> > 
> > > IRQ context were excluded from using the Per-Cpu-Pages (PCP) lists caching
> > > of order-0 pages in commit 374ad05ab64d ("mm, page_alloc: only use per-cpu
> > > allocator for irq-safe requests").
> > > 
> > > This unfortunately also included excluded SoftIRQ.  This hurt the 
> > > performance
> > > for the use-case of refilling DMA RX rings in softirq context.  
> > 
> > Out of curiosity: by how much did it "hurt"?
> >
> > 
> > 
> > Tariq found:
> > 
> > : I disabled the page-cache (recycle) mechanism to stress the page
> > : allocator, and see a drastic degradation in BW, from 47.5 G in v4.10 to
> > : 31.4 G in v4.11-rc1 (34% drop).
> 
> I've tried to reproduce this in my home testlab, using ConnectX-4 dual
> 100Gbit/s. Hardware limits cause that I cannot reach 100Gbit/s, once a
> memory copy is performed.  (Word of warning: you need PCIe Gen3 width
> 16 (which I do have) to handle 100Gbit/s, and the memory bandwidth of
> the system also need something like 2x 12500MBytes/s (which is where my
> system failed)).
> 
> The mlx5 driver have a driver local page recycler, which I can see fail
> between 29%-38% of the time, with 8 parallel netperf TCP_STREAMs.  I
> speculate adding more streams will make in fail more.  To factor out
> the driver recycler, I simply disable it (like I believe Tariq also did).
> 
> With disabled-mlx5-recycler, 8 parallel netperf TCP_STREAMs:
> 
> Baseline v4.10.0  : 60316 Mbit/s
> Current 4.11.0-rc6: 47491 Mbit/s
> This patch: 60662 Mbit/s
> 
> While this patch does "fix" the performance regression, it does not
> bring any noticeable improvement (as my micro-bench also indicated),
> thus I feel our previous optimization is almost nullified. (p.s. It
> does feel wrong to argue against my own patch ;-)).
> 
> The reason for the current 4.11.0-rc6 regression is lock congestion on
> the (per NUMA) page allocator lock, perf report show we spend 34.92% in
> queued_spin_lock_slowpath (compared to top#2 copy cost of 13.81% in
> copy_user_enhanced_fast_string).
> 

The lock contention is likely due to the per-cpu allocator being bypassed.

> 
> > then with this patch he found
> > 
> > : It looks very good!  I get line-rate (94Gbits/sec) with 8 streams, in
> > : comparison to less than 55Gbits/sec before.
> > 
> > Can I take this to mean that the page allocator's per-cpu-pages feature
> > ended up doubling the performance of this driver?  Better than the
> > driver's private page recycling?  I'd like to believe that, but am
> > having trouble doing so ;)
> 
> I would not conclude that. I'm also very suspicious about such big
> performance "jumps".  Tariq should also benchmark with v4.10 and a
> disabled mlx5-recycler, as I believe the results should be the same as
> after this patch.
> 
> That said, it is possible to see a regression this large, when all the
> CPUs are congesting on the page allocator lock. AFAIK Tariq also
> mentioned seeing 60% spend on the lock, which would confirm this theory.
> 

On that basis, I've posted a revert of the original patch which should
either go into 4.11 or 4.11-stable. Andrew, the revert should also
remove the "re-enable softirq use of per-cpu page" patch from mmotm.

Thanks.

-- 
Mel Gorman
SUSE Labs

Re: [PATCH v2 11/33] dm: add dax_device and dax_operations support

2017-04-15 Thread Dan Williams

[ adding some missing cc's ]

Cover letter patch here:
https://lists.01.org/pipermail/linux-nvdimm/2017-April/009648.html

On Fri, Apr 14, 2017 at 7:34 PM, Dan Williams  wrote:
> Allocate a dax_device to represent the capacity of a device-mapper
> instance. Provide a ->direct_access() method via the new dax_operations
> indirection that mirrors the functionality of the current direct_access
> support via block_device_operations.  Once fs/dax.c has been converted
> to use dax_operations the old dm_blk_direct_access() will be removed.
>
> A new helper dm_dax_get_live_target() is introduced to separate some of
> the dm-specifics from the direct_access implementation.
>
> This enabling is only for the top-level dm representation to upper
> layers. Converting target direct_access implementations is deferred to a
> separate patch.
>
> Cc: Toshi Kani 
> Cc: Mike Snitzer 
> Signed-off-by: Dan Williams 
> ---
>  drivers/md/Kconfig|1
>  drivers/md/dm-core.h  |1
>  drivers/md/dm.c   |   84 
> ++---
>  include/linux/device-mapper.h |1
>  4 files changed, 73 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
> index b7767da50c26..1de8372d9459 100644
> --- a/drivers/md/Kconfig
> +++ b/drivers/md/Kconfig
> @@ -200,6 +200,7 @@ config BLK_DEV_DM_BUILTIN
>  config BLK_DEV_DM
> tristate "Device mapper support"
> select BLK_DEV_DM_BUILTIN
> +   select DAX
> ---help---
>   Device-mapper is a low level volume manager.  It works by allowing
>   people to specify mappings for ranges of logical sectors.  Various
> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index 136fda3ff9e5..538630190f66 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -58,6 +58,7 @@ struct mapped_device {
> struct target_type *immutable_target_type;
>
> struct gendisk *disk;
> +   struct dax_device *dax_dev;
> char name[16];
>
> void *interface_ptr;
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index dfb75979e455..bd56dfe43a99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -908,31 +909,68 @@ int dm_set_target_max_io_len(struct dm_target *ti, 
> sector_t len)
>  }
>  EXPORT_SYMBOL_GPL(dm_set_target_max_io_len);
>
> -static long dm_blk_direct_access(struct block_device *bdev, sector_t sector,
> -void **kaddr, pfn_t *pfn, long size)
> +static struct dm_target *dm_dax_get_live_target(struct mapped_device *md,
> +   sector_t sector, int *srcu_idx)
>  {
> -   struct mapped_device *md = bdev->bd_disk->private_data;
> struct dm_table *map;
> struct dm_target *ti;
> -   int srcu_idx;
> -   long len, ret = -EIO;
>
> -   map = dm_get_live_table(md, &srcu_idx);
> +   map = dm_get_live_table(md, srcu_idx);
> if (!map)
> -   goto out;
> +   return NULL;
>
> ti = dm_table_find_target(map, sector);
> if (!dm_target_is_valid(ti))
> -   goto out;
> +   return NULL;
>
> -   len = max_io_len(sector, ti) << SECTOR_SHIFT;
> -   size = min(len, size);
> +   return ti;
> +}
>
> -   if (ti->type->direct_access)
> -   ret = ti->type->direct_access(ti, sector, kaddr, pfn, size);
> -out:
> +static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
> +   long nr_pages, void **kaddr, pfn_t *pfn)
> +{
> +   struct mapped_device *md = dax_get_private(dax_dev);
> +   sector_t sector = pgoff * PAGE_SECTORS;
> +   struct dm_target *ti;
> +   long len, ret = -EIO;
> +   int srcu_idx;
> +
> +   ti = dm_dax_get_live_target(md, sector, &srcu_idx);
> +
> +   if (!ti)
> +   goto out;
> +   if (!ti->type->direct_access)
> +   goto out;
> +   len = max_io_len(sector, ti) / PAGE_SECTORS;
> +   if (len < 1)
> +   goto out;
> +   nr_pages = min(len, nr_pages);
> +   if (ti->type->direct_access) {
> +   ret = ti->type->direct_access(ti, sector, kaddr, pfn,
> +   nr_pages * PAGE_SIZE);
> +   /*
> +* FIXME: convert ti->type->direct_access to return
> +* nr_pages directly.
> +*/
> +   if (ret >= 0)
> +   ret /= PAGE_SIZE;
> +   }
> + out:
> dm_put_live_table(md, srcu_idx);
> -   return min(ret, size);
> +
> +   return ret;
> +}
> +
> +static long dm_blk_direct_access(struct block_device *bdev, sector_t sector,
> +   void **kaddr, pfn_t *pfn, long size)
> +{
> +   struct mapped_device *md = bdev->bd_disk->private_data;
> +   struct dax_device *dax_dev = md->dax_dev;
> +   long nr_pages = size /

[backport v4.9] tpm_tis: use default timeout value if chip reports it as zero

2017-04-15 Thread Jarkko Sakkinen

From: "Maciej S. Szmigiero" 

Since commit 1107d065fdf1 ("tpm_tis: Introduce intermediate layer for
TPM access") Atmel 3203 TPM on ThinkPad X61S (TPM firmware version 13.9)
no longer works.  The initialization proceeds fine until we get and
start using chip-reported timeouts - and the chip reports C and D
timeouts of zero.

It turns out that until commit 8e54caf407b98e ("tpm: Provide a generic
means to override the chip returned timeouts") we had actually let
default timeout values remain in this case, so let's bring back this
behavior to make chips like Atmel 3203 work again.

Use a common code that was introduced by that commit so a warning is
printed in this case and /sys/class/tpm/tpm*/timeouts correctly says the
timeouts aren't chip-original.

Fixes: 1107d065fdf1 ("tpm_tis: Introduce intermediate layer for TPM access")
Cc: sta...@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
Backport v4.9. Can you test it?
 drivers/char/tpm/tpm-interface.c | 59 ++--
 drivers/char/tpm/tpm_tis.c   |  2 +-
 drivers/char/tpm/tpm_tis_core.c  |  6 ++--
 drivers/char/tpm/tpm_tis_core.h  |  2 +-
 4 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 3a9149cf0110..4c914fe25802 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -489,9 +489,9 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
struct tpm_cmd_t tpm_cmd;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
struct duration_t *duration_cap;
+   cap_t cap;
+   unsigned long timeout_old[4], timeout_chip[4], timeout_eff[4];
ssize_t rc;
 
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
@@ -537,16 +537,15 @@ int tpm_get_timeouts(struct tpm_chip *chip)
goto duration;
}
 
-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
+   timeout_old[0] = jiffies_to_usecs(chip->timeout_a);
+   timeout_old[1] = jiffies_to_usecs(chip->timeout_b);
+   timeout_old[2] = jiffies_to_usecs(chip->timeout_c);
+   timeout_old[3] = jiffies_to_usecs(chip->timeout_d);
+   timeout_chip[0] = be32_to_cpu(cap.timeout.a);
+   timeout_chip[1] = be32_to_cpu(cap.timeout.b);
+   timeout_chip[2] = be32_to_cpu(cap.timeout.c);
+   timeout_chip[3] = be32_to_cpu(cap.timeout.d);
+   memcpy(timeout_eff, timeout_chip, sizeof(timeout_eff));
 
/*
 * Provide ability for vendor overrides of timeout values in case
@@ -554,16 +553,24 @@ int tpm_get_timeouts(struct tpm_chip *chip)
 */
if (chip->ops->update_timeouts != NULL)
chip->timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
+   chip->ops->update_timeouts(chip, timeout_eff);
 
if (!chip->timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
+   /* Restore default if chip reported 0 */
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(timeout_eff); i++) {
+   if (timeout_eff[i])
+   continue;
+
+   timeout_eff[i] = timeout_old[i];
+   chip->timeout_adjusted = true;
+   }
 
+   if (timeout_eff[0] != 0 && timeout_eff[0] < 1000) {
/* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
+   for (i = 0; i != ARRAY_SIZE(timeout_eff); i++)
+   timeout_eff[i] *= 1000;
chip->timeout_adjusted = true;
}
}
@@ -572,16 +579,16 @@ int tpm_get_timeouts(struct tpm_chip *chip)
if (chip->timeout_adjusted) {
dev_info(&chip->dev,
 HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-

Re: [RFC PATCH v5 5/6] mtd: spi-nor: parse Serial Flash Discoverable Parameters (SFDP) tables

2017-04-15 Thread Marek Vasut

On 03/23/2017 12:33 AM, Cyrille Pitchen wrote:
> This patch adds support to the JESD216B standard and parses the SFDP
> tables to dynamically initialize the 'struct spi_nor_flash_parameter'.
> 
> Signed-off-by: Cyrille Pitchen 

Hi, mostly nits below.

> ---
>  drivers/mtd/spi-nor/spi-nor.c | 558 
> ++
>  include/linux/mtd/spi-nor.h   |   6 +
>  2 files changed, 564 insertions(+)
> 
> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> index 2e54792d506d..ce8722055a9c 100644
> --- a/drivers/mtd/spi-nor/spi-nor.c
> +++ b/drivers/mtd/spi-nor/spi-nor.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -86,6 +87,7 @@ struct flash_info {
>* to support memory size above 128Mib.
>*/
>  #define NO_CHIP_ERASEBIT(12) /* Chip does not support chip 
> erase */
> +#define SPI_NOR_SKIP_SFDPBIT(13) /* Skip parsing of SFDP tables */
>  };
>  
>  #define JEDEC_MFR(info)  ((info)->id[0])
> @@ -1593,6 +1595,99 @@ static int spansion_quad_enable(struct spi_nor *nor)
>   return 0;
>  }
>  
> +static int spansion_new_quad_enable(struct spi_nor *nor)
> +{
> + u8 sr_cr[2];
> + int ret;
> +
> + /* Check current Quad Enable bit value. */
> + ret = read_cr(nor);
> + if (ret < 0) {
> + dev_err(nor->dev,
> + "error while reading configuration register\n");
> + return -EINVAL;
> + }
> + sr_cr[1] = ret;
> + if (sr_cr[1] & CR_QUAD_EN_SPAN)
> + return 0;
> +
> + dev_info(nor->dev, "setting Spansion Quad Enable (non-volatile) bit\n");
> +
> + /* Keep the current value of the Status Register. */
> + ret = read_sr(nor);
> + if (ret < 0) {
> + dev_err(nor->dev,
> + "error while reading status register\n");
> + return -EINVAL;
> + }
> + sr_cr[0] = ret;
> + sr_cr[1] |= CR_QUAD_EN_SPAN;
> +
> + write_enable(nor);
> +
> + ret = nor->write_reg(nor, SPINOR_OP_WRSR, sr_cr, 2);
> + if (ret < 0) {
> + dev_err(nor->dev,
> + "error while writing configuration register\n");
> + return -EINVAL;
> + }
> +
> + ret = spi_nor_wait_till_ready(nor);
> + if (ret < 0) {
> + dev_err(nor->dev, "error while waiting for WRSR completion\n");
> + return ret;
> + }
> +
> + /* read back and check it */
> + ret = read_cr(nor);
> + if (!(ret > 0 && (ret & CR_QUAD_EN_SPAN))) {

Nit, you might want to align this with sr2_bit7_quad_enable() below, that is

if (ret || !(ret & CR_QUAD_ENABLE_SPAN))
 ...

> + dev_err(nor->dev, "Spansion Quad bit not set\n");
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int sr2_bit7_quad_enable(struct spi_nor *nor)
> +{
> + u8 sr2;
> + int ret;
> +
> + /* Check current Quad Enable bit value. */
> + ret = nor->read_reg(nor, SPINOR_OP_RDSR2, &sr2, 1);
> + if (ret)
> + return ret;
> + if (sr2 & SR2_QUAD_EN_BIT7)
> + return 0;
> +
> + /* Update the Quad Enable bit. */
> + sr2 |= SR2_QUAD_EN_BIT7;
> +
> + write_enable(nor);
> +
> + ret = nor->write_reg(nor, SPINOR_OP_WRSR2, &sr2, 1);
> + if (ret < 0) {
> + dev_err(nor->dev,
> + "error while writing status register 2\n");
> + return -EINVAL;
> + }
> +
> + ret = spi_nor_wait_till_ready(nor);
> + if (ret < 0) {
> + dev_err(nor->dev, "error while waiting for WRSR2 completion\n");
> + return ret;
> + }
> +
> + /* Read back and check it. */
> + ret = nor->read_reg(nor, SPINOR_OP_RDSR2, &sr2, 1);
> + if (ret || !(sr2 & SR2_QUAD_EN_BIT7)) {
> + dev_err(nor->dev, "SR2 Quad bit not set\n");
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
>  static int spi_nor_check(struct spi_nor *nor)
>  {
>   if (!nor->dev || !nor->read || !nor->write ||
> @@ -1759,6 +1854,465 @@ spi_nor_init_uniform_erase_map(struct 
> spi_nor_erase_map *map,
>   map->uniform_region.size = flash_size;
>  }
>  
> +
> +/*
> + * SFDP parsing.
> + */
> +
> +static int spi_nor_read_sfdp(struct spi_nor *nor, u32 addr,
> +  size_t len, void *buf)
> +{
> + u8 addr_width, read_opcode, read_dummy;
> + int ret;
> +
> + read_opcode = nor->read_opcode;
> + addr_width = nor->addr_width;
> + read_dummy = nor->read_dummy;
> +
> + nor->read_opcode = SPINOR_OP_RDSFDP;
> + nor->addr_width = 3;
> + nor->read_dummy = 8;
> +
> + ret = nor->read(nor, addr, len, (u8 *)buf);
> +
> + nor->read_opcode = read_opcode;
> + nor->addr_width = addr_width;
> + nor->read_dummy = read_dummy;
> +
> + return (ret < 0) ? ret : 0;
> +}
> +
> +struct sfdp_parameter

Re: [RFC PATCH v5 4/6] mtd: spi-nor: add support to non-uniform SPI NOR flash memories

2017-04-15 Thread Marek Vasut

On 03/23/2017 12:33 AM, Cyrille Pitchen wrote:

Hr, sigh, took me almost month to review this one, sorry :(

> This patch is a first step in introducing  the support of SPI memories
> with non-uniform erase sizes like Spansion s25fs512s.
> 
> It introduces the memory erase map which splits the memory array into one
> or many erase regions. Each erase region supports up to 4 erase commands,
> as defined by the JEDEC JESD216B (SFDP) specification.
> In turn, an erase command is defined by an op code and a sector size.
> 
> To be backward compatible, the erase map of uniform SPI NOR flash memories
> is initialized so it contains only one erase region and this erase region
> supports only one erase command. Hence a single size is used to erase any
> sector/block of the memory.
> 
> Besides, since the algorithm used to erase sectors on non-uniform SPI NOR
> flash memories is quite expensive, when possible, the erase map is tuned
> to come back to the uniform case.
> 
> This is a transitional patch: non-uniform erase maps will be used later
> when initialized based on the SFDP data.
> 
> Signed-off-by: Cyrille Pitchen 

[...]

Before I dive into the code, I have two questions:

1) On ie. 128 MiB part, how many struct spi_nor_erase_region {}
   instances would be allocated in total (consider you support
   4k, 64k and 32M erase opcodes) ? Three ?

2) Would it make sense to represent the erase regions as a tree instead?
   For example

  [ region with 32MiB die erase opcode , start=0 , count=4 ]
|
V
[ region with 64k erase opcode ][ region with 64k erase opcode ]
[   start=0, count=1   ][  start=0, count=511  ]
|
V
[ region with 4k erase opcode ]
[  start=0, count=16  ]

I think it'd make the lookup for the best-fitting opcode combination
faster if the user decides to erase some arbitrarily-aligned block of
the flash.

What do you think ?

Note this tree-based approach does not handle the cases where erase
regions would overlap, although I doubt that could be a problem .

> diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
> index d270788f5ab6..c12cafe99bee 100644
> --- a/include/linux/mtd/spi-nor.h
> +++ b/include/linux/mtd/spi-nor.h
> @@ -216,6 +216,55 @@ enum spi_nor_option_flags {
>  };
>  
>  /**
> + * struct spi_nor_erase_command - Structure to describe a SPI NOR erase 
> command
> + * @size:the size of the sector/block erased by the command.
> + * @size_shift:  the size shift: if @size is a power of 2 then 
> the shift
> + *   is stored in @size_shift, otherwise @size_shift is zero.
> + * @size_mask:   the size mask based on @size_shift.
> + * @opcode:  the SPI command op code to erase the sector/block.
> + */
> +struct spi_nor_erase_command {
> + u32 size;
> + u32 size_shift;
> + u32 size_mask;
> + u8  opcode;
> +};
> +
> +/**
> + * struct spi_nor_erase_region - Structure to describe a SPI NOR erase region
> + * @offset:  the offset in the data array of erase region start.
> + *   LSB bits are used as a bitmask encoding the erase
> + *   commands supported inside this erase region.
> + * @size:the size of the region in bytes.
> + */
> +struct spi_nor_erase_region {
> + u64 offset;
> + u64 size;
> +};
> +
> +#define SNOR_CMD_ERASE_MAX   4
> +#define SNOR_CMD_ERASE_MASK  GENMASK_ULL(SNOR_CMD_ERASE_MAX - 1, 0)
> +#define SNOR_CMD_ERASE_OFFSET(_cmd_mask, _offset)\
> + u64)(_offset)) & ~SNOR_CMD_ERASE_MASK) |\
> +  (((u64)(_cmd_mask)) & SNOR_CMD_ERASE_MASK))
> +
> +/**
> + * struct spi_nor_erase_map - Structure to describe the SPI NOR erase map
> + * @commands:an array of erase commands shared by all the 
> regions.
> + * @uniform_region:  a pre-allocated erase region for SPI NOR with a uniform
> + *   sector size (legacy implementation).
> + * @regions: point to an array describing the boundaries of the erase
> + *   regions.
> + * @num_regions: the number of elements in the @regions array.
> + */
> +struct spi_nor_erase_map {
> + struct spi_nor_erase_commandcommands[SNOR_CMD_ERASE_MAX];
> + struct spi_nor_erase_region uniform_region;
> + struct spi_nor_erase_region *regions;
> + u32 num_regions;
> +};
> +
> +/**
>   * struct flash_info -   Forward declaration of a structure used 
> internally by
>   *   spi_nor_scan() and spi_nor_init().
>   */
> @@ -238,6 +287,7 @@ struct flash_info;
>   * @write_proto: the SPI protocol for write operations
>   * @reg_protothe SPI protocol for read_reg/write_reg/erase 
> operations
>   * @cmd_buf: used by the write_reg
> + * @erase_map:   the erase map of the SPI NOR
>   * @prepare: [OPTIONAL] do some p

Re: [RFC PATCH v5 6/6] mtd: spi-nor: parse SFDP 4-byte Address Instruction Table

2017-04-15 Thread Marek Vasut

On 03/23/2017 12:33 AM, Cyrille Pitchen wrote:
> This patch adds supports for SFDP (JESD216B) 4-byte Address Instruction
> Table. This table is optional but when available, we parse it to get the
> 4-byte address op codes supported by the memory.
> Using these op codes is stateless as opposed to entering the 4-byte
> address mode or setting the Base Address Register (BAR).
> 
> Signed-off-by: Cyrille Pitchen 
> ---
>  drivers/mtd/spi-nor/spi-nor.c | 166 
> +-
>  1 file changed, 165 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> index ce8722055a9c..ea044efc4e6d 100644
> --- a/drivers/mtd/spi-nor/spi-nor.c
> +++ b/drivers/mtd/spi-nor/spi-nor.c
> @@ -1899,6 +1899,7 @@ struct sfdp_parameter_header {
>  
>  
>  #define SFDP_BFPT_ID 0xff00u /* Basic Flash Parameter Table */
> +#define SFDP_4BAIT_ID0xff84u /* 4-byte Address Instruction 
> Table */
>  
>  #define SFDP_SIGNATURE   0x50444653u
>  #define SFDP_JESD216_MAJOR   1
> @@ -2241,6 +2242,149 @@ static int spi_nor_parse_bfpt(struct spi_nor *nor,
>   return 0;
>  }
>  
> +struct sfdp_4bait {
> + /* The hardware capability. */
> + u32 hwcaps;
> +
> + /*
> +  * The  bit in DWORD1 of the 4BAIT tells us whether
> +  * the associated 4-byte address op code is supported.
> +  */
> + u32 supported_bit;
> +};
> +
> +static int spi_nor_parse_4bait(struct spi_nor *nor,
> +const struct sfdp_parameter_header *param_header,
> +struct spi_nor_flash_parameter *params)
> +{
> + static const struct sfdp_4bait reads[] = {
> + { SNOR_HWCAPS_READ, BIT(0) },
> + { SNOR_HWCAPS_READ_FAST,BIT(1) },
> + { SNOR_HWCAPS_READ_1_1_2,   BIT(2) },
> + { SNOR_HWCAPS_READ_1_2_2,   BIT(3) },
> + { SNOR_HWCAPS_READ_1_1_4,   BIT(4) },
> + { SNOR_HWCAPS_READ_1_4_4,   BIT(5) },
> + { SNOR_HWCAPS_READ_1_1_1_DTR,   BIT(13) },
> + { SNOR_HWCAPS_READ_1_2_2_DTR,   BIT(14) },
> + { SNOR_HWCAPS_READ_1_4_4_DTR,   BIT(15) },
> + };
> + static const struct sfdp_4bait programs[] = {
> + { SNOR_HWCAPS_PP,   BIT(6) },
> + { SNOR_HWCAPS_PP_1_1_4, BIT(7) },
> + { SNOR_HWCAPS_PP_1_4_4, BIT(8) },
> + };
> + static const struct sfdp_4bait erases[SNOR_CMD_ERASE_MAX] = {
> + { 0u /* not used */,BIT(9) },
> + { 0u /* not used */,BIT(10) },
> + { 0u /* not used */,BIT(11) },
> + { 0u /* not used */,BIT(12) },
> + };
> + u32 dwords[2], addr, discard_hwcaps, read_hwcaps, pp_hwcaps, erase_mask;
> + struct spi_nor_erase_map *map = &nor->erase_map;
> + int i, err;
> +
> + if (param_header->major != SFDP_JESD216_MAJOR ||
> + param_header->length < ARRAY_SIZE(dwords))
> + return -EINVAL;
> +
> + /* Read the 4-byte Address Instruction Table. */
> + addr = SFDP_PARAM_HEADER_PTP(param_header);
> + err = spi_nor_read_sfdp(nor, addr, sizeof(dwords), dwords);
> + if (err)
> + return err;
> +
> + /* Fix endianness of the 4BAIT DWORDs. */
> + for (i = 0; i < ARRAY_SIZE(dwords); i++)
> + dwords[i] = le32_to_cpu(dwords[i]);
> +
> + /*
> +  * Compute the subset of (Fast) Read commands for which the 4-byte
> +  * version is supported.
> +  */
> + discard_hwcaps = 0;
> + read_hwcaps = 0;
> + for (i = 0; i < ARRAY_SIZE(reads); i++) {
> + const struct sfdp_4bait *read = &reads[i];
> +
> + discard_hwcaps |= read->hwcaps;
> + if ((params->hwcaps.mask & read->hwcaps) &&
> + (dwords[0] & read->supported_bit))
> + read_hwcaps |= read->hwcaps;
> + }

Looks like there is a bit of repeated stuff here, maybe this can be
pulled out ?

> + /*
> +  * Compute the subset of Page Program commands for which the 4-byte
> +  * version is supported.
> +  */
> + pp_hwcaps = 0;
> + for (i = 0; i < ARRAY_SIZE(programs); i++) {
> + const struct sfdp_4bait *program = &programs[i];
> +
> + discard_hwcaps |= program->hwcaps;
> + if ((params->hwcaps.mask & program->hwcaps) &&
> + (dwords[0] & program->supported_bit))
> + pp_hwcaps |= program->hwcaps;
> + }
> +
> + /*
> +  * Compute the subet of Sector Erase commands for which the 4-byte
> +  * version is supported.
> +  */
> + erase_mask = 0;
> + for (i = 0; i < SNOR_CMD_ERASE_MAX; i++) {
> + const struct sfdp_4bait *erase = &erases[i];
> +
> + if ((map->commands[i].size > 0) &&
> + (dwords[0] & erase->s

[GIT PULL] parisc architecture fix for v4.11-rc7

2017-04-15 Thread Helge Deller

Hi Linus,

please pull one important fix for the parisc architecture for kernel 4.11-rc7 
from:

  git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git 
parisc-4.11-4

Mikulas Patocka fixed a few bugs in our new pa_memcpy() assembler function,
e.g. one bug made the kernel unbootable if source and destination address are
the same.

Thanks,
Helge


Mikulas Patocka (1):
  parisc: fix bugs in pa_memcpy

 arch/parisc/lib/lusercopy.S | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

Re: [backport v4.9] tpm_tis: use default timeout value if chip reports it as zero

2017-04-15 Thread Maciej S. Szmigiero


Hi Jarkko,

On 15.04.2017 17:26, Jarkko Sakkinen wrote:

From: "Maciej S. Szmigiero" 

Since commit 1107d065fdf1 ("tpm_tis: Introduce intermediate layer for
TPM access") Atmel 3203 TPM on ThinkPad X61S (TPM firmware version 13.9)
no longer works.  The initialization proceeds fine until we get and
start using chip-reported timeouts - and the chip reports C and D
timeouts of zero.

It turns out that until commit 8e54caf407b98e ("tpm: Provide a generic
means to override the chip returned timeouts") we had actually let
default timeout values remain in this case, so let's bring back this
behavior to make chips like Atmel 3203 work again.

Use a common code that was introduced by that commit so a warning is
printed in this case and /sys/class/tpm/tpm*/timeouts correctly says the
timeouts aren't chip-original.

Fixes: 1107d065fdf1 ("tpm_tis: Introduce intermediate layer for TPM access")
Cc: sta...@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero 
Reviewed-by: Jarkko Sakkinen 
Signed-off-by: Jarkko Sakkinen 
---
Backport v4.9. Can you test it?
 drivers/char/tpm/tpm-interface.c | 59 ++--
 drivers/char/tpm/tpm_tis.c   |  2 +-
 drivers/char/tpm/tpm_tis_core.c  |  6 ++--
 drivers/char/tpm/tpm_tis_core.h  |  2 +-
 4 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 3a9149cf0110..4c914fe25802 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c

(..)

@@ -537,16 +537,15 @@ int tpm_get_timeouts(struct tpm_chip *chip)
goto duration;
}

-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
-


Is this part right?
These tests weren't removed by this commit as present in the mainline kernel.

Maciej

[GIT PULL] SCSI fixes for 4.11-rc6

2017-04-15 Thread James Bottomley

This is larger than it would be: we missed the -rc5 fixes pull request
because of a problem linux-next found with the lowest patch which
necessitated a rebase to fix.

This is seven small fixes which are all for user visible issues that
fortunately only occur in rare circumstances.  The most serious is the
sr one in which QEMU can cause us to read beyond the end of a buffer (I
don't think it's exploitable, but just in case).  The next is the sd
capacity fix which means all non 512 byte sector drives greater than
2TB fail to be correctly sized.  The rest are either in new drivers
(qedf) or on error legs.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Chad Dupuis (1):
  scsi: qedf: Fix crash due to unsolicited FIP VLAN response.

Fam Zheng (1):
  scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable

Guilherme G. Piccoli (1):
  scsi: aacraid: fix PCI error recovery path

Martin K. Petersen (2):
  scsi: sd: Fix capacity calculation with 32-bit sector_t
  scsi: sr: Sanity check returned mode data

Mauricio Faria de Oliveira (1):
  scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

Sawan Chandak (1):
  scsi: qla2xxx: Add fix to read correct register value for ISP82xx.

With diffstat:

 drivers/scsi/aacraid/aacraid.h | 11 ---
 drivers/scsi/aacraid/commsup.c |  3 ++-
 drivers/scsi/ipr.c |  7 ++-
 drivers/scsi/qedf/qedf_fip.c   |  3 ++-
 drivers/scsi/qedf/qedf_main.c  |  1 +
 drivers/scsi/qla2xxx/qla_os.c  |  7 ++-
 drivers/scsi/sd.c  | 23 ---
 drivers/scsi/sr.c  |  6 --
 8 files changed, 49 insertions(+), 12 deletions(-)

And full diff below.

James

---

diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index d036a806f31c..d281492009fb 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -1690,9 +1690,6 @@ struct aac_dev
 #define aac_adapter_sync_cmd(dev, command, p1, p2, p3, p4, p5, p6, status, r1, 
r2, r3, r4) \
(dev)->a_ops.adapter_sync_cmd(dev, command, p1, p2, p3, p4, p5, p6, 
status, r1, r2, r3, r4)
 
-#define aac_adapter_check_health(dev) \
-   (dev)->a_ops.adapter_check_health(dev)
-
 #define aac_adapter_restart(dev, bled, reset_type) \
((dev)->a_ops.adapter_restart(dev, bled, reset_type))
 
@@ -2615,6 +2612,14 @@ static inline unsigned int cap_to_cyls(sector_t 
capacity, unsigned divisor)
return capacity;
 }
 
+static inline int aac_adapter_check_health(struct aac_dev *dev)
+{
+   if (unlikely(pci_channel_offline(dev->pdev)))
+   return -1;
+
+   return (dev)->a_ops.adapter_check_health(dev);
+}
+
 /* SCp.phase values */
 #define AAC_OWNER_MIDLEVEL 0x101
 #define AAC_OWNER_LOWLEVEL 0x102
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index c8172f16cf33..1f4918355fdb 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1873,7 +1873,8 @@ int aac_check_health(struct aac_dev * aac)
spin_unlock_irqrestore(&aac->fib_lock, flagv);
 
if (BlinkLED < 0) {
-   printk(KERN_ERR "%s: Host adapter dead %d\n", aac->name, 
BlinkLED);
+   printk(KERN_ERR "%s: Host adapter is dead (or got a PCI error) 
%d\n",
+   aac->name, BlinkLED);
goto out;
}
 
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index b29afafc2885..5d5e272fd815 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -6293,7 +6293,12 @@ static void ipr_erp_start(struct ipr_ioa_cfg *ioa_cfg,
break;
case IPR_IOASC_MED_DO_NOT_REALLOC: /* prevent retries */
case IPR_IOASA_IR_DUAL_IOA_DISABLED:
-   scsi_cmd->result |= (DID_PASSTHROUGH << 16);
+   /*
+* exception: do not set DID_PASSTHROUGH on CHECK CONDITION
+* so SCSI mid-layer and upper layers handle it accordingly.
+*/
+   if (scsi_cmd->result != SAM_STAT_CHECK_CONDITION)
+   scsi_cmd->result |= (DID_PASSTHROUGH << 16);
break;
case IPR_IOASC_BUS_WAS_RESET:
case IPR_IOASC_BUS_WAS_RESET_BY_OTHER:
diff --git a/drivers/scsi/qedf/qedf_fip.c b/drivers/scsi/qedf/qedf_fip.c
index ed58b9104f58..e10b91cc3c62 100644
--- a/drivers/scsi/qedf/qedf_fip.c
+++ b/drivers/scsi/qedf/qedf_fip.c
@@ -99,7 +99,8 @@ static void qedf_fcoe_process_vlan_resp(struct qedf_ctx *qedf,
qedf_set_vlan_id(qedf, vid);
 
/* Inform waiter that it's ok to call fcoe_ctlr_link up() */
-   complete(&qedf->fipvlan_compl);
+   if (!completion_done(&qedf->fipvlan_compl))
+   complete(&qedf->fipvlan_compl);
}
 }
 
diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
index 8e2a160490e6..cceddd995a4b 100644
--- a/dr

Re: [git pull] vfs fixes

2017-04-15 Thread Linus Torvalds

On Fri, Apr 14, 2017 at 11:41 PM, Vegard Nossum  wrote:
>
> I'm seeing the same memfd_create/name_to_handle_at/path_lookupat
> use-after-free that Dmitry was seeing here:

Ok, see if that is gone in current git with commit c0eb027e5aef ("vfs:
don't do RCU lookup of empty pathnames")

  Linus

Re: [PATCH RESEND] regulator: rn5t618: Fix out of bounds array access

2017-04-15 Thread Stefan Agner

On 2017-04-15 07:52, Axel Lin wrote:
> The commit "regulator: rn5t618: Add RN5T567 PMIC support" added
> RN5T618_DCDC4 to the enum, then RN5T618_REG_NUM is also changed.
> So for rn5t618, there is out of bounds array access when checking
> regulators[i].name in the for loop.

I use designated initializers ([RN5T618_##rid] = {..), which guarantee
that the non initialized elements are zero. The highest element LDORTC2
is defined, hence the length of the array should be RN5T618_REG_NUM.

See also
https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html

--
Stefan


> 
> The number of regulators is different for rn5t567 and rn5t618, so we had
> better remove RN5T618_REG_NUM and get the correct num_regulators during
> probe instead.
> 
> Fixes: ed6d362d8dbc ("regulator: rn5t618: Add RN5T567 PMIC support")
> Signed-off-by: Axel Lin 
> ---
> RESEND: Correct subject line (remove double Fix)
> 
>  drivers/regulator/rn5t618-regulator.c | 8 
>  include/linux/mfd/rn5t618.h   | 1 -
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/regulator/rn5t618-regulator.c
> b/drivers/regulator/rn5t618-regulator.c
> index 8d2819e..0c09143 100644
> --- a/drivers/regulator/rn5t618-regulator.c
> +++ b/drivers/regulator/rn5t618-regulator.c
> @@ -85,14 +85,17 @@ static int rn5t618_regulator_probe(struct
> platform_device *pdev)
>   struct regulator_config config = { };
>   struct regulator_dev *rdev;
>   struct regulator_desc *regulators;
> + int num_regulators;
>   int i;
>  
>   switch (rn5t618->variant) {
>   case RN5T567:
>   regulators = rn5t567_regulators;
> + num_regulators = ARRAY_SIZE(rn5t567_regulators);
>   break;
>   case RN5T618:
>   regulators = rn5t618_regulators;
> + num_regulators = ARRAY_SIZE(rn5t618_regulators);
>   break;
>   default:
>   return -EINVAL;
> @@ -101,10 +104,7 @@ static int rn5t618_regulator_probe(struct
> platform_device *pdev)
>   config.dev = pdev->dev.parent;
>   config.regmap = rn5t618->regmap;
>  
> - for (i = 0; i < RN5T618_REG_NUM; i++) {
> - if (!regulators[i].name)
> - continue;
> -
> + for (i = 0; i < num_regulators; i++) {
>   rdev = devm_regulator_register(&pdev->dev,
>  ®ulators[i],
>  &config);
> diff --git a/include/linux/mfd/rn5t618.h b/include/linux/mfd/rn5t618.h
> index e5a6cde..d7b3155 100644
> --- a/include/linux/mfd/rn5t618.h
> +++ b/include/linux/mfd/rn5t618.h
> @@ -233,7 +233,6 @@ enum {
>   RN5T618_LDO5,
>   RN5T618_LDORTC1,
>   RN5T618_LDORTC2,
> - RN5T618_REG_NUM,
>  };
>  
>  enum {

[PATCH 2/2] mfd: omap-usb-tll: Configure ULPIAUTOIDLE

2017-04-15 Thread Tony Lindgren

The idle mode needs to be only disabled for UTMIAUTOIDLE while
ULPIAUTOIDLE can be enabled.

This matches the TLL_CHANNEL_CONF_i register configuration for ehci-tll
in the Motorola Linux kernel tree for Wrigley 3G LTE modem on droid 4
and the modem still stays responsive.

Cc: Felipe Balbi 
Cc: Keshava Munegowda 
Cc: Marcel Partap 
Cc: Michael Scott 
Cc: Roger Quadros 
Cc: Sebastian Reichel 
Signed-off-by: Tony Lindgren 
---
 drivers/mfd/omap-usb-tll.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mfd/omap-usb-tll.c b/drivers/mfd/omap-usb-tll.c
--- a/drivers/mfd/omap-usb-tll.c
+++ b/drivers/mfd/omap-usb-tll.c
@@ -373,12 +373,13 @@ int omap_tll_init(struct usbhs_omap_platform_data *pdata)
} else if (pdata->port_mode[i] ==
OMAP_EHCI_PORT_MODE_TLL) {
/*
-* Disable AutoIdle, BitStuffing
-* and use SDR Mode
+* Disable UTMI AutoIdle, BitStuffing
+* and use SDR Mode. Enable ULPI AutoIdle.
 */
reg &= ~(OMAP_TLL_CHANNEL_CONF_UTMIAUTOIDLE
| OMAP_TLL_CHANNEL_CONF_ULPIDDRMODE);
reg |= OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF;
+   reg |= OMAP_TLL_CHANNEL_CONF_ULPI_ULPIAUTOIDLE;
} else if (pdata->port_mode[i] ==
OMAP_EHCI_PORT_MODE_HSIC) {
/*
-- 
2.12.2

[PATCH 1/2] mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode

2017-04-15 Thread Tony Lindgren

Commit 16fa3dc75c22 ("mfd: omap-usb-tll: HOST TLL platform driver")
added support for USB TLL, but uses OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF
bit the wrong way. The comments in the code are correct, but the inverted
use of OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF causes the register to be
enabled instead of disabled unlike what the comments say.

Without this change the Wrigley 3G LTE modem on droid 4 EHCI bus can
be only pinged few times before it stops responding.

Fixes: 16fa3dc75c22 ("mfd: omap-usb-tll: HOST TLL platform driver")
Cc: Felipe Balbi 
Cc: Keshava Munegowda 
Cc: Marcel Partap 
Cc: Michael Scott 
Cc: Roger Quadros 
Cc: Sebastian Reichel 
Signed-off-by: Tony Lindgren 
---
 drivers/mfd/omap-usb-tll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mfd/omap-usb-tll.c b/drivers/mfd/omap-usb-tll.c
--- a/drivers/mfd/omap-usb-tll.c
+++ b/drivers/mfd/omap-usb-tll.c
@@ -377,8 +377,8 @@ int omap_tll_init(struct usbhs_omap_platform_data *pdata)
 * and use SDR Mode
 */
reg &= ~(OMAP_TLL_CHANNEL_CONF_UTMIAUTOIDLE
-   | OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF
| OMAP_TLL_CHANNEL_CONF_ULPIDDRMODE);
+   reg |= OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF;
} else if (pdata->port_mode[i] ==
OMAP_EHCI_PORT_MODE_HSIC) {
/*
-- 
2.12.2

[PATCHv2 0/2] mfd: omap-usb-tll: Fixes for USB TLL mode

2017-04-15 Thread Tony Lindgren

Hi

Here's v2 of this that move the enabling of ULPIAUTOIDLE bit into a
separate patch as suggested by Roger Quadros .

Both patches can wait for v4.12.

Regards,

Tony


Tony Lindgren (2):
  mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode
  mfd: omap-usb-tll: Configure ULPIAUTOIDLE

 drivers/mfd/omap-usb-tll.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

-- 
2.12.2

Re: [git pull] vfs fixes

2017-04-15 Thread Al Viro

On Sat, Apr 15, 2017 at 09:51:40AM -0700, Linus Torvalds wrote:
> On Fri, Apr 14, 2017 at 11:41 PM, Vegard Nossum  
> wrote:
> >
> > I'm seeing the same memfd_create/name_to_handle_at/path_lookupat
> > use-after-free that Dmitry was seeing here:
> 
> Ok, see if that is gone in current git with commit c0eb027e5aef ("vfs:
> don't do RCU lookup of empty pathnames")

FWIW, I'm finishing testing of fixes for crap found during the discussion
of that stuff last week (making sure that mntns_install() can't be abused
into setting ->fs->root/->fs->pwd to dentry of NFS referral and its ilk
and doing that in a sane way).

[GIT PULL] libnvdimm fixes for 4.11-rc7

2017-04-15 Thread Dan Williams

Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes

...to receive:

A small crop of lockdep, sleeping while atomic, and other fixes /
band-aids in advance of the full-blown reworks targeting the next
merge window. The largest change here is "libnvdimm: fix blk free
space accounting" which deletes a pile of buggy code that better
testing would have caught before merging. The next change that is
borderline too big for a late rc is switching the device-dax locking
from rcu to srcu, I couldn't think of a smaller way to make that fix.

The __copy_user_nocache fix will have a full replacement in 4.12 to
move those pmem special case considerations into the pmem driver. The
"libnvdimm: band aid btt vs clear poison locking" commit admits that
our error clearing support for btt went in broken, so we just disable
it in 4.11 and -stable. A replacement / full fix is in the pipeline
for 4.12

Some of these would have been caught earlier had
CONFIG_DEBUG_ATOMIC_SLEEP been enabled on my development station. I
wonder if we should have:

config DEBUG_ATOMIC_SLEEP
default PROVE_LOCKING

...since I mistakenly thought I got both with CONFIG_PROVE_LOCKING=y.

These have received a build success notification from the 0day robot,
and some have appeared in a -next release with no reported issues.

---

The following changes since commit c02ed2e75ef4c74e41e421acb4ef1494671585e8:

  Linux 4.11-rc4 (2017-03-26 14:15:16 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes

for you to fetch changes up to 11e63f6d920d6f2dfd3cd421e939a4aec9a58dcd:

  x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions
(2017-04-12 13:45:18 -0700)


Dan Williams (6):
  acpi, nfit, libnvdimm: fix interleave set cookie calculation
(64-bit comparison)
  libnvdimm: fix blk free space accounting
  libnvdimm: fix reconfig_mutex, mmap_sem, and jbd2_handle lockdep splat
  libnvdimm: band aid btt vs clear poison locking
  device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
  x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions

 arch/x86/include/asm/pmem.h | 42 ++---
 drivers/acpi/nfit/core.c|  6 +++-
 drivers/dax/Kconfig |  1 +
 drivers/dax/dax.c   | 13 
 drivers/nvdimm/bus.c|  6 
 drivers/nvdimm/claim.c  | 10 +-
 drivers/nvdimm/dimm_devs.c  | 77 +++--
 7 files changed, 70 insertions(+), 85 deletions(-)

commit b03b99a329a14b7302f37c3ea6da3848db41c8c5
Author: Dan Williams 
Date:   Mon Mar 27 21:53:38 2017 -0700

acpi, nfit, libnvdimm: fix interleave set cookie calculation
(64-bit comparison)

While reviewing the -stable patch for commit 86ef58a4e35e "nfit,
libnvdimm: fix interleave set cookie calculation" Ben noted:

"This is returning an int, thus it's effectively doing a 32-bit
 comparison and not the 64-bit comparison you say is needed."

Update the compare operation to be immune to this integer demotion problem.

Cc: 
Cc: Nicholas Moulin 
Fixes: 86ef58a4e35e ("nfit, libnvdimm: fix interleave set cookie
calculation")
Reported-by: Ben Hutchings 
Signed-off-by: Dan Williams 

commit fe514739d8538783749d3ce72f78e5a999ea5668
Author: Dan Williams 
Date:   Tue Apr 4 15:08:36 2017 -0700

libnvdimm: fix blk free space accounting

Commit a1f3e4d6a0c3 "libnvdimm, region: update nd_region_available_dpa()
for multi-pmem support" reworked blk dpa (DIMM Physical Address)
accounting to comprehend multiple pmem namespace allocations aliasing
with a given blk-dpa range.

The following call trace is a result of failing to account for allocated
blk capacity.

 WARNING: CPU: 1 PID: 2433 at
tools/testing/nvdimm/../../../drivers/nvdimm/names
4 size_store+0x6f3/0x930 [libnvdimm]
 nd_region region5: allocation underrun: 0x0 of 0x100 bytes
 [..]
 Call Trace:
  dump_stack+0x86/0xc3
  __warn+0xcb/0xf0
  warn_slowpath_fmt+0x5f/0x80
  size_store+0x6f3/0x930 [libnvdimm]
  dev_attr_store+0x18/0x30

If a given blk-dpa allocation does not alias with any pmem ranges then
the full allocation should be accounted as busy space, not the size of
the current pmem contribution to the region.

The thinkos that led to this confusion was not realizing that the struct
resource management is already guaranteeing no collisions between pmem
allocations and blk allocations on the same dimm. Also, we do not try to
support blk allocations in aliased pmem holes.

This patch also fixes a case where the available blk goes negative.

Cc: 
Fixes: a1f3e4d6a0c3 ("libnvdimm, region: update
nd_region_available_dpa() for multi-pmem support").
Reported-by: Dariusz Dokupil 
Reported-b

Re: [patch 0/6] hwmon/coretemp: Hotplug fixes, cleanups and state machine conversion

2017-04-15 Thread Tommi Rantala

2017-04-14 20:35 GMT+03:00 Thomas Gleixner :
> On Wed, 12 Apr 2017, Thomas Gleixner wrote:
>>
>> Can you please try the following:
>>
>> # for STATE in freezer devices platform processors core; do \
>>   echo $STATE; \
>>   echo $STATE >/sys/power/pm_test; \
>>   echo mem >/sys/power/state
>>
>> That should give us at least a hint in which area to dig.
>
> Any news on that?

Sorry, was traveling.

Testing with 4.10.8-200.fc25.x86_64: freezer, devices and platform are
OK, it breaks at "processors".
The screen stays off, and the machine no longer answers to ping.

(Without coretemp loaded, the machine survives all the states. There
are some graphics glitches and radeon error messages)

-Tommi

Re: regulator: s2mps11: Use kcalloc() in s2mps11_pmic_probe()

2017-04-15 Thread SF Markus Elfring

>> A multiplication for the size determination of a memory allocation
>> indicated that an array data structure should be processed.
>> Thus use the corresponding function "kcalloc".
>>
>> This issue was detected by using the Coccinelle software.
> 
> Unfortunately you write mostly cryptic commit messages.

Thanks for your feedback.


> This does not answer for the main question - why this change is needed.

My update suggestion affects an aspect for the coding style.


> Code looks okay,

There can be different opinions about related implementation details.


> but you should explain in simple words why this is needed.

Do you find the following wording from the script “checkpatch.pl”
better to understand?

WARNING: Prefer kcalloc over kzalloc with multiply


Regards,
Markus

[patch 05/20] x86/mtrr: Remove get_online_cpus() from mtrr_save_state()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

mtrr_save_state() is invoked from native_cpu_up() which is in the context
of a CPU hotplug operation and therefor calling get_online_cpus() is
pointless.

While this works in the current get_online_cpus() implementation it
prevents from converting the hotplug locking to percpu rwsems.

Remove it.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: x...@kernel.org

---
 arch/x86/kernel/cpu/mtrr/main.c |2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -807,10 +807,8 @@ void mtrr_save_state(void)
if (!mtrr_enabled())
return;
 
-   get_online_cpus();
first_cpu = cpumask_first(cpu_online_mask);
smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
-   put_online_cpus();
 }
 
 void set_mtrr_aps_delayed_init(void)

[patch 09/20] hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

etm_probe4() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Mathieu Poirier 
Cc: linux-arm-ker...@lists.infradead.org

---
 drivers/hwtracing/coresight/coresight-etm4x.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -990,12 +990,12 @@ static int etm4_probe(struct amba_device
dev_err(dev, "ETM arch init failed\n");
 
if (!etm4_count++) {
-   cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING,
- "arm/coresight4:starting",
- etm4_starting_cpu, etm4_dying_cpu);
-   ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-   "arm/coresight4:online",
-   etm4_online_cpu, NULL);
+   
cpuhp_setup_state_nocalls_locked(CPUHP_AP_ARM_CORESIGHT_STARTING,
+"arm/coresight4:starting",
+etm4_starting_cpu, 
etm4_dying_cpu);
+   ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+  "arm/coresight4:online",
+  etm4_online_cpu, NULL);
if (ret < 0)
goto err_arch_supported;
hp_online = ret;

[patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

cpufreq holds get_online_cpus() while invoking cpuhp_setup_state_nocalls()
to make subsys_interface_register() and the registration of hotplug calls
atomic versus cpu hotplug.

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: "Rafael J. Wysocki" 
Cc: Viresh Kumar 
Cc: linux...@vger.kernel.org

---
 drivers/cpufreq/cpufreq.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2473,9 +2473,10 @@ int cpufreq_register_driver(struct cpufr
goto err_if_unreg;
}
 
-   ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
-   cpuhp_cpufreq_online,
-   cpuhp_cpufreq_offline);
+   ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+  "cpufreq:online",
+  cpuhp_cpufreq_online,
+  cpuhp_cpufreq_offline);
if (ret < 0)
goto err_if_unreg;
hp_online = ret;
@@ -2519,7 +2520,7 @@ int cpufreq_unregister_driver(struct cpu
get_online_cpus();
subsys_interface_unregister(&cpufreq_interface);
remove_boost_sysfs_file();
-   cpuhp_remove_state_nocalls(hp_online);
+   cpuhp_remove_state_nocalls_locked(hp_online);
 
write_lock_irqsave(&cpufreq_driver_lock, flags);

[patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

If intel_snb_check_microcode() is invoked via
  microcode_init -> perf_check_microcode -> intel_snb_check_microcode

then get_online_cpus() is invoked nested. This works with the current
implementation of get_online_cpus() but prevents converting it to a percpu
rwsem.

intel_snb_check_microcode() is also invoked from intel_sandybridge_quirk()
unprotected.

Drop get_online_cpus() from intel_snb_check_microcode() and add it to
intel_sandybridge_quirk() so both call sites are protected.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: x...@kernel.org

---
 arch/x86/events/intel/core.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3389,12 +3389,10 @@ static void intel_snb_check_microcode(vo
int pebs_broken = 0;
int cpu;
 
-   get_online_cpus();
for_each_online_cpu(cpu) {
if ((pebs_broken = intel_snb_pebs_broken(cpu)))
break;
}
-   put_online_cpus();
 
if (pebs_broken == x86_pmu.pebs_broken)
return;
@@ -3467,7 +3465,9 @@ static bool check_msr(unsigned long msr,
 static __init void intel_sandybridge_quirk(void)
 {
x86_pmu.check_microcode = intel_snb_check_microcode;
+   get_online_cpus();
intel_snb_check_microcode();
+   put_online_cpus();
 }
 
 static const struct { int id; char *name; } intel_arch_events_map[] 
__initconst = {

[patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem

2017-04-15 Thread Thomas Gleixner

get_online_cpus() is used in hot pathes in mainline and even more so in
RT. That can show up badly under certain conditions because every locker
contends on a global mutex. RT has it's own homebrewn mitigation which is
an (badly done) open coded implementation of percpu_rwsems with recursion
support.

The proper replacement for that are percpu_rwsems, but that requires to
remove recursion support.

The conversion unearthed real locking issues which were previously not
visible because the get_online_cpus() lockdep annotation was implemented
with recursion support which prevents lockdep from tracking full dependency
chains. These potential deadlocks are not related to recursive calls, they
trigger on the first invocation because lockdep now has the full dependency
chains available.

The following patch series addresses this by

 - Cleaning up places which call get_online_cpus() nested

 - Replacing a few instances with cpu_hotplug_disable() to prevent circular
   locking dependencies.

The series depends on

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
  plus
Linus tree merged in to avoid conflicts

It's available in git from

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.hotplug

Thanks,

tglx

---
 arch/arm/kernel/hw_breakpoint.c   |5 
 arch/powerpc/kvm/book3s_hv.c  |8 -
 arch/powerpc/platforms/powernv/subcore.c  |2 
 arch/s390/kernel/time.c   |2 
 arch/x86/events/core.c|1 
 arch/x86/events/intel/core.c  |4 
 arch/x86/events/intel/cqm.c   |   12 +-
 arch/x86/kernel/cpu/mtrr/main.c   |2 
 drivers/acpi/processor_driver.c   |4 
 drivers/cpufreq/cpufreq.c |9 -
 drivers/hwtracing/coresight/coresight-etm3x.c |   12 +-
 drivers/hwtracing/coresight/coresight-etm4x.c |   12 +-
 drivers/pci/pci-driver.c  |   46 
 include/linux/cpuhotplug.h|   29 +
 include/linux/padata.h|3 
 include/linux/pci.h   |1 
 include/linux/stop_machine.h  |   26 
 kernel/cpu.c  |  149 +++---
 kernel/padata.c   |   38 +++---
 kernel/stop_machine.c |4 
 20 files changed, 177 insertions(+), 192 deletions(-)

[patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()

2017-04-15 Thread Thomas Gleixner

Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
unearthed a circular lock dependency which was hidden from lockdep due to
the lockdep annotation of get_online_cpus() which prevents lockdep from
creating full dependency chains.

CPU0CPU1

lock((&wfc.work));
 lock(cpu_hotplug_lock.rw_sem);
 lock((&wfc.work));
lock(cpu_hotplug_lock.rw_sem);

This dependency is established via acpi_processor_start() which calls into
the work queue code. And the work queue code establishes the reverse
dependency.

This is not a problem of get_online_cpus() recursion, it's a possible
deadlock undetected by lockdep so far.

The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
protect the probing from acpi_processor_start().

There is a side effect to this: cpu_hotplug_disable() makes a concurrent
cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but that
probing usually happens during the boot process where no interaction is
possible. Any later invocations are infrequent enough and concurrent
hotplug attempts are so unlikely that the danger of user space visible
regressions is very close to zero. Anyway, thats preferrable over a real
deadlock.

Signed-off-by: Thomas Gleixner 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: linux-a...@vger.kernel.org
---
 drivers/acpi/processor_driver.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -268,9 +268,9 @@ static int acpi_processor_start(struct d
return -ENODEV;
 
/* Protect against concurrent CPU hotplug operations */
-   get_online_cpus();
+   cpu_hotplug_disable();
ret = __acpi_processor_start(device);
-   put_online_cpus();
+   cpu_hotplug_enable();
return ret;
 }

[patch 10/20] perf/x86/intel/cqm: Use cpuhp_setup_state_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

intel_cqm_init() holds get_online_cpus() while registerring the hotplug
callbacks.

cpuhp_setup_state() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use cpuhp_setup_state_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: x...@kernel.org
Cc: Fenghua Yu 

---
 arch/x86/events/intel/cqm.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -1746,12 +1746,12 @@ static int __init intel_cqm_init(void)
 * Setup the hot cpu notifier once we are sure cqm
 * is enabled to avoid notifier leak.
 */
-   cpuhp_setup_state(CPUHP_AP_PERF_X86_CQM_STARTING,
- "perf/x86/cqm:starting",
- intel_cqm_cpu_starting, NULL);
-   cpuhp_setup_state(CPUHP_AP_PERF_X86_CQM_ONLINE, "perf/x86/cqm:online",
- NULL, intel_cqm_cpu_exit);
-
+   cpuhp_setup_state_locked(CPUHP_AP_PERF_X86_CQM_STARTING,
+"perf/x86/cqm:starting",
+intel_cqm_cpu_starting, NULL);
+   cpuhp_setup_state_locked(CPUHP_AP_PERF_X86_CQM_ONLINE,
+"perf/x86/cqm:online",
+NULL, intel_cqm_cpu_exit);
 out:
put_online_cpus();

[patch 18/20] PCI: Replace the racy recursion prevention

2017-04-15 Thread Thomas Gleixner

pci_call_probe() can called recursively when a physcial function is probed
and the probing creates virtual functions, which are populated via
pci_bus_add_device() which in turn can end up calling pci_call_probe()
again.

The code has an interesting way to prevent recursing into the workqueue
code.  That's accomplished by a check whether the current task runs already
on the numa node which is associated with the device.

While that works to prevent the recursion into the workqueue code, it's
racy versus normal execution as there is no guarantee that the node does
not vanish after the check.

Make the detection reliable by:

 - Mark a probed device as 'is_probed' in pci_call_probe()
 
 - Check in pci_call_probe for a virtual function. If it's a virtual
   function and the associated physical function device is marked
   'is_probed' then this is a recursive call, so the call can be invoked in
   the calling context.

Signed-off-by: Thomas Gleixner 
Cc: Bjorn Helgaas 
Cc: linux-...@vger.kernel.org
---
 drivers/pci/pci-driver.c |   35 ++-
 include/linux/pci.h  |1 +
 2 files changed, 15 insertions(+), 21 deletions(-)

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -341,33 +341,26 @@ static int pci_call_probe(struct pci_dri
 * on the right node.
 */
node = dev_to_node(&dev->dev);
+   dev->is_probed = 1;
+
+   cpu_hotplug_disable();
 
/*
-* On NUMA systems, we are likely to call a PF probe function using
-* work_on_cpu().  If that probe calls pci_enable_sriov() (which
-* adds the VF devices via pci_bus_add_device()), we may re-enter
-* this function to call the VF probe function.  Calling
-* work_on_cpu() again will cause a lockdep warning.  Since VFs are
-* always on the same node as the PF, we can work around this by
-* avoiding work_on_cpu() when we're already on the correct node.
-*
-* Preemption is enabled, so it's theoretically unsafe to use
-* numa_node_id(), but even if we run the probe function on the
-* wrong node, it should be functionally correct.
+* Prevent nesting work_on_cpu() for the case where a Virtual Function
+* device is probed from work_on_cpu() of the Physical device.
 */
-   if (node >= 0 && node != numa_node_id()) {
-   int cpu;
-
-   cpu_hotplug_disable();
+   if (dev->is_virtfn && pci_physfn_is_probed(dev))
+   cpu = nr_cpu_ids;
+   else
cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-   if (cpu < nr_cpu_ids)
-   error = work_on_cpu(cpu, local_pci_probe, &ddi);
-   else
-   error = local_pci_probe(&ddi);
-   cpu_hotplug_enable();
-   } else
+
+   if (cpu < nr_cpu_ids)
+   error = work_on_cpu(cpu, local_pci_probe, &ddi);
+   else
error = local_pci_probe(&ddi);
 
+   dev->is_probed = 0;
+   cpu_hotplug_enable();
return error;
 }
 
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -365,6 +365,7 @@ struct pci_dev {
unsigned intirq_managed:1;
unsigned inthas_secondary_link:1;
unsigned intnon_compliant_bars:1;   /* broken BARs; ignore them */
+   unsigned intis_probed:1;/* device probing in progress */
pci_dev_flags_t dev_flags;
atomic_tenable_cnt; /* pci_enable_device has been called */

[patch 03/20] padata: Make padata_alloc() static

2017-04-15 Thread Thomas Gleixner

No users outside of padata.c

Signed-off-by: Thomas Gleixner 
Cc: Steffen Klassert 
Cc: linux-cry...@vger.kernel.org
---
 include/linux/padata.h |3 ---
 kernel/padata.c|   34 +-
 2 files changed, 17 insertions(+), 20 deletions(-)

--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -166,9 +166,6 @@ struct padata_instance {
 
 extern struct padata_instance *padata_alloc_possible(
struct workqueue_struct *wq);
-extern struct padata_instance *padata_alloc(struct workqueue_struct *wq,
-   const struct cpumask *pcpumask,
-   const struct cpumask *cbcpumask);
 extern void padata_free(struct padata_instance *pinst);
 extern int padata_do_parallel(struct padata_instance *pinst,
  struct padata_priv *padata, int cb_cpu);
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -913,7 +913,7 @@ static ssize_t padata_sysfs_show(struct
 }
 
 static ssize_t padata_sysfs_store(struct kobject *kobj, struct attribute *attr,
- const char *buf, size_t count)
+sconst char *buf, size_t count)
 {
struct padata_instance *pinst;
struct padata_sysfs_entry *pentry;
@@ -939,19 +939,6 @@ static struct kobj_type padata_attr_type
 };
 
 /**
- * padata_alloc_possible - Allocate and initialize padata instance.
- * Use the cpu_possible_mask for serial and
- * parallel workers.
- *
- * @wq: workqueue to use for the allocated padata instance
- */
-struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
-{
-   return padata_alloc(wq, cpu_possible_mask, cpu_possible_mask);
-}
-EXPORT_SYMBOL(padata_alloc_possible);
-
-/**
  * padata_alloc - allocate and initialize a padata instance and specify
  *cpumasks for serial and parallel workers.
  *
@@ -959,9 +946,9 @@ EXPORT_SYMBOL(padata_alloc_possible);
  * @pcpumask: cpumask that will be used for padata parallelization
  * @cbcpumask: cpumask that will be used for padata serialization
  */
-struct padata_instance *padata_alloc(struct workqueue_struct *wq,
-const struct cpumask *pcpumask,
-const struct cpumask *cbcpumask)
+static struct padata_instance *padata_alloc(struct workqueue_struct *wq,
+   const struct cpumask *pcpumask,
+   const struct cpumask *cbcpumask)
 {
struct padata_instance *pinst;
struct parallel_data *pd = NULL;
@@ -1016,6 +1003,19 @@ struct padata_instance *padata_alloc(str
 }
 
 /**
+ * padata_alloc_possible - Allocate and initialize padata instance.
+ * Use the cpu_possible_mask for serial and
+ * parallel workers.
+ *
+ * @wq: workqueue to use for the allocated padata instance
+ */
+struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
+{
+   return padata_alloc(wq, cpu_possible_mask, cpu_possible_mask);
+}
+EXPORT_SYMBOL(padata_alloc_possible);
+
+/**
  * padata_free - free a padata instance
  *
  * @padata_inst: padata instance to free

[patch 07/20] KVM/PPC/Book3S HV: Use cpuhp_setup_state_nocalls_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

kvmppc_alloc_host_rm_ops() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: k...@vger.kernel.org
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org

---
 arch/powerpc/kvm/book3s_hv.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3336,10 +3336,10 @@ void kvmppc_alloc_host_rm_ops(void)
return;
}
 
-   cpuhp_setup_state_nocalls(CPUHP_KVM_PPC_BOOK3S_PREPARE,
- "ppc/kvm_book3s:prepare",
- kvmppc_set_host_core,
- kvmppc_clear_host_core);
+   cpuhp_setup_state_nocalls_locked(CPUHP_KVM_PPC_BOOK3S_PREPARE,
+"ppc/kvm_book3s:prepare",
+kvmppc_set_host_core,
+kvmppc_clear_host_core);
put_online_cpus();
 }

[patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()

2017-04-15 Thread Thomas Gleixner

Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
unearthed a circular lock dependency which was hidden from lockdep due to
the lockdep annotation of get_online_cpus() which prevents lockdep from
creating full dependency chains. There are several variants of this. And
example is:

Chain exists of:

cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex

CPU0CPU1

lock(&item->mutex);
lock(drm_global_mutex);
lock(&item->mutex);
lock(cpu_hotplug_lock.rw_sem);

because there are dependencies through workqueues. The call chain is:

get_online_cpus
apply_workqueue_attrs
__alloc_workqueue_key
ttm_mem_global_init
ast_ttm_mem_global_init
drm_global_item_ref
ast_mm_init
ast_driver_load
drm_dev_register
drm_get_pci_dev
ast_pci_probe
local_pci_probe
work_for_cpu_fn
process_one_work
worker_thread

This is not a problem of get_online_cpus() recursion, it's a possible
deadlock undetected by lockdep so far.

The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
protect the PCI probing.

There is a side effect to this: cpu_hotplug_disable() makes a concurrent
cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI
probing usually happens during the boot process where no interaction is
possible. Any later invocations are infrequent enough and concurrent
hotplug attempts are so unlikely that the danger of user space visible
regressions is very close to zero. Anyway, thats preferrable over a real
deadlock.

Signed-off-by: Thomas Gleixner 
Cc: Bjorn Helgaas 
Cc: linux-...@vger.kernel.org
---
 drivers/pci/pci-driver.c |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
return 0;
 }
 
+static bool pci_physfn_is_probed(struct pci_dev *dev)
+{
+#ifdef CONFIG_ATS
+   return dev->physfn->is_probed;
+#else
+   return false;
+#endif
+}
+
 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
  const struct pci_device_id *id)
 {
-   int error, node;
+   int error, node, cpu;
struct drv_dev_and_id ddi = { drv, dev, id };
 
/*
@@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri
if (node >= 0 && node != numa_node_id()) {
int cpu;
 
-   get_online_cpus();
+   cpu_hotplug_disable();
cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
if (cpu < nr_cpu_ids)
error = work_on_cpu(cpu, local_pci_probe, &ddi);
else
error = local_pci_probe(&ddi);
-   put_online_cpus();
+   cpu_hotplug_enable();
} else
error = local_pci_probe(&ddi);

[patch 12/20] s390/kernel: Use stop_machine_locked()

2017-04-15 Thread Thomas Gleixner

stp_work_fn() holds get_online_cpus() while invoking stop_machine().

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: David Hildenbrand 
Cc: linux-s...@vger.kernel.org

---
 arch/s390/kernel/time.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -636,7 +636,7 @@ static void stp_work_fn(struct work_stru
memset(&stp_sync, 0, sizeof(stp_sync));
get_online_cpus();
atomic_set(&stp_sync.cpus, num_online_cpus() - 1);
-   stop_machine(stp_sync_clock, &stp_sync, cpu_online_mask);
+   stop_machine_locked(stp_sync_clock, &stp_sync, cpu_online_mask);
put_online_cpus();
 
if (!check_sync_clock())

[patch 11/20] ARM/hw_breakpoint: Use cpuhp_setup_state_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

arch_hw_breakpoint_init() holds get_online_cpus() while registerring the
hotplug callbacks.

cpuhp_setup_state() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use cpuhp_setup_state_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Russell King 
Cc: linux-arm-ker...@lists.infradead.org

---
 arch/arm/kernel/hw_breakpoint.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -1098,8 +1098,9 @@ static int __init arch_hw_breakpoint_ini
 * assume that a halting debugger will leave the world in a nice state
 * for us.
 */
-   ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm/hw_breakpoint:online",
-   dbg_reset_online, NULL);
+   ret = cpuhp_setup_state_locked(CPUHP_AP_ONLINE_DYN,
+  "arm/hw_breakpoint:online",
+  dbg_reset_online, NULL);
unregister_undef_hook(&debug_reg_hook);
if (WARN_ON(ret < 0) || !cpumask_empty(&debug_err_mask)) {
core_num_brps = 0;

[patch 13/20] powerpc/powernv: Use stop_machine_locked()

2017-04-15 Thread Thomas Gleixner

set_subcores_per_core() holds get_online_cpus() while invoking stop_machine().

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: linuxppc-...@lists.ozlabs.org

---
 arch/powerpc/platforms/powernv/subcore.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/platforms/powernv/subcore.c
+++ b/arch/powerpc/platforms/powernv/subcore.c
@@ -356,7 +356,7 @@ static int set_subcores_per_core(int new
/* Ensure state is consistent before we call the other cpus */
mb();
 
-   stop_machine(cpu_update_split_mode, &new_mode, cpu_online_mask);
+   stop_machine_locked(cpu_update_split_mode, &new_mode, cpu_online_mask);
 
put_online_cpus();

[patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem

2017-04-15 Thread Thomas Gleixner

There are no more (known) nested calls to get_online_cpus() so it's
possible to remove the nested call magic and convert the mutex to a
percpu-rwsem, which speeds up get/put_online_cpus() significantly for the
uncontended case.

The contended case (write locked for hotplug operations) is slow anyway, so
the slightly more expensive down_write of the percpu rwsem does not matter.

Signed-off-by: Thomas Gleixner 
---
 kernel/cpu.c |  102 ---
 1 file changed, 8 insertions(+), 94 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #define CREATE_TRACE_POINTS
@@ -196,121 +197,36 @@ void cpu_maps_update_done(void)
mutex_unlock(&cpu_add_remove_lock);
 }
 
-/* If set, cpu_up and cpu_down will return -EBUSY and do nothing.
+/*
+ * If set, cpu_up and cpu_down will return -EBUSY and do nothing.
  * Should always be manipulated under cpu_add_remove_lock
  */
 static int cpu_hotplug_disabled;
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-static struct {
-   struct task_struct *active_writer;
-   /* wait queue to wake up the active_writer */
-   wait_queue_head_t wq;
-   /* verifies that no writer will get active while readers are active */
-   struct mutex lock;
-   /*
-* Also blocks the new readers during
-* an ongoing cpu hotplug operation.
-*/
-   atomic_t refcount;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-   struct lockdep_map dep_map;
-#endif
-} cpu_hotplug = {
-   .active_writer = NULL,
-   .wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq),
-   .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-   .dep_map = STATIC_LOCKDEP_MAP_INIT("cpu_hotplug.dep_map", 
&cpu_hotplug.dep_map),
-#endif
-};
-
-/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
-#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire_tryread() \
- lock_map_acquire_tryread(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire()  lock_map_acquire(&cpu_hotplug.dep_map)
-#define cpuhp_lock_release()  lock_map_release(&cpu_hotplug.dep_map)
-
+DEFINE_STATIC_PERCPU_RWSEM(cpu_hotplug_lock);
 
 void get_online_cpus(void)
 {
-   might_sleep();
-   if (cpu_hotplug.active_writer == current)
-   return;
-   cpuhp_lock_acquire_read();
-   mutex_lock(&cpu_hotplug.lock);
-   atomic_inc(&cpu_hotplug.refcount);
-   mutex_unlock(&cpu_hotplug.lock);
+   percpu_down_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(get_online_cpus);
 
 void put_online_cpus(void)
 {
-   int refcount;
-
-   if (cpu_hotplug.active_writer == current)
-   return;
-
-   refcount = atomic_dec_return(&cpu_hotplug.refcount);
-   if (WARN_ON(refcount < 0)) /* try to fix things up */
-   atomic_inc(&cpu_hotplug.refcount);
-
-   if (refcount <= 0 && waitqueue_active(&cpu_hotplug.wq))
-   wake_up(&cpu_hotplug.wq);
-
-   cpuhp_lock_release();
-
+   percpu_up_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(put_online_cpus);
 
-/*
- * This ensures that the hotplug operation can begin only when the
- * refcount goes to zero.
- *
- * Note that during a cpu-hotplug operation, the new readers, if any,
- * will be blocked by the cpu_hotplug.lock
- *
- * Since cpu_hotplug_begin() is always called after invoking
- * cpu_maps_update_begin(), we can be sure that only one writer is active.
- *
- * Note that theoretically, there is a possibility of a livelock:
- * - Refcount goes to zero, last reader wakes up the sleeping
- *   writer.
- * - Last reader unlocks the cpu_hotplug.lock.
- * - A new reader arrives at this moment, bumps up the refcount.
- * - The writer acquires the cpu_hotplug.lock finds the refcount
- *   non zero and goes to sleep again.
- *
- * However, this is very difficult to achieve in practice since
- * get_online_cpus() not an api which is called all that often.
- *
- */
 void cpu_hotplug_begin(void)
 {
-   DEFINE_WAIT(wait);
-
-   cpu_hotplug.active_writer = current;
-   cpuhp_lock_acquire();
-
-   for (;;) {
-   mutex_lock(&cpu_hotplug.lock);
-   prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
-   if (likely(!atomic_read(&cpu_hotplug.refcount)))
-   break;
-   mutex_unlock(&cpu_hotplug.lock);
-   schedule();
-   }
-   finish_wait(&cpu_hotplug.wq, &wait);
+   percpu_down_write(&cpu_hotplug_lock);
 }
 
 void cpu_hotplug_done(void)
 {
-   cpu_hotplug.active_writer = NULL;
-   mutex_unlock(&cpu_hotplug.lock);
-   cpuhp_lock_release();
+   percpu_up_write(&cpu_hotplug_lock);
 }
 
 /*
@@ -344,8 +260,6 @@ void cpu_hotplug_enable(void)
 EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
 #endif /* CONFIG_HOTPLUG_CPU */
 
-/* Notifier wr

[patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode

2017-04-15 Thread Thomas Gleixner

The only caller is the microcode update, which cannot be modular.

Drop the export.

Signed-off-by: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: x...@kernel.org
---
 arch/x86/events/core.c |1 -
 1 file changed, 1 deletion(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2224,7 +2224,6 @@ void perf_check_microcode(void)
if (x86_pmu.check_microcode)
x86_pmu.check_microcode();
 }
-EXPORT_SYMBOL_GPL(perf_check_microcode);
 
 static struct pmu pmu = {
.pmu_enable = x86_pmu_enable,

[patch 08/20] hwtracing/coresight-etm3x: Use the locked version of cpuhp_setup_state_nocalls()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

etm_probe() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Mathieu Poirier 
Cc: linux-arm-ker...@lists.infradead.org

---
 drivers/hwtracing/coresight/coresight-etm3x.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c 
b/drivers/hwtracing/coresight/coresight-etm3x.c
index a51b6b64ecdf..0887265f361d 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x.c
@@ -803,12 +803,12 @@ static int etm_probe(struct amba_device *adev, const 
struct amba_id *id)
dev_err(dev, "ETM arch init failed\n");
 
if (!etm_count++) {
-   cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING,
- "arm/coresight:starting",
- etm_starting_cpu, etm_dying_cpu);
-   ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-   "arm/coresight:online",
-   etm_online_cpu, NULL);
+   
cpuhp_setup_state_nocalls_locked(CPUHP_AP_ARM_CORESIGHT_STARTING,
+"arm/coresight:starting",
+etm_starting_cpu, 
etm_dying_cpu);
+   ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+  "arm/coresight:online",
+  etm_online_cpu, NULL);
if (ret < 0)
goto err_arch_supported;
hp_online = ret;
-- 
2.11.0

[patch 04/20] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

pcrypt_init_padata()
   get_online_cpus()
   padata_alloc_possible()
 padata_alloc()
   get_online_cpus()

The nested call to get_online_cpus() works with the current implementation,
but prevents the conversion to a percpu rwsem.

The other caller of padata_alloc_possible() is pcrypt_init_padata() which
calls from a get_online_cpus() protected region as well.

Remove the get_online_cpus() call in padata_alloc() and document the
calling convention.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Cc: Steffen Klassert 
Cc: linux-cry...@vger.kernel.org

---
 kernel/padata.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -913,7 +913,7 @@ static ssize_t padata_sysfs_show(struct
 }
 
 static ssize_t padata_sysfs_store(struct kobject *kobj, struct attribute *attr,
-sconst char *buf, size_t count)
+ const char *buf, size_t count)
 {
struct padata_instance *pinst;
struct padata_sysfs_entry *pentry;
@@ -945,6 +945,8 @@ static struct kobj_type padata_attr_type
  * @wq: workqueue to use for the allocated padata instance
  * @pcpumask: cpumask that will be used for padata parallelization
  * @cbcpumask: cpumask that will be used for padata serialization
+ *
+ * Must be called from a get_online_cpus() protected region
  */
 static struct padata_instance *padata_alloc(struct workqueue_struct *wq,
const struct cpumask *pcpumask,
@@ -957,7 +959,6 @@ static struct padata_instance *padata_al
if (!pinst)
goto err;
 
-   get_online_cpus();
if (!alloc_cpumask_var(&pinst->cpumask.pcpu, GFP_KERNEL))
goto err_free_inst;
if (!alloc_cpumask_var(&pinst->cpumask.cbcpu, GFP_KERNEL)) {
@@ -997,7 +998,6 @@ static struct padata_instance *padata_al
free_cpumask_var(pinst->cpumask.cbcpu);
 err_free_inst:
kfree(pinst);
-   put_online_cpus();
 err:
return NULL;
 }
@@ -1008,6 +1008,8 @@ static struct padata_instance *padata_al
  * parallel workers.
  *
  * @wq: workqueue to use for the allocated padata instance
+ *
+ * Must be called from a get_online_cpus() protected region
  */
 struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
 {

[patch 14/20] kernel/hotplug: Use stop_machine_locked() in takedown_cpu()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

takedown_cpu() is a cpu hotplug function invoking stop_machine(). The cpu
hotplug machinery holds the hotplug lock for write.

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 

---
 kernel/cpu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -701,7 +701,7 @@ static int takedown_cpu(unsigned int cpu
/*
 * So now all preempt/rcu users must observe !cpu_active().
 */
-   err = stop_machine(take_cpu_down, NULL, cpumask_of(cpu));
+   err = stop_machine_locked(take_cpu_down, NULL, cpumask_of(cpu));
if (err) {
/* CPU refused to die */
irq_unlock_sparse();

[patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

Some call sites of cpuhp_setup/remove_state[_nocalls]() are within a
get_online_cpus() protected region.

cpuhp_setup/remove_state[_nocalls]() call get_online_cpus() as well, which
is possible in the current implementation but prevetns converting the
hotplug locking to a percpu rwsem.

Provide locked versions of the interfaces to avoid nested calls to
get_online_cpus().

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 

---
 include/linux/cpuhotplug.h |   29 +
 kernel/cpu.c   |   45 +
 2 files changed, 62 insertions(+), 12 deletions(-)

--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -151,6 +151,11 @@ int __cpuhp_setup_state(enum cpuhp_state
int (*startup)(unsigned int cpu),
int (*teardown)(unsigned int cpu), bool multi_instance);
 
+int __cpuhp_setup_state_locked(enum cpuhp_state state, const char *name,
+  bool invoke,
+  int (*startup)(unsigned int cpu),
+  int (*teardown)(unsigned int cpu),
+  bool multi_instance);
 /**
  * cpuhp_setup_state - Setup hotplug state callbacks with calling the callbacks
  * @state: The state for which the calls are installed
@@ -169,6 +174,15 @@ static inline int cpuhp_setup_state(enum
return __cpuhp_setup_state(state, name, true, startup, teardown, false);
 }
 
+static inline int cpuhp_setup_state_locked(enum cpuhp_state state,
+  const char *name,
+  int (*startup)(unsigned int cpu),
+  int (*teardown)(unsigned int cpu))
+{
+   return __cpuhp_setup_state_locked(state, name, true, startup, teardown,
+ false);
+}
+
 /**
  * cpuhp_setup_state_nocalls - Setup hotplug state callbacks without calling 
the
  *callbacks
@@ -189,6 +203,15 @@ static inline int cpuhp_setup_state_noca
   false);
 }
 
+static inline int cpuhp_setup_state_nocalls_locked(enum cpuhp_state state,
+  const char *name,
+  int (*startup)(unsigned int 
cpu),
+  int (*teardown)(unsigned int 
cpu))
+{
+   return __cpuhp_setup_state_locked(state, name, false, startup, teardown,
+ false);
+}
+
 /**
  * cpuhp_setup_state_multi - Add callbacks for multi state
  * @state: The state for which the calls are installed
@@ -248,6 +271,7 @@ static inline int cpuhp_state_add_instan
 }
 
 void __cpuhp_remove_state(enum cpuhp_state state, bool invoke);
+void __cpuhp_remove_state_locked(enum cpuhp_state state, bool invoke);
 
 /**
  * cpuhp_remove_state - Remove hotplug state callbacks and invoke the teardown
@@ -271,6 +295,11 @@ static inline void cpuhp_remove_state_no
__cpuhp_remove_state(state, false);
 }
 
+static inline void cpuhp_remove_state_nocalls_locked(enum cpuhp_state state)
+{
+   __cpuhp_remove_state_locked(state, false);
+}
+
 /**
  * cpuhp_remove_multi_state - Remove hotplug multi state callback
  * @state: The state for which the calls are removed
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1457,7 +1457,7 @@ int __cpuhp_state_add_instance(enum cpuh
 EXPORT_SYMBOL_GPL(__cpuhp_state_add_instance);
 
 /**
- * __cpuhp_setup_state - Setup the callbacks for an hotplug machine state
+ * __cpuhp_setup_state_locked - Setup the callbacks for an hotplug machine 
state
  * @state: The state to setup
  * @invoke:If true, the startup function is invoked for cpus where
  * cpu state >= @state
@@ -1466,17 +1466,18 @@ EXPORT_SYMBOL_GPL(__cpuhp_state_add_inst
  * @multi_instance:State is set up for multiple instances which get
  * added afterwards.
  *
+ * The caller needs to hold get_online_cpus() while calling this function.
  * Returns:
  *   On success:
  *  Positive state number if @state is CPUHP_AP_ONLINE_DYN
  *  0 for all other states
  *   On failure: proper (negative) error code
  */
-int __cpuhp_setup_state(enum cpuhp_state state,
-   const char *name, bool invoke,
-   int (*startup)(unsigned int cpu),
-   int (*teardown)(unsigned int cpu),
-   bool multi_instance)
+int __cpuhp_setup_state_locked(enum cpuhp_state state,
+  const char *name, bool invoke,
+  int (*startup)(unsigned int cpu),
+  int (*teardown)(unsigned int cpu),
+  bool multi_instance)
 {
int cpu, ret = 0;
bool dynstate

[patch 02/20] stop_machine: Provide stop_machine_locked()

2017-04-15 Thread Thomas Gleixner

From: Sebastian Andrzej Siewior 

Some call sites of stop_machine() are within a get_online_cpus() protected
region.

stop_machine() calls get_online_cpus() as well, which is possible in the
current implementation but prevents converting the hotplug locking to a
percpu rwsem.

Provide stop_machine_locked() to avoid nested calls to get_online_cpus().

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 

---
 include/linux/stop_machine.h |   26 +++---
 kernel/stop_machine.c|4 ++--
 2 files changed, 25 insertions(+), 5 deletions(-)

--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -116,15 +116,29 @@ static inline int try_stop_cpus(const st
  * @fn() runs.
  *
  * This can be thought of as a very heavy write lock, equivalent to
- * grabbing every spinlock in the kernel. */
+ * grabbing every spinlock in the kernel.
+ *
+ * Protects against CPU hotplug.
+ */
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
+/**
+ * stop_machine_locked: freeze the machine on all CPUs and run this function
+ * @fn: the function to run
+ * @data: the data ptr for the @fn()
+ * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
+ *
+ * Same as above. Must be called from with in a get_online_cpus() protected
+ * region. Avoids nested calls to get_online_cpus().
+ */
+int stop_machine_locked(cpu_stop_fn_t fn, void *data, const struct cpumask 
*cpus);
+
 int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
   const struct cpumask *cpus);
 #else  /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */
 
-static inline int stop_machine(cpu_stop_fn_t fn, void *data,
-const struct cpumask *cpus)
+static inline int stop_machine_locked(cpu_stop_fn_t fn, void *data,
+ const struct cpumask *cpus)
 {
unsigned long flags;
int ret;
@@ -134,6 +148,12 @@ static inline int stop_machine(cpu_stop_
return ret;
 }
 
+static inline int stop_machine(cpu_stop_fn_t fn, void *data,
+  const struct cpumask *cpus)
+{
+   return stop_machine_locked(fn, data, cpus);
+}
+
 static inline int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
 const struct cpumask *cpus)
 {
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -552,7 +552,7 @@ static int __init cpu_stop_init(void)
 }
 early_initcall(cpu_stop_init);
 
-static int __stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask 
*cpus)
+int stop_machine_locked(cpu_stop_fn_t fn, void *data, const struct cpumask 
*cpus)
 {
struct multi_stop_data msdata = {
.fn = fn,
@@ -591,7 +591,7 @@ int stop_machine(cpu_stop_fn_t fn, void
 
/* No CPUs can come up or down during this. */
get_online_cpus();
-   ret = __stop_machine(fn, data, cpus);
+   ret = stop_machine_locked(fn, data, cpus);
put_online_cpus();
return ret;
 }

Re: [PATCH] misc: lkdtm: Add volatile to intentional NULL pointer reference

2017-04-15 Thread Kees Cook

On Fri, Apr 14, 2017 at 2:15 PM, Matthias Kaehlcke  wrote:
> From: Michael Davidson 
>
> Add a volatile qualifier where a NULL pointer is deliberately
> dereferenced to trigger a panic.
>
> Without the volatile qualifier clang will issue the following warning:
> "indirection of non-volatile null pointer will be deleted,
> not trap [-Wnull-dereference]" and replace the pointer reference
> with a __builtin_trap() (which generates a ud2 instruction on x86_64).
>
> Signed-off-by: Michael Davidson 
> Signed-off-by: Matthias Kaehlcke 

Thanks!

Acked-by: Kees Cook 

Greg, please add this to drivers/misc when you get a chance. :)

-Kees

> ---
>  drivers/misc/lkdtm_bugs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/misc/lkdtm_bugs.c b/drivers/misc/lkdtm_bugs.c
> index e3f4cd8876b5..d734d75afade 100644
> --- a/drivers/misc/lkdtm_bugs.c
> +++ b/drivers/misc/lkdtm_bugs.c
> @@ -67,7 +67,7 @@ void lkdtm_WARNING(void)
>
>  void lkdtm_EXCEPTION(void)
>  {
> -   *((int *) 0) = 0;
> +   *((volatile int *) 0) = 0;
>  }
>
>  void lkdtm_LOOP(void)
> --
> 2.12.2.762.g0e3151a226-goog
>



-- 
Kees Cook
Pixel Security

Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-15 Thread Logan Gunthorpe

Thanks, Benjamin, for the summary of some of the issues.

On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
> So I assume the p2p code provides a way to address that too via special
> dma_ops ? Or wrappers ?

Not at this time. We will probably need a way to ensure the iommus do
not attempt to remap these addresses. Though if it does, I'd expect
everything would still work you just wouldn't get the performance or
traffic flow you are looking for. We've been testing with the software
iommu which doesn't have this problem.

> The problem is that the latter while seemingly easier, is also slower
> and not supported by all platforms and architectures (for example,
> POWER currently won't allow it, or rather only allows a store-only
> subset of it under special circumstances).

Yes, I think situations where we have to cross host bridges will remain
unsupported by this work for a long time. There are two many cases where
it just doesn't work or it performs too poorly to be useful.

> I don't fully understand how p2pmem "solves" that by creating struct
> pages. The offset problem is one issue. But there's the iommu issue as
> well, the driver cannot just use the normal dma_map ops.

We are not using a proper iommu and we are dealing with systems that
have zero offset. This case is also easily supported. I expect fixing
the iommus to not map these addresses would also be reasonably achievable.

Logan

[PATCH] ASoC: topology: use j for internal loop counter

2017-04-15 Thread Colin King

From: Colin Ian King 

Currently variable i is being for 2 nested for loops. Fix this by
using integer loop counter j for the inside for loop.

Fixes: 1a7dd6e2f1929 ("ASoC: topology: Allow a widget to have multiple enum 
controls")
Signed-off-by: Colin Ian King 
---
 sound/soc/soc-topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c
index 058bc99c6c34..002772e3ba2c 100644
--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -495,12 +495,13 @@ static void remove_widget(struct snd_soc_component *comp,
struct snd_kcontrol *kcontrol = w->kcontrols[i];
struct soc_enum *se =
(struct soc_enum *)kcontrol->private_value;
+   int j;
 
snd_ctl_remove(card, kcontrol);
 
kfree(se->dobj.control.dvalues);
-   for (i = 0; i < se->items; i++)
-   kfree(se->dobj.control.dtexts[i]);
+   for (j = 0; j < se->items; j++)
+   kfree(se->dobj.control.dtexts[j]);
 
kfree(se);
}
-- 
2.11.0

Re: [PATCH linux 2/2] net sched actions: fix refcount decrement on error

2017-04-15 Thread Cong Wang

On Fri, Apr 14, 2017 at 2:08 AM, Wolfgang Bumiller
 wrote:
> Before I do that - trying to wrap my head around the interdependencies
> here better to be thorough - I noticed that tcf_hash_release() can
> return ACT_P_DELETED. The ACT_P_CREATED case means tcf_hash_create()
> was used, in the other case the tc_action's ref & bind count is bumped
> by tcf_hash_check() and then also decremented by tcf_hash_release() if
> it existed, iow. kept at 1, but not always: It does always happen in
> act_police.c but in other files such as act_bpf.c or act_connmark.c if
> eg. bind is set they return without decrementing, so both ref&bind count
> are bumped when they return - the refcount logic isn't easy to follow
> for a newcomer. Now there are two uses of __tcf_hash_release() in
> act_api.c which check for a return value of ACT_P_DELETED, in which case
> they call module_put().


That's the nasty part... IIRC, Jamal has fixed two bugs on action refcnt'ing.
We really need to clean up the code.

> So I'm not sure exactly how the module and tc_action counts are related
> (and I usually like to understand my own patches ;-) ).


Each action holds a refcnt to its module, each filter holds a refcnt to
its bound or referenced (unbound) action.


> Maybe I'm missing something obvious but I'm currently a bit confused as
> to whether the tcf_hash_release() call there is okay, or should have its
> return value checked or should depend on ->init()'s ACT_P_CREATED value
> as well?
>

I think it's the same? If we have ACT_P_CREATED here, tcf_hash_release()
will return ACT_P_DELETED for sure because the newly created action has
refcnt==1?

Thanks.

Re: [PATCH linux 2/2] net sched actions: fix refcount decrement on error

2017-04-15 Thread Wolfgang Bumiller


> On April 15, 2017 at 8:20 PM Cong Wang  wrote:
> 
> 
> On Fri, Apr 14, 2017 at 2:08 AM, Wolfgang Bumiller
>  wrote:
> > Before I do that - trying to wrap my head around the interdependencies
> > here better to be thorough - I noticed that tcf_hash_release() can
> > return ACT_P_DELETED. The ACT_P_CREATED case means tcf_hash_create()
> > was used, in the other case the tc_action's ref & bind count is bumped
> > by tcf_hash_check() and then also decremented by tcf_hash_release() if
> > it existed, iow. kept at 1, but not always: It does always happen in
> > act_police.c but in other files such as act_bpf.c or act_connmark.c if
> > eg. bind is set they return without decrementing, so both ref&bind count
> > are bumped when they return - the refcount logic isn't easy to follow
> > for a newcomer. Now there are two uses of __tcf_hash_release() in
> > act_api.c which check for a return value of ACT_P_DELETED, in which case
> > they call module_put().
> 
> 
> That's the nasty part... IIRC, Jamal has fixed two bugs on action refcnt'ing.
> We really need to clean up the code.
> 
> > So I'm not sure exactly how the module and tc_action counts are related
> > (and I usually like to understand my own patches ;-) ).
> 
> 
> Each action holds a refcnt to its module, each filter holds a refcnt to
> its bound or referenced (unbound) action.
> 
> 
> > Maybe I'm missing something obvious but I'm currently a bit confused as
> > to whether the tcf_hash_release() call there is okay, or should have its
> > return value checked or should depend on ->init()'s ACT_P_CREATED value
> > as well?
> >
> 
> I think it's the same? If we have ACT_P_CREATED here, tcf_hash_release()
> will return ACT_P_DELETED for sure because the newly created action has
> refcnt==1?

Makes sense on the one hand, but for ACT_P_DELETED both ref and bind
count need to reach 0, so I'm still concerned that the different behaviors
I mentioned above might be problematic if we use ACT_P_CREATED only.
(It also means my patches still leak a count - which is probably still
better than the previous underflow, but ultimately doesn't satisfy me.)
Should I still resend it this way for the record with the Acked-bys?
(Since given the fact that with unprivileged containers it's possible to
trigger this access and potentially crash the kernel I strongly feel that
some version of this should end up in the 4.11 release.)

[GIT PULL 00/19] LightNVM patches for 4.12.

2017-04-15 Thread Matias Bjørling

Hi Jens,

With this merge window, we like to push pblk upstream. It is a new
host-side translation layer that implements support for exposing
Open-Channel SSDs as block devices.

We have described pblk in the LightNVM paper "LightNVM: The Linux
Open-Channel SSD Subsystem" that was accepted at FAST 2017. The paper
defines open-channel SSDs, the subsystem, pblk and has an evaluation as
well. Over the past couple of kernel versions we have shipped the
support patches for pblk, and we are now comfortable pushing the core of
pblk upstream.

The core contains the logic to control data placement and I/O scheduling
on open-channel SSDs. Including implementation of translation table
management, GC, recovery, rate-limiting, and similar components. It
assumes that the SSD is media-agnostic, and runs on both 1.2 and 2.0 of
the Open-Channel SSD specification without modifications.

I want to point out two neat features of pblk. First, pblk can be
instantiated multiple times on the same SSD, enabling I/O isolation
between tenants, and makes it able to fulfill strict QoS requirements.
We showed results from this at the NVMW '17 workshop this year, while
presenting the "Multi-Tenant I/O Isolation with Open-Channel SSDs" talk.
Second, now that a full host-side translation layer is implemented, one
can begin to optimize its data placement and I/O scheduling algorithms
to match user workloads. We have shown a couple of the benefits in the
LightNVM paper, and we know of a couple of companies and universities
that have begun making new algorithms.

In detail, this pull request contains:

 - The new host-side FTL pblk from Javier, and other contributors.

 - Add support to the "create" ioctl to force a target to be
   re-initialized at using "factory" flag from Javier.

 - Fix various errors in LightNVM core from Javier and me.

 - An optimization from Neil Brown to skip error checking on mempool
   allocations that can sleep.

 - A buffer overflow fix in nvme_nvm_identify from Scott Bauer.

 - Fix for bad block discovery handle error handling from Christophe
   Jaillet.

 - Fixes from Dan Carpenter to pblk after it went into linux-next.

Please pull from the for-jens branch or apply the patches posted with
this mail:

   https://github.com/OpenChannelSSD/linux.git for-jens

Thanks,
Matias

Christophe JAILLET (1):
  lightnvm: Fix error handling

Dan Carpenter (3):
  lightnvm: pblk-gc: fix an error pointer dereference in init
  lightnvm: fix some WARN() messages
  lightnvm: fix some error code in pblk-init.c

Javier González (12):
  lightnvm: submit erases using the I/O path
  lightnvm: rename scrambler controller hint
  lightnvm: free reverse device map
  lightnvm: double-clear of dev->lun_map on target init error
  lightnvm: fix cleanup order of disk on init error
  lightnvm: bad type conversion for nvme control bits
  lightnvm: allow to init targets on factory mode
  lightnvm: make nvm_free static
  lightnvm: clean unused variable
  lightnvm: fix type checks on rrpc
  lightnvm: convert sprintf into strlcpy
  lightnvm: physical block device (pblk) target

Matias Bjørling (1):
  lightnvm: enable nvme size compile asserts

NeilBrown (1):
  lightnvm: don't check for failure from mempool_alloc()

Scott Bauer (1):
  nvme/lightnvm: Prevent small buffer overflow in nvme_nvm_identify

 Documentation/lightnvm/pblk.txt  |   21 +
 drivers/lightnvm/Kconfig |9 +
 drivers/lightnvm/Makefile|5 +
 drivers/lightnvm/core.c  |  124 +--
 drivers/lightnvm/pblk-cache.c|  114 +++
 drivers/lightnvm/pblk-core.c | 1655 ++
 drivers/lightnvm/pblk-gc.c   |  555 +
 drivers/lightnvm/pblk-init.c |  957 ++
 drivers/lightnvm/pblk-map.c  |  136 
 drivers/lightnvm/pblk-rb.c   |  852 
 drivers/lightnvm/pblk-read.c |  529 
 drivers/lightnvm/pblk-recovery.c |  998 +++
 drivers/lightnvm/pblk-rl.c   |  182 +
 drivers/lightnvm/pblk-sysfs.c|  507 
 drivers/lightnvm/pblk-write.c|  411 ++
 drivers/lightnvm/pblk.h  | 1121 ++
 drivers/lightnvm/rrpc.c  |   25 +-
 drivers/nvme/host/lightnvm.c |   42 +-
 include/linux/lightnvm.h |   13 +-
 include/uapi/linux/lightnvm.h|4 +
 20 files changed, 8165 insertions(+), 95 deletions(-)
 create mode 100644 Documentation/lightnvm/pblk.txt
 create mode 100644 drivers/lightnvm/pblk-cache.c
 create mode 100644 drivers/lightnvm/pblk-core.c
 create mode 100644 drivers/lightnvm/pblk-gc.c
 create mode 100644 drivers/lightnvm/pblk-init.c
 create mode 100644 drivers/lightnvm/pblk-map.c
 create mode 100644 drivers/lightnvm/pblk-rb.c
 create mode 100644 drivers/lightnvm/pblk-read.c
 create mode 100644 drivers/lightnvm/pblk-recovery.c
 create mode 100644 drivers/lightnvm/pblk-rl.c
 create mode 100644 drivers/lightnvm/pblk-sysfs.c
 create mode 100644 drivers/lightnvm/pblk-wri

[GIT PULL 05/19] lightnvm: free reverse device map

2017-04-15 Thread Matias Bjørling

From: Javier González 

Free the reverse mapping table correctly on target tear down

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 95105c4..a14c52c 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -411,6 +411,18 @@ static int nvm_register_map(struct nvm_dev *dev)
return -ENOMEM;
 }
 
+static void nvm_unregister_map(struct nvm_dev *dev)
+{
+   struct nvm_dev_map *rmap = dev->rmap;
+   int i;
+
+   for (i = 0; i < dev->geo.nr_chnls; i++)
+   kfree(rmap->chnls[i].lun_offs);
+
+   kfree(rmap->chnls);
+   kfree(rmap);
+}
+
 static void nvm_map_to_dev(struct nvm_tgt_dev *tgt_dev, struct ppa_addr *p)
 {
struct nvm_dev_map *dev_map = tgt_dev->map;
@@ -992,7 +1004,7 @@ void nvm_free(struct nvm_dev *dev)
if (dev->dma_pool)
dev->ops->destroy_dma_pool(dev->dma_pool);
 
-   kfree(dev->rmap);
+   nvm_unregister_map(dev);
kfree(dev->lptbl);
kfree(dev->lun_map);
kfree(dev);
-- 
2.9.3

[GIT PULL 01/19] lightnvm: Fix error handling

2017-04-15 Thread Matias Bjørling

From: Christophe JAILLET 

According to error handling in this function, it is likely that going to
'out' was expected here.

Signed-off-by: Christophe JAILLET 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/rrpc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
index e00b1d7..e68efbc 100644
--- a/drivers/lightnvm/rrpc.c
+++ b/drivers/lightnvm/rrpc.c
@@ -1275,8 +1275,10 @@ static int rrpc_bb_discovery(struct nvm_tgt_dev *dev, 
struct rrpc_lun *rlun)
}
 
nr_blks = nvm_bb_tbl_fold(dev->parent, blks, nr_blks);
-   if (nr_blks < 0)
-   return nr_blks;
+   if (nr_blks < 0) {
+   ret = nr_blks;
+   goto out;
+   }
 
for (i = 0; i < nr_blks; i++) {
if (blks[i] == NVM_BLK_T_FREE)
-- 
2.9.3

[GIT PULL 04/19] lightnvm: rename scrambler controller hint

2017-04-15 Thread Matias Bjørling

From: Javier González 

According to the OCSSD 1.2 specification, the 0x200 hint enables the
media scrambler for the read/write opcode, providing that the controller
has been correctly configured by the firmware. Rename the macro to
represent this meaning.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 include/linux/lightnvm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index e11163f..eff7d1f 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -123,7 +123,7 @@ enum {
/* NAND Access Modes */
NVM_IO_SUSPEND  = 0x80,
NVM_IO_SLC_MODE = 0x100,
-   NVM_IO_SCRAMBLE_DISABLE = 0x200,
+   NVM_IO_SCRAMBLE_ENABLE  = 0x200,
 
/* Block Types */
NVM_BLK_T_FREE  = 0x0,
-- 
2.9.3

[GIT PULL 13/19] lightnvm: clean unused variable

2017-04-15 Thread Matias Bjørling

From: Javier González 

Clean unused variable on lightnvm core.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index eb9ab1a..258007a 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -501,7 +501,6 @@ void nvm_part_to_tgt(struct nvm_dev *dev, sector_t *entries,
int *lun_roffs;
struct ppa_addr gaddr;
u64 pba = le64_to_cpu(entries[i]);
-   int off;
u64 diff;
 
if (!pba)
@@ -511,8 +510,6 @@ void nvm_part_to_tgt(struct nvm_dev *dev, sector_t *entries,
ch_rmap = &dev_rmap->chnls[gaddr.g.ch];
lun_roffs = ch_rmap->lun_offs;
 
-   off = gaddr.g.ch * geo->luns_per_chnl + gaddr.g.lun;
-
diff = ((ch_rmap->ch_off * geo->luns_per_chnl) +
(lun_roffs[gaddr.g.lun])) * geo->sec_per_lun;
 
-- 
2.9.3

[GIT PULL 11/19] lightnvm: allow to init targets on factory mode

2017-04-15 Thread Matias Bjørling

From: Javier González 

Target initialization has two responsibilities: creating the target
partition and instantiating the target. This patch enables to create a
factory partition (e.g., do not trigger recovery on the given target).
This is useful for target development and for being able to restore the
device state at any moment in time without requiring a full-device
erase.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c   | 14 +++---
 drivers/lightnvm/rrpc.c   |  3 ++-
 include/linux/lightnvm.h  |  3 ++-
 include/uapi/linux/lightnvm.h |  4 
 4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 5f84d2a..a63b563 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -280,7 +280,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
tdisk->fops = &nvm_fops;
tdisk->queue = tqueue;
 
-   targetdata = tt->init(tgt_dev, tdisk);
+   targetdata = tt->init(tgt_dev, tdisk, create->flags);
if (IS_ERR(targetdata))
goto err_init;
 
@@ -1244,8 +1244,16 @@ static long nvm_ioctl_dev_create(struct file *file, void 
__user *arg)
create.tgtname[DISK_NAME_LEN - 1] = '\0';
 
if (create.flags != 0) {
-   pr_err("nvm: no flags supported\n");
-   return -EINVAL;
+   __u32 flags = create.flags;
+
+   /* Check for valid flags */
+   if (flags & NVM_TARGET_FACTORY)
+   flags &= ~NVM_TARGET_FACTORY;
+
+   if (flags) {
+   pr_err("nvm: flag not supported\n");
+   return -EINVAL;
+   }
}
 
return __nvm_configure_create(&create);
diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
index a8acf9e..5dba544 100644
--- a/drivers/lightnvm/rrpc.c
+++ b/drivers/lightnvm/rrpc.c
@@ -1506,7 +1506,8 @@ static int rrpc_luns_configure(struct rrpc *rrpc)
 
 static struct nvm_tgt_type tt_rrpc;
 
-static void *rrpc_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk)
+static void *rrpc_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
+  int flags)
 {
struct request_queue *bqueue = dev->q;
struct request_queue *tqueue = tdisk->queue;
diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index eff7d1f..7dfa56e 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -436,7 +436,8 @@ static inline int ppa_cmp_blk(struct ppa_addr ppa1, struct 
ppa_addr ppa2)
 
 typedef blk_qc_t (nvm_tgt_make_rq_fn)(struct request_queue *, struct bio *);
 typedef sector_t (nvm_tgt_capacity_fn)(void *);
-typedef void *(nvm_tgt_init_fn)(struct nvm_tgt_dev *, struct gendisk *);
+typedef void *(nvm_tgt_init_fn)(struct nvm_tgt_dev *, struct gendisk *,
+   int flags);
 typedef void (nvm_tgt_exit_fn)(void *);
 typedef int (nvm_tgt_sysfs_init_fn)(struct gendisk *);
 typedef void (nvm_tgt_sysfs_exit_fn)(struct gendisk *);
diff --git a/include/uapi/linux/lightnvm.h b/include/uapi/linux/lightnvm.h
index fd19f36..c8aec4b 100644
--- a/include/uapi/linux/lightnvm.h
+++ b/include/uapi/linux/lightnvm.h
@@ -85,6 +85,10 @@ struct nvm_ioctl_create_conf {
};
 };
 
+enum {
+   NVM_TARGET_FACTORY = 1 << 0,/* Init target in factory mode */
+};
+
 struct nvm_ioctl_create {
char dev[DISK_NAME_LEN];/* open-channel SSD device */
char tgttype[NVM_TTYPE_NAME_MAX];   /* target type name */
-- 
2.9.3

[GIT PULL 12/19] lightnvm: make nvm_free static

2017-04-15 Thread Matias Bjørling

From: Javier González 

Prefix the nvm_free static function with a missing static keyword.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index a63b563..eb9ab1a 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -999,7 +999,7 @@ static int nvm_core_init(struct nvm_dev *dev)
return ret;
 }
 
-void nvm_free(struct nvm_dev *dev)
+static void nvm_free(struct nvm_dev *dev)
 {
if (!dev)
return;
-- 
2.9.3

[GIT PULL 18/19] lightnvm: fix some WARN() messages

2017-04-15 Thread Matias Bjørling

From: Dan Carpenter 

WARN_ON() takes a condition, not an error message.  I slightly tweaked
some conditions so hopefully it's more clear.

Signed-off-by: Dan Carpenter 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/pblk-read.c | 12 ++--
 drivers/lightnvm/pblk-recovery.c |  2 +-
 drivers/lightnvm/pblk-write.c|  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index eff0982..bce7ed5 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -49,8 +49,8 @@ static void pblk_read_ppalist_rq(struct pblk *pblk, struct 
nvm_rq *rqd,
int i, j = 0;
 
/* logic error: lba out-of-bounds. Ignore read request */
-   if (!(blba + nr_secs < pblk->rl.nr_secs)) {
-   WARN_ON("pblk: read lbas out of bounds\n");
+   if (blba + nr_secs >= pblk->rl.nr_secs) {
+   WARN(1, "pblk: read lbas out of bounds\n");
return;
}
 
@@ -254,8 +254,8 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq 
*rqd,
sector_t lba = pblk_get_lba(bio);
 
/* logic error: lba out-of-bounds. Ignore read request */
-   if (!(lba < pblk->rl.nr_secs)) {
-   WARN_ON("pblk: read lba out of bounds\n");
+   if (lba >= pblk->rl.nr_secs) {
+   WARN(1, "pblk: read lba out of bounds\n");
return;
}
 
@@ -411,8 +411,8 @@ static int read_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
int valid_secs = 0;
 
/* logic error: lba out-of-bounds */
-   if (!(lba < pblk->rl.nr_secs)) {
-   WARN_ON("pblk: read lba out of bounds\n");
+   if (lba >= pblk->rl.nr_secs) {
+   WARN(1, "pblk: read lba out of bounds\n");
goto out;
}
 
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 0d50f41..f8f8508 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -167,7 +167,7 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, 
struct pblk_line *line)
if (le64_to_cpu(lba_list[i]) == ADDR_EMPTY) {
spin_lock(&line->lock);
if (test_and_set_bit(i, line->invalid_bitmap))
-   WARN_ON_ONCE("pblk: rec. double invalidate:\n");
+   WARN_ONCE(1, "pblk: rec. double invalidate:\n");
else
line->vsc--;
spin_unlock(&line->lock);
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index ee57db9..74f7413 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -141,7 +141,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct 
nvm_rq *rqd)
 
/* Logic error */
if (bit > c_ctx->nr_valid) {
-   WARN_ON_ONCE("pblk: corrupted write request\n");
+   WARN_ONCE(1, "pblk: corrupted write request\n");
goto out;
}
 
-- 
2.9.3

[GIT PULL 17/19] lightnvm: pblk-gc: fix an error pointer dereference in init

2017-04-15 Thread Matias Bjørling

From: Dan Carpenter 

These labels are reversed so we could end up dereferencing an error
pointer or leaking.

Fixes: 7f347ba6bb3a ("lightnvm: physical block device (pblk) target")
Signed-off-by: Dan Carpenter 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/pblk-gc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
index 9b147cf..f173fd4 100644
--- a/drivers/lightnvm/pblk-gc.c
+++ b/drivers/lightnvm/pblk-gc.c
@@ -527,10 +527,10 @@ int pblk_gc_init(struct pblk *pblk)
 
return 0;
 
-fail_free_main_kthread:
-   kthread_stop(gc->gc_ts);
 fail_free_writer_kthread:
kthread_stop(gc->gc_writer_ts);
+fail_free_main_kthread:
+   kthread_stop(gc->gc_ts);
 
return ret;
 }
-- 
2.9.3

[GIT PULL 19/19] lightnvm: fix some error code in pblk-init.c

2017-04-15 Thread Matias Bjørling

From: Dan Carpenter 

There were a bunch of places in pblk_lines_init() where we didn't set an
error code.  And in pblk_writer_init() we accidentally return 1 instead
of a correct error code, which would result in a Oops later.

Fixes: 11a5d6fdf919 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Dan Carpenter 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/pblk-init.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 94653b1..3996e4b 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -543,7 +543,7 @@ static int pblk_lines_init(struct pblk *pblk)
long nr_bad_blks, nr_meta_blks, nr_free_blks;
int bb_distance;
int i;
-   int ret = 0;
+   int ret;
 
lm->sec_per_line = geo->sec_per_blk * geo->nr_luns;
lm->blk_per_line = geo->nr_luns;
@@ -638,12 +638,16 @@ static int pblk_lines_init(struct pblk *pblk)
}
 
l_mg->bb_template = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
-   if (!l_mg->bb_template)
+   if (!l_mg->bb_template) {
+   ret = -ENOMEM;
goto fail_free_meta;
+   }
 
l_mg->bb_aux = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
-   if (!l_mg->bb_aux)
+   if (!l_mg->bb_aux) {
+   ret = -ENOMEM;
goto fail_free_bb_template;
+   }
 
bb_distance = (geo->nr_luns) * geo->sec_per_pl;
for (i = 0; i < lm->sec_per_line; i += bb_distance)
@@ -667,8 +671,10 @@ static int pblk_lines_init(struct pblk *pblk)
 
pblk->lines = kcalloc(l_mg->nr_lines, sizeof(struct pblk_line),
GFP_KERNEL);
-   if (!pblk->lines)
+   if (!pblk->lines) {
+   ret = -ENOMEM;
goto fail_free_bb_aux;
+   }
 
nr_free_blks = 0;
for (i = 0; i < l_mg->nr_lines; i++) {
@@ -682,8 +688,10 @@ static int pblk_lines_init(struct pblk *pblk)
spin_lock_init(&line->lock);
 
nr_bad_blks = pblk_bb_line(pblk, line);
-   if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line)
+   if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line) {
+   ret = -EINVAL;
goto fail_free_lines;
+   }
 
line->blk_in_line = lm->blk_per_line - nr_bad_blks;
if (line->blk_in_line < lm->min_blk_line) {
@@ -733,7 +741,7 @@ static int pblk_writer_init(struct pblk *pblk)
pblk->writer_ts = kthread_create(pblk_write_ts, pblk, "pblk-writer-t");
if (IS_ERR(pblk->writer_ts)) {
pr_err("pblk: could not allocate writer kthread\n");
-   return 1;
+   return PTR_ERR(pblk->writer_ts);
}
 
return 0;
-- 
2.9.3

[GIT PULL 08/19] lightnvm: double-clear of dev->lun_map on target init error

2017-04-15 Thread Matias Bjørling

From: Javier González 

The dev->lun_map bits are cleared twice if an target init error occurs.
First in the target clean routine, and then next in the nvm_tgt_create
error function. Make sure that it is only cleared once by extending
nvm_remove_tgt_devi() with a clear bit, such that clearing of bits can
ignored when cleaning up a successful initialized target.

Signed-off-by: Javier González 
Fix style.
Signed-off-by: Matias Bjørling 

Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index a14c52c..5eea3d5 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -89,7 +89,7 @@ static void nvm_release_luns_err(struct nvm_dev *dev, int 
lun_begin,
WARN_ON(!test_and_clear_bit(i, dev->lun_map));
 }
 
-static void nvm_remove_tgt_dev(struct nvm_tgt_dev *tgt_dev)
+static void nvm_remove_tgt_dev(struct nvm_tgt_dev *tgt_dev, int clear)
 {
struct nvm_dev *dev = tgt_dev->parent;
struct nvm_dev_map *dev_map = tgt_dev->map;
@@ -100,11 +100,14 @@ static void nvm_remove_tgt_dev(struct nvm_tgt_dev 
*tgt_dev)
int *lun_offs = ch_map->lun_offs;
int ch = i + ch_map->ch_off;
 
-   for (j = 0; j < ch_map->nr_luns; j++) {
-   int lun = j + lun_offs[j];
-   int lunid = (ch * dev->geo.luns_per_chnl) + lun;
+   if (clear) {
+   for (j = 0; j < ch_map->nr_luns; j++) {
+   int lun = j + lun_offs[j];
+   int lunid = (ch * dev->geo.luns_per_chnl) + lun;
 
-   WARN_ON(!test_and_clear_bit(lunid, dev->lun_map));
+   WARN_ON(!test_and_clear_bit(lunid,
+   dev->lun_map));
+   }
}
 
kfree(ch_map->lun_offs);
@@ -309,7 +312,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
 err_queue:
blk_cleanup_queue(tqueue);
 err_dev:
-   nvm_remove_tgt_dev(tgt_dev);
+   nvm_remove_tgt_dev(tgt_dev, 0);
 err_t:
kfree(t);
 err_reserve:
@@ -332,7 +335,7 @@ static void __nvm_remove_target(struct nvm_target *t)
if (tt->exit)
tt->exit(tdisk->private_data);
 
-   nvm_remove_tgt_dev(t->dev);
+   nvm_remove_tgt_dev(t->dev, 1);
put_disk(tdisk);
 
list_del(&t->list);
-- 
2.9.3

[GIT PULL 14/19] lightnvm: fix type checks on rrpc

2017-04-15 Thread Matias Bjørling

From: Javier González 

sector_t is always unsigned, therefore avoid < 0 checks on it.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/rrpc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
index 5dba544..cf0e28a 100644
--- a/drivers/lightnvm/rrpc.c
+++ b/drivers/lightnvm/rrpc.c
@@ -817,7 +817,7 @@ static int rrpc_read_ppalist_rq(struct rrpc *rrpc, struct 
bio *bio,
 
for (i = 0; i < npages; i++) {
/* We assume that mapping occurs at 4KB granularity */
-   BUG_ON(!(laddr + i >= 0 && laddr + i < rrpc->nr_sects));
+   BUG_ON(!(laddr + i < rrpc->nr_sects));
gp = &rrpc->trans_map[laddr + i];
 
if (gp->rblk) {
@@ -846,7 +846,7 @@ static int rrpc_read_rq(struct rrpc *rrpc, struct bio *bio, 
struct nvm_rq *rqd,
if (!is_gc && rrpc_lock_rq(rrpc, bio, rqd))
return NVM_IO_REQUEUE;
 
-   BUG_ON(!(laddr >= 0 && laddr < rrpc->nr_sects));
+   BUG_ON(!(laddr < rrpc->nr_sects));
gp = &rrpc->trans_map[laddr];
 
if (gp->rblk) {
-- 
2.9.3

[GIT PULL 15/19] lightnvm: convert sprintf into strlcpy

2017-04-15 Thread Matias Bjørling

From: Javier González 

Convert sprintf calls to strlcpy in order to make possible buffer
overflow more obvious.

Signed-off-by: Javier González 
Signed-off-by: Matias Bjørling 
---
 drivers/lightnvm/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 258007a..2c26af3 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -273,7 +273,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
goto err_disk;
blk_queue_make_request(tqueue, tt->make_rq);
 
-   sprintf(tdisk->disk_name, "%s", create->tgtname);
+   strlcpy(tdisk->disk_name, create->tgtname, sizeof(tdisk->disk_name));
tdisk->flags = GENHD_FL_EXT_DEVT;
tdisk->major = 0;
tdisk->first_minor = 0;
@@ -1198,13 +1198,13 @@ static long nvm_ioctl_get_devices(struct file *file, 
void __user *arg)
list_for_each_entry(dev, &nvm_devices, devices) {
struct nvm_ioctl_device_info *info = &devices->info[i];
 
-   sprintf(info->devname, "%s", dev->name);
+   strlcpy(info->devname, dev->name, sizeof(info->devname));
 
/* kept for compatibility */
info->bmversion[0] = 1;
info->bmversion[1] = 0;
info->bmversion[2] = 0;
-   sprintf(info->bmname, "%s", "gennvm");
+   strlcpy(info->bmname, "gennvm", sizeof(info->bmname));
i++;
 
if (i > 31) {
-- 
2.9.3

1 2 >

1 - 100 of 145 matches

Mail list logo