Re: [PATCH v8 1/3] of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and PowerPC

2013-05-07 Thread Andrew Murray
On Sun, May 05, 2013 at 03:41:49AM +0100, Benjamin Herrenschmidt wrote:
> On Mon, 2013-04-22 at 11:41 +0100, Andrew Murray wrote:
> > The pci_process_bridge_OF_ranges function, used to parse the "ranges"
> > property of a PCI host device, is found in both Microblaze and PowerPC
> > architectures. These implementations are nearly identical. This patch
> > moves this common code to a common place.
> 
> What's happening with this ? I'd like to avoid that patch for now
> as I'm doing some changes to pci_process_bridge_OF_ranges
> which are fairly urgent (I might even stick them in the current
> merge window) to deal with memory windows having separate offsets.

There were no objections to this latest revision until now and it is
currently sitting with Jason Cooper (mvebu-next/pcie). [1]

> 
> There's also a few hacks in there that are really ppc specific...
> 
> I think the right long term approach is to change the way powerpc
> (and microblaze ?) initializes PCI host bridges. Move it away from
> setup_arch() (which is a PITA anyway since it's way too early) to
> an early init call of some sort, and encapsulate the new struct
> pci_host_bridge.
> 
> We can then directly configure the host bridge windows rather
> than having this "intermediary" set of resources in our pci_controller
> and in fact move most of the fields from pci_controller to
> pci_host_bridge to the point where the former can remain as a
> simple platform specific wrapper if needed.

This is a view that was also shared by Bjorn [2] when I attempted to
submit a patchset which moves struct pci_controller to asm-generic.

> 
> So for new stuff (hint: DT based ARM PCI) or stuff that has to deal with
> a lot less archaic platforms (hint: Microblaze), I'd recommend going
> straight for that approach rather than perpetuating the PowerPC code
> which I'll try to deal with in the next few monthes.

The motativation for my patchsets were to give a way for ARM PCI host
bridge drivers to parse the DT ranges property, but this snow-balled into
unifying pci_process_bridge_OF_ranges.

My v8 patchset provides a of_pci_range_parser which is used directly by a
few ARM PCI DT host bridge drivers, this has been generally accepted and
tested. I don't see why this can't remain and so I'd really like to keep this
around. 

Grant, Benjamin would you be happy for me to resubmit this series which provides
the of_pci_range_parser which will be used by the separate implementations of
pci_process_bridge_OF_ranges in PowerPC/Microblaze?

Benjamin are you able to still use of_pci_range_parser in your
'Support per-aperture memory offset' patch?

Thanks,

Andrew Murray

[1] https://lkml.org/lkml/2013/4/22/505
[2] https://patchwork.kernel.org/patch/2487671

> 
> Cheers,
> Ben.
>  
> 
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v9 1/3] of/pci: Provide support for parsing PCI DT ranges property

2013-05-07 Thread Andrew Murray
This patch factors out common implementation patterns to reduce overall kernel
code and provide a means for host bridge drivers to directly obtain struct
resources from the DT's ranges property without relying on architecture specific
DT handling. This will make it easier to write archiecture independent host 
bridge
drivers and mitigate against further duplication of DT parsing code.

This patch can be used in the following way:

struct of_pci_range_parser parser;
struct of_pci_range range;

if (of_pci_range_parser_init(&parser, np))
; //no ranges property

for_each_of_pci_range(&parser, &range) {

/*
directly access properties of the address range, e.g.:
range.pci_space, range.pci_addr, range.cpu_addr,
range.size, range.flags

alternatively obtain a struct resource, e.g.:
struct resource res;
of_pci_range_to_resource(&range, np, &res);
*/
}

Additionally the implementation takes care of adjacent ranges and merges them
into a single range (as was the case with powerpc and microblaze).

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Signed-off-by: Thomas Petazzoni 
Reviewed-by: Rob Herring 
Tested-by: Thomas Petazzoni 
Tested-by: Linus Walleij 
Tested-by: Jingoo Han 
Acked-by: Grant Likely 
---
 drivers/of/address.c   |   67 
 include/linux/of_address.h |   48 +++
 2 files changed, 115 insertions(+), 0 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 04da786..fdd0636 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -227,6 +227,73 @@ int of_pci_address_to_resource(struct device_node *dev, 
int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
 }
 EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+int of_pci_range_parser_init(struct of_pci_range_parser *parser,
+   struct device_node *node)
+{
+   const int na = 3, ns = 2;
+   int rlen;
+
+   parser->node = node;
+   parser->pna = of_n_addr_cells(node);
+   parser->np = parser->pna + na + ns;
+
+   parser->range = of_get_property(node, "ranges", &rlen);
+   if (parser->range == NULL)
+   return -ENOENT;
+
+   parser->end = parser->range + rlen / sizeof(__be32);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_init);
+
+struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser 
*parser,
+   struct of_pci_range *range)
+{
+   const int na = 3, ns = 2;
+
+   if (!range)
+   return NULL;
+
+   if (!parser->range || parser->range + parser->np > parser->end)
+   return NULL;
+
+   range->pci_space = parser->range[0];
+   range->flags = of_bus_pci_get_flags(parser->range);
+   range->pci_addr = of_read_number(parser->range + 1, ns);
+   range->cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   range->size = of_read_number(parser->range + parser->pna + na, ns);
+
+   parser->range += parser->np;
+
+   /* Now consume following elements while they are contiguous */
+   while (parser->range + parser->np <= parser->end) {
+   u32 flags, pci_space;
+   u64 pci_addr, cpu_addr, size;
+
+   pci_space = be32_to_cpup(parser->range);
+   flags = of_bus_pci_get_flags(parser->range);
+   pci_addr = of_read_number(parser->range + 1, ns);
+   cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   size = of_read_number(parser->range + parser->pna + na, ns);
+
+   if (flags != range->flags)
+   break;
+   if (pci_addr != range->pci_addr + range->size ||
+   cpu_addr != range->cpu_addr + range->size)
+   break;
+
+   range->size += size;
+   parser->range += parser->np;
+   }
+
+   return range;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_one);
+
 #endif /* CONFIG_PCI */
 
 /*
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 0506eb5..4c2e6f2 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -4,6 +4,36 @@
 #include 
 #include 
 
+struct of_pci_range_parser {
+   struct device_node *node;
+   const __be32 *range;
+   const __be32 *end;
+   int np;
+   int pna;
+};
+
+struct of_pci_range {
+   u32 pci_space;
+   u64 pci_addr;
+   u64 c

[PATCH v9 0/3] of/pci: Provide common support for PCI DT parsing

2013-05-07 Thread Andrew Murray
This patchset factors out duplicated code associated with parsing PCI
DT "ranges" properties across the architectures and introduces a
"ranges" parser. This parser "of_pci_range_parser" can be used directly
by ARM host bridge drivers enabling them to obtain ranges from device
trees.

I've included the Reviewed-by, Tested-by and Acked-by's received from
v5/v6/v7/v8 in this patchset, earlier versions of this patchset (v3) have been
tested-by:

Thierry Reding 
Jingoo Han 

I've tested that this patchset builds and runs on ARM and that it builds on
PowerPC, x86_64, MIPS and Microblaze.

Compared to the v8 sent by Andrew Murray, the following changes have been made
(please note that the MIPS patch is unchanged from v8):

 * Remove the unification of pci_process_bridge_OF_ranges between PowerPC and
   Microblaze. Feedback from Bjorn and Benjamin (along with a NAK) suggested
   that this goes against their future direction (using more of struct
   pci_host_bridge and less of arch specific struct pci_controller).

Compared to the v7 sent by Andrew Murray, the following changes have been made
(please note that the first patch is unchanged from v7):

 * Rename of_pci_range_parser to of_pci_range_parser_init and
   of_pci_process_ranges to of_pci_range_parser_one as suggested by Grant
   Likely.

 * Reverted back to using a switch statement instead of if/else in
   pci_process_bridge_OF_ranges. Grant Likely highlighted this change from
   the original code which was unnecessary.

 * Squashed in a patch provided by Gabor Juhos which fixes build errors on
   MIPS found in the last patchset.

Compared to the v6 sent by Andrew Murray, the following changes have
been made in response to build errors/warnings:

 * Inclusion of linux/of_address.h in of_pci.c as suggested by Michal
   Simek to prevent compilation failures on Microblaze (and others) and his
   ack.

 * Use of externs, static inlines and a typo in linux/of_address.h in response
   to linker errors (multiple defination) on x86_64 as spotted by a kbuild test
   robot on (jcooper/linux.git mvebu/drivers)

 * Add EXPORT_SYMBOL_GPL to of_pci_range_parser function to be consistent
   with of_pci_process_ranges function

Compared to the v5 sent by Andrew Murray, the following changes have
been made:

 * Use of CONFIG_64BIT instead of CONFIG_[a32bitarch] as suggested by
   Rob Herring in drivers/of/of_pci.c

 * Added forward declaration of struct pci_controller in linux/of_pci.h
   to prevent compiler warning as suggested by Thomas Petazzoni

 * Improved error checking (!range check), removal of unnecessary be32_to_cpup
   call, improved formatting of struct of_pci_range_parser layout and
   replacement of macro with a static inline. All suggested by Rob Herring.

Compared to the v4 (incorrectly labelled v3) sent by Andrew Murray,
the following changes have been made:

 * Split the patch as suggested by Rob Herring

Compared to the v3 sent by Andrew Murray, the following changes have
been made:

 * Unify and move duplicate pci_process_bridge_OF_ranges functions to
   drivers/of/of_pci.c as suggested by Rob Herring

 * Fix potential build errors with Microblaze/MIPS

Compared to "[PATCH v5 01/17] of/pci: Provide support for parsing PCI DT
ranges property", the following changes have been made:

 * Correct use of IORESOURCE_* as suggested by Russell King

 * Improved interface and naming as suggested by Thierry Reding

Compared to the v2 sent by Andrew Murray, Thomas Petazzoni did:

 * Add a memset() on the struct of_pci_range_iter when starting the
   for loop in for_each_pci_range(). Otherwise, with an uninitialized
   of_pci_range_iter, of_pci_process_ranges() may crash.

 * Add parenthesis around 'res', 'np' and 'iter' in the
   for_each_of_pci_range macro definitions. Otherwise, passing
   something like &foobar as 'res' didn't work.

 * Rebased on top of 3.9-rc2, which required fixing a few conflicts in
   the Microblaze code.

v2:
  This follows on from suggestions made by Grant Likely
  (marc.info/?l=linux-kernel&m=136079602806328)

Andrew Murray (3):
  of/pci: Provide support for parsing PCI DT ranges property
  of/pci: mips: convert to common of_pci_range_parser
  of/pci: microblaze: convert to common of_pci_range_parser

 arch/microblaze/pci/pci-common.c |  106 ++
 arch/mips/pci/pci.c  |   50 ++---
 drivers/of/address.c |   67 
 include/linux/of_address.h   |   48 +
 4 files changed, 171 insertions(+), 100 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v9 2/3] of/pci: mips: convert to common of_pci_range_parser

2013-05-07 Thread Andrew Murray
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Signed-off-by: Gabor Juhos 
Reviewed-by: Rob Herring 
Reviewed-by: Grant Likely 
Tested-by: Linus Walleij 
---
 arch/mips/pci/pci.c |   50 ++
 1 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 0872f12..0d291e9 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -122,51 +122,37 @@ static void pcibios_scanbus(struct pci_controller *hose)
 #ifdef CONFIG_OF
 void pci_load_of_ranges(struct pci_controller *hose, struct device_node *node)
 {
-   const __be32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(node);
-   int np = pna + 5;
+   struct of_pci_range range;
+   struct of_pci_range_parser parser;
 
pr_info("PCI host bridge %s ranges:\n", node->full_name);
-   ranges = of_get_property(node, "ranges", &rlen);
-   if (ranges == NULL)
-   return;
hose->of_node = node;
 
-   while ((rlen -= np * 4) >= 0) {
-   u32 pci_space;
+   if (of_pci_range_parser_init(&parser, node))
+   return;
+
+   for_each_of_pci_range(&parser, &range) {
struct resource *res = NULL;
-   u64 addr, size;
-
-   pci_space = be32_to_cpup(&ranges[0]);
-   addr = of_translate_address(node, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-   ranges += np;
-   switch ((pci_space >> 24) & 0x3) {
-   case 1: /* PCI IO space */
+
+   switch (range.flags & IORESOURCE_TYPE_BITS) {
+   case IORESOURCE_IO:
pr_info("  IO 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.cpu_addr,
+   range.cpu_addr + range.size - 1);
hose->io_map_base =
-   (unsigned long)ioremap(addr, size);
+   (unsigned long)ioremap(range.cpu_addr,
+  range.size);
res = hose->io_resource;
-   res->flags = IORESOURCE_IO;
break;
-   case 2: /* PCI Memory space */
-   case 3: /* PCI 64 bits Memory space */
+   case IORESOURCE_MEM:
pr_info(" MEM 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.cpu_addr,
+   range.cpu_addr + range.size - 1);
res = hose->mem_resource;
-   res->flags = IORESOURCE_MEM;
break;
}
-   if (res != NULL) {
-   res->start = addr;
-   res->name = node->full_name;
-   res->end = res->start + size - 1;
-   res->parent = NULL;
-   res->sibling = NULL;
-   res->child = NULL;
-   }
+   if (res != NULL)
+   of_pci_range_to_resource(&range, node, res);
}
 }
 #endif
-- 
1.7.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v9 3/3] of/pci: microblaze: convert to common of_pci_range_parser

2013-05-07 Thread Andrew Murray
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
---
 arch/microblaze/pci/pci-common.c |  106 ++
 1 files changed, 38 insertions(+), 68 deletions(-)

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 9ea521e..ba9e4a1 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -658,67 +658,42 @@ void pci_resource_to_user(const struct pci_dev *dev, int 
bar,
 void pci_process_bridge_OF_ranges(struct pci_controller *hose,
  struct device_node *dev, int primary)
 {
-   const u32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(dev);
-   int np = pna + 5;
int memno = 0, isa_hole = -1;
-   u32 pci_space;
-   unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
unsigned long long isa_mb = 0;
struct resource *res;
+   struct of_pci_range range;
+   struct of_pci_range_parser parser;
 
pr_info("PCI host bridge %s %s ranges:\n",
   dev->full_name, primary ? "(primary)" : "");
 
-   /* Get ranges property */
-   ranges = of_get_property(dev, "ranges", &rlen);
-   if (ranges == NULL)
+   /* Check for ranges property */
+   if (of_pci_range_parser_init(&parser, dev))
return;
 
-   /* Parse it */
pr_debug("Parsing ranges property...\n");
-   while ((rlen -= np * 4) >= 0) {
+   for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
-   pci_space = ranges[0];
-   pci_addr = of_read_number(ranges + 1, 2);
-   cpu_addr = of_translate_address(dev, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
-   pci_space, pci_addr);
+   range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
-   cpu_addr, size);
-
-   ranges += np;
+   range.cpu_addr, range.size);
 
/* If we failed translation or got a zero-sized region
 * (some FW try to feed us with non sensical zero sized regions
 * such as power3 which look like some kind of attempt
 * at exposing the VGA memory hole)
 */
-   if (cpu_addr == OF_BAD_ADDR || size == 0)
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
 
-   /* Now consume following elements while they are contiguous */
-   for (; rlen >= np * sizeof(u32);
-ranges += np, rlen -= np * 4) {
-   if (ranges[0] != pci_space)
-   break;
-   pci_next = of_read_number(ranges + 1, 2);
-   cpu_next = of_translate_address(dev, ranges + 3);
-   if (pci_next != pci_addr + size ||
-   cpu_next != cpu_addr + size)
-   break;
-   size += of_read_number(ranges + pna + 3, 2);
-   }
-
/* Act based on address space type */
res = NULL;
-   switch ((pci_space >> 24) & 0x3) {
-   case 1: /* PCI IO space */
+   switch (range.flags & IORESOURCE_TYPE_BITS) {
+   case IORESOURCE_IO:
pr_info("  IO 0x%016llx..0x%016llx -> 0x%016llx\n",
-  cpu_addr, cpu_addr + size - 1, pci_addr);
+   range.cpu_addr, range.cpu_addr + range.size - 1,
+   range.pci_addr);
 
/* We support only one IO range */
if (hose->pci_io_size) {
@@ -726,11 +701,12 @@ void pci_process_bridge_OF_ranges(struct pci_controller 
*hose,
continue;
}
/* On 32 bits, limit I/O space to 16MB */
-   if (size > 0x0100)
-   size = 0x0100;
+   if (range.size > 0x0100)
+   range.size = 0x0100;
 
/* 32 bits needs to map IOs here */
-   hose->io_base_virt = ioremap(cpu_addr, size);
+   hose->io_base_virt = ioremap(range.cpu_addr,
+   range.size);
 
/* Expect trouble if pci_addr is not 0 */

Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Andrew Murray
On Wed, Dec 12, 2012 at 10:49 AM, Grant Likely wrote:
> On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek 
> mailto:mon...@monstr.eu>> wrote:
> > On 12/10/2012 10:41 PM, Grant Likely wrote:
> >> drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c
> >> which might actually be a good idea in the short term so that it gets
> >> appropriate supervision while being generalized before being moved into
> >> the pci directory.
> >
> > Ben: Are you willing to move that ppc code to this location?
> > It is probably not good idea that I should do it when I even don't have
> > hardware available for testing (Asking someone else).
> 
> You're a clever guy, you are more than capable of crafting the patch,
> even if you can't test on hardware. :-)
> 
> I refactored most of the OF support code without having access to most
> of the affected hardware. Once I got the changes out there for review
> I also asked for spot testing before getting it into linux-next for
> even more testing.

I've been working on a relatively architecture agnostic PCI host bridge driver
and also wanted to avoid duplicating more generic DT parsing code for PCI
bindings.

I've ended up with a patch which provides an iterator for returning resources
based on the the typical 'ranges' binding. This has ended up living in
drivers/of/address.c. I originally started out in drivers/of/pci.c and
drivers/pci/pci-of.c but found there were good (and static) implementations in
drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags,
bus->count_cells).

I'm not just ready to post it - but can do before early next week if you can
wait.

Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] pci: Provide support for parsing PCI DT ranges property

2012-12-12 Thread Andrew Murray
DT bindings for PCI host bridges often use the ranges property to describe
memory and IO ranges - this binding tends to be the same across architectures
yet several parsing implementations exist, e.g. arch/mips/pci/pci.c,
arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and
arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate
functionality provided by drivers/of/address.c.

This patch provides a common iterator-based parser for the ranges property, it
is hoped this will reduce DT representation differences between architectures
and that architectures will migrate in part to this new parser.

It is also hoped (and the motativation for the patch) that this patch will
reduce duplication of code when writing host bridge drivers that are supported
by multiple architectures.

This patch provides struct resources from a device tree node, e.g.:

u32 *last = NULL;
struct resource res;
while ((last = of_pci_process_ranges(np, res, last))) {
//do something with res
}

Platforms with quirks can then do what they like with the resource or migrate
common quirk handling to the parser. In an ideal world drivers can just request
the obtained resources and pass them on (e.g. pci_add_resource_offset).

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
---
 drivers/of/address.c   |   53 +++-
 include/linux/of_address.h |7 +
 2 files changed, 59 insertions(+), 1 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 7e262a6..03bfe61 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -219,6 +219,57 @@ int of_pci_address_to_resource(struct device_node *dev, 
int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
 }
 EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from)
+{
+   const __be32 *start, *end;
+   int na, ns, np, pna;
+   int rlen;
+   struct of_bus *bus;
+   WARN_ON(!res);
+
+   bus = of_match_bus(node);
+   bus->count_cells(node, &na, &ns);
+   if (!OF_CHECK_COUNTS(na, ns)) {
+   pr_err("Bad cell count for %s\n", node->full_name);
+   return NULL;
+   }
+
+   pna = of_n_addr_cells(node);
+   np = pna + na + ns;
+
+   start = of_get_property(node, "ranges", &rlen);
+   if (start == NULL)
+   return NULL;
+
+   end = start + rlen;
+
+   if (!from)
+   from = start;
+
+   while (from + np <= end) {
+   u64 cpu_addr, size;
+
+   cpu_addr = of_translate_address(node, from + na);
+   size = of_read_number(from + na + pna, ns);
+   res->flags = bus->get_flags(from);
+   from += np;
+
+   if (cpu_addr == OF_BAD_ADDR || size == 0)
+   continue;
+
+   res->name = node->full_name;
+   res->start = cpu_addr;
+   res->end = res->start + size - 1;
+   res->parent = res->child = res->sibling = NULL;
+   return from;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(of_pci_process_ranges);
+
 #endif /* CONFIG_PCI */
 
 /*
@@ -421,7 +472,7 @@ u64 __of_translate_address(struct device_node *dev, const 
__be32 *in_addr,
goto bail;
bus = of_match_bus(parent);
 
-   /* Cound address cells & copy address locally */
+   /* Count address cells & copy address locally */
bus->count_cells(dev, &na, &ns);
if (!OF_CHECK_COUNTS(na, ns)) {
printk(KERN_ERR "prom_parse: Bad cell count for %s\n",
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 01b925a..4582b20 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -26,6 +26,8 @@ static inline unsigned long pci_address_to_pio(phys_addr_t 
addr) { return -1; }
 #define pci_address_to_pio pci_address_to_pio
 #endif
 
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from);
 #else /* CONFIG_OF_ADDRESS */
 static inline int of_address_to_resource(struct device_node *dev, int index,
 struct resource *r)
@@ -48,6 +50,11 @@ static inline const u32 *of_get_address(struct device_node 
*dev, int index,
 {
return NULL;
 }
+const __be32 *of_pci_process_ranges(struct device_node *node,
+   struct resource *res, const __be32 *from)
+{
+   return NULL;
+}
 #endif /* CONFIG_OF_ADDRESS */
 
 
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: pci and pcie device-tree binding - range No cells

2012-12-12 Thread Andrew Murray
On Wed, Dec 12, 2012 at 01:34:24PM +, Thierry Reding wrote:
> On Wed, Dec 12, 2012 at 12:19:12PM +0000, Andrew Murray wrote:
> > I've been working on a relatively architecture agnostic PCI host bridge 
> > driver
> > and also wanted to avoid duplicating more generic DT parsing code for PCI
> > bindings.
> > 
> > I've ended up with a patch which provides an iterator for returning 
> > resources
> > based on the the typical 'ranges' binding. This has ended up living in
> > drivers/of/address.c. I originally started out in drivers/of/pci.c and
> > drivers/pci/pci-of.c but found there were good (and static) implementations 
> > in
> > drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags,
> > bus->count_cells).
> > 
> > I'm not just ready to post it - but can do before early next week if you can
> > wait.
> 
> I already posted a similar patch[0] as part of a larger series to bring
> DT support to Tegra PCIe back in July. I suppose what you have must be
> something pretty close to that. Most of the stuff that had me occupied
> since then should be done soon and I was planning on resurrecting the
> series one of these days.

Thanks for the reference. I've submitted my patch, it's along the lines of your
existing patch.

I'm happy to take the best bits from both, drop mine, etc.

Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] pci: Provide support for parsing PCI DT ranges property

2012-12-13 Thread Andrew Murray
On Thu, Dec 13, 2012 at 09:13:33AM +, Thierry Reding wrote:
> Hi Andrew,
> 
> I don't like iterator interfaces too much, but I can live with that.
> Other than that the patch looks good to me and I'll try to work it into
> my Tegra PCIe patch series.
> 
> Just two minor comments below.
> 
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> [...]
> > @@ -421,7 +472,7 @@ u64 __of_translate_address(struct device_node *dev, 
> > const __be32 *in_addr,
> > goto bail;
> > bus = of_match_bus(parent);
> >  
> > -   /* Cound address cells & copy address locally */
> > +   /* Count address cells & copy address locally */
> > bus->count_cells(dev, &na, &ns);
> > if (!OF_CHECK_COUNTS(na, ns)) {
> > printk(KERN_ERR "prom_parse: Bad cell count for %s\n",
> 
> This is really minor, but it should still go into a separate patch.
> 
> > diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> > index 01b925a..4582b20 100644
> > --- a/include/linux/of_address.h
> > +++ b/include/linux/of_address.h
> > @@ -26,6 +26,8 @@ static inline unsigned long 
> > pci_address_to_pio(phys_addr_t addr) { return -1; }
> >  #define pci_address_to_pio pci_address_to_pio
> >  #endif
> >  
> > +const __be32 *of_pci_process_ranges(struct device_node *node,
> > +   struct resource *res, const __be32 *from);
> >  #else /* CONFIG_OF_ADDRESS */
> >  static inline int of_address_to_resource(struct device_node *dev, int 
> > index,
> >  struct resource *r)
> > @@ -48,6 +50,11 @@ static inline const u32 *of_get_address(struct 
> > device_node *dev, int index,
> >  {
> > return NULL;
> >  }
> > +const __be32 *of_pci_process_ranges(struct device_node *node,
> 
> There should be a blank line to separate the above two lines.
> 

Thanks for the feedback.

I will send another patch for the typo and leave this patch with you for
working into your existing series.

Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] pci: Provide support for parsing PCI DT ranges property

2012-12-13 Thread Andrew Murray
On Thu, Dec 13, 2012 at 10:03:18AM +, Thierry Reding wrote:
> On Thu, Dec 13, 2012 at 09:45:43AM +0000, Andrew Murray wrote:
> > On Thu, Dec 13, 2012 at 09:13:33AM +, Thierry Reding wrote:
> > > Hi Andrew,
> > > 
> > > I don't like iterator interfaces too much, but I can live with that.
> > > Other than that the patch looks good to me and I'll try to work it into
> > > my Tegra PCIe patch series.
> > > 
> > > Just two minor comments below.
> > > 
> > > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > [...]
> > > > @@ -421,7 +472,7 @@ u64 __of_translate_address(struct device_node *dev, 
> > > > const __be32 *in_addr,
> > > > goto bail;
> > > > bus = of_match_bus(parent);
> > > >  
> > > > -   /* Cound address cells & copy address locally */
> > > > +   /* Count address cells & copy address locally */
> > > > bus->count_cells(dev, &na, &ns);
> > > > if (!OF_CHECK_COUNTS(na, ns)) {
> > > > printk(KERN_ERR "prom_parse: Bad cell count for %s\n",
> > > 
> > > This is really minor, but it should still go into a separate patch.
> > > 
> > > > diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> > > > index 01b925a..4582b20 100644
> > > > --- a/include/linux/of_address.h
> > > > +++ b/include/linux/of_address.h
> > > > @@ -26,6 +26,8 @@ static inline unsigned long 
> > > > pci_address_to_pio(phys_addr_t addr) { return -1; }
> > > >  #define pci_address_to_pio pci_address_to_pio
> > > >  #endif
> > > >  
> > > > +const __be32 *of_pci_process_ranges(struct device_node *node,
> > > > +   struct resource *res, const __be32 
> > > > *from);
> > > >  #else /* CONFIG_OF_ADDRESS */
> > > >  static inline int of_address_to_resource(struct device_node *dev, int 
> > > > index,
> > > >  struct resource *r)
> > > > @@ -48,6 +50,11 @@ static inline const u32 *of_get_address(struct 
> > > > device_node *dev, int index,
> > > >  {
> > > > return NULL;
> > > >  }
> > > > +const __be32 *of_pci_process_ranges(struct device_node *node,
> > > 
> > > There should be a blank line to separate the above two lines.
> > > 
> > 
> > Thanks for the feedback.
> > 
> > I will send another patch for the typo and leave this patch with you for
> > working into your existing series.
> 
> I suppose you have your own series that uses this patch?

Not yet, it may be some time before I submit my PCI host bridge driver. Though
I am making changes else where (e.g. this patch) which I'm hoping to submit as
early as possible. I can rebase my work for these upstream dependencies.

I can re-spin this patch with your suggested changes if you prefer?

Andrew Murray


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] pci: Provide support for parsing PCI DT ranges property

2013-01-14 Thread Andrew Murray
On Thu, Dec 20, 2012 at 08:25:00AM +, Thierry Reding wrote:
> On Wed, Dec 12, 2012 at 04:37:50PM +0000, Andrew Murray wrote:
> [...]
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> [...]
> > +   start = of_get_property(node, "ranges", &rlen);
> > +   if (start == NULL)
> > +   return NULL;
> > +
> > +   end = start + rlen;
> 
> I'm currently rewriting large parts of the Tegra PCIe controller driver
> and I'm trying to use this new API. This seems to work fine, except that
> I think this line needs to be:
> 
>   end = start + rlen / sizeof(__be32);
> 
> Otherwise we'll try to process 4 times as many ranges as there are.
> 
> Thierry

Good catch. Thanks for taking this on.

Andrew Murray


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] pci: Provide support for parsing PCI DT ranges property

2013-01-14 Thread Andrew Murray
On Sat, Dec 15, 2012 at 01:06:41AM +, Grant Likely wrote:
> On Wed, Dec 12, 2012 at 4:37 PM, Andrew Murray  wrote:
> > DT bindings for PCI host bridges often use the ranges property to describe
> > memory and IO ranges - this binding tends to be the same across 
> > architectures
> > yet several parsing implementations exist, e.g. arch/mips/pci/pci.c,
> > arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and
> > arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate
> > functionality provided by drivers/of/address.c.
> 
> Hi Andrew,
> 
> Thanks for looking into this. This definitely needs to be done.
> 
> However, I cannot merge this patch as-is because it actually makes
> things worse by adding yet another implementation of the parsing code.
> Plus it doesn't actually have any users.  :-)

I understand. Though I see Thierry has included this in his patch set - I
guess that means there is potentially one user now :).

> 
> Instead, move the existing code that you need out of
> arch/powerpc/kernel/pci-common.c into a shared place and add in the
> features you need. Bonus points if you fixup microblaze or others at
> the same time.

In most part the patch I submitted was the common code from powerpc but
without quirks and tie-ins to powerpc structures. I'd like to convert
powerpc to using my patch and others but won't get time to do this at
present :(

> 
> g.
> 
Andrew Murray

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v7 0/3] of/pci: Provide common support for PCI DT parsing

2013-04-16 Thread Andrew Murray
This patchset factors out duplicated code associated with parsing PCI
DT "ranges" properties across the architectures and introduces a
"ranges" parser. This parser "of_pci_range_parser" can be used directly
by ARM host bridge drivers enabling them to obtain ranges from device
trees.

I've included the Reviewed-by and Tested-by's received from v5/v6 in this
patchset, earlier versions of this patchset (v3) have been tested-by:

Thierry Reding 
Jingoo Han 

I've tested that this patchset builds and runs on ARM and that it builds on
PowerPC and x86_64.

Compared to the v6 sent by Andrew Murray, the following changes have
been made in response to build errors/warnings:

 * Inclusion of linux/of_address.h in of_pci.c as suggested by Michal
   Simek to prevent compilation failures on Microblaze (and others) and his
   ack.

 * Use of externs, static inlines and a typo in linux/of_address.h in response
   to linker errors (multiple defination) on x86_64 as spotted by a kbuild test
   robot on (jcooper/linux.git mvebu/drivers)

 * Add EXPORT_SYMBOL_GPL to of_pci_range_parser function to be consistent
   with of_pci_process_ranges function

Compared to the v5 sent by Andrew Murray, the following changes have
been made:

 * Use of CONFIG_64BIT instead of CONFIG_[a32bitarch] as suggested by
   Rob Herring in drivers/of/of_pci.c

 * Added forward declaration of struct pci_controller in linux/of_pci.h
   to prevent compiler warning as suggested by Thomas Petazzoni

 * Improved error checking (!range check), removal of unnecessary be32_to_cpup
   call, improved formatting of struct of_pci_range_parser layout and
   replacement of macro with a static inline. All suggested by Rob Herring.

Compared to the v4 (incorrectly labelled v3) sent by Andrew Murray,
the following changes have been made:

 * Split the patch as suggested by Rob Herring

Compared to the v3 sent by Andrew Murray, the following changes have
been made:

 * Unify and move duplicate pci_process_bridge_OF_ranges functions to
   drivers/of/of_pci.c as suggested by Rob Herring

 * Fix potential build errors with Microblaze/MIPS

Compared to "[PATCH v5 01/17] of/pci: Provide support for parsing PCI DT
ranges property", the following changes have been made:

 * Correct use of IORESOURCE_* as suggested by Russell King

 * Improved interface and naming as suggested by Thierry Reding

Compared to the v2 sent by Andrew Murray, Thomas Petazzoni did:

 * Add a memset() on the struct of_pci_range_iter when starting the
   for loop in for_each_pci_range(). Otherwise, with an uninitialized
   of_pci_range_iter, of_pci_process_ranges() may crash.

 * Add parenthesis around 'res', 'np' and 'iter' in the
   for_each_of_pci_range macro definitions. Otherwise, passing
   something like &foobar as 'res' didn't work.

 * Rebased on top of 3.9-rc2, which required fixing a few conflicts in
   the Microblaze code.

v2:
  This follows on from suggestions made by Grant Likely
  (marc.info/?l=linux-kernel&m=136079602806328)

Andrew Murray (3):
  of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and
PowerPC
  of/pci: Provide support for parsing PCI DT ranges property
  of/pci: mips: convert to common of_pci_range_parser

 arch/microblaze/include/asm/pci-bridge.h |5 +-
 arch/microblaze/pci/pci-common.c |  192 --
 arch/mips/pci/pci.c  |   50 +++--
 arch/powerpc/include/asm/pci-bridge.h|5 +-
 arch/powerpc/kernel/pci-common.c |  192 --
 drivers/of/address.c |   67 +++
 drivers/of/of_pci.c  |  169 ++
 include/linux/of_address.h   |   46 +++
 include/linux/of_pci.h   |4 +
 9 files changed, 304 insertions(+), 426 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v7 1/3] of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and PowerPC

2013-04-16 Thread Andrew Murray
The pci_process_bridge_OF_ranges function, used to parse the "ranges"
property of a PCI host device, is found in both Microblaze and PowerPC
architectures. These implementations are nearly identical. This patch
moves this common code to a common place.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Reviewed-by: Rob Herring 
Tested-by: Thomas Petazzoni 
Tested-by: Linus Walleij 
Acked-by: Michal Simek 
---
 arch/microblaze/include/asm/pci-bridge.h |5 +-
 arch/microblaze/pci/pci-common.c |  192 
 arch/powerpc/include/asm/pci-bridge.h|5 +-
 arch/powerpc/kernel/pci-common.c |  192 
 drivers/of/of_pci.c  |  200 ++
 include/linux/of_pci.h   |4 +
 6 files changed, 206 insertions(+), 392 deletions(-)

diff --git a/arch/microblaze/include/asm/pci-bridge.h 
b/arch/microblaze/include/asm/pci-bridge.h
index cb5d397..5783cd6 100644
--- a/arch/microblaze/include/asm/pci-bridge.h
+++ b/arch/microblaze/include/asm/pci-bridge.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct device_node;
 
@@ -132,10 +133,6 @@ extern void setup_indirect_pci(struct pci_controller *hose,
 extern struct pci_controller *pci_find_hose_for_OF_device(
struct device_node *node);
 
-/* Fill up host controller resources from the OF node */
-extern void pci_process_bridge_OF_ranges(struct pci_controller *hose,
-   struct device_node *dev, int primary);
-
 /* Allocate & free a PCI host bridge structure */
 extern struct pci_controller *pcibios_alloc_controller(struct device_node 
*dev);
 extern void pcibios_free_controller(struct pci_controller *phb);
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 9ea521e..2735ad9 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -622,198 +622,6 @@ void pci_resource_to_user(const struct pci_dev *dev, int 
bar,
*end = rsrc->end - offset;
 }
 
-/**
- * pci_process_bridge_OF_ranges - Parse PCI bridge resources from device tree
- * @hose: newly allocated pci_controller to be setup
- * @dev: device node of the host bridge
- * @primary: set if primary bus (32 bits only, soon to be deprecated)
- *
- * This function will parse the "ranges" property of a PCI host bridge device
- * node and setup the resource mapping of a pci controller based on its
- * content.
- *
- * Life would be boring if it wasn't for a few issues that we have to deal
- * with here:
- *
- *   - We can only cope with one IO space range and up to 3 Memory space
- * ranges. However, some machines (thanks Apple !) tend to split their
- * space into lots of small contiguous ranges. So we have to coalesce.
- *
- *   - We can only cope with all memory ranges having the same offset
- * between CPU addresses and PCI addresses. Unfortunately, some bridges
- * are setup for a large 1:1 mapping along with a small "window" which
- * maps PCI address 0 to some arbitrary high address of the CPU space in
- * order to give access to the ISA memory hole.
- * The way out of here that I've chosen for now is to always set the
- * offset based on the first resource found, then override it if we
- * have a different offset and the previous was set by an ISA hole.
- *
- *   - Some busses have IO space not starting at 0, which causes trouble with
- * the way we do our IO resource renumbering. The code somewhat deals with
- * it for 64 bits but I would expect problems on 32 bits.
- *
- *   - Some 32 bits platforms such as 4xx can have physical space larger than
- * 32 bits so we need to use 64 bits values for the parsing
- */
-void pci_process_bridge_OF_ranges(struct pci_controller *hose,
- struct device_node *dev, int primary)
-{
-   const u32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(dev);
-   int np = pna + 5;
-   int memno = 0, isa_hole = -1;
-   u32 pci_space;
-   unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
-   unsigned long long isa_mb = 0;
-   struct resource *res;
-
-   pr_info("PCI host bridge %s %s ranges:\n",
-  dev->full_name, primary ? "(primary)" : "");
-
-   /* Get ranges property */
-   ranges = of_get_property(dev, "ranges", &rlen);
-   if (ranges == NULL)
-   return;
-
-   /* Parse it */
-   pr_debug("Parsing ranges property...\n");
-   while ((rlen -= np * 4) >= 0) {
-   /* Read next ranges element */
-   pci_space = ranges[0];
-   pci_addr = of_read_number(ranges + 1, 2);
-   cpu_addr = of_translate_address(dev, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-
- 

[PATCH v7 2/3] of/pci: Provide support for parsing PCI DT ranges property

2013-04-16 Thread Andrew Murray
This patch factors out common implementation patterns to reduce overall kernel
code and provide a means for host bridge drivers to directly obtain struct
resources from the DT's ranges property without relying on architecture specific
DT handling. This will make it easier to write archiecture independent host 
bridge
drivers and mitigate against further duplication of DT parsing code.

This patch can be used in the following way:

struct of_pci_range_parser parser;
struct of_pci_range range;

if (of_pci_range_parser(&parser, np))
; //no ranges property

for_each_of_pci_range(&parser, &range) {

/*
directly access properties of the address range, e.g.:
range.pci_space, range.pci_addr, range.cpu_addr,
range.size, range.flags

alternatively obtain a struct resource, e.g.:
struct resource res;
of_pci_range_to_resource(&range, np, &res);
*/
}

Additionally the implementation takes care of adjacent ranges and merges them
into a single range (as was the case with powerpc and microblaze).

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Signed-off-by: Thomas Petazzoni 
Reviewed-by: Rob Herring 
Tested-by: Thomas Petazzoni 
Tested-by: Linus Walleij 
---
 drivers/of/address.c   |   67 ++
 drivers/of/of_pci.c|  113 
 include/linux/of_address.h |   46 ++
 3 files changed, 154 insertions(+), 72 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 04da786..6eec70c 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -227,6 +227,73 @@ int of_pci_address_to_resource(struct device_node *dev, 
int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
 }
 EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+int of_pci_range_parser(struct of_pci_range_parser *parser,
+   struct device_node *node)
+{
+   const int na = 3, ns = 2;
+   int rlen;
+
+   parser->node = node;
+   parser->pna = of_n_addr_cells(node);
+   parser->np = parser->pna + na + ns;
+
+   parser->range = of_get_property(node, "ranges", &rlen);
+   if (parser->range == NULL)
+   return -ENOENT;
+
+   parser->end = parser->range + rlen / sizeof(__be32);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser);
+
+struct of_pci_range *of_pci_process_ranges(struct of_pci_range_parser *parser,
+   struct of_pci_range *range)
+{
+   const int na = 3, ns = 2;
+
+   if (!range)
+   return NULL;
+
+   if (!parser->range || parser->range + parser->np > parser->end)
+   return NULL;
+
+   range->pci_space = parser->range[0];
+   range->flags = of_bus_pci_get_flags(parser->range);
+   range->pci_addr = of_read_number(parser->range + 1, ns);
+   range->cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   range->size = of_read_number(parser->range + parser->pna + na, ns);
+
+   parser->range += parser->np;
+
+   /* Now consume following elements while they are contiguous */
+   while (parser->range + parser->np <= parser->end) {
+   u32 flags, pci_space;
+   u64 pci_addr, cpu_addr, size;
+
+   pci_space = be32_to_cpup(parser->range);
+   flags = of_bus_pci_get_flags(parser->range);
+   pci_addr = of_read_number(parser->range + 1, ns);
+   cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   size = of_read_number(parser->range + parser->pna + na, ns);
+
+   if (flags != range->flags)
+   break;
+   if (pci_addr != range->pci_addr + range->size ||
+   cpu_addr != range->cpu_addr + range->size)
+   break;
+
+   range->size += size;
+   parser->range += parser->np;
+   }
+
+   return range;
+}
+EXPORT_SYMBOL_GPL(of_pci_process_ranges);
+
 #endif /* CONFIG_PCI */
 
 /*
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 1626172..e5ab604 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #if defined(CONFIG_PPC32) || defined(CONFIG_PPC64) || 
defined(CONFIG_MICROBLAZE)
@@ -82,67 +83,43 @@ EXPORT_SYMBOL_GPL(of_pci_find_child_device);
 void pci_process_bridge_OF_ranges(struct pci_controller *hose,
  struct device_node *dev,

[PATCH v7 3/3] of/pci: mips: convert to common of_pci_range_parser

2013-04-16 Thread Andrew Murray
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Reviewed-by: Rob Herring 
---
 arch/mips/pci/pci.c |   50 --
 1 files changed, 16 insertions(+), 34 deletions(-)

diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 0872f12..bee49a4 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -122,51 +122,33 @@ static void pcibios_scanbus(struct pci_controller *hose)
 #ifdef CONFIG_OF
 void pci_load_of_ranges(struct pci_controller *hose, struct device_node *node)
 {
-   const __be32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(node);
-   int np = pna + 5;
+   struct of_pci_range_range range;
+   struct of_pci_range_parser parser;
+   u32 res_type;
 
pr_info("PCI host bridge %s ranges:\n", node->full_name);
-   ranges = of_get_property(node, "ranges", &rlen);
-   if (ranges == NULL)
-   return;
hose->of_node = node;
 
-   while ((rlen -= np * 4) >= 0) {
-   u32 pci_space;
+   if (of_pci_range_parser(&parser, node))
+   return;
+
+   for_each_of_pci_range(&parser, &range) {
struct resource *res = NULL;
-   u64 addr, size;
-
-   pci_space = be32_to_cpup(&ranges[0]);
-   addr = of_translate_address(node, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-   ranges += np;
-   switch ((pci_space >> 24) & 0x3) {
-   case 1: /* PCI IO space */
+
+   res_type = range.flags & IORESOURCE_TYPE_BITS;
+   if (res_type == IORESOURCE_IO) {
pr_info("  IO 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.addr, range.addr + range.size - 1);
hose->io_map_base =
-   (unsigned long)ioremap(addr, size);
+   (unsigned long)ioremap(range.addr, range.size);
res = hose->io_resource;
-   res->flags = IORESOURCE_IO;
-   break;
-   case 2: /* PCI Memory space */
-   case 3: /* PCI 64 bits Memory space */
+   } else if (res_type == IORESOURCE_MEM) {
pr_info(" MEM 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.addr, range.addr + range.size - 1);
res = hose->mem_resource;
-   res->flags = IORESOURCE_MEM;
-   break;
-   }
-   if (res != NULL) {
-   res->start = addr;
-   res->name = node->full_name;
-   res->end = res->start + size - 1;
-   res->parent = NULL;
-   res->sibling = NULL;
-   res->child = NULL;
}
+   if (res != NULL)
+   of_pci_range_to_resource(&range, node, res);
}
 }
 #endif
-- 
1.7.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v7 1/3] of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and PowerPC

2013-04-16 Thread Andrew Murray
On Tue, Apr 16, 2013 at 11:18:26AM +0100, Andrew Murray wrote:
> The pci_process_bridge_OF_ranges function, used to parse the "ranges"
> property of a PCI host device, is found in both Microblaze and PowerPC
> architectures. These implementations are nearly identical. This patch
> moves this common code to a common place.
> 
> Signed-off-by: Andrew Murray 
> Signed-off-by: Liviu Dudau 
> Reviewed-by: Rob Herring 
> Tested-by: Thomas Petazzoni 
> Tested-by: Linus Walleij 
> Acked-by: Michal Simek 
> ---
>  arch/microblaze/include/asm/pci-bridge.h |5 +-
>  arch/microblaze/pci/pci-common.c |  192 
>  arch/powerpc/include/asm/pci-bridge.h|5 +-
>  arch/powerpc/kernel/pci-common.c |  192 

Is there anyone on linuxppc-dev/linux-mips that can help test this patchset?

I've tested that it builds on powerpc with a variety of configs (some which
include fsl_pci.c implementation). Though I don't have hardware to verify that
it works.

I haven't tested this builds or runs on MIPS.

You shouldn't see any difference in behaviour or new warnings and PCI devices
should continue to operate as before.

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v7 3/3] of/pci: mips: convert to common of_pci_range_parser

2013-04-18 Thread Andrew Murray
On Wed, Apr 17, 2013 at 04:42:48PM +0100, Linus Walleij wrote:
> On Tue, Apr 16, 2013 at 12:18 PM, Andrew Murray  wrote:
> 
> > This patch converts the pci_load_of_ranges function to use the new common
> > of_pci_range_parser.
> >
> > Signed-off-by: Andrew Murray 
> > Signed-off-by: Liviu Dudau 
> > Reviewed-by: Rob Herring 
> 
> Tested-by: Linus Walleij 

Jason - you may not have seen this, but here (Linus Walleij) is another 
Tested-by
to add to this patch in your tree (if you can).

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v7 2/3] of/pci: Provide support for parsing PCI DT ranges property

2013-04-18 Thread Andrew Murray
On Thu, Apr 18, 2013 at 02:44:01PM +0100, Grant Likely wrote:
> On Tue, 16 Apr 2013 11:18:27 +0100, Andrew Murray  
> wrote:
> > This patch factors out common implementation patterns to reduce overall 
> > kernel
> > code and provide a means for host bridge drivers to directly obtain struct
> > resources from the DT's ranges property without relying on architecture 
> > specific
> > DT handling. This will make it easier to write archiecture independent host 
> > bridge
> > drivers and mitigate against further duplication of DT parsing code.
> > 
> > This patch can be used in the following way:
> > 
> > struct of_pci_range_parser parser;
> > struct of_pci_range range;
> > 
> > if (of_pci_range_parser(&parser, np))
> > ; //no ranges property
> > 
> > for_each_of_pci_range(&parser, &range) {
> > 
> > /*
> > directly access properties of the address range, e.g.:
> > range.pci_space, range.pci_addr, range.cpu_addr,
> > range.size, range.flags
> > 
> > alternatively obtain a struct resource, e.g.:
> > struct resource res;
> > of_pci_range_to_resource(&range, np, &res);
> > */
> >     }
> > 
> > Additionally the implementation takes care of adjacent ranges and merges 
> > them
> > into a single range (as was the case with powerpc and microblaze).
> > 
> > Signed-off-by: Andrew Murray 
> > Signed-off-by: Liviu Dudau 
> > Signed-off-by: Thomas Petazzoni 
> > Reviewed-by: Rob Herring 
> > Tested-by: Thomas Petazzoni 
> > Tested-by: Linus Walleij 
> 
> Acked-by: Grant Likely 
> 
> But comments below...
> 
> > ---
> >  drivers/of/address.c   |   67 ++
> >  drivers/of/of_pci.c|  113 
> > 
> >  include/linux/of_address.h |   46 ++
> >  3 files changed, 154 insertions(+), 72 deletions(-)
> > 
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 04da786..6eec70c 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -227,6 +227,73 @@ int of_pci_address_to_resource(struct device_node 
> > *dev, int bar,
> > return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
> >  }
> >  EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
> > +
> > +int of_pci_range_parser(struct of_pci_range_parser *parser,
> > +   struct device_node *node)
> > +{
> > +   const int na = 3, ns = 2;
> > +   int rlen;
> > +
> > +   parser->node = node;
> > +   parser->pna = of_n_addr_cells(node);
> > +   parser->np = parser->pna + na + ns;
> > +
> > +   parser->range = of_get_property(node, "ranges", &rlen);
> > +   if (parser->range == NULL)
> > +   return -ENOENT;
> > +
> > +   parser->end = parser->range + rlen / sizeof(__be32);
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(of_pci_range_parser);
> 
> "of_pci_range_parser_init" would be a clearer name
> 
> > +struct of_pci_range *of_pci_process_ranges(struct of_pci_range_parser 
> > *parser,
> > +   struct of_pci_range *range)
> 
> Similarly, "of_pci_range_parser_one" would be more consistent.
> 
> > +{
> > +   const int na = 3, ns = 2;
> > +
> > +   if (!range)
> > +   return NULL;
> > +
> > +   if (!parser->range || parser->range + parser->np > parser->end)
> > +   return NULL;
> > +
> > +   range->pci_space = parser->range[0];
> > +   range->flags = of_bus_pci_get_flags(parser->range);
> > +   range->pci_addr = of_read_number(parser->range + 1, ns);
> > +   range->cpu_addr = of_translate_address(parser->node,
> > +   parser->range + na);
> > +   range->size = of_read_number(parser->range + parser->pna + na, ns);
> > +
> > +   parser->range += parser->np;
> > +
> > +   /* Now consume following elements while they are contiguous */
> > +   while (parser->range + parser->np <= parser->end) {
> > +   u32 flags, pci_space;
> > +   u64 pci_addr, cpu_addr, size;
> > +
> > +   pci_space = be32_to_cpup(parser->range);
> > +   flags = of_bus_pci_ge

Re: [PATCH v7 3/3] of/pci: mips: convert to common of_pci_range_parser

2013-04-18 Thread Andrew Murray
On Thu, Apr 18, 2013 at 02:45:35PM +0100, Grant Likely wrote:
> On Tue, 16 Apr 2013 11:18:28 +0100, Andrew Murray  
> wrote:
> > This patch converts the pci_load_of_ranges function to use the new common
> > of_pci_range_parser.
> > 
> > Signed-off-by: Andrew Murray 
> > Signed-off-by: Liviu Dudau 
> > Reviewed-by: Rob Herring 
> 
> Looks okay to me, and thank you very much for doing the legwork to merge
> the common code.

No problem, though I think there is more than can be done in this area -
there is a lot of similar PCI code floating about.

> 
> Reviewed-by: Grant Likely 

Thanks for the review.

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v7 2/3] of/pci: Provide support for parsing PCI DT ranges property

2013-04-18 Thread Andrew Murray
On Thu, Apr 18, 2013 at 04:29:54PM +0100, Grant Likely wrote:
> On Thu, Apr 18, 2013 at 3:24 PM, Andrew Murray  wrote:
> > On Thu, Apr 18, 2013 at 02:44:01PM +0100, Grant Likely wrote:
> >> On Tue, 16 Apr 2013 11:18:27 +0100, Andrew Murray  
> >> wrote:
> >> > /* Act based on address space type */
> >> > res = NULL;
> >> > -   switch ((pci_space >> 24) & 0x3) {
> >> > -   case 1: /* PCI IO space */
> >> > +   res_type = range.flags & IORESOURCE_TYPE_BITS;
> >> > +   if (res_type == IORESOURCE_IO) {
> >>
> >> Why the change from switch() to an if/else if sequence?
> >
> > Russell King suggested this change - see
> > https://patchwork.kernel.org/patch/2323941/
> 
> Umm, that isn't what that link shows. That link shows the patch
> already changing to an if/else if construct, and rmk is commenting on
> that.

Arh yes sorry about that. I can't find or remember any good reason why I
changed it from a switch to an if/else :\

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v8 3/3] of/pci: mips: convert to common of_pci_range_parser

2013-04-22 Thread Andrew Murray
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Signed-off-by: Gabor Juhos 
Reviewed-by: Rob Herring 
Reviewed-by: Grant Likely 
Tested-by: Linus Walleij 
---
 arch/mips/pci/pci.c |   51 +++
 1 files changed, 19 insertions(+), 32 deletions(-)

diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 0872f12..4b09ca8 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -122,51 +122,38 @@ static void pcibios_scanbus(struct pci_controller *hose)
 #ifdef CONFIG_OF
 void pci_load_of_ranges(struct pci_controller *hose, struct device_node *node)
 {
-   const __be32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(node);
-   int np = pna + 5;
+   struct of_pci_range range;
+   struct of_pci_range_parser parser;
+   u32 res_type;
 
pr_info("PCI host bridge %s ranges:\n", node->full_name);
-   ranges = of_get_property(node, "ranges", &rlen);
-   if (ranges == NULL)
-   return;
hose->of_node = node;
 
-   while ((rlen -= np * 4) >= 0) {
-   u32 pci_space;
+   if (of_pci_range_parser_init(&parser, node))
+   return;
+
+   for_each_of_pci_range(&parser, &range) {
struct resource *res = NULL;
-   u64 addr, size;
-
-   pci_space = be32_to_cpup(&ranges[0]);
-   addr = of_translate_address(node, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-   ranges += np;
-   switch ((pci_space >> 24) & 0x3) {
-   case 1: /* PCI IO space */
+
+   switch (range.flags & IORESOURCE_TYPE_BITS) {
+   case IORESOURCE_IO:
pr_info("  IO 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.cpu_addr,
+   range.cpu_addr + range.size - 1);
hose->io_map_base =
-   (unsigned long)ioremap(addr, size);
+   (unsigned long)ioremap(range.cpu_addr,
+  range.size);
res = hose->io_resource;
-   res->flags = IORESOURCE_IO;
break;
-   case 2: /* PCI Memory space */
-   case 3: /* PCI 64 bits Memory space */
+   case IORESOURCE_MEM:
pr_info(" MEM 0x%016llx..0x%016llx\n",
-   addr, addr + size - 1);
+   range.cpu_addr,
+   range.cpu_addr + range.size - 1);
res = hose->mem_resource;
-   res->flags = IORESOURCE_MEM;
break;
}
-   if (res != NULL) {
-   res->start = addr;
-   res->name = node->full_name;
-   res->end = res->start + size - 1;
-   res->parent = NULL;
-   res->sibling = NULL;
-   res->child = NULL;
-   }
+   if (res != NULL)
+   of_pci_range_to_resource(&range, node, res);
}
 }
 #endif
-- 
1.7.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v8 0/3] of/pci: Provide common support for PCI DT parsing

2013-04-22 Thread Andrew Murray
This patchset factors out duplicated code associated with parsing PCI
DT "ranges" properties across the architectures and introduces a
"ranges" parser. This parser "of_pci_range_parser" can be used directly
by ARM host bridge drivers enabling them to obtain ranges from device
trees.

I've included the Reviewed-by, Tested-by and Acked-by's received from v5/v6/v7
in this patchset, earlier versions of this patchset (v3) have been tested-by:

Thierry Reding 
Jingoo Han 

I've tested that this patchset builds and runs on ARM and that it builds on
PowerPC, x86_64 and MIPS.

Compared to the v7 sent by Andrew Murray, the following changes have been made
(please note that the first patch is unchanged from v7):

 * Rename of_pci_range_parser to of_pci_range_parser_init and
   of_pci_process_ranges to of_pci_range_parser_one as suggested by Grant
   Likely.

 * Reverted back to using a switch statement instead of if/else in
   pci_process_bridge_OF_ranges. Grant Likely highlighted this change from
   the original code which was unnecessary.

 * Squashed in a patch provided by Gabor Juhos which fixes build errors on
   MIPS found in the last patchset.

Compared to the v6 sent by Andrew Murray, the following changes have
been made in response to build errors/warnings:

 * Inclusion of linux/of_address.h in of_pci.c as suggested by Michal
   Simek to prevent compilation failures on Microblaze (and others) and his
   ack.

 * Use of externs, static inlines and a typo in linux/of_address.h in response
   to linker errors (multiple defination) on x86_64 as spotted by a kbuild test
   robot on (jcooper/linux.git mvebu/drivers)

 * Add EXPORT_SYMBOL_GPL to of_pci_range_parser function to be consistent
   with of_pci_process_ranges function

Compared to the v5 sent by Andrew Murray, the following changes have
been made:

 * Use of CONFIG_64BIT instead of CONFIG_[a32bitarch] as suggested by
   Rob Herring in drivers/of/of_pci.c

 * Added forward declaration of struct pci_controller in linux/of_pci.h
   to prevent compiler warning as suggested by Thomas Petazzoni

 * Improved error checking (!range check), removal of unnecessary be32_to_cpup
   call, improved formatting of struct of_pci_range_parser layout and
   replacement of macro with a static inline. All suggested by Rob Herring.

Compared to the v4 (incorrectly labelled v3) sent by Andrew Murray,
the following changes have been made:

 * Split the patch as suggested by Rob Herring

Compared to the v3 sent by Andrew Murray, the following changes have
been made:

 * Unify and move duplicate pci_process_bridge_OF_ranges functions to
   drivers/of/of_pci.c as suggested by Rob Herring

 * Fix potential build errors with Microblaze/MIPS

Compared to "[PATCH v5 01/17] of/pci: Provide support for parsing PCI DT
ranges property", the following changes have been made:

 * Correct use of IORESOURCE_* as suggested by Russell King

 * Improved interface and naming as suggested by Thierry Reding

Compared to the v2 sent by Andrew Murray, Thomas Petazzoni did:

 * Add a memset() on the struct of_pci_range_iter when starting the
   for loop in for_each_pci_range(). Otherwise, with an uninitialized
   of_pci_range_iter, of_pci_process_ranges() may crash.

 * Add parenthesis around 'res', 'np' and 'iter' in the
   for_each_of_pci_range macro definitions. Otherwise, passing
   something like &foobar as 'res' didn't work.

 * Rebased on top of 3.9-rc2, which required fixing a few conflicts in
   the Microblaze code.

v2:
  This follows on from suggestions made by Grant Likely
  (marc.info/?l=linux-kernel&m=136079602806328)

Andrew Murray (3):
  of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and
PowerPC
  of/pci: Provide support for parsing PCI DT ranges property
  of/pci: mips: convert to common of_pci_range_parser

 arch/microblaze/include/asm/pci-bridge.h |5 +-
 arch/microblaze/pci/pci-common.c |  192 --
 arch/mips/pci/pci.c  |   51 +++-
 arch/powerpc/include/asm/pci-bridge.h|5 +-
 arch/powerpc/kernel/pci-common.c |  192 --
 drivers/of/address.c |   67 +++
 drivers/of/of_pci.c  |  173 +++
 include/linux/of_address.h   |   48 
 include/linux/of_pci.h   |4 +
 9 files changed, 313 insertions(+), 424 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v8 2/3] of/pci: Provide support for parsing PCI DT ranges property

2013-04-22 Thread Andrew Murray
This patch factors out common implementation patterns to reduce overall kernel
code and provide a means for host bridge drivers to directly obtain struct
resources from the DT's ranges property without relying on architecture specific
DT handling. This will make it easier to write archiecture independent host 
bridge
drivers and mitigate against further duplication of DT parsing code.

This patch can be used in the following way:

struct of_pci_range_parser parser;
struct of_pci_range range;

if (of_pci_range_parser_init(&parser, np))
; //no ranges property

for_each_of_pci_range(&parser, &range) {

/*
directly access properties of the address range, e.g.:
range.pci_space, range.pci_addr, range.cpu_addr,
range.size, range.flags

alternatively obtain a struct resource, e.g.:
struct resource res;
of_pci_range_to_resource(&range, np, &res);
*/
}

Additionally the implementation takes care of adjacent ranges and merges them
into a single range (as was the case with powerpc and microblaze).

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Signed-off-by: Thomas Petazzoni 
Reviewed-by: Rob Herring 
Tested-by: Thomas Petazzoni 
Tested-by: Linus Walleij 
Acked-by: Grant Likely 
---
 drivers/of/address.c   |   67 ++
 drivers/of/of_pci.c|  113 +---
 include/linux/of_address.h |   48 +++
 3 files changed, 158 insertions(+), 70 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 04da786..fdd0636 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -227,6 +227,73 @@ int of_pci_address_to_resource(struct device_node *dev, 
int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
 }
 EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+int of_pci_range_parser_init(struct of_pci_range_parser *parser,
+   struct device_node *node)
+{
+   const int na = 3, ns = 2;
+   int rlen;
+
+   parser->node = node;
+   parser->pna = of_n_addr_cells(node);
+   parser->np = parser->pna + na + ns;
+
+   parser->range = of_get_property(node, "ranges", &rlen);
+   if (parser->range == NULL)
+   return -ENOENT;
+
+   parser->end = parser->range + rlen / sizeof(__be32);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_init);
+
+struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser 
*parser,
+   struct of_pci_range *range)
+{
+   const int na = 3, ns = 2;
+
+   if (!range)
+   return NULL;
+
+   if (!parser->range || parser->range + parser->np > parser->end)
+   return NULL;
+
+   range->pci_space = parser->range[0];
+   range->flags = of_bus_pci_get_flags(parser->range);
+   range->pci_addr = of_read_number(parser->range + 1, ns);
+   range->cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   range->size = of_read_number(parser->range + parser->pna + na, ns);
+
+   parser->range += parser->np;
+
+   /* Now consume following elements while they are contiguous */
+   while (parser->range + parser->np <= parser->end) {
+   u32 flags, pci_space;
+   u64 pci_addr, cpu_addr, size;
+
+   pci_space = be32_to_cpup(parser->range);
+   flags = of_bus_pci_get_flags(parser->range);
+   pci_addr = of_read_number(parser->range + 1, ns);
+   cpu_addr = of_translate_address(parser->node,
+   parser->range + na);
+   size = of_read_number(parser->range + parser->pna + na, ns);
+
+   if (flags != range->flags)
+   break;
+   if (pci_addr != range->pci_addr + range->size ||
+   cpu_addr != range->cpu_addr + range->size)
+   break;
+
+   range->size += size;
+   parser->range += parser->np;
+   }
+
+   return range;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_one);
+
 #endif /* CONFIG_PCI */
 
 /*
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 1626172..3c49ab2 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #if defined(CONFIG_PPC32) || defined(CONFIG_PPC64) || 
defined(CONFIG_MICROBLAZE)
@@ -82,67 +83,42 @@ EXPORT_SYMBOL_GPL(of_pci_find_child_device);
 void pci_process_bridge_OF_ranges(struct pci_controller *hose,
   

[PATCH v8 1/3] of/pci: Unify pci_process_bridge_OF_ranges from Microblaze and PowerPC

2013-04-22 Thread Andrew Murray
The pci_process_bridge_OF_ranges function, used to parse the "ranges"
property of a PCI host device, is found in both Microblaze and PowerPC
architectures. These implementations are nearly identical. This patch
moves this common code to a common place.

Signed-off-by: Andrew Murray 
Signed-off-by: Liviu Dudau 
Reviewed-by: Rob Herring 
Tested-by: Thomas Petazzoni 
Tested-by: Linus Walleij 
Acked-by: Michal Simek 
Acked-by: Grant Likely 
---
 arch/microblaze/include/asm/pci-bridge.h |5 +-
 arch/microblaze/pci/pci-common.c |  192 
 arch/powerpc/include/asm/pci-bridge.h|5 +-
 arch/powerpc/kernel/pci-common.c |  192 
 drivers/of/of_pci.c  |  200 ++
 include/linux/of_pci.h   |4 +
 6 files changed, 206 insertions(+), 392 deletions(-)

diff --git a/arch/microblaze/include/asm/pci-bridge.h 
b/arch/microblaze/include/asm/pci-bridge.h
index cb5d397..5783cd6 100644
--- a/arch/microblaze/include/asm/pci-bridge.h
+++ b/arch/microblaze/include/asm/pci-bridge.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct device_node;
 
@@ -132,10 +133,6 @@ extern void setup_indirect_pci(struct pci_controller *hose,
 extern struct pci_controller *pci_find_hose_for_OF_device(
struct device_node *node);
 
-/* Fill up host controller resources from the OF node */
-extern void pci_process_bridge_OF_ranges(struct pci_controller *hose,
-   struct device_node *dev, int primary);
-
 /* Allocate & free a PCI host bridge structure */
 extern struct pci_controller *pcibios_alloc_controller(struct device_node 
*dev);
 extern void pcibios_free_controller(struct pci_controller *phb);
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 9ea521e..2735ad9 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -622,198 +622,6 @@ void pci_resource_to_user(const struct pci_dev *dev, int 
bar,
*end = rsrc->end - offset;
 }
 
-/**
- * pci_process_bridge_OF_ranges - Parse PCI bridge resources from device tree
- * @hose: newly allocated pci_controller to be setup
- * @dev: device node of the host bridge
- * @primary: set if primary bus (32 bits only, soon to be deprecated)
- *
- * This function will parse the "ranges" property of a PCI host bridge device
- * node and setup the resource mapping of a pci controller based on its
- * content.
- *
- * Life would be boring if it wasn't for a few issues that we have to deal
- * with here:
- *
- *   - We can only cope with one IO space range and up to 3 Memory space
- * ranges. However, some machines (thanks Apple !) tend to split their
- * space into lots of small contiguous ranges. So we have to coalesce.
- *
- *   - We can only cope with all memory ranges having the same offset
- * between CPU addresses and PCI addresses. Unfortunately, some bridges
- * are setup for a large 1:1 mapping along with a small "window" which
- * maps PCI address 0 to some arbitrary high address of the CPU space in
- * order to give access to the ISA memory hole.
- * The way out of here that I've chosen for now is to always set the
- * offset based on the first resource found, then override it if we
- * have a different offset and the previous was set by an ISA hole.
- *
- *   - Some busses have IO space not starting at 0, which causes trouble with
- * the way we do our IO resource renumbering. The code somewhat deals with
- * it for 64 bits but I would expect problems on 32 bits.
- *
- *   - Some 32 bits platforms such as 4xx can have physical space larger than
- * 32 bits so we need to use 64 bits values for the parsing
- */
-void pci_process_bridge_OF_ranges(struct pci_controller *hose,
- struct device_node *dev, int primary)
-{
-   const u32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(dev);
-   int np = pna + 5;
-   int memno = 0, isa_hole = -1;
-   u32 pci_space;
-   unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
-   unsigned long long isa_mb = 0;
-   struct resource *res;
-
-   pr_info("PCI host bridge %s %s ranges:\n",
-  dev->full_name, primary ? "(primary)" : "");
-
-   /* Get ranges property */
-   ranges = of_get_property(dev, "ranges", &rlen);
-   if (ranges == NULL)
-   return;
-
-   /* Parse it */
-   pr_debug("Parsing ranges property...\n");
-   while ((rlen -= np * 4) >= 0) {
-   /* Read next ranges element */
-   pci_space = ranges[0];
-   pci_addr = of_read_number(ranges + 1, 2);
-   cpu_addr = of_translate_address(dev, ranges + 3);
-   s

Re: [PATCH v7 3/3] of/pci: mips: convert to common of_pci_range_parser

2013-04-22 Thread Andrew Murray
On Sun, Apr 21, 2013 at 08:27:02AM +0100, Gabor Juhos wrote:
> Hi Jason,
> 
> >> Sorry I had no time earlier, but I have tested this now on MIPS. The patch
> >> causes build errors unfortunately. Given the fact that this has been merged
> >> already, I will send a fixup patch.
> > 
> > Olof has dropped this branch from arm-soc, plase post the build error
> > and fix here so that it can be included in this series.
> 
> I have posted the patch to Olof two days ago. It has been CC'd to you as well
> but In case that it does not exists in your mailbox the patch can be found 
> here:
> 
> https://patchwork.linux-mips.org/patch/5196/
> 
> However I can re-post the patch as a reply to this thread if you prefer that.

As this branch was dropped I have updated my patchset to include Grant's recent
feedback - I've also included Gabor's fixes to this patchset (and his sign-off).

If you include this new patchset in your branch the drivers that depend on it
will need to be updated to reflect the new naming of functions as suggested by
Grant.

Thanks,

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v7 2/3] of/pci: Provide support for parsing PCI DT ranges property

2013-04-22 Thread Andrew Murray
On Thu, Apr 18, 2013 at 02:44:01PM +0100, Grant Likely wrote:
> On Tue, 16 Apr 2013 11:18:27 +0100, Andrew Murray  
> wrote:

> 
> Acked-by: Grant Likely 
> 
> But comments below...
> 

I've updated the patchset (now v8) to reflect your feedback, after a closer
look...

> > -
> > -   pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
> > -   pci_space, pci_addr);
> > -   pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
> > -   cpu_addr, size);
> > -
> > -   ranges += np;
> > +   pr_debug("pci_space: 0x%08x pci_addr: 0x%016llx ",
> > +   range.pci_space, range.pci_addr);
> > +   pr_debug("cpu_addr: 0x%016llx size: 0x%016llx\n",
> > +   range.cpu_addr, range.size);
> 
> Nit: the patch changed whitespace on the pr_debug() statements, so even
> though the first line of each is identical, they look different in the
> patch.
> 

Actually the first line isn't identical, the original file was inconsistent
with its use of spaces between ':' and '0x%0' - my patch ensured that there
was always a space. I guess this could have been done as a separate patch.

> >  
> > /* If we failed translation or got a zero-sized region
> >  * (some FW try to feed us with non sensical zero sized regions
> >  * such as power3 which look like some kind of attempt
> >  * at exposing the VGA memory hole)
> >  */
> > -   if (cpu_addr == OF_BAD_ADDR || size == 0)
> > +   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> > continue;
> 
> Can this also be rolled into the parsing iterator?
> 

I decided not to do this. Mainly because ARM drivers use the parser directly
(instead of pci_process_bridge_OF_ranges function) and it seemed perfectly
valid for the parser to return a range of size 0 if that is what was present in
the DT.

> >  
> > -   /* Now consume following elements while they are contiguous */
> > -   for (; rlen >= np * sizeof(u32);
> > -ranges += np, rlen -= np * 4) {
> > -   if (ranges[0] != pci_space)
> > -   break;
> > -   pci_next = of_read_number(ranges + 1, 2);
> > -   cpu_next = of_translate_address(dev, ranges + 3);
> > -   if (pci_next != pci_addr + size ||
> > -   cpu_next != cpu_addr + size)
> > -   break;
> > -   size += of_read_number(ranges + pna + 3, 2);
> > -   }
> > -
> > /* Act based on address space type */
> > res = NULL;
> > -   switch ((pci_space >> 24) & 0x3) {
> > -   case 1: /* PCI IO space */
> > +   res_type = range.flags & IORESOURCE_TYPE_BITS;
> > +   if (res_type == IORESOURCE_IO) {
> 
> Why the change from switch() to an if/else if sequence?

I think this was an artifact of the patches evolution, I've reverted back to
the switch.

> 
> But those are mostly nitpicks. If this is deferred to v3.10 then I would
> suggest fixing them up and posting for another round of review.

Andrew Murray
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 0/3] Unify definations of struct pci_controller

2013-04-25 Thread Andrew Murray
PowerPC and Microblaze have nearly identical definations of struct
pci_controller - this patch unifies them in asm-generic to reduce
code duplication and to allow new architectures to reuse.

This patchset follows and depends on "of/pci: Provide common
support for PCI DT parsing" which provided common 'ranges' parsing
code which uses an architecture defined struct pci_controller. This
patch is currently in Jason Coopers mvebu-next/pcie branch.

It is hoped this will pave the way for providing common
implementations of commonly duplicated functions found across the
architectures such as pcibios_alloc|free_controller and
pcibios_setup_phb_resources type functions.

Andrew Murray (3):
  powerpc: Move struct pci_controller to asm-generic
  microblaze: Use asm-generic version of pci_controller
  pci: Use common definations of INDIRECT_TYPE_*

 arch/microblaze/include/asm/pci-bridge.h |   70 +---
 arch/powerpc/include/asm/pci-bridge.h|   82 ---
 arch/powerpc/sysdev/fsl_pci.c|   16 +++---
 arch/powerpc/sysdev/indirect_pci.c   |   20 +++---
 arch/powerpc/sysdev/ppc4xx_pci.c |4 +-
 arch/powerpc/sysdev/xilinx_pci.c |2 +-
 include/asm-generic/pci-bridge.h |   90 ++
 7 files changed, 112 insertions(+), 172 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 1/3] powerpc: Move struct pci_controller to asm-generic

2013-04-25 Thread Andrew Murray
This patch moves struct pci_controller into asm-generic to allow
for use by other architectures thus reducing code duplication in
the kernel.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/include/asm/pci-bridge.h |   87 +---
 include/asm-generic/pci-bridge.h  |   68 +
 2 files changed, 82 insertions(+), 73 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 205bfba..163bd40 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -8,7 +8,6 @@
  * 2 of the License, or (at your option) any later version.
  */
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -16,85 +15,27 @@
 struct device_node;
 
 /*
- * Structure of a PCI controller (host bridge)
+ * Used for variants of PCI indirect handling and possible quirks:
+ *  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
+ *  EXT_REG - provides access to PCI-e extended registers
+ *  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
+ *   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
+ *   to determine which bus number to match on when generating type0
+ *   config cycles
+ *  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
+ *   hanging if we don't have link and try to do config cycles to
+ *   anything but the PHB.  Only allow talking to the PHB if this is
+ *   set.
+ *  BIG_ENDIAN - cfg_addr is a big endian register
+ *  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs on
+ *   the PLB4.  Effectively disable MRM commands by setting this.
  */
-struct pci_controller {
-   struct pci_bus *bus;
-   char is_dynamic;
-#ifdef CONFIG_PPC64
-   int node;
-#endif
-   struct device_node *dn;
-   struct list_head list_node;
-   struct device *parent;
-
-   int first_busno;
-   int last_busno;
-   int self_busno;
-   struct resource busn;
-
-   void __iomem *io_base_virt;
-#ifdef CONFIG_PPC64
-   void *io_base_alloc;
-#endif
-   resource_size_t io_base_phys;
-   resource_size_t pci_io_size;
-
-   /* Some machines (PReP) have a non 1:1 mapping of
-* the PCI memory space in the CPU bus space
-*/
-   resource_size_t pci_mem_offset;
-
-   /* Some machines have a special region to forward the ISA
-* "memory" cycles such as VGA memory regions. Left to 0
-* if unsupported
-*/
-   resource_size_t isa_mem_phys;
-   resource_size_t isa_mem_size;
-
-   struct pci_ops *ops;
-   unsigned int __iomem *cfg_addr;
-   void __iomem *cfg_data;
-
-   /*
-* Used for variants of PCI indirect handling and possible quirks:
-*  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
-*  EXT_REG - provides access to PCI-e extended registers
-*  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
-*   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
-*   to determine which bus number to match on when generating type0
-*   config cycles
-*  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
-*   hanging if we don't have link and try to do config cycles to
-*   anything but the PHB.  Only allow talking to the PHB if this is
-*   set.
-*  BIG_ENDIAN - cfg_addr is a big endian register
-*  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs 
on
-*   the PLB4.  Effectively disable MRM commands by setting this.
-*/
 #define PPC_INDIRECT_TYPE_SET_CFG_TYPE 0x0001
 #define PPC_INDIRECT_TYPE_EXT_REG  0x0002
 #define PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS 0x0004
 #define PPC_INDIRECT_TYPE_NO_PCIE_LINK 0x0008
 #define PPC_INDIRECT_TYPE_BIG_ENDIAN   0x0010
 #define PPC_INDIRECT_TYPE_BROKEN_MRM   0x0020
-   u32 indirect_type;
-   /* Currently, we limit ourselves to 1 IO range and 3 mem
-* ranges since the common pci_bus structure can't handle more
-*/
-   struct resource io_resource;
-   struct resource mem_resources[3];
-   int global_number;  /* PCI domain number */
-
-   resource_size_t dma_window_base_cur;
-   resource_size_t dma_window_size;
-
-#ifdef CONFIG_PPC64
-   unsigned long buid;
-
-   void *private_data;
-#endif /* CONFIG_PPC64 */
-};
 
 /* These are used for config access before all the PCI probing
has been done. */
diff --git a/include/asm-generic/pci-bridge.h b/include/asm-generic/pci-bridge.h
index 20db2e5..e58830e 100644
--- a/include/asm-generic/pci-bridge.h
+++ b/include/asm-generic/pci-bridge.h
@@ -9,6 +9,9 @@
 
 #ifdef __KERNEL__
 
+#include 
+#include 
+
 enum {
/* Force re-assigning all resources (ignore firmware
 * setup completely

[RFC PATCH 2/3] microblaze: Use asm-generic version of pci_controller

2013-04-25 Thread Andrew Murray
This patch removes struct pci_controller from Microblaze and instead
uses struct pci_controller from asm-generic.

Signed-off-by: Andrew Murray 
---
 arch/microblaze/include/asm/pci-bridge.h |   75 ++
 include/asm-generic/pci-bridge.h |2 +-
 2 files changed, 16 insertions(+), 61 deletions(-)

diff --git a/arch/microblaze/include/asm/pci-bridge.h 
b/arch/microblaze/include/asm/pci-bridge.h
index 5783cd6..0ee75dc 100644
--- a/arch/microblaze/include/asm/pci-bridge.h
+++ b/arch/microblaze/include/asm/pci-bridge.h
@@ -8,9 +8,9 @@
  * 2 of the License, or (at your option) any later version.
  */
 #include 
-#include 
 #include 
 #include 
+#include 
 
 struct device_node;
 
@@ -25,72 +25,27 @@ static inline int pcibios_vaddr_is_ioport(void __iomem 
*address)
 #endif
 
 /*
- * Structure of a PCI controller (host bridge)
+ * Used for variants of PCI indirect handling and possible quirks:
+ *  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
+ *  EXT_REG - provides access to PCI-e extended registers
+ *  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
+ *   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
+ *   to determine which bus number to match on when generating type0
+ *   config cycles
+ *  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
+ *   hanging if we don't have link and try to do config cycles to
+ *   anything but the PHB.  Only allow talking to the PHB if this is
+ *   set.
+ *  BIG_ENDIAN - cfg_addr is a big endian register
+ *  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs
+ *   on the PLB4.  Effectively disable MRM commands by setting this.
  */
-struct pci_controller {
-   struct pci_bus *bus;
-   char is_dynamic;
-   struct device_node *dn;
-   struct list_head list_node;
-   struct device *parent;
-
-   int first_busno;
-   int last_busno;
-
-   int self_busno;
-
-   void __iomem *io_base_virt;
-   resource_size_t io_base_phys;
-
-   resource_size_t pci_io_size;
-
-   /* Some machines (PReP) have a non 1:1 mapping of
-* the PCI memory space in the CPU bus space
-*/
-   resource_size_t pci_mem_offset;
-
-   /* Some machines have a special region to forward the ISA
-* "memory" cycles such as VGA memory regions. Left to 0
-* if unsupported
-*/
-   resource_size_t isa_mem_phys;
-   resource_size_t isa_mem_size;
-
-   struct pci_ops *ops;
-   unsigned int __iomem *cfg_addr;
-   void __iomem *cfg_data;
-
-   /*
-* Used for variants of PCI indirect handling and possible quirks:
-*  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
-*  EXT_REG - provides access to PCI-e extended registers
-*  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
-*   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
-*   to determine which bus number to match on when generating type0
-*   config cycles
-*  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
-*   hanging if we don't have link and try to do config cycles to
-*   anything but the PHB.  Only allow talking to the PHB if this is
-*   set.
-*  BIG_ENDIAN - cfg_addr is a big endian register
-*  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs
-*   on the PLB4.  Effectively disable MRM commands by setting this.
-*/
 #define INDIRECT_TYPE_SET_CFG_TYPE 0x0001
 #define INDIRECT_TYPE_EXT_REG  0x0002
 #define INDIRECT_TYPE_SURPRESS_PRIMARY_BUS 0x0004
 #define INDIRECT_TYPE_NO_PCIE_LINK 0x0008
 #define INDIRECT_TYPE_BIG_ENDIAN   0x0010
 #define INDIRECT_TYPE_BROKEN_MRM   0x0020
-   u32 indirect_type;
-
-   /* Currently, we limit ourselves to 1 IO range and 3 mem
-* ranges since the common pci_bus structure can't handle more
-*/
-   struct resource io_resource;
-   struct resource mem_resources[3];
-   int global_number;  /* PCI domain number */
-};
 
 #ifdef CONFIG_PCI
 static inline struct pci_controller *pci_bus_to_host(const struct pci_bus *bus)
diff --git a/include/asm-generic/pci-bridge.h b/include/asm-generic/pci-bridge.h
index e58830e..1a7f96d 100644
--- a/include/asm-generic/pci-bridge.h
+++ b/include/asm-generic/pci-bridge.h
@@ -46,7 +46,7 @@ struct device_node;
 /*
  * Structure of a PCI controller (host bridge)
  */
-#if defined(CONFIG_PPC32) || defined(CONFIG_PPC64)
+#if defined(CONFIG_PPC32) || defined(CONFIG_PPC64) || 
defined(CONFIG_MICROBLAZE)
 struct pci_controller {
struct pci_bus *bus;
char is_dynamic;
-- 
1.7.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 3/3] pci: Use common definations of INDIRECT_TYPE_*

2013-04-25 Thread Andrew Murray
This patch unifies similar definations of INDIRECT_TYPE_* between
PowerPC and Microblaze.

Signed-off-by: Andrew Murray 
---
 arch/microblaze/include/asm/pci-bridge.h |   23 ---
 arch/powerpc/include/asm/pci-bridge.h|   23 ---
 arch/powerpc/sysdev/fsl_pci.c|   16 
 arch/powerpc/sysdev/indirect_pci.c   |   20 ++--
 arch/powerpc/sysdev/ppc4xx_pci.c |4 ++--
 arch/powerpc/sysdev/xilinx_pci.c |2 +-
 include/asm-generic/pci-bridge.h |   22 ++
 7 files changed, 43 insertions(+), 67 deletions(-)

diff --git a/arch/microblaze/include/asm/pci-bridge.h 
b/arch/microblaze/include/asm/pci-bridge.h
index 0ee75dc..acf8252 100644
--- a/arch/microblaze/include/asm/pci-bridge.h
+++ b/arch/microblaze/include/asm/pci-bridge.h
@@ -24,29 +24,6 @@ static inline int pcibios_vaddr_is_ioport(void __iomem 
*address)
 }
 #endif
 
-/*
- * Used for variants of PCI indirect handling and possible quirks:
- *  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
- *  EXT_REG - provides access to PCI-e extended registers
- *  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
- *   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
- *   to determine which bus number to match on when generating type0
- *   config cycles
- *  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
- *   hanging if we don't have link and try to do config cycles to
- *   anything but the PHB.  Only allow talking to the PHB if this is
- *   set.
- *  BIG_ENDIAN - cfg_addr is a big endian register
- *  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs
- *   on the PLB4.  Effectively disable MRM commands by setting this.
- */
-#define INDIRECT_TYPE_SET_CFG_TYPE 0x0001
-#define INDIRECT_TYPE_EXT_REG  0x0002
-#define INDIRECT_TYPE_SURPRESS_PRIMARY_BUS 0x0004
-#define INDIRECT_TYPE_NO_PCIE_LINK 0x0008
-#define INDIRECT_TYPE_BIG_ENDIAN   0x0010
-#define INDIRECT_TYPE_BROKEN_MRM   0x0020
-
 #ifdef CONFIG_PCI
 static inline struct pci_controller *pci_bus_to_host(const struct pci_bus *bus)
 {
diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 163bd40..b2bbf05 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -14,29 +14,6 @@
 
 struct device_node;
 
-/*
- * Used for variants of PCI indirect handling and possible quirks:
- *  SET_CFG_TYPE - used on 4xx or any PHB that does explicit type0/1
- *  EXT_REG - provides access to PCI-e extended registers
- *  SURPRESS_PRIMARY_BUS - we suppress the setting of PCI_PRIMARY_BUS
- *   on Freescale PCI-e controllers since they used the PCI_PRIMARY_BUS
- *   to determine which bus number to match on when generating type0
- *   config cycles
- *  NO_PCIE_LINK - the Freescale PCI-e controllers have issues with
- *   hanging if we don't have link and try to do config cycles to
- *   anything but the PHB.  Only allow talking to the PHB if this is
- *   set.
- *  BIG_ENDIAN - cfg_addr is a big endian register
- *  BROKEN_MRM - the 440EPx/GRx chips have an errata that causes hangs on
- *   the PLB4.  Effectively disable MRM commands by setting this.
- */
-#define PPC_INDIRECT_TYPE_SET_CFG_TYPE 0x0001
-#define PPC_INDIRECT_TYPE_EXT_REG  0x0002
-#define PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS 0x0004
-#define PPC_INDIRECT_TYPE_NO_PCIE_LINK 0x0008
-#define PPC_INDIRECT_TYPE_BIG_ENDIAN   0x0010
-#define PPC_INDIRECT_TYPE_BROKEN_MRM   0x0020
-
 /* These are used for config access before all the PCI probing
has been done. */
 extern int early_read_config_byte(struct pci_controller *hose, int bus,
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 682084d..d73f94a 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -399,7 +399,7 @@ void fsl_pcibios_fixup_bus(struct pci_bus *bus)
 
if (fsl_pcie_bus_fixup)
is_pcie = early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP);
-   no_link = !!(hose->indirect_type & PPC_INDIRECT_TYPE_NO_PCIE_LINK);
+   no_link = !!(hose->indirect_type & INDIRECT_TYPE_NO_PCIE_LINK);
 
if (bus->parent == hose->bus && (is_pcie || no_link)) {
for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; ++i) {
@@ -462,7 +462,7 @@ int __init fsl_add_bridge(struct platform_device *pdev, int 
is_primary)
hose->last_busno = bus_range ? bus_range[1] : 0xff;
 
setup_indirect_pci(hose, rsrc.start, rsrc.start + 0x4,
-   PPC_INDIRECT_TYPE_BIG_ENDIAN);
+   INDIRECT_TYPE_BIG_ENDIAN);
 
if (early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP)) {
/* For PCIE read HEADER_TYPE to identify contro

[PATCH] powerpc/pci: Use of_pci_range_parser helper in pci_process_bridge_OF_ranges

2014-02-24 Thread Andrew Murray
This patch updates the implementation of pci_process_bridge_OF_ranges to use
the of_pci_range_parser helpers.

Signed-off-by: Andrew Murray 
---
I've verified that this builds, however I have no hardware to test this.
---
 arch/powerpc/kernel/pci-common.c | 88 +---
 1 file changed, 29 insertions(+), 59 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index d9476c1..a05fe18 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -666,60 +666,36 @@ void pci_resource_to_user(const struct pci_dev *dev, int 
bar,
 void pci_process_bridge_OF_ranges(struct pci_controller *hose,
  struct device_node *dev, int primary)
 {
-   const __be32 *ranges;
-   int rlen;
-   int pna = of_n_addr_cells(dev);
-   int np = pna + 5;
int memno = 0;
-   u32 pci_space;
-   unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
struct resource *res;
+   struct of_pci_range range;
+   struct of_pci_range_parser parser;
 
printk(KERN_INFO "PCI host bridge %s %s ranges:\n",
   dev->full_name, primary ? "(primary)" : "");
 
-   /* Get ranges property */
-   ranges = of_get_property(dev, "ranges", &rlen);
-   if (ranges == NULL)
+   /* Check for ranges property */
+   if (of_pci_range_parser_init(&parser, dev))
return;
 
/* Parse it */
-   while ((rlen -= np * 4) >= 0) {
-   /* Read next ranges element */
-   pci_space = of_read_number(ranges, 1);
-   pci_addr = of_read_number(ranges + 1, 2);
-   cpu_addr = of_translate_address(dev, ranges + 3);
-   size = of_read_number(ranges + pna + 3, 2);
-   ranges += np;
-
+   for_each_of_pci_range(&parser, &range) {
/* If we failed translation or got a zero-sized region
 * (some FW try to feed us with non sensical zero sized regions
 * such as power3 which look like some kind of attempt at 
exposing
 * the VGA memory hole)
 */
-   if (cpu_addr == OF_BAD_ADDR || size == 0)
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
 
-   /* Now consume following elements while they are contiguous */
-   for (; rlen >= np * sizeof(u32);
-ranges += np, rlen -= np * 4) {
-   if (of_read_number(ranges, 1) != pci_space)
-   break;
-   pci_next = of_read_number(ranges + 1, 2);
-   cpu_next = of_translate_address(dev, ranges + 3);
-   if (pci_next != pci_addr + size ||
-   cpu_next != cpu_addr + size)
-   break;
-   size += of_read_number(ranges + pna + 3, 2);
-   }
-
/* Act based on address space type */
res = NULL;
-   switch ((pci_space >> 24) & 0x3) {
-   case 1: /* PCI IO space */
+   switch (range.flags & IORESOURCE_TYPE_BITS) {
+   case IORESOURCE_IO:
printk(KERN_INFO
   "  IO 0x%016llx..0x%016llx -> 0x%016llx\n",
-  cpu_addr, cpu_addr + size - 1, pci_addr);
+  range.cpu_addr, range.cpu_addr + range.size - 1,
+  range.pci_addr);
 
/* We support only one IO range */
if (hose->pci_io_size) {
@@ -729,11 +705,12 @@ void pci_process_bridge_OF_ranges(struct pci_controller 
*hose,
}
 #ifdef CONFIG_PPC32
/* On 32 bits, limit I/O space to 16MB */
-   if (size > 0x0100)
-   size = 0x0100;
+   if (range.size > 0x0100)
+   range.size = 0x0100;
 
/* 32 bits needs to map IOs here */
-   hose->io_base_virt = ioremap(cpu_addr, size);
+   hose->io_base_virt = ioremap(range.cpu_addr,
+   range.size);
 
/* Expect trouble if pci_addr is not 0 */
if (primary)
@@ -743,20 +720,20 @@ void pci_process_bridge_OF_ranges(struct pci_controller 
*hose,
/* pci_io_size and io_base_phys always represent IO
 * space starting at 0 so we factor in pci_addr
 */
-   hose->pci_io_size = pci_addr + size;
- 

Re: [PATCH] powerpc/pci: Use of_pci_range_parser helper in pci_process_bridge_OF_ranges

2014-02-25 Thread Andrew Murray
On 25 February 2014 13:25, Benjamin Herrenschmidt
 wrote:
> On Tue, 2014-02-25 at 06:32 +0000, Andrew Murray wrote:
>> This patch updates the implementation of pci_process_bridge_OF_ranges to use
>> the of_pci_range_parser helpers.
>>
>> Signed-off-by: Andrew Murray 
>> ---
>> I've verified that this builds, however I have no hardware to test this.
>> ---
>
> Thanks. A cursory review looks good but I need to spend a bit more time
> making sure our various special cases are handled properly.
>
> It's tracked on patchwork so unless you have an update to the patch,
> it won't be lost, but it might take a little while before I get to
> actually merge it.

Thanks for the response - Yes it's easy to screw this stuff up.

Please note that some of the special cases are handled by the parser
helper e.g. consumption of contiguous ranges, assignment to 'struct
resource', etc.

It should also be pointed out that this is now once again very similar
to the Microblaze implementation.

Thanks,

Andrew Murray

>
> Cheers,
> Ben.
>
>>  arch/powerpc/kernel/pci-common.c | 88 
>> +---
>>  1 file changed, 29 insertions(+), 59 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/pci-common.c 
>> b/arch/powerpc/kernel/pci-common.c
>> index d9476c1..a05fe18 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -666,60 +666,36 @@ void pci_resource_to_user(const struct pci_dev *dev, 
>> int bar,
>>  void pci_process_bridge_OF_ranges(struct pci_controller *hose,
>> struct device_node *dev, int primary)
>>  {
>> - const __be32 *ranges;
>> - int rlen;
>> - int pna = of_n_addr_cells(dev);
>> - int np = pna + 5;
>>   int memno = 0;
>> - u32 pci_space;
>> - unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
>>   struct resource *res;
>> + struct of_pci_range range;
>> + struct of_pci_range_parser parser;
>>
>>   printk(KERN_INFO "PCI host bridge %s %s ranges:\n",
>>  dev->full_name, primary ? "(primary)" : "");
>>
>> - /* Get ranges property */
>> - ranges = of_get_property(dev, "ranges", &rlen);
>> - if (ranges == NULL)
>> + /* Check for ranges property */
>> + if (of_pci_range_parser_init(&parser, dev))
>>   return;
>>
>>   /* Parse it */
>> - while ((rlen -= np * 4) >= 0) {
>> - /* Read next ranges element */
>> - pci_space = of_read_number(ranges, 1);
>> - pci_addr = of_read_number(ranges + 1, 2);
>> - cpu_addr = of_translate_address(dev, ranges + 3);
>> - size = of_read_number(ranges + pna + 3, 2);
>> - ranges += np;
>> -
>> + for_each_of_pci_range(&parser, &range) {
>>   /* If we failed translation or got a zero-sized region
>>* (some FW try to feed us with non sensical zero sized regions
>>* such as power3 which look like some kind of attempt at 
>> exposing
>>* the VGA memory hole)
>>*/
>> - if (cpu_addr == OF_BAD_ADDR || size == 0)
>> + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
>>   continue;
>>
>> - /* Now consume following elements while they are contiguous */
>> - for (; rlen >= np * sizeof(u32);
>> -  ranges += np, rlen -= np * 4) {
>> - if (of_read_number(ranges, 1) != pci_space)
>> - break;
>> - pci_next = of_read_number(ranges + 1, 2);
>> - cpu_next = of_translate_address(dev, ranges + 3);
>> - if (pci_next != pci_addr + size ||
>> - cpu_next != cpu_addr + size)
>> - break;
>> - size += of_read_number(ranges + pna + 3, 2);
>> - }
>> -
>>   /* Act based on address space type */
>>   res = NULL;
>> - switch ((pci_space >> 24) & 0x3) {
>> - case 1: /* PCI IO space */
>> + switch (range.flags & IORESOURCE_TYPE_BITS) {
>> + case IORESOURCE_IO:
>>   printk(KERN_INFO
>>  "  IO 0x%016llx..0x%016llx -> 0x%016llx\n"

[PATCH v1 0/7] PCI: dt: Remove magic numbers for legacy PCI IRQ interrupts

2019-11-04 Thread Andrew Murray
PCI devices can trigger interrupts via 4 physical/virtual lines known
as INTA, INTB, INTC or INTD. Due to interrupt swizzling it is often
required to describe the interrupt mapping in the device tree. Let's
avoid the existing magic numbers and replace them with a #define to
improve clarity.

Based on v5.4-rc5, compile tested

Signed-off-by: Andrew Murray 


Andrew Murray (7):
  PCI: dt: Add legacy PCI IRQ defines
  arm64: dts: Use IRQ flags for legacy PCI IRQ interrupts
  arm: dts: Use IRQ flags for legacy PCI IRQ interrupts
  xtensa: dts: Use IRQ flags for legacy PCI IRQ interrupts
  powerpc: dts: fsl: Use IRQ flags for legacy PCI IRQ interrupts
  powerpc: dts: Use IRQ flags for legacy PCI IRQ interrupts
  dt-bindings: PCI: Use IRQ flags for legacy PCI IRQ interrupts

 .../devicetree/bindings/pci/83xx-512x-pci.txt |  18 +--
 .../devicetree/bindings/pci/aardvark-pci.txt  |  10 +-
 .../devicetree/bindings/pci/altera-pcie.txt   |  10 +-
 .../bindings/pci/axis,artpec6-pcie.txt|  10 +-
 .../bindings/pci/cdns,cdns-pcie-host.txt  |  10 +-
 .../bindings/pci/faraday,ftpci100.txt |  68 
 .../bindings/pci/fsl,imx6q-pcie.txt   |  10 +-
 .../bindings/pci/hisilicon-pcie.txt   |  20 +--
 .../bindings/pci/host-generic-pci.txt |  10 +-
 .../devicetree/bindings/pci/kirin-pcie.txt|  10 +-
 .../bindings/pci/layerscape-pci.txt   |  10 +-
 .../devicetree/bindings/pci/mediatek-pcie.txt |  40 ++---
 .../devicetree/bindings/pci/mobiveil-pcie.txt |   8 +-
 .../devicetree/bindings/pci/pci-rcar-gen2.txt |   8 +-
 .../bindings/pci/pci-thunder-pem.txt  |  10 +-
 .../devicetree/bindings/pci/pcie-al.txt   |   4 +-
 .../devicetree/bindings/pci/qcom,pcie.txt |  20 +--
 .../bindings/pci/ralink,rt3883-pci.txt|  18 +--
 .../bindings/pci/rockchip-pcie-host.txt   |  10 +-
 .../devicetree/bindings/pci/ti-pci.txt|  10 +-
 .../devicetree/bindings/pci/uniphier-pcie.txt |  10 +-
 .../bindings/pci/v3-v360epc-pci.txt   |  34 ++--
 .../devicetree/bindings/pci/versatile.txt |  40 ++---
 .../devicetree/bindings/pci/xgene-pci-msi.txt |  10 +-
 .../devicetree/bindings/pci/xgene-pci.txt |  10 +-
 .../bindings/pci/xilinx-nwl-pcie.txt  |  10 +-
 .../devicetree/bindings/pci/xilinx-pcie.txt   |  20 +--
 arch/arm/boot/dts/alpine.dtsi |   6 +-
 arch/arm/boot/dts/artpec6.dtsi|  10 +-
 arch/arm/boot/dts/gemini-dlink-dir-685.dts|  34 ++--
 arch/arm/boot/dts/gemini-sl93512r.dts |  34 ++--
 arch/arm/boot/dts/gemini-sq201.dts|  34 ++--
 arch/arm/boot/dts/gemini-wbd111.dts   |  34 ++--
 arch/arm/boot/dts/gemini-wbd222.dts   |  34 ++--
 arch/arm/boot/dts/imx6qdl.dtsi|  10 +-
 arch/arm/boot/dts/imx6sx.dtsi |  10 +-
 arch/arm/boot/dts/integratorap.dts|  36 +++--
 arch/arm/boot/dts/keystone-k2e.dtsi   |  11 +-
 arch/arm/boot/dts/keystone.dtsi   |  10 +-
 arch/arm/boot/dts/qcom-apq8064.dtsi   |  10 +-
 arch/arm/boot/dts/qcom-ipq4019.dtsi   |  10 +-
 arch/arm/boot/dts/versatile-pb.dts|  36 +++--
 arch/arm64/boot/dts/al/alpine-v2.dtsi |   6 +-
 .../boot/dts/amd/amd-overdrive-rev-b0.dts |   2 +-
 .../boot/dts/amd/amd-overdrive-rev-b1.dts |   2 +-
 arch/arm64/boot/dts/amd/amd-overdrive.dts |   2 +-
 arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi  |  12 +-
 arch/arm64/boot/dts/amd/husky.dts |   2 +-
 arch/arm64/boot/dts/arm/fvp-base-revc.dts |  10 +-
 arch/arm64/boot/dts/arm/juno-base.dtsi|  12 +-
 arch/arm64/boot/dts/cavium/thunder2-99xx.dtsi |  10 +-
 .../arm64/boot/dts/freescale/fsl-ls1012a.dtsi |  10 +-
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi |  10 +-
 arch/arm64/boot/dts/hisilicon/hip06.dtsi  |  10 +-
 arch/arm64/boot/dts/qcom/msm8998.dtsi |  10 +-
 arch/arm64/boot/dts/qcom/qcs404.dtsi  |  10 +-
 arch/arm64/boot/dts/rockchip/rk3399.dtsi  |  10 +-
 .../boot/dts/socionext/uniphier-ld20.dtsi |  11 +-
 .../boot/dts/socionext/uniphier-pxs3.dtsi |  11 +-
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi|  12 +-
 arch/powerpc/boot/dts/bluestone.dts   |  12 +-
 arch/powerpc/boot/dts/charon.dts  |  12 +-
 arch/powerpc/boot/dts/digsy_mtc.dts   |  12 +-
 arch/powerpc/boot/dts/fsl/b4420qds.dts|   4 +-
 arch/powerpc/boot/dts/fsl/b4420si-post.dtsi   |   2 +-
 arch/powerpc/boot/dts/fsl/b4860qds.dts|   4 +-
 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi   |   2 +-
 arch/powerpc/boot/dts/fsl/b4qds.dtsi  |   2 +-
 arch/powerpc/boot/dts/fsl/b4si-post.dtsi  |  12 +-
 arch/powerpc/boot/dts/fsl/bsc9132qds.dts  |   2 +-
 arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi |  12 +-
 arch/powerpc/boot/dts/fsl/c293pcie.dts|   2 +-
 arch/powerpc/boot/dts/fsl/c293si-post.dtsi|  12 +-
 arch/powerpc/boot/dts/fsl/gef_sbc310.dts  |  12 +-
 arch/powerpc/boot/dt

[PATCH v1 5/7] powerpc: dts: fsl: Use IRQ flags for legacy PCI IRQ interrupts

2019-11-04 Thread Andrew Murray
Replace magic numbers used to describe legacy PCI IRQ interrupts
with #define.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/boot/dts/fsl/b4420qds.dts|   4 +-
 arch/powerpc/boot/dts/fsl/b4420si-post.dtsi   |   2 +-
 arch/powerpc/boot/dts/fsl/b4860qds.dts|   4 +-
 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi   |   2 +-
 arch/powerpc/boot/dts/fsl/b4qds.dtsi  |   2 +-
 arch/powerpc/boot/dts/fsl/b4si-post.dtsi  |  12 +-
 arch/powerpc/boot/dts/fsl/bsc9132qds.dts  |   2 +-
 arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi |  12 +-
 arch/powerpc/boot/dts/fsl/c293pcie.dts|   2 +-
 arch/powerpc/boot/dts/fsl/c293si-post.dtsi|  12 +-
 arch/powerpc/boot/dts/fsl/gef_sbc310.dts  |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8536ds.dts   |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8536ds_36b.dts   |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8540ads.dts  | 100 ++--
 arch/powerpc/boot/dts/fsl/mpc8544ds.dts   |  22 +--
 arch/powerpc/boot/dts/fsl/mpc8544ds.dtsi  |  22 +--
 arch/powerpc/boot/dts/fsl/mpc8548cds_32b.dts  |  14 +-
 arch/powerpc/boot/dts/fsl/mpc8548cds_36b.dts  |  14 +-
 arch/powerpc/boot/dts/fsl/mpc8548si-post.dtsi |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8560ads.dts  | 100 ++--
 arch/powerpc/boot/dts/fsl/mpc8568mds.dts  |  22 +--
 arch/powerpc/boot/dts/fsl/mpc8568si-post.dtsi |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8569mds.dts  |   2 +-
 arch/powerpc/boot/dts/fsl/mpc8569si-post.dtsi |  12 +-
 arch/powerpc/boot/dts/fsl/mpc8641_hpcn.dts| 150 +-
 .../powerpc/boot/dts/fsl/mpc8641_hpcn_36b.dts | 150 +-
 arch/powerpc/boot/dts/fsl/p2020ds.dts |   2 +-
 arch/powerpc/boot/dts/fsl/p2020ds.dtsi|  46 +++---
 arch/powerpc/boot/dts/fsl/ppa8548.dts |   2 +-
 arch/powerpc/boot/dts/fsl/sbc8641d.dts|   4 +-
 30 files changed, 408 insertions(+), 368 deletions(-)

diff --git a/arch/powerpc/boot/dts/fsl/b4420qds.dts 
b/arch/powerpc/boot/dts/fsl/b4420qds.dts
index cd9203ceedc0..11b6a5147538 100644
--- a/arch/powerpc/boot/dts/fsl/b4420qds.dts
+++ b/arch/powerpc/boot/dts/fsl/b4420qds.dts
@@ -33,7 +33,7 @@
  */
 
 /include/ "b4420si-pre.dtsi"
-/include/ "b4qds.dtsi"
+#include "b4qds.dtsi"
 
 / {
model = "fsl,B4420QDS";
@@ -47,4 +47,4 @@
 
 };
 
-/include/ "b4420si-post.dtsi"
+#include "b4420si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi
index f996cced45e0..981585dc9026 100644
--- a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi
@@ -32,7 +32,7 @@
  * this software, even if advised of the possibility of such damage.
  */
 
-/include/ "b4si-post.dtsi"
+#include "b4si-post.dtsi"
 
 /* controller at 0x20 */
 &pci0 {
diff --git a/arch/powerpc/boot/dts/fsl/b4860qds.dts 
b/arch/powerpc/boot/dts/fsl/b4860qds.dts
index a8bc419959ca..9cad8d3f3165 100644
--- a/arch/powerpc/boot/dts/fsl/b4860qds.dts
+++ b/arch/powerpc/boot/dts/fsl/b4860qds.dts
@@ -33,7 +33,7 @@
  */
 
 /include/ "b4860si-pre.dtsi"
-/include/ "b4qds.dtsi"
+#include "b4qds.dtsi"
 
 / {
model = "fsl,B4860QDS";
@@ -114,4 +114,4 @@
};
 };
 
-/include/ "b4860si-post.dtsi"
+#include "b4860si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
index 868719821106..8d99df9a0259 100644
--- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
@@ -32,7 +32,7 @@
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-/include/ "b4si-post.dtsi"
+#include "b4si-post.dtsi"
 
 /* controller at 0x20 */
 &pci0 {
diff --git a/arch/powerpc/boot/dts/fsl/b4qds.dtsi 
b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
index 05be919f3545..0fd5a51942a3 100644
--- a/arch/powerpc/boot/dts/fsl/b4qds.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
@@ -277,4 +277,4 @@
};
 };
 
-/include/ "b4si-post.dtsi"
+#include "b4si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/fsl/b4si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/b4si-post.dtsi
index 4f044b41a776..b8c0cfe342ff 100644
--- a/arch/powerpc/boot/dts/fsl/b4si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4si-post.dtsi
@@ -32,6 +32,8 @@
  * this software, even if advised of the possibility of such damage.
  */
 
+#include 
+
 &bman_fbpr {
compatible = "fsl,bman-fbpr";
alloc-ranges = <0 0 0x1 0>;
@@ -70,13 +72,13 @@
device_type = "pci";
reg = <0 0 0 0 0>;
interrupts = <20 2 0 0>;
-   interrupt-map-mask = <0xf800 0 0 7>;
+   interrupt-map-mask = <0xf800 0 0 IRQ_INT_ALL>;
interrupt-map = <
/* IDSEL 0x0 */

[PATCH v1 6/7] powerpc: dts: Use IRQ flags for legacy PCI IRQ interrupts

2019-11-04 Thread Andrew Murray
Replace magic numbers used to describe legacy PCI IRQ interrupts
with #define.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/boot/dts/bluestone.dts   | 12 +++--
 arch/powerpc/boot/dts/charon.dts  | 12 +++--
 arch/powerpc/boot/dts/digsy_mtc.dts   | 12 +++--
 arch/powerpc/boot/dts/haleakala.dts   | 12 +++--
 arch/powerpc/boot/dts/holly.dts   | 42 
 arch/powerpc/boot/dts/hotfoot.dts | 12 +++--
 arch/powerpc/boot/dts/kuroboxHD.dts   | 28 ++-
 arch/powerpc/boot/dts/kuroboxHG.dts   | 28 ++-
 arch/powerpc/boot/dts/lite5200.dts| 12 +++--
 arch/powerpc/boot/dts/lite5200b.dts   | 22 +
 arch/powerpc/boot/dts/media5200.dts   | 26 +-
 arch/powerpc/boot/dts/mpc5121ads.dts  | 20 
 arch/powerpc/boot/dts/mpc8308rdb.dts  | 12 +++--
 arch/powerpc/boot/dts/mpc8313erdb.dts | 20 
 arch/powerpc/boot/dts/mpc832x_mds.dts | 60 ---
 arch/powerpc/boot/dts/mpc832x_rdb.dts | 22 +
 arch/powerpc/boot/dts/mpc8349emitxgp.dts  |  8 +--
 arch/powerpc/boot/dts/mpc836x_mds.dts | 60 ---
 arch/powerpc/boot/dts/mpc836x_rdk.dts | 16 +++---
 arch/powerpc/boot/dts/mucmc52.dts | 12 +++--
 arch/powerpc/boot/dts/mvme5100.dts| 48 +-
 arch/powerpc/boot/dts/pcm030.dts  | 22 +
 arch/powerpc/boot/dts/pcm032.dts  | 22 +
 arch/powerpc/boot/dts/pq2fads.dts | 28 ++-
 arch/powerpc/boot/dts/socrates.dts|  8 +--
 arch/powerpc/boot/dts/storcenter.dts  | 28 ++-
 arch/powerpc/boot/dts/stx_gp3_8560.dts| 36 +++---
 arch/powerpc/boot/dts/taishan.dts | 20 
 arch/powerpc/boot/dts/tqm5200.dts | 12 +++--
 arch/powerpc/boot/dts/tqm8540.dts | 16 +++---
 arch/powerpc/boot/dts/tqm8541.dts | 16 +++---
 arch/powerpc/boot/dts/tqm8555.dts | 16 +++---
 arch/powerpc/boot/dts/tqm8560.dts | 16 +++---
 arch/powerpc/boot/dts/virtex440-ml510.dts | 43 
 arch/powerpc/boot/dts/xcalibur1501.dts| 13 +++--
 arch/powerpc/boot/dts/xpedite5200.dts |  8 +--
 36 files changed, 437 insertions(+), 363 deletions(-)

diff --git a/arch/powerpc/boot/dts/bluestone.dts 
b/arch/powerpc/boot/dts/bluestone.dts
index cc965a1816b6..851b6f764ec3 100644
--- a/arch/powerpc/boot/dts/bluestone.dts
+++ b/arch/powerpc/boot/dts/bluestone.dts
@@ -8,6 +8,8 @@
 
 /dts-v1/;
 
+#include 
+
 / {
#address-cells = <2>;
#size-cells = <1>;
@@ -359,12 +361,12 @@
 * below are basically de-swizzled numbers.
 * The real slot is on idsel 0, so the swizzling is 1:1
 */
-   interrupt-map-mask = <0x0 0x0 0x0 0x7>;
+   interrupt-map-mask = <0x0 0x0 0x0 IRQ_INT_ALL>;
interrupt-map = <
-   0x0 0x0 0x0 0x1 &UIC3 0xc 0x4 /* swizzled int A 
*/
-   0x0 0x0 0x0 0x2 &UIC3 0xd 0x4 /* swizzled int B 
*/
-   0x0 0x0 0x0 0x3 &UIC3 0xe 0x4 /* swizzled int C 
*/
-   0x0 0x0 0x0 0x4 &UIC3 0xf 0x4 /* swizzled int D 
*/>;
+   0x0 0x0 0x0 IRQ_INTA &UIC3 0xc 0x4 /* swizzled 
int A */
+   0x0 0x0 0x0 IRQ_INTB &UIC3 0xd 0x4 /* swizzled 
int B */
+   0x0 0x0 0x0 IRQ_INTC &UIC3 0xe 0x4 /* swizzled 
int C */
+   0x0 0x0 0x0 IRQ_INTD &UIC3 0xf 0x4 /* swizzled 
int D */>;
};
 
MSI: ppc4xx-msi@C1000 {
diff --git a/arch/powerpc/boot/dts/charon.dts b/arch/powerpc/boot/dts/charon.dts
index 408b486b13df..0e2fe2511c46 100644
--- a/arch/powerpc/boot/dts/charon.dts
+++ b/arch/powerpc/boot/dts/charon.dts
@@ -11,6 +11,8 @@
 
 /dts-v1/;
 
+#include 
+
 / {
model = "anon,charon";
compatible = "anon,charon";
@@ -217,11 +219,11 @@
device_type = "pci";
compatible = "fsl,mpc5200-pci";
reg = <0xfd00 0x100>;
-   interrupt-map-mask = <0xf800 0 0 7>;
-   interrupt-map = <0xc000 0 0 1 &mpc5200_pic 0 0 3
-0xc000 0 0 2 &mpc5200_pic 0 0 3
-0xc000 0 0 3 &mpc5200_pic 0 0 3
-0xc000 0 0 4 &mpc5200_pic 0 0 3>;
+   interrupt-map-mask = <0xf800 0 0 IRQ_INT_ALL>;
+   interrupt-map = <0xc000 0 0 IRQ_INTA &mpc5200_pic 0 0 3
+0xc000 0 0 IRQ_INTB &mpc5200_pic 0 0 3
+0xc000 0 0 IRQ_INTC &mpc5200_pic 0 0 3
+0xc000 0 0 IRQ_INTD &mpc5200_pic 0 0 3&

Re: [PATCH v6 1/3] dt-bindings: pci: layerscape-pci: add compatible strings "fsl,ls1028a-pcie"

2019-11-05 Thread Andrew Murray
On Mon, Sep 02, 2019 at 11:43:17AM +0800, Xiaowei Bao wrote:
> Add the PCIe compatible string for LS1028A
> 
> Signed-off-by: Xiaowei Bao 
> Signed-off-by: Hou Zhiqiang 
> Reviewed-by: Rob Herring 
> ---

Reviewed-by: Andrew Murray 

> v2:
>  - No change.
> v3:
>  - No change.
> v4:
>  - No change.
> v5:
>  - No change.
> v6:
>  - No change.
> 
>  Documentation/devicetree/bindings/pci/layerscape-pci.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/pci/layerscape-pci.txt 
> b/Documentation/devicetree/bindings/pci/layerscape-pci.txt
> index e20ceaa..99a386e 100644
> --- a/Documentation/devicetree/bindings/pci/layerscape-pci.txt
> +++ b/Documentation/devicetree/bindings/pci/layerscape-pci.txt
> @@ -21,6 +21,7 @@ Required properties:
>  "fsl,ls1046a-pcie"
>  "fsl,ls1043a-pcie"
>  "fsl,ls1012a-pcie"
> +"fsl,ls1028a-pcie"
>EP mode:
>   "fsl,ls1046a-pcie-ep", "fsl,ls-pcie-ep"
>  - reg: base addresses and lengths of the PCIe controller register blocks.
> -- 
> 2.9.5
> 


Re: [PATCH v6 3/3] PCI: layerscape: Add LS1028a support

2019-11-05 Thread Andrew Murray
On Mon, Sep 02, 2019 at 11:43:19AM +0800, Xiaowei Bao wrote:
> Add support for the LS1028a PCIe controller.
> 
> Signed-off-by: Xiaowei Bao 
> Signed-off-by: Hou Zhiqiang 
> ---
> v2:
>  - No change.
> v3:
>  - Reuse the ls2088 driver data structurt.
> v4:
>  - No change.
> v5:
>  - No change.
> v6:
>  - No change.
> 
>  drivers/pci/controller/dwc/pci-layerscape.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pci-layerscape.c 
> b/drivers/pci/controller/dwc/pci-layerscape.c
> index 3a5fa26..f24f79a 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> @@ -263,6 +263,7 @@ static const struct ls_pcie_drvdata ls2088_drvdata = {
>  static const struct of_device_id ls_pcie_of_match[] = {
>   { .compatible = "fsl,ls1012a-pcie", .data = &ls1046_drvdata },
>   { .compatible = "fsl,ls1021a-pcie", .data = &ls1021_drvdata },
> + { .compatible = "fsl,ls1028a-pcie", .data = &ls2088_drvdata },
>   { .compatible = "fsl,ls1043a-pcie", .data = &ls1043_drvdata },
>   { .compatible = "fsl,ls1046a-pcie", .data = &ls1046_drvdata },
>   { .compatible = "fsl,ls2080a-pcie", .data = &ls2080_drvdata },

Reviewed-by: Andrew Murray 

> -- 
> 2.9.5
> 


[PATCH 00/10] perf/core: Generalise event exclusion checking

2018-11-16 Thread Andrew Murray
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags.

However this approach requires that each time a new event modifier is
added to perf, all the perf drivers need to be modified to indicate that
they don't support the attribute. This results in additional boiler-plate
code common to many drivers that needs to be maintained. An example of
this is the addition of exclude_host and exclude_guest in 2011 yet many
PMU drivers do not support this or indicate an error on events that make
use of it.

This patch generalises the test for exclusion and updates PMU drivers to
use it. This is a functional change as some PMU drivers will now correctly
report that they don't support certain events whereas they previously did.

An alternative approach might have been to provide a static perf_event_attr
with exclusion events set such that each PMU driver could compare against
(see ref [1]) - however this is less readable. A longer term approach may
instead be for PMU's to advertise their capabilities on registration.

All drivers touched by this patchset have been compile tested.

[1] https://lore.kernel.org/patchwork/patch/325116/
~       

Andrew Murray (10):
  perf/core: Add macro to test for event exclusion flags
  arm: perf/core: generalise event exclusion checking with perf macro
  arm: perf: add additional validation to set_event_filter
  powerpc: perf/core: generalise event exclusion checking with perf
macro
  powerpc/pmu/fsl: add additional validation to event_init
  alpha: perf/core: generalise event exclusion checking with perf macro
  x86: perf/core: generalise event exclusion checking with perf macro
  perf/core: Remove unused perf_flags
  drivers/perf: perf/core: generalise event exclusion checking with perf
macro
  perf/doc: update design.txt for exclude_{host|guest} flags

 arch/alpha/kernel/perf_event.c   |  4 +---
 arch/arm/kernel/perf_event_v7.c  |  2 ++
 arch/arm/mach-imx/mmdc.c |  8 +---
 arch/arm/mm/cache-l2x0-pmu.c |  7 +--
 arch/powerpc/perf/core-fsl-emb.c |  2 ++
 arch/powerpc/perf/hv-24x7.c  |  7 +--
 arch/powerpc/perf/hv-gpci.c  |  7 +--
 arch/powerpc/perf/imc-pmu.c  | 14 ++
 arch/x86/events/amd/ibs.c| 11 +--
 arch/x86/events/amd/iommu.c  |  3 +--
 arch/x86/events/amd/power.c  |  8 +---
 arch/x86/events/amd/uncore.c |  3 +--
 arch/x86/events/intel/cstate.c   |  7 +--
 arch/x86/events/intel/rapl.c |  7 +--
 arch/x86/events/intel/uncore.c   |  3 +--
 arch/x86/events/intel/uncore_snb.c   |  7 +--
 arch/x86/events/msr.c|  7 +--
 drivers/perf/arm-cci.c   |  7 +--
 drivers/perf/arm-ccn.c   |  5 +
 drivers/perf/arm_dsu_pmu.c   |  7 +--
 drivers/perf/arm_pmu.c   |  9 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c |  7 +--
 drivers/perf/qcom_l2_pmu.c   |  3 +--
 drivers/perf/qcom_l3_pmu.c   |  3 +--
 drivers/perf/xgene_pmu.c |  3 +--
 include/linux/perf_event.h   |  9 +
 include/uapi/linux/perf_event.h  |  2 --
 tools/include/uapi/linux/perf_event.h|  2 --
 tools/perf/design.txt|  4 
 29 files changed, 41 insertions(+), 127 deletions(-)

-- 
2.7.4



[PATCH 02/10] arm: perf/core: generalise event exclusion checking with perf macro

2018-11-16 Thread Andrew Murray
Replace checking of perf event exclusion flags with perf macro.

Signed-off-by: Andrew Murray 
---
 arch/arm/mach-imx/mmdc.c | 8 +---
 arch/arm/mm/cache-l2x0-pmu.c | 7 +--
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/arch/arm/mach-imx/mmdc.c b/arch/arm/mach-imx/mmdc.c
index 04b3bf7..d9d468f 100644
--- a/arch/arm/mach-imx/mmdc.c
+++ b/arch/arm/mach-imx/mmdc.c
@@ -293,13 +293,7 @@ static int mmdc_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest   ||
-   event->attr.sample_period)
+   if (event_has_exclude_flags(event) || event->attr.sample_period)
return -EINVAL;
 
if (cfg < 0 || cfg >= MMDC_NUM_COUNTERS)
diff --git a/arch/arm/mm/cache-l2x0-pmu.c b/arch/arm/mm/cache-l2x0-pmu.c
index afe5b4c..968fdf8 100644
--- a/arch/arm/mm/cache-l2x0-pmu.c
+++ b/arch/arm/mm/cache-l2x0-pmu.c
@@ -314,12 +314,7 @@ static int l2x0_pmu_event_init(struct perf_event *event)
event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
if (event->cpu < 0)
-- 
2.7.4



[PATCH 01/10] perf/core: Add macro to test for event exclusion flags

2018-11-16 Thread Andrew Murray
Add a macro that tests if any of the perf event exclusion flags
are set on a given event.

Signed-off-by: Andrew Murray 
---
 include/linux/perf_event.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f..89ee7fa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1004,6 +1004,15 @@ perf_event__output_id_sample(struct perf_event *event,
 extern void
 perf_log_lost_samples(struct perf_event *event, u64 lost);
 
+static inline bool event_has_exclude_flags(struct perf_event *event)
+{
+   struct perf_event_attr *attr = &event->attr;
+
+   return attr->exclude_idle || attr->exclude_user ||
+  attr->exclude_kernel || attr->exclude_hv ||
+  attr->exclude_guest || attr->exclude_host;
+}
+
 static inline bool is_sampling_event(struct perf_event *event)
 {
return event->attr.sample_period != 0;
-- 
2.7.4



[PATCH 03/10] arm: perf: add additional validation to set_event_filter

2018-11-16 Thread Andrew Murray
The armv7pmu driver doesn't support host/guest mode exclusion so
let's report this when set_event_filter is called with these
exclusion flags set.

Signed-off-by: Andrew Murray 
---
 arch/arm/kernel/perf_event_v7.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index a4fb0f8..c4c9fbb 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -1074,6 +1074,8 @@ static int armv7pmu_set_event_filter(struct hw_perf_event 
*event,
 
if (attr->exclude_idle)
return -EPERM;
+   if (attr->exclude_host || attr->exclude_guest)
+   return -EPERM;
if (attr->exclude_user)
config_base |= ARMV7_EXCLUDE_USER;
if (attr->exclude_kernel)
-- 
2.7.4



[PATCH 04/10] powerpc: perf/core: generalise event exclusion checking with perf macro

2018-11-16 Thread Andrew Murray
Replace checking of perf event exclusion flags with perf macro.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/perf/hv-24x7.c |  7 +--
 arch/powerpc/perf/hv-gpci.c |  7 +--
 arch/powerpc/perf/imc-pmu.c | 14 ++
 3 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 72238ee..60db22d 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1307,12 +1307,7 @@ static int h_24x7_event_init(struct perf_event *event)
}
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
/* no branch sampling */
diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 43fabb3..2d2b5c0 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -233,12 +233,7 @@ static int h_gpci_event_init(struct perf_event *event)
}
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
/* no branch sampling */
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 1fafc32b..1ae1d3f 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -474,12 +474,7 @@ static int nest_imc_event_init(struct perf_event *event)
return -EINVAL;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
if (event->cpu < 0)
@@ -749,12 +744,7 @@ static int core_imc_event_init(struct perf_event *event)
return -EINVAL;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
if (event->cpu < 0)
-- 
2.7.4



[PATCH 05/10] powerpc/pmu/fsl: add additional validation to event_init

2018-11-16 Thread Andrew Murray
The fsl PMU driver doesn't support host/guest mode exclusion so
let's report this when event_init is called with these exclusion
flags set.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/perf/core-fsl-emb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
index ba48584..b1cc9d1 100644
--- a/arch/powerpc/perf/core-fsl-emb.c
+++ b/arch/powerpc/perf/core-fsl-emb.c
@@ -559,6 +559,8 @@ static int fsl_emb_pmu_event_init(struct perf_event *event)
event->hw.config_base |= PMLCA_FCS;
if (event->attr.exclude_idle)
return -ENOTSUPP;
+   if (event->attr.exclude_host || event->attr.exclude_guest)
+   return -ENOTSUPP;
 
event->hw.last_period = event->hw.sample_period;
local64_set(&event->hw.period_left, event->hw.last_period);
-- 
2.7.4



[PATCH 06/10] alpha: perf/core: generalise event exclusion checking with perf macro

2018-11-16 Thread Andrew Murray
Replace checking of perf event exclusion flags with perf macro.

This is a functional change as __hw_perf_event_init will now indicate
that it doesn't support exclude_host and exclude_guest event flags.

Signed-off-by: Andrew Murray 
---
 arch/alpha/kernel/perf_event.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 5613aa37..36f98ee 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -631,10 +631,8 @@ static int __hw_perf_event_init(struct perf_event *event)
}
 
/* The EV67 does not support mode exclusion */
-   if (attr->exclude_kernel || attr->exclude_user
-   || attr->exclude_hv || attr->exclude_idle) {
+   if (event_has_exclude_flags(event))
return -EPERM;
-   }
 
/*
 * We place the event type in event_base here and leave calculation
-- 
2.7.4



[PATCH 07/10] x86: perf/core: generalise event exclusion checking with perf macro

2018-11-16 Thread Andrew Murray
Replace checking of perf event exclusion flags with perf macro.

This is a functional change as exclude_host and exclude_guest are
added in these files:

arch/x86/events/intel/uncore.c

and exclude_idle and exclude_hv are added in these files:

arch/x86/events/amd/iommu.c
arch/x86/events/amd/uncore.c

Signed-off-by: Andrew Murray 
---
 arch/x86/events/amd/ibs.c  | 11 +--
 arch/x86/events/amd/iommu.c|  3 +--
 arch/x86/events/amd/power.c|  8 +---
 arch/x86/events/amd/uncore.c   |  3 +--
 arch/x86/events/intel/cstate.c |  7 +--
 arch/x86/events/intel/rapl.c   |  7 +--
 arch/x86/events/intel/uncore.c |  3 +--
 arch/x86/events/intel/uncore_snb.c |  7 +--
 arch/x86/events/msr.c  |  7 +--
 9 files changed, 9 insertions(+), 47 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index d50bb4d..a51981c 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -253,15 +253,6 @@ static int perf_ibs_precise_event(struct perf_event 
*event, u64 *config)
return -EOPNOTSUPP;
 }
 
-static const struct perf_event_attr ibs_notsupp = {
-   .exclude_user   = 1,
-   .exclude_kernel = 1,
-   .exclude_hv = 1,
-   .exclude_idle   = 1,
-   .exclude_host   = 1,
-   .exclude_guest  = 1,
-};
-
 static int perf_ibs_init(struct perf_event *event)
 {
struct hw_perf_event *hwc = &event->hw;
@@ -282,7 +273,7 @@ static int perf_ibs_init(struct perf_event *event)
if (event->pmu != &perf_ibs->pmu)
return -ENOENT;
 
-   if (perf_flags(&event->attr) & perf_flags(&ibs_notsupp))
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
if (config & ~perf_ibs->config_mask)
diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 3210fee..fa7541b 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -224,8 +224,7 @@ static int perf_iommu_event_init(struct perf_event *event)
return -EINVAL;
 
/* IOMMU counters do not have usr/os/guest/host bits */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
if (event->cpu < 0)
diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
index 2aefacf..4129fbe 100644
--- a/arch/x86/events/amd/power.c
+++ b/arch/x86/events/amd/power.c
@@ -136,13 +136,7 @@ static int pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* Unsupported modes and filters. */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   /* no sampling */
+   if (event_has_exclude_flags(event) ||
event->attr.sample_period)
return -EINVAL;
 
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 8671de1..c2015c7 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -202,8 +202,7 @@ static int amd_uncore_event_init(struct perf_event *event)
return -EINVAL;
 
/* NB and Last level cache counters do not have usr/os/guest/host bits 
*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
/* and we do not enable counter overflow interrupts */
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 9f8084f..9366833 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -280,12 +280,7 @@ static int cstate_pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
+   if (event_has_exclude_flags(event) ||
event->attr.sample_period) /* no sampling */
return -EINVAL;
 
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 32f3e94..428d40c 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -397,12 +397,7 @@ static int rapl_pmu_event_init(struct perf_event *event)
return -EINVAL;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.ex

[PATCH 08/10] perf/core: Remove unused perf_flags

2018-11-16 Thread Andrew Murray
Now that perf_flags is not used we remove it.

Signed-off-by: Andrew Murray 
---
 include/uapi/linux/perf_event.h   | 2 --
 tools/include/uapi/linux/perf_event.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
-- 
2.7.4



[PATCH 09/10] drivers/perf: perf/core: generalise event exclusion checking with perf macro

2018-11-16 Thread Andrew Murray
Replace checking of perf event exclusion flags with perf macro.

This is a functional change as exclude_host and exclude_guest are added
in the following files:

 - drivers/perf/qcom_l2_pmu.c
 - drivers/perf/qcom_l3_pmu.c
 - drivers/perf/arm_pmu.c

And exclude_idle and exclude_hv are added in these files:

 - drivers/perf/xgene_pmu.c

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm-cci.c   | 7 +--
 drivers/perf/arm-ccn.c   | 5 +
 drivers/perf/arm_dsu_pmu.c   | 7 +--
 drivers/perf/arm_pmu.c   | 9 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 7 +--
 drivers/perf/qcom_l2_pmu.c   | 3 +--
 drivers/perf/qcom_l3_pmu.c   | 3 +--
 drivers/perf/xgene_pmu.c | 3 +--
 8 files changed, 8 insertions(+), 36 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 1bfeb16..d749f19 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1328,12 +1328,7 @@ static int cci_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
 
/* We have no filtering of any kind */
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
/*
diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c
index 7dd850e..9a22a95 100644
--- a/drivers/perf/arm-ccn.c
+++ b/drivers/perf/arm-ccn.c
@@ -741,10 +741,7 @@ static int arm_ccn_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (has_branch_stack(event) || event->attr.exclude_user ||
-   event->attr.exclude_kernel || event->attr.exclude_hv ||
-   event->attr.exclude_idle || event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   if (has_branch_stack(event) || event_has_exclude_flags(event)) {
dev_dbg(ccn->dev, "Can't exclude execution levels!\n");
return -EINVAL;
}
diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 660cb8a..300ff3d 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -563,12 +563,7 @@ static int dsu_pmu_event_init(struct perf_event *event)
}
 
if (has_branch_stack(event) ||
-   event->attr.exclude_user ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle ||
-   event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   event_has_exclude_flags(event)) {
dev_dbg(dsu_pmu->pmu.dev, "Can't support filtering\n");
return -EINVAL;
}
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 7f01f6f..a03634f 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -357,13 +357,6 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 }
 
 static int
-event_requires_mode_exclusion(struct perf_event_attr *attr)
-{
-   return attr->exclude_idle || attr->exclude_user ||
-  attr->exclude_kernel || attr->exclude_hv;
-}
-
-static int
 __hw_perf_event_init(struct perf_event *event)
 {
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
@@ -395,7 +388,7 @@ __hw_perf_event_init(struct perf_event *event)
 */
if ((!armpmu->set_event_filter ||
 armpmu->set_event_filter(hwc, &event->attr)) &&
-event_requires_mode_exclusion(&event->attr)) {
+event_has_exclude_flags(event)) {
pr_debug("ARM performance counters do not support "
 "mode exclusion\n");
return -EOPNOTSUPP;
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 9efd241..d3edff9 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -143,12 +143,7 @@ int hisi_uncore_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
 
/* counters do not have these bits */
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_host||
-   event->attr.exclude_guest   ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle)
+   if (event_has_exclude_flags(event))
return -EINVAL;
 
/*
diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
index 842135c..d7d85a2 100644
--- a/drivers/perf/qcom_l2_pmu.c
+++ b/drivers/perf/qcom_l2_pmu.c
@@ -

[PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags

2018-11-16 Thread Andrew Murray
Update design.txt to reflect the presence of the exclude_host
and exclude_guest perf flags.

Signed-off-by: Andrew Murray 
---
 tools/perf/design.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index a28dca2..7de7d83 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits 
provide a
 way to request that counting of events be restricted to times when the
 CPU is in user, kernel and/or hypervisor mode.
 
+Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
+to request counting of events restricted to guest and host contexts when
+using virtualisation.
+
 The 'mmap' and 'munmap' bits allow recording of PROT_EXEC mmap/munmap
 operations, these can be used to relate userspace IP addresses to actual
 code, even after the mapping (or even the whole process) is gone,
-- 
2.7.4



Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags

2018-11-20 Thread Andrew Murray
On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote:
> Andrew Murray  writes:
> 
> > Update design.txt to reflect the presence of the exclude_host
> > and exclude_guest perf flags.
> >
> > Signed-off-by: Andrew Murray 
> > ---
> >  tools/perf/design.txt | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt
> > index a28dca2..7de7d83 100644
> > --- a/tools/perf/design.txt
> > +++ b/tools/perf/design.txt
> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' 
> > bits provide a
> >  way to request that counting of events be restricted to times when the
> >  CPU is in user, kernel and/or hypervisor mode.
> >  
> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
> > +to request counting of events restricted to guest and host contexts when
> > +using virtualisation.
> 
> How does exclude_host differ from exclude_hv ?

I believe exclude_host / exclude_guest are intented to distinguish
between host and guest in the hosted hypervisor context (KVM).
Whereas exclude_hv allows to distinguish between guest and
hypervisor in the bare-metal type hypervisors.

In the case of arm64 - if VHE extensions are present then the host
kernel will run at a higher privilege to the guest kernel, in which
case there is no distinction between hypervisor and host so we ignore
exclude_hv. But where VHE extensions are not present then the host
kernel runs at the same privilege level as the guest and we use a
higher privilege level to switch between them - in this case we can
use exclude_hv to discount that hypervisor role of switching between
guests.

Thanks,

Andrew Murray

> 
> cheers


Re: [PATCH 00/10] perf/core: Generalise event exclusion checking

2018-11-22 Thread Andrew Murray
On Mon, Nov 19, 2018 at 02:08:00PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 16, 2018 at 10:24:03AM +0000, Andrew Murray wrote:
> > Many PMU drivers do not have the capability to exclude counting events
> > that occur in specific contexts such as idle, kernel, guest, etc. These
> > drivers indicate this by returning an error in their event_init upon
> > testing the events attribute flags.
> > 
> > However this approach requires that each time a new event modifier is
> > added to perf, all the perf drivers need to be modified to indicate that
> > they don't support the attribute. This results in additional boiler-plate
> > code common to many drivers that needs to be maintained. An example of
> > this is the addition of exclude_host and exclude_guest in 2011 yet many
> > PMU drivers do not support this or indicate an error on events that make
> > use of it.
> > 
> > This patch generalises the test for exclusion and updates PMU drivers to
> > use it. This is a functional change as some PMU drivers will now correctly
> > report that they don't support certain events whereas they previously did.
> 
> Right, I like that idea, and yes, there's a lot of fail around there :/
> 
> > A longer term approach may instead be for PMU's to advertise their
> > capabilities on registration.
> 
> This I think is the better approach. We already have the
> PERF_PMU_CAP_flags that can be used to advertise various PMU
> capabilities.

OK I'll respin my series to take this approach.

> 
> Something along these lines I suppose; then every PMU that actually
> checks the flags, needs to set the flag, otherwise it'll fail.
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 53c500f0ca79..de15723ea52a 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -244,6 +244,7 @@ struct perf_event;
>  #define PERF_PMU_CAP_EXCLUSIVE   0x10
>  #define PERF_PMU_CAP_ITRACE  0x20
>  #define PERF_PMU_CAP_HETEROGENEOUS_CPUS  0x40
> +#define PERF_PMU_CAP_EXCLUDE 0x80
>  
>  /**
>   * struct pmu - generic performance monitoring unit
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 84530ab358c3..d76b724177b9 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9772,6 +9772,14 @@ static int perf_try_init_event(struct pmu *pmu, struct 
> perf_event *event)
>   if (ctx)
>   perf_event_ctx_unlock(event->group_leader, ctx);
>  
> + if (!ret) {
> + if ((pmu->capabilities & PERF_PMU_CAP_EXCLUDE) ||
> + event_has_exclude_flags(event)) {
> + event->destroy(event);
> + ret = -EINVAL;
> + }
> + }
> +

I don't quite follow this logic. Should that not have been:

if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUDE) &&
     event_has_exclude_flags(event)) {

Meaning that if an event has any exclude flags but the pmu doesn't
have the capability to handle them then error.

If you're happy with my proposed logic, then would it also make
sense to move this before the call to the pmu->event_init ?

Thanks,

Andrew Murray

>   if (ret)
>   module_put(pmu->module);
>  
> 


Re: [PATCH 00/10] perf/core: Generalise event exclusion checking

2018-11-22 Thread Andrew Murray
On Thu, Nov 22, 2018 at 01:26:37PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 22, 2018 at 12:21:43PM +0000, Andrew Murray wrote:
> > On Mon, Nov 19, 2018 at 02:08:00PM +0100, Peter Zijlstra wrote:
> 
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 84530ab358c3..d76b724177b9 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -9772,6 +9772,14 @@ static int perf_try_init_event(struct pmu *pmu, 
> > > struct perf_event *event)
> > >   if (ctx)
> > >   perf_event_ctx_unlock(event->group_leader, ctx);
> > >  
> > > + if (!ret) {
> > > + if ((pmu->capabilities & PERF_PMU_CAP_EXCLUDE) ||
> > > + event_has_exclude_flags(event)) {
> > > + event->destroy(event);
> > > + ret = -EINVAL;
> > > + }
> > > + }
> > > +
> > 
> > I don't quite follow this logic. Should that not have been:
> > 
> > if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUDE) &&
> >  event_has_exclude_flags(event)) {
> > 
> > Meaning that if an event has any exclude flags but the pmu doesn't
> > have the capability to handle them then error.
> 
> Uhm, yes. Brainfart on my side that.
> 
> > If you're happy with my proposed logic, then would it also make
> > sense to move this before the call to the pmu->event_init ?
> 
> I'm not sure that can work; I think we need ->event_init() first such
> that it can -ENOENT. Only after ->event_init() returns success can we be
> certain of @pmu.

Ah yes I see now. Until event_init doesn't return -ENOENT we can't be sure
that this will be the PMU we use (as per the other calls to
perf_try_init_event in perf_init_event).

Thanks,

Andrew Murray


Re: [PATCH 09/10] drivers/perf: perf/core: generalise event exclusion checking with perf macro

2018-11-23 Thread Andrew Murray
On Mon, Nov 19, 2018 at 04:03:52PM +, Mark Rutland wrote:
> On Fri, Nov 16, 2018 at 10:24:12AM +0000, Andrew Murray wrote:
> > Replace checking of perf event exclusion flags with perf macro.
> > 
> > This is a functional change as exclude_host and exclude_guest are added
> > in the following files:
> > 
> >  - drivers/perf/qcom_l2_pmu.c
> >  - drivers/perf/qcom_l3_pmu.c
> >  - drivers/perf/arm_pmu.c
> > 
> > And exclude_idle and exclude_hv are added in these files:
> > 
> >  - drivers/perf/xgene_pmu.c
> 
> FWIW, that all sounds fine to me.
> 
> Assuming you fix up the 'macro' nit:
> 
> Acked-by: Mark Rutland 
> 
> ... unless we go for Peter's core cap for this.

Yeah I'm going to re-spin with the core cap approach - but thanks anyway.

Thanks,

Andrew Murray

> 
> Thanks,
> Mark.
> 
> > Signed-off-by: Andrew Murray 
> > ---
> >  drivers/perf/arm-cci.c   | 7 +--
> >  drivers/perf/arm-ccn.c   | 5 +
> >  drivers/perf/arm_dsu_pmu.c   | 7 +--
> >  drivers/perf/arm_pmu.c   | 9 +
> >  drivers/perf/hisilicon/hisi_uncore_pmu.c | 7 +--
> >  drivers/perf/qcom_l2_pmu.c   | 3 +--
> >  drivers/perf/qcom_l3_pmu.c   | 3 +--
> >  drivers/perf/xgene_pmu.c | 3 +--
> >  8 files changed, 8 insertions(+), 36 deletions(-)
> > 
> > diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> > index 1bfeb16..d749f19 100644
> > --- a/drivers/perf/arm-cci.c
> > +++ b/drivers/perf/arm-cci.c
> > @@ -1328,12 +1328,7 @@ static int cci_pmu_event_init(struct perf_event 
> > *event)
> > return -EOPNOTSUPP;
> >  
> > /* We have no filtering of any kind */
> > -   if (event->attr.exclude_user||
> > -   event->attr.exclude_kernel  ||
> > -   event->attr.exclude_hv  ||
> > -   event->attr.exclude_idle||
> > -   event->attr.exclude_host||
> > -   event->attr.exclude_guest)
> > +   if (event_has_exclude_flags(event))
> > return -EINVAL;
> >  
> > /*
> > diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c
> > index 7dd850e..9a22a95 100644
> > --- a/drivers/perf/arm-ccn.c
> > +++ b/drivers/perf/arm-ccn.c
> > @@ -741,10 +741,7 @@ static int arm_ccn_pmu_event_init(struct perf_event 
> > *event)
> > return -EOPNOTSUPP;
> > }
> >  
> > -   if (has_branch_stack(event) || event->attr.exclude_user ||
> > -   event->attr.exclude_kernel || event->attr.exclude_hv ||
> > -   event->attr.exclude_idle || event->attr.exclude_host ||
> > -   event->attr.exclude_guest) {
> > +   if (has_branch_stack(event) || event_has_exclude_flags(event)) {
> > dev_dbg(ccn->dev, "Can't exclude execution levels!\n");
> > return -EINVAL;
> > }
> > diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
> > index 660cb8a..300ff3d 100644
> > --- a/drivers/perf/arm_dsu_pmu.c
> > +++ b/drivers/perf/arm_dsu_pmu.c
> > @@ -563,12 +563,7 @@ static int dsu_pmu_event_init(struct perf_event *event)
> > }
> >  
> > if (has_branch_stack(event) ||
> > -   event->attr.exclude_user ||
> > -   event->attr.exclude_kernel ||
> > -   event->attr.exclude_hv ||
> > -   event->attr.exclude_idle ||
> > -   event->attr.exclude_host ||
> > -   event->attr.exclude_guest) {
> > +   event_has_exclude_flags(event)) {
> > dev_dbg(dsu_pmu->pmu.dev, "Can't support filtering\n");
> > return -EINVAL;
> > }
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 7f01f6f..a03634f 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -357,13 +357,6 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void 
> > *dev)
> >  }
> >  
> >  static int
> > -event_requires_mode_exclusion(struct perf_event_attr *attr)
> > -{
> > -   return attr->exclude_idle || attr->exclude_user ||
> > -  attr->exclude_kernel || attr->exclude_hv;
> > -}
> > -
> > -static int
> >  __hw_perf_event_init(struct perf_event *event)
> >  {
> > struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> > @@ -395,7 +388,7 @@ __hw_perf_event_init(struct perf_event *event)
> >  */
>

Re: [PATCH 01/10] perf/core: Add macro to test for event exclusion flags

2018-11-23 Thread Andrew Murray
On Tue, Nov 20, 2018 at 10:28:34PM +1100, Michael Ellerman wrote:
> Andrew Murray  writes:
> 
> > Add a macro that tests if any of the perf event exclusion flags
> > are set on a given event.
> >
> > Signed-off-by: Andrew Murray 
> > ---
> >  include/linux/perf_event.h | 9 +
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 53c500f..89ee7fa 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -1004,6 +1004,15 @@ perf_event__output_id_sample(struct perf_event 
> > *event,
> >  extern void
> >  perf_log_lost_samples(struct perf_event *event, u64 lost);
> >  
> > +static inline bool event_has_exclude_flags(struct perf_event *event)
> > +{
> > +   struct perf_event_attr *attr = &event->attr;
> > +
> > +   return attr->exclude_idle || attr->exclude_user ||
> > +  attr->exclude_kernel || attr->exclude_hv ||
> > +  attr->exclude_guest || attr->exclude_host;
> > +}
> 
> Sorry to be a total PITA, but using "flags" plural suggests that it only
> returns true if there is more than one exclude flag set.
> 
> A better name would be event_has_exclude_flag() or maybe
> event_has_any_exclude_flag().
> 
> If you're doing a respin anyway it'd be nice to fix the name, but
> obviously it's not high priority.

No problem - I'll go with event_has_any_exclude_flag.

Thanks,

Andrew Murray

> 
> cheers


[PATCH v2 00/20] perf/core: Generalise event exclusion checking

2018-11-26 Thread Andrew Murray
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags.

However this approach requires that each time a new event modifier is
added to perf, all the perf drivers need to be modified to indicate that
they don't support the attribute. This results in additional boiler-plate
code common to many drivers that needs to be maintained. Furthermore the
drivers are not consistent with regards to the error value they return
when reporting unsupported attributes.

This patchset allow PMU drivers to advertise their ability to exclude
based on context via a new capability: PERF_PMU_CAP_EXCLUDE. This allows
the perf core to reject requests for exclusion events where there is no
support in the PMU.

This is a functional change, in particular:

 - Some drivers will now additionally (but correctly) report unsupported
   exclusion flags. It's typical for existing userspace tools such as
   perf to handle such errors by retrying the system call without the
   unsupported flags.

 - Drivers that do not support any exclusion that previously reported
   -EPERM or -EOPNOTSUPP will now report -EINVAL - this is consistent
   with the majority and results in userspace perf retrying without
   exclusion.

All drivers touched by this patchset have been compile tested.

Changes from v1:

 - Changed approach from explicitly rejecting events in unsupporting PMU
   drivers to explicitly advertising a capability in PMU drivers that
   do support exclusion events

 - Added additional information to tools/perf/design.txt

 - Rename event_has_exclude_flags to event_has_any_exclude_flag and
   update commit log to reflect it's a function

Andrew Murray (20):
  perf/doc: update design.txt for exclude_{host|guest} flags
  perf/core: add function to test for event exclusion flags
  perf/core: add PERF_PMU_CAP_EXCLUDE for exclusion capable PMUs
  perf/hw_breakpoint: perf/core: advertise PMU exclusion capability
  alpha: perf/core: remove unnecessary checks for exclusion
  arc: perf/core: advertise PMU exclusion capability
  arm: perf: conditionally advertise PMU exclusion capability
  arm: perf/core: remove unnecessary checks for exclusion
  drivers/perf: perf/core: remove unnecessary checks for exclusion
  drivers/perf: perf/core: remove unnecessary checks for exclusion
  drivers/perf: perf/core: advertise PMU exclusion capability
  mips: perf/core: advertise PMU exclusion capability
  powerpc: perf/core: advertise PMU exclusion capability
  powerpc: perf/core: remove unnecessary checks for exclusion
  s390: perf/events: advertise PMU exclusion capability
  sparc: perf/core: advertise PMU exclusion capability
  x86: perf/core: remove unnecessary checks for exclusion
  x86: perf/core remove unnecessary checks for exclusion
  x86: perf/core: advertise PMU exclusion capability
  perf/core: remove unused perf_flags

 arch/alpha/kernel/perf_event.c   |  6 --
 arch/arc/kernel/perf_event.c |  1 +
 arch/arm/mach-imx/mmdc.c |  8 +---
 arch/arm/mm/cache-l2x0-pmu.c |  8 
 arch/mips/kernel/perf_event_mipsxx.c |  1 +
 arch/powerpc/perf/core-book3s.c  |  1 +
 arch/powerpc/perf/core-fsl-emb.c |  1 +
 arch/powerpc/perf/hv-24x7.c  |  9 -
 arch/powerpc/perf/hv-gpci.c  |  9 -
 arch/powerpc/perf/imc-pmu.c  | 18 --
 arch/s390/kernel/perf_cpum_cf.c  |  1 +
 arch/s390/kernel/perf_cpum_sf.c  |  2 ++
 arch/sparc/kernel/perf_event.c   |  1 +
 arch/x86/events/amd/ibs.c| 12 
 arch/x86/events/amd/iommu.c  |  5 -
 arch/x86/events/amd/power.c  |  9 +
 arch/x86/events/amd/uncore.c |  5 -
 arch/x86/events/core.c   |  2 ++
 arch/x86/events/intel/bts.c  |  2 +-
 arch/x86/events/intel/cstate.c   |  8 +---
 arch/x86/events/intel/pt.c   |  4 +++-
 arch/x86/events/intel/rapl.c |  8 +---
 arch/x86/events/intel/uncore.c   |  8 
 arch/x86/events/intel/uncore_snb.c   |  8 +---
 arch/x86/events/msr.c|  8 +---
 drivers/perf/arm-cci.c   |  9 -
 drivers/perf/arm-ccn.c   |  5 +
 drivers/perf/arm_dsu_pmu.c   |  8 +---
 drivers/perf/arm_pmu.c   | 15 +--
 drivers/perf/arm_spe_pmu.c   |  3 ++-
 drivers/perf/hisilicon/hisi_uncore_pmu.c |  9 -
 drivers/perf/qcom_l2_pmu.c   |  8 
 drivers/perf/qcom_l3_pmu.c   |  7 ---
 drivers/perf/xgene_pmu.c |  5 -
 include/linux/perf_event.h   | 10 ++
 include/uapi/linux/perf_event.h  |  2 --
 kerne

[PATCH v2 01/20] perf/doc: update design.txt for exclude_{host|guest} flags

2018-11-26 Thread Andrew Murray
Update design.txt to reflect the presence of the exclude_host
and exclude_guest perf flags.

Signed-off-by: Andrew Murray 
---
 tools/perf/design.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index a28dca2..5b2b23b 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits 
provide a
 way to request that counting of events be restricted to times when the
 CPU is in user, kernel and/or hypervisor mode.
 
+Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
+to request counting of events restricted to guest and host contexts when
+using KVM virtualisation.
+
 The 'mmap' and 'munmap' bits allow recording of PROT_EXEC mmap/munmap
 operations, these can be used to relate userspace IP addresses to actual
 code, even after the mapping (or even the whole process) is gone,
-- 
2.7.4



[PATCH v2 02/20] perf/core: add function to test for event exclusion flags

2018-11-26 Thread Andrew Murray
Add a function that tests if any of the perf event exclusion flags
are set on a given event.

Signed-off-by: Andrew Murray 
---
 include/linux/perf_event.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f..b2e806f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1004,6 +1004,15 @@ perf_event__output_id_sample(struct perf_event *event,
 extern void
 perf_log_lost_samples(struct perf_event *event, u64 lost);
 
+static inline bool event_has_any_exclude_flag(struct perf_event *event)
+{
+   struct perf_event_attr *attr = &event->attr;
+
+   return attr->exclude_idle || attr->exclude_user ||
+  attr->exclude_kernel || attr->exclude_hv ||
+  attr->exclude_guest || attr->exclude_host;
+}
+
 static inline bool is_sampling_event(struct perf_event *event)
 {
return event->attr.sample_period != 0;
-- 
2.7.4



[PATCH v2 03/20] perf/core: add PERF_PMU_CAP_EXCLUDE for exclusion capable PMUs

2018-11-26 Thread Andrew Murray
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags. This approach is error prone and
often inconsistent.

Let's instead allow PMU drivers to advertise their ability to exclude
based on context via a new capability: PERF_PMU_CAP_EXCLUDE. This
allows the perf core to reject requests for exclusion events where
there is no support in the PMU.

Signed-off-by: Andrew Murray 
---
 include/linux/perf_event.h | 1 +
 kernel/events/core.c   | 9 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b2e806f..69b3d65 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -244,6 +244,7 @@ struct perf_event;
 #define PERF_PMU_CAP_EXCLUSIVE 0x10
 #define PERF_PMU_CAP_ITRACE0x20
 #define PERF_PMU_CAP_HETEROGENEOUS_CPUS0x40
+#define PERF_PMU_CAP_EXCLUDE   0x80
 
 /**
  * struct pmu - generic performance monitoring unit
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5a97f34..9afb33c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9743,6 +9743,15 @@ static int perf_try_init_event(struct pmu *pmu, struct 
perf_event *event)
if (ctx)
perf_event_ctx_unlock(event->group_leader, ctx);
 
+   if (!ret) {
+   if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUDE) &&
+   event_has_any_exclude_flag(event)) {
+   if (event->destroy)
+   event->destroy(event);
+   ret = -EINVAL;
+   }
+   }
+
if (ret)
module_put(pmu->module);
 
-- 
2.7.4



[PATCH v2 04/20] perf/hw_breakpoint: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The breakpoint PMU has the capability to exclude kernel contexts,
let's advertise that we support the PERF_PMU_CAP_EXCLUDE
capability so that perf doesn't prevent us from handling events
with any exclusion flags set.

Signed-off-by: Andrew Murray 
---
 kernel/events/hw_breakpoint.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index d6b5618..fefe3d5 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -669,6 +669,8 @@ static struct pmu perf_breakpoint = {
.start  = hw_breakpoint_start,
.stop   = hw_breakpoint_stop,
.read   = hw_breakpoint_pmu_read,
+
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 int __init init_hw_breakpoint(void)
-- 
2.7.4



[PATCH v2 05/20] alpha: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
As the Alpha PMU doesn't support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for
exclusion flags.

This change means that __hw_perf_event_init will now also
indicate that it doesn't support exclude_host and exclude_guest and
will now implicitly return -EINVAL instead of -EPERM. This is likely
more desirable as -EPERM will result in a kernel.perf_event_paranoid
related warning from the perf userspace utility.

Signed-off-by: Andrew Murray 
---
 arch/alpha/kernel/perf_event.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 5613aa37..5c17077 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -630,12 +630,6 @@ static int __hw_perf_event_init(struct perf_event *event)
return ev;
}
 
-   /* The EV67 does not support mode exclusion */
-   if (attr->exclude_kernel || attr->exclude_user
-   || attr->exclude_hv || attr->exclude_idle) {
-   return -EPERM;
-   }
-
/*
 * We place the event type in event_base here and leave calculation
 * of the codes to programme the PMU for alpha_pmu_enable() because
-- 
2.7.4



[PATCH v2 06/20] arc: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The ARC PMU has the capability to exclude events based on
context. Let's advertise that we support the
PERF_PMU_CAP_EXCLUDE capability to ensure that perf doesn't
prevent us from handling events where any exclusion flags are
set.

Signed-off-by: Andrew Murray 
---
 arch/arc/kernel/perf_event.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 8aec462..cf8ea6d 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -514,6 +514,7 @@ static int arc_pmu_device_probe(struct platform_device 
*pdev)
.start  = arc_pmu_start,
.stop   = arc_pmu_stop,
.read   = arc_pmu_read,
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
};
 
if (has_interrupts) {
-- 
2.7.4



[PATCH v2 07/20] arm: perf: conditionally advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The ARM PMU driver can be used to represent a variety of ARM based
PMUs. Some of these PMUs provide support for context exclusion, where
this is the case we advertise this via the PERF_PMU_CAP_EXCLUDE
capability to ensure that perf doesn't prevent us from handling events
where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm_pmu.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 7f01f6f..4409035 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -357,13 +357,6 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 }
 
 static int
-event_requires_mode_exclusion(struct perf_event_attr *attr)
-{
-   return attr->exclude_idle || attr->exclude_user ||
-  attr->exclude_kernel || attr->exclude_hv;
-}
-
-static int
 __hw_perf_event_init(struct perf_event *event)
 {
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
@@ -393,9 +386,8 @@ __hw_perf_event_init(struct perf_event *event)
/*
 * Check whether we need to exclude the counter from certain modes.
 */
-   if ((!armpmu->set_event_filter ||
-armpmu->set_event_filter(hwc, &event->attr)) &&
-event_requires_mode_exclusion(&event->attr)) {
+   if (armpmu->set_event_filter &&
+   armpmu->set_event_filter(hwc, &event->attr)) {
pr_debug("ARM performance counters do not support "
 "mode exclusion\n");
return -EOPNOTSUPP;
@@ -861,6 +853,9 @@ int armpmu_register(struct arm_pmu *pmu)
if (ret)
return ret;
 
+   if (pmu->set_event_filter)
+   pmu->pmu.capabilities |= PERF_PMU_CAP_EXCLUDE;
+
ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
if (ret)
goto out_destroy;
-- 
2.7.4



[PATCH v2 08/20] arm: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For drivers that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for
exclusion flags.

Signed-off-by: Andrew Murray 
---
 arch/arm/mach-imx/mmdc.c | 8 +---
 arch/arm/mm/cache-l2x0-pmu.c | 8 
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/arch/arm/mach-imx/mmdc.c b/arch/arm/mach-imx/mmdc.c
index 04b3bf7..b937a15 100644
--- a/arch/arm/mach-imx/mmdc.c
+++ b/arch/arm/mach-imx/mmdc.c
@@ -293,13 +293,7 @@ static int mmdc_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest   ||
-   event->attr.sample_period)
+   if (event->attr.sample_period)
return -EINVAL;
 
if (cfg < 0 || cfg >= MMDC_NUM_COUNTERS)
diff --git a/arch/arm/mm/cache-l2x0-pmu.c b/arch/arm/mm/cache-l2x0-pmu.c
index afe5b4c..ba92f9e 100644
--- a/arch/arm/mm/cache-l2x0-pmu.c
+++ b/arch/arm/mm/cache-l2x0-pmu.c
@@ -314,14 +314,6 @@ static int l2x0_pmu_event_init(struct perf_event *event)
event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
-- 
2.7.4



[PATCH v2 09/20] drivers/perf: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For drivers that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures
that perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for
exclusion flags.

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm-cci.c   | 9 -
 drivers/perf/arm-ccn.c   | 5 +
 drivers/perf/arm_dsu_pmu.c   | 8 +---
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 9 -
 4 files changed, 2 insertions(+), 29 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 1bfeb16..501b497 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1327,15 +1327,6 @@ static int cci_pmu_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EOPNOTSUPP;
 
-   /* We have no filtering of any kind */
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/*
 * Following the example set by other "uncore" PMUs, we accept any CPU
 * and rewrite its affinity dynamically rather than having perf core
diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c
index 7dd850e..decf881 100644
--- a/drivers/perf/arm-ccn.c
+++ b/drivers/perf/arm-ccn.c
@@ -741,10 +741,7 @@ static int arm_ccn_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (has_branch_stack(event) || event->attr.exclude_user ||
-   event->attr.exclude_kernel || event->attr.exclude_hv ||
-   event->attr.exclude_idle || event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   if (has_branch_stack(event)) {
dev_dbg(ccn->dev, "Can't exclude execution levels!\n");
return -EINVAL;
}
diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 660cb8a..036414c 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -562,13 +562,7 @@ static int dsu_pmu_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   if (has_branch_stack(event) ||
-   event->attr.exclude_user ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle ||
-   event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   if (has_branch_stack(event)) {
dev_dbg(dsu_pmu->pmu.dev, "Can't support filtering\n");
return -EINVAL;
}
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 9efd241..f028cbc 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -142,15 +142,6 @@ int hisi_uncore_pmu_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EOPNOTSUPP;
 
-   /* counters do not have these bits */
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_host||
-   event->attr.exclude_guest   ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle)
-   return -EINVAL;
-
/*
 *  The uncore counters not specific to any CPU, so cannot
 *  support per-task
-- 
2.7.4



[PATCH v2 10/20] drivers/perf: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For drivers that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for
exclusion flags.

This change means that qcom_{l2|l3}_pmu will now also indicate that
they do not support exclude_{host|guest} and that xgene_pmu does
not also support exclude_idle and exclude_hv.

Note that for qcom_l2_pmu we now implictly return -EINVAL instead
of -EOPNOTSUPP. This change will result in the perf userspace
utility retrying the perf_event_open system call with fallback
event attributes that do not fail.

Signed-off-by: Andrew Murray 
---
 drivers/perf/qcom_l2_pmu.c | 8 
 drivers/perf/qcom_l3_pmu.c | 7 ---
 drivers/perf/xgene_pmu.c   | 5 -
 3 files changed, 20 deletions(-)

diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
index 842135c..518e18c 100644
--- a/drivers/perf/qcom_l2_pmu.c
+++ b/drivers/perf/qcom_l2_pmu.c
@@ -509,14 +509,6 @@ static int l2_cache_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   /* We cannot filter accurately so we just don't allow it. */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle) {
-   dev_dbg_ratelimited(&l2cache_pmu->pdev->dev,
-   "Can't exclude execution levels\n");
-   return -EOPNOTSUPP;
-   }
-
if (((L2_EVT_GROUP(event->attr.config) > L2_EVT_GROUP_MAX) ||
 ((event->attr.config & ~L2_EVT_MASK) != 0)) &&
(event->attr.config != L2CYCLE_CTR_RAW_CODE)) {
diff --git a/drivers/perf/qcom_l3_pmu.c b/drivers/perf/qcom_l3_pmu.c
index 2dc63d6..e28bd2f 100644
--- a/drivers/perf/qcom_l3_pmu.c
+++ b/drivers/perf/qcom_l3_pmu.c
@@ -495,13 +495,6 @@ static int qcom_l3_cache__event_init(struct perf_event 
*event)
return -ENOENT;
 
/*
-* There are no per-counter mode filters in the PMU.
-*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle)
-   return -EINVAL;
-
-   /*
 * Sampling not supported since these events are not core-attributable.
 */
if (hwc->sample_period)
diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 0e31f13..bdc55de 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -914,11 +914,6 @@ static int xgene_perf_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* SOC counters do not have usr/os/guest/host bits */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
/*
-- 
2.7.4



[PATCH v2 11/20] drivers/perf: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The arm_pse PMU has the capability to exclude events based on
context. Let's advertise that we support the PERF_PMU_CAP_EXCLUDE
capability to ensure that perf doesn't prevent us from handling
events where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm_spe_pmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 54ec278..a09c38a 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -902,7 +902,8 @@ static int arm_spe_pmu_perf_init(struct arm_spe_pmu 
*spe_pmu)
 
spe_pmu->pmu = (struct pmu) {
.module = THIS_MODULE,
-   .capabilities   = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE,
+   .capabilities   = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE |
+ PERF_PMU_CAP_EXCLUDE,
.attr_groups= arm_spe_pmu_attr_groups,
/*
 * We hitch a ride on the software context here, so that
-- 
2.7.4



[PATCH v2 12/20] mips: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The MIPS PMU has the capability to exclude events based on
context. Let's advertise that we support the PERF_PMU_CAP_EXCLUDE
capability to ensure that perf doesn't prevent us from handling
events where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 arch/mips/kernel/perf_event_mipsxx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/kernel/perf_event_mipsxx.c 
b/arch/mips/kernel/perf_event_mipsxx.c
index 4138635..d7813d0 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -670,6 +670,7 @@ static struct pmu pmu = {
.start  = mipspmu_start,
.stop   = mipspmu_stop,
.read   = mipspmu_read,
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 static unsigned int mipspmu_perf_event_encode(const struct mips_perf_event 
*pev)
-- 
2.7.4



[PATCH v2 13/20] powerpc: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
For PowerPC PMUs that have the capability to exclude events
based on context. Let's advertise that we support the
PERF_PMU_CAP_EXCLUDE capability to ensure that perf doesn't
prevent us from handling events where any exclusion flags are
set.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/perf/core-book3s.c  | 1 +
 arch/powerpc/perf/core-fsl-emb.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 81f8a0c..2f44b09 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2007,6 +2007,7 @@ static struct pmu power_pmu = {
.commit_txn = power_pmu_commit_txn,
.event_idx  = power_pmu_event_idx,
.sched_task = power_pmu_sched_task,
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 /*
diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
index ba48584..cea5bcb 100644
--- a/arch/powerpc/perf/core-fsl-emb.c
+++ b/arch/powerpc/perf/core-fsl-emb.c
@@ -596,6 +596,7 @@ static struct pmu fsl_emb_pmu = {
.start  = fsl_emb_pmu_start,
.stop   = fsl_emb_pmu_stop,
.read   = fsl_emb_pmu_read,
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 /*
-- 
2.7.4



[PATCH v2 14/20] powerpc: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For PowerPC PMUs that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for exclusion
flags.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/perf/hv-24x7.c |  9 -
 arch/powerpc/perf/hv-gpci.c |  9 -
 arch/powerpc/perf/imc-pmu.c | 18 --
 3 files changed, 36 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 72238ee..d13d8a9 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1306,15 +1306,6 @@ static int h_24x7_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/* no branch sampling */
if (has_branch_stack(event))
return -EOPNOTSUPP;
diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 43fabb3..e0ce0e0 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -232,15 +232,6 @@ static int h_gpci_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/* no branch sampling */
if (has_branch_stack(event))
return -EOPNOTSUPP;
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 1fafc32b..49c0b1c 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -473,15 +473,6 @@ static int nest_imc_event_init(struct perf_event *event)
if (event->hw.sample_period)
return -EINVAL;
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
@@ -748,15 +739,6 @@ static int core_imc_event_init(struct perf_event *event)
if (event->hw.sample_period)
return -EINVAL;
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
-- 
2.7.4



[PATCH v2 15/20] s390: perf/events: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The s390 cpum_cf and cpum_sf PMUs have the capability to exclude
events based on context. Let's advertise that we support the
PERF_PMU_CAP_EXCLUDE capability to ensure that perf doesn't
prevent us from handling events where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 arch/s390/kernel/perf_cpum_cf.c | 1 +
 arch/s390/kernel/perf_cpum_sf.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index cc085e2..7b583ed 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -667,6 +667,7 @@ static struct pmu cpumf_pmu = {
.start_txn= cpumf_pmu_start_txn,
.commit_txn   = cpumf_pmu_commit_txn,
.cancel_txn   = cpumf_pmu_cancel_txn,
+   .capabilities = PERF_PMU_CAP_EXCLUDE,
 };
 
 static int cpumf_pmf_setup(unsigned int cpu, int flags)
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index 5c53e97..25a64aa 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1885,6 +1885,8 @@ static struct pmu cpumf_sampling = {
 
.setup_aux= aux_buffer_setup,
.free_aux = aux_buffer_free,
+
+   .capabilities = PERF_PMU_CAP_EXCLUDE,
 };
 
 static void cpumf_measurement_alert(struct ext_code ext_code,
-- 
2.7.4



[PATCH v2 17/20] x86: perf/core: remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For drivers that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures
that perf will prevent us from handling events where any exclusion
flags are set. Let's remove the now unnecessary check for
exclusion flags.

Signed-off-by: Andrew Murray 
---
 arch/x86/events/amd/ibs.c  | 12 
 arch/x86/events/amd/power.c|  9 +
 arch/x86/events/intel/cstate.c |  8 +---
 arch/x86/events/intel/rapl.c   |  8 +---
 arch/x86/events/intel/uncore_snb.c |  8 +---
 arch/x86/events/msr.c  |  8 +---
 6 files changed, 5 insertions(+), 48 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index d50bb4d..9e43ef6 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -253,15 +253,6 @@ static int perf_ibs_precise_event(struct perf_event 
*event, u64 *config)
return -EOPNOTSUPP;
 }
 
-static const struct perf_event_attr ibs_notsupp = {
-   .exclude_user   = 1,
-   .exclude_kernel = 1,
-   .exclude_hv = 1,
-   .exclude_idle   = 1,
-   .exclude_host   = 1,
-   .exclude_guest  = 1,
-};
-
 static int perf_ibs_init(struct perf_event *event)
 {
struct hw_perf_event *hwc = &event->hw;
@@ -282,9 +273,6 @@ static int perf_ibs_init(struct perf_event *event)
if (event->pmu != &perf_ibs->pmu)
return -ENOENT;
 
-   if (perf_flags(&event->attr) & perf_flags(&ibs_notsupp))
-   return -EINVAL;
-
if (config & ~perf_ibs->config_mask)
return -EINVAL;
 
diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
index 2aefacf..ef80c60 100644
--- a/arch/x86/events/amd/power.c
+++ b/arch/x86/events/amd/power.c
@@ -136,14 +136,7 @@ static int pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* Unsupported modes and filters. */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   /* no sampling */
-   event->attr.sample_period)
+   if (event->attr.sample_period)
return -EINVAL;
 
if (cfg != AMD_POWER_EVENTSEL_PKG)
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 9f8084f..af919c4 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -280,13 +280,7 @@ static int cstate_pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   event->attr.sample_period) /* no sampling */
+   if (event->attr.sample_period) /* no sampling */
return -EINVAL;
 
if (event->cpu < 0)
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 32f3e94..9cb94e6 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -397,13 +397,7 @@ static int rapl_pmu_event_init(struct perf_event *event)
return -EINVAL;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   event->attr.sample_period) /* no sampling */
+   if (event->attr.sample_period) /* no sampling */
return -EINVAL;
 
/* must be done before validate_group */
diff --git a/arch/x86/events/intel/uncore_snb.c 
b/arch/x86/events/intel/uncore_snb.c
index 8527c3e..26441eb 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -374,13 +374,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
*event)
return -EINVAL;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   event->attr.sample_period) /* no sampling */
+   if (event->attr.sample_period) /* no sampling */
return -EINVAL;
 
/*
diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index b4771a6..c4fa5d4 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -160,13 +160,7 @@ s

[PATCH v2 16/20] sparc: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
The SPARC PMU has the capability to exclude events based on context
 - let's advertise that we support the PERF_PMU_CAP_EXCLUDE
capability to ensure that perf doesn't prevent us from handling
events where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 arch/sparc/kernel/perf_event.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index d3149ba..38fac17 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1571,6 +1571,7 @@ static struct pmu pmu = {
.start_txn  = sparc_pmu_start_txn,
.cancel_txn = sparc_pmu_cancel_txn,
.commit_txn = sparc_pmu_commit_txn,
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 void perf_event_print_debug(void)
-- 
2.7.4



[PATCH v2 18/20] x86: perf/core remove unnecessary checks for exclusion

2018-11-26 Thread Andrew Murray
For x86 PMUs that do not support context exclusion we do not
advertise the PERF_PMU_CAP_EXCLUDE capability. This ensures
that perf will prevent us from handling events where any
exclusion flags are set. Let's remove the now unnecessary
check for exclusion flags.

This change means that amd/iommu and amd/uncore will now
also indicate that they do not support exclude_{hv|idle} and
intel/uncore that it does not support exclude_{guest|host}.

Signed-off-by: Andrew Murray 
---
 arch/x86/events/amd/iommu.c| 5 -
 arch/x86/events/amd/uncore.c   | 5 -
 arch/x86/events/intel/uncore.c | 8 
 3 files changed, 18 deletions(-)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 3210fee..eb35fe4 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -223,11 +223,6 @@ static int perf_iommu_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* IOMMU counters do not have usr/os/guest/host bits */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 8671de1..d6ade30 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -201,11 +201,6 @@ static int amd_uncore_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* NB and Last level cache counters do not have usr/os/guest/host bits 
*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
/* and we do not enable counter overflow interrupts */
hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB;
hwc->idx = -1;
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 27a4614..f1d78d9 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -695,14 +695,6 @@ static int uncore_pmu_event_init(struct perf_event *event)
if (pmu->func_id < 0)
return -ENOENT;
 
-   /*
-* Uncore PMU does measure at all privilege level all the time.
-* So it doesn't make sense to specify any exclude bits.
-*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle)
-   return -EINVAL;
-
/* Sampling not supported yet */
if (hwc->sample_period)
return -EINVAL;
-- 
2.7.4



[PATCH v2 19/20] x86: perf/core: advertise PMU exclusion capability

2018-11-26 Thread Andrew Murray
For PMUs that have the capability to exclude events based
on context. Let's advertise that we support the
PERF_PMU_CAP_EXCLUDE capability to ensure that perf doesn't
prevent us from handling events where any exclusion flags
are set.

Signed-off-by: Andrew Murray 
---
 arch/x86/events/core.c  | 2 ++
 arch/x86/events/intel/bts.c | 2 +-
 arch/x86/events/intel/pt.c  | 4 +++-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index dfb2f7c..3f51916 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2297,6 +2297,8 @@ static struct pmu pmu = {
.event_idx  = x86_pmu_event_idx,
.sched_task = x86_pmu_sched_task,
.task_ctx_size  = sizeof(struct x86_perf_task_context),
+
+   .capabilities   = PERF_PMU_CAP_EXCLUDE,
 };
 
 void arch_perf_update_userpage(struct perf_event *event,
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 24ffa1e..4976695 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -601,7 +601,7 @@ static __init int bts_init(void)
}
 
bts_pmu.capabilities= PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE |
- PERF_PMU_CAP_EXCLUSIVE;
+ PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_EXCLUDE;
bts_pmu.task_ctx_nr = perf_sw_context;
bts_pmu.event_init  = bts_event_init;
bts_pmu.add = bts_event_add;
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 8d016ce..2d811f8 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1516,7 +1516,9 @@ static __init int pt_init(void)
pt_pmu.pmu.capabilities =
PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_AUX_SW_DOUBLEBUF;
 
-   pt_pmu.pmu.capabilities |= PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE;
+   pt_pmu.pmu.capabilities  = PERF_PMU_CAP_EXCLUSIVE |
+  PERF_PMU_CAP_ITRACE |
+  PERF_PMU_CAP_EXCLUDE;
pt_pmu.pmu.attr_groups   = pt_attr_groups;
pt_pmu.pmu.task_ctx_nr   = perf_sw_context;
pt_pmu.pmu.event_init= pt_event_init;
-- 
2.7.4



[PATCH v2 20/20] perf/core: remove unused perf_flags

2018-11-26 Thread Andrew Murray
Now that perf_flags is not used we remove it.

Signed-off-by: Andrew Murray 
---
 include/uapi/linux/perf_event.h   | 2 --
 tools/include/uapi/linux/perf_event.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
-- 
2.7.4



Re: [PATCH v2 03/20] perf/core: add PERF_PMU_CAP_EXCLUDE for exclusion capable PMUs

2018-11-26 Thread Andrew Murray
On Mon, Nov 26, 2018 at 02:10:24PM +, Robin Murphy wrote:
> Hi Andrew,
> 
> On 26/11/2018 11:12, Andrew Murray wrote:
> > Many PMU drivers do not have the capability to exclude counting events
> > that occur in specific contexts such as idle, kernel, guest, etc. These
> > drivers indicate this by returning an error in their event_init upon
> > testing the events attribute flags. This approach is error prone and
> > often inconsistent.
> > 
> > Let's instead allow PMU drivers to advertise their ability to exclude
> > based on context via a new capability: PERF_PMU_CAP_EXCLUDE. This
> > allows the perf core to reject requests for exclusion events where
> > there is no support in the PMU.
> > 
> > Signed-off-by: Andrew Murray 
> > ---
> >   include/linux/perf_event.h | 1 +
> >   kernel/events/core.c   | 9 +
> >   2 files changed, 10 insertions(+)
> > 
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index b2e806f..69b3d65 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -244,6 +244,7 @@ struct perf_event;
> >   #define PERF_PMU_CAP_EXCLUSIVE0x10
> >   #define PERF_PMU_CAP_ITRACE   0x20
> >   #define PERF_PMU_CAP_HETEROGENEOUS_CPUS   0x40
> > +#define PERF_PMU_CAP_EXCLUDE   0x80
> >   /**
> >* struct pmu - generic performance monitoring unit
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 5a97f34..9afb33c 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -9743,6 +9743,15 @@ static int perf_try_init_event(struct pmu *pmu, 
> > struct perf_event *event)
> > if (ctx)
> > perf_event_ctx_unlock(event->group_leader, ctx);
> > +   if (!ret) {
> > +   if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUDE) &&
> > +   event_has_any_exclude_flag(event)) {
> 
> Technically this is a bisection-breaker, since no driver has this capability
> yet - ideally, this patch should come after all the ones introducing it to
> the relevant drivers (with the removal of the now-redundant code from the
> other drivers at the end).

Indeed. Thought it is possible to first introduce the capability, update the
relevant drivers to advertise it, then add the change to core.c and finally
remove the unnecessary error checks as a result of using the new capability.
This approach could be bisection-proof.

> 
> Alternatively, since we already have several other negative capabilities,
> unless there's a strong feeling against adding any more then it might work
> out simpler to flip it to PERF_PMU_CAP_NO_EXCLUDE, such that we only need to
> introduce the core check then directly replace the open-coded event checks
> with the capability in the appropriate drivers, and need not touch the
> exclusion-supporting ones at all.

This would certaintly be less risky and invasive (e.g. compare the number of
files touched between this v2 and the previous v1).

I'm happy with either approach.

Thanks,

Andrew Murray

> 
> Robin.
> 
> > +   if (event->destroy)
> > +   event->destroy(event);
> > +   ret = -EINVAL;
> > +   }
> > +   }
> > +
> > if (ret)
> > module_put(pmu->module);
> > 


[PATCH v3 00/12] perf/core: Generalise event exclusion checking

2018-12-06 Thread Andrew Murray
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags.

However this approach requires that each time a new event modifier is
added to perf, all the perf drivers need to be modified to indicate that
they don't support the attribute. This results in additional boiler-plate
code common to many drivers that needs to be maintained. Furthermore the
drivers are not consistent with regards to the error value they return
when reporting unsupported attributes.

This patchset allow PMU drivers to advertise their inability to exclude
based on context via a new capability: PERF_PMU_CAP_NO_EXCLUDE. This
allows the perf core to reject requests for exclusion events where there
is no support in the PMU.

This is a functional change, in particular:

 - Some drivers will now additionally (but correctly) report unsupported
   exclusion flags. It's typical for existing userspace tools such as
   perf to handle such errors by retrying the system call without the
   unsupported flags.

 - Drivers that do not support any exclusion that previously reported
   -EPERM or -EOPNOTSUPP will now report -EINVAL - this is consistent
   with the majority and results in userspace perf retrying without
   exclusion.

All drivers touched by this patchset have been compile tested.

Changes from v2:

 - Invert logic from CAP_EXCLUDE to CAP_NO_EXCLUDE

Changes from v1:

 - Changed approach from explicitly rejecting events in unsupporting PMU
   drivers to explicitly advertising a capability in PMU drivers that
   do support exclusion events

 - Added additional information to tools/perf/design.txt

 - Rename event_has_exclude_flags to event_has_any_exclude_flag and
   update commit log to reflect it's a function

Andrew Murray (12):
  perf/doc: update design.txt for exclude_{host|guest} flags
  perf/core: add function to test for event exclusion flags
  perf/core: add PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs
  alpha: perf/core: use PERF_PMU_CAP_NO_EXCLUDE
  arm: perf: conditionally use PERF_PMU_CAP_NO_EXCLUDE
  arm: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude
incapable PMUs
  drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude
incapable PMUs
  powerpc: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable
PMUs
  x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  perf/core: remove unused perf_flags

 arch/alpha/kernel/perf_event.c|  7 +--
 arch/arm/mach-imx/mmdc.c  |  9 ++---
 arch/arm/mm/cache-l2x0-pmu.c  |  9 +
 arch/powerpc/perf/hv-24x7.c   | 10 +-
 arch/powerpc/perf/hv-gpci.c   | 10 +-
 arch/powerpc/perf/imc-pmu.c   | 19 +--
 arch/x86/events/amd/ibs.c | 13 +
 arch/x86/events/amd/iommu.c   |  6 +-
 arch/x86/events/amd/power.c   | 10 ++
 arch/x86/events/amd/uncore.c  |  7 ++-
 arch/x86/events/intel/cstate.c| 12 +++-
 arch/x86/events/intel/rapl.c  |  9 ++---
 arch/x86/events/intel/uncore.c|  9 +
 arch/x86/events/intel/uncore_snb.c|  9 ++---
 arch/x86/events/msr.c | 10 ++
 drivers/perf/arm-cci.c| 10 +-
 drivers/perf/arm-ccn.c|  6 ++
 drivers/perf/arm_dsu_pmu.c|  9 ++---
 drivers/perf/arm_pmu.c| 15 +--
 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c |  1 +
 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c  |  9 -
 drivers/perf/qcom_l2_pmu.c|  9 +
 drivers/perf/qcom_l3_pmu.c|  8 +---
 drivers/perf/xgene_pmu.c  |  6 +-
 include/linux/perf_event.h| 10 ++
 include/uapi/linux/perf_event.h   |  2 --
 kernel/events/core.c  |  9 +
 tools/include/uapi/linux/perf_event.h |  2 --
 tools/perf/design.txt |  4 
 31 files changed, 62 insertions(+), 189 deletions(-)

-- 
2.7.4



[PATCH v3 01/12] perf/doc: update design.txt for exclude_{host|guest} flags

2018-12-06 Thread Andrew Murray
Update design.txt to reflect the presence of the exclude_host
and exclude_guest perf flags.

Signed-off-by: Andrew Murray 
---
 tools/perf/design.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index a28dca2..0453ba2 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits 
provide a
 way to request that counting of events be restricted to times when the
 CPU is in user, kernel and/or hypervisor mode.
 
+Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
+to request counting of events restricted to guest and host contexts when
+using Linux as the hypervisor.
+
 The 'mmap' and 'munmap' bits allow recording of PROT_EXEC mmap/munmap
 operations, these can be used to relate userspace IP addresses to actual
 code, even after the mapping (or even the whole process) is gone,
-- 
2.7.4



[PATCH v3 02/12] perf/core: add function to test for event exclusion flags

2018-12-06 Thread Andrew Murray
Add a function that tests if any of the perf event exclusion flags
are set on a given event.

Signed-off-by: Andrew Murray 
---
 include/linux/perf_event.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f..b2e806f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1004,6 +1004,15 @@ perf_event__output_id_sample(struct perf_event *event,
 extern void
 perf_log_lost_samples(struct perf_event *event, u64 lost);
 
+static inline bool event_has_any_exclude_flag(struct perf_event *event)
+{
+   struct perf_event_attr *attr = &event->attr;
+
+   return attr->exclude_idle || attr->exclude_user ||
+  attr->exclude_kernel || attr->exclude_hv ||
+  attr->exclude_guest || attr->exclude_host;
+}
+
 static inline bool is_sampling_event(struct perf_event *event)
 {
return event->attr.sample_period != 0;
-- 
2.7.4



[PATCH v3 03/12] perf/core: add PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs

2018-12-06 Thread Andrew Murray
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags. This approach is error prone and
often inconsistent.

Let's instead allow PMU drivers to advertise their inability to exclude
based on context via a new capability: PERF_PMU_CAP_NO_EXCLUDE. This
allows the perf core to reject requests for exclusion events where
there is no support in the PMU.

Signed-off-by: Andrew Murray 
---
 include/linux/perf_event.h | 1 +
 kernel/events/core.c   | 9 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b2e806f..fe92b89 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -244,6 +244,7 @@ struct perf_event;
 #define PERF_PMU_CAP_EXCLUSIVE 0x10
 #define PERF_PMU_CAP_ITRACE0x20
 #define PERF_PMU_CAP_HETEROGENEOUS_CPUS0x40
+#define PERF_PMU_CAP_NO_EXCLUDE0x80
 
 /**
  * struct pmu - generic performance monitoring unit
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5a97f34..5113cfc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9743,6 +9743,15 @@ static int perf_try_init_event(struct pmu *pmu, struct 
perf_event *event)
if (ctx)
perf_event_ctx_unlock(event->group_leader, ctx);
 
+   if (!ret) {
+   if (pmu->capabilities & PERF_PMU_CAP_NO_EXCLUDE &&
+   event_has_any_exclude_flag(event)) {
+   if (event->destroy)
+   event->destroy(event);
+   ret = -EINVAL;
+   }
+   }
+
if (ret)
module_put(pmu->module);
 
-- 
2.7.4



[PATCH v3 04/12] alpha: perf/core: use PERF_PMU_CAP_NO_EXCLUDE

2018-12-06 Thread Andrew Murray
As the Alpha PMU doesn't support context exclusion let's advertise
the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

This change means that __hw_perf_event_init will now also
indicate that it doesn't support exclude_host and exclude_guest and
will now implicitly return -EINVAL instead of -EPERM. This is likely
more desirable as -EPERM will result in a kernel.perf_event_paranoid
related warning from the perf userspace utility.

Signed-off-by: Andrew Murray 
---
 arch/alpha/kernel/perf_event.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 5613aa37..4341ccf 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -630,12 +630,6 @@ static int __hw_perf_event_init(struct perf_event *event)
return ev;
}
 
-   /* The EV67 does not support mode exclusion */
-   if (attr->exclude_kernel || attr->exclude_user
-   || attr->exclude_hv || attr->exclude_idle) {
-   return -EPERM;
-   }
-
/*
 * We place the event type in event_base here and leave calculation
 * of the codes to programme the PMU for alpha_pmu_enable() because
@@ -771,6 +765,7 @@ static struct pmu pmu = {
.start  = alpha_pmu_start,
.stop   = alpha_pmu_stop,
.read   = alpha_pmu_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 
-- 
2.7.4



[PATCH v3 05/12] arm: perf: conditionally use PERF_PMU_CAP_NO_EXCLUDE

2018-12-06 Thread Andrew Murray
The ARM PMU driver can be used to represent a variety of ARM based
PMUs. Some of these PMUs do not provide support for context
exclusion, where this is the case we advertise the
PERF_PMU_CAP_NO_EXCLUDE capability to ensure that perf prevents us
from handling events where any exclusion flags are set.

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm_pmu.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 7f01f6f..ea69067 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -357,13 +357,6 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 }
 
 static int
-event_requires_mode_exclusion(struct perf_event_attr *attr)
-{
-   return attr->exclude_idle || attr->exclude_user ||
-  attr->exclude_kernel || attr->exclude_hv;
-}
-
-static int
 __hw_perf_event_init(struct perf_event *event)
 {
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
@@ -393,9 +386,8 @@ __hw_perf_event_init(struct perf_event *event)
/*
 * Check whether we need to exclude the counter from certain modes.
 */
-   if ((!armpmu->set_event_filter ||
-armpmu->set_event_filter(hwc, &event->attr)) &&
-event_requires_mode_exclusion(&event->attr)) {
+   if (armpmu->set_event_filter &&
+   armpmu->set_event_filter(hwc, &event->attr)) {
pr_debug("ARM performance counters do not support "
 "mode exclusion\n");
return -EOPNOTSUPP;
@@ -861,6 +853,9 @@ int armpmu_register(struct arm_pmu *pmu)
if (ret)
return ret;
 
+   if (!pmu->set_event_filter)
+   pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
+
ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
if (ret)
goto out_destroy;
-- 
2.7.4



[PATCH v3 06/12] arm: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

Signed-off-by: Andrew Murray 
---
 arch/arm/mach-imx/mmdc.c | 9 ++---
 arch/arm/mm/cache-l2x0-pmu.c | 9 +
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/arm/mach-imx/mmdc.c b/arch/arm/mach-imx/mmdc.c
index 04b3bf7..3453838 100644
--- a/arch/arm/mach-imx/mmdc.c
+++ b/arch/arm/mach-imx/mmdc.c
@@ -293,13 +293,7 @@ static int mmdc_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest   ||
-   event->attr.sample_period)
+   if (event->attr.sample_period)
return -EINVAL;
 
if (cfg < 0 || cfg >= MMDC_NUM_COUNTERS)
@@ -455,6 +449,7 @@ static int mmdc_pmu_init(struct mmdc_pmu *pmu_mmdc,
.start  = mmdc_pmu_event_start,
.stop   = mmdc_pmu_event_stop,
.read   = mmdc_pmu_event_update,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
},
.mmdc_base = mmdc_base,
.dev = dev,
diff --git a/arch/arm/mm/cache-l2x0-pmu.c b/arch/arm/mm/cache-l2x0-pmu.c
index afe5b4c..99bcd07 100644
--- a/arch/arm/mm/cache-l2x0-pmu.c
+++ b/arch/arm/mm/cache-l2x0-pmu.c
@@ -314,14 +314,6 @@ static int l2x0_pmu_event_init(struct perf_event *event)
event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
@@ -544,6 +536,7 @@ static __init int l2x0_pmu_init(void)
.del = l2x0_pmu_event_del,
.event_init = l2x0_pmu_event_init,
.attr_groups = l2x0_pmu_attr_groups,
+   .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
};
 
l2x0_pmu_reset();
-- 
2.7.4



[PATCH v3 07/12] drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

Signed-off-by: Andrew Murray 
---
 drivers/perf/arm-cci.c| 10 +-
 drivers/perf/arm-ccn.c|  6 ++
 drivers/perf/arm_dsu_pmu.c|  9 ++---
 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c |  1 +
 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c  |  9 -
 7 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 1bfeb16..bfd03e0 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1327,15 +1327,6 @@ static int cci_pmu_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EOPNOTSUPP;
 
-   /* We have no filtering of any kind */
-   if (event->attr.exclude_user||
-   event->attr.exclude_kernel  ||
-   event->attr.exclude_hv  ||
-   event->attr.exclude_idle||
-   event->attr.exclude_host||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/*
 * Following the example set by other "uncore" PMUs, we accept any CPU
 * and rewrite its affinity dynamically rather than having perf core
@@ -1433,6 +1424,7 @@ static int cci_pmu_init(struct cci_pmu *cci_pmu, struct 
platform_device *pdev)
.stop   = cci_pmu_stop,
.read   = pmu_read,
.attr_groups= pmu_attr_groups,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
cci_pmu->plat_device = pdev;
diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c
index 7dd850e..2ae7602 100644
--- a/drivers/perf/arm-ccn.c
+++ b/drivers/perf/arm-ccn.c
@@ -741,10 +741,7 @@ static int arm_ccn_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   if (has_branch_stack(event) || event->attr.exclude_user ||
-   event->attr.exclude_kernel || event->attr.exclude_hv ||
-   event->attr.exclude_idle || event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   if (has_branch_stack(event)) {
dev_dbg(ccn->dev, "Can't exclude execution levels!\n");
return -EINVAL;
}
@@ -1290,6 +1287,7 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
.read = arm_ccn_pmu_event_read,
.pmu_enable = arm_ccn_pmu_enable,
.pmu_disable = arm_ccn_pmu_disable,
+   .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
};
 
/* No overflow interrupt? Have to use a timer instead. */
diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 660cb8a..5851de5 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -562,13 +562,7 @@ static int dsu_pmu_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   if (has_branch_stack(event) ||
-   event->attr.exclude_user ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle ||
-   event->attr.exclude_host ||
-   event->attr.exclude_guest) {
+   if (has_branch_stack(event)) {
dev_dbg(dsu_pmu->pmu.dev, "Can't support filtering\n");
return -EINVAL;
}
@@ -735,6 +729,7 @@ static int dsu_pmu_device_probe(struct platform_device 
*pdev)
.read   = dsu_pmu_read,
 
.attr_groups= dsu_pmu_attr_groups,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
rc = perf_pmu_register(&dsu_pmu->pmu, name, -1);
diff --git a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
index 1b10ea0..296fef8 100644
--- a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
@@ -396,6 +396,7 @@ static int hisi_ddrc_pmu_probe(struct platform_device *pdev)
.stop   = hisi_uncore_pmu_stop,
.read   = hisi_uncore_pmu_read,
.attr_groups= hisi_ddrc_pmu_attr_groups,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
ret = perf_pmu_register(&ddrc_pmu->pmu, name, -1);
diff --git a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
index 443906e..2553a84 100644
--- a/drivers/perf/hisil

[PATCH v3 08/12] drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

This change means that qcom_{l2|l3}_pmu will now also indicate that
they do not support exclude_{host|guest} and that xgene_pmu does
not also support exclude_idle and exclude_hv.

Note that for qcom_l2_pmu we now implictly return -EINVAL instead
of -EOPNOTSUPP. This change will result in the perf userspace
utility retrying the perf_event_open system call with fallback
event attributes that do not fail.

Signed-off-by: Andrew Murray 
---
 drivers/perf/qcom_l2_pmu.c | 9 +
 drivers/perf/qcom_l3_pmu.c | 8 +---
 drivers/perf/xgene_pmu.c   | 6 +-
 3 files changed, 3 insertions(+), 20 deletions(-)

diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
index 842135c..091b4d7 100644
--- a/drivers/perf/qcom_l2_pmu.c
+++ b/drivers/perf/qcom_l2_pmu.c
@@ -509,14 +509,6 @@ static int l2_cache_event_init(struct perf_event *event)
return -EOPNOTSUPP;
}
 
-   /* We cannot filter accurately so we just don't allow it. */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle) {
-   dev_dbg_ratelimited(&l2cache_pmu->pdev->dev,
-   "Can't exclude execution levels\n");
-   return -EOPNOTSUPP;
-   }
-
if (((L2_EVT_GROUP(event->attr.config) > L2_EVT_GROUP_MAX) ||
 ((event->attr.config & ~L2_EVT_MASK) != 0)) &&
(event->attr.config != L2CYCLE_CTR_RAW_CODE)) {
@@ -982,6 +974,7 @@ static int l2_cache_pmu_probe(struct platform_device *pdev)
.stop   = l2_cache_event_stop,
.read   = l2_cache_event_read,
.attr_groups= l2_cache_pmu_attr_grps,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
l2cache_pmu->num_counters = get_num_counters();
diff --git a/drivers/perf/qcom_l3_pmu.c b/drivers/perf/qcom_l3_pmu.c
index 2dc63d6..5d70646 100644
--- a/drivers/perf/qcom_l3_pmu.c
+++ b/drivers/perf/qcom_l3_pmu.c
@@ -495,13 +495,6 @@ static int qcom_l3_cache__event_init(struct perf_event 
*event)
return -ENOENT;
 
/*
-* There are no per-counter mode filters in the PMU.
-*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle)
-   return -EINVAL;
-
-   /*
 * Sampling not supported since these events are not core-attributable.
 */
if (hwc->sample_period)
@@ -777,6 +770,7 @@ static int qcom_l3_cache_pmu_probe(struct platform_device 
*pdev)
.read   = qcom_l3_cache__event_read,
 
.attr_groups= qcom_l3_cache_pmu_attr_grps,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
memrc = platform_get_resource(pdev, IORESOURCE_MEM, 0);
diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 0e31f13..dad6169 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -914,11 +914,6 @@ static int xgene_perf_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* SOC counters do not have usr/os/guest/host bits */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
/*
@@ -1133,6 +1128,7 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, 
char *name)
.start  = xgene_perf_start,
.stop   = xgene_perf_stop,
.read   = xgene_perf_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
/* Hardware counter init */
-- 
2.7.4



[PATCH v3 09/12] powerpc: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For PowerPC PMUs that do not support context exclusion let's
advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion flags
are set. Let's also remove the now unnecessary check for exclusion
flags.

Signed-off-by: Andrew Murray 
---
 arch/powerpc/perf/hv-24x7.c | 10 +-
 arch/powerpc/perf/hv-gpci.c | 10 +-
 arch/powerpc/perf/imc-pmu.c | 19 +--
 3 files changed, 3 insertions(+), 36 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 72238ee..d2b8e60 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1306,15 +1306,6 @@ static int h_24x7_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/* no branch sampling */
if (has_branch_stack(event))
return -EOPNOTSUPP;
@@ -1577,6 +1568,7 @@ static struct pmu h_24x7_pmu = {
.start_txn   = h_24x7_event_start_txn,
.commit_txn  = h_24x7_event_commit_txn,
.cancel_txn  = h_24x7_event_cancel_txn,
+   .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static int hv_24x7_init(void)
diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 43fabb3..735e77b 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -232,15 +232,6 @@ static int h_gpci_event_init(struct perf_event *event)
return -EINVAL;
}
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
/* no branch sampling */
if (has_branch_stack(event))
return -EOPNOTSUPP;
@@ -285,6 +276,7 @@ static struct pmu h_gpci_pmu = {
.start   = h_gpci_event_start,
.stop= h_gpci_event_stop,
.read= h_gpci_event_update,
+   .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static int hv_gpci_init(void)
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 1fafc32b..1dbb0ee 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -473,15 +473,6 @@ static int nest_imc_event_init(struct perf_event *event)
if (event->hw.sample_period)
return -EINVAL;
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
@@ -748,15 +739,6 @@ static int core_imc_event_init(struct perf_event *event)
if (event->hw.sample_period)
return -EINVAL;
 
-   /* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
@@ -1069,6 +1051,7 @@ static int update_pmu_ops(struct imc_pmu *pmu)
pmu->pmu.stop = imc_event_stop;
pmu->pmu.read = imc_event_update;
pmu->pmu.attr_groups = pmu->attr_groups;
+   pmu->pmu.capabilities = PERF_PMU_CAP_NO_EXCLUDE;
pmu->attr_groups[IMC_FORMAT_ATTR] = &imc_format_group;
 
switch (pmu->domain) {
-- 
2.7.4



[PATCH v3 10/12] x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

Signed-off-by: Andrew Murray 
---
 arch/x86/events/amd/ibs.c  | 13 +
 arch/x86/events/amd/power.c| 10 ++
 arch/x86/events/intel/cstate.c | 12 +++-
 arch/x86/events/intel/rapl.c   |  9 ++---
 arch/x86/events/intel/uncore_snb.c |  9 ++---
 arch/x86/events/msr.c  | 10 ++
 6 files changed, 12 insertions(+), 51 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index d50bb4d..62f317c 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -253,15 +253,6 @@ static int perf_ibs_precise_event(struct perf_event 
*event, u64 *config)
return -EOPNOTSUPP;
 }
 
-static const struct perf_event_attr ibs_notsupp = {
-   .exclude_user   = 1,
-   .exclude_kernel = 1,
-   .exclude_hv = 1,
-   .exclude_idle   = 1,
-   .exclude_host   = 1,
-   .exclude_guest  = 1,
-};
-
 static int perf_ibs_init(struct perf_event *event)
 {
struct hw_perf_event *hwc = &event->hw;
@@ -282,9 +273,6 @@ static int perf_ibs_init(struct perf_event *event)
if (event->pmu != &perf_ibs->pmu)
return -ENOENT;
 
-   if (perf_flags(&event->attr) & perf_flags(&ibs_notsupp))
-   return -EINVAL;
-
if (config & ~perf_ibs->config_mask)
return -EINVAL;
 
@@ -537,6 +525,7 @@ static struct perf_ibs perf_ibs_fetch = {
.start  = perf_ibs_start,
.stop   = perf_ibs_stop,
.read   = perf_ibs_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
},
.msr= MSR_AMD64_IBSFETCHCTL,
.config_mask= IBS_FETCH_CONFIG_MASK,
diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
index 2aefacf..c5ff084 100644
--- a/arch/x86/events/amd/power.c
+++ b/arch/x86/events/amd/power.c
@@ -136,14 +136,7 @@ static int pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* Unsupported modes and filters. */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   /* no sampling */
-   event->attr.sample_period)
+   if (event->attr.sample_period)
return -EINVAL;
 
if (cfg != AMD_POWER_EVENTSEL_PKG)
@@ -226,6 +219,7 @@ static struct pmu pmu_class = {
.start  = pmu_event_start,
.stop   = pmu_event_stop,
.read   = pmu_event_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static int power_cpu_exit(unsigned int cpu)
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 9f8084f..15a1981 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -280,13 +280,7 @@ static int cstate_pmu_event_init(struct perf_event *event)
return -ENOENT;
 
/* unsupported modes and filters */
-   if (event->attr.exclude_user   ||
-   event->attr.exclude_kernel ||
-   event->attr.exclude_hv ||
-   event->attr.exclude_idle   ||
-   event->attr.exclude_host   ||
-   event->attr.exclude_guest  ||
-   event->attr.sample_period) /* no sampling */
+   if (event->attr.sample_period) /* no sampling */
return -EINVAL;
 
if (event->cpu < 0)
@@ -437,7 +431,7 @@ static struct pmu cstate_core_pmu = {
.start  = cstate_pmu_event_start,
.stop   = cstate_pmu_event_stop,
.read   = cstate_pmu_event_update,
-   .capabilities   = PERF_PMU_CAP_NO_INTERRUPT,
+   .capabilities   = PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE,
.module = THIS_MODULE,
 };
 
@@ -451,7 +445,7 @@ static struct pmu cstate_pkg_pmu = {
.start  = cstate_pmu_event_start,
.stop   = cstate_pmu_event_stop,
.read   = cstate_pmu_event_update,
-   .capabilities   = PERF_PMU_CAP_NO_INTERRUPT,
+   .capabilities   = PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE,
.module = THIS_MODULE,
 };
 
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 32f3e94..18a5628 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -397,13 +397,7 @@ static int rapl_pmu_event_init(struct perf_event *event)
return -EINVAL;
 
/* unsupporte

[PATCH v3 11/12] x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2018-12-06 Thread Andrew Murray
For x86 PMUs that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

This change means that amd/iommu and amd/uncore will now also
indicate that they do not support exclude_{hv|idle} and intel/uncore
that it does not support exclude_{guest|host}.

Signed-off-by: Andrew Murray 
---
 arch/x86/events/amd/iommu.c| 6 +-
 arch/x86/events/amd/uncore.c   | 7 ++-
 arch/x86/events/intel/uncore.c | 9 +
 3 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 3210fee..7635c23 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -223,11 +223,6 @@ static int perf_iommu_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* IOMMU counters do not have usr/os/guest/host bits */
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
if (event->cpu < 0)
return -EINVAL;
 
@@ -414,6 +409,7 @@ static const struct pmu iommu_pmu __initconst = {
.read   = perf_iommu_read,
.task_ctx_nr= perf_invalid_context,
.attr_groups= amd_iommu_attr_groups,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static __init int init_one_iommu(unsigned int idx)
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 8671de1..988cb9c 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -201,11 +201,6 @@ static int amd_uncore_event_init(struct perf_event *event)
if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
 
-   /* NB and Last level cache counters do not have usr/os/guest/host bits 
*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_host || event->attr.exclude_guest)
-   return -EINVAL;
-
/* and we do not enable counter overflow interrupts */
hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB;
hwc->idx = -1;
@@ -307,6 +302,7 @@ static struct pmu amd_nb_pmu = {
.start  = amd_uncore_start,
.stop   = amd_uncore_stop,
.read   = amd_uncore_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static struct pmu amd_llc_pmu = {
@@ -317,6 +313,7 @@ static struct pmu amd_llc_pmu = {
.start  = amd_uncore_start,
.stop   = amd_uncore_stop,
.read   = amd_uncore_read,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
 };
 
 static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 27a4614..d516161 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -695,14 +695,6 @@ static int uncore_pmu_event_init(struct perf_event *event)
if (pmu->func_id < 0)
return -ENOENT;
 
-   /*
-* Uncore PMU does measure at all privilege level all the time.
-* So it doesn't make sense to specify any exclude bits.
-*/
-   if (event->attr.exclude_user || event->attr.exclude_kernel ||
-   event->attr.exclude_hv || event->attr.exclude_idle)
-   return -EINVAL;
-
/* Sampling not supported yet */
if (hwc->sample_period)
return -EINVAL;
@@ -800,6 +792,7 @@ static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
.stop   = uncore_pmu_event_stop,
.read   = uncore_pmu_event_read,
.module = THIS_MODULE,
+   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
} else {
pmu->pmu = *pmu->type->pmu;
-- 
2.7.4



[PATCH v3 12/12] perf/core: remove unused perf_flags

2018-12-06 Thread Andrew Murray
Now that perf_flags is not used we remove it.

Signed-off-by: Andrew Murray 
---
 include/uapi/linux/perf_event.h   | 2 --
 tools/include/uapi/linux/perf_event.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index f35eb72..ba89bd3 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -445,8 +445,6 @@ struct perf_event_query_bpf {
__u32   ids[0];
 };
 
-#define perf_flags(attr)   (*(&(attr)->read_format + 1))
-
 /*
  * Ioctls that can be done on a perf event fd:
  */
-- 
2.7.4



Re: [PATCH v3 00/12] perf/core: Generalise event exclusion checking

2018-12-10 Thread Andrew Murray
On Fri, Dec 07, 2018 at 05:25:17PM +, Will Deacon wrote:
> On Thu, Dec 06, 2018 at 04:47:17PM +0000, Andrew Murray wrote:
> > Many PMU drivers do not have the capability to exclude counting events
> > that occur in specific contexts such as idle, kernel, guest, etc. These
> > drivers indicate this by returning an error in their event_init upon
> > testing the events attribute flags.
> > 
> > However this approach requires that each time a new event modifier is
> > added to perf, all the perf drivers need to be modified to indicate that
> > they don't support the attribute. This results in additional boiler-plate
> > code common to many drivers that needs to be maintained. Furthermore the
> > drivers are not consistent with regards to the error value they return
> > when reporting unsupported attributes.
> > 
> > This patchset allow PMU drivers to advertise their inability to exclude
> > based on context via a new capability: PERF_PMU_CAP_NO_EXCLUDE. This
> > allows the perf core to reject requests for exclusion events where there
> > is no support in the PMU.
> > 
> > This is a functional change, in particular:
> > 
> >  - Some drivers will now additionally (but correctly) report unsupported
> >exclusion flags. It's typical for existing userspace tools such as
> >perf to handle such errors by retrying the system call without the
> >unsupported flags.
> > 
> >  - Drivers that do not support any exclusion that previously reported
> >-EPERM or -EOPNOTSUPP will now report -EINVAL - this is consistent
> >with the majority and results in userspace perf retrying without
> >exclusion.
> > 
> > All drivers touched by this patchset have been compile tested.
> 
> For the bits under arch/arm/ and drivers/perf:
> 
> Acked-by: Will Deacon 
> 
> Note that I've queued the TX2 uncore PMU for 4.21 [1], which could also
> benefit from your new flag.

Ah thanks for pointing this out, I'll send a patch in due course.

Thanks,

Andrwe Murray

> 
> Will
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/commit/?h=for-next/perf&id=69c32972d59388c041268e8206e8eb1acff29b9a


Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags

2018-12-11 Thread Andrew Murray
On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote:
> [ Reviving old thread. ]
> 
> Andrew Murray  writes:
> > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote:
> >> Andrew Murray  writes:
> >> 
> >> > Update design.txt to reflect the presence of the exclude_host
> >> > and exclude_guest perf flags.
> >> >
> >> > Signed-off-by: Andrew Murray 
> >> > ---
> >> >  tools/perf/design.txt | 4 
> >> >  1 file changed, 4 insertions(+)
> >> >
> >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt
> >> > index a28dca2..7de7d83 100644
> >> > --- a/tools/perf/design.txt
> >> > +++ b/tools/perf/design.txt
> >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 
> >> > 'exclude_hv' bits provide a
> >> >  way to request that counting of events be restricted to times when the
> >> >  CPU is in user, kernel and/or hypervisor mode.
> >> >  
> >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
> >> > +to request counting of events restricted to guest and host contexts when
> >> > +using virtualisation.
> >> 
> >> How does exclude_host differ from exclude_hv ?
> >
> > I believe exclude_host / exclude_guest are intented to distinguish
> > between host and guest in the hosted hypervisor context (KVM).
> 
> OK yeah, from the perf-list man page:
> 
>u - user-space counting
>k - kernel counting
>h - hypervisor counting
>I - non idle counting
>G - guest counting (in KVM guests)
>H - host counting (not in KVM guests)
> 
> > Whereas exclude_hv allows to distinguish between guest and
> > hypervisor in the bare-metal type hypervisors.
> 
> Except that's exactly not how we use them on powerpc :)
> 
> We use exclude_hv to exclude "the hypervisor", regardless of whether
> it's KVM or PowerVM (which is a bare-metal hypervisor).
> 
> We don't use exclude_host / exclude_guest at all, which I guess is a
> bug, except I didn't know they existed until this thread.
> 
> eg, in a KVM guest:
> 
>   $ perf record -e cycles:G /bin/bash -c "for i in {0..10}; do :;done"
>   $ perf report -D | grep -Fc "dso: [hypervisor]"
>   16
> 
> 
> > In the case of arm64 - if VHE extensions are present then the host
> > kernel will run at a higher privilege to the guest kernel, in which
> > case there is no distinction between hypervisor and host so we ignore
> > exclude_hv. But where VHE extensions are not present then the host
> > kernel runs at the same privilege level as the guest and we use a
> > higher privilege level to switch between them - in this case we can
> > use exclude_hv to discount that hypervisor role of switching between
> > guests.
> 
> I couldn't find any arm64 perf code using exclude_host/guest at all?

Correct - but this is in flight as I am currently adding support for this
see [1].

> 
> And I don't see any x86 code using exclude_hv.

I can't find any either.

> 
> But maybe that's OK, I just worry this is confusing for users.

There is some extra context regarding this where exclude_guest/exclude_host
was added, see [2] and where exclude_hv was added, see [3]

Generally it seems that exclude_guest/exclude_host relies upon switching
counters off/on on guest/host switch code (which works well in the nested
virt case). Whereas exclude_hv tends to rely solely on hardware capability
based on privilege level (which works well in the bare metal case where
the guest doesn't run at same privilege as the host).

I think from the user perspective exclude_hv allows you to see your overhead
if you are a guest (i.e. work done by bare metal hypervisor associated with
you as the guest). Whereas exclude_guest/exclude_host doesn't allow you to
see events above you (i.e. the kernel hypervisor) if you are the guest...

At least that's how I read this, I've copied in others that may provide
more authoritative feedback.

[1] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-December/033698.html
[2] https://www.spinics.net/lists/kvm/msg53996.html
[3] https://lore.kernel.org/patchwork/patch/143918/

Thanks,

Andrew Murray

> 
> cheers


Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags

2018-12-12 Thread Andrew Murray
On Wed, Dec 12, 2018 at 09:07:42AM +0100, Christoffer Dall wrote:
> On Tue, Dec 11, 2018 at 01:59:03PM +0000, Andrew Murray wrote:
> > On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote:
> > > [ Reviving old thread. ]
> > > 
> > > Andrew Murray  writes:
> > > > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote:
> > > >> Andrew Murray  writes:
> > > >> 
> > > >> > Update design.txt to reflect the presence of the exclude_host
> > > >> > and exclude_guest perf flags.
> > > >> >
> > > >> > Signed-off-by: Andrew Murray 
> > > >> > ---
> > > >> >  tools/perf/design.txt | 4 
> > > >> >  1 file changed, 4 insertions(+)
> > > >> >
> > > >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt
> > > >> > index a28dca2..7de7d83 100644
> > > >> > --- a/tools/perf/design.txt
> > > >> > +++ b/tools/perf/design.txt
> > > >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 
> > > >> > 'exclude_hv' bits provide a
> > > >> >  way to request that counting of events be restricted to times when 
> > > >> > the
> > > >> >  CPU is in user, kernel and/or hypervisor mode.
> > > >> >  
> > > >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a 
> > > >> > way
> > > >> > +to request counting of events restricted to guest and host contexts 
> > > >> > when
> > > >> > +using virtualisation.
> > > >> 
> > > >> How does exclude_host differ from exclude_hv ?
> > > >
> > > > I believe exclude_host / exclude_guest are intented to distinguish
> > > > between host and guest in the hosted hypervisor context (KVM).
> > > 
> > > OK yeah, from the perf-list man page:
> > > 
> > >u - user-space counting
> > >k - kernel counting
> > >h - hypervisor counting
> > >I - non idle counting
> > >G - guest counting (in KVM guests)
> > >H - host counting (not in KVM guests)
> > > 
> > > > Whereas exclude_hv allows to distinguish between guest and
> > > > hypervisor in the bare-metal type hypervisors.
> > > 
> > > Except that's exactly not how we use them on powerpc :)
> > > 
> > > We use exclude_hv to exclude "the hypervisor", regardless of whether
> > > it's KVM or PowerVM (which is a bare-metal hypervisor).
> > > 
> > > We don't use exclude_host / exclude_guest at all, which I guess is a
> > > bug, except I didn't know they existed until this thread.
> > > 
> > > eg, in a KVM guest:
> > > 
> > >   $ perf record -e cycles:G /bin/bash -c "for i in {0..10}; do :;done"
> > >   $ perf report -D | grep -Fc "dso: [hypervisor]"
> > >   16
> > > 
> > > 
> > > > In the case of arm64 - if VHE extensions are present then the host
> > > > kernel will run at a higher privilege to the guest kernel, in which
> > > > case there is no distinction between hypervisor and host so we ignore
> > > > exclude_hv. But where VHE extensions are not present then the host
> > > > kernel runs at the same privilege level as the guest and we use a
> > > > higher privilege level to switch between them - in this case we can
> > > > use exclude_hv to discount that hypervisor role of switching between
> > > > guests.
> > > 
> > > I couldn't find any arm64 perf code using exclude_host/guest at all?
> > 
> > Correct - but this is in flight as I am currently adding support for this
> > see [1].
> > 
> > > 
> > > And I don't see any x86 code using exclude_hv.
> > 
> > I can't find any either.
> > 
> > > 
> > > But maybe that's OK, I just worry this is confusing for users.
> > 
> > There is some extra context regarding this where exclude_guest/exclude_host
> > was added, see [2] and where exclude_hv was added, see [3]
> > 
> > Generally it seems that exclude_guest/exclude_host relies upon switching
> > counters off/on on guest/host switch code (which works well in the nested
> > virt case). Wherea

Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs support for DWC

2019-08-15 Thread Andrew Murray
-designware.c 
> b/drivers/pci/controller/dwc/pcie-designware.c
> index 7d25102..c99cee4 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -158,6 +158,43 @@ static void dw_pcie_writel_ob_unroll(struct dw_pcie 
> *pci, u32 index, u32 reg,
>   dw_pcie_writel_atu(pci, offset + reg, val);
>  }
>  
> +static void dw_pcie_prog_ep_outbound_atu_unroll(struct dw_pcie *pci, u8 
> func_no,
> + int index, int type,
> + u64 cpu_addr, u64 pci_addr,
> + u32 size)
> +{
> + u32 retries, val;
> +
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_LOWER_BASE,
> +  lower_32_bits(cpu_addr));
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_BASE,
> +  upper_32_bits(cpu_addr));
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_LIMIT,
> +  lower_32_bits(cpu_addr + size - 1));
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_LOWER_TARGET,
> +  lower_32_bits(pci_addr));
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_TARGET,
> +  upper_32_bits(pci_addr));
> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL1,
> +  type | PCIE_ATU_FUNC_NUM(func_no));

With the exception of this line, the rest of this function is identical to
dw_pcie_prog_outbound_atu_unroll.

> + dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL2,
> +  PCIE_ATU_ENABLE);
> +
> + /*
> +  * Make sure ATU enable takes effect before any subsequent config
> +  * and I/O accesses.
> +  */
> + for (retries = 0; retries < LINK_WAIT_MAX_IATU_RETRIES; retries++) {
> + val = dw_pcie_readl_ob_unroll(pci, index,
> +   PCIE_ATU_UNR_REGION_CTRL2);
> + if (val & PCIE_ATU_ENABLE)
> + return;
> +
> + mdelay(LINK_WAIT_IATU);
> + }
> + dev_err(pci->dev, "Outbound iATU is not being enabled\n");
> +}
> +
>  static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, int index,
>int type, u64 cpu_addr,
>u64 pci_addr, u32 size)
> @@ -194,6 +231,51 @@ static void dw_pcie_prog_outbound_atu_unroll(struct 
> dw_pcie *pci, int index,
>   dev_err(pci->dev, "Outbound iATU is not being enabled\n");
>  }
>  
> +void dw_pcie_prog_ep_outbound_atu(struct dw_pcie *pci, u8 func_no, int index,
> +   int type, u64 cpu_addr, u64 pci_addr,
> +   u32 size)
> +{
> + u32 retries, val;
> +
> + if (pci->ops->cpu_addr_fixup)
> + cpu_addr = pci->ops->cpu_addr_fixup(pci, cpu_addr);
> +
> + if (pci->iatu_unroll_enabled) {
> + dw_pcie_prog_ep_outbound_atu_unroll(pci, func_no, index, type,
> + cpu_addr, pci_addr, size);
> + return;
> + }
> +
> + dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT,
> +PCIE_ATU_REGION_OUTBOUND | index);
> + dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_BASE,
> +lower_32_bits(cpu_addr));
> + dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_BASE,
> +upper_32_bits(cpu_addr));
> + dw_pcie_writel_dbi(pci, PCIE_ATU_LIMIT,
> +lower_32_bits(cpu_addr + size - 1));
> + dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_TARGET,
> +lower_32_bits(pci_addr));
> + dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_TARGET,
> +upper_32_bits(pci_addr));
> + dw_pcie_writel_dbi(pci, PCIE_ATU_CR1, type |
> +PCIE_ATU_FUNC_NUM(func_no));

The same here, this is identical to dw_pcie_prog_outbound_atu with the
exception of this line.

Is there a way you can avoid all of this duplicated code?

Thanks,

Andrew Murray 

> + dw_pcie_writel_dbi(pci, PCIE_ATU_CR2, PCIE_ATU_ENABLE);
> +
> + /*
> +  * Make sure ATU enable takes effect before any subsequent config
> +  * and I/O accesses.
> +  */
> + for (retries = 0; retries < LINK_WAIT_MAX_IATU_RETRIES; retries++) {
> + val = dw_pcie_readl_dbi(pci, PCIE_ATU_CR2);
> + if (val & PCIE_ATU_ENABLE)
> + return;
> +
> + mdelay(LINK_WAIT_IATU);
> + }
> + d

Re: [PATCH 02/10] PCI: designware-ep: Add the doorbell mode of MSI-X in EP mode

2019-08-15 Thread Andrew Murray
On Thu, Aug 15, 2019 at 04:37:08PM +0800, Xiaowei Bao wrote:
> Add the doorbell mode of MSI-X in EP mode.
> 
> Signed-off-by: Xiaowei Bao 
> ---
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 14 ++
>  drivers/pci/controller/dwc/pcie-designware.h| 14 ++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
> b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 75e2955..e3a7cdf 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -454,6 +454,20 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 
> func_no,
>   return 0;
>  }
>  
> +int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep, u8 func_no,
> +u16 interrupt_num)
> +{
> + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + u32 msg_data;
> +
> + msg_data = (func_no << PCIE_MSIX_DOORBELL_PF_SHIFT) |
> +(interrupt_num - 1);
> +
> + dw_pcie_writel_dbi(pci, PCIE_MSIX_DOORBELL, msg_data);
> +
> + return 0;
> +}
> +
>  int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
> u16 interrupt_num)

Have I understood correctly that the hardware provides an alternative mechanism
that allows for raising MSI-X interrupts without the bother of reading the
capabilities registers?

If so is there any good reason to keep dw_pcie_ep_raise_msix_irq? (And thus use
it in dw_plat_pcie_ep_raise_irq also)?


>  {
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
> b/drivers/pci/controller/dwc/pcie-designware.h
> index 2b291e8..cd903e9 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -88,6 +88,11 @@
>  #define PCIE_MISC_CONTROL_1_OFF  0x8BC
>  #define PCIE_DBI_RO_WR_ENBIT(0)
>  
> +#define PCIE_MSIX_DOORBELL   0x948
> +#define PCIE_MSIX_DOORBELL_PF_SHIFT  24
> +#define PCIE_MSIX_DOORBELL_VF_SHIFT  16
> +#define PCIE_MSIX_DOORBELL_VF_ACTIVE BIT(15)

The _VF defines are not used, I'd suggest removing them.

Thanks,

Andrew Murray

> +
>  /*
>   * iATU Unroll-specific register definitions
>   * From 4.80 core version the address translation will be made by unroll
> @@ -399,6 +404,8 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 
> func_no,
>u8 interrupt_num);
>  int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>u16 interrupt_num);
> +int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep, u8 func_no,
> +u16 interrupt_num);
>  void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar);
>  #else
>  static inline void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
> @@ -431,6 +438,13 @@ static inline int dw_pcie_ep_raise_msix_irq(struct 
> dw_pcie_ep *ep, u8 func_no,
>   return 0;
>  }
>  
> +static inline int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep,
> +  u8 func_no,
> +  u16 interrupt_num)
> +{
> + return 0;
> +}
> +
>  static inline void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno 
> bar)
>  {
>  }
> -- 
> 2.9.5
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH 05/10] PCI: layerscape: Modify the way of getting capability with different PEX

2019-08-15 Thread Andrew Murray
On Thu, Aug 15, 2019 at 04:37:11PM +0800, Xiaowei Bao wrote:
> The different PCIe controller in one board may be have different
> capability of MSI or MSIX, so change the way of getting the MSI
> capability, make it more flexible.
> 
> Signed-off-by: Xiaowei Bao 
> ---
>  drivers/pci/controller/dwc/pci-layerscape-ep.c | 28 
> +++---
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c 
> b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> index be61d96..9404ca0 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> @@ -22,6 +22,7 @@
>  
>  struct ls_pcie_ep {
>   struct dw_pcie  *pci;
> + struct pci_epc_features *ls_epc;
>  };
>  
>  #define to_ls_pcie_ep(x) dev_get_drvdata((x)->dev)
> @@ -40,25 +41,26 @@ static const struct of_device_id ls_pcie_ep_of_match[] = {
>   { },
>  };
>  
> -static const struct pci_epc_features ls_pcie_epc_features = {
> - .linkup_notifier = false,
> - .msi_capable = true,
> - .msix_capable = false,
> -};
> -
>  static const struct pci_epc_features*
>  ls_pcie_ep_get_features(struct dw_pcie_ep *ep)
>  {
> - return &ls_pcie_epc_features;
> + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + struct ls_pcie_ep *pcie = to_ls_pcie_ep(pci);
> +
> + return pcie->ls_epc;
>  }
>  
>  static void ls_pcie_ep_init(struct dw_pcie_ep *ep)
>  {
>   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + struct ls_pcie_ep *pcie = to_ls_pcie_ep(pci);
>   enum pci_barno bar;
>  
>   for (bar = BAR_0; bar <= BAR_5; bar++)
>   dw_pcie_ep_reset_bar(pci, bar);
> +
> + pcie->ls_epc->msi_capable = ep->msi_cap ? true : false;
> + pcie->ls_epc->msix_capable = ep->msix_cap ? true : false;
>  }
>  
>  static int ls_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
> @@ -118,6 +120,7 @@ static int __init ls_pcie_ep_probe(struct platform_device 
> *pdev)
>   struct device *dev = &pdev->dev;
>   struct dw_pcie *pci;
>   struct ls_pcie_ep *pcie;
> + struct pci_epc_features *ls_epc;
>   struct resource *dbi_base;
>   int ret;
>  
> @@ -129,6 +132,10 @@ static int __init ls_pcie_ep_probe(struct 
> platform_device *pdev)
>   if (!pci)
>   return -ENOMEM;
>  
> + ls_epc = devm_kzalloc(dev, sizeof(*ls_epc), GFP_KERNEL);
> + if (!ls_epc)
> + return -ENOMEM;
> +
>   dbi_base = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
>   pci->dbi_base = devm_pci_remap_cfg_resource(dev, dbi_base);
>   if (IS_ERR(pci->dbi_base))
> @@ -139,6 +146,13 @@ static int __init ls_pcie_ep_probe(struct 
> platform_device *pdev)
>   pci->ops = &ls_pcie_ep_ops;
>   pcie->pci = pci;
>  
> + ls_epc->linkup_notifier = false,
> + ls_epc->msi_capable = true,
> + ls_epc->msix_capable = true,

As [msi,msix]_capable is shortly set from ls_pcie_ep_init - is there any
reason to set them here (to potentially incorrect values)?

Thanks,

Andrew Murray

> + ls_epc->bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4),
> +
> + pcie->ls_epc = ls_epc;
> +
>   platform_set_drvdata(pdev, pcie);
>  
>   ret = ls_add_pcie_ep(pcie, pdev);
> -- 
> 2.9.5
> 


Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs support for DWC

2019-08-16 Thread Andrew Murray
On Fri, Aug 16, 2019 at 02:55:41AM +, Xiaowei Bao wrote:
> 
> 
> > -Original Message-
> > From: Andrew Murray 
> > Sent: 2019年8月15日 19:32
> > To: Xiaowei Bao 
> > Cc: jingooh...@gmail.com; gustavo.pimen...@synopsys.com;
> > bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com;
> > shawn...@kernel.org; Leo Li ; kis...@ti.com;
> > lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org;
> > M.h. Lian ; Mingkai Hu ;
> > Roy Zang ; linux-...@vger.kernel.org;
> > devicet...@vger.kernel.org; linux-ker...@vger.kernel.org;
> > linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs support for
> > DWC
> > 
> > On Thu, Aug 15, 2019 at 04:37:07PM +0800, Xiaowei Bao wrote:
> > > Add multiple PFs support for DWC, different PF have different config
> > > space, we use pf-offset property which get from the DTS to access the
> > > different pF config space.
> > 
> > Thanks for the patch. I haven't seen a cover letter for this series, is 
> > there one
> > missing?
> Maybe I miss, I will add you to review next time, thanks a lot for your 
> comments.
> > 
> > 
> > >
> > > Signed-off-by: Xiaowei Bao 
> > > ---
> > >  drivers/pci/controller/dwc/pcie-designware-ep.c |  97
> > +-
> > >  drivers/pci/controller/dwc/pcie-designware.c| 105
> > ++--
> > >  drivers/pci/controller/dwc/pcie-designware.h|  10 ++-
> > >  include/linux/pci-epc.h |   1 +
> > >  4 files changed, 164 insertions(+), 49 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > index 2bf5a35..75e2955 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > @@ -19,12 +19,14 @@ void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
> > >   pci_epc_linkup(epc);
> > >  }
> > >
> > > -static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno
> > bar,
> > > -int flags)
> > > +static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, u8 func_no,
> > > +enum pci_barno bar, int flags)
> > >  {
> > >   u32 reg;
> > > + struct pci_epc *epc = pci->ep.epc;
> > > + u32 pf_base = func_no * epc->pf_offset;
> > >
> > > - reg = PCI_BASE_ADDRESS_0 + (4 * bar);
> > > + reg = pf_base + PCI_BASE_ADDRESS_0 + (4 * bar);
> > 
> > I think I'd rather see this arithmetic (and the one for determining pf_base)
> > inside a macro or inline header function. This would make this code more
> > readable and reduce the chances of an error by avoiding duplication of code.
> > 
> > For example look at cdns_pcie_ep_fn_writeb and
> > ROCKCHIP_PCIE_EP_FUNC_BASE for examples of other EP drivers that do
> > this.
> Agree, this looks fine, thanks a lot for your comments, I will use this way 
> to access
> the registers in next version patch.
> > 
> > 
> > >   dw_pcie_dbi_ro_wr_en(pci);
> > >   dw_pcie_writel_dbi2(pci, reg, 0x0);
> > >   dw_pcie_writel_dbi(pci, reg, 0x0);
> > > @@ -37,7 +39,12 @@ static void __dw_pcie_ep_reset_bar(struct dw_pcie
> > > *pci, enum pci_barno bar,
> > >
> > >  void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)  {
> > > - __dw_pcie_ep_reset_bar(pci, bar, 0);
> > > + u8 func_no, funcs;
> > > +
> > > + funcs = pci->ep.epc->max_functions;
> > > +
> > > + for (func_no = 0; func_no < funcs; func_no++)
> > > + __dw_pcie_ep_reset_bar(pci, func_no, bar, 0);
> > >  }
> > >
> > >  static u8 __dw_pcie_ep_find_next_cap(struct dw_pcie *pci, u8 cap_ptr,
> > > @@ -78,28 +85,29 @@ static int dw_pcie_ep_write_header(struct pci_epc
> > > *epc, u8 func_no,  {
> > >   struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> > >   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > + u32 pf_base = func_no * epc->pf_offset;
> > >
> > >   dw_pcie_dbi_ro_wr_en(pci);
> > > - dw_pcie_writew_dbi(pci, PCI_VENDOR_ID, hdr->vendorid);
> > > - dw_pcie_writew_dbi(pci, PCI_DEVICE_ID, hdr->deviceid);
> > > - dw_pcie_writeb_dbi(pci, PCI_REVISION_ID, hdr->revid);
> > > - dw_pcie_writeb_dbi(

Re: [PATCH 02/10] PCI: designware-ep: Add the doorbell mode of MSI-X in EP mode

2019-08-16 Thread Andrew Murray
On Fri, Aug 16, 2019 at 02:58:31AM +, Xiaowei Bao wrote:
> 
> 
> > -Original Message-
> > From: Andrew Murray 
> > Sent: 2019年8月15日 19:54
> > To: Xiaowei Bao 
> > Cc: jingooh...@gmail.com; gustavo.pimen...@synopsys.com;
> > bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com;
> > shawn...@kernel.org; Leo Li ; kis...@ti.com;
> > lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org;
> > M.h. Lian ; Mingkai Hu ;
> > Roy Zang ; linux-...@vger.kernel.org;
> > devicet...@vger.kernel.org; linux-ker...@vger.kernel.org;
> > linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH 02/10] PCI: designware-ep: Add the doorbell mode of
> > MSI-X in EP mode
> > 
> > On Thu, Aug 15, 2019 at 04:37:08PM +0800, Xiaowei Bao wrote:
> > > Add the doorbell mode of MSI-X in EP mode.
> > >
> > > Signed-off-by: Xiaowei Bao 
> > > ---
> > >  drivers/pci/controller/dwc/pcie-designware-ep.c | 14 ++
> > >  drivers/pci/controller/dwc/pcie-designware.h| 14 ++
> > >  2 files changed, 28 insertions(+)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > index 75e2955..e3a7cdf 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > @@ -454,6 +454,20 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep
> > *ep, u8 func_no,
> > >   return 0;
> > >  }
> > >
> > > +int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep, u8
> > func_no,
> > > +u16 interrupt_num)
> > > +{
> > > + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > + u32 msg_data;
> > > +
> > > + msg_data = (func_no << PCIE_MSIX_DOORBELL_PF_SHIFT) |
> > > +(interrupt_num - 1);
> > > +
> > > + dw_pcie_writel_dbi(pci, PCIE_MSIX_DOORBELL, msg_data);
> > > +
> > > + return 0;
> > > +}
> > > +
> > >  int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
> > > u16 interrupt_num)
> > 
> > Have I understood correctly that the hardware provides an alternative
> > mechanism that allows for raising MSI-X interrupts without the bother of
> > reading the capabilities registers?
> Yes, the hardware provide two way to MSI-X, please check the page 492 of 
> DWC_pcie_dm_registers_4.30 Menu.
> MSIX_DOORBELL_OFF on page 492 0x948 Description: MSI-X Doorbell Register>

Thanks for the reference.

> > 
> > If so is there any good reason to keep dw_pcie_ep_raise_msix_irq? (And thus
> > use it in dw_plat_pcie_ep_raise_irq also)?
> I am not sure, but I think the dw_pcie_ep_raise_msix_irq function is not 
> correct, 
> because I think we can't get the MSIX table from the address ep->phys_base + 
> tbl_addr, 
> but I also don't know where I can get the correct MSIX table.

Well it looks like this function is used by snps,dw-pcie-ep and snps,dw-pcie,
perhaps the doorbell mode isn't available on that hardware.

> > 
> > 
> > >  {
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h
> > > b/drivers/pci/controller/dwc/pcie-designware.h
> > > index 2b291e8..cd903e9 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > > @@ -88,6 +88,11 @@
> > >  #define PCIE_MISC_CONTROL_1_OFF  0x8BC
> > >  #define PCIE_DBI_RO_WR_ENBIT(0)
> > >
> > > +#define PCIE_MSIX_DOORBELL   0x948
> > > +#define PCIE_MSIX_DOORBELL_PF_SHIFT  24
> > > +#define PCIE_MSIX_DOORBELL_VF_SHIFT  16
> > > +#define PCIE_MSIX_DOORBELL_VF_ACTIVE BIT(15)
> > 
> > The _VF defines are not used, I'd suggest removing them.
> In fact, I will add the SRIOV support in this file, the SRIOV feature have 
> verified 
> In my board, but I need wait the EP framework SRIOV patch merge, 
> so I defined these two macros.

I'd suggest adding the VF macros along with the SRIOV feature.

Thanks,

Andrew Murray

> > 
> > Thanks,
> > 
> > Andrew Murray
> > 
> > > +
> > >  /*
> > >   * iATU Unroll-specific register definitions
> > >   * From 4.80 core version the address translation will be made by
> > > unroll @@ -399,6 +404,8 @@ int dw

Re: [PATCH 05/10] PCI: layerscape: Modify the way of getting capability with different PEX

2019-08-16 Thread Andrew Murray
On Fri, Aug 16, 2019 at 03:00:00AM +, Xiaowei Bao wrote:
> 
> 
> > -Original Message-
> > From: Andrew Murray 
> > Sent: 2019年8月15日 20:51
> > To: Xiaowei Bao 
> > Cc: jingooh...@gmail.com; gustavo.pimen...@synopsys.com;
> > bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com;
> > shawn...@kernel.org; Leo Li ; kis...@ti.com;
> > lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org;
> > M.h. Lian ; Mingkai Hu ;
> > Roy Zang ; linux-...@vger.kernel.org;
> > devicet...@vger.kernel.org; linux-ker...@vger.kernel.org;
> > linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH 05/10] PCI: layerscape: Modify the way of getting
> > capability with different PEX
> > 
> > On Thu, Aug 15, 2019 at 04:37:11PM +0800, Xiaowei Bao wrote:
> > > The different PCIe controller in one board may be have different
> > > capability of MSI or MSIX, so change the way of getting the MSI
> > > capability, make it more flexible.
> > >
> > > Signed-off-by: Xiaowei Bao 
> > > ---
> > >  drivers/pci/controller/dwc/pci-layerscape-ep.c | 28
> > > +++---
> > >  1 file changed, 21 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > > b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > > index be61d96..9404ca0 100644
> > > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > > @@ -22,6 +22,7 @@
> > >
> > >  struct ls_pcie_ep {
> > >   struct dw_pcie  *pci;
> > > + struct pci_epc_features *ls_epc;
> > >  };
> > >
> > >  #define to_ls_pcie_ep(x) dev_get_drvdata((x)->dev)
> > > @@ -40,25 +41,26 @@ static const struct of_device_id
> > ls_pcie_ep_of_match[] = {
> > >   { },
> > >  };
> > >
> > > -static const struct pci_epc_features ls_pcie_epc_features = {
> > > - .linkup_notifier = false,
> > > - .msi_capable = true,
> > > - .msix_capable = false,
> > > -};
> > > -
> > >  static const struct pci_epc_features*  ls_pcie_ep_get_features(struct
> > > dw_pcie_ep *ep)  {
> > > - return &ls_pcie_epc_features;
> > > + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > + struct ls_pcie_ep *pcie = to_ls_pcie_ep(pci);
> > > +
> > > + return pcie->ls_epc;
> > >  }
> > >
> > >  static void ls_pcie_ep_init(struct dw_pcie_ep *ep)  {
> > >   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > + struct ls_pcie_ep *pcie = to_ls_pcie_ep(pci);
> > >   enum pci_barno bar;
> > >
> > >   for (bar = BAR_0; bar <= BAR_5; bar++)
> > >   dw_pcie_ep_reset_bar(pci, bar);
> > > +
> > > + pcie->ls_epc->msi_capable = ep->msi_cap ? true : false;
> > > + pcie->ls_epc->msix_capable = ep->msix_cap ? true : false;
> > >  }
> > >
> > >  static int ls_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no, @@
> > > -118,6 +120,7 @@ static int __init ls_pcie_ep_probe(struct platform_device
> > *pdev)
> > >   struct device *dev = &pdev->dev;
> > >   struct dw_pcie *pci;
> > >   struct ls_pcie_ep *pcie;
> > > + struct pci_epc_features *ls_epc;
> > >   struct resource *dbi_base;
> > >   int ret;
> > >
> > > @@ -129,6 +132,10 @@ static int __init ls_pcie_ep_probe(struct
> > platform_device *pdev)
> > >   if (!pci)
> > >   return -ENOMEM;
> > >
> > > + ls_epc = devm_kzalloc(dev, sizeof(*ls_epc), GFP_KERNEL);
> > > + if (!ls_epc)
> > > + return -ENOMEM;
> > > +
> > >   dbi_base = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> > "regs");
> > >   pci->dbi_base = devm_pci_remap_cfg_resource(dev, dbi_base);
> > >   if (IS_ERR(pci->dbi_base))
> > > @@ -139,6 +146,13 @@ static int __init ls_pcie_ep_probe(struct
> > platform_device *pdev)
> > >   pci->ops = &ls_pcie_ep_ops;
> > >   pcie->pci = pci;
> > >
> > > + ls_epc->linkup_notifier = false,
> > > + ls_epc->msi_capable = true,
> > > + ls_epc->msix_capable = true,
> > 
> > As [msi,msix]_capable is shortly set from ls_pcie_ep_init - is there any 
> > reason
> > to set them here (to potentially incorrect values)?
> This is a INIT value, maybe false is better for msi_capable and msix_capable, 
> of course, we don't need to set it.

ls_epc is kzalloc'd and so all zeros, so you get false for free. I think you
can remove these two lines (or all three if you don't care that linkup_notifier
isn't explicitly set).

Thanks,

Andrew Murray

> > 
> > Thanks,
> > 
> > Andrew Murray
> > 
> > > + ls_epc->bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4),
> > > +
> > > + pcie->ls_epc = ls_epc;
> > > +
> > >   platform_set_drvdata(pdev, pcie);
> > >
> > >   ret = ls_add_pcie_ep(pcie, pdev);
> > > --
> > > 2.9.5
> > >


Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs support for DWC

2019-08-16 Thread Andrew Murray
On Fri, Aug 16, 2019 at 11:00:01AM +, Xiaowei Bao wrote:
> 
> 
> > -Original Message-
> > From: Andrew Murray 
> > Sent: 2019年8月16日 17:45
> > To: Xiaowei Bao 
> > Cc: jingooh...@gmail.com; gustavo.pimen...@synopsys.com;
> > mark.rutl...@arm.com; shawn...@kernel.org; Leo Li
> > ; kis...@ti.com; lorenzo.pieral...@arm.com;
> > a...@arndb.de; gre...@linuxfoundation.org; M.h. Lian
> > ; Roy Zang ;
> > linux-...@vger.kernel.org; devicet...@vger.kernel.org;
> > linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> > linuxppc-dev@lists.ozlabs.org; Z.q. Hou 
> > Subject: Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs support for
> > DWC
> > 
> > On Fri, Aug 16, 2019 at 02:55:41AM +, Xiaowei Bao wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Andrew Murray 
> > > > Sent: 2019年8月15日 19:32
> > > > To: Xiaowei Bao 
> > > > Cc: jingooh...@gmail.com; gustavo.pimen...@synopsys.com;
> > > > bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com;
> > > > shawn...@kernel.org; Leo Li ; kis...@ti.com;
> > > > lorenzo.pieral...@arm.com; a...@arndb.de;
> > > > gre...@linuxfoundation.org; M.h. Lian ;
> > > > Mingkai Hu ; Roy Zang ;
> > > > linux-...@vger.kernel.org; devicet...@vger.kernel.org;
> > > > linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> > > > linuxppc-dev@lists.ozlabs.org
> > > > Subject: Re: [PATCH 01/10] PCI: designware-ep: Add multiple PFs
> > > > support for DWC
> > > >
> > > > On Thu, Aug 15, 2019 at 04:37:07PM +0800, Xiaowei Bao wrote:
> > > > > Add multiple PFs support for DWC, different PF have different
> > > > > config space, we use pf-offset property which get from the DTS to
> > > > > access the different pF config space.
> > > >
> > > > Thanks for the patch. I haven't seen a cover letter for this series,
> > > > is there one missing?
> > > Maybe I miss, I will add you to review next time, thanks a lot for your
> > comments.
> > > >
> > > >
> > > > >
> > > > > Signed-off-by: Xiaowei Bao 
> > > > > ---
> > > > >  drivers/pci/controller/dwc/pcie-designware-ep.c |  97
> > > > +-
> > > > >  drivers/pci/controller/dwc/pcie-designware.c| 105
> > > > ++--
> > > > >  drivers/pci/controller/dwc/pcie-designware.h|  10 ++-
> > > > >  include/linux/pci-epc.h |   1 +
> > > > >  4 files changed, 164 insertions(+), 49 deletions(-)
> > > > >
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > index 2bf5a35..75e2955 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > @@ -19,12 +19,14 @@ void dw_pcie_ep_linkup(struct dw_pcie_ep
> > *ep)
> > > > >   pci_epc_linkup(epc);
> > > > >  }
> > > > >
> > > > > -static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum
> > > > > pci_barno
> > > > bar,
> > > > > -int flags)
> > > > > +static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, u8 func_no,
> > > > > +enum pci_barno bar, int flags)
> > > > >  {
> > > > >   u32 reg;
> > > > > + struct pci_epc *epc = pci->ep.epc;
> > > > > + u32 pf_base = func_no * epc->pf_offset;
> > > > >
> > > > > - reg = PCI_BASE_ADDRESS_0 + (4 * bar);
> > > > > + reg = pf_base + PCI_BASE_ADDRESS_0 + (4 * bar);
> > > >
> > > > I think I'd rather see this arithmetic (and the one for determining
> > > > pf_base) inside a macro or inline header function. This would make
> > > > this code more readable and reduce the chances of an error by avoiding
> > duplication of code.
> > > >
> > > > For example look at cdns_pcie_ep_fn_writeb and
> > > > ROCKCHIP_PCIE_EP_FUNC_BASE for examples of other EP drivers that do
> > > > this.
> > > Agree, this looks fine, thanks a lot for your

[PATCH v4 01/13] perf/doc: update design.txt for exclude_{host|guest} flags

2019-01-07 Thread Andrew Murray
Update design.txt to reflect the presence of the exclude_host
and exclude_guest perf flags.

Signed-off-by: Andrew Murray 
---
 tools/perf/design.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index a28dca2..0453ba2 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits 
provide a
 way to request that counting of events be restricted to times when the
 CPU is in user, kernel and/or hypervisor mode.
 
+Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
+to request counting of events restricted to guest and host contexts when
+using Linux as the hypervisor.
+
 The 'mmap' and 'munmap' bits allow recording of PROT_EXEC mmap/munmap
 operations, these can be used to relate userspace IP addresses to actual
 code, even after the mapping (or even the whole process) is gone,
-- 
2.7.4



  1   2   >