Re: Boot failure in Power7 pSeries
On Thu, 2014-06-19 at 13:49 +0800, Mike Qiu wrote: > Hi Michael, > > The config file has attached ... Thanks. Nothing is obviously broken in there. Can you try building with POWERNV=n, I think that will fix it. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] powerpc/pci: Remove duplicate logic
Since the logic to reset PCI secondary bus by PCI config register PCI_BRIDGE_CTL_BUS_RESET is included in pci_reset_secondary_bus(), we needn't implement another one of ourself. The patch removes the duplicate implementation and calls pci_reset_secondary_bus(). Signed-off-by: Gavin Shan --- arch/powerpc/kernel/pci-common.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index b49c72f..b2814e2 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -123,21 +123,12 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus, void pcibios_reset_secondary_bus(struct pci_dev *dev) { - u16 ctrl; - if (ppc_md.pcibios_reset_secondary_bus) { ppc_md.pcibios_reset_secondary_bus(dev); return; } - pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &ctrl); - ctrl |= PCI_BRIDGE_CTL_BUS_RESET; - pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); - msleep(2); - - ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; - pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); - ssleep(1); + pci_reset_secondary_bus(dev); } static resource_size_t pcibios_io_size(const struct pci_controller *hose) -- 1.8.3.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] PCI: Make resetting secondary bus logic common
Commit d92a208d086 ("powerpc/pci: Mask linkDown on resetting PCI bus") implemented same logic (resetting PCI secondary bus by bridge's config register PCI_BRIDGE_CTL_BUS_RESET) in PCI core and arch-dependent code. In order to avoid the duplicate implementation introduced by the commit, the patch puts the logic into pci_reset_secondary_bus(). That commit also missed declaring newly introduced weak function pcibios_reset_secondary_bus() in linux/include/pci.h. The patch fixes it. Signed-off-by: Gavin Shan --- drivers/pci/pci.c | 7 ++- include/linux/pci.h | 2 ++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 63a54a3..758f1d8 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3193,7 +3193,7 @@ static int pci_pm_reset(struct pci_dev *dev, int probe) return 0; } -void __weak pcibios_reset_secondary_bus(struct pci_dev *dev) +void pci_reset_secondary_bus(struct pci_dev *dev) { u16 ctrl; @@ -3219,6 +3219,11 @@ void __weak pcibios_reset_secondary_bus(struct pci_dev *dev) ssleep(1); } +void __weak pcibios_reset_secondary_bus(struct pci_dev *dev) +{ + pci_reset_secondary_bus(dev); +} + /** * pci_reset_bridge_secondary_bus - Reset the secondary bus on a PCI bridge. * @dev: Bridge device diff --git a/include/linux/pci.h b/include/linux/pci.h index 466bcd1..340529d 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -978,6 +978,8 @@ int pci_try_reset_slot(struct pci_slot *slot); int pci_probe_reset_bus(struct pci_bus *bus); int pci_reset_bus(struct pci_bus *bus); int pci_try_reset_bus(struct pci_bus *bus); +void pci_reset_secondary_bus(struct pci_dev *dev); +void pcibios_reset_secondary_bus(struct pci_dev *dev); void pci_reset_bridge_secondary_bus(struct pci_dev *dev); void pci_update_resource(struct pci_dev *dev, int resno); int __must_check pci_assign_resource(struct pci_dev *dev, int i); -- 1.8.3.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
(2014/06/19 15:40), Suzuki K. Poulose wrote: > On 06/19/2014 10:22 AM, Masami Hiramatsu wrote: >> (2014/06/19 10:30), Michael Ellerman wrote: >>> On Wed, 2014-06-18 at 17:46 +0900, Masami Hiramatsu wrote: (2014/06/18 16:56), Michael Ellerman wrote: > On Fri, 2014-06-06 at 15:38 +0900, Masami Hiramatsu wrote: >> Ping? >> >> I guess this should go to 3.16 branch, shouldn't it? > >>> diff --git a/arch/powerpc/include/asm/types.h >>> b/arch/powerpc/include/asm/types.h >>> index bfb6ded..8b89d65 100644 >>> --- a/arch/powerpc/include/asm/types.h >>> +++ b/arch/powerpc/include/asm/types.h >>> @@ -25,6 +25,17 @@ typedef struct { >>> unsigned long env; >>> } func_descr_t; >>> >>> +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1) >>> +/* >>> + * On PPC64 ABIv1 the function pointer actually points to the >>> + * function's descriptor. The first entry in the descriptor is the >>> + * address of the function text. >>> + */ >>> +#define function_entry(fn) (((func_descr_t *)(fn))->entry) >>> +#else >>> +#define function_entry(fn) ((unsigned long)(fn)) >>> +#endif > > We already have ppc_function_entry(), can't you use that? I'd like to ask you whether the address which ppc_function_entry() returns on PPC ABIv2 is really same address in kallsyms or not. As you can see, kprobes uses function_entry() to get the actual entry address where kallsyms knows. I have not much information about that, but it seems that the "global entry point" is the address which kallsyms knows, isn't it? >>> >>> OK. I'm not sure off the top of my head which address kallsyms knows about, >>> but >>> yes it's likely that it is the global entry point. >>> >>> I recently sent a patch to add ppc_global_function_entry(), because we need >>> it >>> in the ftrace code. Once that is merged you could use that. >> >> Yeah, I could use that. But since this is used in arch-independent code >> (e.g. IA64 >> needs similar macro), I think we'd better define function_entry() in >> asm/types.h for >> general use (for kallsyms), and rename ppc_function_entry to >> local_function_entry() >> in asm/code-patching.h. >> >> >>> How do you hit the original problem, you don't actually specify in your >>> commit >>> message? Something with kprobes obviously, but what exactly? I'll try and >>> reproduce it here. >> >> Ah, those messages should be shown in dmesg when booting if it doesn't work, >> because the messages are printed by initialization process of kprobe >> blacklist. >> So, reproducing it is just enabling CONFIG_KPROBES and boot it. > Well, we don't get those messages on Power, since the kallsyms has the > entries for ".function_name". The correct way to verify is, either : Hmm, that seems another issue on powerpc. Is that expected(and designed) behavior? And if so, how I can verify when initializing blacklist? (should I better use kallsyms_lookup() and kallsyms_lookup_name() for verification?) Thank you, > > 1) Dump the black_list via xmon ( see : > https://lkml.org/lkml/2014/5/29/893 ) and verify the entries. > > or > > 2) Issue a kprobe on a black listed entry and hit a success,(which we > will, since we don't check the actual function address). > > Thanks > Suzuki > > >> >> Thank you, >> -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] flexcan: add err_irq handler for flexcan
when flexcan is not physically linked, command 'cantest' will trigger an err_irq, add err_irq handler for it. Signed-off-by: Zhao Qiang --- drivers/net/can/flexcan.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index aaed97b..e3c6cfd 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -206,6 +206,7 @@ struct flexcan_priv { void __iomem *base; u32 reg_esr; u32 reg_ctrl_default; + unsigned interr_irq; struct clk *clk_ipg; struct clk *clk_per; @@ -654,6 +655,23 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id) return IRQ_HANDLED; } +static irqreturn_t flexcan_err_irq(int irq, void *dev_id) +{ + struct net_device *dev = dev_id; + struct flexcan_priv *priv = netdev_priv(dev); + struct flexcan_regs __iomem *regs = priv->base; + u32 reg_ctrl, reg_esr; + + reg_esr = flexcan_read(®s->esr); + reg_ctrl = flexcan_read(®s->ctrl); + if (reg_esr & FLEXCAN_ESR_TX_WRN) { + flexcan_write(reg_esr & ~FLEXCAN_ESR_TX_WRN, ®s->esr); + flexcan_write(reg_ctrl & ~FLEXCAN_CTRL_ERR_MSK, ®s->ctrl); + netdev_err(dev, "No physical link!\n"); + } + return IRQ_HANDLED; +} + static void flexcan_set_bittiming(struct net_device *dev) { const struct flexcan_priv *priv = netdev_priv(dev); @@ -860,6 +878,12 @@ static int flexcan_open(struct net_device *dev) if (err) goto out_close; + if (priv->err_irq) + err = request_irq(priv->err_irq, flexcan_err_irq, IRQF_SHARED, + dev->name, dev); + if (err) + goto out_close; + /* start chip and queuing */ err = flexcan_chip_start(dev); if (err) @@ -1007,7 +1031,7 @@ static int flexcan_probe(struct platform_device *pdev) struct resource *mem; struct clk *clk_ipg = NULL, *clk_per = NULL; void __iomem *base; - int err, irq; + int err, irq, err_irq; u32 clock_freq = 0; if (pdev->dev.of_node) @@ -1034,6 +1058,10 @@ static int flexcan_probe(struct platform_device *pdev) if (irq <= 0) return -ENODEV; + err_irq = platform_get_irq(pdev, 1); + if (err_irq <= 0) + err_irq = 0; + base = devm_ioremap_resource(&pdev->dev, mem); if (IS_ERR(base)) return PTR_ERR(base); @@ -1057,6 +1085,7 @@ static int flexcan_probe(struct platform_device *pdev) dev->flags |= IFF_ECHO; priv = netdev_priv(dev); + priv->err_irq = err_irq; priv->can.clock.freq = clock_freq; priv->can.bittiming_const = &flexcan_bittiming_const; priv->can.do_set_mode = flexcan_set_mode; -- 1.8.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] flexcan: add err interrupt for p1010rdb
add err interrupt for p1010rdb into dts. Signed-off-by: Zhao Qiang --- arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi index 4313ff6..bcd95ba 100644 --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi @@ -141,13 +141,15 @@ can0: can@1c000 { compatible = "fsl,p1010-flexcan"; reg = <0x1c000 0x1000>; - interrupts = <48 0x2 0 0>; + interrupts = <48 0x2 0 0 + 16 0x2 0 0>; }; can1: can@1d000 { compatible = "fsl,p1010-flexcan"; reg = <0x1d000 0x1000>; - interrupts = <61 0x2 0 0>; + interrupts = <61 0x2 0 0 + 16 0x2 0 0>; }; L2: l2-cache-controller@2 { -- 1.8.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 -next 4/9] DMA, CMA: support arbitrary bitmap granularity
On Wed, Jun 18, 2014 at 01:48:15PM -0700, Andrew Morton wrote: > On Mon, 16 Jun 2014 14:40:46 +0900 Joonsoo Kim wrote: > > > PPC KVM's CMA area management requires arbitrary bitmap granularity, > > since they want to reserve very large memory and manage this region > > with bitmap that one bit for several pages to reduce management overheads. > > So support arbitrary bitmap granularity for following generalization. > > > > ... > > > > --- a/drivers/base/dma-contiguous.c > > +++ b/drivers/base/dma-contiguous.c > > @@ -38,6 +38,7 @@ struct cma { > > unsigned long base_pfn; > > unsigned long count; > > unsigned long *bitmap; > > + unsigned int order_per_bit; /* Order of pages represented by one bit */ > > struct mutexlock; > > }; > > > > @@ -157,9 +158,37 @@ void __init dma_contiguous_reserve(phys_addr_t limit) > > > > static DEFINE_MUTEX(cma_mutex); > > > > +static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int > > align_order) > > +{ > > + return (1 << (align_order >> cma->order_per_bit)) - 1; > > +} > > Might want a "1UL << ..." here. Okay! > > > +static unsigned long cma_bitmap_maxno(struct cma *cma) > > +{ > > + return cma->count >> cma->order_per_bit; > > +} > > + > > +static unsigned long cma_bitmap_pages_to_bits(struct cma *cma, > > + unsigned long pages) > > +{ > > + return ALIGN(pages, 1 << cma->order_per_bit) >> cma->order_per_bit; > > +} > > Ditto. I'm not really sure what the compiler will do in these cases, > but would prefer not to rely on it anyway! Okay! Thanks for fix! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V3 14/17] ppc/pci: create/release dev-tree node for VFs
On Thu, Jun 19, 2014 at 3:46 AM, Wei Yang wrote: > On Wed, Jun 18, 2014 at 07:26:27PM +0100, Grant Likely wrote: >>On Tue, Jun 10, 2014 at 2:56 AM, Wei Yang wrote: >>> Currently, powernv platform is not aware of VFs. This means no dev-node >>> represents a VF. Also, VF PCI device is created when PF driver want to >>> enable >>> it. This leads to the pdn->pdev and pdn->pe_number an invalid value. >>> >>> This patch create/release dev-node for VF and fixs this when a VF's pci_dev >>> is created. >>> >>> Signed-off-by: Wei Yang >> >>I don't think this is the right way to handle this. Unless it is a >>fixup to a buggy devicetree provided by firmware, I don't want to see >>any code modifying the devicetree to describe stuff that is able to be >>directly enumerated. Really the pci code should handle the lack of a >>device_node gracefully. If it cannot then it should be fixed. > > Grant, > > Glad to see your comment. > > I will fix this in the firmware. That's not really what I meant. The kernel should be able to deal with virtual functions even if firmware doesn't know how, and the kernel should not require modifying the device tree to support them. I'm saying fix the kernel so that a device node is not necessary for virtual functions. g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] flexcan: add err interrupt for p1010rdb
On 06/19/2014 09:46 AM, Zhao Qiang wrote: > add err interrupt for p1010rdb into dts. Please update the flexcan binding documentation, too. Marc > > Signed-off-by: Zhao Qiang > --- > arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi > b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi > index 4313ff6..bcd95ba 100644 > --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi > +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi > @@ -141,13 +141,15 @@ > can0: can@1c000 { > compatible = "fsl,p1010-flexcan"; > reg = <0x1c000 0x1000>; > - interrupts = <48 0x2 0 0>; > + interrupts = <48 0x2 0 0 > + 16 0x2 0 0>; > }; > > can1: can@1d000 { > compatible = "fsl,p1010-flexcan"; > reg = <0x1d000 0x1000>; > - interrupts = <61 0x2 0 0>; > + interrupts = <61 0x2 0 0 > + 16 0x2 0 0>; > }; > > L2: l2-cache-controller@2 { > -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] flexcan: add err_irq handler for flexcan
On 06/19/2014 09:46 AM, Zhao Qiang wrote: > when flexcan is not physically linked, command 'cantest' will > trigger an err_irq, add err_irq handler for it. > > Signed-off-by: Zhao Qiang > --- > drivers/net/can/flexcan.c | 31 ++- > 1 file changed, 30 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c > index aaed97b..e3c6cfd 100644 > --- a/drivers/net/can/flexcan.c > +++ b/drivers/net/can/flexcan.c > @@ -206,6 +206,7 @@ struct flexcan_priv { > void __iomem *base; > u32 reg_esr; > u32 reg_ctrl_default; > + unsigned interr_irq; Please use a space instead of a tab before err_irq. > > struct clk *clk_ipg; > struct clk *clk_per; > @@ -654,6 +655,23 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id) > return IRQ_HANDLED; > } > > +static irqreturn_t flexcan_err_irq(int irq, void *dev_id) > +{ > + struct net_device *dev = dev_id; > + struct flexcan_priv *priv = netdev_priv(dev); > + struct flexcan_regs __iomem *regs = priv->base; > + u32 reg_ctrl, reg_esr; > + > + reg_esr = flexcan_read(®s->esr); > + reg_ctrl = flexcan_read(®s->ctrl); > + if (reg_esr & FLEXCAN_ESR_TX_WRN) { > + flexcan_write(reg_esr & ~FLEXCAN_ESR_TX_WRN, ®s->esr); > + flexcan_write(reg_ctrl & ~FLEXCAN_CTRL_ERR_MSK, ®s->ctrl); > + netdev_err(dev, "No physical link!\n"); A warning IRQ does not imply that you have no physical link. I suggest to make use of the existing flexcan_poll_state() function. > + } > + return IRQ_HANDLED; > +} > + > static void flexcan_set_bittiming(struct net_device *dev) > { > const struct flexcan_priv *priv = netdev_priv(dev); > @@ -860,6 +878,12 @@ static int flexcan_open(struct net_device *dev) > if (err) > goto out_close; > > + if (priv->err_irq) > + err = request_irq(priv->err_irq, flexcan_err_irq, IRQF_SHARED, > + dev->name, dev); > + if (err) > + goto out_close; > + > /* start chip and queuing */ > err = flexcan_chip_start(dev); > if (err) > @@ -1007,7 +1031,7 @@ static int flexcan_probe(struct platform_device *pdev) > struct resource *mem; > struct clk *clk_ipg = NULL, *clk_per = NULL; > void __iomem *base; > - int err, irq; > + int err, irq, err_irq; > u32 clock_freq = 0; > > if (pdev->dev.of_node) > @@ -1034,6 +1058,10 @@ static int flexcan_probe(struct platform_device *pdev) > if (irq <= 0) > return -ENODEV; > > + err_irq = platform_get_irq(pdev, 1); > + if (err_irq <= 0) > + err_irq = 0; > + > base = devm_ioremap_resource(&pdev->dev, mem); > if (IS_ERR(base)) > return PTR_ERR(base); > @@ -1057,6 +1085,7 @@ static int flexcan_probe(struct platform_device *pdev) > dev->flags |= IFF_ECHO; > > priv = netdev_priv(dev); > + priv->err_irq = err_irq; > priv->can.clock.freq = clock_freq; > priv->can.bittiming_const = &flexcan_bittiming_const; > priv->can.do_set_mode = flexcan_set_mode; > Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: OF_DYNAMIC node lifecycle
Hi Grant, CCing Thomas Gleixner & Steven Rostedt, since they might have a few ideas... On Jun 18, 2014, at 11:07 PM, Grant Likely wrote: > Hi Nathan and Tyrel, > > I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and > I'm hoping you can help me. Right now, pseries seems to be the only > user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on > the entire kernel because it requires all DT code to manage reference > counting with iterating over nodes. Most users simply get it wrong. > Pantelis did some investigation and found that the reference counts on > a running kernel are all over the place. I have my doubts that any > code really gets it right. > > The problem is that users need to know when it is appropriate to call > of_node_get()/of_node_put(). All list traversals that exit early need > an extra call to of_node_put(), and code that is searching for a node > in the tree and holding a reference to it needs to call of_node_get(). > In hindsight it appears that drivers just can't get the lifecycle right. So we need to simplify things. > I've got a few pseries questions: > - What are the changes being requested by pseries firmware? Is it only > CPUs and memory nodes, or does it manipulate things all over the tree? > - How frequent are the changes? How many changes would be likely over > the runtime of the system? > - Are you able to verify that removed nodes are actually able to be > freed correctly? Do you have any testcases for node removal? > > I'm thinking very seriously about changing the locking semantics of DT > code entirely so that most users never have to worry about > of_node_get/put at all. If the DT code is switched to use rcu > primitives for tree iteration (which also means making DT code use > list_head, something I'm already investigating), then instead of > trying to figure out of_node_get/put rules, callers could use > rcu_read_lock()/rcu_read_unlock() to protect the region that is > searching over nodes, and only call of_node_get() if the node pointer > is needed outside the rcu read-side lock. > > I'd really like to be rid of the node reference counting entirely, but > I can't figure out a way of doing that safely, so I'd settle for > making it a lot easier to get correct. > Since we're going about changing things, how about that devtree_lock? We're using a raw_spinlock and we're always taking the lock with interrupts disabled. If we're going to make DT changes frequently during normal runtime and not only during boot time, those are bad for any kind of real-time performance. So the question is, do we really have code that access the live tree during atomic sections? Is that something we want? Enforcing this will make our lives easier, and we'll get the change to replace that spinlock with a mutex. > Thoughts? > > g. Regards -- Pantelis ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V3 14/17] ppc/pci: create/release dev-tree node for VFs
On Thu, Jun 19, 2014 at 09:30:47AM +0100, Grant Likely wrote: >On Thu, Jun 19, 2014 at 3:46 AM, Wei Yang wrote: >> On Wed, Jun 18, 2014 at 07:26:27PM +0100, Grant Likely wrote: >>>On Tue, Jun 10, 2014 at 2:56 AM, Wei Yang wrote: Currently, powernv platform is not aware of VFs. This means no dev-node represents a VF. Also, VF PCI device is created when PF driver want to enable it. This leads to the pdn->pdev and pdn->pe_number an invalid value. This patch create/release dev-node for VF and fixs this when a VF's pci_dev is created. Signed-off-by: Wei Yang >>> >>>I don't think this is the right way to handle this. Unless it is a >>>fixup to a buggy devicetree provided by firmware, I don't want to see >>>any code modifying the devicetree to describe stuff that is able to be >>>directly enumerated. Really the pci code should handle the lack of a >>>device_node gracefully. If it cannot then it should be fixed. >> >> Grant, >> >> Glad to see your comment. >> >> I will fix this in the firmware. > >That's not really what I meant. The kernel should be able to deal with >virtual functions even if firmware doesn't know how, and the kernel >should not require modifying the device tree to support them. > >I'm saying fix the kernel so that a device node is not necessary for >virtual functions. oh, sorry for my poor understanding. Let me do some investigation to see whether it is fine to get rid of device node for vfs. > >g. -- Richard Yang Help you, Help me ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
On 06/19/2014 12:56 PM, Masami Hiramatsu wrote: > (2014/06/19 15:40), Suzuki K. Poulose wrote: >> On 06/19/2014 10:22 AM, Masami Hiramatsu wrote: >>> (2014/06/19 10:30), Michael Ellerman wrote: On Wed, 2014-06-18 at 17:46 +0900, Masami Hiramatsu wrote: > (2014/06/18 16:56), Michael Ellerman wrote: >> On Fri, 2014-06-06 at 15:38 +0900, Masami Hiramatsu wrote: >>> Ping? >>> >>> I guess this should go to 3.16 branch, shouldn't it? >> diff --git a/arch/powerpc/include/asm/types.h b/arch/powerpc/include/asm/types.h index bfb6ded..8b89d65 100644 --- a/arch/powerpc/include/asm/types.h +++ b/arch/powerpc/include/asm/types.h @@ -25,6 +25,17 @@ typedef struct { unsigned long env; } func_descr_t; +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1) +/* + * On PPC64 ABIv1 the function pointer actually points to the + * function's descriptor. The first entry in the descriptor is the + * address of the function text. + */ +#define function_entry(fn)(((func_descr_t *)(fn))->entry) +#else +#define function_entry(fn)((unsigned long)(fn)) +#endif >> >> We already have ppc_function_entry(), can't you use that? > > I'd like to ask you whether the address which ppc_function_entry() > returns on > PPC ABIv2 is really same address in kallsyms or not. > As you can see, kprobes uses function_entry() to get the actual entry > address > where kallsyms knows. I have not much information about that, but it > seems that > the "global entry point" is the address which kallsyms knows, isn't it? OK. I'm not sure off the top of my head which address kallsyms knows about, but yes it's likely that it is the global entry point. I recently sent a patch to add ppc_global_function_entry(), because we need it in the ftrace code. Once that is merged you could use that. >>> >>> Yeah, I could use that. But since this is used in arch-independent code >>> (e.g. IA64 >>> needs similar macro), I think we'd better define function_entry() in >>> asm/types.h for >>> general use (for kallsyms), and rename ppc_function_entry to >>> local_function_entry() >>> in asm/code-patching.h. >>> >>> How do you hit the original problem, you don't actually specify in your commit message? Something with kprobes obviously, but what exactly? I'll try and reproduce it here. >>> >>> Ah, those messages should be shown in dmesg when booting if it doesn't work, >>> because the messages are printed by initialization process of kprobe >>> blacklist. >>> So, reproducing it is just enabling CONFIG_KPROBES and boot it. >> Well, we don't get those messages on Power, since the kallsyms has the >> entries for ".function_name". The correct way to verify is, either : > > Hmm, that seems another issue on powerpc. Is that expected(and designed) > behavior? AFAIK, yes, it is. To be more precise : we have 'foo' and '.foo' for a function foo(), where 'foo' points to the function_entry and '.foo' points to the actual function. So, a kallsyms_lookup_size_offset() on both 'foo' and '.foo' will return a hit. So, if we make sure we use the value of '.foo' (by using the appropriate macros) we should be fine. And if so, how I can verify when initializing blacklist? > (should I better use kallsyms_lookup() and kallsyms_lookup_name() for > verification?) One way to verify would be to make sure the symbol starts with '.' from the result of the current kallsyms_lookup_size_offset() for PPC. Thanks Suzuki > > Thank you, > >> >> 1) Dump the black_list via xmon ( see : >> https://lkml.org/lkml/2014/5/29/893 ) and verify the entries. >> >> or >> >> 2) Issue a kprobe on a black listed entry and hit a success,(which we >> will, since we don't check the actual function address). >> >> Thanks >> Suzuki >> >> >>> >>> Thank you, >>> ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
(2014/06/19 18:45), Suzuki K. Poulose wrote: > On 06/19/2014 12:56 PM, Masami Hiramatsu wrote: >> (2014/06/19 15:40), Suzuki K. Poulose wrote: >>> On 06/19/2014 10:22 AM, Masami Hiramatsu wrote: (2014/06/19 10:30), Michael Ellerman wrote: > On Wed, 2014-06-18 at 17:46 +0900, Masami Hiramatsu wrote: >> (2014/06/18 16:56), Michael Ellerman wrote: >>> On Fri, 2014-06-06 at 15:38 +0900, Masami Hiramatsu wrote: Ping? I guess this should go to 3.16 branch, shouldn't it? >>> > diff --git a/arch/powerpc/include/asm/types.h > b/arch/powerpc/include/asm/types.h > index bfb6ded..8b89d65 100644 > --- a/arch/powerpc/include/asm/types.h > +++ b/arch/powerpc/include/asm/types.h > @@ -25,6 +25,17 @@ typedef struct { > unsigned long env; > } func_descr_t; > > +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1) > +/* > + * On PPC64 ABIv1 the function pointer actually points to the > + * function's descriptor. The first entry in the descriptor is the > + * address of the function text. > + */ > +#define function_entry(fn) (((func_descr_t *)(fn))->entry) > +#else > +#define function_entry(fn) ((unsigned long)(fn)) > +#endif >>> >>> We already have ppc_function_entry(), can't you use that? >> >> I'd like to ask you whether the address which ppc_function_entry() >> returns on >> PPC ABIv2 is really same address in kallsyms or not. >> As you can see, kprobes uses function_entry() to get the actual entry >> address >> where kallsyms knows. I have not much information about that, but it >> seems that >> the "global entry point" is the address which kallsyms knows, isn't it? > > OK. I'm not sure off the top of my head which address kallsyms knows > about, but > yes it's likely that it is the global entry point. > > I recently sent a patch to add ppc_global_function_entry(), because we > need it > in the ftrace code. Once that is merged you could use that. Yeah, I could use that. But since this is used in arch-independent code (e.g. IA64 needs similar macro), I think we'd better define function_entry() in asm/types.h for general use (for kallsyms), and rename ppc_function_entry to local_function_entry() in asm/code-patching.h. > How do you hit the original problem, you don't actually specify in your > commit > message? Something with kprobes obviously, but what exactly? I'll try and > reproduce it here. Ah, those messages should be shown in dmesg when booting if it doesn't work, because the messages are printed by initialization process of kprobe blacklist. So, reproducing it is just enabling CONFIG_KPROBES and boot it. >>> Well, we don't get those messages on Power, since the kallsyms has the >>> entries for ".function_name". The correct way to verify is, either : >> >> Hmm, that seems another issue on powerpc. Is that expected(and designed) >> behavior? > AFAIK, yes, it is. > To be more precise : > > we have 'foo' and '.foo' for a function foo(), where 'foo' points to the > function_entry and '.foo' points to the actual function. Ah, I see. So if we run func_ptr p = foo; return p == kallsyms_lookup_name(".foo"); it returns true. > So, a kallsyms_lookup_size_offset() on both 'foo' and '.foo' will return > a hit. So, if we make sure we use the value of '.foo' (by using the > appropriate macros) we should be fine. > > And if so, how I can verify when initializing blacklist? >> (should I better use kallsyms_lookup() and kallsyms_lookup_name() for >> verification?) > One way to verify would be to make sure the symbol starts with '.' from > the result of the current kallsyms_lookup_size_offset() for PPC. OK, I'll do that as another enhancement, since the bug reported here will be fixed with our patch. Anyway, this patch itself should go into 3.16 tree to fix actual bug. Thanks, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Re: Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
(2014/06/19 20:01), Masami Hiramatsu wrote: > Ah, those messages should be shown in dmesg when booting if it doesn't > work, > because the messages are printed by initialization process of kprobe > blacklist. > So, reproducing it is just enabling CONFIG_KPROBES and boot it. Well, we don't get those messages on Power, since the kallsyms has the entries for ".function_name". The correct way to verify is, either : >>> >>> Hmm, that seems another issue on powerpc. Is that expected(and designed) >>> behavior? >> AFAIK, yes, it is. >> To be more precise : >> >> we have 'foo' and '.foo' for a function foo(), where 'foo' points to the >> function_entry and '.foo' points to the actual function. > > Ah, I see. So if we run > > func_ptr p = foo; > return p == kallsyms_lookup_name(".foo"); > > it returns true. One more thing I should know, is the address of ".function_name" within the kernel text? In other words, does kernel_text_address() return true for that address? If not, it's easy to verify the address. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 23/38] mmc: sdhci: convert sdhci_set_uhs_signaling() into a library function
On Mon, Jun 16, 2014 at 02:17:30PM +0200, Ulf Hansson wrote: > Anyway, we did get some folks to test the patches and was thus fairly > confident that we could merge them. Chris asked me to try to collect > them in a PR for him, so I did. Sorry if I managed to screw some > things up, there were several conflicts and actual regressions, which > I tried to take care of. > > The mmc people were also very helping in sending patches to fixup > related regressions, immediately after we merged your patchset. Thus > together I think we managed to pull it off. I tend to look through slightly less rose-tinted glasses. The fact is... there's loads of ARM platforms which now fail in Olof's build/boot testing, and they all seem to have a very similar pattern: hummingboard: [1.149688] sdhci: Secure Digital Host Controller Interface driver [1.155901] sdhci: Copyright(c) Pierre Ossman ... [1.253630] Waiting for root device /dev/mmcblk0p2... [ 60.325469] imx-sdma 20ec000.sdma: firmware not found ~$off # PYBOOT: Exception: timeout jetson: [2.261355] Waiting for root device /dev/mmcblk0p1... wandboard: [1.186870] sdhci: Secure Digital Host Controller Interface driver [1.193075] sdhci: Copyright(c) Pierre Ossman ... [1.291064] Waiting for root device /dev/mmcblk0p2... Whether these are caused by the patch set or not is anyone's guess, because we (a) don't know what's causing these failures, and (b) my patch series was never tested on anything but iMX6. I'm pretty certain that the hummingboard failure is not related to my series as that's one of the platforms I did test my series on. There's more failures which look like possibly something in core MMC is rather screwed, as OMAP5 (which doesn't use SDHCI) is also failing at a similar point. What these failures /do/ mean is that when I'm pushing my ARM for-next branch out, Olof's builder picks it up and runs a build across it, and the report returns a whole load of failures. A whole load of failures means that those platforms haven't tested my changes, which means the quality of testing is much lower than it should be. With 26 passing and 15 failing, that's over 1/3 of platforms failing, which means 1/3 aren't getting tested. This level of failure has been going on for quite a while now, and (afaik) it remains uninvestigated and undiagnosed. (This is one of the complaints I have about Olof's build/boot test system, much of the information about the build and boot is hidden away and unpublished, which makes it almost impossible for third parties to diagnose any problem there. I've given up looking at most of Olof's build/boot mails because of this - it's just not interesting to see the same abbreviated boot failure logs which give no useful information time and time again.) We need to get on top of these failures and get them sorted. -- FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly improving, and getting towards what was expected from it. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/3] dmaengine: mpc512x: add device tree binding document
2014-06-18 18:56 GMT+04:00 Alexander Popov : > 2014-06-18 17:37 GMT+04:00 Mark Rutland : >> On Wed, Jun 18, 2014 at 11:48:10AM +0100, Alexander Popov wrote: >>> Introduce a device tree binding document for the MPC512x DMA controller >>> +Optional properties: >>> +- #dma-cells: the length of the DMA specifier, must be <1>. >>> + Each channel of this DMA controller has a peripheral request line, >>> + the assignment is fixed in hardware. This one cell >>> + in dmas property of a client device represents the channel number. >> >> Surely this is required to be able to refer to DMA channels on the >> device? > > Excuse me, I didn't understand your question. > Do you inquire about the reason of making #dma-cells an optional property? > It's optional because device tree based lookup support is made > optional (part 3/3). Mark, did I answer your question? Should I fix anything in this patch series? Hope for the reply. Alexander ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Node 0 not necessary for powerpc?
On Tue, Jun 10, 2014 at 04:31:57PM -0700, Nishanth Aravamudan wrote: > > I think what this really wants to do is NODE_DATA(cpu_to_mem(cpu)) and I > > thought ppc had the cpu-to-local-memory-node mappings correct? > > Except cpu_to_mem relies on the mapping being defined, but early in > boot, specifically, it isn't yet (at least not necessarily). Can't ppc NODE_DATA simply return dummy generic node_data during early boot? Populating it with just enough to make early boot work shouldn't be too hard, right? Thanks. -- tejun ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: OF_DYNAMIC node lifecycle
On 06/18/2014 03:07 PM, Grant Likely wrote: > Hi Nathan and Tyrel, > > I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and > I'm hoping you can help me. Right now, pseries seems to be the only > user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on > the entire kernel because it requires all DT code to manage reference > counting with iterating over nodes. Most users simply get it wrong. > Pantelis did some investigation and found that the reference counts on > a running kernel are all over the place. I have my doubts that any > code really gets it right. > > The problem is that users need to know when it is appropriate to call > of_node_get()/of_node_put(). All list traversals that exit early need > an extra call to of_node_put(), and code that is searching for a node > in the tree and holding a reference to it needs to call of_node_get(). > > I've got a few pseries questions: > - What are the changes being requested by pseries firmware? Is it only > CPUs and memory nodes, or does it manipulate things all over the tree? The short answer, everything. For pseries the two big actions that can change the device tree are adding/removing resources and partition migration. The most frequent updates to the device tree happen during resource (cpu, memory, and pci/phb) add and remove. During this process we add and remove the node and its properties from the device tree. - For memory on newer systems this just involves updating the ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory property. Older firmware levels add and remove the memroy@XXX nodes and their properties. - For cpus the cpus/PowerPC,POWER nodes and its properties are added or removed - For pci/phb the pci@X nodes and properties are added/removed. The less frequent operation of live partition migration (and suspend/resume) can update just about anything in the device tree. When this occurs and the systems starts after being migrated (or waking up after a suspend) we make a call to firmware to get updates to the device tree for the new hardware we are running on. > - How frequent are the changes? How many changes would be likely over > the runtime of the system? This can happen frequently. > - Are you able to verify that removed nodes are actually able to be > freed correctly? Do you have any testcases for node removal? I have always tested this by doing resource add/remove, usually cpu and memory since it is the easiest. > > I'm thinking very seriously about changing the locking semantics of DT > code entirely so that most users never have to worry about > of_node_get/put at all. If the DT code is switched to use rcu > primitives for tree iteration (which also means making DT code use > list_head, something I'm already investigating), then instead of > trying to figure out of_node_get/put rules, callers could use > rcu_read_lock()/rcu_read_unlock() to protect the region that is > searching over nodes, and only call of_node_get() if the node pointer > is needed outside the rcu read-side lock. > This sounds good. I like just taking the rcu lock around accessing the DT. Do we have many places where DT node pointers are held that require keeping the of_node_get/put calls? If this did exist perhaps we could update those places to look up the DT node every time instead of holding on to the pointer. We could just get rid of the reference counting altogether then. > I'd really like to be rid of the node reference counting entirely, but > I can't figure out a way of doing that safely, so I'd settle for > making it a lot easier to get correct. > heh! I have often thought about adding reference counting to device tree properties. I don't really want to but there are some properties that can get updated frequently (namely the one mentioned above for memory) that can also get pretty big, especially on systems with a lot of memory. We never free the memory for old versions of a device tree property. This is a pretty minor issue though and probably best suited for a separate discussion after resolving this. Other than pseries, who else does dynamic device tree updating? Are we the only ones? -Nathan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 23/38] mmc: sdhci: convert sdhci_set_uhs_signaling() into a library function
On 06/19/2014 06:28 AM, Russell King - ARM Linux wrote: > On Mon, Jun 16, 2014 at 02:17:30PM +0200, Ulf Hansson wrote: >> Anyway, we did get some folks to test the patches and was thus fairly >> confident that we could merge them. Chris asked me to try to collect >> them in a PR for him, so I did. Sorry if I managed to screw some >> things up, there were several conflicts and actual regressions, which >> I tried to take care of. >> >> The mmc people were also very helping in sending patches to fixup >> related regressions, immediately after we merged your patchset. Thus >> together I think we managed to pull it off. > > I tend to look through slightly less rose-tinted glasses. > > The fact is... there's loads of ARM platforms which now fail in Olof's > build/boot testing, and they all seem to have a very similar pattern: > > hummingboard: > [1.149688] sdhci: Secure Digital Host Controller Interface driver > [1.155901] sdhci: Copyright(c) Pierre Ossman > ... > [1.253630] Waiting for root device /dev/mmcblk0p2... > [ 60.325469] imx-sdma 20ec000.sdma: firmware not found > ~$off > # PYBOOT: Exception: timeout > > jetson: > [2.261355] Waiting for root device /dev/mmcblk0p1... > > wandboard: > [1.186870] sdhci: Secure Digital Host Controller Interface driver > [1.193075] sdhci: Copyright(c) Pierre Ossman > ... > [1.291064] Waiting for root device /dev/mmcblk0p2... Any SDHCI failures in Linus' tree (but not linux-next) that occur only in multi_v7_defconfig are likely solved by: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/264012.html [PATCH] ARM: multi_v7_defconfig: re-enable SDHCI drivers > Whether these are caused by the patch set or not is anyone's guess, > because we (a) don't know what's causing these failures, and (b) > my patch series was never tested on anything but iMX6. I thought that I'd tested at least some of it on Tegra. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 23/38] mmc: sdhci: convert sdhci_set_uhs_signaling() into a library function
On Thu, Jun 19, 2014 at 5:28 AM, Russell King - ARM Linux wrote: > On Mon, Jun 16, 2014 at 02:17:30PM +0200, Ulf Hansson wrote: >> Anyway, we did get some folks to test the patches and was thus fairly >> confident that we could merge them. Chris asked me to try to collect >> them in a PR for him, so I did. Sorry if I managed to screw some >> things up, there were several conflicts and actual regressions, which >> I tried to take care of. >> >> The mmc people were also very helping in sending patches to fixup >> related regressions, immediately after we merged your patchset. Thus >> together I think we managed to pull it off. > > I tend to look through slightly less rose-tinted glasses. > > The fact is... there's loads of ARM platforms which now fail in Olof's > build/boot testing, and they all seem to have a very similar pattern: > > hummingboard: > [1.149688] sdhci: Secure Digital Host Controller Interface driver > [1.155901] sdhci: Copyright(c) Pierre Ossman > ... > [1.253630] Waiting for root device /dev/mmcblk0p2... > [ 60.325469] imx-sdma 20ec000.sdma: firmware not found > ~$off > # PYBOOT: Exception: timeout > > jetson: > [2.261355] Waiting for root device /dev/mmcblk0p1... > > wandboard: > [1.186870] sdhci: Secure Digital Host Controller Interface driver > [1.193075] sdhci: Copyright(c) Pierre Ossman > ... > [1.291064] Waiting for root device /dev/mmcblk0p2... > > Whether these are caused by the patch set or not is anyone's guess, > because we (a) don't know what's causing these failures, and (b) > my patch series was never tested on anything but iMX6. > > I'm pretty certain that the hummingboard failure is not related to > my series as that's one of the platforms I did test my series on. > > There's more failures which look like possibly something in core MMC is > rather screwed, as OMAP5 (which doesn't use SDHCI) is also failing at > a similar point. > > What these failures /do/ mean is that when I'm pushing my ARM for-next > branch out, Olof's builder picks it up and runs a build across it, and > the report returns a whole load of failures. A whole load of failures > means that those platforms haven't tested my changes, which means the > quality of testing is much lower than it should be. > > With 26 passing and 15 failing, that's over 1/3 of platforms failing, > which means 1/3 aren't getting tested. > > This level of failure has been going on for quite a while now, and (afaik) > it remains uninvestigated and undiagnosed. (This is one of the complaints > I have about Olof's build/boot test system, much of the information about > the build and boot is hidden away and unpublished, which makes it almost > impossible for third parties to diagnose any problem there. I've given up > looking at most of Olof's build/boot mails because of this - it's just > not interesting to see the same abbreviated boot failure logs which give > no useful information time and time again.) Most of this is because I want to avoid sending huuuge emails out with the failures. I'll add a push of the full log to arm-soc.lixom.net and include a link to it in the emails, similar to how I do the build logs. I'll let you know when I've made that change. -Olof ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Node 0 not necessary for powerpc?
On 21.05.2014 [14:58:12 -0400], Tejun Heo wrote: > Hello, > > On Wed, May 21, 2014 at 09:16:27AM -0500, Christoph Lameter wrote: > > On Mon, 19 May 2014, Nishanth Aravamudan wrote: > > > I'm seeing a panic at boot with this change on an LPAR which actually > > > has no Node 0. Here's what I think is happening: > > > > > > start_kernel > > > ... > > > -> setup_per_cpu_areas > > > -> pcpu_embed_first_chunk > > > -> pcpu_fc_alloc > > > -> ___alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu), ... > > > -> smp_prepare_boot_cpu > > > -> set_numa_node(boot_cpuid) > > > > > > So we panic on the NODE_DATA call. It seems that ia64, at least, uses > > > pcpu_alloc_first_chunk rather than embed. x86 has some code to handle > > > early calls of cpu_to_node (early_cpu_to_node) and sets the mapping for > > > all CPUs in setup_per_cpu_areas(). > > > > Maybe we can switch ia64 too embed? Tejun: Why are there these > > dependencies? > > > > > Thoughts? Does that mean we need something similar to x86 for powerpc? > > I'm missing context to properly understand what's going on but the > specific allocator in use shouldn't matter. e.g. x86 can use both > embed and page allocators. If the problem is that the arch is > accessing percpu memory before percpu allocator is initialized and the > problem was masked before somehow, the right thing to do would be > removing those premature percpu accesses. If early percpu variables > are really necessary, doing similar early_percpu thing as in x86 would > be necessary. The early access is in the arch's pcpu_alloc_bootmem. On x86, rather than using NODE_DATA(cpu_to_node), it uses (in pcpu_alloc_bootmem), early_cpu_to_node(cpu) with their custom logic. The issue is that cpu_to_node, if USE_PERCPU_NUMA_NODE_ID is defined (which it is for NUMA powerpc, x86, ia64), is that cpu_to_node uses the percpu area, which data isn't initialized yet. So I guess powerpc needs the same treatment as x86. Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Node 0 not necessary for powerpc?
On 19.06.2014 [10:59:50 -0400], Tejun Heo wrote: > On Tue, Jun 10, 2014 at 04:31:57PM -0700, Nishanth Aravamudan wrote: > > > I think what this really wants to do is NODE_DATA(cpu_to_mem(cpu)) and I > > > thought ppc had the cpu-to-local-memory-node mappings correct? > > > > Except cpu_to_mem relies on the mapping being defined, but early in > > boot, specifically, it isn't yet (at least not necessarily). > > Can't ppc NODE_DATA simply return dummy generic node_data during early > boot? Populating it with just enough to make early boot work > shouldn't be too hard, right? So the problem is this, whether we use cpu_to_mem() or cpu_to_node() here, neither is setup yet because of the ordering between percpu setup and the actual writing of the percpu data (that is actually storing what node/local memory is relative to a given CPU). The NODE_DATA is all correct, but since we are calling cpu_to_{mem,node} before it really holds valid data, it falsely says 0, which is not necessarily even an online node. So, I think we need to do the same thing as x86 and have an early mapping setup and configured before the percpu areas are. Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] IBM Akebono: Remove obsolete config select
Hi Ben, It looks like we may have missed this trivial fix? Can you please apply it to your tree? Regards, Alistair On Fri, 13 Jun 2014 13:56:32 Paul Bolle wrote: > On Fri, 2014-05-02 at 18:06 +1000, Alistair Popple wrote: > > The original implementation of MMC support for Akebono introduced a > > new configuration symbol (MMC_SDHCI_OF_476GTR). This symbol has been > > dropped in favour of using the generic platform driver however the > > select for this symbol was mistakenly left in the platform > > configuration. > > > > This patch removes the obsolete symbol selection. > > > > Signed-off-by: Alistair Popple > > This patch hasn't yet entered linux-next nor Linus' tree. Is it queued > somewhere? If not, would a > Acked-by: Paul Bolle > > help to get this trivial patch queued for either of those trees? > > > Paul Bolle > > > --- > > > > arch/powerpc/platforms/44x/Kconfig | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/arch/powerpc/platforms/44x/Kconfig > > b/arch/powerpc/platforms/44x/Kconfig index 8beec7d..908bf11 100644 > > --- a/arch/powerpc/platforms/44x/Kconfig > > +++ b/arch/powerpc/platforms/44x/Kconfig > > @@ -220,7 +220,6 @@ config AKEBONO > > > > select USB_EHCI_HCD_PLATFORM > > select MMC_SDHCI > > select MMC_SDHCI_PLTFM > > > > - select MMC_SDHCI_OF_476GTR > > > > select ATA > > select SATA_AHCI_PLATFORM > > help ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Re: Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
On Thu, 2014-06-19 at 20:20 +0900, Masami Hiramatsu wrote: > (2014/06/19 20:01), Masami Hiramatsu wrote: > > > Ah, those messages should be shown in dmesg when booting if it doesn't > > work, > > because the messages are printed by initialization process of kprobe > > blacklist. > > So, reproducing it is just enabling CONFIG_KPROBES and boot it. > Well, we don't get those messages on Power, since the kallsyms has the > entries for ".function_name". The correct way to verify is, either : > >>> > >>> Hmm, that seems another issue on powerpc. Is that expected(and designed) > >>> behavior? > >> AFAIK, yes, it is. > >> To be more precise : > >> > >> we have 'foo' and '.foo' for a function foo(), where 'foo' points to the > >> function_entry and '.foo' points to the actual function. > > > > Ah, I see. So if we run > > > > func_ptr p = foo; > > return p == kallsyms_lookup_name(".foo"); > > > > it returns true. > > One more thing I should know, is the address of ".function_name" within the > kernel text? In other words, does kernel_text_address() return true for that > address? If not, it's easy to verify the address. Yes. That is the text address, kernel_text_address() should definitely return true. On 64-bit, ABIv1, "foo" points to the function descriptor, in the ".opd" section. ".foo" points to the actual text of the function, in ".text". On 64-bit, ABIv2, "foo" points to the text in ".text". There are no dot symbols. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 1/2] flexcan: add err_irq handler for flexcan
when flexcan is not physically linked, command 'cantest' will trigger an err_irq, add err_irq handler for it. Signed-off-by: Zhao Qiang --- Changes for v2: - use a space instead of tab - use flexcan_poll_state instead of print drivers/net/can/flexcan.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index f425ec2..7432ba4 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -208,6 +208,7 @@ struct flexcan_priv { void __iomem *base; u32 reg_esr; u32 reg_ctrl_default; + unsigned int err_irq; struct clk *clk_ipg; struct clk *clk_per; @@ -744,6 +745,23 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id) return IRQ_HANDLED; } +static irqreturn_t flexcan_err_irq(int irq, void *dev_id) +{ + struct net_device *dev = dev_id; + struct flexcan_priv *priv = netdev_priv(dev); + struct flexcan_regs __iomem *regs = priv->base; + u32 reg_ctrl, reg_esr; + + reg_esr = flexcan_read(®s->esr); + reg_ctrl = flexcan_read(®s->ctrl); + if (reg_esr & FLEXCAN_ESR_TX_WRN) { + flexcan_write(reg_esr & ~FLEXCAN_ESR_TX_WRN, ®s->esr); + flexcan_write(reg_ctrl & ~FLEXCAN_CTRL_ERR_MSK, ®s->ctrl); + flexcan_poll_state(dev, reg_esr); + } + return IRQ_HANDLED; +} + static void flexcan_set_bittiming(struct net_device *dev) { const struct flexcan_priv *priv = netdev_priv(dev); @@ -944,6 +962,12 @@ static int flexcan_open(struct net_device *dev) if (err) goto out_close; + if (priv->err_irq) + err = request_irq(priv->err_irq, flexcan_err_irq, IRQF_SHARED, + dev->name, dev); + if (err) + goto out_close; + /* start chip and queuing */ err = flexcan_chip_start(dev); if (err) @@ -1099,7 +1123,7 @@ static int flexcan_probe(struct platform_device *pdev) struct resource *mem; struct clk *clk_ipg = NULL, *clk_per = NULL; void __iomem *base; - int err, irq; + int err, irq, err_irq; u32 clock_freq = 0; if (pdev->dev.of_node) @@ -1126,6 +1150,10 @@ static int flexcan_probe(struct platform_device *pdev) if (irq <= 0) return -ENODEV; + err_irq = platform_get_irq(pdev, 1); + if (err_irq <= 0) + err_irq = 0; + base = devm_ioremap_resource(&pdev->dev, mem); if (IS_ERR(base)) return PTR_ERR(base); @@ -1149,6 +1177,7 @@ static int flexcan_probe(struct platform_device *pdev) dev->flags |= IFF_ECHO; priv = netdev_priv(dev); + priv->err_irq = err_irq; priv->can.clock.freq = clock_freq; priv->can.bittiming_const = &flexcan_bittiming_const; priv->can.do_set_mode = flexcan_set_mode; -- 1.8.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] flexcan: add err interrupt for p1010rdb
add err interrupt for p1010rdb into dts. Signed-off-by: Zhao Qiang --- Changes for v2: - add binding documentation update Documentation/devicetree/bindings/net/can/fsl-flexcan.txt | 7 +-- arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 6 -- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt index 56d6cc3..81929e5 100644 --- a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt +++ b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt @@ -10,7 +10,9 @@ Required properties: - fsl,p1010-flexcan - reg : Offset and length of the register set for this device -- interrupts : Interrupt tuple for this device +- interrupts : Interrupt tuple for this device. + The first interrupt is for FlexCAN(Message Buffer and Wake Up) + The second is for error(Shared with IFC, PEX1 and some other device) Optional properties: @@ -23,7 +25,8 @@ Example: can@1c000 { compatible = "fsl,p1010-flexcan"; reg = <0x1c000 0x1000>; - interrupts = <48 0x2>; + interrupts = <48 0x2 0 0 + 16 0x2 0 0>; interrupt-parent = <&mpic>; clock-frequency = <2>; // filled in by bootloader }; diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi index af12ead..47125a6 100644 --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi @@ -136,13 +136,15 @@ can0: can@1c000 { compatible = "fsl,p1010-flexcan"; reg = <0x1c000 0x1000>; - interrupts = <48 0x2 0 0>; + interrupts = <48 0x2 0 0 + 16 0x2 0 0>; }; can1: can@1d000 { compatible = "fsl,p1010-flexcan"; reg = <0x1d000 0x1000>; - interrupts = <61 0x2 0 0>; + interrupts = <61 0x2 0 0 + 16 0x2 0 0>; }; L2: l2-cache-controller@2 { -- 1.8.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
(2014/06/20 9:37), Michael Ellerman wrote: > On Thu, 2014-06-19 at 20:20 +0900, Masami Hiramatsu wrote: >> (2014/06/19 20:01), Masami Hiramatsu wrote: >> >>> Ah, those messages should be shown in dmesg when booting if it doesn't >>> work, >>> because the messages are printed by initialization process of kprobe >>> blacklist. >>> So, reproducing it is just enabling CONFIG_KPROBES and boot it. >> Well, we don't get those messages on Power, since the kallsyms has the >> entries for ".function_name". The correct way to verify is, either : > > Hmm, that seems another issue on powerpc. Is that expected(and designed) > behavior? AFAIK, yes, it is. To be more precise : we have 'foo' and '.foo' for a function foo(), where 'foo' points to the function_entry and '.foo' points to the actual function. >>> >>> Ah, I see. So if we run >>> >>> func_ptr p = foo; >>> return p == kallsyms_lookup_name(".foo"); >>> >>> it returns true. >> >> One more thing I should know, is the address of ".function_name" within the >> kernel text? In other words, does kernel_text_address() return true for that >> address? If not, it's easy to verify the address. > > Yes. That is the text address, kernel_text_address() should definitely return > true. > > On 64-bit, ABIv1, "foo" points to the function descriptor, in the ".opd" > section. > > ".foo" points to the actual text of the function, in ".text". Hmm, I misunderstood that. Anyway, we can verify it by kernel_text_address(). > > On 64-bit, ABIv2, "foo" points to the text in ".text". There are no dot > symbols. OK, in that case, kernel_text_address() check is still available. :) Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v4] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64
On ia64 and ppc64, the function pointer does not point the entry address of the function, but the address of function discriptor (which contains the entry address and misc data.) Since the kprobes passes the function pointer stored by NOKPROBE_SYMBOL() to kallsyms_lookup_size_offset() for initalizing its blacklist, it fails and reports many errors as below. Failed to find blacklist 000101316830 Failed to find blacklist 0001013000f0a000 Failed to find blacklist 000101315f70a000 Failed to find blacklist 000101324c80a000 Failed to find blacklist 0001013063f0a000 Failed to find blacklist 000101327800a000 Failed to find blacklist 0001013277f0a000 Failed to find blacklist 000101315a70a000 Failed to find blacklist 0001013277e0a000 Failed to find blacklist 000101305a20a000 Failed to find blacklist 0001013277d0a000 Failed to find blacklist 00010130bdc0a000 Failed to find blacklist 00010130dc20a000 Failed to find blacklist 000101309a00a000 Failed to find blacklist 0001013277c0a000 Failed to find blacklist 0001013277b0a000 Failed to find blacklist 0001013277a0a000 Failed to find blacklist 000101327790a000 Failed to find blacklist 000101303140a000 Failed to find blacklist 0001013a3280a000 To fix this bug, this introduces function_entry() macro to retrieve the entry address from the given function pointer, and uses for kallsyms_lookup_size_offset() while initializing blacklist. Changes in v4: - Add kernel_text_address() check for verifying the address. - Moved on the latest linus tree. Changes in v3: - Fix a bug to get blacklist address based on function entry instead of function descriptor. (Suzuki's work, Thanks!) Changes in V2: - Use function_entry() macro when lookin up symbols instead of storing it. - Update for the latest -next. Signed-off-by: Masami Hiramatsu Reported-by: Tony Luck Tested-by: Tony Luck Cc: Michael Ellerman Cc: Suzuki K. Poulose Cc: Tony Luck Cc: Fenghua Yu Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Ananth N Mavinakayanahalli Cc: Kevin Hao Cc: linux-i...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org --- arch/ia64/include/asm/types.h|2 ++ arch/powerpc/include/asm/types.h | 11 +++ include/linux/types.h|4 kernel/kprobes.c | 15 ++- 4 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/ia64/include/asm/types.h b/arch/ia64/include/asm/types.h index 4c351b1..95279dd 100644 --- a/arch/ia64/include/asm/types.h +++ b/arch/ia64/include/asm/types.h @@ -27,5 +27,7 @@ struct fnptr { unsigned long gp; }; +#define function_entry(fn) (((struct fnptr *)(fn))->ip) + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_IA64_TYPES_H */ diff --git a/arch/powerpc/include/asm/types.h b/arch/powerpc/include/asm/types.h index bfb6ded..8b89d65 100644 --- a/arch/powerpc/include/asm/types.h +++ b/arch/powerpc/include/asm/types.h @@ -25,6 +25,17 @@ typedef struct { unsigned long env; } func_descr_t; +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1) +/* + * On PPC64 ABIv1 the function pointer actually points to the + * function's descriptor. The first entry in the descriptor is the + * address of the function text. + */ +#define function_entry(fn) (((func_descr_t *)(fn))->entry) +#else +#define function_entry(fn) ((unsigned long)(fn)) +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_TYPES_H */ diff --git a/include/linux/types.h b/include/linux/types.h index a0bb704..3b95369 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -213,5 +213,9 @@ struct callback_head { }; #define rcu_head callback_head +#ifndef function_entry +#define function_entry(fn) ((unsigned long)(fn)) +#endif + #endif /* __ASSEMBLY__ */ #endif /* _LINUX_TYPES_H */ diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 3214289..7412535 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -32,6 +32,7 @@ * added function-return probes. */ #include +#include #include #include #include @@ -2037,19 +2038,23 @@ static int __init populate_kprobe_blacklist(unsigned long *start, { unsigned long *iter; struct kprobe_blacklist_entry *ent; - unsigned long offset = 0, size = 0; + unsigned long entry, offset = 0, size = 0; for (iter = start; iter < end; iter++) { - if (!kallsyms_lookup_size_offset(*iter, &size, &offset)) { - pr_err("Failed to find blacklist %p\n", (void *)*iter); + entry = function_entry(*iter); + + if (!kernel_text_address(entry) || + !kallsyms_lookup_size_offset(entry, &size, &offset)) { + pr_err("Failed to find blacklist at %p\n", + (void *)entry); continue; } ent = kmalloc(sizeof(*ent), GFP_KERNEL); i
Re: [PATCH] vfio: Fix endianness handling for emulated BARs
On Thu, 2014-06-19 at 13:48 +1000, Alexey Kardashevskiy wrote: > On 06/19/2014 11:50 AM, Alexey Kardashevskiy wrote: > > On 06/19/2014 10:50 AM, Alexey Kardashevskiy wrote: > >> On 06/19/2014 04:35 AM, Alex Williamson wrote: > >>> On Wed, 2014-06-18 at 21:36 +1000, Alexey Kardashevskiy wrote: > VFIO exposes BARs to user space as a byte stream so userspace can > read it using pread()/pwrite(). Since this is a byte stream, VFIO should > not do byte swapping and simply return values as it gets them from > PCI device. > > Instead, the existing code assumes that byte stream in read/write is > little-endian and it fixes endianness for values which it passes to > ioreadXX/iowriteXX helpers. This works for little-endian as PCI is > little endian and le32_to_cpu/... are stubs. > >>> > >>> vfio read32: > >>> > >>> val = cpu_to_le32(ioread32(io + off)); > >>> > >>> Where the typical x86 case, ioread32 is: > >>> > >>> #define ioread32(addr) readl(addr) > >>> > >>> and readl is: > >>> > >>> __le32_to_cpu(__raw_readl(addr)); > >>> > >>> So we do canceling byte swaps, which are both nops on x86, and end up > >>> returning device endian, which we assume is little endian. > >>> > >>> vfio write32 is similar: > >>> > >>> iowrite32(le32_to_cpu(val), io + off); > >>> > >>> The implicit cpu_to_le32 of iowrite32() and our explicit swap cancel > >>> out, so input data is device endian, which is assumed little. > >>> > This also works for big endian but rather by an accident: it reads 4 > bytes > from the stream (@val is big endian), converts to CPU format (which > should > be big endian) as it was little endian (@val becomes actually little > endian) and calls iowrite32() which does not do swapping on big endian > system. > >>> > >>> Really? > >>> > >>> In arch/powerpc/kernel/iomap.c iowrite32() is just a wrapper around > >>> writel(), which seems to use the generic implementation, which does > >>> include a cpu_to_le32. > >> > >> > >> Ouch, wrong comment. iowrite32() does swapping. My bad. > >> > >> > >>> > >>> I also see other big endian archs like parisc doing cpu_to_le32 on > >>> iowrite32, so I don't think this statement is true. I imagine it's > >>> probably working for you because the swap cancel. > >>> > This removes byte swapping and makes use ioread32be/iowrite32be > (and 16bit versions) on big-endian systems. The "be" helpers take > native endian values and do swapping at the moment of writing to a PCI > register using one of "store byte-reversed" instructions. > >>> > >>> So now you want iowrite32() on little endian and iowrite32be() on big > >>> endian, the former does a cpu_to_le32 (which is a nop on little endian) > >>> and the latter does a cpu_to_be32 (which is a nop on big endian)... > >>> should we just be using __raw_writel() on both? > >> > >> > >> We can do that too. The beauty of iowrite32be on ppc64 is that it does not > >> swap and write separately, it is implemented via the "Store Word > >> Byte-Reverse Indexed X-form" single instruction. > >> > >> And some archs (do not know which ones) may add memory barriers in their > >> implementations of ioread/iowrite. __raw_writel is too raw :) > >> > >>> There doesn't actually > >>> seem to be any change in behavior here, it just eliminates back-to-back > >>> byte swaps, which are a nop on x86, but not power, right? > >> > >> Exactly. No dependency for QEMU. > > > > How about that: > > === > > > > VFIO exposes BARs to user space as a byte stream so userspace can > > read it using pread()/pwrite(). Since this is a byte stream, VFIO should > > not do byte swapping and simply return values as it gets them from > > PCI device. > > > > Instead, the existing code assumes that byte stream in read/write is > > little-endian and it fixes endianness for values which it passes to > > ioreadXX/iowriteXX helpers in native format. The IO helpers do swapping > > again. Since both byte swaps are nops on little-endian host, this works. > > > > This also works for big endian but rather by an accident: it reads 4 bytes > > from the stream (@val is big endian), converts to CPU format (which should > > be big endian) as it was little endian (and @val becomes actually little > > endian) and calls iowrite32() which does swapping on big endian > > system again. So byte swap gets cancelled, __raw_writel() receives > > a native value and then > > *(volatile unsigned int __force *)PCI_FIX_ADDR(addr) = v; > > just does the right thing. > > I am wrong here, sorry. This is what happens when you watch soccer between > 2am and 4am :) > > > > > > This removes byte swaps and makes use of ioread32be/iowrite32be > > (and 16bit versions) which do explicit byte swapping at the moment > > of write to a PCI register. PPC64 uses a special "Store Word > > Byte-Reverse Indexed X-form" instruction which does swap and store. > > No swapping is done here if we use ioread32be as it calls
Re: [RFC PATCH V3 14/17] ppc/pci: create/release dev-tree node for VFs
On Thu, Jun 19, 2014 at 09:30:47AM +0100, Grant Likely wrote: >On Thu, Jun 19, 2014 at 3:46 AM, Wei Yang wrote: >> On Wed, Jun 18, 2014 at 07:26:27PM +0100, Grant Likely wrote: >>>On Tue, Jun 10, 2014 at 2:56 AM, Wei Yang wrote: Currently, powernv platform is not aware of VFs. This means no dev-node represents a VF. Also, VF PCI device is created when PF driver want to enable it. This leads to the pdn->pdev and pdn->pe_number an invalid value. This patch create/release dev-node for VF and fixs this when a VF's pci_dev is created. Signed-off-by: Wei Yang >>> >>>I don't think this is the right way to handle this. Unless it is a >>>fixup to a buggy devicetree provided by firmware, I don't want to see >>>any code modifying the devicetree to describe stuff that is able to be >>>directly enumerated. Really the pci code should handle the lack of a >>>device_node gracefully. If it cannot then it should be fixed. >> >> Grant, >> >> Glad to see your comment. >> >> I will fix this in the firmware. > >That's not really what I meant. The kernel should be able to deal with >virtual functions even if firmware doesn't know how, and the kernel >should not require modifying the device tree to support them. > >I'm saying fix the kernel so that a device node is not necessary for >virtual functions. > >g. Grant, After doing some investigation, I found there are two places might highly rely on these information. And not only VFs, but also PFs. 1. pnv_pci_read_config()/pnv_pci_cfg_read() When doing config space read, this needs the information of the phb. In commit 61305a96, the phb is retrived from the bus, and in commit 9bf41be6 it turns to use the device node for EEH hotplug case. Also VF may face similar case for EEH hotplug.(This is under dev) To get rid of the device node/pci_dn, we need a special handling for VFs. Hmm... it looks not nice. 2. pnv_pci_ioda_dma_dev_setup()/pnv_pci_ioda_dma_set_mask() In pci_dn, there is a field: pe_number. This is used to retrive the correct PE this pci device associated with. If we don't have a pci_dn for a VF, we need to store this information to another place. Like in the PF's pci_dn? Hmm... looks not nice neither. Generally, we could find a workaround make the VFs work without device node/pci_dn, but it would do some harm to the infrastructure, make it not consistant and not easy to read/maintain. Currently I don't find a neat way to just get rid of device node/pci_dn for VFs only. May require a careful restructure to do so. BTW, my understanding may not be correct. If you have better idea, please let me know :-) Thanks a lot. -- Richard Yang Help you, Help me ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev