Re: [Xen-devel] [PATCH 1/5] xen-blkback: don't store dev_bus_addr
>>> On 19.03.13 at 15:32, Konrad Rzeszutek Wilk wrote: > On Mon, Mar 18, 2013 at 05:49:32PM +0100, Roger Pau Monne wrote: >> dev_bus_addr returned in the grant ref map operation is the mfn of the >> passed page, there's no need to store it in the persistent grant >> entry, since we can always get it provided that we have the page. >> >> This reduces the memory overhead of persistent grants in blkback. > > I took this patch, but I redid it a bit: > > commit 1d4cb410befdb8b373c6fad604b39e0200e0bee0 > Author: Roger Pau Monne > Date: Mon Mar 18 17:49:32 2013 +0100 > > xen-blkback: don't store dev_bus_addr > > dev_bus_addr returned in the grant ref map operation is the mfn of the > passed page, there's no need to store it in the persistent grant > entry, since we can always get it provided that we have the page. > > This reduces the memory overhead of persistent grants in blkback. > > While at it, rename the 'seg[i].buf' to be 'seg[i].offset' as > it makes much more sense - as we use that value in bio_add_page > which as the fourth argument expects the offset. > > We hadn't used the physical address as part of this at all. > > Signed-off-by: Roger Pau Monné > Cc: Konrad Rzeszutek Wilk > Cc: xen-de...@lists.xen.org > [v1: s/buf/offset/] > Signed-off-by: Konrad Rzeszutek Wilk > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index 2cf8381..061c202 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -442,7 +442,7 @@ int xen_blkif_schedule(void *arg) > } > > struct seg_buf { > - unsigned long buf; > + unsigned long offset; If you touch this anyway, why don't you reduce the type to "unsigned int", halving the overall structure size? Even more, the field seems pointless to me altogether, since ... > unsigned int nsec; > }; > /* > @@ -621,30 +621,21 @@ static int xen_blkbk_map(struct blkif_request *req, >* If this is a new persistent grant >* save the handler >*/ > - persistent_gnts[i]->handle = map[j].handle; > - persistent_gnts[i]->dev_bus_addr = > - map[j++].dev_bus_addr; > + persistent_gnts[i]->handle = map[j++].handle; > } > pending_handle(pending_req, i) = > persistent_gnts[i]->handle; > > if (ret) > continue; > - > - seg[i].buf = persistent_gnts[i]->dev_bus_addr | > - (req->u.rw.seg[i].first_sect << 9); > } else { > - pending_handle(pending_req, i) = map[j].handle; > + pending_handle(pending_req, i) = map[j++].handle; > bitmap_set(pending_req->unmap_seg, i, 1); > > - if (ret) { > - j++; > + if (ret) > continue; > - } > - > - seg[i].buf = map[j++].dev_bus_addr | > - (req->u.rw.seg[i].first_sect << 9); > } > + seg[i].offset = (req->u.rw.seg[i].first_sect << 9); ... this uses "i" as index on both sides, so ... > } > return ret; > } > @@ -971,7 +962,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > (bio_add_page(bio, >pages[i], >seg[i].nsec << 9, > - seg[i].buf & ~PAGE_MASK) == 0)) { > + seg[i].offset & ~PAGE_MASK) == 0)) { ... this one could as well use the original field. And the masking with ~PAGE_MASK is not pointless in any case. Jan > > bio = bio_alloc(GFP_KERNEL, nseg-i); > if (unlikely(bio == NULL)) > diff --git a/drivers/block/xen-blkback/common.h > b/drivers/block/xen-blkback/common.h > index da78346..60103e2 100644 > --- a/drivers/block/xen-blkback/common.h > +++ b/drivers/block/xen-blkback/common.h > @@ -187,7 +187,6 @@ struct persistent_gnt { > struct page *page; > grant_ref_t gnt; > grant_handle_t handle; > - uint64_t dev_bus_addr; > struct rb_node node; > }; > > > ___ > Xen-devel mailing list > xen-de...@lists.xen.org > http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 1/5] xen-blkback: don't store dev_bus_addr
>>> On 19.03.13 at 16:10, Konrad Rzeszutek Wilk wrote: > On Tue, Mar 19, 2013 at 02:55:56PM +, Jan Beulich wrote: >> >>> On 19.03.13 at 15:32, Konrad Rzeszutek Wilk >> >>> wrote: >> > @@ -971,7 +962,7 @@ static int dispatch_rw_block_io(struct xen_blkif >> > *blkif, >> > (bio_add_page(bio, >> > pages[i], >> > seg[i].nsec << 9, >> > - seg[i].buf & ~PAGE_MASK) == 0)) { >> > + seg[i].offset & ~PAGE_MASK) == 0)) { >> >> ... this one could as well use the original field. >> >> And the masking with ~PAGE_MASK is not pointless in any case. > > Good point. In which might as well make the 'struct seg_buf' be an > simple array of unsigned int. Looks like you understood that the "not" in my earlier response was meant to be "now". Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Intel][RNG] PCI IDs commented out for 82801I (ICH9 Mobile and non-mobile)?
>>> On 20.03.13 at 21:52, "H. Peter Anvin" wrote: > On 03/20/2013 01:42 PM, Shawn Starr wrote: >> Hello folks, >> >> I was looking at why I can't load the Intel RNG driver (or why it doesn't > load automatically) and >> it just so happens I have both the mobile and non-mobile ICH9 chipset. > Looking at the driver I noticed: >> >> /* BAM, CAM, DBM, FBM, GxM >>{ PCI_DEVICE(0x8086, 0x2448) }, */ >> >> /* BA, CA, DB, Ex, 6300, Fx, 631x/632x, Gx >>{ PCI_DEVICE(0x8086, 0x244e) }, */ >> >> IDs from both machines: >> >> 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge > [8086:2448] (rev 93) >> 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] > (rev 92) >> >> I saw a thread from 2006 explaining the 50% chance there's no RNG (but these > days all modern chipsets should have an RNG) and I know >> this chipset I have does have an RNG so I'd like to use the HW random > generator vs software. >> > > Do you know that for sure? I haven't seen _any_ half way recent system having this old style FWH. >> Do we need to revisit this? Even if I try to force it to load it still fails: >> >> # modprobe intel_rng no_fwh_detect=-1 or =1 >> modprobe: ERROR: could not insert 'intel_rng': No such device >> > > The Intel RNG of that era lived in the Firmware Hub (a.k.a. BIOS flash) > rather than in the chipset proper... so even with the right chipset you > might or might not actually have the the RNG. The code really detects > the presence of a chipset which could support a FWH flash part. > > Jan, in c24c95a085c6b52c11c2f5afecc38b0ca143cdae you comment out a > number of PCI IDs. Was that intentional? Yes, as explained in the description (albeit, reading it now, 6.5 years later, again, I admit it could have been written in better ways): Before that change, the selection of which particular device from a chipset was chose was more or less random. But with the new code now actually accessing the device, it has to be the LPC one. And the list of devices stops at ICH7 - that was, afaict now, the newest one in existence back then. Therefore, if there really are newer chipsets with FWH, the code would need auditing for correctness when intending to add further PCI IDs. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] xen-pciback: notify hypervisor about devices intended to be assigned to guests
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. Signed-off-by: Jan Beulich --- arch/x86/include/asm/xen/hypercall.h |4 +- drivers/xen/fallback.c |3 + drivers/xen/xen-pciback/pci_stub.c | 59 ++- include/xen/interface/physdev.h |6 +++ 4 files changed, 54 insertions(+), 18 deletions(-) --- 3.8-rc6/arch/x86/include/asm/xen/hypercall.h +++ 3.8-rc6-xen-pciback-MSI-X-prepare/arch/x86/include/asm/xen/hypercall.h @@ -382,14 +382,14 @@ HYPERVISOR_console_io(int cmd, int count return _hypercall3(int, console_io, cmd, count, str); } -extern int __must_check HYPERVISOR_physdev_op_compat(int, void *); +extern int __must_check xen_physdev_op_compat(int, void *); static inline int HYPERVISOR_physdev_op(int cmd, void *arg) { int rc = _hypercall2(int, physdev_op, cmd, arg); if (unlikely(rc == -ENOSYS)) - rc = HYPERVISOR_physdev_op_compat(cmd, arg); + rc = xen_physdev_op_compat(cmd, arg); return rc; } --- 3.8-rc6/drivers/xen/fallback.c +++ 3.8-rc6-xen-pciback-MSI-X-prepare/drivers/xen/fallback.c @@ -44,7 +44,7 @@ int xen_event_channel_op_compat(int cmd, } EXPORT_SYMBOL_GPL(xen_event_channel_op_compat); -int HYPERVISOR_physdev_op_compat(int cmd, void *arg) +int xen_physdev_op_compat(int cmd, void *arg) { struct physdev_op op; int rc; @@ -78,3 +78,4 @@ int HYPERVISOR_physdev_op_compat(int cmd return rc; } +EXPORT_SYMBOL_GPL(xen_physdev_op_compat); --- 3.8-rc6/drivers/xen/xen-pciback/pci_stub.c +++ 3.8-rc6-xen-pciback-MSI-X-prepare/drivers/xen/xen-pciback/pci_stub.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "pciback.h" #include "conf_space.h" #include "conf_space_quirks.h" @@ -85,37 +86,52 @@ static struct pcistub_device *pcistub_de static void pcistub_device_release(struct kref *kref) { struct pcistub_device *psdev; + struct pci_dev *dev; struct xen_pcibk_dev_data *dev_data; psdev = container_of(kref, struct pcistub_device, kref); - dev_data = pci_get_drvdata(psdev->dev); + dev = psdev->dev; + dev_data = pci_get_drvdata(dev); - dev_dbg(&psdev->dev->dev, "pcistub_device_release\n"); + dev_dbg(&dev->dev, "pcistub_device_release\n"); - xen_unregister_device_domain_owner(psdev->dev); + xen_unregister_device_domain_owner(dev); /* Call the reset function which does not take lock as this * is called from "unbind" which takes a device_lock mutex. */ - __pci_reset_function_locked(psdev->dev); - if (pci_load_and_free_saved_state(psdev->dev, - &dev_data->pci_saved_state)) { - dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); - } else - pci_restore_state(psdev->dev); + __pci_reset_function_locked(dev); + if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state)) + dev_dbg(&dev->dev, "Could not reload PCI state\n"); + else + pci_restore_state(dev); + + if (pci_find_capability(dev, PCI_CAP_ID_MSIX)) { + struct physdev_pci_device ppdev = { + .seg = pci_domain_nr(dev->bus), + .bus = dev->bus->number, + .devfn = dev->devfn + }; + int err = HYPERVISOR_physdev_op(PHYSDEVOP_release_msix, + &ppdev); + + if (err) + dev_warn(&dev->dev, "MSI-X release failed (%d)\n", +err); + } /* Disable the device */ - xen_pcibk_reset_device(psdev->dev); + xen_pcibk_reset_device(dev); kfree(dev_data); - pci_set_drvdata(psdev->dev, NULL); + pci_set_drvdata(dev, NULL); /* Clean-up the device */ - xen_pcibk_config_free_dyn_fields(psdev->dev); - xen_pcibk_config_free_dev(psdev->dev); + xen_pcibk_config_free_dyn_fields(dev); + xen_pcibk_config_free_dev(dev); - psdev->dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED; - pci_dev_put(psdev->dev); + dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED; + pci_dev_put(dev); kfree(psdev); } @@ -355,6 +371,19 @@ static int pcistub_init_device(struct pc if (err) goto config_release; + if (pci_find_capability(dev, PCI_
Re: [PATCH] xen-pciback: notify hypervisor about devices intended to be assigned to guests
>>> On 06.02.13 at 18:12, Konrad Rzeszutek Wilk wrote: >> +if (pci_find_capability(dev, PCI_CAP_ID_MSIX)) { >> +struct physdev_pci_device ppdev = { >> +.seg = pci_domain_nr(dev->bus), >> +.bus = dev->bus->number, >> +.devfn = dev->devfn >> +}; >> +int err = HYPERVISOR_physdev_op(PHYSDEVOP_release_msix, >> +&ppdev); >> + >> +if (err) >> +dev_warn(&dev->dev, "MSI-X release failed (%d)\n", >> + err); >> +} > > Perhaps it should be more off: > > if (err) { > if (err == -ENOSYS) > dev_info(&dev->dev,"MSI-X release > hypercall not supported."); > else > dev_warn(&dev->dev, "MSI-X release failed > (%d)\n", >err); > Why would you want to special case this? The more that _really_ old hypervisors returned -EINVAL instead of -ENOSYS here? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 03.04.13 at 14:56, William Dauchy wrote: > On Thu, Feb 28, 2013 at 3:34 AM, Chen Gang wrote: >> additional information: >> before commit 01c681d4c70d64cb72142a2823f27c4146a02e63, the value printed >> here was bogus, as it was the guest provided value from req->u.rw.handle >> rather than the actual device. >> >> Signed-off-by: Chen Gang >> Acked-by: Jan Beulich > > Sorry I'm a bit late but since since 01c681d is in stable, I guess > a72d900 (xen/xen-blkback: preq.dev is used without initialized) > could be added in stable as well. > Am I wrong? otherwise I'm ok to do the request. Iirc we requested the earlier commit to be removed from stable trees, and I think Greg also did so. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 03.04.13 at 15:25, William Dauchy wrote: > On Wed, Apr 3, 2013 at 3:01 PM, Jan Beulich wrote: >> Iirc we requested the earlier commit to be removed from stable >> trees, and I think Greg also did so. > > I'm sorry but I'm unable to find a revert of 01c681d in stable tree. ChangeLog-3.8.3 for example has commit 32f4d10ed8fd7ef4cebbf02c5326e8bb6aeca9b1 Author: Greg Kroah-Hartman Date: Tue Mar 12 15:08:26 2013 -0700 Revert "xen/blkback: Don't trust the handle from the frontend." This reverts commit 01c681d4c70d64cb72142a2823f27c4146a02e63 upstream (ef56ca64ea733c3b88f0bb74b04da128b1dc35d8 in this tree), as it wasn't supposed to have been applied to the stable tree. Cc: Jan Beulich Cc: Ian Campbell Cc: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 03.04.13 at 15:56, William Dauchy wrote: > On Wed, Apr 3, 2013 at 3:42 PM, Jan Beulich wrote: >> ChangeLog-3.8.3 for example has > > oh sorry, you are right. I wasn't looking is the 3.8.x branch. > The thing is, the revert seems present only in 3.8.x branch. For > example in 3.4.x the last patch is still 01c681d > Should we consider this normal or is it a mistake? I think it is a mistake, but ultimately it's Konrad's call. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: fix rebuild with EFI_STUB enabled
eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence their .cmd files don't get included by the build machinery, leading to the files always getting rebuilt. Rather than adding the two files individually, take the opportunity and add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment at the top of the file to be shrunk quite a bit. At the same time, remove a pointless flags override line - the variable assigned to was misspelled anyway, and the options added are meaningless for assembly sources. Signed-off-by: Jan Beulich Cc: Matthew Garrett Cc: Matt Fleming --- arch/x86/boot/compressed/Makefile |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- 3.9-rc5/arch/x86/boot/compressed/Makefile +++ 3.9-rc5-x86-EFI-stub-rebuild/arch/x86/boot/compressed/Makefile @@ -4,7 +4,7 @@ # create a compressed vmlinux image from the original vmlinux # -targets := vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo head_$(BITS).o misc.o string.o cmdline.o early_serial_console.o piggy.o +targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo KBUILD_CFLAGS := -m$(BITS) -D__KERNEL__ $(LINUX_INCLUDE) -O2 KBUILD_CFLAGS += -fno-strict-aliasing -fPIC @@ -29,7 +29,6 @@ VMLINUX_OBJS = $(obj)/vmlinux.lds $(obj) $(obj)/piggy.o $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone -$(obj)/efi_stub_$(BITS).o: KBUILD_CLFAGS += -fshort-wchar -mno-red-zone ifeq ($(CONFIG_EFI_STUB), y) VMLINUX_OBJS += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o @@ -43,7 +42,7 @@ OBJCOPYFLAGS_vmlinux.bin := -R .comment $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) -targets += vmlinux.bin.all vmlinux.relocs +targets += $(patsubst $(obj)/%,%,$(VMLINUX_OBJS)) vmlinux.bin.all vmlinux.relocs CMD_RELOCS = arch/x86/tools/relocs quiet_cmd_relocs = RELOCS $@ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] xen: drop tracking of IRQ vector
For quite a few Xen versions, this wasn't the IRQ vector anymore anyway, and it is not being used by the kernel for anything. Hence drop the field from struct irq_info, and respective function parameters. Signed-off-by: Jan Beulich Cc: Stefano Stabellini --- arch/x86/pci/xen.c |6 +++--- drivers/xen/events.c | 13 - include/xen/events.h |3 +-- 3 files changed, 8 insertions(+), 14 deletions(-) --- 3.9-rc5/arch/x86/pci/xen.c +++ 3.9-rc5-xen-irq-no-vector/arch/x86/pci/xen.c @@ -177,7 +177,7 @@ static int xen_setup_msi_irqs(struct pci goto error; i = 0; list_for_each_entry(msidesc, &dev->msi_list, list) { - irq = xen_bind_pirq_msi_to_irq(dev, msidesc, v[i], 0, + irq = xen_bind_pirq_msi_to_irq(dev, msidesc, v[i], (type == PCI_CAP_ID_MSIX) ? "pcifront-msi-x" : "pcifront-msi", @@ -244,7 +244,7 @@ static int xen_hvm_setup_msi_irqs(struct dev_dbg(&dev->dev, "xen: msi already bound to pirq=%d\n", pirq); } - irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq, 0, + irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq, (type == PCI_CAP_ID_MSIX) ? "msi-x" : "msi", DOMID_SELF); @@ -326,7 +326,7 @@ static int xen_initdom_setup_msi_irqs(st } ret = xen_bind_pirq_msi_to_irq(dev, msidesc, - map_irq.pirq, map_irq.index, + map_irq.pirq, (type == PCI_CAP_ID_MSIX) ? "msi-x" : "msi", domid); --- 3.9-rc5/drivers/xen/events.c +++ 3.9-rc5-xen-irq-no-vector/drivers/xen/events.c @@ -85,8 +85,7 @@ enum xen_irq_type { * event channel - irq->event channel mapping * cpu - cpu this event channel is bound to * index - type-specific information: - *PIRQ - vector, with MSB being "needs EIO", or physical IRQ of the HVM - * guest, or GSI (real passthrough IRQ) of the device. + *PIRQ - physical IRQ, GSI, flags, and owner domain *VIRQ - virq number *IPI - IPI vector *EVTCHN - @@ -105,7 +104,6 @@ struct irq_info { struct { unsigned short pirq; unsigned short gsi; - unsigned char vector; unsigned char flags; uint16_t domid; } pirq; @@ -211,7 +209,6 @@ static void xen_irq_info_pirq_init(unsig unsigned short evtchn, unsigned short pirq, unsigned short gsi, - unsigned short vector, uint16_t domid, unsigned char flags) { @@ -221,7 +218,6 @@ static void xen_irq_info_pirq_init(unsig info->u.pirq.pirq = pirq; info->u.pirq.gsi = gsi; - info->u.pirq.vector = vector; info->u.pirq.domid = domid; info->u.pirq.flags = flags; } @@ -714,7 +710,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gs goto out; } - xen_irq_info_pirq_init(irq, 0, pirq, gsi, irq_op.vector, DOMID_SELF, + xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF, shareable ? PIRQ_SHAREABLE : 0); pirq_query_unmask(irq); @@ -762,8 +758,7 @@ int xen_allocate_pirq_msi(struct pci_dev } int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc, -int pirq, int vector, const char *name, -domid_t domid) +int pirq, const char *name, domid_t domid) { int irq, ret; @@ -776,7 +771,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_ irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, name); - xen_irq_info_pirq_init(irq, 0, pirq, 0, vector, domid, 0); + xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0); ret = irq_set_msi_desc(irq, msidesc); if (ret < 0) goto error_irq; --- 3.9-rc5/include/xen/events.h +++ 3.9-rc5-xen-irq-no-vector/include/xen/events.h @@ -90,8 +90,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gs int xen_allocate_pirq_msi(struct pci_dev *dev, struct msi_desc *msidesc); /* Bind an PSI pirq to an irq. */ int xen_bind_pirq_m
Re: [PATCH] x86: fix rebuild with EFI_STUB enabled
>>> On 03.04.13 at 16:48, "H. Peter Anvin" wrote: > This looks awesome for 3.10, but getting a minimal fix for 3.9/stable would > be good, too. Do you really view this as relevant for stable? Considering that this had been this way for a while with apparently no-one having noticed, I wouldn't think so. Nor would I see a strong need for this to go into 3.9. Jan > Jan Beulich wrote: > >>eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence >>their .cmd files don't get included by the build machinery, leading to >>the files always getting rebuilt. >> >>Rather than adding the two files individually, take the opportunity and >>add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment >>at the top of the file to be shrunk quite a bit. >> >>At the same time, remove a pointless flags override line - the variable >>assigned to was misspelled anyway, and the options added are >>meaningless for assembly sources. >> >>Signed-off-by: Jan Beulich >>Cc: Matthew Garrett >>Cc: Matt Fleming >>--- >> arch/x86/boot/compressed/Makefile |5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >>--- 3.9-rc5/arch/x86/boot/compressed/Makefile >>+++ 3.9-rc5-x86-EFI-stub-rebuild/arch/x86/boot/compressed/Makefile >>@@ -4,7 +4,7 @@ >> # create a compressed vmlinux image from the original vmlinux >> # >> >>-targets := vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz >>vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo >>head_$(BITS).o misc.o string.o cmdline.o early_serial_console.o piggy.o >>+targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 >>vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo >> >> KBUILD_CFLAGS := -m$(BITS) -D__KERNEL__ $(LINUX_INCLUDE) -O2 >> KBUILD_CFLAGS += -fno-strict-aliasing -fPIC >>@@ -29,7 +29,6 @@ VMLINUX_OBJS = $(obj)/vmlinux.lds $(obj) >> $(obj)/piggy.o >> >> $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone >>-$(obj)/efi_stub_$(BITS).o: KBUILD_CLFAGS += -fshort-wchar >>-mno-red-zone >> >> ifeq ($(CONFIG_EFI_STUB), y) >> VMLINUX_OBJS += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o >>@@ -43,7 +42,7 @@ OBJCOPYFLAGS_vmlinux.bin := -R .comment >> $(obj)/vmlinux.bin: vmlinux FORCE >> $(call if_changed,objcopy) >> >>-targets += vmlinux.bin.all vmlinux.relocs >>+targets += $(patsubst $(obj)/%,%,$(VMLINUX_OBJS)) vmlinux.bin.all >>vmlinux.relocs >> >> CMD_RELOCS = arch/x86/tools/relocs >> quiet_cmd_relocs = RELOCS $@ > > -- > Sent from my mobile phone. Please excuse brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] x86: Hyper-V: register clocksource only if its advertised
>>> On 28.01.13 at 18:44, Stefano Stabellini >>> wrote: > I think that Olaf made his point very clear: a feature A should only be > enabled if the corresponding flag A is set. > In fact it seems to me that this patch is correct on its own merits, > regardless of Xen does or does not. > > The Xen tools might or might not know whether a guest is going to be > Linux, Windows, FreeBSD or whatever else people use nowadays. Setting > viridian=1 is the safe choice, given that it shouldn't create any > issues: after all guests are supposed to check for feature flags before > using them. > > If Xen is going to implement "Partition Reference Counter", it is also > going to set the corresponding flag, so the guest OS (Windows, Linux, > my pet OS) can check whether the feature is available and decide whether > it wants to use it. While I agree in general, the specific case of the callback vector seems a little more difficult: As KY says, there's no feature flag for this (or perhaps more precisely for it being deliverable across all CPUs), and hence there's both the problem of detection and the problem of disambiguation (as otherwise both the Hyper-V code and the Xen code in Linux could be trying to use the same vector). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86-64: replace left over sti/cli in ia32 audit exit code
For some reason they didn't get replaced so far by their paravirt equivalents, resulting in code to be run with interrupts disabled that doesn't expect so (causing, in the observed case, a BUG_ON() to trigger) when syscall auditing is enabled. David (Cc-ed) came up with an identical fix, so likely this can be taken to count as an ack from him. Reported-by: Peter Moody Signed-off-by: Jan Beulich Cc: sta...@vger.kernel.org Cc: Konrad Rzeszutek Wilk Cc: David Vrabel Tested-by: Peter Moody --- arch/x86/ia32/ia32entry.S |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- 3.8-rc5/arch/x86/ia32/ia32entry.S +++ 3.8-rc5-x86_64-paravirt-ia32-audit-exit/arch/x86/ia32/ia32entry.S @@ -207,7 +207,7 @@ sysexit_from_sys_call: testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) jnz ia32_ret_from_sys_call TRACE_IRQS_ON - sti + ENABLE_INTERRUPTS(CLBR_NONE) movl %eax,%esi /* second arg, syscall return value */ cmpl $-MAX_ERRNO,%eax /* is it an error ? */ jbe 1f @@ -217,7 +217,7 @@ sysexit_from_sys_call: call __audit_syscall_exit movq RAX-ARGOFFSET(%rsp),%rax /* reload syscall return value */ movl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),%edi - cli + DISABLE_INTERRUPTS(CLBR_NONE) TRACE_IRQS_OFF testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) jz \exit -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] X86: Add a check to catch Xen emulation of Hyper-V
>>> On 30.01.13 at 01:51, "K. Y. Srinivasan" wrote: > Xen emulates Hyper-V to host enlightened Windows. Looks like this > emulation may be turned on by default even for Linux guests. Check and > fail Hyper-V detection if we are on Xen. > > Signed-off-by: K. Y. Srinivasan > --- > arch/x86/kernel/cpu/mshyperv.c |7 +++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c > index 646d192..4dab317 100644 > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -30,6 +30,13 @@ static bool __init ms_hyperv_platform(void) > if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) > return false; > > + /* > + * Xen emulates Hyper-V to support enlightened Windows. > + * Check to see first if we are on a Xen Hypervisor. > + */ > + if (xen_cpuid_base()) > + return false; > + > cpuid(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS, > &eax, &hyp_signature[0], &hyp_signature[1], &hyp_signature[2]); I'm not convinced that's the right approach - any hypervisor could do similar emulation, and hence you either want to make sure you run on Hyper-V (by excluding all others), or you tolerate using the emulation (which may require syncing up with the other guest implementations so that shared resources don't get used by two parties). I also wonder whether using the Hyper-V emulation (where useful, there might not be anything right now, but this may change going forward) when no Xen support is configured wouldn't be better than not using anything... Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 30.01.13 at 01:51, "K. Y. Srinivasan" wrote: > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -14,10 +14,15 @@ > #include > #include > #include > +#include > +#include > #include > #include > #include > #include > +#include > +#include > +#include > > struct ms_hyperv_info ms_hyperv; > EXPORT_SYMBOL_GPL(ms_hyperv); > @@ -77,6 +82,12 @@ static void __init ms_hyperv_init_platform(void) > > if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE) > clocksource_register_hz(&hyperv_cs, NSEC_PER_SEC/100); > +#if IS_ENABLED(CONFIG_HYPERV) > + /* > + * Setup the IDT for hypervisor callback. > + */ > + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); > +#endif > } > > const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = { > @@ -85,3 +96,30 @@ const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv > = { > .init_platform = ms_hyperv_init_platform, > }; > EXPORT_SYMBOL(x86_hyper_ms_hyperv); > + > +static int vmbus_irq = -1; > +static irq_handler_t vmbus_isr; > + > +void hv_register_vmbus_handler(int irq, irq_handler_t handler) > +{ > + vmbus_irq = irq; > + vmbus_isr = handler; > +} > +EXPORT_SYMBOL_GPL(hv_register_vmbus_handler); > + > +void hyperv_vector_handler(struct pt_regs *regs) > +{ > + struct pt_regs *old_regs = set_irq_regs(regs); > + struct irq_desc *desc; > + > + irq_enter(); > + exit_idle(); > + > + desc = irq_to_desc(vmbus_irq); > + > + if (desc) > + generic_handle_irq_desc(vmbus_irq, desc); > + > + irq_exit(); > + set_irq_regs(old_regs); > +} This function appears to be dead code when !CONFIG_HYPERV, because ... > --- a/arch/x86/kernel/entry_32.S > +++ b/arch/x86/kernel/entry_32.S > @@ -1046,11 +1046,18 @@ ENTRY(xen_failsafe_callback) > _ASM_EXTABLE(4b,9b) > ENDPROC(xen_failsafe_callback) > > -BUILD_INTERRUPT3(xen_hvm_callback_vector, XEN_HVM_EVTCHN_CALLBACK, > +BUILD_INTERRUPT3(xen_hvm_callback_vector, HYPERVISOR_CALLBACK_VECTOR, > xen_evtchn_do_upcall) > > #endif /* CONFIG_XEN */ > > +#if IS_ENABLED(CONFIG_HYPERV) > + > +BUILD_INTERRUPT3(hyperv_callback_vector, HYPERVISOR_CALLBACK_VECTOR, > + hyperv_vector_handler) > + > +#endif /* CONFIG_HYPERV */ > + > #ifdef CONFIG_FUNCTION_TRACER > #ifdef CONFIG_DYNAMIC_FTRACE > ... the consumers here and below are conditional. I also wonder why arch/x86/kernel/cpu/mshyperv.c is being built at all when !CONFIG_HYPERV (which would eliminate the need for the conditional the patch adds to ms_hyperv_init_platform()). Jan > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index d5113c3..c1d01e6 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -1446,11 +1446,16 @@ ENTRY(xen_failsafe_callback) > CFI_ENDPROC > END(xen_failsafe_callback) > > -apicinterrupt XEN_HVM_EVTCHN_CALLBACK \ > +apicinterrupt HYPERVISOR_CALLBACK_VECTOR \ > xen_hvm_callback_vector xen_evtchn_do_upcall > > #endif /* CONFIG_XEN */ > > +#if IS_ENABLED(CONFIG_HYPERV) > +apicinterrupt HYPERVISOR_CALLBACK_VECTOR \ > + hyperv_callback_vector hyperv_vector_handler > +#endif /* CONFIG_HYPERV */ > + > /* > * Some functions should be protected against kprobes > */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/3] X86: Add a check to catch Xen emulation of Hyper-V
>>> On 30.01.13 at 19:12, KY Srinivasan wrote: > Presumably, Hyper-V emulation is only to run enlightened Windows. The issue > with > Xen is not that it emulates Hyper-V, but this emulation is turned on while > running Linux. > That is the reason I chose to check for Xen. Would you prefer a DMI check > for the Hyper-V > platform. I consider DMI checks to be too fragile here - in particular with the eventual passing through of host DMI attributes to guests, this sets you up for mistakes. Instead, I would envision you scanning the whole CPUID range "reserved" for virtualization (starting at 0x4000) and see whether you get back anything other than the Hyper-V identification (much like the way xen_cpuid_base() scans for the Xen range). Of course that's under the premise that you're right in assuming Hyper-V would never emulate any other hypervisor's interface. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/3] X86: Add a check to catch Xen emulation of Hyper-V
>>> On 31.01.13 at 16:53, KY Srinivasan wrote: > Are there any published standards in terms of how the CPUID space should be > populated in the range from 0x4000 to 0x4001. Specifically, unless I recall having seen this range being marked as reserved for hypervisor use somewhere, but I don't remember where that was. > the standard mandates that all ranges unused by a given hypervisor would > return a known value, how can this code be used to detect the presence of an > unknown hypervisor. Hyper-V is going to return the Hyper-V string at > 0x4000. So, I was planning to scan starting at 0x4100. Clearly, I can > check for a specific hypervisor that I know causes a problem for Hyper-V (as > I > have currently done by checking for Xen). How can I check for the presence of > yet to be created Hypervisors that may emulate Hyper-V by scanning the CPUID > space. I am almost tempted to say that Xen is the special case and the patch > I have submitted addresses that. If a new (or existing hypervisor) plans to > do what Xen is doing, perhaps we can dissuade them from doing that or we can > fix that within the general framework we have here. I'd simply preset ECX, EDX, and EBX to zero, and check whether they change on any EAX input divisible by 256, skipping the one range you know Hyper-V sits in. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 02:17, "K. Y. Srinivasan" wrote: > @@ -69,6 +74,11 @@ static void __init ms_hyperv_init_platform(void) > ms_hyperv.features, ms_hyperv.hints); > > clocksource_register_hz(&hyperv_cs, NSEC_PER_SEC/100); > + > + /* > + * Setup the IDT for hypervisor callback. > + */ > + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); Isn't doing this unconditionally here as problematic as the call to clocksource_register_hz() turned out to be when Xen's Hyper-V shim reacts to the CPUID inquiry above? > @@ -77,3 +87,32 @@ const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv > = { > .init_platform = ms_hyperv_init_platform, > }; > EXPORT_SYMBOL(x86_hyper_ms_hyperv); > + > +static int vmbus_irq = -1; > +static irq_handler_t vmbus_isr; > + > +void hv_register_vmbus_handler(int irq, irq_handler_t handler) > +{ > + vmbus_irq = irq; > + vmbus_isr = handler; > +} > +EXPORT_SYMBOL_GPL(hv_register_vmbus_handler); > + > +void hyperv_vector_handler(struct pt_regs *regs) > +{ > + struct pt_regs *old_regs = set_irq_regs(regs); > + struct irq_desc *desc; > + > + irq_enter(); > +#ifdef CONFIG_X86 > + exit_idle(); > +#endif This being in a file underneath arch/x86 - why the conditional? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][GIT PULL][3.8] x86: Don't clobber top of pt_regs in nested NMI
>>> On 23.01.13 at 21:00, Steven Rostedt wrote: > On Mon, 2012-11-05 at 08:58 +, Jan Beulich wrote: >> >>> On 02.11.12 at 20:19, Steven Rostedt wrote: >> > @@ -1842,8 +1851,12 @@ nmi_swapgs: >> >SWAPGS_UNSAFE_STACK >> > nmi_restore: >> >RESTORE_ALL 8 >> > + >> > + /* Pop the extra iret frame */ >> > + addq $(5*8), %rsp >> >> This could (for code efficiency) and should (for CFI annotation >> correctness) be folded into the RESTORE_ALL above (by >> converting "8" to "6*8"). > > This change never made it in. Would you like to send a patch, or would > you want me to do it? Let me do so - I have a patch pending for the CFI part of this already, and simply forgot that folding the two operations would be the simpler solution to the issue. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86-64: fix unwind annotations in recent NMI changes
While in one case a plain annotation is necessary, in the other case the stack adjustment can simply be folded into the immediately preceding RESTORE_ALL, thus getting the correct annotation for free. Signed-off-by: Jan Beulich Cc: Steven Rostedt --- arch/x86/kernel/entry_64.S |7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) --- 3.8-rc4/arch/x86/kernel/entry_64.S +++ 3.8-rc4-x86_64-NMI-unwind/arch/x86/kernel/entry_64.S @@ -1781,6 +1781,7 @@ first_nmi: * Leave room for the "copied" frame */ subq $(5*8), %rsp + CFI_ADJUST_CFA_OFFSET 5*8 /* Copy the stack frame to the Saved frame */ .rept 5 @@ -1863,10 +1864,8 @@ end_repeat_nmi: nmi_swapgs: SWAPGS_UNSAFE_STACK nmi_restore: - RESTORE_ALL 8 - - /* Pop the extra iret frame */ - addq $(5*8), %rsp + /* Pop the extra iret frame at once */ + RESTORE_ALL 6*8 /* Clear the NMI executing stack variable */ movq $0, 5*8(%rsp) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 17:07, KY Srinivasan wrote: > >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Thursday, January 24, 2013 3:48 AM >> To: KY Srinivasan >> Cc: o...@aepfle.de; b...@alien8.de; a...@canonical.com; x...@kernel.org; >> t...@linutronix.de; de...@linuxdriverproject.org; > gre...@linuxfoundation.org; >> jasow...@redhat.com; linux-kernel@vger.kernel.org; h...@zytor.com >> Subject: Re: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as >> special hypervisor interrupts >> >> >>> On 24.01.13 at 02:17, "K. Y. Srinivasan" wrote: >> > @@ -69,6 +74,11 @@ static void __init ms_hyperv_init_platform(void) >> > ms_hyperv.features, ms_hyperv.hints); >> > >> >clocksource_register_hz(&hyperv_cs, NSEC_PER_SEC/100); >> > + >> > + /* >> > + * Setup the IDT for hypervisor callback. >> > + */ >> > + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, >> hyperv_callback_vector); >> >> Isn't doing this unconditionally here as problematic as the call to >> clocksource_register_hz() turned out to be when Xen's Hyper-V >> shim reacts to the CPUID inquiry above? > > I was not sure what to make this conditional on at run-time. To the extent > that Xen emulation of Hyper-V is complete, this will be a problem. Does Xen > return all the "feature bits" of Hyper-V. I could discriminate on a feature > that > Xen does not plan to emulate. I've no idea what plans there might be, but that's the code there is currently: int cpuid_viridian_leaves(unsigned int leaf, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { struct domain *d = current->domain; if ( !is_viridian_domain(d) ) return 0; leaf -= 0x4000; if ( leaf > 6 ) return 0; *eax = *ebx = *ecx = *edx = 0; switch ( leaf ) { case 0: *eax = 0x4006; /* Maximum leaf */ *ebx = 0x7263694d; /* Magic numbers */ *ecx = 0x666F736F; *edx = 0x76482074; break; case 1: *eax = 0x31237648; /* Version number */ break; case 2: /* Hypervisor information, but only if the guest has set its own version number. */ if ( d->arch.hvm_domain.viridian.guest_os_id.raw == 0 ) break; *eax = 1; /* Build number */ *ebx = (xen_major_version() << 16) | xen_minor_version(); *ecx = 0; /* SP */ *edx = 0; /* Service branch and number */ break; case 3: /* Which hypervisor MSRs are available to the guest */ *eax = (CPUID3A_MSR_APIC_ACCESS | CPUID3A_MSR_HYPERCALL | CPUID3A_MSR_VP_INDEX); break; case 4: /* Recommended hypercall usage. */ if ( (d->arch.hvm_domain.viridian.guest_os_id.raw == 0) || (d->arch.hvm_domain.viridian.guest_os_id.fields.os < 4) ) break; *eax = (CPUID4A_MSR_BASED_APIC | CPUID4A_RELAX_TIMER_INT); *ebx = 2047; /* long spin count */ break; } return 1; } Question is - considering you stated that this is supported starting in Win8, doesn't Hyper-V itself announce that capability in some explicit way? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 20:04, "H. Peter Anvin" wrote: > Can't you discover Xen emulation by looking for Xen first? Or does Xen go > "stealth"? That's already being done - the ordering currently is Xen, VMWare, Hyper-V, KVM. I.e. there's only a problem when !CONFIG_XEN_PVHVM. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 20:10, "H. Peter Anvin" wrote: > I also have to point out the obvious problem with Xen masquerading as HyperV > without implementing the same feature set... I can't resist to state that this is what feature bits have been introduced for in CPUID leaves. As long as Xen's shim doesn't advertise a feature bit it doesn't implement, all should be fine. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 20:03, KY Srinivasan wrote: > I was under the impression that only Dom0 Xen would emulate Hyper-V. How that? It's the hypervisor that does the emulation, and obviously not for Dom0, but for (HVM) DomU-s. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND 1/1] X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
>>> On 24.01.13 at 19:59, Olaf Hering wrote: > On Thu, Jan 24, KY Srinivasan wrote: > > >> > Question is - considering you stated that this is supported >> > starting in Win8, doesn't Hyper-V itself announce that >> > capability in some explicit way? >> >> Thanks Jan. Unfortunately I don't think tis interrupt delivery model >> is specified via a "feature" bit (I will check with the Windows guys). >> Perhaps, we can have a Hyper-V specific compilation switch to address >> the Xen emulation issue. > > Would that really help if both pvops XEN_PVHVM and HYPERV are enabled in > the config? I assume at this point the access to the DMI data is not yet > possible. As just said in another reply - with XEN_PVHVM, there's no problem, since Xen gets checked for first. There's only a problem when !XEN_PVHVM, because in that case Hyper-V will be probed for. And the problem can only get bigger when on top of that the out-of-tree PV drivers are intended to be used. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/debug] x86/EFI: Properly init-annotate BGRT code
>>> On 24.01.13 at 23:28, Josh Triplett wrote: > On Thu, Jan 24, 2013 at 12:34:21PM -0800, tip-bot for Jan Beulich wrote: >> Commit-ID: 13f0e4d2b9e2209f13d5a4122478eb79e6136870 >> Gitweb: > http://git.kernel.org/tip/13f0e4d2b9e2209f13d5a4122478eb79e6136870 >> Author: Jan Beulich >> AuthorDate: Fri, 23 Nov 2012 16:30:07 + >> Committer: Ingo Molnar >> CommitDate: Thu, 24 Jan 2013 17:12:18 +0100 >> >> x86/EFI: Properly init-annotate BGRT code >> >> These items are only ever referenced from initialization code. > > Not true, and this patch will break the BGRT code. bgrt_init, which > does indeed have an __init annotation, stores bgrt_image and > bgrt_image_size into the .private and .size fields of a sysfs > bin_attribute, which does *not* have an __initdata annotation, and which > will get read whenever the user reads the corresponding sysfs attribute. Copying init-only data into a sysfs structure is no problem at all - that structure obviously is non-__initdata and hence can be read at any time. It was a different thing if .private and/or .size stored _pointers_ to one of the two variables in question. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3, v2] x86: xor-block handling adjustments
>>> On 24.01.13 at 18:32, "H. Peter Anvin" wrote: > On 01/24/2013 11:15 AM, Ingo Molnar wrote: >> >> * Jan Beulich wrote: >> >>> v2 of this series is merely updated on top of the changes between >>> 3.6 and 3.7-rc (which includes the dropping of what previously was >>> the second patch in a four patch series). >>> >>> 1: unify SSE-base xor-block routines >>> 2: add alternative SSE implementation only prefetching once per 64-byte line >>> 3: make virtualization friendly >>> >>> Signed-off-by: Jan Beulich >> >> Looks useful. Wondering what the status is: hpa, was your >> concern resolved, do you think we can apply these? >> > > I don't see anything wrong, except that I can't *find* patch 3/3 either > in my inbox nor on LKML... I can certainly resend the whole set, but our email system even got a delivery confirmation from yours on patch 3 (other than e.g. from Ingo's, and all of this identical to 0...2). Or see https://patchwork.kernel.org/patch/1688991/, according to which you even responded to it (which is the concern Ingo was referring to). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/asm] x86/xor: Make virtualization friendly
>>> On 25.01.13 at 23:11, "H. Peter Anvin" wrote: > On 01/25/2013 02:43 AM, tip-bot for Jan Beulich wrote: >> Commit-ID: 05fbf4d6fc6a3c0c3e63b77979c9311596716d10 >> Gitweb: > http://git.kernel.org/tip/05fbf4d6fc6a3c0c3e63b77979c9311596716d10 >> Author: Jan Beulich >> AuthorDate: Fri, 2 Nov 2012 14:21:23 + >> Committer: Ingo Molnar >> CommitDate: Fri, 25 Jan 2013 09:23:51 +0100 >> >> x86/xor: Make virtualization friendly >> >> In virtualized environments, the CR0.TS management needed here >> can be a lot slower than anticipated by the original authors of >> this code, which particularly means that in such cases forcing >> the use of SSE- (or MMX-) based implementations is not desirable >> - actual measurements should always be done in that case. >> >> For consistency, pull into the shared (32- and 64-bit) header >> not only the inclusion of the generic code, but also that of the >> AVX variants. >> > > This patch is wrong and should be dropped. I verified it with the KVM > people that they do NOT want this change. It is a Xen-specific problem. I don't follow: The patch doesn't penalize anyone, it merely widens the set of methods tried on virtualized platforms. I.e. if other hypervisors have no problem here, then the best performing one should still turn out to be the SSE or AVX one. Or if it doesn't, it ought to be to their advantage (I would even question why this extra probing isn't done on native too, e.g. to cope with eventual bad vector implementations, say on low-power/low-cost CPUs). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]smp: Fix send func call IPI to empty cpu mask
>>> On 27.01.13 at 16:50, Ingo Molnar wrote: > * Linus Torvalds wrote: > >> On Fri, Jan 25, 2013 at 11:53 PM, Wang YanQing wrote: >> > I get below warning every day with 3.7, >> > one or two times per day. >> > >> > [ 2235.186027] WARNING: at > /mnt/sda7/kernel/linux/arch/x86/kernel/apic/ipi.c:109 > default_send_IPI_mask_logical+0x2f/0xb8() >> > [ 2235.186030] Hardware name: Aspire 4741 >> > [ 2235.186032] empty IPI mask >> > [ 2235.186079] [] native_send_call_func_ipi+0x4f/0x57 >> > [ 2235.186087] [] smp_call_function_many+0x191/0x1a9 >> > [ 2235.186097] [] native_flush_tlb_others+0x21/0x24 >> > [ 2235.186101] [] flush_tlb_page+0x63/0x89 >> > [ 2235.186105] [] ptep_set_access_flags+0x20/0x26 >> > [ 2235.186111] [] do_wp_page+0x234/0x502 >> > [ 2235.186121] [] handle_pte_fault+0x50d/0x54c >> > [ 2235.186148] [] handle_mm_fault+0xd0/0xe2 >> > [ 2235.186153] [] __do_page_fault+0x411/0x42d >> > [ 2235.186166] [] do_page_fault+0x8/0xa >> > [ 2235.186170] [] error_code+0x5a/0x60 >> > >> > This patch fix it. >> > >> > This patch also fix some system hang problem: >> > If the data->cpumask been cleared after pass >> > >> > if (WARN_ONCE(!mask, "empty IPI mask")) >> > return; >> > then the problem 83d349f3 fix will happen again. >> >> Hmm. We have very consciously tried to avoid the extra copy, although >> I'm not entirely sure why (it might possibly hurt on the MAXSMP >> configuration). >> >> See for example commit 723aae25d5cd ("smp_call_function_many: handle >> concurrent clearing of mask") which fixed another version of this >> problem. >> >> But I do agree that it looks like the copy is required, simply because >> - as you say - once we've done the "list_add_rcu()" to add it to the >> queue, we can have (another) IPI to the target CPU that can now see it >> and clear the mask. >> >> So by the time we get to actually send the IPI, the mask might have >> been cleared by another IPI. So I do agree that your patch seems >> correct, but I really really want to run it by other people. >> >> Guys? Original patch on lkml. The other possible fix might be >> to take the &call_function.lock earlier in >> generic_smp_call_function_interrupt(), so that we can never >> clear the bit while somebody is adding entries to the list... >> But I think it very much tries to avoid that on purpose right >> now, with only the last CPU responding to that IPI taking the >> lock. >> >> So copying the IPI mask seems to be the reasonable approach. >> Comments? > > Agreed, looks correct to me as well - I've queued the fix up in > tip:x86/urgent. But the patch is obviously incomplete for the CPUMASK_OFFSTACK case, as the newly added cpumask_ipi member never gets its bit array allocated. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 3/3] xen/blkback: Check for insane amounts of request on the ring.
>>> On 25.01.13 at 19:43, Konrad Rzeszutek Wilk wrote: > @@ -764,6 +768,9 @@ __do_block_io_op(struct xen_blkif *blkif) > rp = blk_rings->common.sring->req_prod; > rmb(); /* Ensure we see queued requests up to 'rp'. */ > > + if (RING_REQUEST_PROD_OVERFLOW(&blk_rings->common, rp, rc)) > + return -EACCES; Actually I wonder whether we need the new macro at all: It seems to me that using RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rp) here would achieve the same effect. Jan > + > while (rc != rp) { > > if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rc)) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 1/3] xen/blkback: Don't trust the handle from the frontend.
>>> On 25.01.13 at 18:32, Konrad Rzeszutek Wilk wrote: > The 'handle' is the device that the request is from. For the life-time > of the ring we copy it from a request to a response so that the frontend > is not surprised by it. But we do not need it - when we start processing > I/Os we have our own 'struct phys_req' which has only most essential > information about the request. In fact the 'vbd_translate' ends up > over-writing the preq.dev with a value from the backend. > > This assignment of preq.dev with the 'handle' value is superfluous > so lets not do it. > > Cc: sta...@vger.kernel.org Is this really stable material? It fixes nothing but the (wrong) impression that the driver might be using frontend provided data without checking. Jan > Acked-by: Jan Beulich > Acked-by: Ian Campbell > Signed-off-by: Konrad Rzeszutek Wilk > --- > drivers/block/xen-blkback/blkback.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index 742871d..2de3da9 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -890,7 +890,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > goto fail_response; > } > > - preq.dev = req->u.rw.handle; > preq.sector_number = req->u.rw.sector_number; > preq.nr_sects = 0; > > -- > 1.8.0.2 > > > ___ > Xen-devel mailing list > xen-de...@lists.xen.org > http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 3/3] xen/blkback: Check for insane amounts of request on the ring.
>>> On 28.01.13 at 16:42, Konrad Rzeszutek Wilk wrote: > On Mon, Jan 28, 2013 at 11:07:46AM +, Jan Beulich wrote: >> >>> On 25.01.13 at 19:43, Konrad Rzeszutek Wilk >> >>> wrote: >> > @@ -764,6 +768,9 @@ __do_block_io_op(struct xen_blkif *blkif) >> >rp = blk_rings->common.sring->req_prod; >> >rmb(); /* Ensure we see queued requests up to 'rp'. */ >> > >> > + if (RING_REQUEST_PROD_OVERFLOW(&blk_rings->common, rp, rc)) >> > + return -EACCES; >> >> Actually I wonder whether we need the new macro at all: It seems >> to me that using RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rp) >> here would achieve the same effect. > > But it would not. The RING_REQUEST_CONS_OVERFLOW only check that the > non-shared ring entries (rsp_prod and rsp_prod_pvt) are less than > the size of the ring (32). In other words - they check whether we want > to process more requests as we still have a back-log of responses to > deal with. So did you not notice that here 'rp' (i.e. req_prod) is being passed, not 'rc'? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 27.02.13 at 05:52, Chen Gang wrote: > if call xen_vbd_translate failed, the preq.dev will be not initialized. > so use blkif->vbd.pdevice instead (still better to print relative info). > > Signed-off-by: Chen Gang Acked-by: Jan Beulich > --- > drivers/block/xen-blkback/blkback.c |3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index de1f319..6d1cc3d 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -904,7 +904,8 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > pr_debug(DRV_PFX "access denied: %s of [%llu,%llu] on > dev=%04x\n", >operation == READ ? "read" : "write", >preq.sector_number, > - preq.sector_number + preq.nr_sects, preq.dev); > + preq.sector_number + preq.nr_sects, > + blkif->vbd.pdevice); > goto fail_response; > } > > -- > 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 27.02.13 at 05:52, Chen Gang wrote: > if call xen_vbd_translate failed, the preq.dev will be not initialized. > so use blkif->vbd.pdevice instead (still better to print relative info). You also could have mentioned that even before commit 01c681d4c70d64cb72142a2823f27c4146a02e63 the value printed here was bogus, as it was the guest provided value from req->u.rw.handle rather than the actual device. Jan > Signed-off-by: Chen Gang > --- > drivers/block/xen-blkback/blkback.c |3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index de1f319..6d1cc3d 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -904,7 +904,8 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > pr_debug(DRV_PFX "access denied: %s of [%llu,%llu] on > dev=%04x\n", >operation == READ ? "read" : "write", >preq.sector_number, > - preq.sector_number + preq.nr_sects, preq.dev); > + preq.sector_number + preq.nr_sects, > + blkif->vbd.pdevice); > goto fail_response; > } > > -- > 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 27.02.13 at 10:50, Roger Pau Monné wrote: > On 27/02/13 05:52, Chen Gang wrote: >> >> if call xen_vbd_translate failed, the preq.dev will be not initialized. >> so use blkif->vbd.pdevice instead (still better to print relative info). > > preq.dev is initialized a a couple of lines prior to calling > xen_vbd_translate: > > preq.dev = req->u.rw.handle; Not anymore after 01c681d4c70d64cb72142a2823f27c4146a02e63 (in Konrad's tree). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/block/xen-blkback: preq.dev is used without initialized
>>> On 27.02.13 at 11:38, Chen Gang wrote: > 于 2013年02月27日 17:57, Jan Beulich 写道: >> You also could have mentioned that even before commit >> 01c681d4c70d64cb72142a2823f27c4146a02e63 the value printed >> here was bogus, as it was the guest provided value from >> req->u.rw.handle rather than the actual device. > > pardon ? > > I guess what you said is : > my patch seems ok, but the comments need improving. > need add "additional info" in comments: > "before commit 01c681d4c70d64cb72142a2823f27c4146a02e63 >the value printed here was bogus, as it was the guest >provided value from req->u.rw.handle rather than the >actual device". > > is it correct ? Yes (and you might have missed the ACK that I had already sent before this reply - that mail got bounced for your address). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 00/12] xen-block: indirect descriptors
>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: > This series contains the initial implementation of indirect > descriptors for Linux blkback/blkfront. > > Patches 1, 2, 3, 4 and 5 are bug fixes and minor optimizations. > > Patch 6 contains a LRU implementation for blkback that will be needed > when using indirect descriptors (since we are no longer able to map > all possible grants blkfront might use). Considering this, ... > Patch 7 is an addition to the print stats function in blkback in order > to print information regarding persistent grant usage. > > Patches 8, 9, 10 and 11 are preparatory work for indirect descriptors > implementation, mainly make blkback use dynamic memory and remove the > shared blkbk structure, so each blkback instance has it's own list of > free requests, pages, handles and so on. > > Finally patch 12 contains the indirect descriptors implementation. > > I've also pushed this series to the following git repository: > > git://xenbits.xen.org/people/royger/linux.git xen-block-indirect > > Performance benefit of this series can be seen in the following graph: > > http://xenbits.xen.org/people/royger/plot_indirect.png ... would you happen to also have a comparison with using indirect descriptors but not persistent grants? IOW I'm wondering about the hit rate on the persistently mapped grants, especially when blkfront really saturates the added bandwidth. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 01/12] xen-blkback: don't store dev_bus_addr
>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: > dev_bus_addr returned in the grant ref map operation is the mfn of the > passed page, there's no need to store it in the persistent grant > entry, since we can always get it provided that we have the page. Interesting that you come up with this, as I have a similar patch pending (not posted yet), aiming at reducing the stack usage in dispatch_rw_block_io(): seg[].buf is really unnecessary with the dev_bus_addr storing removed, as the only reader of that field can equally well use req->u.rw.seg[i].first_sect. And then the biolist[] array really can be folded into a union with the remaining seg[] one, as their usage scopes are easily separable. > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -621,9 +621,7 @@ static int xen_blkbk_map(struct blkif_request *req, >* If this is a new persistent grant >* save the handler >*/ > - persistent_gnts[i]->handle = map[j].handle; > - persistent_gnts[i]->dev_bus_addr = > - map[j++].dev_bus_addr; > + persistent_gnts[i]->handle = map[j++].handle; > } > pending_handle(pending_req, i) = > persistent_gnts[i]->handle; > @@ -631,7 +629,8 @@ static int xen_blkbk_map(struct blkif_request *req, > if (ret) > continue; > > - seg[i].buf = persistent_gnts[i]->dev_bus_addr | > + seg[i].buf = pfn_to_mfn(page_to_pfn( > + persistent_gnts[i]->page)) << PAGE_SHIFT | So why do you do this? The only reader masks the field with ~PAGE_MASK anyway. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 09/12] xen-blkback: move pending handles list from blkbk to pending_req
>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: > Moving grant ref handles from blkbk to pending_req will allow us to > get rid of the shared blkbk structure. At the expense of (slightly?) higher memory requirements? > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -136,6 +136,7 @@ struct pending_req { > struct list_headfree_list; > struct persistent_gnt > *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > + grant_handle_t grant_handles[BLKIF_MAX_SEGMENTS_PER_REQUEST]; Adding yet another array here makes it even more desirable to switch from multiple arrays to a singly array of a structure, thus improving locality of the memory accesses involved in processing an individual segment. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 10/12] xen-blkback: make the queue of free requests per backend
>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: > @@ -122,6 +125,19 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) > spin_lock_init(&blkif->free_pages_lock); > INIT_LIST_HEAD(&blkif->free_pages); > blkif->free_pages_num = 0; > + blkif->pending_reqs = kzalloc(sizeof(blkif->pending_reqs[0]) * > + xen_blkif_reqs, GFP_KERNEL); kcalloc() is preferred in cases like this, I believe. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 12/12] xen-block: implement indirect descriptors
>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: > @@ -109,6 +111,16 @@ typedef uint64_t blkif_sector_t; > */ > #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 > > +#define BLKIF_MAX_INDIRECT_GREFS_PER_REQUEST 8 > + > +struct blkif_request_segment_aligned { > + grant_ref_t gref;/* reference to I/O buffer frame*/ > + /* @first_sect: first sector in frame to transfer (inclusive). */ > + /* @last_sect: last sector in frame to transfer (inclusive). */ > + uint8_t first_sect, last_sect; > + uint16_t_pad; /* padding to make it 8 bytes, so it's cache-aligned > */ > +} __attribute__((__packed__)); What's the __packed__ for here? > + > struct blkif_request_rw { > uint8_tnr_segments; /* number of segments */ > blkif_vdev_t handle; /* only for read/write requests */ > @@ -138,11 +150,24 @@ struct blkif_request_discard { > uint8_t_pad3; > } __attribute__((__packed__)); > > +struct blkif_request_indirect { > + uint8_tindirect_op; > + uint16_t nr_segments; > +#ifdef CONFIG_X86_64 > + uint32_t _pad1;/* offsetof(blkif_...,u.indirect.id) == 8 > */ > +#endif Either you want the structure be packed tightly (and you don't care about misaligned fields), in which case you shouldn't need a padding field. That's even more so as there's no padding between indirect_op and nr_segments, so everything is misaligned anyway, and the comment above is wrong too (offsetof() really ought to yield 7 in that case). Or you want the structure fields aligned, in which case you again ought to drop the use of the __packed__ attribute and introduce _all_ necessary padding fields. > + uint64_t id; > + blkif_vdev_t handle; > + blkif_sector_t sector_number; > + grant_ref_tindirect_grefs[BLKIF_MAX_INDIRECT_GREFS_PER_REQUEST]; > +} __attribute__((__packed__)); And then it would be quite nice for new features to no longer require translation between a 32- and a 64-bit layout at all. Plus, rather than introducing uninitialized padding fields, I'd suggest using fields that are required to be zero initialized, to allow giving them a meaning later. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 00/12] xen-block: indirect descriptors
>>> On 28.02.13 at 12:25, Roger Pau Monné wrote: > This is the expanded graph that also contains indirect descriptors > without persistent grants: > > http://xenbits.xen.org/people/royger/plot_indirect_nopers.png Thanks. Interesting - this suggests an unexpectedly high hit rate. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC 12/12] xen-block: implement indirect descriptors
>>> On 28.02.13 at 13:00, Roger Pau Monné wrote: > On 28/02/13 12:19, Jan Beulich wrote: >>>>> On 28.02.13 at 11:28, Roger Pau Monne wrote: >>> @@ -109,6 +111,16 @@ typedef uint64_t blkif_sector_t; >>> */ >>> #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 >>> >>> +#define BLKIF_MAX_INDIRECT_GREFS_PER_REQUEST 8 >>> + >>> +struct blkif_request_segment_aligned { >>> + grant_ref_t gref;/* reference to I/O buffer frame*/ >>> + /* @first_sect: first sector in frame to transfer (inclusive). */ >>> + /* @last_sect: last sector in frame to transfer (inclusive). */ >>> + uint8_t first_sect, last_sect; >>> + uint16_t_pad; /* padding to make it 8 bytes, so it's cache-aligned >>> */ >>> +} __attribute__((__packed__)); >> >> What's the __packed__ for here? > > Yes, that's not needed. > >> >>> + >>> struct blkif_request_rw { >>> uint8_tnr_segments; /* number of segments */ >>> blkif_vdev_t handle; /* only for read/write requests */ >>> @@ -138,11 +150,24 @@ struct blkif_request_discard { >>> uint8_t_pad3; >>> } __attribute__((__packed__)); >>> >>> +struct blkif_request_indirect { >>> + uint8_tindirect_op; >>> + uint16_t nr_segments; >>> +#ifdef CONFIG_X86_64 >>> + uint32_t _pad1;/* offsetof(blkif_...,u.indirect.id) == 8 >>> */ >>> +#endif >> >> Either you want the structure be packed tightly (and you don't care >> about misaligned fields), in which case you shouldn't need a padding >> field. That's even more so as there's no padding between indirect_op >> and nr_segments, so everything is misaligned anyway, and the >> comment above is wrong too (offsetof() really ought to yield 7 in >> that case). > > This padding is because we want to have the "id" field at the same > position as blkif_request_rw, so we need to add the padding for it to > match 32 & 64 bit blkif_request_rw structures, this prevents adding some > "if (req.op == BLKIF_OP_INDIRECT)..." if we only need to get the id of > the request. Oh, right, that's desirable of course. > The comment is indeed wrong, I've copied it from blkif_request_discard > and forgot to change the offset But the offset stated there then is right after all - I forgot that there is a 1-byte field outside the union (the way this is being done in the upstream Linux header is really ugly imo, but I guess Jeremy and/or Konrad liked it that way). That's also why the packed attribute is needed here. But you will probably want to switch sector_number and handle, so that sector_number becomes aligned, and add another 16-bit padding field between handle and indirect_grefs[]. I also wonder whether "indirect_op" wouldn't better be named "actual_op" or just "op". Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH] xen: netback: remove redundant xenvif_put
>>> On 19.02.13 at 06:53, David Miller wrote: > From: Andrew Jones > Date: Mon, 18 Feb 2013 21:29:20 +0100 > >> netbk_fatal_tx_err() calls xenvif_carrier_off(), which does >> a xenvif_put(). As callers of netbk_fatal_tx_err should only >> have one reference to the vif at this time, then the xenvif_put >> in netbk_fatal_tx_err is one too many. >> >> Signed-off-by: Andrew Jones > > Applied. But this is wrong from all we can tell, we discussed this before (Wei pointed to the discussion in an earlier reply). The core of it is that the put here parallels the one in netbk_tx_err(), and the one in xenvif_carrier_off() matches the get from xenvif_connect() (which normally would be done on the path coming through xenvif_disconnect()). And anyway - shouldn't changes to netback require an ack from Ian? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH XEN] xen: event channel arrays are xen_ulong_t and not unsigned long
>>> On 21.02.13 at 18:16, Ian Campbell wrote: > On Tue, 2013-02-19 at 14:49 +, Ian Campbell wrote: >> On ARM we want these to be the same size on 32- and 64-bit. >> >> This is an ABI change on ARM. X86 does not change. >> >> Signed-off-by: Ian Campbell >> Cc: Jan Beulich >> Cc: Keir (Xen.org) > > Are you guys (un)happy with this change from the Xen & x86 side? I don't see any problem with it. Jan >> Cc: Tim Deegan >> Cc: Stefano Stabellini >> Cc: Konrad Rzeszutek Wilk >> Cc: linux-arm-ker...@lists.infradead.org >> Cc: xen-de...@lists.xen.org >> --- >> tools/include/xen-foreign/mkheader.py |6 ++ >> xen/include/public/xen.h |8 >> 2 files changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/tools/include/xen-foreign/mkheader.py > b/tools/include/xen-foreign/mkheader.py >> index d189b07..57681fa 100644 >> --- a/tools/include/xen-foreign/mkheader.py >> +++ b/tools/include/xen-foreign/mkheader.py >> @@ -21,13 +21,18 @@ inttypes["arm"] = { >> "unsigned long" : "uint32_t", >> "long" : "uint32_t", >> "xen_pfn_t" : "uint64_t", >> +"xen_ulong_t" : "uint64_t", >> }; >> +header["arm"] = """ >> +#define __arm___ARM32 1 >> +"""; >> >> # x86_32 >> inttypes["x86_32"] = { >> "unsigned long" : "uint32_t", >> "long" : "uint32_t", >> "xen_pfn_t" : "uint32_t", >> +"xen_ulong_t" : "uint32_t", >> }; >> header["x86_32"] = """ >> #define __i386___X86_32 1 >> @@ -42,6 +47,7 @@ inttypes["x86_64"] = { >> "unsigned long" : "__align8__ uint64_t", >> "long" : "__align8__ uint64_t", >> "xen_pfn_t" : "__align8__ uint64_t", >> +"xen_ulong_t" : "__align8__ uint64_t", >> }; >> header["x86_64"] = """ >> #if defined(__GNUC__) && !defined(__STRICT_ANSI__) >> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h >> index 5593066..99c8212 100644 >> --- a/xen/include/public/xen.h >> +++ b/xen/include/public/xen.h >> @@ -556,7 +556,7 @@ DEFINE_XEN_GUEST_HANDLE(multicall_entry_t); >> * Event channel endpoints per domain: >> * 1024 if a long is 32 bits; 4096 if a long is 64 bits. >> */ >> -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * sizeof(unsigned long) * > 64) >> +#define NR_EVENT_CHANNELS (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64) >> >> struct vcpu_time_info { >> /* >> @@ -613,7 +613,7 @@ struct vcpu_info { >> */ >> uint8_t evtchn_upcall_pending; >> uint8_t evtchn_upcall_mask; >> -unsigned long evtchn_pending_sel; >> +xen_ulong_t evtchn_pending_sel; >> struct arch_vcpu_info arch; >> struct vcpu_time_info time; >> }; /* 64 bytes (x86) */ >> @@ -663,8 +663,8 @@ struct shared_info { >> * per-vcpu selector word to be set. Each bit in the selector covers a >> * 'C long' in the PENDING bitfield array. >> */ >> -unsigned long evtchn_pending[sizeof(unsigned long) * 8]; >> -unsigned long evtchn_mask[sizeof(unsigned long) * 8]; >> +xen_ulong_t evtchn_pending[sizeof(xen_ulong_t) * 8]; >> +xen_ulong_t evtchn_mask[sizeof(xen_ulong_t) * 8]; >> >> /* >> * Wallclock time: updated only by control software. Guests should base -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH XEN] xen: event channel arrays are xen_ulong_t and not unsigned long
>>> On 22.02.13 at 09:28, Ian Campbell wrote: > On Fri, 2013-02-22 at 08:12 +, Jan Beulich wrote: >> >>> On 21.02.13 at 18:16, Ian Campbell wrote: >> > On Tue, 2013-02-19 at 14:49 +, Ian Campbell wrote: >> >> On ARM we want these to be the same size on 32- and 64-bit. >> >> >> >> This is an ABI change on ARM. X86 does not change. >> >> >> >> Signed-off-by: Ian Campbell >> >> Cc: Jan Beulich >> >> Cc: Keir (Xen.org) >> > >> > Are you guys (un)happy with this change from the Xen & x86 side? >> >> I don't see any problem with it. > > I'll take this as an Acked-by if that's ok with you. Well, I specifically didn't say "ack": I don't really mind the change, but I'm also not eager see it go in. Not seeing a problem with it doesn't really mean there's none lurking - type changes in public interfaces are always an at least slightly risky business. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH XEN] xen: event channel arrays are xen_ulong_t and not unsigned long
>>> On 22.02.13 at 09:55, Ian Campbell wrote: > On Fri, 2013-02-22 at 08:48 +, Jan Beulich wrote: >> >>> On 22.02.13 at 09:28, Ian Campbell wrote: >> > On Fri, 2013-02-22 at 08:12 +, Jan Beulich wrote: >> >> >>> On 21.02.13 at 18:16, Ian Campbell wrote: >> >> > On Tue, 2013-02-19 at 14:49 +, Ian Campbell wrote: >> >> >> On ARM we want these to be the same size on 32- and 64-bit. >> >> >> >> >> >> This is an ABI change on ARM. X86 does not change. >> >> >> >> >> >> Signed-off-by: Ian Campbell >> >> >> Cc: Jan Beulich >> >> >> Cc: Keir (Xen.org) >> >> > >> >> > Are you guys (un)happy with this change from the Xen & x86 side? >> >> >> >> I don't see any problem with it. >> > >> > I'll take this as an Acked-by if that's ok with you. >> >> Well, I specifically didn't say "ack": I don't really mind the change, >> but I'm also not eager see it go in. Not seeing a problem with it >> doesn't really mean there's none lurking - type changes in public >> interfaces are always an at least slightly risky business. > > For x86 there is no change here since: > xen/include/public/arch-x86/xen.h:typedef unsigned long xen_ulong_t; > > Also the tools/include/xen-foreign checks show no change in the size of > things on x86. Which is why I said I see no problem with it. > On ARM the change is deliberate, if there is any fallout we can fix it. I understand that. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Put git commit 51ac8893a7a51b196501164e645583bf78138699 on the stable tree please
>>> On 22.02.13 at 17:19, Greg KH wrote: > On Fri, Feb 22, 2013 at 10:46:45AM -0500, Konrad Rzeszutek Wilk wrote: >> Hey Greg, >> >> Please put git commit 51ac8893a7a51b196501164e645583bf78138699 >> commit 51ac8893a7a51b196501164e645583bf78138699 >> Author: Jan Beulich >> Date: Wed Feb 6 10:30:38 2013 -0500 >> >> xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}() >> >> on the stable tree please. It is an security errata fix and should >> have had the CC: sta...@vger.kernel.org but did not. > > Thanks, what stable kernel releases should it be queued up for? Anyone from 3.1 onwards that is still being maintained. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] xen-pciback: notify hypervisor about devices intended to be assigned to guests
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: add mechanism to fully protect MSI-X table from PV guest accesses") on the master branch of git://xenbits.xen.org/xen.git. Signed-off-by: Jan Beulich --- arch/x86/include/asm/xen/hypercall.h |4 +- drivers/xen/fallback.c |3 + drivers/xen/xen-pciback/pci_stub.c | 59 ++- include/xen/interface/physdev.h |6 +++ 4 files changed, 54 insertions(+), 18 deletions(-) --- 3.9-rc2/arch/x86/include/asm/xen/hypercall.h +++ 3.9-rc2-xen-pciback-MSI-X-prepare/arch/x86/include/asm/xen/hypercall.h @@ -382,14 +382,14 @@ HYPERVISOR_console_io(int cmd, int count return _hypercall3(int, console_io, cmd, count, str); } -extern int __must_check HYPERVISOR_physdev_op_compat(int, void *); +extern int __must_check xen_physdev_op_compat(int, void *); static inline int HYPERVISOR_physdev_op(int cmd, void *arg) { int rc = _hypercall2(int, physdev_op, cmd, arg); if (unlikely(rc == -ENOSYS)) - rc = HYPERVISOR_physdev_op_compat(cmd, arg); + rc = xen_physdev_op_compat(cmd, arg); return rc; } --- 3.9-rc2/drivers/xen/fallback.c +++ 3.9-rc2-xen-pciback-MSI-X-prepare/drivers/xen/fallback.c @@ -44,7 +44,7 @@ int xen_event_channel_op_compat(int cmd, } EXPORT_SYMBOL_GPL(xen_event_channel_op_compat); -int HYPERVISOR_physdev_op_compat(int cmd, void *arg) +int xen_physdev_op_compat(int cmd, void *arg) { struct physdev_op op; int rc; @@ -78,3 +78,4 @@ int HYPERVISOR_physdev_op_compat(int cmd return rc; } +EXPORT_SYMBOL_GPL(xen_physdev_op_compat); --- 3.9-rc2/drivers/xen/xen-pciback/pci_stub.c +++ 3.9-rc2-xen-pciback-MSI-X-prepare/drivers/xen/xen-pciback/pci_stub.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "pciback.h" #include "conf_space.h" #include "conf_space_quirks.h" @@ -85,37 +86,52 @@ static struct pcistub_device *pcistub_de static void pcistub_device_release(struct kref *kref) { struct pcistub_device *psdev; + struct pci_dev *dev; struct xen_pcibk_dev_data *dev_data; psdev = container_of(kref, struct pcistub_device, kref); - dev_data = pci_get_drvdata(psdev->dev); + dev = psdev->dev; + dev_data = pci_get_drvdata(dev); - dev_dbg(&psdev->dev->dev, "pcistub_device_release\n"); + dev_dbg(&dev->dev, "pcistub_device_release\n"); - xen_unregister_device_domain_owner(psdev->dev); + xen_unregister_device_domain_owner(dev); /* Call the reset function which does not take lock as this * is called from "unbind" which takes a device_lock mutex. */ - __pci_reset_function_locked(psdev->dev); - if (pci_load_and_free_saved_state(psdev->dev, - &dev_data->pci_saved_state)) { - dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); - } else - pci_restore_state(psdev->dev); + __pci_reset_function_locked(dev); + if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state)) + dev_dbg(&dev->dev, "Could not reload PCI state\n"); + else + pci_restore_state(dev); + + if (pci_find_capability(dev, PCI_CAP_ID_MSIX)) { + struct physdev_pci_device ppdev = { + .seg = pci_domain_nr(dev->bus), + .bus = dev->bus->number, + .devfn = dev->devfn + }; + int err = HYPERVISOR_physdev_op(PHYSDEVOP_release_msix, + &ppdev); + + if (err) + dev_warn(&dev->dev, "MSI-X release failed (%d)\n", +err); + } /* Disable the device */ - xen_pcibk_reset_device(psdev->dev); + xen_pcibk_reset_device(dev); kfree(dev_data); - pci_set_drvdata(psdev->dev, NULL); + pci_set_drvdata(dev, NULL); /* Clean-up the device */ - xen_pcibk_config_free_dyn_fields(psdev->dev); - xen_pcibk_config_free_dev(psdev->dev); + xen_pcibk_config_free_dyn_fields(dev); + xen_pcibk_config_free_dev(dev); - psdev->dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED; - pci_dev_put(psdev->dev); + dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNE
Re: [Xen-devel] [PATCH] xen-pciback: notify hypervisor about devices intended to be assigned to guests
>>> On 12.03.13 at 16:06, "Jan Beulich" wrote: > For MSI-X capable devices the hypervisor wants to write protect the > MSI-X table and PBA, yet it can't assume that resources have been > assigned to their final values at device enumeration time. Thus have > pciback do that notification, as having the device controlled by it is > a prerequisite to assigning the device to guests anyway. > > This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: > add mechanism to fully protect MSI-X table from PV guest accesses") on > the master branch of git://xenbits.xen.org/xen.git. > > Signed-off-by: Jan Beulich Just noticed that once again I forgot to Cc stable@ - could you please add this as you commit it to your tree? I'm sorry about this, Jan > --- > arch/x86/include/asm/xen/hypercall.h |4 +- > drivers/xen/fallback.c |3 + > drivers/xen/xen-pciback/pci_stub.c | 59 > ++- > include/xen/interface/physdev.h |6 +++ > 4 files changed, 54 insertions(+), 18 deletions(-) > > --- 3.9-rc2/arch/x86/include/asm/xen/hypercall.h > +++ 3.9-rc2-xen-pciback-MSI-X-prepare/arch/x86/include/asm/xen/hypercall.h > @@ -382,14 +382,14 @@ HYPERVISOR_console_io(int cmd, int count > return _hypercall3(int, console_io, cmd, count, str); > } > > -extern int __must_check HYPERVISOR_physdev_op_compat(int, void *); > +extern int __must_check xen_physdev_op_compat(int, void *); > > static inline int > HYPERVISOR_physdev_op(int cmd, void *arg) > { > int rc = _hypercall2(int, physdev_op, cmd, arg); > if (unlikely(rc == -ENOSYS)) > - rc = HYPERVISOR_physdev_op_compat(cmd, arg); > + rc = xen_physdev_op_compat(cmd, arg); > return rc; > } > > --- 3.9-rc2/drivers/xen/fallback.c > +++ 3.9-rc2-xen-pciback-MSI-X-prepare/drivers/xen/fallback.c > @@ -44,7 +44,7 @@ int xen_event_channel_op_compat(int cmd, > } > EXPORT_SYMBOL_GPL(xen_event_channel_op_compat); > > -int HYPERVISOR_physdev_op_compat(int cmd, void *arg) > +int xen_physdev_op_compat(int cmd, void *arg) > { > struct physdev_op op; > int rc; > @@ -78,3 +78,4 @@ int HYPERVISOR_physdev_op_compat(int cmd > > return rc; > } > +EXPORT_SYMBOL_GPL(xen_physdev_op_compat); > --- 3.9-rc2/drivers/xen/xen-pciback/pci_stub.c > +++ 3.9-rc2-xen-pciback-MSI-X-prepare/drivers/xen/xen-pciback/pci_stub.c > @@ -17,6 +17,7 @@ > #include > #include > #include > +#include > #include "pciback.h" > #include "conf_space.h" > #include "conf_space_quirks.h" > @@ -85,37 +86,52 @@ static struct pcistub_device *pcistub_de > static void pcistub_device_release(struct kref *kref) > { > struct pcistub_device *psdev; > + struct pci_dev *dev; > struct xen_pcibk_dev_data *dev_data; > > psdev = container_of(kref, struct pcistub_device, kref); > - dev_data = pci_get_drvdata(psdev->dev); > + dev = psdev->dev; > + dev_data = pci_get_drvdata(dev); > > - dev_dbg(&psdev->dev->dev, "pcistub_device_release\n"); > + dev_dbg(&dev->dev, "pcistub_device_release\n"); > > - xen_unregister_device_domain_owner(psdev->dev); > + xen_unregister_device_domain_owner(dev); > > /* Call the reset function which does not take lock as this >* is called from "unbind" which takes a device_lock mutex. >*/ > - __pci_reset_function_locked(psdev->dev); > - if (pci_load_and_free_saved_state(psdev->dev, > - &dev_data->pci_saved_state)) { > - dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); > - } else > - pci_restore_state(psdev->dev); > + __pci_reset_function_locked(dev); > + if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state)) > + dev_dbg(&dev->dev, "Could not reload PCI state\n"); > + else > + pci_restore_state(dev); > + > + if (pci_find_capability(dev, PCI_CAP_ID_MSIX)) { > + struct physdev_pci_device ppdev = { > + .seg = pci_domain_nr(dev->bus), > + .bus = dev->bus->number, > + .devfn = dev->devfn > + }; > + int err = HYPERVISOR_physdev_op(PHYSDEVOP_release_msix, > + &ppdev); > + > + if (err) > + dev_warn(&dev->dev, "MSI-X release failed (%d)\n", > +
Re: [PATCH] xen: fix logical error in tlb flushing
>>> On 05.09.12 at 07:34, Alex Shi wrote: > On 08/25/2012 03:45 AM, Jan Beulich wrote: >>>>> On 24.08.12 at 20:17, Konrad Rzeszutek Wilk >>>>> wrote: >>> How can I reproduce this >> >> I don't know, I spotted this while looking at the code. > > Again, since the old buggy code doesn't cause trouble in PV guest, guess > the hypercall for MMUEXT_INVLPG_MULTI was translated or treated as > MMUEXT_TLB_FLUSH_MULTI. If so, believe correct this will bring a big > performance benefit. It's not clear to me what was buggy with the code prior to your change. And no, there's no magic widening of the scope of these MMU operations - if you ask the hypervisor for a single page invalidation, that's what it's going to do. But of course, there are cases where extra (full) invalidations need to be done without a guest asking for them. But that's nothing a guest can validly make itself dependent upon. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allow gpiolib to be a module
>>> On 11.09.12 at 08:17, Ryan Mallon wrote: > On 10/09/12 22:16, Jan Beulich wrote: >> +#ifdef MODULE >> +int __init gpiolib_init(void) > > Should be static. Oh, yes, of course. >> +{ >> +return gpiolib_sysfs_init() ?: gpiolib_debugfs_init(); > > I thought this was going to call gpiolib_sysfs_init() twice until I > looked at gcc's documentation. Maybe the less obtuse, and far more common: > > int err; > > err = gpiolib_sysfs_init(); > if (err) > return err; > > return gpiolib_debugfs_init(); That construct is being used in many other places throughout the kernel (including in gpiolib itself), so I don't see why it can't be used here - the more that it is precisely available to have a way to avoid the double evaluation. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allow gpiolib to be a module
>>> On 11.09.12 at 19:38, Linus Walleij wrote: > On Mon, Sep 10, 2012 at 2:16 PM, Jan Beulich wrote: >> +#ifndef MODULE >> subsys_initcall(gpiolib_debugfs_init); >> +#endif >> >> #endif /* DEBUG_FS */ >> + >> +#ifdef MODULE >> +int __init gpiolib_init(void) >> +{ >> + return gpiolib_sysfs_init() ?: gpiolib_debugfs_init(); > > I can't parse this, sorry the gpio subsystem maintainer is too bad coder. > What about something more readable? > > int ret; > > ret = gpiolib_sysfs_init(); > if (ret) >return ret; > return gpiolib_debugfs_init(); > > I know it doesn't look as cool but it's easier for me to understand. Okay, since you're the second one to complain despite there being other uses of the construct in the same source file, I'll replace it, ... > There is a bug too: I don't think this compiles if you compile as > a module but disable debugfs. Try it out. ... the more to get this one fixed. Will re-submit. >> +} >> +module_init(gpiolib_init); >> + >> +MODULE_DESCRIPTION("GPIO library"); >> +MODULE_LICENSE("GPL"); > > These two things don't really need to be inside an #ifdef > right? They don't need to be, but since we need a conditional now anyway, it seemed reasonable to include these in there too. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allow gpiolib to be a module
>>> On 12.09.12 at 19:41, Randy Dunlap wrote: > On 09/12/2012 12:43 AM, Jan Beulich wrote: >>>>> On 11.09.12 at 19:38, Linus Walleij wrote: >>> On Mon, Sep 10, 2012 at 2:16 PM, Jan Beulich wrote: >>>> + return gpiolib_sysfs_init() ?: gpiolib_debugfs_init(); >>> >>> I can't parse this, sorry the gpio subsystem maintainer is too bad coder. >>> What about something more readable? >>> >>> int ret; >>> >>> ret = gpiolib_sysfs_init(); >>> if (ret) >>>return ret; >>> return gpiolib_debugfs_init(); >>> >>> I know it doesn't look as cool but it's easier for me to understand. >> >> Okay, since you're the second one to complain despite there >> being other uses of the construct in the same source file, I'll >> replace it, ... > > > Does C just use the value generated from the left side of a ?: > expression for the middle (empty) expression or does it call the > function again (which would usually be bad)? It's the purpose of omitting the middle expression to avoid the second evaluation of that expression (and obviously the compiler can't generally do this on its own when seeing the same expression used twice). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the xen-two tree with Linus' tree
>>> On 02.10.12 at 12:44, Stefano Stabellini >>> wrote: >> --- a/drivers/xen/Makefile >> +++ b/drivers/xen/Makefile >> @@@ -4,8 -8,11 +8,12 @@@ obj-y += xenbus >> nostackp := $(call cc-option, -fno-stack-protector) >> CFLAGS_features.o := $(nostackp) >> >> + obj-$(CONFIG_XEN_DOM0) += $(dom0-y) >> + dom0-$(CONFIG_PCI) += pci.o >> ++dom0-$(CONFIG_X86) += dbgp.o >> + dom0-$(CONFIG_ACPI) += acpi.o >> + dom0-$(CONFIG_X86) += pcpu.o >> obj-$(CONFIG_BLOCK)+= biomerge.o >> - obj-$(CONFIG_HOTPLUG_CPU) += cpu_hotplug.o >> obj-$(CONFIG_XEN_XENCOMM) += xencomm.o >> obj-$(CONFIG_XEN_BALLOON) += xen-balloon.o >> obj-$(CONFIG_XEN_SELFBALLOONING) += xen-selfballoon.o > > > Considering that dbgp doesn't seem to be very useful without PCI at the > moment, could we just turn it into: > > dom0-$(CONFIG_PCI) += dbgp.o > > ? Better not - the code is specifically not PCI-only. And I can't see how it would be harmful to be compiled on e.g. ARM (so the merge perhaps really should use XEN_DOM0 alone, without X86. If anything (and that may indeed be a minor oversight of the original patch) one might want it to depend on USB_SUPPORT, as without that no in-tree debug port capable driver would be able to load (and hence interfere with Xen's use of the debug port). However, as long as it builds fine with USB_SUPPORT undefined (which I believe it does), having it in the shape it is allows for out-of-tree drivers as well (as long as they make use of the designated interface). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the xen-two tree with Linus' tree
>>> On 02.10.12 at 13:45, Stefano Stabellini >>> wrote: >> > Considering that dbgp doesn't seem to be very useful without PCI at the >> > moment, could we just turn it into: >> > >> > dom0-$(CONFIG_PCI) += dbgp.o >> > >> > ? >> >> Better not - the code is specifically not PCI-only. And I can't see >> how it would be harmful to be compiled on e.g. ARM (so the >> merge perhaps really should use XEN_DOM0 alone, without >> X86. If anything (and that may indeed be a minor oversight of >> the original patch) one might want it to depend on USB_SUPPORT, >> as without that no in-tree debug port capable driver would be >> able to load (and hence interfere with Xen's use of the debug >> port). However, as long as it builds fine with USB_SUPPORT >> undefined (which I believe it does), having it in the shape it >> is allows for out-of-tree drivers as well (as long as they make >> use of the designated interface). > > OK for PCI. > Regarding USB_SUPPORT, considering that gbgp.c calls hcd_to_bus, I think > it would make sense to make it depend on it. As said - I'd prefer to do that only if indeed needed to get things to build without that option. Since include/usb/* doesn't reference CONFIG_USB_SUPPORT, I'm in favor of supporting at least those eventual out-of-tree drivers that do use the pre-existing data structures (and others shouldn't be calling pre-existing APIs anyway). But I wouldn't NAK a patch doing what you suggest either. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the xen-two tree with Linus' tree
>>> On 02.10.12 at 14:55, Konrad Rzeszutek Wilk wrote: > On Tue, Oct 02, 2012 at 12:54:20PM +0100, Jan Beulich wrote: >> >>> On 02.10.12 at 13:45, Stefano Stabellini >> >>> > wrote: >> >> > Considering that dbgp doesn't seem to be very useful without PCI at the >> >> > moment, could we just turn it into: >> >> > >> >> > dom0-$(CONFIG_PCI) += dbgp.o >> >> > >> >> > ? >> >> >> >> Better not - the code is specifically not PCI-only. And I can't see >> >> how it would be harmful to be compiled on e.g. ARM (so the >> >> merge perhaps really should use XEN_DOM0 alone, without >> >> X86. If anything (and that may indeed be a minor oversight of >> >> the original patch) one might want it to depend on USB_SUPPORT, >> >> as without that no in-tree debug port capable driver would be >> >> able to load (and hence interfere with Xen's use of the debug >> >> port). However, as long as it builds fine with USB_SUPPORT >> >> undefined (which I believe it does), having it in the shape it >> >> is allows for out-of-tree drivers as well (as long as they make >> >> use of the designated interface). >> > >> > OK for PCI. >> > Regarding USB_SUPPORT, considering that gbgp.c calls hcd_to_bus, I think >> > it would make sense to make it depend on it. >> >> As said - I'd prefer to do that only if indeed needed to get things >> to build without that option. Since include/usb/* doesn't reference >> CONFIG_USB_SUPPORT, I'm in favor of supporting at least those >> eventual out-of-tree drivers that do use the pre-existing data >> structures (and others shouldn't be calling pre-existing APIs >> anyway). But I wouldn't NAK a patch doing what you suggest >> either. > > Could it depend on EARLY_PRINTK_DBGP ? No, that one would definitely be wrong (because we need the EHCI driver to propagate reset information no matter whether the kernel uses an early console). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: wrong DirectMap kB
>>> On 01.10.12 at 10:37, Hugh Dickins wrote: > I noticed yesterday that the DirectMap counts at the bottom of x86_64's > /proc/meminfo are wrong on v3.5 and v3.6. For example, I happen to have > booted this laptop with mem=700M to run a test, but /proc/meminfo shows > > DirectMap4k:4096 kB > DirectMap2M:18446744073709547520 kB I cannot see such odd effect with "mem="; with I can (for any value up to around 1G). > Or if I boot with the full amount of physical memory, the DirectMap > numbers do not add up to the full amount of physical memory, as they > used to do on v3.4 and before. That one I can see how could have happened, in that said patch went a little too far: Dropping the "pages" increments from phys_p[um]d_init() is necessary only for the hotplug case (as otherwise duplicating accounting already done earlier), while at boot time we would want to do the accounting. > Whilst I've not yet tried reverting it, I strongly suspect your > 20167d3421a0 "x86-64: Fix accounting in kernel_physical_mapping_init()". > > Either it was a complete misunderstanding, totally bogus, and should > simply be reverted; or perhaps you really noticed something wrong in > your code inspection, but didn't get the fix quite right? The latter, apparently. The patch below should fix both aspects. Jan --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -386,7 +386,8 @@ phys_pte_init(pte_t *pte_page, unsigned * these mappings are more intelligent. */ if (pte_val(*pte)) { - pages++; + if (!after_bootmem) + pages++; continue; } @@ -451,6 +452,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned * attributes. */ if (page_size_mask & (1 << PG_LEVEL_2M)) { + if (!after_bootmem) + pages++; last_map_addr = next; continue; } @@ -526,6 +529,8 @@ phys_pud_init(pud_t *pud_page, unsigned * attributes. */ if (page_size_mask & (1 << PG_LEVEL_1G)) { + if (!after_bootmem) + pages++; last_map_addr = next; continue; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] xen/Makefile: resolve merge conflict with 9fa5780beea1
>>> On 02.10.12 at 17:13, Stefano Stabellini >>> wrote: > This patch is actually a merge conflict resolution between Konrad's Xen > tree and the following commit: > > commit 9fa5780beea1274d498a224822397100022da7d4 > Author: Jan Beulich > Date: Tue Sep 18 12:23:02 2012 +0100 > > USB EHCI/Xen: propagate controller reset information to hypervisor > > > Compile dbgp.o only if CONFIG_USB is defined. Now this is wrong: USB is a tristate config option, yet dbgp.o must be built in. This is why I suggested to use USB_SUPPORT (if anything at all). Jan > After all xen_dbgp_op relies on hcd_to_bus. > > This patch should be applied on top of > > http://marc.info/?l=linux-kernel&m=134919011231980&w=2 > > > Reported-by: Stephen Rothwell > Signed-off-by: Stefano Stabellini > > diff --cc drivers/xen/Makefile > index a4a3cab,275abfc..33711ad > --- a/drivers/xen/Makefile > +++ b/drivers/xen/Makefile > @@@ -4,8 -8,11 +8,12 @@@ obj-y+= xenbus > nostackp := $(call cc-option, -fno-stack-protector) > CFLAGS_features.o := $(nostackp) > > + dom0-$(CONFIG_PCI) += pci.o > + dom0-$(CONFIG_ACPI) += acpi.o > + dom0-$(CONFIG_X86) += pcpu.o > ++dom0-$(CONFIG_USB) += dbgp.o > + obj-$(CONFIG_XEN_DOM0) += $(dom0-y) > obj-$(CONFIG_BLOCK) += biomerge.o > - obj-$(CONFIG_HOTPLUG_CPU) += cpu_hotplug.o > obj-$(CONFIG_XEN_XENCOMM) += xencomm.o > obj-$(CONFIG_XEN_BALLOON) += xen-balloon.o > obj-$(CONFIG_XEN_SELFBALLOONING)+= xen-selfballoon.o -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] x86, mm: Include the entire kernel memory map in trampoline_pgd
>>> Matt Fleming 10/03/12 2:59 PM >>> >+static int insert_identity_mapping(resource_size_t paddr, unsigned long vaddr, >+unsigned long size) >+{ >+unsigned long end = vaddr + size; >+unsigned long next; >+pgd_t *vpgd, *ppgd; >+ >+#ifdef CONFIG_X86_32 >+ppgd = initial_page_table + pgd_index(paddr); >+ >+if (paddr >= PAGE_OFFSET || paddr + size > PAGE_OFFSET) >+return 1; >+#else >+ppgd = __va(real_mode_header->trampoline_pgd) + pgd_index(paddr); Missing equivalent code (to the 32-bit one above) here - after all, you're trying to potentially insert a 52-bit physical address into 48-bit virtual space. >+#endif >+ >+vpgd = pgd_offset_k(vaddr); >+do { >+next = pgd_addr_end(vaddr, end); >+ >+if (!pgd_present(*ppgd)) { >+pud_t *ppud = (pud_t *)get_zeroed_page(GFP_KERNEL); >+if (!ppud) >+return 1; >+ >+set_pgd(ppgd, __pgd(_KERNPG_TABLE | __pa(ppud))); >+} >+ >+if (ident_pud_range(paddr, vaddr, ppgd, vpgd, next)) >+return 1; >+} while (ppgd++, vpgd++, vaddr = next, vaddr != end); >+ >+return 0; >+} >+ >/* >* Remap an arbitrary physical address space into the kernel virtual >* address space. Needed when the kernel wants to access high addresses >@@ -163,6 +258,10 @@ static void __iomem *__ioremap_caller(resource_size_t >phys_addr, >ret_addr = (void __iomem *) (vaddr + offset); >mmiotrace_ioremap(unaligned_phys_addr, unaligned_size, ret_addr); > >+if (insert_identity_mapping(phys_addr, vaddr, size)) >+printk(KERN_WARNING "ioremap: unable to map 0x%llx in identity >pagetable\n", >+(unsigned long long)phys_addr); Isn't that going to trigger quite frequently on 32-bit kernels? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] x86, mm: Include the entire kernel memory map in trampoline_pgd
>>> On 03.10.12 at 16:03, Matt Fleming wrote: > On Wed, 2012-10-03 at 14:31 +0100, Jan Beulich wrote: >> >>> Matt Fleming 10/03/12 2:59 PM >>> >> >@@ -163,6 +258,10 @@ static void __iomem *__ioremap_caller(resource_size_t >> >phys_addr, >> >ret_addr = (void __iomem *) (vaddr + offset); >> >mmiotrace_ioremap(unaligned_phys_addr, unaligned_size, ret_addr); >> > >> >+if (insert_identity_mapping(phys_addr, vaddr, size)) >> >+printk(KERN_WARNING "ioremap: unable to map 0x%llx in identity >> >pagetable\n", >> >+(unsigned long long)phys_addr); >> >> Isn't that going to trigger quite frequently on 32-bit kernels? > > Hmmm... yeah, probably, though it didn't during my testing. If it is That's suspicious, isn't it? In general, on any machine with more than 3Gb of memory below the 4Gb boundary this ought to trigger for _all_ mappings of MMIO space, and that's already only considering the default of VMSPLIT_3G. > likely to trigger a lot then we might be best only inserting the > identity mmio mapping for 64-bit, and addressing the 32-bit case if we > ever actually need the identity pagetable. I think that would be the best choice for the moment. Btw., once this set of yours is in - will I need to resubmit the time handling patch that actually triggered this work, or will you just reinstate it without further action on my part? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH] xen/xen_initial_domain: check that xen_start_info is initialized
>>> On 03.10.12 at 19:08, Stefano Stabellini wrote: > Since commit commit 4c071ee5268f7234c3d084b6093bebccc28cdcba ("arm: > initial Xen support") PV on HVM guests can be xen_initial_domain. > However PV on HVM guests might have an unitialized xen_start_info, so > check before accessing its fields. > > Signed-off-by: Stefano Stabellini > Acked-by: Ian Campbell > Reported-by: Konrad Rzeszutek Wilk > > diff --git a/include/xen/xen.h b/include/xen/xen.h > index 9a39ca5..e7101bb 100644 > --- a/include/xen/xen.h > +++ b/include/xen/xen.h > @@ -28,7 +28,7 @@ extern enum xen_domain_type xen_domain_type; > #include > > #define xen_initial_domain() (xen_domain() && \ > - xen_start_info->flags & SIF_INITDOMAIN) > + xen_start_info && xen_start_info->flags & > SIF_INITDOMAIN) > #else /* !CONFIG_XEN_DOM0 */ > #define xen_initial_domain() (0) > #endif /* CONFIG_XEN_DOM0 */ Didn't your other patch statically initialize it? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] x86, mm: Include the entire kernel memory map in trampoline_pgd
>>> On 04.10.12 at 11:18, Matt Fleming wrote: > On Thu, 2012-10-04 at 07:32 +0100, Jan Beulich wrote: >> Btw., once this set of yours is in - will I need to resubmit the >> time handling patch that actually triggered this work, or will >> you just reinstate it without further action on my part? > > It's up to you. If you don't want to make any changes to your original > patch then I'll just re-apply it on top of this series, updating the > commit log to note why it got reverted and why it's now OK to re-apply. Afaict no changes should be necessary (so long as the patch still applies). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 2/6] xen-blkfront: handle backend CLOSED without CLOSING
>>> On 25.09.12 at 19:53, David Vrabel wrote: > @@ -1167,7 +1168,8 @@ blkfront_closing(struct blkfront_info *info) > > mutex_lock(&bdev->bd_mutex); > > - if (bdev->bd_openers) { > + /* If the backend is already CLOSED, close now. */ > + if (bdev->bd_openers && backend_state != XenbusStateClosed) { > xenbus_dev_error(xbdev, -EBUSY, >"Device in use; refusing to close"); > xenbus_switch_state(xbdev, XenbusStateClosing); This looks wrong to me on a second glance: As long as there are users of the device, I don't think we want to go into Closed ourselves, irrespective of the backend state. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86-64: fix page table accounting
Commit 20167d3421a089a1bf1bd680b150dc69c9506810 ("x86-64: Fix accounting in kernel_physical_mapping_init()") went a little too far by entirely removing the counting of pre-populated page tables: This should be done at boot time (to cover the page tables set up in early boot code), but shouldn't be done during memory hot add. Hence, re-add the removed increments of "pages", but make them and the one in phys_pte_init() conditional upon !after_bootmem. Reported-by: Hugh Dickins Signed-off-by: Jan Beulich --- Not sure if this ought to be copied to stable@. --- arch/x86/mm/init_64.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) --- 3.6/arch/x86/mm/init_64.c +++ 3.6-x86_64-page-table-count/arch/x86/mm/init_64.c @@ -386,7 +386,8 @@ phys_pte_init(pte_t *pte_page, unsigned * these mappings are more intelligent. */ if (pte_val(*pte)) { - pages++; + if (!after_bootmem) + pages++; continue; } @@ -451,6 +452,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned * attributes. */ if (page_size_mask & (1 << PG_LEVEL_2M)) { + if (!after_bootmem) + pages++; last_map_addr = next; continue; } @@ -526,6 +529,8 @@ phys_pud_init(pud_t *pud_page, unsigned * attributes. */ if (page_size_mask & (1 << PG_LEVEL_1G)) { + if (!after_bootmem) + pages++; last_map_addr = next; continue; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] x86, mm: Include the entire kernel memory map in trampoline_pgd
>>> On 04.10.12 at 23:08, "H. Peter Anvin" wrote: > On 10/03/2012 06:31 AM, Jan Beulich wrote: >>>>> Matt Fleming 10/03/12 2:59 PM >>> >>> +static int insert_identity_mapping(resource_size_t paddr, unsigned long > vaddr, >>> +unsigned long size) >>> +{ >>> +unsigned long end = vaddr + size; >>> +unsigned long next; >>> +pgd_t *vpgd, *ppgd; >>> + >>> +#ifdef CONFIG_X86_32 >>> +ppgd = initial_page_table + pgd_index(paddr); >>> + >>> +if (paddr >= PAGE_OFFSET || paddr + size > PAGE_OFFSET) >>> +return 1; >>> +#else >>> +ppgd = __va(real_mode_header->trampoline_pgd) + pgd_index(paddr); >> >> Missing equivalent code (to the 32-bit one above) here - after all, you're > trying >> to potentially insert a 52-bit physical address into 48-bit virtual space. >> > > We should have the check, but at least for Linux support we require > P <= V-2. Not really imo - P <= V - 1 should be sufficient here, as all that is necessary is that the result represents a 1:1 mapping. Specifically, there's no constraint to the virtual space limitation of the direct mapping of RAM. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] x86, mm: Include the entire kernel memory map in trampoline_pgd
>>> On 05.10.12 at 08:48, Matt Fleming wrote: > On Fri, 2012-10-05 at 07:39 +0100, Jan Beulich wrote: >> >>> On 04.10.12 at 23:08, "H. Peter Anvin" wrote: >> > On 10/03/2012 06:31 AM, Jan Beulich wrote: >> >>>>> Matt Fleming 10/03/12 2:59 PM >>> >> >>> +static int insert_identity_mapping(resource_size_t paddr, unsigned long >> > vaddr, >> >>> +unsigned long size) >> >>> +{ >> >>> +unsigned long end = vaddr + size; >> >>> +unsigned long next; >> >>> +pgd_t *vpgd, *ppgd; >> >>> + >> >>> +#ifdef CONFIG_X86_32 >> >>> +ppgd = initial_page_table + pgd_index(paddr); >> >>> + >> >>> +if (paddr >= PAGE_OFFSET || paddr + size > PAGE_OFFSET) >> >>> +return 1; >> >>> +#else >> >>> +ppgd = __va(real_mode_header->trampoline_pgd) + pgd_index(paddr); >> >> >> >> Missing equivalent code (to the 32-bit one above) here - after all, >> >> you're >> > trying >> >> to potentially insert a 52-bit physical address into 48-bit virtual space. >> >> >> > >> > We should have the check, but at least for Linux support we require >> > P <= V-2. >> >> Not really imo - P <= V - 1 should be sufficient here, as all that is >> necessary is that the result represents a 1:1 mapping. Specifically, >> there's no constraint to the virtual space limitation of the direct >> mapping of RAM. > > Just to be clear, I was going to add this check, > > /* Don't map over the guard hole. */ > if (paddr >= 0x7fff || paddr + size > 0x7fff) > return 1; 0x8000 would be the right numbers in both cases. > Since I'm guessing mapping over the guard hole would be bad. Really, you just can't map anything there (you'd most likely end up mapping something at 0x8000, and that would indeed be bad). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/6] xen-blkfront: handle backend CLOSED without CLOSING
>>> On 05.10.12 at 13:42, David Vrabel wrote: > On 02/10/12 21:02, Konrad Rzeszutek Wilk wrote: >> On Mon, Oct 01, 2012 at 06:19:19PM +0100, David Vrabel wrote: >>> On 25/09/12 18:53, David Vrabel wrote: On 21/09/12 17:04, David Vrabel wrote: > From: David Vrabel > > Backend drivers shouldn't transistion to CLOSED unless the frontend is > CLOSED. If a backend does transition to CLOSED too soon then the > frontend may not see the CLOSING state and will not properly shutdown. > > So, treat an unexpected backend CLOSED state the same as CLOSING. Didn't handle the frontend block device being mounted. Updated patch here. >>> >>> Konrad, can you ack this updated patch if you're happy with it. >> >> Acked-by: Konrad Rzeszutek Wilk >> >> Or should I just carry it in my for-jens-3.7 bug-fixes queue and ask >> Jen to pull it once rc0 is out? > > This seems easiest, if Jan is happy with the patch. I see the point of your explanation yesterday, but am still not convinced that the early cleanup in spite of active users of the disk can't cause any problems (like dangling pointers or NULL dereferences). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/3] x86/efi: Identity mapping pagetable
>>> On 05.10.12 at 15:14, Matt Fleming wrote: > From: Matt Fleming > > This series upgrades real_mode_header->trampoline_pgd to a proper > kernel pagetable instead of just mapping the kernel text and module > space. It also inserts the physical mappings for anything we > ioremap(), so I/O regions are always accessible via their physical > addresses whenever this pagetable is loaded, making it a true identity > mapping. > > These changes make it suitable for loading when calling virtual EFI > runtime functions on x86-64. The main benefit of this change is to fix > the ASUS firmware bug documented here, > >https://lkml.org/lkml/2012/8/7/108 > > but having our own EFI pagetable makes sense anyway. The memory map > code is easily the most complicatd part of the EFI infrastructure, and > it's likely that there will be other funky stuff we have to do with > our memory map as more and more machines ship with various > implementations of EFI firmware. > > Note that we only switch to the identity pagetable for x86-64. I'm > unaware of any bugs like the above on 32-bit EFI platforms, but we can > easily adopt the 64-bit scheme if one is discovered. > > v2: Based on feedback from Jan Beulich delete the 32-bit changes in > [PATCH 1/3], they were likely to create more problems than they would > solve and the identity I/O mappings are unused on 32-bit anyway. > > Matt Fleming (2): > x86, mm: Include the entire kernel memory map in trampoline_pgd > x86, efi: 1:1 pagetable mapping for virtual EFI calls Acked-by: Jan Beulich (I know too little about the tboot code to meaningfully ack that one too.) > Xiaoyan Zhang (1): > x86/kernel: remove tboot 1:1 page table creation code > > arch/x86/include/asm/efi.h | 28 --- > arch/x86/kernel/tboot.c| 78 ++ > arch/x86/mm/init_64.c | 9 +++- > arch/x86/mm/ioremap.c | 105 > + > arch/x86/platform/efi/efi_64.c | 15 ++ > arch/x86/realmode/init.c | 17 ++- > 6 files changed, 169 insertions(+), 83 deletions(-) > > -- > 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH V5] PVH patches for v3.8.
>>> On 23.10.12 at 20:12, Konrad Rzeszutek Wilk wrote: Do you really want to target 3.8 for these, without any hypervisor side review having happened? In particular, > Changelog [since v4] > - Mukesh addressed the reviewer's concerns. > - Took Mukesh's patches and redid the changelogs. > - Added Ack-ed as appropiate. > - Fixed PVHVM 32-bit bootup issues. > - Did some various cleanups, split some patches up. > > The big change is that I've made the xen_remove_from_physmap structure > be of the same size and offset on 32 and 64 bit builds. Currently PVH > only runs with 64-bit guests, but in the future it could run with 32-bit. > This change will eliminate having to add a compat layer to deal with > 32-bit hypercalls. > > Besides that the patches boot on an normal hypervisor - in > dom0 (32 or 64bit), PV domU (32 or 64) and PVHVM domU (32 or 64). > I don't have the PVH hypervisor patches so I could not address the > change to xen_remove_from_physmap but I hope Mukesh can do that. > > The patches are also located at: > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/pvh.v5 > > > arch/x86/include/asm/xen/interface.h | 11 ++- > arch/x86/include/asm/xen/page.h |3 + > arch/x86/xen/Kconfig | 10 ++ > arch/x86/xen/enlighten.c | 78 > arch/x86/xen/irq.c |5 +- > arch/x86/xen/mmu.c | 162 > -- > arch/x86/xen/mmu.h |2 + > arch/x86/xen/p2m.c |2 +- > arch/x86/xen/setup.c | 58 + > arch/x86/xen/smp.c | 75 ++-- > arch/x86/xen/xen-head.S | 11 ++- > drivers/xen/balloon.c| 15 ++-- > drivers/xen/cpu_hotplug.c|4 +- > drivers/xen/events.c |9 ++- > drivers/xen/gntdev.c |3 +- > drivers/xen/grant-table.c| 27 +- > drivers/xen/privcmd.c| 72 ++- > drivers/xen/xenbus/xenbus_client.c |3 +- > include/xen/interface/memory.h | 29 ++- > include/xen/interface/physdev.h | 10 ++ ... any changes to the hypervisor interface (didn't look in detail what is being changed in these two headers) should first be in at least -unstable before being consumed in any official release imo. Jan > include/xen/xen-ops.h|5 +- > 21 files changed, 504 insertions(+), 90 deletions(-) > > Konrad Rzeszutek Wilk (4): > xen/pvh: Fix PVHVM 32-bit bootup problems. > xen/hypercall: Make xen_remove_from_physmap the same on 64/32 builds. > xen/smp: Move the common CPU init code a bit to prep for PVH patch. > xen/e820: Coalesce the PVH release/populate logic in the generic case. > > Mukesh Rathor (6): > xen/pvh: Support ParaVirtualized Hardware extensions. > xen/pvh: Extend vcpu_guest_context, p2m, event, and XenBus. > xen/pvh: Implement MMU changes for PVH. > xen/pvh: bootup and setup (E820) related changes. > xen/pvh: balloon and grant changes. > xen/pvh: /dev/xen/privcmd changes. > > > ___ > Xen-devel mailing list > xen-de...@lists.xen.org > http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH RFC] Persistent grant maps for xen blk drivers
>>> On 23.10.12 at 20:50, Konrad Rzeszutek Wilk wrote: > On Tue, Oct 23, 2012 at 08:09:27PM +0200, Roger Pau Monné wrote: >> On 23/10/12 19:20, Konrad Rzeszutek Wilk wrote: >> diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c >> index c6decb9..2b982b2 100644 >> --- a/drivers/block/xen-blkback/blkback.c >> +++ b/drivers/block/xen-blkback/blkback.c >> @@ -78,6 +78,7 @@ struct pending_req { >> unsigned short operation; >> int status; >> struct list_headfree_list; >> + unsigned intunmap_seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; >> >> Should I change this to a bool? Since we are only setting it to 0 or 1. > > I would just keep it as 'int'. Eventually we can replace this with a > bit-map, but that can be done later. I think this should be a bitmap from the beginning - why would you want to waste 44 bytes per request for something that fits in a single unsigned long (and the picture would get worse with the number-of-segments extension)? Also - am I taking this work being done here as a silent agreement to not invest into blkif2 to streamline the various extensions floating around? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH V5] PVH patches for v3.8.
>>> On 24.10.12 at 11:34, Ian Campbell wrote: > On Wed, 2012-10-24 at 08:13 +0100, Jan Beulich wrote: > >> > include/xen/interface/memory.h | 29 ++- >> > include/xen/interface/physdev.h | 10 ++ >> >> ... any changes to the hypervisor interface (didn't look in detail >> what is being changed in these two headers) should first be in >> at least -unstable before being consumed in any official release >> imo. > > I'd also like to see at least the interface definitions in the h/v tree > if not the implementation right away. > > The flip side is that we have agreed that the interfaces are not > considered set in stone / stable until we've had a chance to review the > implementation, so perhaps it is better not to commit them to > xen-unstable.hg right away. > > I don't know what the right answer is. Perhaps we should at a minimum > reserve the subop numbers even if we don't yet define what they mean in > the Xen tree. But even then - what use is it to have 3.8 possibly only work on some intermediate (perhaps even just privately built) hypervisors? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] linux-next: Tree for Oct 24 (xen)
>>> On 24.10.12 at 23:33, Randy Dunlap wrote: > On 10/23/2012 09:19 PM, Stephen Rothwell wrote: > >> Hi all, >> >> Changes since 201201023: >> > > on x86_64: > > drivers/built-in.o: In function `dbgp_reset_prep': > (.text+0xb96b5): undefined reference to `xen_dbgp_reset_prep' > drivers/built-in.o: In function `dbgp_external_startup': > (.text+0xb9d95): undefined reference to `xen_dbgp_external_startup' > > > Full randconfig file is attached. So this is because with !USB_SUPPORT but EARLY_PRINTK_DBGP dbgp_reset_prep() and dbgp_external_startup() get pointlessly defined and exported. This got broken by the merge recommendation for the ARM side changes (originally compilation of drivers/xen/dbgp.c depended on just CONFIG_XEN_DOM0). >From my pov, fixing the USB side would be the clean solution (i.e. putting those function definitions inside a CONFIG_USB_SUPPORT conditional). The alternative of a smaller change would be to extend the conditional around the respective xen_dbgp_...() declarations in include/linux/usb/ehci_def.h to become #if defined(CONFIG_XEN_DOM0) && defined(CONFIG_USB_SUPPORT) Please advise towards your preference. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH] xen PVonHVM: move shared_info to reserved memory area
>>> On 25.10.12 at 11:04, Olaf Hering wrote: > @@ -1495,38 +1494,53 @@ asmlinkage void __init xen_start_kernel(void) > #endif > } > > -void __ref xen_hvm_init_shared_info(void) Was the __ref here in fact unnecessary (i.e. did you check that its removal doesn't cause any section mismatch warnings)? > +#ifdef CONFIG_XEN_PVHVM > +#define HVM_SHARED_INFO_ADDR 0xFE70UL > +static struct shared_info *xen_hvm_shared_info; > + > +static void xen_hvm_connect_shared_info(unsigned long pfn) > { > - int cpu; > struct xen_add_to_physmap xatp; > - static struct shared_info *shared_info_page = 0; > > - if (!shared_info_page) > - shared_info_page = (struct shared_info *) > - extend_brk(PAGE_SIZE, PAGE_SIZE); > xatp.domid = DOMID_SELF; > xatp.idx = 0; > xatp.space = XENMAPSPACE_shared_info; > - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; > + xatp.gpfn = pfn; > if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) > BUG(); > > - HYPERVISOR_shared_info = (struct shared_info *)shared_info_page; > +} > +static void xen_hvm_set_shared_info(struct shared_info *sip) With its sole caller being __init, this should also be __init. Jan > +{ > + int cpu; > + > + HYPERVISOR_shared_info = sip; > > /* xen_vcpu is a pointer to the vcpu_info struct in the shared_info >* page, we use it in the event channel upcall and in some pvclock >* related functions. We don't need the vcpu_info placement >* optimizations because we don't use any pv_mmu or pv_irq op on > - * HVM. > - * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is > - * online but xen_hvm_init_shared_info is run at resume time too and > - * in that case multiple vcpus might be online. */ > - for_each_online_cpu(cpu) { > + * HVM. */ > + for_each_online_cpu(cpu) > per_cpu(xen_vcpu, cpu) = > &HYPERVISOR_shared_info->vcpu_info[cpu]; > - } > } > > -#ifdef CONFIG_XEN_PVHVM > +/* Reconnect the shared_info pfn to a (new) mfn */ > +void xen_hvm_resume_shared_info(void) > +{ > + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); > +} > + > +/* Obtain an address to the pfn in reserved area */ > +void __init xen_hvm_init_shared_info(void) > +{ > + set_fixmap(FIX_PARAVIRT_BOOTMAP, HVM_SHARED_INFO_ADDR); > + xen_hvm_shared_info = > + (struct shared_info *)fix_to_virt(FIX_PARAVIRT_BOOTMAP); > + xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT); > + xen_hvm_set_shared_info(xen_hvm_shared_info); > +} > + > static void __init init_hvm_pv_info(void) > { > int major, minor; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] linux-next: Tree for Oct 24 (xen)
>>> On 25.10.12 at 15:46, Konrad Rzeszutek Wilk wrote: > On Thu, Oct 25, 2012 at 11:48:30AM +0100, Stefano Stabellini wrote: >> On Thu, 25 Oct 2012, Jan Beulich wrote: >> > >>> On 24.10.12 at 23:33, Randy Dunlap wrote: >> > > On 10/23/2012 09:19 PM, Stephen Rothwell wrote: >> > > >> > >> Hi all, >> > >> >> > >> Changes since 201201023: >> > >> >> > > >> > > on x86_64: >> > > >> > > drivers/built-in.o: In function `dbgp_reset_prep': >> > > (.text+0xb96b5): undefined reference to `xen_dbgp_reset_prep' >> > > drivers/built-in.o: In function `dbgp_external_startup': >> > > (.text+0xb9d95): undefined reference to `xen_dbgp_external_startup' >> > > >> > > >> > > Full randconfig file is attached. >> > >> > So this is because with !USB_SUPPORT but EARLY_PRINTK_DBGP >> > dbgp_reset_prep() and dbgp_external_startup() get pointlessly >> > defined and exported. This got broken by the merge >> > recommendation for the ARM side changes (originally compilation >> > of drivers/xen/dbgp.c depended on just CONFIG_XEN_DOM0). >> > >> > >From my pov, fixing the USB side would be the clean solution (i.e. >> > putting those function definitions inside a CONFIG_USB_SUPPORT >> > conditional). >> > >> > The alternative of a smaller change would be to extend the >> > conditional around the respective xen_dbgp_...() declarations >> > in include/linux/usb/ehci_def.h to become >> > >> > #if defined(CONFIG_XEN_DOM0) && defined(CONFIG_USB_SUPPORT) >> > >> > Please advise towards your preference. >> >> I think that your first suggestion is the right one. > > Can you guys spin up a patch pls and make sure it does not break > compilation. Thx. I'd really like to hear Greg's opinion on which route to take before pointlessly trying the other one. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Xen-devel] [PATCH 1/2] Xen acpi pad implement
>>> "Liu, Jinsong" 10/26/12 8:18 AM >>> >> +static struct acpi_driver xen_acpi_pad_driver = { >> +.name = "processor_aggregator", >> +.class = ACPI_PROCESSOR_AGGREGATOR_CLASS, >> +.ids = pad_device_ids, >> +.ops = { >> +.add = xen_acpi_pad_add, > >.remove? > >[Jinsong] .remove method not used by any logic now (any possible point use >it?), so we remove it from our former patch. Unless there is technical difficulty implementing it, I wouldn't defer adding that code until the point where something doesn't work anymore. >Overall I'd recommend taking a look at the cleaned up driver in >our kernels. > >[Jinsong] What's your point here? There's quite a bit of cleanup/simplification potential here, and rather than pointing the pieces out individually I would think comparing with what we have in production use might be worthwhile. But that's up to you of course. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH 1/2] Xen acpi pad implement
>>> On 25.10.12 at 14:19, "Liu, Jinsong" wrote: > --- /dev/null > +++ b/drivers/xen/xen_acpi_pad.c > @@ -0,0 +1,173 @@ > +/* > + * xen_acpi_pad.c - Xen pad interface > + * > + * Copyright (c) 2012, Intel Corporation. > + *Author: Liu, Jinsong > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#if defined(CONFIG_ACPI_PROCESSOR_AGGREGATOR) || \ > + defined(CONFIG_ACPI_PROCESSOR_AGGREGATOR_MODULE) > + > +#define ACPI_PROCESSOR_AGGREGATOR_CLASS "acpi_pad" > +#define ACPI_PROCESSOR_AGGREGATOR_DEVICE_NAME "Processor Aggregator" > +#define ACPI_PROCESSOR_AGGREGATOR_NOTIFY 0x80 > + > +static int xen_acpi_pad_idle_cpus(int *num_cpus) > +{ > + int ret; > + Stray blank line. > + struct xen_platform_op op = { > + .cmd = XENPF_core_parking, > + .interface_version = XENPF_INTERFACE_VERSION, This is redundant with HYPERVISOR_dom0_op(). > + }; > + > + /* set cpu nums expected to be idled */ > + op.u.core_parking.type = XEN_CORE_PARKING_SET; > + op.u.core_parking.idle_nums = (uint32_t)*num_cpus; It is quite a bit more efficient in terms of generated code to initialize all fields using assignments (the use of an initializer setting only a subset of fields will cause all other fields to get zero initialized). In any case I think it is bad style to mix both approaches. > + ret = HYPERVISOR_dom0_op(&op); > + if (ret) > + return ret; > + > + /* > + * get cpu nums actually be idled > + * cannot get it by using hypercall once (shared with _SET) > + * because of the characteristic of Xen continue_hypercall_on_cpu > + */ > + op.u.core_parking.type = XEN_CORE_PARKING_GET; > + ret = HYPERVISOR_dom0_op(&op); > + if (ret) > + return ret; > + > + *num_cpus = op.u.core_parking.idle_nums; "num_cpus" doesn't need to be a pointer, and you don't need to call _GET here either - callers that care for the count in effect after the call can invoke the corresponding _GET on their own. > + return 0; > +} > + > +/* > + * Query firmware how many CPUs should be idle > + * return -1 on failure > + */ > +static int xen_acpi_pad_pur(acpi_handle handle) > +{ > + struct acpi_buffer buffer = {ACPI_ALLOCATE_BUFFER, NULL}; > + union acpi_object *package; > + int num = -1; > + > + if (ACPI_FAILURE(acpi_evaluate_object(handle, "_PUR", NULL, &buffer))) > + return num; > + > + if (!buffer.length || !buffer.pointer) > + return num; > + > + package = buffer.pointer; > + > + if (package->type == ACPI_TYPE_PACKAGE && > + package->package.count == 2 && > + package->package.elements[0].integer.value == 1) /* rev 1 */ > + > + num = package->package.elements[1].integer.value; > + > + kfree(buffer.pointer); > + return num; > +} > + > +/* Notify firmware how many CPUs are idle */ > +static void xen_acpi_pad_ost(acpi_handle handle, int stat, > + uint32_t idle_cpus) > +{ > + union acpi_object params[3] = { > + {.type = ACPI_TYPE_INTEGER,}, > + {.type = ACPI_TYPE_INTEGER,}, > + {.type = ACPI_TYPE_BUFFER,}, > + }; > + struct acpi_object_list arg_list = {3, params}; > + > + params[0].integer.value = ACPI_PROCESSOR_AGGREGATOR_NOTIFY; > + params[1].integer.value = stat; > + params[2].buffer.length = 4; > + params[2].buffer.pointer = (void *)&idle_cpus; > + acpi_evaluate_object(handle, "_OST", &arg_list, NULL); > +} > + > +static void xen_acpi_pad_handle_notify(acpi_handle handle) > +{ > + int ret, num_cpus; > + > + num_cpus = xen_acpi_pad_pur(handle); > + if (num_cpus < 0) > + return; > + > + ret = xen_acpi_pad_idle_cpus(&num_cpus); > + if (ret) > + return; > + > + xen_acpi_pad_ost(handle, 0, num_cpus); > +} > + > +static void xen_acpi_pad_notify(acpi_handle handle, u32 event, > + void *data) > +{ > + switch (event) { > + case ACPI_PROCESSOR_AGGREGATOR_NOTIFY: > + xen_acpi_pad_handle_notify(handle); > + break; > + default: > + pr_warn("Unsupported event [0x%x]\n", event); > + break; > + } > +} > + > +static int xen_acpi_pad_add(struct acpi_device *device) > +{ > + acpi_status status; > + > + strcpy(acpi_device_name(device), ACPI_PROCESSOR_AGGREGATOR_DEVICE_NAME); > + strcpy(acpi_device_class(device), ACPI_PROCESSOR_AGGREGATOR_CLASS); >
Re: linux-next: Tree for Oct 31 (ehci, dbgp)
>>> Alan Stern 11/01/12 4:28 PM >>> >On Wed, 31 Oct 2012, Randy Dunlap wrote: >> on x86_64: >> >> drivers/built-in.o: In function `ehci_reset': >> host.c:(.text+0x542a7e): undefined reference to `dbgp_reset_prep' >> host.c:(.text+0x542b75): undefined reference to `dbgp_external_startup' >> drivers/built-in.o: In function `ehci_bus_resume': >> host.c:(.text+0x544705): undefined reference to `dbgp_reset_prep' >> host.c:(.text+0x544731): undefined reference to `dbgp_external_startup' > >We all forgot about the chipidea driver. It includes code from I didn't even know of this. >ehci-hcd in a rather unorthodox manner (see >drivers/usb/chipidea/host.c). > >Evidently we need to change your new test in >drivers/usb/early/ehci-dbgp.c to: > >#if IS_ENABLED(CONFIG_USB_HCD_EHCI) || defined(CONFIG_USB_CHIPIDEA_HOST) > >Upcoming changes to ehci-hcd will make this unnecessary in 3.8, but for >now we need it. Which tells me that the CONFIG_USB_SUPPORT version would have been the better one (and I would favor that over the ugly variant you suggest above). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [PATCH] x86: Don't clobber top of pt_regs in nested NMI
>>> Steven Rostedt 11/01/12 2:04 AM >>> >On Mon, 2012-10-01 at 17:29 -0700, Salman Qazi wrote: >> @@ -1826,12 +1832,15 @@ repeat_nmi: >> * is benign for the non-repeat case, where 1 was pushed just above >> * to this very stack slot). >> */ >> -movq $1, 5*8(%rsp) >> +movq $1, 10*8(%rsp) >> >> /* Make another copy, this one may be modified by nested NMIs */ >> +addq $(10*8), %rsp > >This breaks the CFI magic. > >> .rept 5 >> -pushq_cfi 4*8(%rsp) >> +pushq_cfi -6*8(%rsp) >> .endr >> +subq $(5*8), %rsp > >So does this. > >This needs to be annotated correctly before I can push it out. But the >good news is, I stressed tested this change, and it all works out. > >Jan, can you help out here? There doesn't appear to be anything special about these adjustments, so I don't see what help would be required here - it ought to be the normal use of CFI_ADJUST_CFA_OFFSET that needs adding. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] x86, xen: fix build dependency when USB_SUPPORT is not enabled
>>> Konrad Rzeszutek Wilk 11/01/12 1:49 PM >>> >On Wed, Oct 31, 2012 at 10:42:30PM -0700, David Rientjes wrote: >> CONFIG_XEN_DOM0 must depend on CONFIG_USB_SUPPORT, otherwise there is no >> definition of xen_dbgp_reset_prep() and xen_dbgp_external_startup() >> resulting in the following link error: >> >> drivers/built-in.o: In function `dbgp_reset_prep': >> (.text+0x1e03c5): undefined reference to `xen_dbgp_reset_prep' >> drivers/built-in.o: In function `dbgp_external_startup': >> (.text+0x1e0d55): undefined reference to `xen_dbgp_external_startup' > >There is another patch that needs to be Acked and picked up by >Greg KH that fixes this. > >Let me poke Jan Beulich to repost it with the appropiate Acks. It's been picked up already, but another dependency problem was found with it (due to not having used CONFIG_USB_SUPPORT as dependency, as I had first submitted). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: Tree for Oct 31 (ehci, dbgp)
>>> Alan Stern 11/01/12 9:39 PM >>> >On Thu, 1 Nov 2012, Jan Beulich wrote: >> >>> Alan Stern 11/01/12 4:28 PM >>> >> >Evidently we need to change your new test in >> >drivers/usb/early/ehci-dbgp.c to: >> > >> >#if IS_ENABLED(CONFIG_USB_HCD_EHCI) || defined(CONFIG_USB_CHIPIDEA_HOST) >> > >> >Upcoming changes to ehci-hcd will make this unnecessary in 3.8, but for >> >now we need it. >> >> Which tells me that the CONFIG_USB_SUPPORT version would have been >> the better one (and I would favor that over the ugly variant you suggest >> above). > >I also suggested IS_ENABLED(CONFIG_USB), which is no uglier than what >you submitted and would also fix this build error. How about using it >instead? Yes, that's better. Question then is - updated original patch or incremental one? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] x86, xen: fix build dependency when USB_SUPPORT is not enabled
>>> On 01.11.12 at 23:05, Konrad Rzeszutek Wilk wrote: > On Thu, Nov 01, 2012 at 08:04:58PM +, Jan Beulich wrote: >> >>> Konrad Rzeszutek Wilk 11/01/12 1:49 PM >>> >> >On Wed, Oct 31, 2012 at 10:42:30PM -0700, David Rientjes wrote: >> >> CONFIG_XEN_DOM0 must depend on CONFIG_USB_SUPPORT, otherwise there is no >> >> definition of xen_dbgp_reset_prep() and xen_dbgp_external_startup() >> >> resulting in the following link error: >> >> >> >> drivers/built-in.o: In function `dbgp_reset_prep': >> >> (.text+0x1e03c5): undefined reference to `xen_dbgp_reset_prep' >> >> drivers/built-in.o: In function `dbgp_external_startup': >> >> (.text+0x1e0d55): undefined reference to `xen_dbgp_external_startup' >> > >> >There is another patch that needs to be Acked and picked up by >> >Greg KH that fixes this. >> > >> >Let me poke Jan Beulich to repost it with the appropiate Acks. >> >> It's been picked up already, but another dependency problem was found with >> it (due to not having used CONFIG_USB_SUPPORT as dependency, as I had >> first submitted). > > Oh. I missed that part - so do you think that this patch should also > be used? Or do you think there is another way to fix this? > > I am in transit right now so I can't prep a patch (and the laptop I've > is extremely slow to even do a test compile). Alan and I settled on relaxing the condition in ehci-dbgp.c to IS_ENABLED(USB). But it's not clear to me if I should send an incremental patch or a replacement one (neither he nor Greg answered my respective inquiry). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86-64: fix ordering of CFI directives and recent ASM_CLAC additions
While these got added in the right place everywhere else, entry_64.S is the odd one where they ended up before the initial CFI directive(s). In order to cover the full code ranges, the CFI directive must be first, though. Signed-off-by: Jan Beulich --- arch/x86/kernel/entry_64.S | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) --- 3.7-rc3/arch/x86/kernel/entry_64.S +++ 3.7-rc3-x86_64-unwind-vs-clac/arch/x86/kernel/entry_64.S @@ -995,8 +995,8 @@ END(interrupt) */ .p2align CONFIG_X86_L1_CACHE_SHIFT common_interrupt: - ASM_CLAC XCPT_FRAME + ASM_CLAC addq $-0x80,(%rsp) /* Adjust vector to [-256,-1] range */ interrupt do_IRQ /* 0(%rsp): old_rsp-ARGOFFSET */ @@ -1135,8 +1135,8 @@ END(common_interrupt) */ .macro apicinterrupt num sym do_sym ENTRY(\sym) - ASM_CLAC INTR_FRAME + ASM_CLAC pushq_cfi $~(\num) .Lcommon_\sym: interrupt \do_sym @@ -1190,8 +1190,8 @@ apicinterrupt IRQ_WORK_VECTOR \ */ .macro zeroentry sym do_sym ENTRY(\sym) - ASM_CLAC INTR_FRAME + ASM_CLAC PARAVIRT_ADJUST_EXCEPTION_FRAME pushq_cfi $-1 /* ORIG_RAX: no syscall to restart */ subq $ORIG_RAX-R15, %rsp @@ -1208,8 +1208,8 @@ END(\sym) .macro paranoidzeroentry sym do_sym ENTRY(\sym) - ASM_CLAC INTR_FRAME + ASM_CLAC PARAVIRT_ADJUST_EXCEPTION_FRAME pushq_cfi $-1 /* ORIG_RAX: no syscall to restart */ subq $ORIG_RAX-R15, %rsp @@ -1227,8 +1227,8 @@ END(\sym) #define INIT_TSS_IST(x) PER_CPU_VAR(init_tss) + (TSS_ist + ((x) - 1) * 8) .macro paranoidzeroentry_ist sym do_sym ist ENTRY(\sym) - ASM_CLAC INTR_FRAME + ASM_CLAC PARAVIRT_ADJUST_EXCEPTION_FRAME pushq_cfi $-1 /* ORIG_RAX: no syscall to restart */ subq $ORIG_RAX-R15, %rsp @@ -1247,8 +1247,8 @@ END(\sym) .macro errorentry sym do_sym ENTRY(\sym) - ASM_CLAC XCPT_FRAME + ASM_CLAC PARAVIRT_ADJUST_EXCEPTION_FRAME subq $ORIG_RAX-R15, %rsp CFI_ADJUST_CFA_OFFSET ORIG_RAX-R15 @@ -1266,8 +1266,8 @@ END(\sym) /* error code is on the stack already */ .macro paranoiderrorentry sym do_sym ENTRY(\sym) - ASM_CLAC XCPT_FRAME + ASM_CLAC PARAVIRT_ADJUST_EXCEPTION_FRAME subq $ORIG_RAX-R15, %rsp CFI_ADJUST_CFA_OFFSET ORIG_RAX-R15 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86/perf_events: build fix
At least some older gcc versions dislike mixing constant and non-const data in the same section ("... causes a section type confict"). Signed-off-by: Jan Beulich --- arch/x86/kernel/cpu/perf_event_p6.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- 3.7-rc3/arch/x86/kernel/cpu/perf_event_p6.c +++ 3.7-rc3-x86-perf-p6-section/arch/x86/kernel/cpu/perf_event_p6.c @@ -19,7 +19,7 @@ static const u64 p6_perfmon_event_map[] }; -static __initconst u64 p6_hw_cache_event_ids +static const u64 __initconst p6_hw_cache_event_ids [PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX] = -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86/HPET: fix inverted return value check in arch_setup_hpet_msi()
setup_hpet_msi_remapped() returns a negative error indicator on error - check for this rather than for a boolean false indication, and pass on that error code rather than a meaningless "-1". Signed-off-by: Jan Beulich Cc: David Woodhouse --- arch/x86/kernel/apic/io_apic.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- 3.7-rc3/arch/x86/kernel/apic/io_apic.c +++ 3.7-rc3-x86-hpet-masking/arch/x86/kernel/apic/io_apic.c @@ -3317,8 +3317,9 @@ int arch_setup_hpet_msi(unsigned int irq int ret; if (irq_remapping_enabled) { - if (!setup_hpet_msi_remapped(irq, id)) - return -1; + ret = setup_hpet_msi_remapped(irq, id); + if (ret) + return ret; } ret = msi_compose_msg(NULL, irq, &msg, id); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: fix masking of MSI interrupts
HPET_TN_FSB is not a proper mask bit; it merely toggles between MSI and legacy interrupt delivery. The proper mask bit is HPET_TN_ENABLE, so use both bits when (un)masking the interrupt. Signed-off-by: Jan Beulich --- arch/x86/kernel/hpet.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) --- 3.7-rc3/arch/x86/kernel/hpet.c +++ 3.7-rc3-x86-hpet-masking/arch/x86/kernel/hpet.c @@ -434,7 +434,7 @@ void hpet_msi_unmask(struct irq_data *da /* unmask it */ cfg = hpet_readl(HPET_Tn_CFG(hdev->num)); - cfg |= HPET_TN_FSB; + cfg |= HPET_TN_ENABLE | HPET_TN_FSB; hpet_writel(cfg, HPET_Tn_CFG(hdev->num)); } @@ -445,7 +445,7 @@ void hpet_msi_mask(struct irq_data *data /* mask it */ cfg = hpet_readl(HPET_Tn_CFG(hdev->num)); - cfg &= ~HPET_TN_FSB; + cfg &= ~(HPET_TN_ENABLE | HPET_TN_FSB); hpet_writel(cfg, HPET_Tn_CFG(hdev->num)); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: Tree for Oct 31 (ehci, dbgp)
>>> On 02.11.12 at 15:01, Alan Stern wrote: > On Thu, 1 Nov 2012, Jan Beulich wrote: > >> >>> Alan Stern 11/01/12 9:39 PM >>> >> >On Thu, 1 Nov 2012, Jan Beulich wrote: >> >> >>> Alan Stern 11/01/12 4:28 PM >>> >> >> >Evidently we need to change your new test in >> >> >drivers/usb/early/ehci-dbgp.c to: >> >> > >> >> >#if IS_ENABLED(CONFIG_USB_HCD_EHCI) || defined(CONFIG_USB_CHIPIDEA_HOST) >> >> > >> >> >Upcoming changes to ehci-hcd will make this unnecessary in 3.8, but for >> >> >now we need it. >> >> >> >> Which tells me that the CONFIG_USB_SUPPORT version would have been >> >> the better one (and I would favor that over the ugly variant you suggest >> >> above). >> > >> >I also suggested IS_ENABLED(CONFIG_USB), which is no uglier than what >> >you submitted and would also fix this build error. How about using it >> >instead? >> >> Yes, that's better. Question then is - updated original patch or incremental > one? > > Greg will probably want an incremental patch, because the original has > already been merged. I actually sent both (the incremental as attachment - I hope that's going to be acceptable to him) in a submission earlier today. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [PATCH] x86: Don't clobber top of pt_regs in nested NMI
>>> On 02.11.12 at 14:53, Steven Rostedt wrote: > On Fri, 2012-11-02 at 09:51 -0400, Steven Rostedt wrote: >> On Thu, 2012-11-01 at 19:53 +, Jan Beulich wrote: >> >> > There doesn't appear to be anything special about these adjustments, so I >> > don't see what help would be required here - it ought to be the normal use >> > of CFI_ADJUST_CFA_OFFSET that needs adding. >> >> This change look fine to you? >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> index 52edf92..7ba5342 100644 >> --- a/arch/x86/kernel/entry_64.S >> +++ b/arch/x86/kernel/entry_64.S >> @@ -1796,10 +1796,12 @@ repeat_nmi: >> >> /* Make another copy, this one may be modified by nested NMIs */ >> addq $(10*8), %rsp >> +CFI_ADJUST_CFA_OFFSET -10*8 >> .rept 5 >> pushq_cfi -6*8(%rsp) >> .endr >> subq $(5*8), %rsp >> +CFI_ADJUST_CFA_OFFSET 5*8 >> >> CFI_DEF_CFA_OFFSET SS+8-RIP >> end_repeat_nmi: >> > > Is that second one even needed? Or will the CFI_DEF_CFA_OFFSET SS+8-RIP > fix it? Yes it will (as long as no intervening instructions get added; that's to say that I'd recommend removing the blank line to make clear that instruction and annotation belong together). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3, v2] x86: xor-block handling adjustments
v2 of this series is merely updated on top of the changes between 3.6 and 3.7-rc (which includes the dropping of what previously was the second patch in a four patch series). 1: unify SSE-base xor-block routines 2: add alternative SSE implementation only prefetching once per 64-byte line 3: make virtualization friendly Signed-off-by: Jan Beulich -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3, v2] x86: unify SSE-base xor-block routines
Besides folding duplicate code, this has the advantage of fixing x86-64's failure to use proper (para-virtualizable) accessors for dealing with CR0.TS. Signed-off-by: Jan Beulich --- arch/x86/include/asm/xor.h| 319 +- arch/x86/include/asm/xor_32.h | 286 - arch/x86/include/asm/xor_64.h | 295 -- 3 files changed, 319 insertions(+), 581 deletions(-) --- 3.7-rc3-x86-xor.orig/arch/x86/include/asm/xor.h +++ 3.7-rc3-x86-xor/arch/x86/include/asm/xor.h @@ -1,10 +1,327 @@ #ifdef CONFIG_KMEMCHECK /* kmemcheck doesn't handle MMX/SSE/SSE2 instructions */ # include +#elif !defined(_ASM_X86_XOR_H) +#define _ASM_X86_XOR_H + +/* + * Optimized RAID-5 checksumming functions for SSE. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * You should have received a copy of the GNU General Public License + * (for example /usr/src/linux/COPYING); if not, write to the Free + * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +/* + * Cache avoiding checksumming functions utilizing KNI instructions + * Copyright (C) 1999 Zach Brown (with obvious credit due Ingo) + */ + +/* + * Based on + * High-speed RAID5 checksumming functions utilizing SSE instructions. + * Copyright (C) 1998 Ingo Molnar. + */ + +/* + * x86-64 changes / gcc fixes from Andi Kleen. + * Copyright 2002 Andi Kleen, SuSE Labs. + * + * This hasn't been optimized for the hammer yet, but there are likely + * no advantages to be gotten from x86-64 here anyways. + */ + +#include + +#ifdef CONFIG_X86_32 +/* reduce register pressure */ +# define XOR_CONSTANT_CONSTRAINT "i" #else +# define XOR_CONSTANT_CONSTRAINT "re" +#endif + +#define OFFS(x)"16*("#x")" +#define PF_OFFS(x) "256+16*("#x")" +#define PF0(x) " prefetchnta "PF_OFFS(x)"(%[p1]) ;\n" +#define LD(x, y) " movaps "OFFS(x)"(%[p1]), %%xmm"#y" ;\n" +#define ST(x, y) " movaps %%xmm"#y", "OFFS(x)"(%[p1]) ;\n" +#define PF1(x) " prefetchnta "PF_OFFS(x)"(%[p2]) ;\n" +#define PF2(x) " prefetchnta "PF_OFFS(x)"(%[p3]) ;\n" +#define PF3(x) " prefetchnta "PF_OFFS(x)"(%[p4]) ;\n" +#define PF4(x) " prefetchnta "PF_OFFS(x)"(%[p5]) ;\n" +#define XO1(x, y) " xorps "OFFS(x)"(%[p2]), %%xmm"#y" ;\n" +#define XO2(x, y) " xorps "OFFS(x)"(%[p3]), %%xmm"#y" ;\n" +#define XO3(x, y) " xorps "OFFS(x)"(%[p4]), %%xmm"#y" ;\n" +#define XO4(x, y) " xorps "OFFS(x)"(%[p5]), %%xmm"#y" ;\n" + +static void +xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) +{ + unsigned long lines = bytes >> 8; + + kernel_fpu_begin(); + + asm volatile( +#undef BLOCK +#define BLOCK(i) \ + LD(i, 0)\ + LD(i + 1, 1)\ + PF1(i) \ + PF1(i + 2) \ + LD(i + 2, 2)\ + LD(i + 3, 3)\ + PF0(i + 4) \ + PF0(i + 6) \ + XO1(i, 0) \ + XO1(i + 1, 1) \ + XO1(i + 2, 2) \ + XO1(i + 3, 3) \ + ST(i, 0)\ + ST(i + 1, 1)\ + ST(i + 2, 2)\ + ST(i + 3, 3)\ + + + PF0(0) + PF0(2) + + " .align 32 ;\n" + " 1:;\n" + + BLOCK(0) + BLOCK(4) + BLOCK(8) + BLOCK(12) + + " add %[inc], %[p1] ;\n" + " add %[inc], %[p2] ;\n" + " dec %[cnt] ;\n" + " jnz 1b ;\n" + : [cnt] &q
[PATCH 2/3, v2] x86: add alternative SSE implementation only prefetching once per 64-byte line
On CPUs with 64-byte last level cache lines, this yields roughly 10% better performance, independent of CPU vendor or specific model (as far as I was able to test). Signed-off-by: Jan Beulich --- arch/x86/include/asm/xor.h| 172 ++ arch/x86/include/asm/xor_32.h | 23 ++--- arch/x86/include/asm/xor_64.h | 10 -- 3 files changed, 187 insertions(+), 18 deletions(-) --- 3.7-rc3-x86-xor.orig/arch/x86/include/asm/xor.h +++ 3.7-rc3-x86-xor/arch/x86/include/asm/xor.h @@ -58,6 +58,14 @@ #define XO2(x, y) " xorps "OFFS(x)"(%[p3]), %%xmm"#y" ;\n" #define XO3(x, y) " xorps "OFFS(x)"(%[p4]), %%xmm"#y" ;\n" #define XO4(x, y) " xorps "OFFS(x)"(%[p5]), %%xmm"#y" ;\n" +#define NOP(x) + +#define BLK64(pf, op, i) \ + pf(i) \ + op(i, 0)\ + op(i + 1, 1)\ + op(i + 2, 2)\ + op(i + 3, 3) static void xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) @@ -111,6 +119,40 @@ xor_sse_2(unsigned long bytes, unsigned } static void +xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2) +{ + unsigned long lines = bytes >> 8; + + kernel_fpu_begin(); + + asm volatile( +#undef BLOCK +#define BLOCK(i) \ + BLK64(PF0, LD, i) \ + BLK64(PF1, XO1, i) \ + BLK64(NOP, ST, i) \ + + " .align 32 ;\n" + " 1:;\n" + + BLOCK(0) + BLOCK(4) + BLOCK(8) + BLOCK(12) + + " add %[inc], %[p1] ;\n" + " add %[inc], %[p2] ;\n" + " dec %[cnt] ;\n" + " jnz 1b ;\n" + : [cnt] "+r" (lines), + [p1] "+r" (p1), [p2] "+r" (p2) + : [inc] XOR_CONSTANT_CONSTRAINT (256UL) + : "memory"); + + kernel_fpu_end(); +} + +static void xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, unsigned long *p3) { @@ -170,6 +212,43 @@ xor_sse_3(unsigned long bytes, unsigned } static void +xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, + unsigned long *p3) +{ + unsigned long lines = bytes >> 8; + + kernel_fpu_begin(); + + asm volatile( +#undef BLOCK +#define BLOCK(i) \ + BLK64(PF0, LD, i) \ + BLK64(PF1, XO1, i) \ + BLK64(PF2, XO2, i) \ + BLK64(NOP, ST, i) \ + + " .align 32 ;\n" + " 1:;\n" + + BLOCK(0) + BLOCK(4) + BLOCK(8) + BLOCK(12) + + " add %[inc], %[p1] ;\n" + " add %[inc], %[p2] ;\n" + " add %[inc], %[p3] ;\n" + " dec %[cnt] ;\n" + " jnz 1b ;\n" + : [cnt] "+r" (lines), + [p1] "+r" (p1), [p2] "+r" (p2), [p3] "+r" (p3) + : [inc] XOR_CONSTANT_CONSTRAINT (256UL) + : "memory"); + + kernel_fpu_end(); +} + +static void xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, unsigned long *p3, unsigned long *p4) { @@ -236,6 +315,45 @@ xor_sse_4(unsigned long bytes, unsigned } static void +xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, + unsigned long *p3, unsigned long *p4) +{ + unsigned long lines = bytes >> 8; + + kernel_fpu_begin(); + + asm volatile( +#undef BLOCK +#define BLOCK(i) \ + BLK64(PF0, LD, i) \ + BLK64(PF1, XO1, i) \ + BLK64(PF2, XO2, i) \ + BLK64(PF3, XO3, i) \ + BLK64(NOP, ST, i) \ + + " .align 32 ;\n" + " 1:;\n" + + BLOCK(0) + BLOCK(4) + BLOCK(8) + BLOCK(12) + + " add %[inc], %[p1] ;\n" + " add %[inc], %[p2] ;\n" + " add %[inc], %[p3] ;\n" + " add %[inc], %[p4] ;\n" + " dec %[cnt] ;\n" + " jnz 1b
[PATCH 3/3, v2] x86/xor: make virtualization friendly
In virtualized environments, the CR0.TS management needed here can be a lot slower than anticipated by the original authors of this code, which particularly means that in such cases forcing the use of SSE- (or MMX-) based implementations is not desirable - actual measurements should always be done in that case. For consistency, pull into the shared (32- and 64-bit) header not only the inclusion of the generic code, but also that of the AVX variants. Signed-off-by: Jan Beulich Cc: Konrad Rzeszutek Wilk --- arch/x86/include/asm/xor.h|8 +++- arch/x86/include/asm/xor_32.h | 22 ++ arch/x86/include/asm/xor_64.h | 10 ++ 3 files changed, 23 insertions(+), 17 deletions(-) --- 3.7-rc3-x86-xor.orig/arch/x86/include/asm/xor.h +++ 3.7-rc3-x86-xor/arch/x86/include/asm/xor.h @@ -487,6 +487,12 @@ static struct xor_block_template xor_blo #undef XOR_CONSTANT_CONSTRAINT +/* Also try the AVX routines */ +#include + +/* Also try the generic routines. */ +#include + #ifdef CONFIG_X86_32 # include #else @@ -494,6 +500,6 @@ static struct xor_block_template xor_blo #endif #define XOR_SELECT_TEMPLATE(FASTEST) \ - AVX_SELECT(FASTEST) + (cpu_has_hypervisor ? (FASTEST) : AVX_SELECT(FASTEST)) #endif /* _ASM_X86_XOR_H */ --- 3.7-rc3-x86-xor.orig/arch/x86/include/asm/xor_32.h +++ 3.7-rc3-x86-xor/arch/x86/include/asm/xor_32.h @@ -537,12 +537,6 @@ static struct xor_block_template xor_blo .do_5 = xor_sse_5, }; -/* Also try the AVX routines */ -#include - -/* Also try the generic routines. */ -#include - /* We force the use of the SSE xor block because it can write around L2. We may also be able to load into the L1 only depending on how the cpu deals with a load to a line that is being prefetched. */ @@ -553,15 +547,19 @@ do { \ if (cpu_has_xmm) { \ xor_speed(&xor_block_pIII_sse); \ xor_speed(&xor_block_sse_pf64); \ - } else if (cpu_has_mmx) { \ + if (!cpu_has_hypervisor)\ + break; \ + } \ + if (cpu_has_mmx) { \ xor_speed(&xor_block_pII_mmx); \ xor_speed(&xor_block_p5_mmx); \ - } else {\ - xor_speed(&xor_block_8regs);\ - xor_speed(&xor_block_8regs_p); \ - xor_speed(&xor_block_32regs); \ - xor_speed(&xor_block_32regs_p); \ + if (!cpu_has_hypervisor)\ + break; \ } \ + xor_speed(&xor_block_8regs);\ + xor_speed(&xor_block_8regs_p); \ + xor_speed(&xor_block_32regs); \ + xor_speed(&xor_block_32regs_p); \ } while (0) #endif /* _ASM_X86_XOR_32_H */ --- 3.7-rc3-x86-xor.orig/arch/x86/include/asm/xor_64.h +++ 3.7-rc3-x86-xor/arch/x86/include/asm/xor_64.h @@ -9,10 +9,6 @@ static struct xor_block_template xor_blo .do_5 = xor_sse_5, }; - -/* Also try the AVX routines */ -#include - /* We force the use of the SSE xor block because it can write around L2. We may also be able to load into the L1 only depending on how the cpu deals with a load to a line that is being prefetched. */ @@ -22,6 +18,12 @@ do { \ AVX_XOR_SPEED; \ xor_speed(&xor_block_sse_pf64); \ xor_speed(&xor_block_sse); \ + if (cpu_has_hypervisor) { \ + xor_speed(&xor_block_8regs);\ + xor_speed(&xor_block_8regs_p); \ + xor_speed(&xor_block_32regs); \ + xor_speed(&xor_block_32regs_p); \ + } \ } while (0) #endif /* _ASM_X86_XOR_64_H */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] xen-pciback: parsing improvements
1: simplify and tighten parsing of device IDs 2: reject out of range inputs Signed-off-by: Jan Beulich -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] xen-pciback: simplify and tighten parsing of device IDs
Now that at least one of the conformance problems of the kernel's sscanf() was addressed (commit da99075c1d368315e1508b6143226c0d27b621e0), we can improve the parsing done in xen-pciback both in terms of code readability and correctness (in particular properly rejecting input strings not well formed). Signed-off-by: Jan Beulich --- drivers/xen/xen-pciback/pci_stub.c | 83 + 1 file changed, 39 insertions(+), 44 deletions(-) --- 3.7-rc3-xen-pciback.orig/drivers/xen/xen-pciback/pci_stub.c +++ 3.7-rc3-xen-pciback/drivers/xen/xen-pciback/pci_stub.c @@ -897,42 +897,35 @@ static struct pci_driver xen_pcibk_pci_d static inline int str_to_slot(const char *buf, int *domain, int *bus, int *slot, int *func) { - int err; - char wc = '*'; + int parsed = 0; - err = sscanf(buf, " %x:%x:%x.%x", domain, bus, slot, func); - switch (err) { + switch (sscanf(buf, " %x:%x:%x.%x %n", domain, bus, slot, func, + &parsed)) { case 3: *func = -1; - err = sscanf(buf, " %x:%x:%x.%c", domain, bus, slot, &wc); + sscanf(buf, " %x:%x:%x.* %n", domain, bus, slot, &parsed); break; case 2: *slot = *func = -1; - err = sscanf(buf, " %x:%x:*.%c", domain, bus, &wc); - if (err >= 2) - ++err; + sscanf(buf, " %x:%x:*.* %n", domain, bus, &parsed); break; } - if (err == 4 && wc == '*') + if (parsed && !buf[parsed]) return 0; - else if (err < 0) - return -EINVAL; /* try again without domain */ *domain = 0; - wc = '*'; - err = sscanf(buf, " %x:%x.%x", bus, slot, func); - switch (err) { + switch (sscanf(buf, " %x:%x.%x %n", bus, slot, func, &parsed)) { case 2: *func = -1; - err = sscanf(buf, " %x:%x.%c", bus, slot, &wc); + sscanf(buf, " %x:%x.* %n", bus, slot, &parsed); break; case 1: *slot = *func = -1; - err = sscanf(buf, " %x:*.%c", bus, &wc) + 1; + sscanf(buf, " %x:*.* %n", bus, &parsed); break; } - if (err == 3 && wc == '*') + if (parsed && !buf[parsed]) return 0; return -EINVAL; @@ -941,13 +934,20 @@ static inline int str_to_slot(const char static inline int str_to_quirk(const char *buf, int *domain, int *bus, int *slot, int *func, int *reg, int *size, int *mask) { - int err; + int parsed = 0; + + sscanf(buf, " %4x:%2x:%2x.%d-%8x:%1x:%8x %n", domain, bus, slot, func, + reg, size, mask, &parsed); + if (parsed && !buf[parsed]) + return 0; - err = - sscanf(buf, " %04x:%02x:%02x.%d-%08x:%1x:%08x", domain, bus, slot, - func, reg, size, mask); - if (err == 7) + /* try again without domain */ + *domain = 0; + sscanf(buf, " %2x:%2x.%d-%8x:%1x:%8x %n", bus, slot, func, reg, size, + mask, &parsed); + if (parsed && !buf[parsed]) return 0; + return -EINVAL; } @@ -1339,8 +1339,6 @@ static int __init pcistub_init(void) if (pci_devs_to_hide && *pci_devs_to_hide) { do { - char wc = '*'; - parsed = 0; err = sscanf(pci_devs_to_hide + pos, @@ -1349,51 +1347,48 @@ static int __init pcistub_init(void) switch (err) { case 3: func = -1; - err = sscanf(pci_devs_to_hide + pos, -" (%x:%x:%x.%c) %n", -&domain, &bus, &slot, &wc, -&parsed); + sscanf(pci_devs_to_hide + pos, + " (%x:%x:%x.*) %n", + &domain, &bus, &slot, &parsed); break; case 2: slot = func = -1; - err = sscanf(pci_devs_to_hide + pos, -" (%x:%x:*.%c) %n", -&domain, &bus, &wc, &parsed) + 1; +
[PATCH 2/2] xen-pciback: reject out of range inputs
This add checks for out of range numbers (including in cases where the folding of slot and function into a single value could yield false matches). It also removes the bogus field width restrictions in str_to_quirk() - nowhere else in the driver this is being done, and hence this function could reject input the equivalent of which would be happily accepted in other places (in particular, "0x" prefixes causing the effective width of the actual number to be either zero or less than what would be required to cover the full range of valid values). Note that for the moment this second part is cosmetic only, as the kernel's sscanf() currently ignores the field widths, but a patch to overcome this is on its way. Signed-off-by: Jan Beulich --- drivers/xen/xen-pciback/pci_stub.c | 39 + 1 file changed, 27 insertions(+), 12 deletions(-) --- 3.7-rc3-xen-pciback.orig/drivers/xen/xen-pciback/pci_stub.c +++ 3.7-rc3-xen-pciback/drivers/xen/xen-pciback/pci_stub.c @@ -142,7 +142,8 @@ static struct pcistub_device *pcistub_de if (psdev->dev != NULL && domain == pci_domain_nr(psdev->dev->bus) && bus == psdev->dev->bus->number - && PCI_DEVFN(slot, func) == psdev->dev->devfn) { + && slot == PCI_SLOT(psdev->dev->devfn) + && func == PCI_FUNC(psdev->dev->devfn)) { pcistub_device_get(psdev); goto out; } @@ -191,7 +192,8 @@ struct pci_dev *pcistub_get_pci_dev_by_s if (psdev->dev != NULL && domain == pci_domain_nr(psdev->dev->bus) && bus == psdev->dev->bus->number - && PCI_DEVFN(slot, func) == psdev->dev->devfn) { + && slot == PCI_SLOT(psdev->dev->devfn) + && func == PCI_FUNC(psdev->dev->devfn)) { found_dev = pcistub_device_get_pci_dev(pdev, psdev); break; } @@ -936,14 +938,14 @@ static inline int str_to_quirk(const cha { int parsed = 0; - sscanf(buf, " %4x:%2x:%2x.%d-%8x:%1x:%8x %n", domain, bus, slot, func, + sscanf(buf, " %x:%x:%x.%x-%x:%x:%x %n", domain, bus, slot, func, reg, size, mask, &parsed); if (parsed && !buf[parsed]) return 0; /* try again without domain */ *domain = 0; - sscanf(buf, " %2x:%2x.%d-%8x:%1x:%8x %n", bus, slot, func, reg, size, + sscanf(buf, " %x:%x.%x-%x:%x:%x %n", bus, slot, func, reg, size, mask, &parsed); if (parsed && !buf[parsed]) return 0; @@ -955,7 +957,7 @@ static int pcistub_device_id_add(int dom { struct pcistub_device_id *pci_dev_id; unsigned long flags; - int rc = 0; + int rc = 0, devfn = PCI_DEVFN(slot, func); if (slot < 0) { for (slot = 0; !rc && slot < 32; ++slot) @@ -969,13 +971,24 @@ static int pcistub_device_id_add(int dom return rc; } + if (( +#if !defined(MODULE) /* pci_domains_supported is not being exported */ \ +|| !defined(CONFIG_PCI_DOMAINS) +!pci_domains_supported ? domain : +#endif +domain < 0 || domain > 0x) + || bus < 0 || bus > 0xff + || PCI_SLOT(devfn) != slot + || PCI_FUNC(devfn) != func) + return -EINVAL; + pci_dev_id = kmalloc(sizeof(*pci_dev_id), GFP_KERNEL); if (!pci_dev_id) return -ENOMEM; pci_dev_id->domain = domain; pci_dev_id->bus = bus; - pci_dev_id->devfn = PCI_DEVFN(slot, func); + pci_dev_id->devfn = devfn; pr_debug(DRV_NAME ": wants to seize %04x:%02x:%02x.%d\n", domain, bus, slot, func); @@ -1016,14 +1029,18 @@ static int pcistub_device_id_remove(int return err; } -static int pcistub_reg_add(int domain, int bus, int slot, int func, int reg, - int size, int mask) +static int pcistub_reg_add(int domain, int bus, int slot, int func, + unsigned int reg, unsigned int size, + unsigned int mask) { int err = 0; struct pcistub_device *psdev; struct pci_dev *dev; struct config_field *field; + if (reg > 0xfff || (size < 4 && (mask >> (size * 8 + return -EINVAL; + psdev = pcistub_device_find(domain, bus, slot, func); if (!psdev) { err = -ENODEV; @@ -1254,13 +1271,11 @@ static ssize_t permissive_add(struct dev int err;
[PATCH] sscanf: don't ignore field widths for numeric conversions
This is another step towards better standard conformance. Rather than adding a local buffer to store the specified portion of the string (with the need to enforce an arbitrary maximum supported width to limit the buffer size), do a maximum width conversion and then drop as much of it as is necessary to meet the caller's request. Also fail on negative field widths. Signed-off-by: Jan Beulich --- lib/vsprintf.c | 96 +++-- 1 file changed, 53 insertions(+), 43 deletions(-) --- 3.7-rc3/lib/vsprintf.c +++ 3.7-rc3-sscanf-field-width/lib/vsprintf.c @@ -23,12 +23,12 @@ #include #include #include +#include #include #include #include #include /* for PAGE_SIZE */ -#include #include /* for dereference_function_descriptor() */ #include "kstrtox.h" @@ -2013,7 +2013,11 @@ int vsscanf(const char *buf, const char char digit; int num = 0; u8 qualifier; - u8 base; + unsigned int base; + union { + long long s; + unsigned long long u; + } val; s16 field_width; bool is_sign; @@ -2053,8 +2057,11 @@ int vsscanf(const char *buf, const char /* get field width */ field_width = -1; - if (isdigit(*fmt)) + if (isdigit(*fmt)) { field_width = skip_atoi(&fmt); + if (field_width <= 0) + break; + } /* get conversion qualifier */ qualifier = -1; @@ -2154,58 +2161,61 @@ int vsscanf(const char *buf, const char || (base == 0 && !isdigit(digit))) break; + if (is_sign) + val.s = qualifier != 'L' ? + simple_strtol(str, &next, base) : + simple_strtoll(str, &next, base); + else + val.u = qualifier != 'L' ? + simple_strtoul(str, &next, base) : + simple_strtoull(str, &next, base); + + if (field_width > 0 && next - str > field_width) { + if (base == 0) + _parse_integer_fixup_radix(str, &base); + while (next - str > field_width) { + if (is_sign) + val.s = div_s64(val.s, base); + else + val.u = div_u64(val.u, base); + --next; + } + } + switch (qualifier) { case 'H': /* that's 'hh' in format */ - if (is_sign) { - signed char *s = (signed char *)va_arg(args, signed char *); - *s = (signed char)simple_strtol(str, &next, base); - } else { - unsigned char *s = (unsigned char *)va_arg(args, unsigned char *); - *s = (unsigned char)simple_strtoul(str, &next, base); - } + if (is_sign) + *va_arg(args, signed char *) = val.s; + else + *va_arg(args, unsigned char *) = val.u; break; case 'h': - if (is_sign) { - short *s = (short *)va_arg(args, short *); - *s = (short)simple_strtol(str, &next, base); - } else { - unsigned short *s = (unsigned short *)va_arg(args, unsigned short *); - *s = (unsigned short)simple_strtoul(str, &next, base); - } + if (is_sign) + *va_arg(args, short *) = val.s; + else + *va_arg(args, unsigned short *) = val.u; break; case 'l': - if (is_sign) { - long *l = (long *)va_arg(args, long *); - *l = simple_strtol(str, &next, base); - } else { - unsigned long *l = (unsigned long *)va_arg(args, unsigned long *); - *l = simple_strtoul(str, &next, base); - } + if (is_sign) + *va_arg(args, long *) = val.s; + else + *va_arg(args, un
[PATCH] utilize _Static_assert() for BUILD_BUG_ON() when the compiler supports it
This makes the resulting diagnostics quite a bit more useful. Signed-off-by: Jan Beulich --- include/linux/bug.h | 16 1 file changed, 16 insertions(+) --- 3.7-rc3/include/linux/bug.h +++ 3.7-rc3-static-assert/include/linux/bug.h @@ -27,8 +27,15 @@ struct pt_regs; result (of value 0 and type size_t), so the expression can be used e.g. in a structure initializer (or where-ever else comma expressions aren't permitted). */ +#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) +#define BUILD_BUG_ON_ZERO(e) \ + sizeof(struct { _Static_assert(!(e), "!(" #e ")"); }) +#define BUILD_BUG_ON_NULL(e) \ + ((void *)sizeof(struct { _Static_assert(!(e), "!(" #e ")"); })) +#else #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) #define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); })) +#endif /* * BUILD_BUG_ON_INVALID() permits the compiler to check the validity of the @@ -54,6 +61,15 @@ struct pt_regs; */ #ifndef __OPTIMIZE__ #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) +#elif __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3) +#define __build_bug_on_failed(n) __build_bug_on_##n##_failed +#define _BUILD_BUG_ON(n, condition)\ + do {\ + extern void __compiletime_error(#condition) \ + __build_bug_on_failed(n)(void); \ + if (condition) __build_bug_on_failed(n)(); \ + } while(0) +#define BUILD_BUG_ON(condition...) _BUILD_BUG_ON(__COUNTER__, ##condition) #else extern int __build_bug_on_failed; #define BUILD_BUG_ON(condition)\ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] xen/hypercall: fix hypercall fallback code for very old hypervisors
>>> On 02.11.12 at 17:44, Konrad Rzeszutek Wilk wrote: > --- a/arch/x86/include/asm/xen/hypercall.h > +++ b/arch/x86/include/asm/xen/hypercall.h > @@ -359,18 +359,14 @@ HYPERVISOR_update_va_mapping(unsigned long va, pte_t > new_val, > return _hypercall4(int, update_va_mapping, va, > new_val.pte, new_val.pte >> 32, flags); > } > +extern int __must_check xen_HYPERVISOR_event_channel_op_compat(int, void *); Why don't you drop the HYPERVISOR_ now that you added the xen_? >... > +EXPORT_SYMBOL_GPL(xen_HYPERVISOR_event_channel_op_compat); While this export is necessary, ... >... > +EXPORT_SYMBOL_GPL(xen_HYPERVISOR_physdev_op_compat); ... I would recommend not adding this one without need. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Lib:The patch include fix for "-" during string to long conversion
>>> On 04.11.12 at 14:26, Namit Gupta wrote: > API simple_strtol() and simple_strtoul() are giving incorrect result for > string "-". > These API's consider "-" as a '-' (negative integer) and return incorrect > string pointer. > The API returns pointer next to '-' character. However it should return > starting pointer of the string. Where is that behavior specified? If we take the C standard as reference, it is not being made explicit whether, for "base" other than zero, the sequence of letters and digits has to be non-empty. If we take this as implied, then your patch should not only handle "-" alone, but also cases where "-" is followed by other than a letter or digit, or an out of range one. Jan > Below I have included possible solution for that issue. > Please review. > > > From eea8b5fd7e5bb7b206a41f5eec934152a77a9431 Mon Sep 17 00:00:00 2001 > From: Namit Gupta > Date: Sun, 4 Nov 2012 23:36:23 +0530 > Subject: [PATCH 1/1] Lib:The patch include fix for "-" during string to > long conversion > > API simple_strtol and simple_strtoll give wrong result for string "-", > The API consider "-" as a -ve number which and give incorrect result > for "-" string. > > Signed-off-by: Namit Gupta > --- > lib/vsprintf.c | 20 > 1 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index 39c99fe..cde449d 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -76,8 +76,14 @@ EXPORT_SYMBOL(simple_strtoul); > */ > long simple_strtol(const char *cp, char **endp, unsigned int base) > { > -if (*cp == '-') > -return -simple_strtoul(cp + 1, endp, base); > +if (*cp == '-') { > +if (*(cp+1)) > +return -simple_strtoul(cp + 1, endp, base); > +else if (*endp) { /* The string contains only "-"*/ > +*endp = (char *)cp; > +return 0; > +} > +} > > return simple_strtoul(cp, endp, base); > } > @@ -91,8 +97,14 @@ EXPORT_SYMBOL(simple_strtol); > */ > long long simple_strtoll(const char *cp, char **endp, unsigned int base) > { > -if (*cp == '-') > -return -simple_strtoull(cp + 1, endp, base); > +if (*cp == '-') { > +if (*(cp+1)) > +return -simple_strtoull(cp + 1, endp, base); > +else if (*endp) { /* The string contains only "-"*/ > +*endp = (char *)cp; > +return 0; > +} > +} > > return simple_strtoull(cp, endp, base); > } > -- > 1.7.1 > > > Regards, > Namit -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Xen-devel] [PATCH 1/2] Xen acpi pad implement
>>> On 03.11.12 at 13:16, "Liu, Jinsong" wrote: > Konrad Rzeszutek Wilk wrote: From f514b97628945cfac00efb0d456f133d44754c9d Mon Sep 17 00:00:00 2001 >>> From: Liu, Jinsong >>> Date: Thu, 1 Nov 2012 21:02:36 +0800 >>> Subject: [PATCH 1/2] Xen acpi pad implement >>> >>> PAD is acpi Processor Aggregator Device which provides a control >>> point >>> that enables the platform to perform specific processor configuration >>> and control that applies to all processors in the platform. >>> >>> This patch is to implement Xen acpi pad logic. When running under Xen >>> virt platform, native pad driver would not work. Instead Xen pad >>> driver, >>> a self-contained and very thin logic level, would take over acpi pad >>> staff. When acpi pad notify OSPM, xen pad logic intercept and parse >>> _PUR object >>> and then hypercall to hyervisor for the rest work, say, core parking. >> >> Two comments: >> - Did you look at the SuSE tree? Jan mentioned that they did some >>fixes? Did you carry them over? > > I didn't look at Suse tree. > > Jan, would you please tell me where can I pull Suse tree? http://kernel.opensuse.org/git > have you > implemented xen-acpi-pad logic at Suse dom0 and which points you have fixed? Yes, we've been carrying this for quite some time. I don't recall the specific cleanups to the code I did, so I can only refer you to the trees above. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/