Re: [PATCH] linux-user: Improve strace output for read() and getcwd()

2019-11-26 Thread Laurent Vivier
Le 25/11/2019 à 21:54, Aleksandar Markovic a écrit :
>>> 4. NEW PROPOSAL :
>>>
>>> early: read(3,0xff80038c,512)
>>> late: = 512  
>>> [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
>>> early: getcwd(0x18180,4096)
>>> late: = 9  [("/usr/bin",4096)]
>>>
>>> In other words, we would print for (selected) system calls content of
>>> the parameters both before and after actual ioctl
>>> translation/execution.
>> I don't like this.
>> IMHO it just adds unnecessary noise and differs from what a native strace 
>> shows.
>>
> QEMU strace doesn't have to be the same as native strace. Actually,
> they could be used even at the same time: QEMU provides info on target
> syscall, strace on host syscall. Do not forget that not all syscalls
> are implemeted as one-to-one-correspondence between target and host
> syscalls - combination of QEMU strace and native strace is precious at
> some moments.
> 
> In any case, I think that requiring QEMU strace to look exactly like
> native strace is a self-imposed limitation, that does not have enough
> justification. Think of it, wouldn't it be nice to surpass strace,
> like in these cases above?

I'd like to have a QEMU_STRACE formatting similar to the host strace
formatting because it makes debugging easier: we can run the same
program in guest and host and compare the strace output to see where it
differs and thus where the problem appears.

Thanks,
Laurent



Re: [PATCH v2 2/2] s390x/cpumodel: Introduce dynamic feature groups

2019-11-26 Thread David Hildenbrand



> Am 26.11.2019 um 08:54 schrieb Christian Borntraeger :
> 
> 
> 
>> On 25.11.19 18:20, David Hildenbrand wrote:
>> 
>> As soon as dynamic feature groups are used, the CPU model becomes
>> migration-unsafe. Upper layers can expand these models to migration-safe
>> and static variants, allowing them to be migrated.
> 
> I really dislike that. I am trying to get rid of the unsafe variants (e.g. now
> defaulting to host-model instead of host-passthrough). I do not want to give
> users new ways of hurting themselves.
> 

Please note that this is just on the bare command line. Libvirt and friends 
will expand the model and have proper migration in place. What exactly is your 
concern in that regard?

> Unless I misunderstood Eduardo, I think his versioning approach is actually 
> better
> in regard to migration, no?
> I z terms, you can still say -cpu z13  which is just an alias to z13v1 z13v2 
> etc.
> Assuming that the version is checked this will be safe.
> 

It‘s even worse AFAIKS. A „-cpu z13“ would dynamically map to whatever is best 
on the host.




[PATCH] throttle-groups: fix memory leak in throttle_group_set_limits

2019-11-26 Thread pannengyuan
From: PanNengyuan 

This avoid a memory leak when qom-set is called to set throttle_group
limits, here is an easy way to reproduce:

1. run qemu-iotests as follow and check the result with asan:
   ./check -qcow2 184

Following is the asan output backtrack:
Direct leak of 912 byte(s) in 3 object(s) allocated from:
#0 0x8d7ab3c3 in __interceptor_calloc(/lib64/libasan.so.4+0xd33c3)
#1 0x8d4c31cb in g_malloc0 (/lib64/libglib-2.0.so.0+0x571cb)
#2 0x190c857 in qobject_input_start_struct  
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:295
#3 0x19070df in visit_start_struct 
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:49
#4 0x1948b87 in visit_type_ThrottleLimits  qapi/qapi-visit-block-core.c:3759
#5 0x17e4aa3 in throttle_group_set_limits  
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/block/throttle-groups.c:900
#6 0x1650eff in object_property_set  
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/object.c:1272
#7 0x1658517 in object_property_set_qobject  
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/qom-qobject.c:26
#8 0x15880bb in qmp_qom_set 
/mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/qom-qmp-cmds.c:74
#9 0x157e3e3 in qmp_marshal_qom_set  qapi/qapi-commands-qom.c:154

Reported-by: Euler Robot 
Signed-off-by: PanNengyuan 
---
 block/throttle-groups.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 77014c7..88418e6 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -912,6 +912,7 @@ static void throttle_group_set_limits(Object *obj, Visitor 
*v,
 unlock:
 qemu_mutex_unlock(&tg->lock);
 ret:
+qapi_free_ThrottleLimits(argp);
 error_propagate(errp, local_err);
 return;
 }
-- 
2.7.2.windows.1





Re: [PATCH for-5.0 v4 0/5] iotests: Test failing mirror complete

2019-11-26 Thread Max Reitz
On 08.11.19 13:34, Max Reitz wrote:
> Hi,
> 
> v3 of this series was this:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg00868.html
> 
> In the meantime, I’ve merged the first patch, so the subject of the
> series has changed.
> 
> In v4, I’ve tried to address Vladimir’s concern of how to map QAPI
> BlockPermission values to BLK_PERM_* flags with the new patches 1 and 2,
> and a minor change to patch 3:
> - Patch 1: Added, this new function will translate QAPI BlockPermission
>values to BLK_PERM_* flags
> - Patch 2: Added; we can make use of the new function in
>xdbg_graph_add_edge() to save some LoC
> - Patch 3:
>   - %s/4.2/5.0/
>   - Let blkdebug_parse_perm_list() initialize *dest
>   - Use bdrv_qapi_perm_to_blk_perm() to translate QAPI BlockPermission
> values instead of blindly assuming an x -> 2^x mapping

Thanks for the review, applied to my block-next branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block-next

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 12/20] nvme: bump supported specification version to 1.3

2019-11-26 Thread Klaus Birkelund
On Mon, Nov 25, 2019 at 12:13:15PM +, Beata Michalska wrote:
> On Mon, 18 Nov 2019 at 09:48, Klaus Birkelund  wrote:
> >
> > On Tue, Nov 12, 2019 at 03:05:06PM +, Beata Michalska wrote:
> > > Hi Klaus,
> > >
> > > On Tue, 15 Oct 2019 at 11:52, Klaus Jensen  wrote:
> > > >
> > > > +static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeCmd *c)
> > > > +{
> > > > +static const int len = 4096;
> > > > +
> > > > +struct ns_descr {
> > > > +uint8_t nidt;
> > > > +uint8_t nidl;
> > > > +uint8_t rsvd2[2];
> > > > +uint8_t nid[16];
> > > > +};
> > > > +
> > > > +uint32_t nsid = le32_to_cpu(c->nsid);
> > > > +uint64_t prp1 = le64_to_cpu(c->prp1);
> > > > +uint64_t prp2 = le64_to_cpu(c->prp2);
> > > > +
> > > > +struct ns_descr *list;
> > > > +uint16_t ret;
> > > > +
> > > > +trace_nvme_identify_ns_descr_list(nsid);
> > > > +
> > > > +if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> > > > +trace_nvme_err_invalid_ns(nsid, n->num_namespaces);
> > > > +return NVME_INVALID_NSID | NVME_DNR;
> > > > +}
> > > > +
> > > In theory this should abort the command for inactive NSIDs as well.
> > > But I guess this will come later on.
> > >
> >
> > At this point in the series, the device does not support multiple
> > namespaces anyway and num_namespaces is always 1. But this has also been
> > reported seperately in relation the patch adding multiple namespaces and
> > is fixed in v3.
> >
> > > > +list = g_malloc0(len);
> > > > +list->nidt = 0x3;
> > > > +list->nidl = 0x10;
> > > > +*(uint32_t *) &list->nid[12] = cpu_to_be32(nsid);
> > > > +
> > > Might be worth to add some comment here -> as per the NGUID/EUI64 format.
> > > Also those are not specified currently in the namespace identity data 
> > > structure.
> > >
> >
> > I'll add a comment for why the Namespace UUID is set to this value here.
> > The NGUID/EUI64 fields are not set in the namespace identity data
> > structure as they are not required. See the descriptions of NGUID and
> > EUI64. Here for NGUID:
> >
> > "The controller shall specify a globally unique namespace identifier
> > in this field, the EUI64 field, or a Namespace UUID in the Namespace
> > Identification Descriptor..."
> >
> > Here, I chose to provide it in the Namespace Identification Descriptor
> > (by setting `list->nidt = 0x3`).
> >
> > > > +ret = nvme_dma_read_prp(n, (uint8_t *) list, len, prp1, prp2);
> > > > +g_free(list);
> > > > +return ret;
> > > > +}
> > > > +
> > > >  static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
> > > >  {
> > > >  NvmeIdentify *c = (NvmeIdentify *)cmd;
> > > > @@ -934,7 +978,9 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd 
> > > > *cmd)
> > > >  case 0x01:
> > > >  return nvme_identify_ctrl(n, c);
> > > >  case 0x02:
> > > > -return nvme_identify_nslist(n, c);
> > > > +return nvme_identify_ns_list(n, c);
> > > > +case 0x03:
> > > > +return nvme_identify_ns_descr_list(n, cmd);
> > > >  default:
> > > >  trace_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
> > > >  return NVME_INVALID_FIELD | NVME_DNR;
> > > > @@ -1101,6 +1147,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, 
> > > > NvmeCmd *cmd, NvmeRequest *req)
> > > >  blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
> > > >  break;
> > > >  case NVME_NUMBER_OF_QUEUES:
> > > > +if (n->qs_created > 2) {
> > > > +return NVME_CMD_SEQ_ERROR | NVME_DNR;
> > > > +}
> > > > +
> > > I am not sure this is entirely correct as the spec says:
> > > "if any I/O Submission and/or Completion Queues (...)"
> > > so it might be enough to have a single queue created
> > > for this command to be valid.
> > > Also I think that the condition here is to make sure that the number
> > > of queues requested is being set once at init phase. Currently this will
> > > allow the setting to happen if there is no active queue -> so at any
> > > point of time (provided the condition mentioned). I might be wrong here
> > > but it seems that what we need is a single status saying any queue
> > > has been created prior to the Set Feature command at all
> > >
> >
> > Internally, the admin queue pair is counted in qs_created, which is the
> > reason for checking if is above 2. The admin queues are created when the
> > controller is enabled (mmio write to the EN register in CC).
> >
> > I'll add a comment about that - I see why it is unclear.
> >
> 
> Ok, so indeed I have missed the fact that the admin queues are being tracked 
> by
> 'qs_created'. Still, I might be wrong, but, it is enough to create I/O
> submission queue and delete it and the code will allow the command to proceed
> Whereas the spec says :
> "If a Set Features command is issued for this feature after creation of
> any I/O Submission and/or I/O Completion Queues, then the Set Features
> command sh

Re: [PATCH 2/4] ich9: fix getter type for sci_int property

2019-11-26 Thread Marc-André Lureau
On Mon, Nov 25, 2019 at 7:37 PM Felipe Franciosi  wrote:
>
> When QOM APIs were added to ich9 in 6f1426ab, the getter for sci_int was
> written using uint32_t. However, the object property is uint8_t. This
> fixes the getter for correctness.
>
> Signed-off-by: Felipe Franciosi 

Good catch! (I have a few like that in a series pending. This one I didn't spot)

Reviewed-by: Marc-André Lureau 


thanks


> ---
>  hw/isa/lpc_ich9.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index ce3342..240979885d 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -631,9 +631,9 @@ static void ich9_lpc_get_sci_int(Object *obj, Visitor *v, 
> const char *name,
>   void *opaque, Error **errp)
>  {
>  ICH9LPCState *lpc = ICH9_LPC_DEVICE(obj);
> -uint32_t value = lpc->sci_gsi;
> +uint8_t value = lpc->sci_gsi;
>
> -visit_type_uint32(v, name, &value, errp);
> +visit_type_uint8(v, name, &value, errp);
>  }
>
>  static void ich9_lpc_add_properties(ICH9LPCState *lpc)
> @@ -641,7 +641,7 @@ static void ich9_lpc_add_properties(ICH9LPCState *lpc)
>  static const uint8_t acpi_enable_cmd = ICH9_APM_ACPI_ENABLE;
>  static const uint8_t acpi_disable_cmd = ICH9_APM_ACPI_DISABLE;
>
> -object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, "uint32",
> +object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, "uint8",
>  ich9_lpc_get_sci_int,
>  NULL, NULL, NULL, NULL);
>  object_property_add_uint8_ptr(OBJECT(lpc), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> --
> 2.20.1
>


--
Marc-André Lureau



Re: [PATCH 1/4] qom/object: enable setter for uint types

2019-11-26 Thread Marc-André Lureau
Hi

On Mon, Nov 25, 2019 at 7:40 PM Felipe Franciosi  wrote:
>
> Traditionally, the uint-specific property helpers only offer getters.
> When adding object (or class) uint types, one must therefore use the
> generic property helper if a setter is needed.
>
> This enhances the uint-specific property helper APIs by adding a
> 'readonly' field and modifying all users of that API to set this
> parameter to true. If 'readonly' is false, though, the helper will add
> an automatic setter for the value.
>
> Signed-off-by: Felipe Franciosi 
> ---
>  hw/acpi/ich9.c   |   4 +-
>  hw/acpi/pcihp.c  |   6 +-
>  hw/acpi/piix4.c  |  12 ++--
>  hw/isa/lpc_ich9.c|   4 +-
>  hw/ppc/spapr_drc.c   |   2 +-
>  include/qom/object.h |  28 +---
>  qom/object.c | 152 ---
>  ui/console.c |   3 +-
>  8 files changed, 163 insertions(+), 48 deletions(-)
>
> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> index 2034dd749e..94dc5147ce 100644
> --- a/hw/acpi/ich9.c
> +++ b/hw/acpi/ich9.c
> @@ -454,12 +454,12 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
> *pm, Error **errp)
>  pm->s4_val = 2;
>
>  object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> -   &pm->pm_io_base, errp);
> +   &pm->pm_io_base, true, errp);
>  object_property_add(obj, ACPI_PM_PROP_GPE0_BLK, "uint32",
>  ich9_pm_get_gpe0_blk,
>  NULL, NULL, pm, NULL);
>  object_property_add_uint32_ptr(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
> -   &gpe0_len, errp);
> +   &gpe0_len, true, errp);
>  object_property_add_bool(obj, "memory-hotplug-support",
>   ich9_pm_get_memory_hotplug_support,
>   ich9_pm_set_memory_hotplug_support,
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index 8413348a33..70bc1408e6 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -80,7 +80,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
>
>  *bus_bsel = (*bsel_alloc)++;
>  object_property_add_uint32_ptr(OBJECT(bus), ACPI_PCIHP_PROP_BSEL,
> -   bus_bsel, &error_abort);
> +   bus_bsel, true, &error_abort);
>  }
>
>  return bsel_alloc;
> @@ -373,9 +373,9 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *s, 
> PCIBus *root_bus,
>  memory_region_add_subregion(address_space_io, s->io_base, &s->io);
>
>  object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_BASE_PROP, 
> &s->io_base,
> -   &error_abort);
> +   true, &error_abort);
>  object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_LEN_PROP, &s->io_len,
> -   &error_abort);
> +   true, &error_abort);
>  }
>
>  const VMStateDescription vmstate_acpi_pcihp_pci_status = {
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 93aec2dd2c..032ba11e62 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -443,17 +443,17 @@ static void piix4_pm_add_propeties(PIIX4PMState *s)
>  static const uint16_t sci_int = 9;
>
>  object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> -  &acpi_enable_cmd, NULL);
> +  &acpi_enable_cmd, true, NULL);
>  object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_DISABLE_CMD,
> -  &acpi_disable_cmd, NULL);
> +  &acpi_disable_cmd, true, NULL);
>  object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK,
> -  &gpe0_blk, NULL);
> +  &gpe0_blk, true, NULL);
>  object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK_LEN,
> -  &gpe0_blk_len, NULL);
> +  &gpe0_blk_len, true, NULL);
>  object_property_add_uint16_ptr(OBJECT(s), ACPI_PM_PROP_SCI_INT,
> -  &sci_int, NULL);
> +  &sci_int, true, NULL);
>  object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_PM_IO_BASE,
> -  &s->io_base, NULL);
> +  &s->io_base, true, NULL);
>  }
>
>  static void piix4_pm_realize(PCIDevice *dev, Error **errp)
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index 17c292e306..ce3342 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -645,9 +645,9 @@ static void ich9_lpc_add_properties(ICH9LPCState *lpc)
>  ich9_lpc_get_sci_int,
>  NULL, NULL, NULL, NULL);
>  object_property_add_uint8_ptr(OBJECT(lpc), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> -  

Re: [PATCH 3/4] ich9: Simplify ich9_lpc_initfn

2019-11-26 Thread Marc-André Lureau
On Mon, Nov 25, 2019 at 7:37 PM Felipe Franciosi  wrote:
>
> Currently, ich9_lpc_initfn simply serves as a caller to
> ich9_lpc_add_properties. This simplifies the code a bit by eliminating
> ich9_lpc_add_properties altogether and executing its logic in the parent
> object initialiser function.
>
> Signed-off-by: Felipe Franciosi 

yep, /me like simpler code,

Reviewed-by: Marc-André Lureau 


> ---
>  hw/isa/lpc_ich9.c | 19 +++
>  1 file changed, 7 insertions(+), 12 deletions(-)
>
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index 240979885d..232c7916f3 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -636,27 +636,22 @@ static void ich9_lpc_get_sci_int(Object *obj, Visitor 
> *v, const char *name,
>  visit_type_uint8(v, name, &value, errp);
>  }
>
> -static void ich9_lpc_add_properties(ICH9LPCState *lpc)
> +static void ich9_lpc_initfn(Object *obj)
>  {
> +ICH9LPCState *lpc = ICH9_LPC_DEVICE(obj);
> +
>  static const uint8_t acpi_enable_cmd = ICH9_APM_ACPI_ENABLE;
>  static const uint8_t acpi_disable_cmd = ICH9_APM_ACPI_DISABLE;
>
> -object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, "uint8",
> +object_property_add(obj, ACPI_PM_PROP_SCI_INT, "uint8",
>  ich9_lpc_get_sci_int,
>  NULL, NULL, NULL, NULL);
> -object_property_add_uint8_ptr(OBJECT(lpc), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_ACPI_ENABLE_CMD,
>&acpi_enable_cmd, true, NULL);
> -object_property_add_uint8_ptr(OBJECT(lpc), ACPI_PM_PROP_ACPI_DISABLE_CMD,
> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_ACPI_DISABLE_CMD,
>&acpi_disable_cmd, true, NULL);
>
> -ich9_pm_add_properties(OBJECT(lpc), &lpc->pm, NULL);
> -}
> -
> -static void ich9_lpc_initfn(Object *obj)
> -{
> -ICH9LPCState *lpc = ICH9_LPC_DEVICE(obj);
> -
> -ich9_lpc_add_properties(lpc);
> +ich9_pm_add_properties(obj, &lpc->pm, NULL);
>  }
>
>  static void ich9_lpc_realize(PCIDevice *d, Error **errp)
> --
> 2.20.1
>


--
Marc-André Lureau



Re: [PATCH 4/4] qom/object: Use common get/set uint helpers

2019-11-26 Thread Marc-André Lureau
Hi

On Mon, Nov 25, 2019 at 7:37 PM Felipe Franciosi  wrote:
>
> Several objects implemented their own uint property getters and setters,
> despite them being straightforward (without any checks/validations on
> the values themselves) and identical across objects. This makes use of
> an enhanced API for object_property_add_uintXX_ptr() which offers
> default setters.
>
> Some of these setters used to update the value even if the type visit
> failed (eg. because the value being set overflowed over the given type).
> The new setter introduces a check for these errors, not updating the
> value if an error occurred. The error is propagated.
>
> Signed-off-by: Felipe Franciosi 


Some comments below, otherwise:
Reviewed-by: Marc-André Lureau 

> ---
>  hw/acpi/ich9.c   |  93 +++
>  hw/isa/lpc_ich9.c|  14 +-
>  hw/misc/edu.c|  12 +
>  hw/pci-host/q35.c|  14 ++
>  hw/ppc/spapr.c   |  17 ++--
>  hw/vfio/pci-quirks.c |  18 ++--
>  memory.c |  15 +--
>  target/arm/cpu.c |  21 +
>  target/i386/sev.c| 102 +++
>  9 files changed, 29 insertions(+), 277 deletions(-)
>
> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> index 94dc5147ce..aa3c7a59ab 100644
> --- a/hw/acpi/ich9.c
> +++ b/hw/acpi/ich9.c
> @@ -357,81 +357,6 @@ static void ich9_pm_set_cpu_hotplug_legacy(Object *obj, 
> bool value,
>  s->pm.cpu_hotplug_legacy = value;
>  }
>
> -static void ich9_pm_get_disable_s3(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -uint8_t value = pm->disable_s3;
> -
> -visit_type_uint8(v, name, &value, errp);
> -}
> -
> -static void ich9_pm_set_disable_s3(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -Error *local_err = NULL;
> -uint8_t value;
> -
> -visit_type_uint8(v, name, &value, &local_err);
> -if (local_err) {
> -goto out;
> -}
> -pm->disable_s3 = value;
> -out:
> -error_propagate(errp, local_err);
> -}
> -
> -static void ich9_pm_get_disable_s4(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -uint8_t value = pm->disable_s4;
> -
> -visit_type_uint8(v, name, &value, errp);
> -}
> -
> -static void ich9_pm_set_disable_s4(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -Error *local_err = NULL;
> -uint8_t value;
> -
> -visit_type_uint8(v, name, &value, &local_err);
> -if (local_err) {
> -goto out;
> -}
> -pm->disable_s4 = value;
> -out:
> -error_propagate(errp, local_err);
> -}
> -
> -static void ich9_pm_get_s4_val(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -uint8_t value = pm->s4_val;
> -
> -visit_type_uint8(v, name, &value, errp);
> -}
> -
> -static void ich9_pm_set_s4_val(Object *obj, Visitor *v, const char *name,
> -   void *opaque, Error **errp)
> -{
> -ICH9LPCPMRegs *pm = opaque;
> -Error *local_err = NULL;
> -uint8_t value;
> -
> -visit_type_uint8(v, name, &value, &local_err);
> -if (local_err) {
> -goto out;
> -}
> -pm->s4_val = value;
> -out:
> -error_propagate(errp, local_err);
> -}
> -
>  static bool ich9_pm_get_enable_tco(Object *obj, Error **errp)
>  {
>  ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
> @@ -468,18 +393,12 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
> *pm, Error **errp)
>   ich9_pm_get_cpu_hotplug_legacy,
>   ich9_pm_set_cpu_hotplug_legacy,
>   NULL);
> -object_property_add(obj, ACPI_PM_PROP_S3_DISABLED, "uint8",
> -ich9_pm_get_disable_s3,
> -ich9_pm_set_disable_s3,
> -NULL, pm, NULL);
> -object_property_add(obj, ACPI_PM_PROP_S4_DISABLED, "uint8",
> -ich9_pm_get_disable_s4,
> -ich9_pm_set_disable_s4,
> -NULL, pm, NULL);
> -object_property_add(obj, ACPI_PM_PROP_S4_VAL, "uint8",
> -ich9_pm_get_s4_val,
> -ich9_pm_set_s4_val,
> -NULL, pm, NULL);
> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_S3_DISABLED,
> +  &pm->disable_s3, false, errp);
> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_S4_DISABLED,
> +  &pm->disable_s4, false, errp);
> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_S4_

Re: [PATCH v2 15/20] nvme: add support for scatter gather lists

2019-11-26 Thread Klaus Birkelund
On Mon, Nov 25, 2019 at 02:10:37PM +, Beata Michalska wrote:
> On Mon, 25 Nov 2019 at 06:21, Klaus Birkelund  wrote:
> >
> > On Tue, Nov 12, 2019 at 03:25:18PM +, Beata Michalska wrote:
> > > Hi Klaus,
> > >
> > > On Tue, 15 Oct 2019 at 11:57, Klaus Jensen  wrote:
> > > >
> > > > +#define NVME_CMD_FLAGS_FUSE(flags) (flags & 0x3)
> > > > +#define NVME_CMD_FLAGS_PSDT(flags) ((flags >> 6) & 0x3)
> > > Minor: This one is slightly misleading - as per the naming and it's usage:
> > > the PSDT is a field name and as such does not imply using SGLs
> > > and it is being used to verify if given command is actually using
> > > SGLs.
> > >
> >
> > Ah, is this because I do
> >
> >   if (NVME_CMD_FLAGS_PSDT(cmd->flags)) {
> >
> > in the code? That is, just checks for it not being zero? The value of
> > the PRP or SGL for Data Transfer (PSDT) field *does* specify if the
> > command uses SGLs or not. 0x0: PRPs, 0x1 SGL for data, 0x10: SGLs for
> > both data and metadata. Would you prefer the condition was more
> > explicit?
> >
> Yeah, it is just not obvious( at least to me)  without referencing the spec
> that non-zero value implies SGL usage. Guess a comment would be helpful
> but that is not major.
> 
 
Nah. Thats a good point. I have changed it to use a switch on the value.
This technically also fixes a bug because the above would accept 0x3 as
a valid value and interpret it as SGL use.


Klaus



Re: [PATCH v17 02/14] util/cutils: Use qemu_strtold_finite to parse size

2019-11-26 Thread Tao Xu

On 11/25/2019 2:56 PM, Markus Armbruster wrote:

Tao Xu  writes:


Support full 64bit precision, modify related test cases.


That's not true in general: long double need not be any wider than
double.

It might be true on the host machines we support, but I don't know.  If
we decide to rely on it, we better make the build fail when the host
machine doesn't meet our expectations, preferably in configure.


[...]

-if ((val * mul >= 0xfc00) || val < 0) {
+/* Values > UINT64_MAX overflow uint64_t */
+if ((val * mul > UINT64_MAX) || val < 0) {
  retval = -ERANGE;
  goto out;
  }


Not portable.  If it was, we'd have made this changd long ago :)



OK. So the suitable solution is what you suggested in v14?

"A possible alternative is to parse the numeric part both as a double 
and as a 64 bit unsigned integer, then use whatever consumes more 
characters.  This enables providing full 64 bits unless you actually use

a fraction."

I will try this way.



Re: [PATCH 0/5] ARM virt: Add NVDIMM support

2019-11-26 Thread Igor Mammedov
On Mon, 25 Nov 2019 16:25:54 +
Shameerali Kolothum Thodi  wrote:

> Hi Igor,
> 
> > -Original Message-
> > From: Igor Mammedov [mailto:imamm...@redhat.com]
> > Sent: 25 November 2019 15:46
> > To: Shameerali Kolothum Thodi 
> > Cc: Auger Eric ; qemu-devel@nongnu.org;
> > qemu-...@nongnu.org; peter.mayd...@linaro.org;
> > shannon.zha...@gmail.com; xuwei (O) ;
> > ler...@redhat.com; Linuxarm 
> > Subject: Re: [PATCH 0/5] ARM virt: Add NVDIMM support
> > 
> > On Mon, 25 Nov 2019 13:20:02 +
> > Shameerali Kolothum Thodi  wrote:
> >   
> > > Hi Eric/Igor,
> > >  
> > > > -Original Message-
> > > > From: Shameerali Kolothum Thodi
> > > > Sent: 22 October 2019 15:05
> > > > To: 'Auger Eric' ; qemu-devel@nongnu.org;
> > > > qemu-...@nongnu.org; imamm...@redhat.com
> > > > Cc: peter.mayd...@linaro.org; shannon.zha...@gmail.com; xuwei (O)
> > > > ; ler...@redhat.com; Linuxarm
> > > > 
> > > > Subject: RE: [PATCH 0/5] ARM virt: Add NVDIMM support  
> > 
> > not related to problem discussed in this patch but you probably
> > need to update docs/specs/acpi_nvdimm.txt to account for your changes  
> 
> Ok.
> 
> > > >  
> > >
> > > [..]
> > >  
> > > > > one question: I noticed that when a NVDIMM slot is hotplugged one get
> > > > > the following trace on guest:
> > > > >
> > > > > nfit ACPI0012:00: found a zero length table '0' parsing nfit
> > > > > pmem0: detected capacity change from 0 to 1073741824
> > > > >
> > > > > Have you experienced the 0 length trace?  
> > > >
> > > > I double checked and yes that trace is there. And I did a quick check 
> > > > with
> > > > x86 and it is not there.
> > > >
> > > > The reason looks like, ARM64 kernel receives an additional 8 bytes size 
> > > >  
> > when  
> > > > the kernel evaluates the "_FIT" object.
> > > >
> > > > For the same test scenario, Qemu reports a FIT buffer size of 0xb8 and
> > > >
> > > > X86 Guest kernel,
> > > > [1.601077] acpi_nfit_init: data 0x8a273dc12b18 sz 0xb8
> > > >
> > > > ARM64 Guest,
> > > > [0.933133] acpi_nfit_init: data 0x3cbe6018 sz 0xc0
> > > >
> > > > I am not sure how that size gets changed for ARM which results in
> > > > the above mentioned 0 length trace. I need to debug this further.
> > > >
> > > > Please let me know if you have any pointers...  
> > >
> > > I spend some time debugging this further and it looks like the AML code
> > > behaves differently on x86 and ARM64.  
> > FIT table is built dynamically and you are the first to debug
> > such issue.
> > (apart from the author the NVDIMM code.  
> :)
> >  btw: why NVDIMM author is not on CC list???)  
> 
> Right. Missed that. CCd.
>  
> >   
> > > Booted guest with nvdimm mem, and used SSDT override with dbg prints
> > > added,
> > >
> > > -object memory-backend-ram,id=mem1,size=1G \
> > > -device nvdimm,id=dimm1,memdev=mem1 \
> > >
> > > On X86,
> > > ---
> > >
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: Read FIT: offset 0 FIT size 0xb8  
> > Dirty Yes.  
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: read_fit_out buf size 0xc0  
> > func_ret_status 0  
> > >
> > > [AML]"NVDIMM-NCAL: Rcvd RLEN 00C0"
> > > [AML]"NVDIMM-NCAL: Creating OBUF with 05E0 bits"
> > > [AML]"NVDIMM-NCAL: Created  BUF(Local7) size 00BC"
> > > [AML]"NVDIMM-RFIT Rcvd buf size 00BC"
> > > [AML]"NVDIMM-RFIT Created NVDR.RFIT.BUFF size 00B8"
> > > [AML]"NVDIMM-FIT: Rcvd buf size 00B8"
> > >
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: Read FIT: offset 0xb8 FIT size  
> > 0xb8 Dirty No.  
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: read_fit_out buf size 0x8  
> > func_ret_status 0  
> > >
> > > [AML]"NVDIMM-NCAL: Rcvd RLEN 0008"
> > > [AML]"NVDIMM-NCAL: Creating OBUF with 0020 bits"
> > > [AML]"NVDIMM-NCAL: Created  BUF(Local7) size 0004"
> > > [AML]"NVDIMM-RFIT Rcvd buf size 0004"
> > > [AML]"NVDIMM-FIT: Rcvd buf size "
> > > [AML]"NVDIMM-FIT: _FIT returned size 00B8"
> > >
> > > [ KERNEL] acpi_nfit_init: NVDIMM: data 0x9855bb9a7518 sz 0xb8  -->  
> > Guest receives correct size(0xb8) here  
> > >
> > > On ARM64,
> > > ---
> > >
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: Read FIT: offset 0 FIT size 0xb8  
> > Dirty Yes.  
> > > [Qemu]VDIMM:nvdimm_dsm_func_read_fit: read_fit_out buf size 0xc0  
> > func_ret_status 0  
> > >
> > > [AML]"NVDIMM-NCAL: Rcvd RLEN 00C0"
> > > [AML]"NVDIMM-NCAL: Creating OBUF with 05E0 bits"
> > > [AML]"NVDIMM-NCAL: Created  BUF(Local7) size 00BC"
> > > [AML]"NVDIMM-RFIT Rcvd buf size 00BC"
> > > [AML]"NVDIMM-RFIT Created NVDR.RFIT.BUFF size 00B8"
> > > [AML]"NVDIMM-FIT: Rcvd buf size 00B8"
> > >
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: Read FIT: offset 0xb8 FIT size  
> > 0xb8 Dirty No.  
> > > [Qemu]NVDIMM:nvdimm_dsm_func_read_fit: read_fit_out buf size 0x8  
> > func_ret_status 0  
> > >
> > > [AML]"NVDIMM-NCAL: Rcvd RLEN 0008"
> > > [AML]"NVDIMM-NCAL: Creating OBUF with 0

[PULL 0/5] i386 patches for QEMU 4.2-rc

2019-11-26 Thread Paolo Bonzini
The following changes since commit 65e05c82bdc6d348155e301c9d87dba7a08a5701:

  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2019-11-25 15:47:44 +)

are available in the Git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to e37aa8b0e410e83b4e150e38e83e385169ba09a7:

  hvf: more accurately match SDM when setting CR0 and PDPTE registers 
(2019-11-26 09:58:37 +0100)


* VMX feature fix (myself)
* HVF fixes (Cameron)


Cameron Esfahani (4):
  hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in
  hvf: remove TSC synchronization code because it isn't fully complete
  hvf: correctly handle REX prefix in relation to legacy prefixes
  hvf: more accurately match SDM when setting CR0 and PDPTE registers

Paolo Bonzini (1):
  target/i386: add two missing VMX features for Skylake and CascadeLake 
Server

 target/i386/cpu.c|  6 +++--
 target/i386/hvf/hvf.c| 61 +
 target/i386/hvf/vmx.h| 18 +++--
 target/i386/hvf/x86_decode.c | 64 +---
 target/i386/hvf/x86_decode.h | 20 +++---
 target/i386/hvf/x86_emu.c|  3 ---
 target/i386/hvf/x86hvf.c |  4 ---
 7 files changed, 104 insertions(+), 72 deletions(-)
-- 
2.21.0




[PULL 5/5] hvf: more accurately match SDM when setting CR0 and PDPTE registers

2019-11-26 Thread Paolo Bonzini
From: Cameron Esfahani 

More accurately match SDM when setting CR0 and PDPTE registers.

Clear PDPTE registers when resetting vcpus.

Signed-off-by: Cameron Esfahani 
Message-Id: 
<464adb39c8699fb8331d8ad6016fc3e2eff53dbc.1574625592.git.di...@apple.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c |  8 
 target/i386/hvf/vmx.h | 18 ++
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 90fd50acfc..784e67d77e 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -441,12 +441,20 @@ static MemoryListener hvf_memory_listener = {
 };
 
 void hvf_reset_vcpu(CPUState *cpu) {
+uint64_t pdpte[4] = {0, 0, 0, 0};
+int i;
 
 /* TODO: this shouldn't be needed; there is already a call to
  * cpu_synchronize_all_post_reset in vl.c
  */
 wvmcs(cpu->hvf_fd, VMCS_ENTRY_CTLS, 0);
 wvmcs(cpu->hvf_fd, VMCS_GUEST_IA32_EFER, 0);
+
+/* Initialize PDPTE */
+for (i = 0; i < 4; i++) {
+wvmcs(cpu->hvf_fd, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
+}
+
 macvm_set_cr0(cpu->hvf_fd, 0x6010);
 
 wvmcs(cpu->hvf_fd, VMCS_CR4_MASK, CR4_VMXE_MASK);
diff --git a/target/i386/hvf/vmx.h b/target/i386/hvf/vmx.h
index 5dc52ecad6..eb8894cd58 100644
--- a/target/i386/hvf/vmx.h
+++ b/target/i386/hvf/vmx.h
@@ -121,6 +121,7 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, uint64_t 
cr0)
 uint64_t pdpte[4] = {0, 0, 0, 0};
 uint64_t efer = rvmcs(vcpu, VMCS_GUEST_IA32_EFER);
 uint64_t old_cr0 = rvmcs(vcpu, VMCS_GUEST_CR0);
+uint64_t mask = CR0_PG | CR0_CD | CR0_NW | CR0_NE | CR0_ET;
 
 if ((cr0 & CR0_PG) && (rvmcs(vcpu, VMCS_GUEST_CR4) & CR4_PAE) &&
 !(efer & MSR_EFER_LME)) {
@@ -128,18 +129,15 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, 
uint64_t cr0)
  rvmcs(vcpu, VMCS_GUEST_CR3) & ~0x1f,
  MEMTXATTRS_UNSPECIFIED,
  (uint8_t *)pdpte, 32, 0);
+/* Only set PDPTE when appropriate. */
+for (i = 0; i < 4; i++) {
+wvmcs(vcpu, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
+}
 }
 
-for (i = 0; i < 4; i++) {
-wvmcs(vcpu, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
-}
-
-wvmcs(vcpu, VMCS_CR0_MASK, CR0_CD | CR0_NE | CR0_PG);
+wvmcs(vcpu, VMCS_CR0_MASK, mask);
 wvmcs(vcpu, VMCS_CR0_SHADOW, cr0);
 
-cr0 &= ~CR0_CD;
-wvmcs(vcpu, VMCS_GUEST_CR0, cr0 | CR0_NE | CR0_ET);
-
 if (efer & MSR_EFER_LME) {
 if (!(old_cr0 & CR0_PG) && (cr0 & CR0_PG)) {
 enter_long_mode(vcpu, cr0, efer);
@@ -149,6 +147,10 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, 
uint64_t cr0)
 }
 }
 
+/* Filter new CR0 after we are finished examining it above. */
+cr0 = (cr0 & ~(mask & ~CR0_PG));
+wvmcs(vcpu, VMCS_GUEST_CR0, cr0 | CR0_NE | CR0_ET);
+
 hv_vcpu_invalidate_tlb(vcpu);
 hv_vcpu_flush(vcpu);
 }
-- 
2.21.0




[PULL 2/5] hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in

2019-11-26 Thread Paolo Bonzini
From: Cameron Esfahani 

If an area is non-RAM and non-ROMD, then remove mappings so accesses
will trap and can be emulated.  Change hvf_find_overlap_slot() to take
a size instead of an end address: it wouldn't return a slot because
callers would pass the same address for start and end.  Don't always
map area as read/write/execute, respect area flags.

Signed-off-by: Cameron Esfahani 
Message-Id: 
<1d8476c8f86959273fbdf23c86f8b4b611f5e2e1.1574625592.git.di...@apple.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c | 50 ++-
 1 file changed, 35 insertions(+), 15 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 231732aaf7..0b50cfcbc6 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -107,14 +107,14 @@ static void assert_hvf_ok(hv_return_t ret)
 }
 
 /* Memory slots */
-hvf_slot *hvf_find_overlap_slot(uint64_t start, uint64_t end)
+hvf_slot *hvf_find_overlap_slot(uint64_t start, uint64_t size)
 {
 hvf_slot *slot;
 int x;
 for (x = 0; x < hvf_state->num_slots; ++x) {
 slot = &hvf_state->slots[x];
 if (slot->size && start < (slot->start + slot->size) &&
-end > slot->start) {
+(start + size) > slot->start) {
 return slot;
 }
 }
@@ -129,12 +129,10 @@ struct mac_slot {
 };
 
 struct mac_slot mac_slots[32];
-#define ALIGN(x, y)  (((x) + (y) - 1) & ~((y) - 1))
 
-static int do_hvf_set_memory(hvf_slot *slot)
+static int do_hvf_set_memory(hvf_slot *slot, hv_memory_flags_t flags)
 {
 struct mac_slot *macslot;
-hv_memory_flags_t flags;
 hv_return_t ret;
 
 macslot = &mac_slots[slot->slot_id];
@@ -151,8 +149,6 @@ static int do_hvf_set_memory(hvf_slot *slot)
 return 0;
 }
 
-flags = HV_MEMORY_READ | HV_MEMORY_WRITE | HV_MEMORY_EXEC;
-
 macslot->present = 1;
 macslot->gpa_start = slot->start;
 macslot->size = slot->size;
@@ -165,14 +161,24 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 {
 hvf_slot *mem;
 MemoryRegion *area = section->mr;
+bool writeable = !area->readonly && !area->rom_device;
+hv_memory_flags_t flags;
 
 if (!memory_region_is_ram(area)) {
-return;
+if (writeable) {
+return;
+} else if (!memory_region_is_romd(area)) {
+/*
+ * If the memory device is not in romd_mode, then we actually want
+ * to remove the hvf memory slot so all accesses will trap.
+ */
+ add = false;
+}
 }
 
 mem = hvf_find_overlap_slot(
 section->offset_within_address_space,
-section->offset_within_address_space + 
int128_get64(section->size));
+int128_get64(section->size));
 
 if (mem && add) {
 if (mem->size == int128_get64(section->size) &&
@@ -186,7 +192,7 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 /* Region needs to be reset. set the size to 0 and remap it. */
 if (mem) {
 mem->size = 0;
-if (do_hvf_set_memory(mem)) {
+if (do_hvf_set_memory(mem, 0)) {
 error_report("Failed to reset overlapping slot");
 abort();
 }
@@ -196,6 +202,13 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 return;
 }
 
+if (area->readonly ||
+(!memory_region_is_ram(area) && memory_region_is_romd(area))) {
+flags = HV_MEMORY_READ | HV_MEMORY_EXEC;
+} else {
+flags = HV_MEMORY_READ | HV_MEMORY_WRITE | HV_MEMORY_EXEC;
+}
+
 /* Now make a new slot. */
 int x;
 
@@ -216,7 +229,7 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 mem->start = section->offset_within_address_space;
 mem->region = area;
 
-if (do_hvf_set_memory(mem)) {
+if (do_hvf_set_memory(mem, flags)) {
 error_report("Error registering new memory slot");
 abort();
 }
@@ -345,7 +358,14 @@ static bool ept_emulation_fault(hvf_slot *slot, uint64_t 
gpa, uint64_t ept_qual)
 return false;
 }
 
-return !slot;
+if (!slot) {
+return true;
+}
+if (!memory_region_is_ram(slot->region) &&
+!(read && memory_region_is_romd(slot->region))) {
+return true;
+}
+return false;
 }
 
 static void hvf_set_dirty_tracking(MemoryRegionSection *section, bool on)
@@ -354,7 +374,7 @@ static void hvf_set_dirty_tracking(MemoryRegionSection 
*section, bool on)
 
 slot = hvf_find_overlap_slot(
 section->offset_within_address_space,
-section->offset_within_address_space + 
int128_get64(section->size));
+int128_get64(section->size));
 
 /* protect region against writes; begin tracking it */
 if (on) {
@@ -720,7 +740,7 @@ int hvf_vcpu_exec(CPUState *cpu)
 ret = EXCP_INTERRUPT;
 break;
 }
-/* Need to check if MMIO or unmmaped fault */
+/* Need to check if MMIO

[PULL 3/5] hvf: remove TSC synchronization code because it isn't fully complete

2019-11-26 Thread Paolo Bonzini
From: Cameron Esfahani 

The existing code in QEMU's HVF support to attempt to synchronize TSC
across multiple cores is not sufficient.  TSC value on other cores
can go backwards.  Until implementation is fixed, remove calls to
hv_vm_sync_tsc().  Pass through TSC to guest OS.

Signed-off-by: Cameron Esfahani 
Message-Id: 
<44c4afd2301b8bf99682b229b0796d84edd6d66f.1574625592.git.di...@apple.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c | 3 +--
 target/i386/hvf/x86_emu.c | 3 ---
 target/i386/hvf/x86hvf.c  | 4 
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 0b50cfcbc6..90fd50acfc 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -518,7 +518,6 @@ void hvf_reset_vcpu(CPUState *cpu) {
 wreg(cpu->hvf_fd, HV_X86_R8 + i, 0x0);
 }
 
-hv_vm_sync_tsc(0);
 hv_vcpu_invalidate_tlb(cpu->hvf_fd);
 hv_vcpu_flush(cpu->hvf_fd);
 }
@@ -612,7 +611,7 @@ int hvf_init_vcpu(CPUState *cpu)
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_GSBASE, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_KERNELGSBASE, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_TSC_AUX, 1);
-/*hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_TSC, 1);*/
+hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_TSC, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_CS, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_EIP, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_ESP, 1);
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 1b04bd7e94..3df767209d 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -772,9 +772,6 @@ void simulate_wrmsr(struct CPUState *cpu)
 
 switch (msr) {
 case MSR_IA32_TSC:
-/* if (!osx_is_sierra())
- wvmcs(cpu->hvf_fd, VMCS_TSC_OFFSET, data - rdtscp());
-hv_vm_sync_tsc(data);*/
 break;
 case MSR_IA32_APICBASE:
 cpu_set_apic_base(X86_CPU(cpu)->apic_state, data);
diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index e0ea02d631..1485b95776 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -152,10 +152,6 @@ void hvf_put_msrs(CPUState *cpu_state)
 
 hv_vcpu_write_msr(cpu_state->hvf_fd, MSR_GSBASE, env->segs[R_GS].base);
 hv_vcpu_write_msr(cpu_state->hvf_fd, MSR_FSBASE, env->segs[R_FS].base);
-
-/* if (!osx_is_sierra())
- wvmcs(cpu_state->hvf_fd, VMCS_TSC_OFFSET, env->tsc - rdtscp());*/
-hv_vm_sync_tsc(env->tsc);
 }
 
 
-- 
2.21.0





[PULL 1/5] target/i386: add two missing VMX features for Skylake and CascadeLake Server

2019-11-26 Thread Paolo Bonzini
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.

Reported-by: Dr. David Alan Gilbert 
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 730fb28b67..69f518a21a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3006,7 +3006,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
  VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
  VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
  VMX_SECONDARY_EXEC_RDRAND_EXITING | 
VMX_SECONDARY_EXEC_ENABLE_INVPCID |
- VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS,
+ VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS 
|
+ VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML,
 .xlevel = 0x8008,
 .model_id = "Intel Xeon Processor (Skylake)",
 .versions = (X86CPUVersionDefinition[]) {
@@ -3131,7 +3132,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
  VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
  VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
  VMX_SECONDARY_EXEC_RDRAND_EXITING | 
VMX_SECONDARY_EXEC_ENABLE_INVPCID |
- VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS,
+ VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS 
|
+ VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML,
 .xlevel = 0x8008,
 .model_id = "Intel Xeon Processor (Cascadelake)",
 .versions = (X86CPUVersionDefinition[]) {
-- 
2.21.0





Re: virtiofsd: Where should it live?

2019-11-26 Thread Marc-André Lureau
Hi David

On Mon, Nov 25, 2019 at 10:50 PM Dr. David Alan Gilbert
 wrote:
>
> Hi,
>   There's been quite a bit of discussion about where virtiofsd, our
> implemenation of a virtiofs daemon, should live.  I'd like to get
> this settled now, because I'd like to tidy it up for the next
> qemu cycle.
>
> For reference it's based on qemu's livhost-user+chunks of libfuse.
> It can't live in libfuse because we change enough of the library
> to break their ABI.  It's C, and we've got ~100 patches - which
> we can split into about 3 chunks.
>
> Some suggestions so far:
>   a) In contrib
>  This is my current working assumption; the main objection is it's
>  a bit big and pulls in a chunk of libfuse
>
>   b) In a submodule
>
>   c) Just separate
>
> Your suggestions/ideas please.  My preference is (a).
>


It's more about code sharing and lifecycle.

The project started in a separate repository, and the proposed patches
for qemu aren't a clean series. Reviewing it is harder than it should
be, as we have to review/accept the whole thing.

As you said, it doesn't share much with qemu, but libvhost-user (which
we could quite easily copy or make standalone/submodule).

Then it dumps code from libfuse that is questionnable (showing age)
and often redundant with facilities provided by either glib, qemu
utils etc.

Is vhost-user-fs (the qemu device) going to have a strong relation
with virtiofsd?
Are we going to support different version of qemu and virtiofsd
combination? I suppose we have to, as vhost-user protocol should allow
that, and it's nice to allow other to experiment and implement it in
different ways.
If not, then perhaps we should think about introducing some version
checking between qemu and external processes (with config_stamp,
similar to modules).

>From what I understand, I think c) would be fine. However, for
convenience/testing reasons, b) would be my preference.

-- 
Marc-André Lureau



[PULL 4/5] hvf: correctly handle REX prefix in relation to legacy prefixes

2019-11-26 Thread Paolo Bonzini
From: Cameron Esfahani 

In real x86 processors, the REX prefix must come after legacy prefixes.
REX before legacy is ignored.  Update the HVF emulation code to properly
handle this.  Fix some spelling errors in constants.  Fix some decoder
table initialization issues found by Coverity.

Signed-off-by: Cameron Esfahani 
Message-Id: 

Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/x86_decode.c | 64 
 target/i386/hvf/x86_decode.h | 20 +--
 2 files changed, 46 insertions(+), 38 deletions(-)

diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index 822fa1866e..77c346605f 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -122,7 +122,8 @@ static void decode_rax(CPUX86State *env, struct x86_decode 
*decode,
 {
 op->type = X86_VAR_REG;
 op->reg = R_EAX;
-op->ptr = get_reg_ref(env, op->reg, decode->rex.rex, 0,
+/* Since reg is always AX, REX prefix has no impact. */
+op->ptr = get_reg_ref(env, op->reg, false, 0,
   decode->operand_size);
 }
 
@@ -1687,40 +1688,37 @@ calc_addr:
 }
 }
 
-target_ulong get_reg_ref(CPUX86State *env, int reg, int rex, int is_extended,
- int size)
+target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
+ int is_extended, int size)
 {
 target_ulong ptr = 0;
-int which = 0;
 
 if (is_extended) {
 reg |= R_R8;
 }
 
-
 switch (size) {
 case 1:
-if (is_extended || reg < 4 || rex) {
-which = 1;
+if (is_extended || reg < 4 || rex_present) {
 ptr = (target_ulong)&RL(env, reg);
 } else {
-which = 2;
 ptr = (target_ulong)&RH(env, reg - 4);
 }
 break;
 default:
-which = 3;
 ptr = (target_ulong)&RRX(env, reg);
 break;
 }
 return ptr;
 }
 
-target_ulong get_reg_val(CPUX86State *env, int reg, int rex, int is_extended,
- int size)
+target_ulong get_reg_val(CPUX86State *env, int reg, int rex_present,
+ int is_extended, int size)
 {
 target_ulong val = 0;
-memcpy(&val, (void *)get_reg_ref(env, reg, rex, is_extended, size), size);
+memcpy(&val,
+   (void *)get_reg_ref(env, reg, rex_present, is_extended, size),
+   size);
 return val;
 }
 
@@ -1853,28 +1851,38 @@ void calc_modrm_operand(CPUX86State *env, struct 
x86_decode *decode,
 static void decode_prefix(CPUX86State *env, struct x86_decode *decode)
 {
 while (1) {
+/*
+ * REX prefix must come after legacy prefixes.
+ * REX before legacy is ignored.
+ * Clear rex to simulate this.
+ */
 uint8_t byte = decode_byte(env, decode);
 switch (byte) {
 case PREFIX_LOCK:
 decode->lock = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_REPN:
 case PREFIX_REP:
 decode->rep = byte;
+decode->rex.rex = 0;
 break;
-case PREFIX_CS_SEG_OVEERIDE:
-case PREFIX_SS_SEG_OVEERIDE:
-case PREFIX_DS_SEG_OVEERIDE:
-case PREFIX_ES_SEG_OVEERIDE:
-case PREFIX_FS_SEG_OVEERIDE:
-case PREFIX_GS_SEG_OVEERIDE:
+case PREFIX_CS_SEG_OVERRIDE:
+case PREFIX_SS_SEG_OVERRIDE:
+case PREFIX_DS_SEG_OVERRIDE:
+case PREFIX_ES_SEG_OVERRIDE:
+case PREFIX_FS_SEG_OVERRIDE:
+case PREFIX_GS_SEG_OVERRIDE:
 decode->segment_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_OP_SIZE_OVERRIDE:
 decode->op_size_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_ADDR_SIZE_OVERRIDE:
 decode->addr_size_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_REX ... (PREFIX_REX + 0xf):
 if (x86_is_long_mode(env_cpu(env))) {
@@ -2111,14 +2119,14 @@ void init_decoder()
 {
 int i;
 
-for (i = 0; i < ARRAY_SIZE(_decode_tbl2); i++) {
-memcpy(_decode_tbl1, &invl_inst, sizeof(invl_inst));
+for (i = 0; i < ARRAY_SIZE(_decode_tbl1); i++) {
+memcpy(&_decode_tbl1[i], &invl_inst, sizeof(invl_inst));
 }
 for (i = 0; i < ARRAY_SIZE(_decode_tbl2); i++) {
-memcpy(_decode_tbl2, &invl_inst, sizeof(invl_inst));
+memcpy(&_decode_tbl2[i], &invl_inst, sizeof(invl_inst));
 }
 for (i = 0; i < ARRAY_SIZE(_decode_tbl3); i++) {
-memcpy(_decode_tbl3, &invl_inst, sizeof(invl_inst_x87));
+memcpy(&_decode_tbl3[i], &invl_inst_x87, sizeof(invl_inst_x87));
 
 }
 for (i = 0; i < ARRAY_SIZE(_1op_inst); i++) {
@@ -2167,22 +2175,22 @@ target_ulong decode_linear_addr(CPUX86State *env, 
struct x86_decode *decode,
target_ulong addr, X86Seg seg)
 {
 switch (decode->segment_override) {

Re: [PULL 0/5] i386 patches for QEMU 4.2-rc

2019-11-26 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20191126085936.1689-1-pbonz...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TESTcheck-unit: tests/test-thread-pool
wait_for_migration_fail: unexpected status status=wait-unplug allow_active=1
**
ERROR:/tmp/qemu-test/src/tests/migration-test.c:908:wait_for_migration_fail: 
assertion failed: (result)
ERROR - Bail out! 
ERROR:/tmp/qemu-test/src/tests/migration-test.c:908:wait_for_migration_fail: 
assertion failed: (result)
make: *** [check-qtest-aarch64] Error 1
make: *** Waiting for unfinished jobs
  TESTcheck-unit: tests/test-hbitmap
  TESTcheck-unit: tests/test-bdrv-drain
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=3d15e8368b3a42f39b429a8b4ae26f29', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-dduxk3b8/src/docker-src.2019-11-26-04.09.38.2015:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=3d15e8368b3a42f39b429a8b4ae26f29
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-dduxk3b8/src'
make: *** [docker-run-test-quick@centos7] Error 2

real9m26.610s
user0m8.328s


The full log is available at
http://patchew.org/logs/20191126085936.1689-1-pbonz...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: API definition for LUKS key management [V2]

2019-11-26 Thread Maxim Levitsky
On Mon, 2019-11-25 at 19:45 +0100, Max Reitz wrote:
> On 22.11.19 15:22, Maxim Levitsky wrote:
> > Hi!
> > 
> > This is the second version of the proposed QMP API for key management,
> > after discussion with Keven and Max.
> > 
> > Will this work?
> > 
> > Adding Peter Krempa to CC, to hear his opinion from the 
> > libvirt side.
> > 
> > Best regards,
> > Maxim Levitsky
> 
> Looks good to me overall.  I don’t think we need to overly push having
> the same interface for create and amend, because I don’t see much to be
> gained from it.

Thanks!!
> 
> [...]
> 
> > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > index b2a4cff683..019db682cd 100644
> > --- a/qapi/crypto.json
> > +++ b/qapi/crypto.json
> > @@ -309,3 +309,56 @@
> >'base': 'QCryptoBlockInfoBase',
> >'discriminator': 'format',
> >'data': { 'luks': 'QCryptoBlockInfoLUKS' } }
> > +
> > +
> > +##
> > +# @LUKSKeyslotUpdate:
> > +#
> > +# @keyslot: If specified, will update only keyslot with this index
> > +#
> > +# @old-secret:  If specified, will only update keyslots that
> > +#   can be opened with password which is contained in
> > +#   QCryptoSecret with @old-secret ID
> > +#
> > +#   If neither @keyslot nor @old-secret is specified,
> > +#   first empty keyslot is selected for the update
> > +#
> > +# @new-secret:  The ID of a QCryptoSecret object providing a new 
> > decryption
> > +#   key to place in all matching keyslots. Empty string 
> > erases the
> > +#   keyslot.
> 
> Hm...  Can’t we make this some string-or-null alternate type so that
> null will erase the keyslot?  That would make more sense to me.

Agree. I'll see if can get an alternate to work here.

> 
> Max
> 
> > +# @iter-time:   number of milliseconds to spend in
> > +#   PBKDF passphrase processing
> > +##
> > +{ 'struct': 'LUKSKeyslotUpdate',
> > +  'data': {
> > + '*keyslot': 'int',
> > + '*old-secret': 'str',
> > + 'new-secret' : 'str',
> > + '*iter-time' : 'int' } }
> 
> 

Best regards,
Maxim Levitsky




Re: [PATCH 4/4] qom/object: Use common get/set uint helpers

2019-11-26 Thread Felipe Franciosi


> On Nov 26, 2019, at 8:39 AM, Marc-André Lureau  
> wrote:
> 
> Hi

Heya, thanks for the review.

> 
> On Mon, Nov 25, 2019 at 7:37 PM Felipe Franciosi  wrote:
>> 
>> Several objects implemented their own uint property getters and setters,
>> despite them being straightforward (without any checks/validations on
>> the values themselves) and identical across objects. This makes use of
>> an enhanced API for object_property_add_uintXX_ptr() which offers
>> default setters.
>> 
>> Some of these setters used to update the value even if the type visit
>> failed (eg. because the value being set overflowed over the given type).
>> The new setter introduces a check for these errors, not updating the
>> value if an error occurred. The error is propagated.
>> 
>> Signed-off-by: Felipe Franciosi 
> 
> 
> Some comments below, otherwise:
> Reviewed-by: Marc-André Lureau 
> 
>> ---
>> hw/acpi/ich9.c   |  93 +++
>> hw/isa/lpc_ich9.c|  14 +-
>> hw/misc/edu.c|  12 +
>> hw/pci-host/q35.c|  14 ++
>> hw/ppc/spapr.c   |  17 ++--
>> hw/vfio/pci-quirks.c |  18 ++--
>> memory.c |  15 +--
>> target/arm/cpu.c |  21 +
>> target/i386/sev.c| 102 +++
>> 9 files changed, 29 insertions(+), 277 deletions(-)
>> 
>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
>> index 94dc5147ce..aa3c7a59ab 100644
>> --- a/hw/acpi/ich9.c
>> +++ b/hw/acpi/ich9.c
>> @@ -357,81 +357,6 @@ static void ich9_pm_set_cpu_hotplug_legacy(Object *obj, 
>> bool value,
>> s->pm.cpu_hotplug_legacy = value;
>> }
>> 
>> -static void ich9_pm_get_disable_s3(Object *obj, Visitor *v, const char 
>> *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -uint8_t value = pm->disable_s3;
>> -
>> -visit_type_uint8(v, name, &value, errp);
>> -}
>> -
>> -static void ich9_pm_set_disable_s3(Object *obj, Visitor *v, const char 
>> *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -Error *local_err = NULL;
>> -uint8_t value;
>> -
>> -visit_type_uint8(v, name, &value, &local_err);
>> -if (local_err) {
>> -goto out;
>> -}
>> -pm->disable_s3 = value;
>> -out:
>> -error_propagate(errp, local_err);
>> -}
>> -
>> -static void ich9_pm_get_disable_s4(Object *obj, Visitor *v, const char 
>> *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -uint8_t value = pm->disable_s4;
>> -
>> -visit_type_uint8(v, name, &value, errp);
>> -}
>> -
>> -static void ich9_pm_set_disable_s4(Object *obj, Visitor *v, const char 
>> *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -Error *local_err = NULL;
>> -uint8_t value;
>> -
>> -visit_type_uint8(v, name, &value, &local_err);
>> -if (local_err) {
>> -goto out;
>> -}
>> -pm->disable_s4 = value;
>> -out:
>> -error_propagate(errp, local_err);
>> -}
>> -
>> -static void ich9_pm_get_s4_val(Object *obj, Visitor *v, const char *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -uint8_t value = pm->s4_val;
>> -
>> -visit_type_uint8(v, name, &value, errp);
>> -}
>> -
>> -static void ich9_pm_set_s4_val(Object *obj, Visitor *v, const char *name,
>> -   void *opaque, Error **errp)
>> -{
>> -ICH9LPCPMRegs *pm = opaque;
>> -Error *local_err = NULL;
>> -uint8_t value;
>> -
>> -visit_type_uint8(v, name, &value, &local_err);
>> -if (local_err) {
>> -goto out;
>> -}
>> -pm->s4_val = value;
>> -out:
>> -error_propagate(errp, local_err);
>> -}
>> -
>> static bool ich9_pm_get_enable_tco(Object *obj, Error **errp)
>> {
>> ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
>> @@ -468,18 +393,12 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
>> *pm, Error **errp)
>>  ich9_pm_get_cpu_hotplug_legacy,
>>  ich9_pm_set_cpu_hotplug_legacy,
>>  NULL);
>> -object_property_add(obj, ACPI_PM_PROP_S3_DISABLED, "uint8",
>> -ich9_pm_get_disable_s3,
>> -ich9_pm_set_disable_s3,
>> -NULL, pm, NULL);
>> -object_property_add(obj, ACPI_PM_PROP_S4_DISABLED, "uint8",
>> -ich9_pm_get_disable_s4,
>> -ich9_pm_set_disable_s4,
>> -NULL, pm, NULL);
>> -object_property_add(obj, ACPI_PM_PROP_S4_VAL, "uint8",
>> -ich9_pm_get_s4_val,
>> -ich9_pm_set_s4_val,
>> -NULL, pm, NULL);
>> +object_property_add_uint8_ptr(obj, ACPI_PM_PROP_S3_DISABLED,
>> + 

Re: [PATCH v17 02/14] util/cutils: Use qemu_strtold_finite to parse size

2019-11-26 Thread Markus Armbruster
Tao Xu  writes:

> On 11/25/2019 2:56 PM, Markus Armbruster wrote:
>> Tao Xu  writes:
>>
>>> Support full 64bit precision, modify related test cases.
>>
>> That's not true in general: long double need not be any wider than
>> double.
>>
>> It might be true on the host machines we support, but I don't know.  If
>> we decide to rely on it, we better make the build fail when the host
>> machine doesn't meet our expectations, preferably in configure.
>>
> [...]
>>> -if ((val * mul >= 0xfc00) || val < 0) {
>>> +/* Values > UINT64_MAX overflow uint64_t */
>>> +if ((val * mul > UINT64_MAX) || val < 0) {
>>>   retval = -ERANGE;
>>>   goto out;
>>>   }
>>
>> Not portable.  If it was, we'd have made this changd long ago :)
>>
>
> OK. So the suitable solution is what you suggested in v14?
>
> "A possible alternative is to parse the numeric part both as a double
> and as a 64 bit unsigned integer, then use whatever consumes more
> characters.  This enables providing full 64 bits unless you actually
> use
> a fraction."

Yes, that bypasses the portability issue.

> I will try this way.

Go ahead.




Re: [QUESTION] What is the best license option for new files introduced in QEMU?

2019-11-26 Thread Markus Armbruster
Aleksandar Markovic  writes:

> I read LICENSE file, but I read also recent contributions, and it is
> not clear to me what version of GPL is best/recommended for new file
> just being introduced to QEMU:

The situation is confusing.  It's a mess of our own making.

> * GPL 2.0
> * GPL 2.0 (or later at your option)
> * GPL 2.1
> * GPL 2.1 (or later at your option)
>
> or something else. (The rest od wording of license preamble is clear
> to me.) Please somebody explsin snd clarify.

Let's not add to the mess: GPLv2+ unless you have a compelling reason to
deviate, spelled out in your commit message.

In my opinion, accepting GPLv2-only contributions was a mistake.




Re: [PATCH 0/5] ARM virt: Add NVDIMM support

2019-11-26 Thread Andrew Jones
On Tue, Nov 26, 2019 at 09:56:55AM +0100, Igor Mammedov wrote:
> On Mon, 25 Nov 2019 16:25:54 +
> Shameerali Kolothum Thodi  wrote:
> > > -Name (MEMA, 0xBFBFD000)
> > > +Name (MEMA, 0x)
> > > 
> > > However value here is suspicious. If I recall right it should
> > > point to DMS buffer allocated by firmware somewhere in the guest RAM.  
> > 
> > But then it will be horribly wrong and will result in inconsistent behavior
> > as well, right?
> 
> I'd thinks so. I'm not sure what happens but RAM starts at 0x1
> which is above the address you have in MEMA so you shouldn't have
> sensible data there and access probably should raise some sort of error.
> 
> CCing Drew

RAM starts at 0x4000 (the 1G boundary). So 0x is OK, assuming
you want this just below the 3G boundary. With TCG, a data abort exception
will be injected for any access to an unmapped region. With KVM, unless
paging is enabled in the guest, then reads of unmapped regions will return
zero and writes will be ignored.

Thanks,
drew




Re: [PATCH 2/4] ich9: fix getter type for sci_int property

2019-11-26 Thread Felipe Franciosi
Hi

> On Nov 25, 2019, at 4:43 PM, Philippe Mathieu-Daudé  wrote:
> 
> On 11/25/19 4:36 PM, Felipe Franciosi wrote:
>> When QOM APIs were added to ich9 in 6f1426ab, the getter for sci_int was
>> written using uint32_t. However, the object property is uint8_t. This
>> fixes the getter for correctness.
>> Signed-off-by: Felipe Franciosi 
>> ---
>>  hw/isa/lpc_ich9.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>> index ce3342..240979885d 100644
>> --- a/hw/isa/lpc_ich9.c
>> +++ b/hw/isa/lpc_ich9.c
>> @@ -631,9 +631,9 @@ static void ich9_lpc_get_sci_int(Object *obj, Visitor 
>> *v, const char *name,
>>   void *opaque, Error **errp)
>>  {
>>  ICH9LPCState *lpc = ICH9_LPC_DEVICE(obj);
>> -uint32_t value = lpc->sci_gsi;
>> +uint8_t value = lpc->sci_gsi;
>>  -visit_type_uint32(v, name, &value, errp);
>> +visit_type_uint8(v, name, &value, errp);
> 
> Maybe directly as:
> 
>   visit_type_uint8(v, name, &lpc->sci_gsi, errp);
> 

Thanks for the suggestion. I'll improve it on a v2 which I'm sending
out anyway for other reasons.

For my own sake: why is the field called "sci_gsi", but
ACPI_PM_PROP_SCI_INT (and the getter) are called "sci_int"?

Thanks,
F.

> With/without stack variable:
> Reviewed-by: Philippe Mathieu-Daudé 
> 
>>  }
>>static void ich9_lpc_add_properties(ICH9LPCState *lpc)
>> @@ -641,7 +641,7 @@ static void ich9_lpc_add_properties(ICH9LPCState *lpc)
>>  static const uint8_t acpi_enable_cmd = ICH9_APM_ACPI_ENABLE;
>>  static const uint8_t acpi_disable_cmd = ICH9_APM_ACPI_DISABLE;
>>  -object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, "uint32",
>> +object_property_add(OBJECT(lpc), ACPI_PM_PROP_SCI_INT, "uint8",
>>  ich9_lpc_get_sci_int,
>>  NULL, NULL, NULL, NULL);
>>  object_property_add_uint8_ptr(OBJECT(lpc), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> 



[RFC net-next 04/18] tuntap: check tun_msg_ctl type at necessary places

2019-11-26 Thread Prashant Bhole
tun_msg_ctl is used by vhost_net to communicate with tuntap. We will
introduce another type in soon. As a preparation this patch adds
conditions to check tun_msg_ctl type at necessary places.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tap.c | 7 +--
 drivers/net/tun.c | 6 +-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 3ae70c7e6860..4df7bf00af66 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1213,6 +1213,7 @@ static int tap_sendmsg(struct socket *sock, struct msghdr 
*m,
struct tap_queue *q = container_of(sock, struct tap_queue, sock);
struct tun_msg_ctl *ctl = m->msg_control;
struct xdp_buff *xdp;
+   void *ptr = NULL;
int i;
 
if (ctl && (ctl->type == TUN_MSG_PTR)) {
@@ -1223,8 +1224,10 @@ static int tap_sendmsg(struct socket *sock, struct 
msghdr *m,
return 0;
}
 
-   return tap_get_user(q, ctl ? ctl->ptr : NULL, &m->msg_iter,
-   m->msg_flags & MSG_DONTWAIT);
+   if (ctl && ctl->type == TUN_MSG_UBUF)
+   ptr = ctl->ptr;
+
+   return tap_get_user(q, ptr, &m->msg_iter, m->msg_flags & MSG_DONTWAIT);
 }
 
 static int tap_recvmsg(struct socket *sock, struct msghdr *m,
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 683d371e6e82..1e436d9ec4e1 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2529,6 +2529,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr 
*m, size_t total_len)
struct tun_struct *tun = tun_get(tfile);
struct tun_msg_ctl *ctl = m->msg_control;
struct xdp_buff *xdp;
+   void *ptr = NULL;
 
if (!tun)
return -EBADFD;
@@ -2560,7 +2561,10 @@ static int tun_sendmsg(struct socket *sock, struct 
msghdr *m, size_t total_len)
goto out;
}
 
-   ret = tun_get_user(tun, tfile, ctl ? ctl->ptr : NULL, &m->msg_iter,
+   if (ctl && ctl->type == TUN_MSG_UBUF)
+   ptr = ctl->ptr;
+
+   ret = tun_get_user(tun, tfile, ptr, &m->msg_iter,
   m->msg_flags & MSG_DONTWAIT,
   m->msg_flags & MSG_MORE);
 out:
-- 
2.20.1




[RFC 1/3] configure: add libbpf support

2019-11-26 Thread Prashant Bhole
This is a preparation to add libbpf support for Qemu. When it is
enabled Qemu can load eBPF programs and manipulated eBPF maps
libbpf APIs.

When configured with --enable-libbpf, availability of libbpf is
checked. If it exists then CONFIG_LIBBPF is defined and the qemu
binary is linked with libbpf.

Signed-off-by: Prashant Bhole 
---
 configure | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/configure b/configure
index 6099be1d84..a7e8a8450d 100755
--- a/configure
+++ b/configure
@@ -504,6 +504,7 @@ debug_mutex="no"
 libpmem=""
 default_devices="yes"
 plugins="no"
+libbpf="no"
 
 supported_cpu="no"
 supported_os="no"
@@ -1539,6 +1540,8 @@ for opt do
   ;;
   --disable-plugins) plugins="no"
   ;;
+  --enable-libbpf) libbpf="yes"
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -1825,6 +1828,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   debug-mutex mutex debugging support
   libpmem libpmem support
   xkbcommon   xkbcommon support
+  libbpf  eBPF program support
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -6084,6 +6088,19 @@ case "$slirp" in
 ;;
 esac
 
+##
+# Do we have libbpf
+if test "$libbpf" != "no" ; then
+  if $pkg_config libbpf; then
+libbpf="yes"
+libbpf_libs=$($pkg_config --libs libbpf)
+  else
+if test "$libbpf" == "yes" ; then
+  feature_not_found "libbpf" "Install libbpf devel"
+fi
+libbpf="no"
+  fi
+fi
 
 ##
 # End of CC checks
@@ -6599,6 +6616,7 @@ echo "libpmem support   $libpmem"
 echo "libudev   $libudev"
 echo "default devices   $default_devices"
 echo "plugin support$plugins"
+echo "XDP offload support $libbpf"
 
 if test "$supported_cpu" = "no"; then
 echo
@@ -7457,6 +7475,11 @@ if test "$plugins" = "yes" ; then
 fi
 fi
 
+if test "$libbpf" = "yes" ; then
+  echo "CONFIG_LIBBPF=y" >> $config_host_mak
+  echo "LIBBPF_LIBS=$libbpf_libs" >> $config_host_mak
+fi
+
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
 elif test "$ARCH" = "sparc64" ; then
-- 
2.20.1




[RFC net-next 11/18] tun: run xdp prog when tun is read from file interface

2019-11-26 Thread Prashant Bhole
It handles the case when qemu performs read on tun using file
operations.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tun.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 084ca95358fe..639921c10e32 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2318,8 +2318,10 @@ static ssize_t tun_do_read(struct tun_struct *tun, 
struct tun_file *tfile,
   struct iov_iter *to,
   int noblock, void *ptr)
 {
+   struct xdp_frame *frame;
ssize_t ret;
int err;
+   u32 act;
 
tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
@@ -2333,6 +2335,15 @@ static ssize_t tun_do_read(struct tun_struct *tun, 
struct tun_file *tfile,
ptr = tun_ring_recv(tfile, noblock, &err);
if (!ptr)
return err;
+
+   if (tun_is_xdp_frame(ptr)) {
+   frame = tun_ptr_to_xdp(ptr);
+   act = tun_do_xdp_offload(tun, tfile, frame);
+   } else {
+   act = tun_do_xdp_offload_generic(tun, ptr);
+   }
+   if (act != XDP_PASS)
+   return err;
}
 
if (tun_is_xdp_frame(ptr)) {
-- 
2.20.1




[RFC 3/3] virtio-net: add support for offloading an ebpf map

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This change is a part of XDP offload feature. It handles offloading
of eBPF map from the guest.

A command handler for VIRTIO_NET_CTRL_EBPF now checks for subcommand
VIRTIO_NET_CTRL_EBPF_MAP and. The control buffer consists of struct
virtio_net_ctrl_ebpf_map followed by map key/value or key/key pair.
Map manipulation is done using libbpf APIs.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 hw/net/Makefile.objs|  2 +
 hw/net/virtio-net.c | 88 +
 include/standard-headers/linux/virtio_net.h | 23 ++
 3 files changed, 113 insertions(+)

diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs
index 7907d2c199..5928497a01 100644
--- a/hw/net/Makefile.objs
+++ b/hw/net/Makefile.objs
@@ -52,3 +52,5 @@ common-obj-$(CONFIG_ROCKER) += rocker/rocker.o 
rocker/rocker_fp.o \
 obj-$(call lnot,$(CONFIG_ROCKER)) += rocker/qmp-norocker.o
 
 common-obj-$(CONFIG_CAN_BUS) += can/
+
+virtio-net.o-libs := $(LIBBPF_LIBS)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7cc1bd1654..3c49273796 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1011,6 +1011,92 @@ static int virtio_net_handle_offloads(VirtIONet *n, 
uint8_t cmd,
 }
 }
 
+static int virtio_net_handle_ebpf_map(VirtIONet *n, struct iovec *iov,
+  unsigned int iov_cnt)
+{
+#ifdef CONFIG_LIBBPF
+struct virtio_net_ctrl_ebpf_map *ctrl = NULL;
+struct bpf_create_map_attr map_attr = {};
+uint8_t *key, *val;
+uint32_t buf_len;
+int fd, err = 0;
+size_t s;
+
+s = iov_to_buf(iov, iov_cnt, 0, &buf_len, sizeof(buf_len));
+if (s != sizeof(buf_len)) {
+goto err;
+}
+
+ctrl = malloc(sizeof(*ctrl) + buf_len);
+if (!ctrl) {
+goto err;
+}
+
+s = iov_to_buf(iov, iov_cnt, 0, ctrl, sizeof(*ctrl) + buf_len);
+if (s != (sizeof(*ctrl) + buf_len)) {
+error_report("Invalid map control buffer");
+goto err;
+}
+
+key = ctrl->buf;
+val = ctrl->buf + ctrl->key_size;
+
+switch (ctrl->cmd) {
+case VIRTIO_NET_BPF_CMD_CREATE_MAP:
+map_attr.map_type = ctrl->map_type;
+map_attr.map_flags = ctrl->map_flags;
+map_attr.key_size = ctrl->key_size;
+map_attr.value_size = ctrl->value_size;
+map_attr.max_entries = ctrl->max_entries;
+fd = bpf_create_map_xattr(&map_attr);
+if (fd < 0) {
+goto err;
+}
+ctrl->map_fd = fd;
+break;
+case VIRTIO_NET_BPF_CMD_FREE_MAP:
+close(ctrl->map_fd);
+break;
+case VIRTIO_NET_BPF_CMD_LOOKUP_ELEM:
+err = bpf_map_lookup_elem(ctrl->map_fd, key, val);
+break;
+case VIRTIO_NET_BPF_CMD_GET_FIRST:
+err = bpf_map_get_next_key(ctrl->map_fd, NULL, val);
+break;
+case VIRTIO_NET_BPF_CMD_GET_NEXT:
+err = bpf_map_get_next_key(ctrl->map_fd, key, val);
+break;
+case VIRTIO_NET_BPF_CMD_UPDATE_ELEM:
+err = bpf_map_update_elem(ctrl->map_fd, key, val, ctrl->flags);
+break;
+case VIRTIO_NET_BPF_CMD_DELETE_ELEM:
+err = bpf_map_delete_elem(ctrl->map_fd, key);
+default:
+error_report("map operation not implemented %d", ctrl->cmd);
+goto err;
+}
+
+if (err) {
+goto err;
+}
+
+s = iov_from_buf(iov, iov_cnt, 0, ctrl, sizeof(*ctrl) + buf_len);
+if (s != sizeof(*ctrl) + buf_len) {
+error_report("failed to write map operation result");
+goto err;
+}
+
+free(ctrl);
+return VIRTIO_NET_OK;
+
+err:
+if (ctrl) {
+free(ctrl);
+}
+#endif
+return VIRTIO_NET_ERR;
+}
+
 static int virtio_net_handle_ebpf_prog(VirtIONet *n, struct iovec *iov,
unsigned int iov_cnt)
 {
@@ -1053,6 +1139,8 @@ static int virtio_net_handle_ebpf(VirtIONet *n, uint8_t 
cmd,
 {
 if (cmd == VIRTIO_NET_CTRL_EBPF_PROG) {
 return virtio_net_handle_ebpf_prog(n, iov, iov_cnt);
+} else if (cmd == VIRTIO_NET_CTRL_EBPF_MAP) {
+return virtio_net_handle_ebpf_map(n, iov, iov_cnt);
 }
 
 return VIRTIO_NET_ERR;
diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index 83292c81bc..cca234e0e8 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -281,11 +281,34 @@ struct virtio_net_ctrl_ebpf_prog {
uint8_t insns[0];
 };
 
+struct virtio_net_ctrl_ebpf_map {
+   __virtio32 buf_len;
+   __virtio32 cmd;
+   __virtio32 map_type;
+   __virtio32 key_size;
+   __virtio32 value_size;
+   __virtio32 max_entries;
+   __virtio32 map_flags;
+   __virtio32 map_fd;
+   __virtio64 flags;
+   uint8_t buf[0];
+};
+
 #define VIRTIO_NET_CTRL_EBPF   6
  #define VIRTIO_NET_CTRL_EBPF_PROG 1
+ #define VIRTIO_NET_CTRL_EBPF_MAP 2
 
 /* Commands for VIRTIO_NET_CTR

Re: [PATCH] target/arm: Honor HCR_EL2.TID3 trapping requirements

2019-11-26 Thread Richard Henderson
On 11/23/19 11:56 AM, Marc Zyngier wrote:
> +static CPAccessResult access_aa64idreg(CPUARMState *env, const ARMCPRegInfo 
> *ri,
> +   bool isread)
> +{
> +if ((arm_current_el(env) < 2) && (arm_hcr_el2_eff(env) & HCR_TID3)) {
> +return CP_ACCESS_TRAP_EL2;
> +}
> +
> +return CP_ACCESS_OK;
> +}
> +

The only thing I would suggest is to call this access_aa64_tid3, because
tid{0,1,2} also need to be handled in a similar way, and we'll need little
helper functions for those too.


r~



[RFC 2/3] virtio-net: add support for offloading XDP program

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This feature involves offloading of XDP program and ebpf map from
the guest to the host. This patch takes care of offloadin of program.

A handler for VIRTIO_NET_CTRL_EBPF command is added in virtio-net.
The control buffer consist of struct virtio_net_ctrl_ebpf_prog and
followed by an ebpf program instructions. An array of bpf_insn is
prepared and passed to libbpf API bpf_load_program. The program fd is
retuned by the API is then attached to tap fd using TUNSETOFFLOADEDXDP
ioctl command.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 hw/net/virtio-net.c | 69 +
 include/net/tap.h   |  2 +
 include/standard-headers/linux/virtio_net.h | 27 
 net/Makefile.objs   |  1 +
 net/tap-bsd.c   |  5 ++
 net/tap-linux.c | 48 ++
 net/tap-linux.h |  1 +
 net/tap-solaris.c   |  5 ++
 net/tap-stub.c  |  5 ++
 net/tap.c   |  7 +++
 net/tap_int.h   |  1 +
 11 files changed, 171 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 97a5113f7e..7cc1bd1654 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -43,6 +43,11 @@
 #include "monitor/qdev.h"
 #include "hw/pci/pci.h"
 
+#ifdef CONFIG_LIBBPF
+#include 
+#include 
+#endif
+
 #define VIRTIO_NET_VM_VERSION11
 
 #define MAC_TABLE_ENTRIES64
@@ -628,6 +633,21 @@ static int peer_attach(VirtIONet *n, int index)
 return tap_enable(nc->peer);
 }
 
+static int peer_attach_ebpf(VirtIONet *n, int len, void *insns, uint8_t gpl)
+{
+NetClientState *nc = qemu_get_subqueue(n->nic, 0);
+
+if (!nc->peer) {
+return 0;
+}
+
+if (nc->peer->info->type != NET_CLIENT_DRIVER_TAP) {
+return 0;
+}
+
+return tap_attach_ebpf(nc->peer, len, insns, gpl);
+}
+
 static int peer_detach(VirtIONet *n, int index)
 {
 NetClientState *nc = qemu_get_subqueue(n->nic, index);
@@ -991,6 +1011,53 @@ static int virtio_net_handle_offloads(VirtIONet *n, 
uint8_t cmd,
 }
 }
 
+static int virtio_net_handle_ebpf_prog(VirtIONet *n, struct iovec *iov,
+   unsigned int iov_cnt)
+{
+#ifdef CONFIG_LIBBPF
+struct bpf_insn prog[4096];
+struct virtio_net_ctrl_ebpf_prog ctrl;
+size_t s;
+int err = VIRTIO_NET_ERR;
+
+s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
+if (s != sizeof(ctrl)) {
+error_report("Invalid ebpf prog control buffer");
+goto err;
+}
+
+if (ctrl.cmd == VIRTIO_NET_BPF_CMD_SET_OFFLOAD) {
+s = iov_to_buf(iov, iov_cnt, sizeof(ctrl), prog, sizeof(prog));
+if (s != ctrl.len) {
+error_report("Invalid ebpf prog control buffer");
+goto err;
+}
+
+err = peer_attach_ebpf(n, s, prog, ctrl.gpl_compatible);
+if (err) {
+error_report("Failed to attach XDP program");
+goto err;
+}
+} else if (ctrl.cmd == VIRTIO_NET_BPF_CMD_UNSET_OFFLOAD) {
+err = peer_attach_ebpf(n, 0, NULL, 0);
+}
+err:
+return err ? VIRTIO_NET_ERR : VIRTIO_NET_OK;
+#else
+return VIRTIO_NET_ERR;
+#endif
+}
+
+static int virtio_net_handle_ebpf(VirtIONet *n, uint8_t cmd,
+  struct iovec *iov, unsigned int iov_cnt)
+{
+if (cmd == VIRTIO_NET_CTRL_EBPF_PROG) {
+return virtio_net_handle_ebpf_prog(n, iov, iov_cnt);
+}
+
+return VIRTIO_NET_ERR;
+}
+
 static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
  struct iovec *iov, unsigned int iov_cnt)
 {
@@ -1208,6 +1275,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
 } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
 status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
+} else if (ctrl.class == VIRTIO_NET_CTRL_EBPF) {
+status = virtio_net_handle_ebpf(n, ctrl.cmd, iov, iov_cnt);
 }
 
 s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, 
sizeof(status));
diff --git a/include/net/tap.h b/include/net/tap.h
index 5d585515f9..19c507a1c2 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -33,6 +33,8 @@ int tap_disable(NetClientState *nc);
 
 int tap_get_fd(NetClientState *nc);
 
+int tap_attach_ebpf(NetClientState *nc, int len, void *insns, uint8_t gpl);
+
 struct vhost_net;
 struct vhost_net *tap_get_vhost_net(NetClientState *nc);
 
diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index 260c3681d7..83292c81bc 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -261,

[RFC net-next 17/18] virtio_net: implment XDP map offload functionality

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This patch implements:
* Handling of BPF_OFFLOAD_MAP_ALLOC, BPF_OFFLOAD_MAP_FREE:
  Allocate driver specific map data structure. Set up struct
  virtio_net_ctrl_ebpf_map and send the control buffer to Qemu with
  class VIRTIO_NET_CTRL_EBPF, cmd VIRTIO_NET_CTRL_EBPF_MAP. The cmd
  in the control buffer is set to VIRTIO_NET_BPF_CMD_CREATE_MAP. The
  expected behavior from Qemu is that it should perform the action
  as per command and return the status (and map data). In case of
  create map command, Qemu should set the map_fd in the control buffer

* bpf_map_dev_ops operations:
  Common map operations are implemented with use of above mentioned
  struct virtio_net_ctrl_ebpf_map. This control buffer has space for
  storing: key + key or key + value.

* map_fd replacement in a copy of the program:
  Since map are created before the verification of program begins,
  we have map fds from the host side for each offloaded map when
  program verification begins. map fds in the copy of the program are
  replaced with map fds from host side. This copy of program is used
  for offloading.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c| 241 
 include/uapi/linux/virtio_net.h |  23 +++
 2 files changed, 264 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index dddfbb2a2075..91a94b787c64 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -238,6 +238,7 @@ struct virtnet_info {
struct bpf_offload_dev *bpf_dev;
 
struct list_head bpf_bound_progs;
+   struct list_head map_list;
 };
 
 struct padded_vnet_hdr {
@@ -275,6 +276,13 @@ struct virtnet_bpf_bound_prog {
 
 #define VIRTNET_EA(extack, msg)NL_SET_ERR_MSG_MOD((extack), msg)
 
+struct virtnet_bpf_map {
+   struct bpf_offloaded_map *offmap;
+   struct virtnet_info *vi;
+   struct virtio_net_ctrl_ebpf_map *ctrl;
+   struct list_head list;
+};
+
 /* Converting between virtqueue no. and kernel tx/rx queue no.
  * 0:rx0 1:tx0 2:rx1 3:tx1 ... 2N:rxN 2N+1:txN 2N+2:cvq
  */
@@ -2528,6 +2536,19 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct netdev_bpf *bpf)
return err;
 }
 
+static struct virtnet_bpf_map *virtnet_get_bpf_map(struct virtnet_info *vi,
+  struct bpf_map *map)
+{
+   struct virtnet_bpf_map *virtnet_map;
+
+   list_for_each_entry(virtnet_map, &vi->map_list, list) {
+   if (&virtnet_map->offmap->map == map)
+   return virtnet_map;
+   }
+
+   return NULL;
+}
+
 static int virtnet_bpf_verify_insn(struct bpf_verifier_env *env, int insn_idx,
   int prev_insn)
 {
@@ -2623,11 +2644,194 @@ static int virtnet_xdp_set_offload(struct virtnet_info 
*vi,
return err;
 }
 
+static int virtnet_bpf_ctrl_map(struct bpf_offloaded_map *offmap,
+   int cmd, u8 *key, u8 *value, u64 flags,
+   u8 *out_key, u8 *out_value)
+{
+   struct virtio_net_ctrl_ebpf_map *ctrl;
+   struct virtnet_bpf_map *virtnet_map;
+   struct bpf_map *map = &offmap->map;
+   unsigned char *keyptr, *valptr;
+   struct virtnet_info *vi;
+   struct scatterlist sg;
+
+   virtnet_map = offmap->dev_priv;
+   vi = virtnet_map->vi;
+   ctrl = virtnet_map->ctrl;
+
+   keyptr = ctrl->buf;
+   valptr = ctrl->buf + ctrl->key_size;
+
+   if (key)
+   memcpy(keyptr, key, map->key_size);
+   if (value)
+   memcpy(valptr, value, map->value_size);
+
+   ctrl->cmd = cpu_to_virtio32(vi->vdev, cmd);
+   ctrl->flags = cpu_to_virtio64(vi->vdev, flags);
+
+   sg_init_one(&sg, ctrl, sizeof(*ctrl) + ctrl->buf_len);
+   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_EBPF,
+ VIRTIO_NET_CTRL_EBPF_MAP,
+ &sg))
+   return -EFAULT;
+
+   if (out_key)
+   memcpy(out_key, valptr, map->key_size);
+   if (out_value)
+   memcpy(out_value, valptr, map->value_size);
+   return 0;
+}
+
+static int virtnet_bpf_map_update_entry(struct bpf_offloaded_map *offmap,
+   void *key, void *value, u64 flags)
+{
+   return virtnet_bpf_ctrl_map(offmap,
+   VIRTIO_NET_BPF_CMD_UPDATE_ELEM,
+   key, value, flags, NULL, NULL);
+}
+
+static int virtnet_bpf_map_delete_elem(struct bpf_offloaded_map *offmap,
+  void *key)
+{
+   return virtnet_bpf_ctrl_map(offmap,
+   VIRTIO_NET_BPF_CMD_DELETE_ELEM,
+   key, NULL, 0, NULL, NULL);
+}
+
+static int virtnet_bpf_map_lookup_entry(struct bpf_offloaded_map *offmap,
+  

[RFC net-next 10/18] tun: handle XDP_TX action of offloaded program

2019-11-26 Thread Prashant Bhole
When offloaded program returns XDP_TX, we need to inject the packet in
Rx path of tun. This patch injects such packets in Rx path using
tun_xdp_one.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tun.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 8d6cdd3e5139..084ca95358fe 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2249,7 +2249,13 @@ static u32 tun_do_xdp_offload(struct tun_struct *tun, 
struct tun_file *tfile,
case XDP_PASS:
break;
case XDP_TX:
-   /* fall through */
+   tpage.page = NULL;
+   tpage.count = 0;
+   tun_xdp_one(tun, tfile, &xdp, &flush, &tpage, false);
+   tun_put_page(&tpage);
+   if (flush)
+   xdp_do_flush_map();
+   break;
case XDP_REDIRECT:
/* fall through */
default:
-- 
2.20.1




[RFC net-next 16/18] bpf: export function __bpf_map_get

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

__bpf_map_get is necessary to get verify whether an fd corresponds
to a bpf map, without adding a refcount on that map. After exporting
it can be used by a kernel module.

Signed-off-by: Jason Wang 
Signed-off-by: Prashant Bhole 
---
 kernel/bpf/syscall.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e3461ec59570..e524ab1e7c64 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -737,6 +737,7 @@ struct bpf_map *__bpf_map_get(struct fd f)
 
return f.file->private_data;
 }
+EXPORT_SYMBOL(__bpf_map_get);
 
 void bpf_map_inc(struct bpf_map *map)
 {
-- 
2.20.1




[RFC net-next 18/18] virtio_net: restrict bpf helper calls from offloaded program

2019-11-26 Thread Prashant Bhole
Since we are offloading this program to the host, some of the helper
calls will not make sense. For example get_numa_node_id. Some helpers
can not be used because we don't handle them yet.

So let's allow a small set of helper calls for now.

Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 91a94b787c64..ab5be6b95bbd 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2549,6 +2549,25 @@ static struct virtnet_bpf_map 
*virtnet_get_bpf_map(struct virtnet_info *vi,
return NULL;
 }
 
+static int virtnet_bpf_check_helper_call(struct bpf_insn *insn)
+{
+   switch (insn->imm) {
+   case BPF_FUNC_map_lookup_elem:
+   case BPF_FUNC_map_update_elem:
+   case BPF_FUNC_map_delete_elem:
+   case BPF_FUNC_ktime_get_ns:
+   case BPF_FUNC_get_prandom_u32:
+   case BPF_FUNC_csum_update:
+   case BPF_FUNC_xdp_adjust_head:
+   case BPF_FUNC_xdp_adjust_meta:
+   case BPF_FUNC_xdp_adjust_tail:
+   case BPF_FUNC_strtol:
+   case BPF_FUNC_strtoul:
+   return 0;
+   }
+   return -EOPNOTSUPP;
+}
+
 static int virtnet_bpf_verify_insn(struct bpf_verifier_env *env, int insn_idx,
   int prev_insn)
 {
@@ -2830,6 +2849,7 @@ static int virtnet_bpf_verifier_setup(struct bpf_prog 
*prog)
struct virtnet_bpf_bound_prog *state;
struct virtnet_bpf_map *virtnet_map;
struct bpf_map *map;
+   u8 opcode, class;
struct fd mapfd;
int i, err = 0;
 
@@ -2846,6 +2866,16 @@ static int virtnet_bpf_verifier_setup(struct bpf_prog 
*prog)
for (i = 0; i < state->len; i++) {
struct bpf_insn *insn = &state->insnsi[i];
 
+   opcode = BPF_OP(insn->code);
+   class = BPF_CLASS(insn->code);
+
+   if ((class == BPF_JMP || class == BPF_JMP32) &&
+   opcode == BPF_CALL && insn->src_reg != BPF_PSEUDO_CALL) {
+   if (virtnet_bpf_check_helper_call(insn))
+   return -EOPNOTSUPP;
+   continue;
+   }
+
if (insn->code != (BPF_LD | BPF_IMM | BPF_DW))
continue;
 
-- 
2.20.1




[RFC net-next 03/18] net: core: export do_xdp_generic_core()

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

Let's export do_xdp_generic as a general purpose function. It will
just run XDP program on skb but will not handle XDP actions.

Signed-off-by: Jason Wang 
Signed-off-by: Prashant Bhole 
---
 include/linux/netdevice.h | 2 ++
 net/core/dev.c| 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9e6fb8524d91..2b6317ac9795 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3648,6 +3648,8 @@ static inline void dev_consume_skb_any(struct sk_buff 
*skb)
 
 void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
 int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb);
+u32 do_xdp_generic_core(struct sk_buff *skb, struct xdp_buff *xdp,
+   struct bpf_prog *xdp_prog);
 int netif_rx(struct sk_buff *skb);
 int netif_rx_ni(struct sk_buff *skb);
 int netif_receive_skb(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 5ae647b9914f..d97c3f35e047 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4461,9 +4461,8 @@ static struct netdev_rx_queue *netif_get_rxqueue(struct 
sk_buff *skb)
return rxqueue;
 }
 
-static u32 do_xdp_generic_core(struct sk_buff *skb,
-  struct xdp_buff *xdp,
-  struct bpf_prog *xdp_prog)
+u32 do_xdp_generic_core(struct sk_buff *skb, struct xdp_buff *xdp,
+   struct bpf_prog *xdp_prog)
 {
struct netdev_rx_queue *rxqueue;
void *orig_data, *orig_data_end;
@@ -4574,6 +4573,7 @@ static u32 do_xdp_generic_core(struct sk_buff *skb,
 
return act;
 }
+EXPORT_SYMBOL_GPL(do_xdp_generic_core);
 
 /* When doing generic XDP we have to bypass the qdisc layer and the
  * network taps in order to match in-driver-XDP behavior.
-- 
2.20.1




[RFC net-next 13/18] virtio_net: use XDP attachment helpers

2019-11-26 Thread Prashant Bhole
Next patches will introduce virtio_net XDP offloading. In that case
we need to manage offloaded and non-offload program. In order to make
it consistent this patch introduces use of XDP attachment helpers.

Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c | 30 +++---
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c8bbb1b90c1c..cee5c2b15c62 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -229,6 +229,8 @@ struct virtnet_info {
struct failover *failover;
 
struct bpf_prog __rcu *xdp_prog;
+
+   struct xdp_attachment_info xdp;
 };
 
 struct padded_vnet_hdr {
@@ -2398,15 +2400,19 @@ static int virtnet_restore_guest_offloads(struct 
virtnet_info *vi)
return virtnet_set_guest_offloads(vi, offloads);
 }
 
-static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
-  struct netlink_ext_ack *extack)
+static int virtnet_xdp_set(struct net_device *dev, struct netdev_bpf *bpf)
 {
unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
+   struct netlink_ext_ack *extack = bpf->extack;
struct virtnet_info *vi = netdev_priv(dev);
+   struct bpf_prog *prog = bpf->prog;
struct bpf_prog *old_prog;
u16 xdp_qp = 0, curr_qp;
int i, err;
 
+   if (!xdp_attachment_flags_ok(&vi->xdp, bpf))
+   return -EBUSY;
+
if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)
&& (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
@@ -2478,8 +2484,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct 
bpf_prog *prog,
}
}
 
-   if (old_prog)
-   bpf_prog_put(old_prog);
+   xdp_attachment_setup(&vi->xdp, bpf);
 
return 0;
 
@@ -2501,26 +2506,13 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
return err;
 }
 
-static u32 virtnet_xdp_query(struct net_device *dev)
-{
-   struct virtnet_info *vi = netdev_priv(dev);
-   const struct bpf_prog *xdp_prog;
-
-   xdp_prog = rtnl_dereference(vi->xdp_prog);
-   if (xdp_prog)
-   return xdp_prog->aux->id;
-
-   return 0;
-}
-
 static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 {
switch (xdp->command) {
case XDP_SETUP_PROG:
-   return virtnet_xdp_set(dev, xdp->prog, xdp->extack);
+   return virtnet_xdp_set(dev, xdp);
case XDP_QUERY_PROG:
-   xdp->prog_id = virtnet_xdp_query(dev);
-   return 0;
+   return xdp_attachment_query(&vi->xdp, xdp);
default:
return -EINVAL;
}
-- 
2.20.1




[RFC net-next 06/18] tuntap: remove usage of ptr ring in vhost_net

2019-11-26 Thread Prashant Bhole
Remove usage of ptr ring of tuntap in vhost_net and remove the
functions exported from tuntap drivers to get ptr ring.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tap.c  | 13 -
 drivers/net/tun.c  | 13 -
 drivers/vhost/net.c| 31 ---
 include/linux/if_tap.h |  5 -
 include/linux/if_tun.h |  5 -
 5 files changed, 4 insertions(+), 63 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 8635cdfd7aa4..6426501b8d0e 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1298,19 +1298,6 @@ struct socket *tap_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tap_get_socket);
 
-struct ptr_ring *tap_get_ptr_ring(struct file *file)
-{
-   struct tap_queue *q;
-
-   if (file->f_op != &tap_fops)
-   return ERR_PTR(-EINVAL);
-   q = file->private_data;
-   if (!q)
-   return ERR_PTR(-EBADFD);
-   return &q->ring;
-}
-EXPORT_SYMBOL_GPL(tap_get_ptr_ring);
-
 int tap_queue_resize(struct tap_dev *tap)
 {
struct net_device *dev = tap->dev;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4f28f2387435..d078b4659897 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -3750,19 +3750,6 @@ struct socket *tun_get_socket(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tun_get_socket);
 
-struct ptr_ring *tun_get_tx_ring(struct file *file)
-{
-   struct tun_file *tfile;
-
-   if (file->f_op != &tun_fops)
-   return ERR_PTR(-EINVAL);
-   tfile = file->private_data;
-   if (!tfile)
-   return ERR_PTR(-EBADFD);
-   return &tfile->tx_ring;
-}
-EXPORT_SYMBOL_GPL(tun_get_tx_ring);
-
 module_init(tun_init);
 module_exit(tun_cleanup);
 MODULE_DESCRIPTION(DRV_DESCRIPTION);
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0f91b374a558..2e069d1ef946 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -122,7 +122,6 @@ struct vhost_net_virtqueue {
/* Reference counting for outstanding ubufs.
 * Protected by vq mutex. Writers must also take device mutex. */
struct vhost_net_ubuf_ref *ubufs;
-   struct ptr_ring *rx_ring;
struct vhost_net_buf rxq;
/* Batched XDP buffs */
struct xdp_buff *xdp;
@@ -997,8 +996,9 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, 
struct sock *sk)
int len = 0;
unsigned long flags;
 
-   if (rvq->rx_ring)
-   return vhost_net_buf_peek(rvq);
+   len = vhost_net_buf_peek(rvq);
+   if (len)
+   return len;
 
spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
head = skb_peek(&sk->sk_receive_queue);
@@ -1187,7 +1187,7 @@ static void handle_rx(struct vhost_net *net)
goto out;
}
busyloop_intr = false;
-   if (nvq->rx_ring) {
+   if (!vhost_net_buf_is_empty(&nvq->rxq)) {
ctl.type = TUN_MSG_PKT;
ctl.ptr = vhost_net_buf_consume(&nvq->rxq);
msg.msg_control = &ctl;
@@ -1343,7 +1343,6 @@ static int vhost_net_open(struct inode *inode, struct 
file *f)
n->vqs[i].batched_xdp = 0;
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
-   n->vqs[i].rx_ring = NULL;
vhost_net_buf_init(&n->vqs[i].rxq);
}
vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX,
@@ -1372,7 +1371,6 @@ static struct socket *vhost_net_stop_vq(struct vhost_net 
*n,
vhost_net_disable_vq(n, vq);
vhost_net_buf_unproduce(nvq);
vq->private_data = NULL;
-   nvq->rx_ring = NULL;
mutex_unlock(&vq->mutex);
return sock;
 }
@@ -1468,25 +1466,6 @@ static struct socket *get_raw_socket(int fd)
return ERR_PTR(r);
 }
 
-static struct ptr_ring *get_tap_ptr_ring(int fd)
-{
-   struct ptr_ring *ring;
-   struct file *file = fget(fd);
-
-   if (!file)
-   return NULL;
-   ring = tun_get_tx_ring(file);
-   if (!IS_ERR(ring))
-   goto out;
-   ring = tap_get_ptr_ring(file);
-   if (!IS_ERR(ring))
-   goto out;
-   ring = NULL;
-out:
-   fput(file);
-   return ring;
-}
-
 static struct socket *get_tap_socket(int fd)
 {
struct file *file = fget(fd);
@@ -1570,8 +1549,6 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
r = vhost_net_enable_vq(n, vq);
if (r)
goto err_used;
-   if (index == VHOST_NET_VQ_RX)
-   nvq->rx_ring = get_tap_ptr_ring(fd);
 
oldubufs = nvq->ubufs;
nvq->ubufs = ubufs;
diff --git a/include/linux/if_tap.h b/include/linux/if_tap.h
index 915a187cfabd..68fe366fb185 100644
--- a/include/linux/if_tap.h
+++ b/include/linux/if_tap.h
@@ -4,7 +4,6 @@
 
 #if IS_ENABLED(CONFIG_TAP)
 struct socket *tap_get_socket(struct fi

[RFC 0/3] Qemu: virtio-net XDP offload

2019-11-26 Thread Prashant Bhole
Note: This RFC has been sent to netdev as well as qemu-devel lists

This patchset implements XDP offload feature in qemu. The successful
operation of this feature depends on availability of XDP offload
feature in guest, qemu and host. When this feature isn't available in
qemu or host, the request from guest to offload an XDP program should
fail.

Patch 1/3 adds support for libbpf in configure script.
Patch 2/2 enables offloading of ebpf program.
Patch 3/3 enabled offloading of ebpf map.

Points for improvement (TODO):
- In future virtio can have feature bit for offloading capability

- TUNGETFEATURES should have a flag to notify about offloading
  capability

- Submit virtio spec patch to describe XDP offloading feature

- DoS: Offloaded map uses host's memory which is other than what has
  been allocated for the guest. Offloading many maps of large size can
  be one of the DoS strategy. Hence qemu should have parameter to
  limit how many maps guest can offload or how much memory offloaded
  maps use.

Note:
This set directly modifies virtio_net.h header instead of
importing it from existing kernel headers because relevant changes
aren't present in kernel repository yet. Hence changes to virtio_net.h
are for RFC purpose only.


Jason Wang (2):
  virtio-net: add support for offloading XDP program
  virtio-net: add support for offloading an ebpf map

Prashant Bhole (1):
  configure: add libbpf support

 configure   |  23 +++
 hw/net/Makefile.objs|   2 +
 hw/net/virtio-net.c | 157 
 include/net/tap.h   |   2 +
 include/standard-headers/linux/virtio_net.h |  50 +++
 net/Makefile.objs   |   1 +
 net/tap-bsd.c   |   5 +
 net/tap-linux.c |  48 ++
 net/tap-linux.h |   1 +
 net/tap-solaris.c   |   5 +
 net/tap-stub.c  |   5 +
 net/tap.c   |   7 +
 net/tap_int.h   |   1 +
 13 files changed, 307 insertions(+)

-- 
2.20.1




[RFC net-next 02/18] net: core: rename netif_receive_generic_xdp() to do_generic_xdp_core()

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

In skb generic path, we need a way to run XDP program on skb but
to have customized handling of XDP actions. netif_receive_generic_xdp
will be more helpful in such cases than do_xdp_generic.

This patch prepares netif_receive_generic_xdp() to be used as general
purpose function by renaming it.

Signed-off-by: Jason Wang 
Signed-off-by: Prashant Bhole 
---
 net/core/dev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c7fc902ccbdc..5ae647b9914f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4461,9 +4461,9 @@ static struct netdev_rx_queue *netif_get_rxqueue(struct 
sk_buff *skb)
return rxqueue;
 }
 
-static u32 netif_receive_generic_xdp(struct sk_buff *skb,
-struct xdp_buff *xdp,
-struct bpf_prog *xdp_prog)
+static u32 do_xdp_generic_core(struct sk_buff *skb,
+  struct xdp_buff *xdp,
+  struct bpf_prog *xdp_prog)
 {
struct netdev_rx_queue *rxqueue;
void *orig_data, *orig_data_end;
@@ -4610,7 +4610,7 @@ int do_xdp_generic(struct bpf_prog *xdp_prog, struct 
sk_buff *skb)
u32 act;
int err;
 
-   act = netif_receive_generic_xdp(skb, &xdp, xdp_prog);
+   act = do_xdp_generic_core(skb, &xdp, xdp_prog);
if (act != XDP_PASS) {
switch (act) {
case XDP_REDIRECT:
-- 
2.20.1




[RFC net-next 15/18] virtio_net: implement XDP prog offload functionality

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This patch implements bpf_prog_offload_ops callbacks and adds handling
for XDP_SETUP_PROG_HW. Handling of XDP_SETUP_PROG_HW involves setting
up struct virtio_net_ctrl_ebpf_prog and appending program instructions
to it. This control buffer is sent to Qemu with class
VIRTIO_NET_CTRL_EBPF and command VIRTIO_NET_BPF_CMD_SET_OFFLOAD.
The expected behavior from Qemu is that it should try to load the
program in host os and report the status.

It also adds restriction to have either driver or offloaded program
at a time. This restriction can be removed later after proper testing.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c| 114 +++-
 include/uapi/linux/virtio_net.h |  27 
 2 files changed, 139 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a1088d0114f2..dddfbb2a2075 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -169,6 +169,7 @@ struct control_buf {
u8 allmulti;
__virtio16 vid;
__virtio64 offloads;
+   struct virtio_net_ctrl_ebpf_prog prog_ctrl;
 };
 
 struct virtnet_info {
@@ -272,6 +273,8 @@ struct virtnet_bpf_bound_prog {
struct bpf_insn insnsi[0];
 };
 
+#define VIRTNET_EA(extack, msg)NL_SET_ERR_MSG_MOD((extack), msg)
+
 /* Converting between virtqueue no. and kernel tx/rx queue no.
  * 0:rx0 1:tx0 2:rx1 3:tx1 ... 2N:rxN 2N+1:txN 2N+2:cvq
  */
@@ -2427,6 +2430,11 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct netdev_bpf *bpf)
if (!xdp_attachment_flags_ok(&vi->xdp, bpf))
return -EBUSY;
 
+   if (rtnl_dereference(vi->offload_xdp_prog)) {
+   VIRTNET_EA(bpf->extack, "program already attached in offload 
mode");
+   return -EINVAL;
+   }
+
if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)
&& (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
@@ -2528,17 +2536,114 @@ static int virtnet_bpf_verify_insn(struct 
bpf_verifier_env *env, int insn_idx,
 
 static void virtnet_bpf_destroy_prog(struct bpf_prog *prog)
 {
+   struct virtnet_bpf_bound_prog *state;
+
+   state = prog->aux->offload->dev_priv;
+   list_del(&state->list);
+   kfree(state);
+}
+
+static int virtnet_xdp_offload_check(struct virtnet_info *vi,
+struct netdev_bpf *bpf)
+{
+   if (!bpf->prog)
+   return 0;
+
+   if (!bpf->prog->aux->offload) {
+   VIRTNET_EA(bpf->extack, "xdpoffload of non-bound program");
+   return -EINVAL;
+   }
+   if (bpf->prog->aux->offload->netdev != vi->dev) {
+   VIRTNET_EA(bpf->extack, "program bound to different dev");
+   return -EINVAL;
+   }
+
+   if (rtnl_dereference(vi->xdp_prog)) {
+   VIRTNET_EA(bpf->extack, "program already attached in driver 
mode");
+   return -EINVAL;
+   }
+
+   return 0;
 }
 
 static int virtnet_xdp_set_offload(struct virtnet_info *vi,
   struct netdev_bpf *bpf)
 {
-   return -EBUSY;
+   struct virtio_net_ctrl_ebpf_prog *ctrl;
+   struct virtnet_bpf_bound_prog *bound_prog = NULL;
+   struct virtio_device *vdev = vi->vdev;
+   struct bpf_prog *prog = bpf->prog;
+   void *ctrl_buf = NULL;
+   struct scatterlist sg;
+   int prog_len;
+   int err = 0;
+
+   if (!xdp_attachment_flags_ok(&vi->xdp_hw, bpf))
+   return -EBUSY;
+
+   if (prog) {
+   if (prog->type != BPF_PROG_TYPE_XDP)
+   return -EOPNOTSUPP;
+   bound_prog = prog->aux->offload->dev_priv;
+   prog_len = prog->len * sizeof(bound_prog->insnsi[0]);
+
+   ctrl_buf = kmalloc(GFP_KERNEL, sizeof(*ctrl) + prog_len);
+   if (!ctrl_buf)
+   return -ENOMEM;
+   ctrl = ctrl_buf;
+   ctrl->cmd = cpu_to_virtio32(vi->vdev,
+   VIRTIO_NET_BPF_CMD_SET_OFFLOAD);
+   ctrl->len = cpu_to_virtio32(vi->vdev, prog_len);
+   ctrl->gpl_compatible = cpu_to_virtio16(vi->vdev,
+  prog->gpl_compatible);
+   memcpy(ctrl->insns, bound_prog->insnsi,
+  prog->len * sizeof(bound_prog->insnsi[0]));
+   sg_init_one(&sg, ctrl_buf, sizeof(*ctrl) + prog_len);
+   } else {
+   ctrl = &vi->ctrl->prog_ctrl;
+   ctrl->cmd  = cpu_to_virtio32(vi->vdev,
+VIRTIO_NET_BPF_CMD_UNSET_OFFLOAD);
+   sg_init_one(&sg, ctrl, sizeof(*ctrl));
+   }
+
+   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_EBPF,
+  

[RFC net-next 14/18] virtio_net: add XDP prog offload infrastructure

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This patch prepares virtio_net of XDP offloading. It adds data
structures, blank callback implementations for bpf_prog_offload_ops.
It also implements ndo_init, ndo_uninit operations for setting up
offload related data structures.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c | 103 +++
 1 file changed, 103 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index cee5c2b15c62..a1088d0114f2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -229,8 +229,14 @@ struct virtnet_info {
struct failover *failover;
 
struct bpf_prog __rcu *xdp_prog;
+   struct bpf_prog __rcu *offload_xdp_prog;
 
struct xdp_attachment_info xdp;
+   struct xdp_attachment_info xdp_hw;
+
+   struct bpf_offload_dev *bpf_dev;
+
+   struct list_head bpf_bound_progs;
 };
 
 struct padded_vnet_hdr {
@@ -258,6 +264,14 @@ static struct xdp_frame *ptr_to_xdp(void *ptr)
return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG);
 }
 
+struct virtnet_bpf_bound_prog {
+   struct virtnet_info *vi;
+   struct bpf_prog *prog;
+   struct list_head list;
+   u32 len;
+   struct bpf_insn insnsi[0];
+};
+
 /* Converting between virtqueue no. and kernel tx/rx queue no.
  * 0:rx0 1:tx0 2:rx1 3:tx1 ... 2N:rxN 2N+1:txN 2N+2:cvq
  */
@@ -2506,13 +2520,63 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct netdev_bpf *bpf)
return err;
 }
 
+static int virtnet_bpf_verify_insn(struct bpf_verifier_env *env, int insn_idx,
+  int prev_insn)
+{
+   return 0;
+}
+
+static void virtnet_bpf_destroy_prog(struct bpf_prog *prog)
+{
+}
+
+static int virtnet_xdp_set_offload(struct virtnet_info *vi,
+  struct netdev_bpf *bpf)
+{
+   return -EBUSY;
+}
+
+static int virtnet_bpf_verifier_setup(struct bpf_prog *prog)
+{
+   return -ENOMEM;
+}
+
+static int virtnet_bpf_verifier_prep(struct bpf_prog *prog)
+{
+   return 0;
+}
+
+static int virtnet_bpf_translate(struct bpf_prog *prog)
+{
+   return 0;
+}
+
+static int virtnet_bpf_finalize(struct bpf_verifier_env *env)
+{
+   return 0;
+}
+
+static const struct bpf_prog_offload_ops virtnet_bpf_dev_ops = {
+   .setup  = virtnet_bpf_verifier_setup,
+   .prepare= virtnet_bpf_verifier_prep,
+   .insn_hook  = virtnet_bpf_verify_insn,
+   .finalize   = virtnet_bpf_finalize,
+   .translate  = virtnet_bpf_translate,
+   .destroy= virtnet_bpf_destroy_prog,
+};
+
 static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 {
+   struct virtnet_info *vi = netdev_priv(dev);
switch (xdp->command) {
case XDP_SETUP_PROG:
return virtnet_xdp_set(dev, xdp);
case XDP_QUERY_PROG:
return xdp_attachment_query(&vi->xdp, xdp);
+   case XDP_SETUP_PROG_HW:
+   return virtnet_xdp_set_offload(vi, xdp);
+   case XDP_QUERY_PROG_HW:
+   return xdp_attachment_query(&vi->xdp_hw, xdp);
default:
return -EINVAL;
}
@@ -2559,7 +2623,46 @@ static int virtnet_set_features(struct net_device *dev,
return 0;
 }
 
+static int virtnet_bpf_init(struct virtnet_info *vi)
+{
+   int err;
+
+   vi->bpf_dev = bpf_offload_dev_create(&virtnet_bpf_dev_ops, NULL);
+   err = PTR_ERR_OR_ZERO(vi->bpf_dev);
+   if (err)
+   return err;
+
+   err = bpf_offload_dev_netdev_register(vi->bpf_dev, vi->dev);
+   if (err)
+   goto err_netdev_register;
+
+   INIT_LIST_HEAD(&vi->bpf_bound_progs);
+
+   return 0;
+
+err_netdev_register:
+   bpf_offload_dev_destroy(vi->bpf_dev);
+   return err;
+}
+
+static int virtnet_init(struct net_device *dev)
+{
+   struct virtnet_info *vi = netdev_priv(dev);
+
+   return virtnet_bpf_init(vi);
+}
+
+static void virtnet_uninit(struct net_device *dev)
+{
+   struct virtnet_info *vi = netdev_priv(dev);
+
+   bpf_offload_dev_netdev_unregister(vi->bpf_dev, vi->dev);
+   bpf_offload_dev_destroy(vi->bpf_dev);
+}
+
 static const struct net_device_ops virtnet_netdev = {
+   .ndo_init= virtnet_init,
+   .ndo_uninit  = virtnet_uninit,
.ndo_open= virtnet_open,
.ndo_stop= virtnet_close,
.ndo_start_xmit  = start_xmit,
-- 
2.20.1




[RFC net-next 12/18] virtio-net: store xdp_prog in device

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This is a preparation for adding XDP offload support in virtio_net
driver. By storing XDP program in virtionet_info will make it
consistent with the offloaded program which will introduce in next
patches.

Signed-off-by: Jason Wang 
Co-developed-by: Prashant Bhole 
Signed-off-by: Prashant Bhole 
---
 drivers/net/virtio_net.c | 62 
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 4d7d5434cc5d..c8bbb1b90c1c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -137,8 +137,6 @@ struct receive_queue {
 
struct napi_struct napi;
 
-   struct bpf_prog __rcu *xdp_prog;
-
struct virtnet_rq_stats stats;
 
/* Chain pages by the private ptr. */
@@ -229,6 +227,8 @@ struct virtnet_info {
 
/* failover when STANDBY feature enabled */
struct failover *failover;
+
+   struct bpf_prog __rcu *xdp_prog;
 };
 
 struct padded_vnet_hdr {
@@ -486,7 +486,6 @@ static int virtnet_xdp_xmit(struct net_device *dev,
int n, struct xdp_frame **frames, u32 flags)
 {
struct virtnet_info *vi = netdev_priv(dev);
-   struct receive_queue *rq = vi->rq;
struct bpf_prog *xdp_prog;
struct send_queue *sq;
unsigned int len;
@@ -501,7 +500,7 @@ static int virtnet_xdp_xmit(struct net_device *dev,
/* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this
 * indicate XDP resources have been successfully allocated.
 */
-   xdp_prog = rcu_dereference(rq->xdp_prog);
+   xdp_prog = rcu_dereference(vi->xdp_prog);
if (!xdp_prog)
return -ENXIO;
 
@@ -649,7 +648,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
stats->bytes += len;
 
rcu_read_lock();
-   xdp_prog = rcu_dereference(rq->xdp_prog);
+   xdp_prog = rcu_dereference(vi->xdp_prog);
if (xdp_prog) {
struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
struct xdp_frame *xdpf;
@@ -798,7 +797,7 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
stats->bytes += len - vi->hdr_len;
 
rcu_read_lock();
-   xdp_prog = rcu_dereference(rq->xdp_prog);
+   xdp_prog = rcu_dereference(vi->xdp_prog);
if (xdp_prog) {
struct xdp_frame *xdpf;
struct page *xdp_page;
@@ -2060,7 +2059,7 @@ static int virtnet_set_channels(struct net_device *dev,
 * also when XDP is loaded all RX queues have XDP programs so we only
 * need to check a single RX queue.
 */
-   if (vi->rq[0].xdp_prog)
+   if (vi->xdp_prog)
return -EINVAL;
 
get_online_cpus();
@@ -2441,13 +2440,10 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
return -ENOMEM;
}
 
-   old_prog = rtnl_dereference(vi->rq[0].xdp_prog);
+   old_prog = rtnl_dereference(vi->xdp_prog);
if (!prog && !old_prog)
return 0;
 
-   if (prog)
-   bpf_prog_add(prog, vi->max_queue_pairs - 1);
-
/* Make sure NAPI is not using any XDP TX queues for RX. */
if (netif_running(dev)) {
for (i = 0; i < vi->max_queue_pairs; i++) {
@@ -2457,11 +2453,8 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
}
 
if (!prog) {
-   for (i = 0; i < vi->max_queue_pairs; i++) {
-   rcu_assign_pointer(vi->rq[i].xdp_prog, prog);
-   if (i == 0)
-   virtnet_restore_guest_offloads(vi);
-   }
+   rcu_assign_pointer(vi->xdp_prog, prog);
+   virtnet_restore_guest_offloads(vi);
synchronize_net();
}
 
@@ -2472,16 +2465,12 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
vi->xdp_queue_pairs = xdp_qp;
 
if (prog) {
-   for (i = 0; i < vi->max_queue_pairs; i++) {
-   rcu_assign_pointer(vi->rq[i].xdp_prog, prog);
-   if (i == 0 && !old_prog)
-   virtnet_clear_guest_offloads(vi);
-   }
+   rcu_assign_pointer(vi->xdp_prog, prog);
+   if (!old_prog)
+   virtnet_clear_guest_offloads(vi);
}
 
for (i = 0; i < vi->max_queue_pairs; i++) {
-   if (old_prog)
-   bpf_prog_put(old_prog);
if (netif_running(dev)) {
virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
virtnet_napi_tx_enable(vi, vi->sq[i].vq,
@@ -2489,13 +2478,15 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
}
}
 
+   if (old_prog)
+   bpf_prog_put(old_prog);
+
return 0;

Re: [PATCH v17 03/14] util/cutils: refactor do_strtosz() to support suffixes list

2019-11-26 Thread Markus Armbruster
Eduardo Habkost  writes:

> On Mon, Nov 25, 2019 at 08:20:23AM +0100, Markus Armbruster wrote:
>> Tao Xu  writes:
>> 
>> > Add do_strtomul() to convert string according to different suffixes.
>> >
>> > Reviewed-by: Eduardo Habkost 
>> > Signed-off-by: Tao Xu 
>> 
>> What's the actual change here?  "Refactor" suggests the interfaces stay
>> the same, only their implementation changes.  "Support suffixes list"
>> suggests some interface supports something new.
>
> 1) Parameters added to suffix_mul() (suffix table);
> 2) do_strtomul() is being extracted from do_strtosz().
>
> do_strtomul() interface and behavior stays the same.

Alright, it's two related changes squashed together (which tends to
complicate writing good commit messages).  2) is really a refactoring.
1) is not: it makes suffix_mul() more flexible.  Summarizing 1) and 2)
as "refactor do_strtosz() to support suffixes list" is confusing,
because it's about 1), while the interesting part is actually 2).

Moreover, the commit message should state why these two changes are
useful.  It tries, but "to support suffixes list" merely kicks the can
down the road, because the reader is left to wonder why supporting
suffix lists is useful.  It's actually for use by the next patch.  So
say that.

I'll review the actual patch now.




[PATCH] MAINTAINERS: Update Yuval Shaia's email address

2019-11-26 Thread Yuval Shaia
Use gmail account for maintainer tasks.

Signed-off-by: Yuval Shaia 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e5e3e52d6..4297b54fcb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2640,7 +2640,7 @@ F: tests/test-replication.c
 F: docs/block-replication.txt
 
 PVRDMA
-M: Yuval Shaia 
+M: Yuval Shaia 
 M: Marcel Apfelbaum 
 S: Maintained
 F: hw/rdma/*
-- 
2.20.1




[RFC net-next 08/18] tun: run offloaded XDP program in Tx path

2019-11-26 Thread Prashant Bhole
run offloaded XDP program as soon as packet is removed from the ptr
ring. Since this is XDP in Tx path, the traditional handling of
XDP actions XDP_TX/REDIRECT isn't valid. For this reason we call
do_xdp_generic_core instead of do_xdp_generic. do_xdp_generic_core
just runs the program and leaves the action handling to us.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tun.c | 149 +-
 1 file changed, 146 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index ecb49101b0b5..466ea69f00ee 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -131,6 +131,7 @@ struct tap_filter {
 /* MAX_TAP_QUEUES 256 is chosen to allow rx/tx queues to be equal
  * to max number of VCPUs in guest. */
 #define MAX_TAP_QUEUES 256
+#define MAX_TAP_BATCH 64
 #define MAX_TAP_FLOWS  4096
 
 #define TUN_FLOW_EXPIRE (3 * HZ)
@@ -2156,6 +2157,109 @@ static ssize_t tun_put_user(struct tun_struct *tun,
return total;
 }
 
+static struct sk_buff *tun_prepare_xdp_skb(struct sk_buff *skb)
+{
+   struct sk_buff *nskb;
+
+   if (skb_shared(skb) || skb_cloned(skb)) {
+   nskb = skb_copy(skb, GFP_ATOMIC);
+   consume_skb(skb);
+   return nskb;
+   }
+
+   return skb;
+}
+
+static u32 tun_do_xdp_offload_generic(struct tun_struct *tun,
+ struct sk_buff *skb)
+{
+   struct tun_prog *xdp_prog;
+   struct xdp_buff xdp;
+   u32 act = XDP_PASS;
+
+   xdp_prog = rcu_dereference(tun->offloaded_xdp_prog);
+   if (xdp_prog) {
+   skb = tun_prepare_xdp_skb(skb);
+   if (!skb) {
+   act = XDP_DROP;
+   kfree_skb(skb);
+   goto drop;
+   }
+
+   act = do_xdp_generic_core(skb, &xdp, xdp_prog->prog);
+   switch (act) {
+   case XDP_TX:
+   /*
+* Rx path generic XDP will be called in this path
+*/
+   netif_receive_skb(skb);
+   break;
+   case XDP_PASS:
+   break;
+   case XDP_REDIRECT:
+   /*
+* Since we are not handling this case yet, let's free
+* skb here. In case of XDP_DROP/XDP_ABORTED, the skb
+* was already freed in do_xdp_generic_core()
+*/
+   kfree_skb(skb);
+   /* fall through */
+   default:
+   bpf_warn_invalid_xdp_action(act);
+   /* fall through */
+   case XDP_ABORTED:
+   trace_xdp_exception(tun->dev, xdp_prog->prog, act);
+   /* fall through */
+   case XDP_DROP:
+   goto drop;
+   }
+   }
+
+   return act;
+drop:
+   this_cpu_inc(tun->pcpu_stats->tx_dropped);
+   return act;
+}
+
+static u32 tun_do_xdp_offload(struct tun_struct *tun, struct tun_file *tfile,
+ struct xdp_frame *frame)
+{
+   struct tun_prog *xdp_prog;
+   struct tun_page tpage;
+   struct xdp_buff xdp;
+   u32 act = XDP_PASS;
+   int flush = 0;
+
+   xdp_prog = rcu_dereference(tun->offloaded_xdp_prog);
+   if (xdp_prog) {
+   xdp.data_hard_start = frame->data - frame->headroom;
+   xdp.data = frame->data;
+   xdp.data_end = xdp.data + frame->len;
+   xdp.data_meta = xdp.data - frame->metasize;
+
+   act = bpf_prog_run_xdp(xdp_prog->prog, &xdp);
+   switch (act) {
+   case XDP_PASS:
+   break;
+   case XDP_TX:
+   /* fall through */
+   case XDP_REDIRECT:
+   /* fall through */
+   default:
+   bpf_warn_invalid_xdp_action(act);
+   /* fall through */
+   case XDP_ABORTED:
+   trace_xdp_exception(tun->dev, xdp_prog->prog, act);
+   /* fall through */
+   case XDP_DROP:
+   xdp_return_frame_rx_napi(frame);
+   break;
+   }
+   }
+
+   return act;
+}
+
 static void *tun_ring_recv(struct tun_file *tfile, int noblock, int *err)
 {
DECLARE_WAITQUEUE(wait, current);
@@ -2574,6 +2678,47 @@ static int tun_sendmsg(struct socket *sock, struct 
msghdr *m, size_t total_len)
return ret;
 }
 
+static int tun_consume_packets(struct tun_file *tfile, void **ptr_array, int n)
+{
+   struct tun_prog *xdp_prog;
+   struct xdp_frame *frame;
+   struct tun_struct *tun;
+   int i, num_ptrs;
+   int pkt_cnt = 0;
+   void *pkts[MAX_TAP_BATCH];
+   void *ptr;
+   u32 act;
+
+   

[RFC net-next 09/18] tun: add a way to inject Tx path packet into Rx path

2019-11-26 Thread Prashant Bhole
In order to support XDP_TX from offloaded XDP program, we need a way
to inject Tx path packet into Rx path. Let's modify the Rx path
function tun_xdp_one() for this purpose.

This patch adds a parameter to pass information whether packet has
virtio_net header. When header isn't present, it is considered as
XDP_TX'ed packet from offloaded program.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tun.c | 35 ---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 466ea69f00ee..8d6cdd3e5139 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2221,6 +2221,13 @@ static u32 tun_do_xdp_offload_generic(struct tun_struct 
*tun,
return act;
 }
 
+static int tun_xdp_one(struct tun_struct *tun,
+  struct tun_file *tfile,
+  struct xdp_buff *xdp, int *flush,
+  struct tun_page *tpage, int has_hdr);
+
+static void tun_put_page(struct tun_page *tpage);
+
 static u32 tun_do_xdp_offload(struct tun_struct *tun, struct tun_file *tfile,
  struct xdp_frame *frame)
 {
@@ -2527,23 +2534,36 @@ static void tun_put_page(struct tun_page *tpage)
 static int tun_xdp_one(struct tun_struct *tun,
   struct tun_file *tfile,
   struct xdp_buff *xdp, int *flush,
-  struct tun_page *tpage)
+  struct tun_page *tpage, int has_hdr)
 {
unsigned int datasize = xdp->data_end - xdp->data;
-   struct tun_xdp_hdr *hdr = xdp->data_hard_start;
-   struct virtio_net_hdr *gso = &hdr->gso;
+   struct tun_xdp_hdr *hdr;
+   struct virtio_net_hdr *gso;
struct tun_pcpu_stats *stats;
struct bpf_prog *xdp_prog;
struct sk_buff *skb = NULL;
+   unsigned int headroom;
u32 rxhash = 0, act;
-   int buflen = hdr->buflen;
+   int buflen;
int err = 0;
bool skb_xdp = false;
struct page *page;
 
+   if (has_hdr) {
+   hdr = xdp->data_hard_start;
+   gso = &hdr->gso;
+   buflen = hdr->buflen;
+   } else {
+   /* came here from tun tx path */
+   xdp->data_hard_start -= sizeof(struct xdp_frame);
+   headroom = xdp->data - xdp->data_hard_start;
+   buflen = datasize + headroom +
+SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+   }
+
xdp_prog = rcu_dereference(tun->xdp_prog);
if (xdp_prog) {
-   if (gso->gso_type) {
+   if (has_hdr && gso->gso_type) {
skb_xdp = true;
goto build;
}
@@ -2588,7 +2608,8 @@ static int tun_xdp_one(struct tun_struct *tun,
skb_reserve(skb, xdp->data - xdp->data_hard_start);
skb_put(skb, xdp->data_end - xdp->data);
 
-   if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
+   if (has_hdr &&
+   virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
this_cpu_inc(tun->pcpu_stats->rx_frame_errors);
kfree_skb(skb);
err = -EINVAL;
@@ -2652,7 +2673,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr 
*m, size_t total_len)
 
for (i = 0; i < n; i++) {
xdp = &((struct xdp_buff *)ctl->ptr)[i];
-   tun_xdp_one(tun, tfile, xdp, &flush, &tpage);
+   tun_xdp_one(tun, tfile, xdp, &flush, &tpage, true);
}
 
if (flush)
-- 
2.20.1




Re: [PATCH] throttle-groups: fix memory leak in throttle_group_set_limits

2019-11-26 Thread Alberto Garcia
On Tue 26 Nov 2019 09:17:02 AM CET, pannengy...@huawei.com wrote:
> --- a/block/throttle-groups.c
> +++ b/block/throttle-groups.c
> @@ -912,6 +912,7 @@ static void throttle_group_set_limits(Object *obj, 
> Visitor *v,
>  unlock:
>  qemu_mutex_unlock(&tg->lock);
>  ret:
> +qapi_free_ThrottleLimits(argp);
>  error_propagate(errp, local_err);
>  return;

Thanks, but I also think that 'arg' is not used so it can be removed?

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 77014c741b..37695b0cd7 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -893,8 +893,7 @@ static void throttle_group_set_limits(Object *obj, Visitor 
*v,
 {
 ThrottleGroup *tg = THROTTLE_GROUP(obj);
 ThrottleConfig cfg;
-ThrottleLimits arg = { 0 };
-ThrottleLimits *argp = &arg;
+ThrottleLimits *argp;
 Error *local_err = NULL;
 
 visit_type_ThrottleLimits(v, name, &argp, &local_err);
@@ -912,6 +911,7 @@ static void throttle_group_set_limits(Object *obj, Visitor 
*v,
 unlock:
 qemu_mutex_unlock(&tg->lock);
 ret:
+qapi_free_ThrottleLimits(argp);
 error_propagate(errp, local_err);
 return;
 }

Berto



[RFC net-next 05/18] vhost_net: user tap recvmsg api to access ptr ring

2019-11-26 Thread Prashant Bhole
Currently vhost_net directly accesses ptr ring of tap driver to
fetch Rx packet pointers. In order to avoid it this patch modifies
tap driver's recvmsg api to do additional task of fetching Rx packet
pointers.

A special struct tun_msg_ctl is already being passed via msg_control
for tun Rx XDP batching. This patch extends tun_msg_ctl usage to
send sub commands to recvmsg api. Now tun_recvmsg will handle commands
to consume and unconsume packet pointers from ptr ring.

This will be useful in implementation of virtio-net XDP offload
feature, where packets will be XDP processed before they are passed
to vhost_net.

Signed-off-by: Prashant Bhole 
---
 drivers/net/tap.c  | 22 ++-
 drivers/net/tun.c  | 24 -
 drivers/vhost/net.c| 48 +++---
 include/linux/if_tun.h | 18 
 4 files changed, 98 insertions(+), 14 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 4df7bf00af66..8635cdfd7aa4 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1234,8 +1234,28 @@ static int tap_recvmsg(struct socket *sock, struct 
msghdr *m,
   size_t total_len, int flags)
 {
struct tap_queue *q = container_of(sock, struct tap_queue, sock);
-   struct sk_buff *skb = m->msg_control;
+   struct tun_msg_ctl *ctl = m->msg_control;
+   struct sk_buff *skb = NULL;
int ret;
+
+   if (ctl) {
+   switch (ctl->type) {
+   case TUN_MSG_PKT:
+   skb = ctl->ptr;
+   break;
+   case TUN_MSG_CONSUME_PKTS:
+   return ptr_ring_consume_batched(&q->ring,
+   ctl->ptr,
+   ctl->num);
+   case TUN_MSG_UNCONSUME_PKTS:
+   ptr_ring_unconsume(&q->ring, ctl->ptr, ctl->num,
+  tun_ptr_free);
+   return 0;
+   default:
+   return -EINVAL;
+   }
+   }
+
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC)) {
kfree_skb(skb);
return -EINVAL;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 1e436d9ec4e1..4f28f2387435 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2577,7 +2577,8 @@ static int tun_recvmsg(struct socket *sock, struct msghdr 
*m, size_t total_len,
 {
struct tun_file *tfile = container_of(sock, struct tun_file, socket);
struct tun_struct *tun = tun_get(tfile);
-   void *ptr = m->msg_control;
+   struct tun_msg_ctl *ctl = m->msg_control;
+   void *ptr = NULL;
int ret;
 
if (!tun) {
@@ -2585,6 +2586,27 @@ static int tun_recvmsg(struct socket *sock, struct 
msghdr *m, size_t total_len,
goto out_free;
}
 
+   if (ctl) {
+   switch (ctl->type) {
+   case TUN_MSG_PKT:
+   ptr = ctl->ptr;
+   break;
+   case TUN_MSG_CONSUME_PKTS:
+   ret = ptr_ring_consume_batched(&tfile->tx_ring,
+  ctl->ptr,
+  ctl->num);
+   goto out;
+   case TUN_MSG_UNCONSUME_PKTS:
+   ptr_ring_unconsume(&tfile->tx_ring, ctl->ptr,
+  ctl->num, tun_ptr_free);
+   ret = 0;
+   goto out;
+   default:
+   ret = -EINVAL;
+   goto out_put_tun;
+   }
+   }
+
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC|MSG_ERRQUEUE)) {
ret = -EINVAL;
goto out_put_tun;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1a2dd53caade..0f91b374a558 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -175,24 +175,44 @@ static void *vhost_net_buf_consume(struct vhost_net_buf 
*rxq)
 
 static int vhost_net_buf_produce(struct vhost_net_virtqueue *nvq)
 {
+   struct vhost_virtqueue *vq = &nvq->vq;
+   struct socket *sock = vq->private_data;
struct vhost_net_buf *rxq = &nvq->rxq;
+   struct tun_msg_ctl ctl = {
+   .type = TUN_MSG_CONSUME_PKTS,
+   .ptr = (void *) rxq->queue,
+   .num = VHOST_NET_BATCH,
+   };
+   struct msghdr msg = {
+   .msg_control = &ctl,
+   };
 
rxq->head = 0;
-   rxq->tail = ptr_ring_consume_batched(nvq->rx_ring, rxq->queue,
- VHOST_NET_BATCH);
+   rxq->tail = sock->ops->recvmsg(sock, &msg, 0, 0);
+   if (WARN_ON_ONCE(rxq->tail < 0))
+   rxq->tail = 0;
+
return rxq->tail;
 }
 
 static void vhost_net_buf_unproduce(struct vhost_net_virtqueue *nvq)
 {
+   struct vhost_virtqu

[RFC net-next 07/18] tun: set offloaded xdp program

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

This patch introduces an ioctl way to set an offloaded XDP program
to tun driver. This ioctl will be used by qemu to offload XDP program
from virtio_net in the guest.

Signed-off-by: Jason Wang 
Signed-off-by: Prashant Bhole 
---
 drivers/net/tun.c   | 19 ++-
 include/uapi/linux/if_tun.h |  1 +
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d078b4659897..ecb49101b0b5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -241,6 +241,7 @@ struct tun_struct {
struct bpf_prog __rcu *xdp_prog;
struct tun_prog __rcu *steering_prog;
struct tun_prog __rcu *filter_prog;
+   struct tun_prog __rcu *offloaded_xdp_prog;
struct ethtool_link_ksettings link_ksettings;
 };
 
@@ -2256,7 +2257,7 @@ static void tun_prog_free(struct rcu_head *rcu)
 {
struct tun_prog *prog = container_of(rcu, struct tun_prog, rcu);
 
-   bpf_prog_destroy(prog->prog);
+   bpf_prog_put(prog->prog);
kfree(prog);
 }
 
@@ -2301,6 +2302,7 @@ static void tun_free_netdev(struct net_device *dev)
security_tun_dev_free_security(tun->security);
__tun_set_ebpf(tun, &tun->steering_prog, NULL);
__tun_set_ebpf(tun, &tun->filter_prog, NULL);
+   __tun_set_ebpf(tun, &tun->offloaded_xdp_prog, NULL);
 }
 
 static void tun_setup(struct net_device *dev)
@@ -3036,7 +3038,7 @@ static int tun_set_queue(struct file *file, struct ifreq 
*ifr)
 }
 
 static int tun_set_ebpf(struct tun_struct *tun, struct tun_prog **prog_p,
-   void __user *data)
+   void __user *data, int type)
 {
struct bpf_prog *prog;
int fd;
@@ -3047,7 +3049,7 @@ static int tun_set_ebpf(struct tun_struct *tun, struct 
tun_prog **prog_p,
if (fd == -1) {
prog = NULL;
} else {
-   prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
+   prog = bpf_prog_get_type(fd, type);
if (IS_ERR(prog))
return PTR_ERR(prog);
}
@@ -3345,11 +3347,18 @@ static long __tun_chr_ioctl(struct file *file, unsigned 
int cmd,
break;
 
case TUNSETSTEERINGEBPF:
-   ret = tun_set_ebpf(tun, &tun->steering_prog, argp);
+   ret = tun_set_ebpf(tun, &tun->steering_prog, argp,
+  BPF_PROG_TYPE_SOCKET_FILTER);
break;
 
case TUNSETFILTEREBPF:
-   ret = tun_set_ebpf(tun, &tun->filter_prog, argp);
+   ret = tun_set_ebpf(tun, &tun->filter_prog, argp,
+  BPF_PROG_TYPE_SOCKET_FILTER);
+   break;
+
+   case TUNSETOFFLOADEDXDP:
+   ret = tun_set_ebpf(tun, &tun->offloaded_xdp_prog, argp,
+  BPF_PROG_TYPE_XDP);
break;
 
case TUNSETCARRIER:
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 454ae31b93c7..21dbd8db2401 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -61,6 +61,7 @@
 #define TUNSETFILTEREBPF _IOR('T', 225, int)
 #define TUNSETCARRIER _IOW('T', 226, int)
 #define TUNGETDEVNETNS _IO('T', 227)
+#define TUNSETOFFLOADEDXDP _IOW('T', 228, int)
 
 /* TUNSETIFF ifr flags */
 #define IFF_TUN0x0001
-- 
2.20.1




Re: [PATCH v17 03/14] util/cutils: refactor do_strtosz() to support suffixes list

2019-11-26 Thread Markus Armbruster
Tao Xu  writes:

> Add do_strtomul() to convert string according to different suffixes.
>
> Reviewed-by: Eduardo Habkost 
> Signed-off-by: Tao Xu 
> ---
>
> No changes in v17.
>
> Changes in v15:
> - Add a new patch to refactor do_strtosz() (Eduardo)
> ---
>  util/cutils.c | 72 ++-
>  1 file changed, 42 insertions(+), 30 deletions(-)
>
> diff --git a/util/cutils.c b/util/cutils.c
> index d94a468954..ffef92338a 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -181,41 +181,37 @@ int fcntl_setfl(int fd, int flag)
>  }
>  #endif
>  
> -static int64_t suffix_mul(char suffix, int64_t unit)
> +static int64_t suffix_mul(const char *suffixes[], int num_suffix,
> +  const char *endptr, int *offset, int64_t unit)

This function could really use a contract comment now.

>  {
> -switch (qemu_toupper(suffix)) {
> -case 'B':
> -return 1;
> -case 'K':
> -return unit;
> -case 'M':
> -return unit * unit;
> -case 'G':
> -return unit * unit * unit;
> -case 'T':
> -return unit * unit * unit * unit;
> -case 'P':
> -return unit * unit * unit * unit * unit;
> -case 'E':
> -return unit * unit * unit * unit * unit * unit;
> +int i, suffix_len;
> +int64_t mul = 1;
> +
> +for (i = 0; i < num_suffix; i++) {

Aha: @num_suffix is the number of elements in suffixes[].

I'd put a terminating NULL into suffixes[] and dispense with
@num_suffix:

   for (i = 0; suffix[i]; i++) {

> +suffix_len = strlen(suffixes[i]);

@suffix_len should be size_t, because that's what strlen() returns.

> +if (g_ascii_strncasecmp(suffixes[i], endptr, suffix_len) == 0) {

@endptr is a terrible name: it points to the *beginning* of the suffix.

> +*offset = suffix_len;

@offset is a terrible name: it provides no clue whatsoever on its
meaning.  It's actually for receiving the length of the suffix.

Please replace it by char **end, because that's how the related
functions work.

> +return mul;
> +}
> +mul *= unit;
>  }
> +
>  return -1;
>  }
>  
>  /*
> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> - * other error.
> + * Convert string according to different suffixes. End pointer will be 
> returned
> + * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on other 
> error.
>   */
> -static int do_strtosz(const char *nptr, const char **end,
> -  const char default_suffix, int64_t unit,
> +static int do_strtomul(const char *nptr, const char **end,
> +   const char *suffixes[], int num_suffix,
> +   const char *default_suffix, int64_t unit,
>uint64_t *result)

The function comment is barely adequate before your patch: it doesn't
explain the arguments.  Your patch adds more arguments, thus more
guesswork for the reader.

Again, I'd put a terminating NULL into suffixes[] and dispense with
@num_suffix.

>  {
>  int retval;
>  const char *endptr;
> -unsigned char c;
>  int mul_required = 0;
> +int offset = 0;
>  long double val, mul, integral, fraction;
>  
>  retval = qemu_strtold_finite(nptr, &endptr, &val);
> @@ -226,12 +222,12 @@ static int do_strtosz(const char *nptr, const char 
> **end,
>  if (fraction != 0) {
>  mul_required = 1;
>  }
> -c = *endptr;
> -mul = suffix_mul(c, unit);
> +
> +mul = suffix_mul(suffixes, num_suffix, endptr, &offset, unit);

@endptr points behind the number, i.e. to the suffix, and @offset is the
length of the suffix.  Suggest to rename @endptr to @suffix, and replace
@offset by @endptr.

>  if (mul >= 0) {
> -endptr++;
> +endptr += offset;
>  } else {
> -mul = suffix_mul(default_suffix, unit);
> +mul = suffix_mul(suffixes, num_suffix, default_suffix, &offset, 
> unit);
>  assert(mul >= 0);

Please assert "no trailing crap in @default_suffix".

>  }
>  if (mul == 1 && mul_required) {
> @@ -256,19 +252,35 @@ out:
>  return retval;
>  }
>  
> +/*
> + * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> + * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> + * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> + * other error.
> + */
> +static int do_strtosz(const char *nptr, const char **end,
> +  const char *default_suffix, int64_t unit,
> +  uint64_t *result)
> +{
> +static const char *suffixes[] = { "B", "K", "M", "G", "T", "P", "E" };
> +
> +return do_strtomul(nptr, end, suffixes, ARRAY_SIZE(suffixes),
> +   default_suffix, unit, result);
> +}
> +
>  int qemu_strtosz(const char *nptr, const char *

[RFC net-next 00/18] virtio_net XDP offload

2019-11-26 Thread Prashant Bhole
Note: This RFC has been sent to netdev as well as qemu-devel lists

This series introduces XDP offloading from virtio_net. It is based on
the following work by Jason Wang:
https://netdevconf.info/0x13/session.html?xdp-offload-with-virtio-net

Current XDP performance in virtio-net is far from what we can achieve
on host. Several major factors cause the difference:
- Cost of virtualization
- Cost of virtio (populating virtqueue and context switching)
- Cost of vhost, it needs more optimization
- Cost of data copy
Because of above reasons there is a need of offloading XDP program to
host. This set is an attempt to implement XDP offload from the guest.


* High level design:

virtio_net exposes itself as offload capable device and works as a
transport of commands to load the program on the host. When offload is
requested, it sends the program to Qemu. Qemu then loads the program
and attaches to corresponding tap device. Similarly virtio_net sends
control commands to create and control maps. tap device runs the XDP
prog in its Tx path. The fast datapath remains on host whereas slow
path in which user program reads/updates map values remains in the
guest.

When offloading to actual hardware the program needs to be translated
and JITed for the target hardware. In case of offloading from guest
we pass almost raw program to the host. The verifier on the host
verifies the offloaded program.


* Implementation in Kernel


virtio_net
==
Creates bpf offload device and registers as offload capable device.
It also implements bpf_map_dev_ops to handle the offloaded map. A new
command structure is defined to communicate with qemu.

Map offload:
- In offload sequence maps are always offloaded before the program. In
  map offloading stage, virtio_net sends control commands to qemu to
  create a map and return a map fd which is valid on host. This fd is
  stored in driver specific map structure. A list of such maps is
  maintained.

- Currently BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH are supported.
  Offloading a per cpu array from guest to host doesn't make sense.

Program offload:
- In general the verifier in the guest replaces map fds in the user
  submitted programs with map pointers then bpf_prog_offload_ops
  callbacks are called.

- This set introduces new program offload callback 'setup()' which
  verifier calls before replacing map fds with map pointers. This way
  virtio_net can create a copy of the program with guest map fds. It
  was needed because virtio_net wants to derive driver specific map
  data from guest map fd. Then guest map fd will be replaced with
  host map fd in the copy of the program, hence the copy of the
  program which will be submitted to the host will have valid host map
  fds.

- Alternatively if we can move the prep() call in the verifier before
  map fd replacement happens, there is not need to introduce 'setup()'
  callback.

- As per current implementation of 'setup()' callback in virtio_net,
  it verifies full program for allowed helper functions and performs
  above mentioned map fd replacement.

- A list of allowed helper function is maintained and it is currently
  experimental, it will be updated later as per need. Using this
  list we can filter out most non-XDP type programs to some extent.
  Also we prevent the guest from collecting host specific information
  by not allowing some helper calls.

- XDP_PROG_SETUP_HW is called after successful program verification.
  In this call a control buffer is prepared, program instructions are
  appended to the buffer and it is sent to qemu.

tun
===
This set makes changes in tun to run XDP prog in Tx path. It will be
the offloaded program from the guest. This program can be set using
tun ioctl interface. There were multiple places where this program can
be executed.
- tun_net_xmit
- tun_xdp_xmit
- tun_recvmsg
tun_recvmsg was chosen because it runs in process context. The other
two run in bh context. Running in process context helps in setting up
service chaining using XDP redirect.

XDP_REDIRECT action of offloaded program isn't handled. It is because
target interface's ndo_xdp_xmit is called when we redirect a packet.
In offload case the target interface will be some tap interface. Any
packet redirected towards it will sent back to the guest, which is not
what we expect. Such redirect will need special handling in the kernel

XDP_TX action of offloaded program is handled. Packet is injected into
the Rx path in this case. Care is taken such that the tap's native Rx
path XDP will be executed in such case.


* Implementation in Qemu

Qemu is modified to handle handle control commands from the guest.
When program offload command is received, it loads the program in the
host OS and attaches program fd to tap device. All the program and map
operations are performed using libbpf APIs.


* Performance numbers

Single flow tests were performed. The diagram below shows the setup.
xdp1 and xdp2 sample programs were modified to use BPF_MAP_TYP

Re: [PATCH for-4.2 0/2] Fix bitmap migration

2019-11-26 Thread Max Reitz
On 25.11.19 13:52, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> We've faced a bug in rhev-2.12.0-33.el7-based Qemu.
> In upstream, bug introduced in 4.0 by 74da6b943565c45
> "block/dirty-bitmaps: implement inconsistent bit" commit.
> At this commit we started to load inconsistent bitmap instead of
> silently ignoring them, and it now I see that it breaks migration.
> 
> The fix is very simple, so I think it's OK for 4.2.. Still, it's not a
> degradation, so we may postpone it to 5.0.
> 
> Vladimir Sementsov-Ogievskiy (2):
>   block/qcow2-bitmap: fix bitmap migration
>   iotests: add new test cases to bitmap migration
> 
>  block/qcow2-bitmap.c   | 21 -
>  tests/qemu-iotests/169 | 22 +++---
>  tests/qemu-iotests/169.out |  4 ++--
>  3 files changed, 37 insertions(+), 10 deletions(-)

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max



signature.asc
Description: OpenPGP digital signature


[RFC net-next 01/18] bpf: introduce bpf_prog_offload_verifier_setup()

2019-11-26 Thread Prashant Bhole
From: Jason Wang 

Background:
This change was initiated from virtio_net XDP offload work. As per
the implementation plan, a copy of original program with map fds from
guest replaced with map fds from host needs to be offloaded to the
host. To implement this fd replacement, insn_hook() must provide an
insn with map fd intact. bpf_map and driver specific map data can be
derived from map_fd.

Since verifier calls all the offload callbacks after replacing map
fds, it was difficult to implement virtio_net XDP offload feature.
If virtio_net gets only one callback with original bpf program, it
will get a chance to perform the fd replacement in its own copy of the
program.

Solution:
Let's introduce a setup() callback in bpf_prog_offload_ops. It will be
non mandetory. The verifier will call it just before replacing the map
fds.

Signed-off-by: Jason Wang 
Signed-off-by: Prashant Bhole 
---
 include/linux/bpf.h  |  1 +
 include/linux/bpf_verifier.h |  1 +
 kernel/bpf/offload.c | 14 ++
 kernel/bpf/verifier.c|  6 ++
 4 files changed, 22 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 35903f148be5..1cdba120357c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -361,6 +361,7 @@ struct bpf_prog_offload_ops {
struct bpf_insn *insn);
int (*remove_insns)(struct bpf_verifier_env *env, u32 off, u32 cnt);
/* program management callbacks */
+   int (*setup)(struct bpf_prog *prog);
int (*prepare)(struct bpf_prog *prog);
int (*translate)(struct bpf_prog *prog);
void (*destroy)(struct bpf_prog *prog);
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 26e40de9ef55..de7028e17c0d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -418,6 +418,7 @@ static inline struct bpf_reg_state *cur_regs(struct 
bpf_verifier_env *env)
return cur_func(env)->regs;
 }
 
+int bpf_prog_offload_verifier_setup(struct bpf_prog *prog);
 int bpf_prog_offload_verifier_prep(struct bpf_prog *prog);
 int bpf_prog_offload_verify_insn(struct bpf_verifier_env *env,
 int insn_idx, int prev_insn_idx);
diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
index 5b9da0954a27..04ca7a31d947 100644
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -124,6 +124,20 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union 
bpf_attr *attr)
return err;
 }
 
+int bpf_prog_offload_verifier_setup(struct bpf_prog *prog)
+{
+   struct bpf_prog_offload *offload;
+   int ret = 0;
+
+   down_read(&bpf_devs_lock);
+   offload = prog->aux->offload;
+   if (offload && offload->offdev->ops->setup)
+   ret = offload->offdev->ops->setup(prog);
+   up_read(&bpf_devs_lock);
+
+   return ret;
+}
+
 int bpf_prog_offload_verifier_prep(struct bpf_prog *prog)
 {
struct bpf_prog_offload *offload;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a0482e1c4a77..94b43542439e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9737,6 +9737,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
 
env->allow_ptr_leaks = is_priv;
 
+   if (bpf_prog_is_dev_bound(env->prog->aux)) {
+   ret = bpf_prog_offload_verifier_setup(env->prog);
+   if (ret)
+   goto skip_full_check;
+   }
+
if (is_priv)
env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
 
-- 
2.20.1




Re: [RFC 0/3] Qemu: virtio-net XDP offload

2019-11-26 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20191126100914.5150-1-prashantbhole.li...@gmail.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  aarch64-softmmu/hw/arm/musicpal.o
  CC  aarch64-softmmu/hw/arm/netduino2.o
  CC  aarch64-softmmu/hw/arm/nseries.o
/tmp/qemu-test/src/hw/net/virtio-net.c:636:12: error: 'peer_attach_ebpf' 
defined but not used [-Werror=unused-function]
 static int peer_attach_ebpf(VirtIONet *n, int len, void *insns, uint8_t gpl)
^~~~
cc1: all warnings being treated as errors
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/net/virtio-net.o] Error 1
make[1]: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/hw/arm/omap_sx1.o
  CC  aarch64-softmmu/hw/arm/palm.o
make: *** [Makefile:491: x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/hw/arm/gumstix.o
  CC  aarch64-softmmu/hw/arm/spitz.o
---
  CC  aarch64-softmmu/gdbstub-xml.o
  CC  aarch64-softmmu/trace/generated-helpers.o
  CC  aarch64-softmmu/target/arm/translate-sve.o
/tmp/qemu-test/src/hw/net/virtio-net.c:636:12: error: 'peer_attach_ebpf' 
defined but not used [-Werror=unused-function]
 static int peer_attach_ebpf(VirtIONet *n, int len, void *insns, uint8_t gpl)
^~~~
cc1: all warnings being treated as errors
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/net/virtio-net.o] Error 1
make: *** [Makefile:491: aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in 
sys.exit(main())
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=5b6acc2ac7494ad6b59706c9dd26c3cd', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-4e2hnnd5/src/docker-src.2019-11-26-05.33.37.332:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=5b6acc2ac7494ad6b59706c9dd26c3cd
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-4e2hnnd5/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real2m24.967s
user0m8.999s


The full log is available at
http://patchew.org/logs/20191126100914.5150-1-prashantbhole.li...@gmail.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: virtiofsd: Where should it live?

2019-11-26 Thread Daniel P . Berrangé
On Mon, Nov 25, 2019 at 06:50:21PM +, Dr. David Alan Gilbert wrote:
> Hi,
>   There's been quite a bit of discussion about where virtiofsd, our
> implemenation of a virtiofs daemon, should live.  I'd like to get
> this settled now, because I'd like to tidy it up for the next
> qemu cycle.
> 
> For reference it's based on qemu's livhost-user+chunks of libfuse.
> It can't live in libfuse because we change enough of the library
> to break their ABI.  It's C, and we've got ~100 patches - which
> we can split into about 3 chunks.
> 
> Some suggestions so far:
>   a) In contrib
>  This is my current working assumption; the main objection is it's
>  a bit big and pulls in a chunk of libfuse.

My main objection to 'contrib/' is actually the perceived notions
about what the contrib directory is for. When I see 'contrib/'
code in either QEMU, or other open source projects, my general
impression is that this is largely unsupported code which is just
there as it might be interesting to someone, and doesn't typically
get much ongoing dev attention.

Parts that are fully supported & actively developed by projects
usually live elsewhere like a src/ or lib/ or tools/ directory.

This has kind of been the case with QEMU historically, with
the vhost-user-blk, vhost-user-scsi not being real production
quality implementations. Rather they are just technology demos
to show what you might do.   vhost-user-gpu/input blurred this
boundary a bit as they're more supported tools, and so I'd
argue contrib/ probably wasn't the right place for them either
in hindsight.

virtiofsd is definitely different as it is intended to be a
fully production quality supported tool with active dev into
the future IIUC.

IOW, if we did decide we want it in QEMU, then instead of
'$GIT/contrib/virtiofsd', I'd prefer to see '$GIT/virtiofsd'.

>   b) In a submodule
> 
>   c) Just separate
> 
> Your suggestions/ideas please.  My preference is (a).

What I'm wondering is just how much sharing / overlap of code and concepts
and community operation there is going otbetween QEMU and virtiofsd. From
the tech POV, IIUC, the main blocker it would need to be in QEMU is because
it links to libvhost-user and we've not declared that to be a stable API
for 3rd party linking.

Personally I'm always biased towards self-contained apps being in their
own repositories, rather than bundling too much stuff into one repo. You
can see that in the way we've created independant git repos for any libvirt
module that didn't need to be part of the main libvirt.git.

To me the key benefit this gives is flexibility in approach. ie the app
doesn't need to blindly follow every precedent that QEMU has set. It
can instead take the most appropriate path for its needs. For example...

It could use meson for its build system already. This would be good as
builds will be done in a matter of seconds. For contributors it would
be a much less daunting project to join as it wouldn't be lost in the
firehose of other non-virtiofsd contributions on qemu-devel.

It doesn't have to follow QEMU's 3-times a year release model, with 6
week long freeze periods. It can be more agile releasing 6 times a year
with 1 week freezes if desired, I personally think tihs would be quite
desirable for a young project like virtiofsd which is evolving rapidly
as it would get new work available to users much more rapidly.

It doesn't have to follow QEMU's API stability & deprecation policies.
It could be more flexible in taking non-compatible changes, which again
may be valuable for a young rapidly evolving app.



Anyway to be clear, I'm not a contributor to virtiofsd, nor likely to
be one in the future, so just consider this a personal POV. From QEMU's
POV I don't think it'll matter whether virtiofsd in or out of QEMU git.
It is more about the impact & burden QEMU's dev process & standards might
impose on virtiofsd itself.

I'm fine with whatever option above is chosen, with the only caveat
being that if its in qemu.git, I don't think it belongs under contrib/
it should be a top level dir of its own.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v17 03/14] util/cutils: refactor do_strtosz() to support suffixes list

2019-11-26 Thread Daniel P . Berrangé
On Tue, Nov 26, 2019 at 11:04:41AM +0100, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > On Mon, Nov 25, 2019 at 08:20:23AM +0100, Markus Armbruster wrote:
> >> Tao Xu  writes:
> >> 
> >> > Add do_strtomul() to convert string according to different suffixes.
> >> >
> >> > Reviewed-by: Eduardo Habkost 
> >> > Signed-off-by: Tao Xu 
> >> 
> >> What's the actual change here?  "Refactor" suggests the interfaces stay
> >> the same, only their implementation changes.  "Support suffixes list"
> >> suggests some interface supports something new.
> >
> > 1) Parameters added to suffix_mul() (suffix table);
> > 2) do_strtomul() is being extracted from do_strtosz().
> >
> > do_strtomul() interface and behavior stays the same.
> 
> Alright, it's two related changes squashed together (which tends to
> complicate writing good commit messages).  2) is really a refactoring.
> 1) is not: it makes suffix_mul() more flexible.  Summarizing 1) and 2)
> as "refactor do_strtosz() to support suffixes list" is confusing,
> because it's about 1), while the interesting part is actually 2).
> 
> Moreover, the commit message should state why these two changes are
> useful.  It tries, but "to support suffixes list" merely kicks the can
> down the road, because the reader is left to wonder why supporting
> suffix lists is useful.  It's actually for use by the next patch.  So
> say that.

Test case additions to test-cutils.c would go a long way to illustrating
what is added & that its working correctly.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [RFC 0/3] Qemu: virtio-net XDP offload

2019-11-26 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20191126100914.5150-1-prashantbhole.li...@gmail.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  ui/input-keymap.o
  CC  ui/input-legacy.o
  CC  ui/kbd-state.o
/tmp/qemu-test/src/net/tap-linux.c:34:21: fatal error: bpf/bpf.h: No such file 
or directory
 #include 
 ^
compilation terminated.
---
  SIGNpc-bios/optionrom/linuxboot.bin
  SIGNpc-bios/optionrom/kvmvapic.bin
  BUILD   pc-bios/optionrom/linuxboot_dma.img
make: *** [net/tap-linux.o] Error 1
make: *** Waiting for unfinished jobs
  BUILD   pc-bios/optionrom/pvh.img
  BUILD   pc-bios/optionrom/linuxboot_dma.raw
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=be849bfed02d4ea7b19f7746fe037bd5', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-3d2z3wl3/src/docker-src.2019-11-26-05.31.05.21708:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=be849bfed02d4ea7b19f7746fe037bd5
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-3d2z3wl3/src'
make: *** [docker-run-test-quick@centos7] Error 2

real1m56.447s
user0m8.519s


The full log is available at
http://patchew.org/logs/20191126100914.5150-1-prashantbhole.li...@gmail.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH] fence: introduce a file-based self-fence mechanism

2019-11-26 Thread Marc-André Lureau
On Mon, Nov 25, 2019 at 8:14 PM Felipe Franciosi  wrote:
>
> This introduces a self-fence mechanism to Qemu, causing it to die if a
> heartbeat condition is not met. Currently, a file-based heartbeat is
> available and can be configured as follows:
>
> -object file-fence,id=ff0,file=/foo,qtimeout=20,ktimeout=25,signal=kill
>
> Qemu will watch 'file' for attribute changes. Touching the file works as
> a heartbeat. This parameter is mandatory.
>
> Fencing happens after 'qtimeout' or 'ktimeout' seconds elapse without a
> heartbeat. At least one of these must be specified. Both may be used.
>
> When using 'qtimeout', an internal Qemu timer is used. Fencing with this
> method gives Qemu a chance to write a log message indicating which file
> caused the event. If Qemu's main loop is hung for whatever reason, this
> method won't successfully kill Qemu.
>
> When using 'ktimeout', a kernel timer is used. In this case, 'signal'
> can be 'kill' (for SIGKILL, default) or 'quit' (for SIGQUIT). Using
> SIGQUIT may be preferred for obtaining core dumps. If Qemu is hung
> (eg. uninterruptable sleep), this method won't successfully kill Qemu.
>
> It is worth noting that even successfully killing Qemu may not be
> sufficient to completely fence a VM as certain operations like network
> packets or block commands may be pending in the kernel. If that is a
> concern, systems should consider using further fencing mechanisms like
> hardware watchdogs either in addition or in conjunction with this for
> additional protection.
>
> Signed-off-by: Felipe Franciosi 
> ---
> Based-on: <20191125153619.39893-2-fel...@nutanix.com>
>
>  Makefile.objs   |   1 +
>  fence/Makefile.objs |   1 +
>  fence/file_fence.c  | 381 

I think it could be under backends/
And a slight preference for - seperated words in filenames over qemu codebase.

>  qemu-options.hx |  27 +++-
>  4 files changed, 409 insertions(+), 1 deletion(-)
>  create mode 100644 fence/Makefile.objs
>  create mode 100644 fence/file_fence.c
>
> diff --git a/Makefile.objs b/Makefile.objs
> index 11ba1a36bd..998eed4796 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -75,6 +75,7 @@ common-obj-$(CONFIG_TPM) += tpm.o
>
>  common-obj-y += backends/
>  common-obj-y += chardev/
> +common-obj-y += fence/
>
>  common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
>  qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
> diff --git a/fence/Makefile.objs b/fence/Makefile.objs
> new file mode 100644
> index 00..2ed2092568
> --- /dev/null
> +++ b/fence/Makefile.objs
> @@ -0,0 +1 @@
> +common-obj-y += file_fence.o
> diff --git a/fence/file_fence.c b/fence/file_fence.c
> new file mode 100644
> index 00..5b743e69d2
> --- /dev/null
> +++ b/fence/file_fence.c
> @@ -0,0 +1,381 @@
> +/*
> + * QEMU file-based self-fence mechanism
> + *
> + * Copyright (c) 2019 Nutanix Inc. All rights reserved.
> + *
> + * Authors:
> + *Felipe Franciosi 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qom/object_interfaces.h"
> +#include "qemu/filemonitor.h"
> +#include "qemu/timer.h"
> +
> +#include 
> +
> +#define TYPE_FILE_FENCE "file-fence"
> +
> +typedef struct FileFence {
> +Object parent_obj;
> +
> +gchar *dir;
> +gchar *file;
> +uint32_t qtimeout;
> +uint32_t ktimeout;
> +int signal;
> +
> +timer_t ktimer;
> +QEMUTimer *qtimer;
> +
> +QFileMonitor *fm;
> +uint64_t id;
> +} FileFence;
> +
> +#define FILE_FENCE(obj) \
> +OBJECT_CHECK(FileFence, (obj), TYPE_FILE_FENCE)
> +
> +static void
> +timer_update(FileFence *ff)
> +{
> +struct itimerspec its = {
> +.it_value.tv_sec = ff->ktimeout,
> +};
> +int err;
> +
> +if (ff->qtimeout) {
> +timer_mod(ff->qtimer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> +  ff->qtimeout * 1000);
> +}
> +
> +if (ff->ktimeout) {
> +err = timer_settime(ff->ktimer, 0, &its, NULL);
> +g_assert(err == 0);
> +}
> +}
> +
> +static void
> +file_fence_abort_cb(void *opaque)
> +{
> +FileFence *ff = opaque;
> +printf("Fencing after %u seconds on '%s'\n",
> +   ff->qtimeout, g_strconcat(ff->dir, "/", ff->file, NULL));

May be error_printf() instead.

> +abort

Re: [EXTERNAL]Re: [PATCH v2 2/5] MAINTAINERS: Adjust maintainership for Fulong 2E board

2019-11-26 Thread BALATON Zoltan

On Mon, 25 Nov 2019, Philippe Mathieu-Daudé wrote:

Hi Zoltan,

On 11/14/19 2:50 PM, Philippe Mathieu-Daudé wrote:

---

But let's wait to see what Huacai Chen thinks of it, before posting it.


Aleksandar, can you stay as co-maintainer?

The patch would be:

-- 8< --
    Fulong 2E
+M: Philippe Mathieu-Daudé 
  M: Aleksandar Markovic 
-R: Aleksandar Rikalo 


You are a prolific contributor of the Fuloong, are you OK to be listed as 
reviewer in this section?


Aleksandar M. has already asked me but no thanks. I've only contributed to 
Fulong 2E because it happens to use the same or similar components that 
I'm interested in for PPC machines emulation (VIA superio for Pegasos2 and 
ati-vga) and it was a good way to cross check these with something else 
but otherwise I don't use the MIPS boards.


Regards,
BALATON Zoltan

Re: [RFC 0/8] ATI R300 emulated graphics card

2019-11-26 Thread Andrew Randrianasulu
Hello, Aaron!

While I can't help with coding (or even answering questions!)
I recall mr. Zoltan worked on emulating earlier ATI cards (r128/rv100)
and pointed me at this project:

https://github.com/xenia-project/xenia/tree/master/src/xenia/gpu
this basically about xbox 360, but GPU inside it supposed to be R500 alike.



Re: virtiofsd: Where should it live?

2019-11-26 Thread Dr. David Alan Gilbert
* Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> Hi David
> 
> On Mon, Nov 25, 2019 at 10:50 PM Dr. David Alan Gilbert
>  wrote:
> >
> > Hi,
> >   There's been quite a bit of discussion about where virtiofsd, our
> > implemenation of a virtiofs daemon, should live.  I'd like to get
> > this settled now, because I'd like to tidy it up for the next
> > qemu cycle.
> >
> > For reference it's based on qemu's livhost-user+chunks of libfuse.
> > It can't live in libfuse because we change enough of the library
> > to break their ABI.  It's C, and we've got ~100 patches - which
> > we can split into about 3 chunks.
> >
> > Some suggestions so far:
> >   a) In contrib
> >  This is my current working assumption; the main objection is it's
> >  a bit big and pulls in a chunk of libfuse
> >
> >   b) In a submodule
> >
> >   c) Just separate
> >
> > Your suggestions/ideas please.  My preference is (a).
> >
> 
> 
> It's more about code sharing and lifecycle.
> 
> The project started in a separate repository, and the proposed patches
> for qemu aren't a clean series. Reviewing it is harder than it should
> be, as we have to review/accept the whole thing.
> 
> As you said, it doesn't share much with qemu, but libvhost-user (which
> we could quite easily copy or make standalone/submodule).
> 
> Then it dumps code from libfuse that is questionnable (showing age)
> and often redundant with facilities provided by either glib, qemu
> utils etc.

The libfuse code is pretty much upto date.

> Is vhost-user-fs (the qemu device) going to have a strong relation
> with virtiofsd?
> Are we going to support different version of qemu and virtiofsd
> combination? I suppose we have to, as vhost-user protocol should allow
> that, and it's nice to allow other to experiment and implement it in
> different ways.
> If not, then perhaps we should think about introducing some version
> checking between qemu and external processes (with config_stamp,
> similar to modules).

It should support mismatched versions.
We do have at least two extensions over the base we're working on
(DAX and notification for blocking locks); I'd expect
the sets of these to be posted close together but not be required
to go in at the same time.

> From what I understand, I think c) would be fine. However, for
> convenience/testing reasons, b) would be my preference.

Dave

> -- 
> Marc-André Lureau
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [RFC 0/8] ATI R300 emulated graphics card

2019-11-26 Thread BALATON Zoltan

Hello,

On Tue, 26 Nov 2019, aaron.zakh...@gmail.com wrote:

From: Aaron Dominick 

Hello,
I thought of working on an emulated R300 GPU for QEMU video acceleration on 
vintage operating systems (Windows 9x-XP)


Good idea, :-) I very welcome any contribution to this.


The following patch series contains the initial QEMU device and some register 
read/write operations.


At first look you have pathces twice in the series (so each patch appears 
twice, I haven't checked if they are identical or different) and also the 
first (and 5th) patch adds a lot of probably unused files which makes 
review difficult. Could you please fix these and resend with only adding 
the headers really needed and each patch only once so we know what are the 
actual changes that are relevant?


Also is it possible to keep everything in ati-vga only adding another 
device model there rather than fully duplicating it as r300* ? Although 
for development it's probably OK to keep it separate but unless there are 
enough differences having a single file at the end may be better. (Also 
changes are clearer that way but if you have a patch only copying ati-vga 
files first then separate patches that changes it can be reviewed that way 
too.)


In short I think this series needs some cleanup first for us to be able to 
revies it better.



Testing it on an OpenSUSE Linux guest and the kernel correctly detects the card 
and loads the radeon DRM driver.


So I think this gets a bit further than my ati-vga rv100 which does not 
work with DRM yet. I've thought about targeting RV100 first then moving on 
to later radeons as those probably have more features and differences from 
R128 which is the starting point for ati-vga but if you need R300 for some 
reason specifically targeting that is OK too.



It gets as far as the CRTC probing before crashing with an error that there is 
not enough bandwidth.


Getting DRM to load is one thing but likely you'll need to implement 
microengine/command processor (also referred to as PM4 or programming mode 
4 sometimes) to get it fully working as that's how DRM and Windows ATI 
drivers likely send commands to the card. I've looked at it but couldn't 
find documentation on how the microengine works. We get a microcode 
uploaded by no idea how to run that. If we can't figure out how the 
microengine works anther approach could be what xenia.jp XBox 360 emulator 
does and directly parse the packets not using the microcode. It could be 
possible to copy code from Xenia for that but we need to convert C++ to C. 
The difficult part is probably figuring out how to run it in a different 
thread so the device emulation does not block the machine and it works in 
parallel like real hardware does.



I know next to nothing about hardware emulation and would like to know if what 
I have got so far is on the right track.


Are you aware of my project page for ATI VGA emulation here:

https://osdn.net/projects/qmiga/wiki/SubprojectAti

where I have collected some knowledge that I could gather so far? I have 
some experience in implementing devices in QEMU but know next to nothing 
about GPUs so hopefully you know more about those which is more needed for 
ati-vga. The QEMU knowledge can be picked up by looking at existing 
devices and asking here or on IRC (if you prefer that, I don't use it but 
I know some do).


Regards,
BALATON Zoltan



[PULL for 4.2 0/3] a few vm-test fixes

2019-11-26 Thread Alex Bennée
The following changes since commit 65e05c82bdc6d348155e301c9d87dba7a08a5701:

  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2019-11-25 15:47:44 +)

are available in the Git repository at:

  https://github.com/stsquad/qemu.git tags/pull-rc3-testing-261119-1

for you to fetch changes up to b3b9a433b0691b73c96d9ea4de67fe9c8ea293e8:

  tests/vm/ubuntu: update i386 image to 18.04 (2019-11-26 11:28:23 +)


A few vm-test updates

  - use Ubuntu 18.04 for i386 image
  - python3 for centos and docker
  - install locales for ubuntu


Alex Bennée (3):
  tests/vm/centos: fix centos build target
  tests/vm/ubuntu: include language pack to silence locale warnings
  tests/vm/ubuntu: update i386 image to 18.04

 tests/vm/centos  | 2 +-
 tests/vm/ubuntu.i386 | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.20.1




[PULL 2/3] tests/vm/ubuntu: include language pack to silence locale warnings

2019-11-26 Thread Alex Bennée
The iotests in particular don't like the output being spammed with
warnings about locales.

Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 

diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386
index 38f740eabf7..18b1ea2b72c 100755
--- a/tests/vm/ubuntu.i386
+++ b/tests/vm/ubuntu.i386
@@ -84,7 +84,7 @@ class UbuntuX86VM(basevm.BaseVM):
 self.ssh_root_check("sed -ie s/^#\ deb-src/deb-src/g 
/etc/apt/sources.list")
 self.ssh_root_check("apt-get update")
 self.ssh_root_check("apt-get build-dep -y qemu")
-self.ssh_root_check("apt-get install -y libfdt-dev flex bison")
+self.ssh_root_check("apt-get install -y libfdt-dev flex bison 
language-pack-en")
 self.ssh_root("poweroff")
 self.wait()
 os.rename(img_tmp, img)
-- 
2.20.1




Re: [PATCH 1/4] qom/object: enable setter for uint types

2019-11-26 Thread Felipe Franciosi
Heya

> On Nov 26, 2019, at 8:40 AM, Marc-André Lureau  
> wrote:
> 
> Hi
> 
> On Mon, Nov 25, 2019 at 7:40 PM Felipe Franciosi  wrote:
>> 
>> Traditionally, the uint-specific property helpers only offer getters.
>> When adding object (or class) uint types, one must therefore use the
>> generic property helper if a setter is needed.
>> 
>> This enhances the uint-specific property helper APIs by adding a
>> 'readonly' field and modifying all users of that API to set this
>> parameter to true. If 'readonly' is false, though, the helper will add
>> an automatic setter for the value.
>> 
>> Signed-off-by: Felipe Franciosi 
>> ---
>> hw/acpi/ich9.c   |   4 +-
>> hw/acpi/pcihp.c  |   6 +-
>> hw/acpi/piix4.c  |  12 ++--
>> hw/isa/lpc_ich9.c|   4 +-
>> hw/ppc/spapr_drc.c   |   2 +-
>> include/qom/object.h |  28 +---
>> qom/object.c | 152 ---
>> ui/console.c |   3 +-
>> 8 files changed, 163 insertions(+), 48 deletions(-)
>> 
>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
>> index 2034dd749e..94dc5147ce 100644
>> --- a/hw/acpi/ich9.c
>> +++ b/hw/acpi/ich9.c
>> @@ -454,12 +454,12 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
>> *pm, Error **errp)
>> pm->s4_val = 2;
>> 
>> object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
>> -   &pm->pm_io_base, errp);
>> +   &pm->pm_io_base, true, errp);
>> object_property_add(obj, ACPI_PM_PROP_GPE0_BLK, "uint32",
>> ich9_pm_get_gpe0_blk,
>> NULL, NULL, pm, NULL);
>> object_property_add_uint32_ptr(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
>> -   &gpe0_len, errp);
>> +   &gpe0_len, true, errp);
>> object_property_add_bool(obj, "memory-hotplug-support",
>>  ich9_pm_get_memory_hotplug_support,
>>  ich9_pm_set_memory_hotplug_support,
>> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
>> index 8413348a33..70bc1408e6 100644
>> --- a/hw/acpi/pcihp.c
>> +++ b/hw/acpi/pcihp.c
>> @@ -80,7 +80,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
>> 
>> *bus_bsel = (*bsel_alloc)++;
>> object_property_add_uint32_ptr(OBJECT(bus), ACPI_PCIHP_PROP_BSEL,
>> -   bus_bsel, &error_abort);
>> +   bus_bsel, true, &error_abort);
>> }
>> 
>> return bsel_alloc;
>> @@ -373,9 +373,9 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *s, 
>> PCIBus *root_bus,
>> memory_region_add_subregion(address_space_io, s->io_base, &s->io);
>> 
>> object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_BASE_PROP, 
>> &s->io_base,
>> -   &error_abort);
>> +   true, &error_abort);
>> object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_LEN_PROP, &s->io_len,
>> -   &error_abort);
>> +   true, &error_abort);
>> }
>> 
>> const VMStateDescription vmstate_acpi_pcihp_pci_status = {
>> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
>> index 93aec2dd2c..032ba11e62 100644
>> --- a/hw/acpi/piix4.c
>> +++ b/hw/acpi/piix4.c
>> @@ -443,17 +443,17 @@ static void piix4_pm_add_propeties(PIIX4PMState *s)
>> static const uint16_t sci_int = 9;
>> 
>> object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_ENABLE_CMD,
>> -  &acpi_enable_cmd, NULL);
>> +  &acpi_enable_cmd, true, NULL);
>> object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_DISABLE_CMD,
>> -  &acpi_disable_cmd, NULL);
>> +  &acpi_disable_cmd, true, NULL);
>> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK,
>> -  &gpe0_blk, NULL);
>> +  &gpe0_blk, true, NULL);
>> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK_LEN,
>> -  &gpe0_blk_len, NULL);
>> +  &gpe0_blk_len, true, NULL);
>> object_property_add_uint16_ptr(OBJECT(s), ACPI_PM_PROP_SCI_INT,
>> -  &sci_int, NULL);
>> +  &sci_int, true, NULL);
>> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_PM_IO_BASE,
>> -  &s->io_base, NULL);
>> +  &s->io_base, true, NULL);
>> }
>> 
>> static void piix4_pm_realize(PCIDevice *dev, Error **errp)
>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>> index 17c292e306..ce3342 100644
>> --- a/hw/isa/lpc_ich9.c
>> +++ b/hw/isa/lpc_ich9.c
>> @@ -645,9 +645,9 @@ static void ich9_lpc_add_properties(ICH9LPCState *lpc)
>> ich9_lpc_get_sci_int,
>>   

[PULL 3/3] tests/vm/ubuntu: update i386 image to 18.04

2019-11-26 Thread Alex Bennée
The current image is broken while running qtests but the bug go away
when built with a newer Ubuntu i386 image. I was unable to replicate
the crash on Debian Buster for i386 either so I'm concluding it is a
distro problem. Let's paper over that crack by updating our 32 bir
test image.

Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 

diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386
index 18b1ea2b72c..f611bebdc9d 100755
--- a/tests/vm/ubuntu.i386
+++ b/tests/vm/ubuntu.i386
@@ -66,8 +66,8 @@ class UbuntuX86VM(basevm.BaseVM):
 
 def build_image(self, img):
 cimg = self._download_with_cache(
-
"https://cloud-images.ubuntu.com/releases/16.04/release-20190605/ubuntu-16.04-server-cloudimg-i386-disk1.img";,
-
sha256sum="e30091144c73483822b7c27193e9d47346dd1064229da577c3fedcf943f7cfcc")
+
"https://cloud-images.ubuntu.com/releases/bionic/release-20191114/ubuntu-18.04-server-cloudimg-i386.img";,
+
sha256sum="28969840626d1ea80bb249c08eef1a4533e8904aa51a327b40f37ac4b4ff04ef")
 img_tmp = img + ".tmp"
 subprocess.check_call(["cp", "-f", cimg, img_tmp])
 subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
-- 
2.20.1




[PULL 1/3] tests/vm/centos: fix centos build target

2019-11-26 Thread Alex Bennée
To be able to run the docker tests centos has here we have to install
python3 as well as the basic tools.

Signed-off-by: Alex Bennée 
Reviewed-by: Wainer dos Santos Moschetta 
Reviewed-by: Philippe Mathieu-Daudé 

diff --git a/tests/vm/centos b/tests/vm/centos
index 53976f1c4c9..b9e851f2d33 100755
--- a/tests/vm/centos
+++ b/tests/vm/centos
@@ -73,7 +73,7 @@ class CentosVM(basevm.BaseVM):
 self.wait_ssh()
 self.ssh_root_check("touch /etc/cloud/cloud-init.disabled")
 self.ssh_root_check("yum update -y")
-self.ssh_root_check("yum install -y docker make git")
+self.ssh_root_check("yum install -y docker make git python3")
 self.ssh_root_check("systemctl enable docker")
 self.ssh_root("poweroff")
 self.wait()
-- 
2.20.1




Re: virtiofsd: Where should it live?

2019-11-26 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Mon, Nov 25, 2019 at 06:50:21PM +, Dr. David Alan Gilbert wrote:
> > Hi,
> >   There's been quite a bit of discussion about where virtiofsd, our
> > implemenation of a virtiofs daemon, should live.  I'd like to get
> > this settled now, because I'd like to tidy it up for the next
> > qemu cycle.
> > 
> > For reference it's based on qemu's livhost-user+chunks of libfuse.
> > It can't live in libfuse because we change enough of the library
> > to break their ABI.  It's C, and we've got ~100 patches - which
> > we can split into about 3 chunks.
> > 
> > Some suggestions so far:
> >   a) In contrib
> >  This is my current working assumption; the main objection is it's
> >  a bit big and pulls in a chunk of libfuse.
> 
> My main objection to 'contrib/' is actually the perceived notions
> about what the contrib directory is for. When I see 'contrib/'
> code in either QEMU, or other open source projects, my general
> impression is that this is largely unsupported code which is just
> there as it might be interesting to someone, and doesn't typically
> get much ongoing dev attention.
> 
> Parts that are fully supported & actively developed by projects
> usually live elsewhere like a src/ or lib/ or tools/ directory.
> 
> This has kind of been the case with QEMU historically, with
> the vhost-user-blk, vhost-user-scsi not being real production
> quality implementations. Rather they are just technology demos
> to show what you might do.   vhost-user-gpu/input blurred this
> boundary a bit as they're more supported tools, and so I'd
> argue contrib/ probably wasn't the right place for them either
> in hindsight.
> 
> virtiofsd is definitely different as it is intended to be a
> fully production quality supported tool with active dev into
> the future IIUC.
> 
> IOW, if we did decide we want it in QEMU, then instead of
> '$GIT/contrib/virtiofsd', I'd prefer to see '$GIT/virtiofsd'.

I'm not sure it deserves a new top level for such a specific tool.

> >   b) In a submodule
> > 
> >   c) Just separate
> > 
> > Your suggestions/ideas please.  My preference is (a).
> 
> What I'm wondering is just how much sharing / overlap of code and concepts
> and community operation there is going otbetween QEMU and virtiofsd. From
> the tech POV, IIUC, the main blocker it would need to be in QEMU is because
> it links to libvhost-user and we've not declared that to be a stable API
> for 3rd party linking.
> 
> Personally I'm always biased towards self-contained apps being in their
> own repositories, rather than bundling too much stuff into one repo. You
> can see that in the way we've created independant git repos for any libvirt
> module that didn't need to be part of the main libvirt.git.
> 
> To me the key benefit this gives is flexibility in approach. ie the app
> doesn't need to blindly follow every precedent that QEMU has set. It
> can instead take the most appropriate path for its needs. For example...
> 
> It could use meson for its build system already. This would be good as
> builds will be done in a matter of seconds. For contributors it would
> be a much less daunting project to join as it wouldn't be lost in the
> firehose of other non-virtiofsd contributions on qemu-devel.
> 
> It doesn't have to follow QEMU's 3-times a year release model, with 6
> week long freeze periods. It can be more agile releasing 6 times a year
> with 1 week freezes if desired, I personally think tihs would be quite
> desirable for a young project like virtiofsd which is evolving rapidly
> as it would get new work available to users much more rapidly.

Form virtiofsd's point of view I'm not that worried about the release
cycle;  Given that features have to go through virtio standardisation,
the release ycle is unlikely to be a bottleneck.

> It doesn't have to follow QEMU's API stability & deprecation policies.
> It could be more flexible in taking non-compatible changes, which again
> may be valuable for a young rapidly evolving app.
> 
> 
> 
> Anyway to be clear, I'm not a contributor to virtiofsd, nor likely to
> be one in the future, so just consider this a personal POV. From QEMU's
> POV I don't think it'll matter whether virtiofsd in or out of QEMU git.
> It is more about the impact & burden QEMU's dev process & standards might
> impose on virtiofsd itself.

As a qemu contributor, your opinion is welcome!  No need to sit on the
fence.

> I'm fine with whatever option above is chosen, with the only caveat
> being that if its in qemu.git, I don't think it belongs under contrib/
> it should be a top level dir of its own.
> 

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 1/4] qom/object: enable setter for uint types

2019-11-26 Thread Marc-André Lureau
Hi

On Tue, Nov 26, 2019 at 4:03 PM Felipe Franciosi  wrote:
>
> Heya
>
> > On Nov 26, 2019, at 8:40 AM, Marc-André Lureau  
> > wrote:
> >
> > Hi
> >
> > On Mon, Nov 25, 2019 at 7:40 PM Felipe Franciosi  wrote:
> >>
> >> Traditionally, the uint-specific property helpers only offer getters.
> >> When adding object (or class) uint types, one must therefore use the
> >> generic property helper if a setter is needed.
> >>
> >> This enhances the uint-specific property helper APIs by adding a
> >> 'readonly' field and modifying all users of that API to set this
> >> parameter to true. If 'readonly' is false, though, the helper will add
> >> an automatic setter for the value.
> >>
> >> Signed-off-by: Felipe Franciosi 
> >> ---
> >> hw/acpi/ich9.c   |   4 +-
> >> hw/acpi/pcihp.c  |   6 +-
> >> hw/acpi/piix4.c  |  12 ++--
> >> hw/isa/lpc_ich9.c|   4 +-
> >> hw/ppc/spapr_drc.c   |   2 +-
> >> include/qom/object.h |  28 +---
> >> qom/object.c | 152 ---
> >> ui/console.c |   3 +-
> >> 8 files changed, 163 insertions(+), 48 deletions(-)
> >>
> >> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> >> index 2034dd749e..94dc5147ce 100644
> >> --- a/hw/acpi/ich9.c
> >> +++ b/hw/acpi/ich9.c
> >> @@ -454,12 +454,12 @@ void ich9_pm_add_properties(Object *obj, 
> >> ICH9LPCPMRegs *pm, Error **errp)
> >> pm->s4_val = 2;
> >>
> >> object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> >> -   &pm->pm_io_base, errp);
> >> +   &pm->pm_io_base, true, errp);
> >> object_property_add(obj, ACPI_PM_PROP_GPE0_BLK, "uint32",
> >> ich9_pm_get_gpe0_blk,
> >> NULL, NULL, pm, NULL);
> >> object_property_add_uint32_ptr(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
> >> -   &gpe0_len, errp);
> >> +   &gpe0_len, true, errp);
> >> object_property_add_bool(obj, "memory-hotplug-support",
> >>  ich9_pm_get_memory_hotplug_support,
> >>  ich9_pm_set_memory_hotplug_support,
> >> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> >> index 8413348a33..70bc1408e6 100644
> >> --- a/hw/acpi/pcihp.c
> >> +++ b/hw/acpi/pcihp.c
> >> @@ -80,7 +80,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
> >>
> >> *bus_bsel = (*bsel_alloc)++;
> >> object_property_add_uint32_ptr(OBJECT(bus), ACPI_PCIHP_PROP_BSEL,
> >> -   bus_bsel, &error_abort);
> >> +   bus_bsel, true, &error_abort);
> >> }
> >>
> >> return bsel_alloc;
> >> @@ -373,9 +373,9 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *s, 
> >> PCIBus *root_bus,
> >> memory_region_add_subregion(address_space_io, s->io_base, &s->io);
> >>
> >> object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_BASE_PROP, 
> >> &s->io_base,
> >> -   &error_abort);
> >> +   true, &error_abort);
> >> object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_LEN_PROP, 
> >> &s->io_len,
> >> -   &error_abort);
> >> +   true, &error_abort);
> >> }
> >>
> >> const VMStateDescription vmstate_acpi_pcihp_pci_status = {
> >> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> >> index 93aec2dd2c..032ba11e62 100644
> >> --- a/hw/acpi/piix4.c
> >> +++ b/hw/acpi/piix4.c
> >> @@ -443,17 +443,17 @@ static void piix4_pm_add_propeties(PIIX4PMState *s)
> >> static const uint16_t sci_int = 9;
> >>
> >> object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_ENABLE_CMD,
> >> -  &acpi_enable_cmd, NULL);
> >> +  &acpi_enable_cmd, true, NULL);
> >> object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_DISABLE_CMD,
> >> -  &acpi_disable_cmd, NULL);
> >> +  &acpi_disable_cmd, true, NULL);
> >> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK,
> >> -  &gpe0_blk, NULL);
> >> +  &gpe0_blk, true, NULL);
> >> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK_LEN,
> >> -  &gpe0_blk_len, NULL);
> >> +  &gpe0_blk_len, true, NULL);
> >> object_property_add_uint16_ptr(OBJECT(s), ACPI_PM_PROP_SCI_INT,
> >> -  &sci_int, NULL);
> >> +  &sci_int, true, NULL);
> >> object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_PM_IO_BASE,
> >> -  &s->io_base, NULL);
> >> +  &s->io_base, true, NULL);
> >> }
> >>
> >> static void piix4_pm_realize(PCIDevice *dev, Error **errp)
> >> diff --git a/

Re: [RFC 0/8] ATI R300 emulated graphics card

2019-11-26 Thread Aaron Zakhrov
Yeah sorry about the double patches. That was because of a botched rebase.
It applied my commits twice.
I am working on cleaning up the header files and other fixes.

 > Are you aware of my project page for ATI VGA emulation here:
Yes I am. Your code got me through the initial setup.



On Tue, Nov 26, 2019 at 5:33 PM BALATON Zoltan  wrote:

> Hello,
>
> On Tue, 26 Nov 2019, aaron.zakh...@gmail.com wrote:
> > From: Aaron Dominick 
> >
> > Hello,
> > I thought of working on an emulated R300 GPU for QEMU video acceleration
> on vintage operating systems (Windows 9x-XP)
>
> Good idea, :-) I very welcome any contribution to this.
>
> > The following patch series contains the initial QEMU device and some
> register read/write operations.
>
> At first look you have pathces twice in the series (so each patch appears
> twice, I haven't checked if they are identical or different) and also the
> first (and 5th) patch adds a lot of probably unused files which makes
> review difficult. Could you please fix these and resend with only adding
> the headers really needed and each patch only once so we know what are the
> actual changes that are relevant?
>
> Also is it possible to keep everything in ati-vga only adding another
> device model there rather than fully duplicating it as r300* ? Although
> for development it's probably OK to keep it separate but unless there are
> enough differences having a single file at the end may be better. (Also
> changes are clearer that way but if you have a patch only copying ati-vga
> files first then separate patches that changes it can be reviewed that way
> too.)
>
> In short I think this series needs some cleanup first for us to be able to
> revies it better.
>
> > Testing it on an OpenSUSE Linux guest and the kernel correctly detects
> the card and loads the radeon DRM driver.
>
> So I think this gets a bit further than my ati-vga rv100 which does not
> work with DRM yet. I've thought about targeting RV100 first then moving on
> to later radeons as those probably have more features and differences from
> R128 which is the starting point for ati-vga but if you need R300 for some
> reason specifically targeting that is OK too.
>
> > It gets as far as the CRTC probing before crashing with an error that
> there is not enough bandwidth.
>
> Getting DRM to load is one thing but likely you'll need to implement
> microengine/command processor (also referred to as PM4 or programming mode
> 4 sometimes) to get it fully working as that's how DRM and Windows ATI
> drivers likely send commands to the card. I've looked at it but couldn't
> find documentation on how the microengine works. We get a microcode
> uploaded by no idea how to run that. If we can't figure out how the
> microengine works anther approach could be what xenia.jp XBox 360
> emulator
> does and directly parse the packets not using the microcode. It could be
> possible to copy code from Xenia for that but we need to convert C++ to C.
> The difficult part is probably figuring out how to run it in a different
> thread so the device emulation does not block the machine and it works in
> parallel like real hardware does.
>
> > I know next to nothing about hardware emulation and would like to know
> if what I have got so far is on the right track.
>
> Are you aware of my project page for ATI VGA emulation here:
>
> https://osdn.net/projects/qmiga/wiki/SubprojectAti
>
> where I have collected some knowledge that I could gather so far? I have
> some experience in implementing devices in QEMU but know next to nothing
> about GPUs so hopefully you know more about those which is more needed for
> ati-vga. The QEMU knowledge can be picked up by looking at existing
> devices and asking here or on IRC (if you prefer that, I don't use it but
> I know some do).
>
> Regards,
> BALATON Zoltan
>


Re: [PULL 0/4] Net patches

2019-11-26 Thread Peter Maydell
On Mon, 25 Nov 2019 at 15:40, Jason Wang  wrote:
>
> The following changes since commit 122e6d2a9c1bf8aa1d51409c15809a82621515b1:
>
>   Merge remote-tracking branch 'remotes/gkurz/tags/9p-fix-2019-11-23' into 
> staging (2019-11-25 13:39:45 +)
>
> are available in the git repository at:
>
>   https://github.com/jasowang/qemu.git tags/net-pull-request
>
> for you to fetch changes up to 4d0e59ace29277b2faa5b33c719be9baaa659093:
>
>   net/virtio: return error when device_opts arg is NULL (2019-11-25 23:30:29 
> +0800)
>
> 
>
> 
> Jens Freimann (4):
>   net/virtio: fix dev_unplug_pending
>   net/virtio: return early when failover primary alread added
>   net/virtio: fix re-plugging of primary device
>   net/virtio: return error when device_opts arg is NULL
>
>  hw/net/virtio-net.c | 58 
> +++--
>  migration/savevm.c  |  3 ++-
>  2 files changed, 40 insertions(+), 21 deletions(-)


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM



Re: [PATCH v17 03/14] util/cutils: refactor do_strtosz() to support suffixes list

2019-11-26 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Tue, Nov 26, 2019 at 11:04:41AM +0100, Markus Armbruster wrote:
>> Eduardo Habkost  writes:
>> 
>> > On Mon, Nov 25, 2019 at 08:20:23AM +0100, Markus Armbruster wrote:
>> >> Tao Xu  writes:
>> >> 
>> >> > Add do_strtomul() to convert string according to different suffixes.
>> >> >
>> >> > Reviewed-by: Eduardo Habkost 
>> >> > Signed-off-by: Tao Xu 
>> >> 
>> >> What's the actual change here?  "Refactor" suggests the interfaces stay
>> >> the same, only their implementation changes.  "Support suffixes list"
>> >> suggests some interface supports something new.
>> >
>> > 1) Parameters added to suffix_mul() (suffix table);
>> > 2) do_strtomul() is being extracted from do_strtosz().
>> >
>> > do_strtomul() interface and behavior stays the same.
>> 
>> Alright, it's two related changes squashed together (which tends to
>> complicate writing good commit messages).  2) is really a refactoring.
>> 1) is not: it makes suffix_mul() more flexible.  Summarizing 1) and 2)
>> as "refactor do_strtosz() to support suffixes list" is confusing,
>> because it's about 1), while the interesting part is actually 2).
>> 
>> Moreover, the commit message should state why these two changes are
>> useful.  It tries, but "to support suffixes list" merely kicks the can
>> down the road, because the reader is left to wonder why supporting
>> suffix lists is useful.  It's actually for use by the next patch.  So
>> say that.
>
> Test case additions to test-cutils.c would go a long way to illustrating
> what is added & that its working correctly.

Since this patch only prepares for new features, it doesn't need test
updates.  The next patch adds features, and it does update the tests.




[RFC 00/10] R300 QEMU device V2

2019-11-26 Thread aaron . zakhrov
From: Aaron Dominick 

I have removed the botched patches and have got the code working upto the GART 
initialization.
I am not sure how to implement the GART. I am guessing it should be an IOMMU 
device but I think that is a bit much for an emulated card. 
The earlier problem of display probing seems to be resolved by using an R300 
bios I got from TechPowerUP's GPU database:

https://www.techpowerup.com/vgabios/14509/14509
I am NOT sure if we can distribute it in the QEMU source tree. If it does cause 
problems I can send a patch to remove it.

Aaron Dominick (10):
  Add Radeon kernel headers. Will clean up later
  Fix MC STATUS resgister
  R300 fixes
  Got GPU init working. Stops at probing display
  Clean up Radeon Header files
  Add a new R300 VBIOS courtesy of TechPowerUP

-- 
2.24.0




[RFC 06/10] Fix MC STATUS resgister

2019-11-26 Thread aaron . zakhrov
From: Aaron Dominick 

---
 hw/display/r300.c | 15 ---
 hw/display/r300.h |  1 +
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/display/r300.c b/hw/display/r300.c
index 94e90b7a95..653474c3aa 100644
--- a/hw/display/r300.c
+++ b/hw/display/r300.c
@@ -278,6 +278,10 @@ static uint64_t r300_mm_read(void *opaque, hwaddr addr, 
unsigned int size)
 uint64_t val = 0;
 
 switch (addr) {
+case RADEON_MC_STATUS:
+val = s->regs.mc_status;
+break;
+
 case RADEON_MM_INDEX:
 val = s->regs.mm_index;
 break;
@@ -358,9 +362,9 @@ static uint64_t r300_mm_read(void *opaque, hwaddr addr, 
unsigned int size)
 case RADEON_CONFIG_REG_APER_SIZE:
 val = memory_region_size(&s->mm);
 break;
-case RADEON_MC_STATUS:
-val = 5;
-break;
+// case RADEON_MC_STATUS:
+// val = 5;
+// break;
 case RADEON_RBBM_STATUS:
 val = 64; /* free CMDFIFO entries */
 break;
@@ -512,6 +516,10 @@ static void r300_mm_write(void *opaque, hwaddr addr,
 trace_ati_mm_write(size, addr, r300_reg_name(addr & ~3ULL), data);
 }
 switch (addr) {
+  case RADEON_MC_STATUS:
+s->regs.mc_status = R300_MC_IDLE;
+s->regs.mc_status = data;
+break;
   case RADEON_RBBM_STATUS:
 s->regs.rbbm_status = data|= RADEON_RBBM_FIFOCNT_MASK;
 break;
@@ -946,6 +954,7 @@ static void r300_vga_realize(PCIDevice *dev, Error **errp)
 static void r300_vga_reset(DeviceState *dev)
 {
 RADVGAState *s = RAD_VGA(dev);
+s->regs.mc_status = R300_MC_IDLE;
 
 timer_del(&s->vblank_timer);
 r300_vga_update_irq(s);
diff --git a/hw/display/r300.h b/hw/display/r300.h
index 60f572647f..a9e1db32be 100644
--- a/hw/display/r300.h
+++ b/hw/display/r300.h
@@ -81,6 +81,7 @@ typedef struct RADVGARegs{
   uint32_t default_pitch;
   uint32_t default_tile;
   uint32_t default_sc_bottom_right;
+  uint32_t mc_status;
 
 //Color Buffer RB3D
   uint32_t r300_rb3d_aaresolve_ctl;
-- 
2.24.0




Re: [PATCH] target/arm: Honor HCR_EL2.TID3 trapping requirements

2019-11-26 Thread Peter Maydell
On Mon, 25 Nov 2019 at 17:49, Marc Zyngier  wrote:
> I also had a look at TID0, but couldn't find any of the Jazelle
> stuff in QEMU...

We implement only "minimal Jazelle", ie the minimal set of
registers needed to be architecturally compliant for an
implementation without Jazelle.

thanks
-- PMM



[RFC 08/10] Got GPU init working. Stops at probing display

2019-11-26 Thread aaron . zakhrov
From: Aaron Dominick 

---
 hw/display/ati.c  |   9 +-
 hw/display/r300.c | 571 +-
 hw/display/r300.h |  77 ++-
 3 files changed, 544 insertions(+), 113 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index db3b254316..1d36233163 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -44,7 +44,7 @@ enum { VGA_MODE, EXT_MODE };
 
 static void ati_vga_switch_mode(ATIVGAState *s)
 {
-DPRINTF("%d -> %d\n",
+qemu_log("%d -> %d\n",
 s->mode, !!(s->regs.crtc_gen_cntl & CRTC2_EXT_DISP_EN));
 if (s->regs.crtc_gen_cntl & CRTC2_EXT_DISP_EN) {
 /* Extended mode enabled */
@@ -88,7 +88,7 @@ static void ati_vga_switch_mode(ATIVGAState *s)
 qemu_log_mask(LOG_UNIMP, "Unsupported bpp value\n");
 }
 assert(bpp != 0);
-DPRINTF("Switching to %dx%d %d %d @ %x\n", h, v, stride, bpp, 
offs);
+qemu_log("Switching to %dx%d %d %d @ %x\n", h, v, stride, bpp, 
offs);
 vbe_ioport_write_index(&s->vga, 0, VBE_DISPI_INDEX_ENABLE);
 vbe_ioport_write_data(&s->vga, 0, VBE_DISPI_DISABLED);
 s->vga.big_endian_fb = (s->regs.config_cntl & APER_0_ENDIAN ||
@@ -111,14 +111,14 @@ static void ati_vga_switch_mode(ATIVGAState *s)
 vbe_ioport_write_data(&s->vga, 0, stride);
 stride *= bypp;
 if (offs % stride) {
-DPRINTF("CRTC offset is not multiple of pitch\n");
+qemu_log("CRTC offset is not multiple of pitch\n");
 vbe_ioport_write_index(&s->vga, 0,
VBE_DISPI_INDEX_X_OFFSET);
 vbe_ioport_write_data(&s->vga, 0, offs % stride / bypp);
 }
 vbe_ioport_write_index(&s->vga, 0, VBE_DISPI_INDEX_Y_OFFSET);
 vbe_ioport_write_data(&s->vga, 0, offs / stride);
-DPRINTF("VBE offset (%d,%d), vbe_start_addr=%x\n",
+qemu_log("VBE offset (%d,%d), vbe_start_addr=%x\n",
 s->vga.vbe_regs[VBE_DISPI_INDEX_X_OFFSET],
 s->vga.vbe_regs[VBE_DISPI_INDEX_Y_OFFSET],
 s->vga.vbe_start_addr);
@@ -129,6 +129,7 @@ static void ati_vga_switch_mode(ATIVGAState *s)
 s->mode = VGA_MODE;
 vbe_ioport_write_index(&s->vga, 0, VBE_DISPI_INDEX_ENABLE);
 vbe_ioport_write_data(&s->vga, 0, VBE_DISPI_DISABLED);
+qemu_log("VGA MODE %d \n",s->mode);
 }
 }
 
diff --git a/hw/display/r300.c b/hw/display/r300.c
index 074dbf5b2d..5bd71142a8 100644
--- a/hw/display/r300.c
+++ b/hw/display/r300.c
@@ -19,6 +19,7 @@
 #include "qemu/osdep.h"
 #include "r300.h"
 #include "r300_reg.h"
+#include "r100d.h"
 #include "radeon_reg.h"
 #include "vga-access.h"
 #include "hw/qdev-properties.h"
@@ -38,18 +39,20 @@ static const struct {
 uint16_t dev_id;
 } r300_model_aliases[] = {
 { "radeon9500", PCI_DEVICE_ID_ATI_RADEON_9500_PRO },
+{ "radeon9700", PCI_DEVICE_ID_ATI_RADEON_9700 }
 };
 
 enum { VGA_MODE, EXT_MODE };
 
 static void r300_vga_switch_mode(RADVGAState *s)
 {
-DPRINTF("%d -> %d\n",
-s->mode, !!(s->regs.crtc_gen_cntl & RADEON_CRTC_EXT_DISP_EN));
-if (s->regs.crtc_gen_cntl & RADEON_CRTC_EXT_DISP_EN) {
+qemu_log(" R300 %d -> %d\n",
+s->mode, !(s->regs.crtc_gen_cntl & RADEON_CRTC_EN));
+qemu_log("R300 RADEON_CRTC_EXT_DISP_EN = %08x CRTC_GEN_CNTL = %08x 
",RADEON_CRTC_EN,s->regs.crtc_gen_cntl);
+if (s->regs.crtc_gen_cntl & ~RADEON_CRTC_EN) {
 /* Extended mode enabled */
 s->mode = EXT_MODE;
-if (s->regs.crtc_gen_cntl & RADEON_CRTC2_EN) {
+if (s->regs.crtc_gen_cntl & ~RADEON_CRTC_EN) {
 /* CRT controller enabled, use CRTC values */
 /* FIXME Should these be the same as VGA CRTC regs? */
 uint32_t offs = s->regs.crtc_offset & 0x07ff;
@@ -65,32 +68,32 @@ static void r300_vga_switch_mode(RADVGAState *s)
 }
 h = ((s->regs.crtc_h_total_disp >> 16) + 1) * 8;
 v = (s->regs.crtc_v_total_disp >> 16) + 1;
-// switch (s->regs.crtc_gen_cntl & RADEON_CRTC_CUR_MODE_MASK) {
-// // case RADEON_CRTC_PIX_WIDTH_4BPP:
-// // bpp = 4;
-// // break;
-// // case RADEON_CRTC_PIX_WIDTH_8BPP:
-// // bpp = 8;
-// // break;
-// // case RADEON_CRTC_PIX_WIDTH_15BPP:
-// // bpp = 15;
-// // break;
-// // case RADEON_CRTC_PIX_WIDTH_16BPP:
-// // bpp = 16;
-// // break;
-// // case RADEON_CRTC_PIX_WIDTH_24BPP:
-// // bpp = 24;
-// // break;
-// // case RADEON_CRTC_PIX_WIDTH_32BPP:
-// // bpp = 32;
-// // break;
-// default:
-// qemu_log_mask(L

[RFC 07/10] R300 fixes

2019-11-26 Thread aaron . zakhrov
From: Aaron Dominick 

---
 hw/display/r300.c | 9 +
 hw/display/r300.h | 6 ++
 2 files changed, 15 insertions(+)

diff --git a/hw/display/r300.c b/hw/display/r300.c
index 653474c3aa..074dbf5b2d 100644
--- a/hw/display/r300.c
+++ b/hw/display/r300.c
@@ -878,6 +878,15 @@ static void r300_mm_write(void *opaque, hwaddr addr,
 case RADEON_DEFAULT_SC_BOTTOM_RIGHT:
 s->regs.default_sc_bottom_right = data & 0x3fff3fff;
 break;
+case R300_GB_ENABLE:
+s->regs.r300_gb_enable = data;
+break;
+case R300_GB_TILE_CONFIG:
+s->regs.r300_gb_tile_config = data;
+break;
+case R300_GB_FIFO_SIZE:
+s->regs.r300_gb_fifo_size = data;
+break;
 default:
 break;
 }
diff --git a/hw/display/r300.h b/hw/display/r300.h
index a9e1db32be..19e3d97f8a 100644
--- a/hw/display/r300.h
+++ b/hw/display/r300.h
@@ -83,6 +83,12 @@ typedef struct RADVGARegs{
   uint32_t default_sc_bottom_right;
   uint32_t mc_status;
 
+  //R300 GB Registers
+  uint32_t r300_gb_enable;
+  uint32_t r300_gb_tile_config;
+  uint32_t r300_gb_fifo_size;
+
+
 //Color Buffer RB3D
   uint32_t r300_rb3d_aaresolve_ctl;
   uint32_t r300_rb3d_aaresolve_offset;
-- 
2.24.0




[PATCH v2] target/arm: Allow loading elf from aliased ROM regions

2019-11-26 Thread Jean-Hugues Deschênes
With this patch, we allow loading a ROM image at an aliased address,
when it is located in a memory region for which an alias exists.

Changes since v1:
- Removes unnecessary "else rom = NULL" clause after having verified mr.

Signed-off-by: Jean-Hugues Deschenes 
---
 target/arm/cpu.c | 31 ---
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a4ac9339b..bff81b51d1 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -313,13 +313,30 @@ static void arm_cpu_reset(CPUState *s)
 initial_msp = ldl_p(rom);
 initial_pc = ldl_p(rom + 4);
 } else {
-/* Address zero not covered by a ROM blob, or the ROM blob
- * is in non-modifiable memory and this is a second reset after
- * it got copied into memory. In the latter case, rom_ptr
- * will return a NULL pointer and we should use ldl_phys instead.
- */
-initial_msp = ldl_phys(s->as, vecbase);
-initial_pc = ldl_phys(s->as, vecbase + 4);
+/* See if the ROM blob is aliased somewhere */
+hwaddr len = 0, xlat = 0;
+MemoryRegion *mr = address_space_translate(s->as, vecbase, &xlat,
+&len, false, MEMTXATTRS_UNSPECIFIED);
+
+if (mr) {
+rom = rom_ptr(mr->addr + xlat, 8);
+}
+
+if (rom) {
+initial_msp = ldl_p(rom);
+initial_pc = ldl_p(rom + 4);
+} else {
+
+/*
+ * Address zero not covered by a ROM blob, or the ROM blob
+ * is in non-modifiable memory and this is a second reset after
+ * it got copied into memory. In the latter case, rom_ptr
+ * will return a NULL pointer and we should use ldl_phys
+ * instead.
+ */
+initial_msp = ldl_phys(s->as, vecbase);
+initial_pc = ldl_phys(s->as, vecbase + 4);
+}
 }
 
 env->regs[13] = initial_msp & 0xFFFC;
-- 
2.17.1




Re: [PATCH v2 2/2] s390x/cpumodel: Introduce dynamic feature groups

2019-11-26 Thread Christian Borntraeger
re-adding ccs from the cover-letter

>>> On 25.11.19 18:20, David Hildenbrand wrote:
>>>
>>> As soon as dynamic feature groups are used, the CPU model becomes
>>> migration-unsafe. Upper layers can expand these models to migration-safe
>>> and static variants, allowing them to be migrated.
>>
>> I really dislike that. I am trying to get rid of the unsafe variants (e.g. 
>> now
>> defaulting to host-model instead of host-passthrough). I do not want to give
>> users new ways of hurting themselves.
>>
> 
> Please note that this is just on the bare command line. Libvirt and friends 
> will expand the model and have proper migration in place. What exactly is 
> your concern in that regard?

What is then the value? libvirt can also use host-model or  baselining if 
necessary.
And for the command line this will just add more opportunity to shot yourself 
in the
foot, no?

Let me put it this way, I might have misunderstood what you are trying to do 
here,
but if I do not get, then others (e.g. users) will also not get it.

Maybe its just the interface or the name. But I find this very non-intuitive

e.g. you wrote

Get the maximum possible feature set (e.g., including deprecated
features) for a CPU definition in the configuration ("everything that
could be enabled"):
-cpu z14,all-features=off,available-features=on

Get all valid features for a CPU definition:
-cpu z14,all-features=on

What is the point of this? It is either the same as the one before, or it wont
be able to start. 

> 
>> Unless I misunderstood Eduardo, I think his versioning approach is actually 
>> better
>> in regard to migration, no?
>> I z terms, you can still say -cpu z13  which is just an alias to z13v1 z13v2 
>> etc.
>> Assuming that the version is checked this will be safe.
>>
> 
> It‘s even worse AFAIKS. A „-cpu z13“ would dynamically map to whatever is 
> best on the host.
> 




Re: [Virtio-fs] [PATCH 4/4] virtiofsd: Implement blocking posix locks

2019-11-26 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Fri, Nov 22, 2019 at 05:47:32PM +, Dr. David Alan Gilbert wrote:
> 
> [..]
> > > +static int virtio_send_notify_msg(struct fuse_session *se, struct iovec 
> > > *iov,
> > > +   int count)
> > > +{
> > > +struct fv_QueueInfo *qi;
> > > +VuDev *dev = &se->virtio_dev->dev;
> > > +VuVirtq *q;
> > > +FVRequest *req;
> > > +VuVirtqElement *elem;
> > > +unsigned int in_num, bad_in_num = 0, bad_out_num = 0;
> > > +struct fuse_out_header *out = iov[0].iov_base;
> > > +size_t in_len, tosend_len = iov_size(iov, count);
> > > +struct iovec *in_sg;
> > > +int ret = 0;
> > > +
> > > +/* Notifications have unique == 0 */
> > > +assert (!out->unique);
> > > +
> > > +if (!se->notify_enabled)
> > > +return -EOPNOTSUPP;
> > > +
> > > +/* If notifications are enabled, queue index 1 is notification queue 
> > > */
> > > +qi = se->virtio_dev->qi[1];
> > > +q = vu_get_queue(dev, qi->qidx);
> > > +
> > > +pthread_rwlock_rdlock(&qi->virtio_dev->vu_dispatch_rwlock);
> > > +pthread_mutex_lock(&qi->vq_lock);
> > > +/* Pop an element from queue */
> > > +req = vu_queue_pop(dev, q, sizeof(FVRequest), &bad_in_num, 
> > > &bad_out_num);
> > 
> > You don't need bad_in_num/bad_out_num - just pass NULL for both; they're
> > only needed if you expect to read/write data that's not mappable (i.e.
> > in our direct write case).
> 
> Will do.
> 
> [..]
> > > @@ -1950,21 +1948,54 @@ static void lo_setlk(fuse_req_t req, fuse_ino_t 
> > > ino,
> > >  
> > >   if (!plock) {
> > >   saverr = ret;
> > > + pthread_mutex_unlock(&inode->plock_mutex);
> > >   goto out;
> > >   }
> > >  
> > > + /*
> > > +  * plock is now released when inode is going away. We already have
> > > +  * a reference on inode, so it is guaranteed that plock->fd is
> > > +  * still around even after dropping inode->plock_mutex lock
> > > +  */
> > > + ofd = plock->fd;
> > > + pthread_mutex_unlock(&inode->plock_mutex);
> > > +
> > > + /*
> > > +  * If this lock request can block, request caller to wait for
> > > +  * notification. Do not access req after this. Once lock is
> > > +  * available, send a notification instead.
> > > +  */
> > > + if (sleep && lock->l_type != F_UNLCK) {
> > > + /*
> > > +  * If notification queue is not enabled, can't support async
> > > +  * locks.
> > > +  */
> > > + if (!se->notify_enabled) {
> > > + saverr = EOPNOTSUPP;
> > > + goto out;
> > > + }
> > > + async_lock = true;
> > > + unique = req->unique;
> > > + fuse_reply_wait(req);
> > > + }
> > >   /* TODO: Is it alright to modify flock? */
> > >   lock->l_pid = 0;
> > > - ret = fcntl(plock->fd, F_OFD_SETLK, lock);
> > > + if (async_lock)
> > > + ret = fcntl(ofd, F_OFD_SETLKW, lock);
> > > + else
> > > + ret = fcntl(ofd, F_OFD_SETLK, lock);
> > 
> > What happens if the guest is rebooted after it's asked
> > for, but not been granted a lock?
> 
> I think a regular reboot can't be done till a request is pending, because
> virtio-fs can't be unmounted and unmount will wait for all pending
> requests to finish.
> 
> Destroying qemu will destroy deamon too.
> 
> Are there any other reboot paths I have missed.

Yes, there are a few other ways the guest can reboot:
  a) A echo b > /proc/sysrq-trigger
  b) Telling qemu to do a reset

probably a few more as well; but they all end up with the daemon
still running over the same connection.   See
'virtiofsd: Handle hard reboot' where I handle that case where
a FUSE_INIT turns up unexpectedly.

Dave


> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PULL] RISC-V Patches for 4.2-rc3

2019-11-26 Thread Peter Maydell
On Mon, 25 Nov 2019 at 20:59, Palmer Dabbelt  wrote:
>
> The following changes since commit 65e05c82bdc6d348155e301c9d87dba7a08a5701:
>
>   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> (2019-11-25 15:47:44 +)
>
> are available in the Git repository at:
>
>   g...@github.com:palmer-dabbelt/qemu.git tags/riscv-for-master-4.2-rc3
>
> for you to fetch changes up to 6478dd745dca49d63250500cd1aeca1c41cd6f89:
>
>   hw/riscv: Add optional symbol callback ptr to riscv_load_kernel() 
> (2019-11-25 12:34:52 -0800)
>
> 
> RISC-V Patches for 4.2-rc3
>
> This tag contains two patches that I'd like to target for 4.2-rc3:
>
> * A fix to the DT entry for the SiFive test finisher.
> * A fix to the spike board's HTIF interface.
>
> This passes "make check" and boots OE for me.
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM



Re: [PATCH] target/arm: Honor HCR_EL2.TID3 trapping requirements

2019-11-26 Thread Peter Maydell
On Tue, 26 Nov 2019 at 10:12, Richard Henderson
 wrote:
>
> On 11/23/19 11:56 AM, Marc Zyngier wrote:
> > +static CPAccessResult access_aa64idreg(CPUARMState *env, const 
> > ARMCPRegInfo *ri,
> > +   bool isread)
> > +{
> > +if ((arm_current_el(env) < 2) && (arm_hcr_el2_eff(env) & HCR_TID3)) {
> > +return CP_ACCESS_TRAP_EL2;
> > +}
> > +
> > +return CP_ACCESS_OK;
> > +}
> > +
>
> The only thing I would suggest is to call this access_aa64_tid3, because
> tid{0,1,2} also need to be handled in a similar way, and we'll need little
> helper functions for those too.

Good idea, I will make that change also.

-- PMM



[PULL 1/2] block/qcow2-bitmap: fix bitmap migration

2019-11-26 Thread Max Reitz
From: Vladimir Sementsov-Ogievskiy 

Fix bitmap migration with dirty-bitmaps capability enabled and shared
storage. We should ignore IN_USE bitmaps in the image on target, when
migrating bitmaps through migration channel, see new comment below.

Fixes: 74da6b943565c451
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20191125125229.13531-2-vsement...@virtuozzo.com
Reviewed-by: Max Reitz 
Signed-off-by: Max Reitz 
---
 block/qcow2-bitmap.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 809bbc5d20..8abaf632fc 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -988,7 +988,26 @@ bool qcow2_load_dirty_bitmaps(BlockDriverState *bs, Error 
**errp)
 }
 
 QSIMPLEQ_FOREACH(bm, bm_list, entry) {
-BdrvDirtyBitmap *bitmap = load_bitmap(bs, bm, errp);
+BdrvDirtyBitmap *bitmap;
+
+if ((bm->flags & BME_FLAG_IN_USE) &&
+bdrv_find_dirty_bitmap(bs, bm->name))
+{
+/*
+ * We already have corresponding BdrvDirtyBitmap, and bitmap in the
+ * image is marked IN_USE. Firstly, this state is valid, no reason
+ * to consider existing BdrvDirtyBitmap to be bad. Secondly it's
+ * absolutely possible, when we do migration with shared storage
+ * with dirty-bitmaps capability enabled: if the bitmap was loaded
+ * from this storage before migration start, the storage will
+ * of-course contain IN_USE outdated version of the bitmap, and we
+ * should not load it on migration target, as we already have this
+ * bitmap, being migrated.
+ */
+continue;
+}
+
+bitmap = load_bitmap(bs, bm, errp);
 if (bitmap == NULL) {
 goto fail;
 }
-- 
2.23.0




[PULL 0/2] Block patches for 4.2.0-rc3

2019-11-26 Thread Max Reitz
The following changes since commit 4ecc984210ca1bf508a96a550ec8a93a5f833f6c:

  Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.2-rc3' 
into staging (2019-11-26 12:36:40 +)

are available in the Git repository at:

  https://github.com/XanClic/qemu.git tags/pull-block-2019-11-26

for you to fetch changes up to d8130f4c420acafbf15a59436fa47199b82dc00e:

  iotests: add new test cases to bitmap migration (2019-11-26 14:18:01 +0100)


Block patches for 4.2.0-rc3:
- Fix for shared storage migration with persistent dirty bitmaps


Vladimir Sementsov-Ogievskiy (2):
  block/qcow2-bitmap: fix bitmap migration
  iotests: add new test cases to bitmap migration

 block/qcow2-bitmap.c   | 21 -
 tests/qemu-iotests/169 | 22 +++---
 tests/qemu-iotests/169.out |  4 ++--
 3 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.23.0




[PULL 2/2] iotests: add new test cases to bitmap migration

2019-11-26 Thread Max Reitz
From: Vladimir Sementsov-Ogievskiy 

Add optional pre-shutdown: shutdown/launch vm before migration. This
leads to storing persistent bitmap to the storage, which breaks
migration with dirty-bitmaps capability enabled and shared storage
until fixed by previous commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20191125125229.13531-3-vsement...@virtuozzo.com
Reviewed-by: Max Reitz 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/169 | 22 +++---
 tests/qemu-iotests/169.out |  4 ++--
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/169 b/tests/qemu-iotests/169
index 8c204caf20..9656a7f620 100755
--- a/tests/qemu-iotests/169
+++ b/tests/qemu-iotests/169
@@ -134,7 +134,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.check_bitmap(self.vm_a, sha256 if persistent else False)
 
 def do_test_migration(self, persistent, migrate_bitmaps, online,
-  shared_storage):
+  shared_storage, pre_shutdown):
 granularity = 512
 
 # regions = ((start, count), ...)
@@ -142,15 +142,13 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
(0xf, 0x1),
(0xa0201, 0x1000))
 
-should_migrate = migrate_bitmaps or persistent and shared_storage
+should_migrate = \
+(migrate_bitmaps and (persistent or not pre_shutdown)) or \
+(persistent and shared_storage)
 mig_caps = [{'capability': 'events', 'state': True}]
 if migrate_bitmaps:
 mig_caps.append({'capability': 'dirty-bitmaps', 'state': True})
 
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=mig_caps)
-self.assert_qmp(result, 'return', {})
-
 self.vm_b.add_incoming(incoming_cmd if online else "defer")
 self.vm_b.add_drive(disk_a if shared_storage else disk_b)
 
@@ -166,6 +164,14 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
 sha256 = self.get_bitmap_hash(self.vm_a)
 
+if pre_shutdown:
+self.vm_a.shutdown()
+self.vm_a.launch()
+
+result = self.vm_a.qmp('migrate-set-capabilities',
+   capabilities=mig_caps)
+self.assert_qmp(result, 'return', {})
+
 result = self.vm_a.qmp('migrate', uri=mig_cmd)
 while True:
 event = self.vm_a.event_wait('MIGRATION')
@@ -210,11 +216,13 @@ def inject_test_case(klass, name, method, *args, 
**kwargs):
 mc = operator.methodcaller(method, *args, **kwargs)
 setattr(klass, 'test_' + method + name, lambda self: mc(self))
 
-for cmb in list(itertools.product((True, False), repeat=4)):
+for cmb in list(itertools.product((True, False), repeat=5)):
 name = ('_' if cmb[0] else '_not_') + 'persistent_'
 name += ('_' if cmb[1] else '_not_') + 'migbitmap_'
 name += '_online' if cmb[2] else '_offline'
 name += '_shared' if cmb[3] else '_nonshared'
+if (cmb[4]):
+name += '__pre_shutdown'
 
 inject_test_case(TestDirtyBitmapMigration, name, 'do_test_migration',
  *list(cmb))
diff --git a/tests/qemu-iotests/169.out b/tests/qemu-iotests/169.out
index 3a89159833..5c26d15c0d 100644
--- a/tests/qemu-iotests/169.out
+++ b/tests/qemu-iotests/169.out
@@ -1,5 +1,5 @@
-
+
 --
-Ran 20 tests
+Ran 36 tests
 
 OK
-- 
2.23.0




Re: [PATCH 1/4] qom/object: enable setter for uint types

2019-11-26 Thread Felipe Franciosi


> On Nov 26, 2019, at 12:18 PM, Marc-André Lureau  
> wrote:
> 
> Hi
> 
> On Tue, Nov 26, 2019 at 4:03 PM Felipe Franciosi  wrote:
>> 
>> Heya
>> 
>>> On Nov 26, 2019, at 8:40 AM, Marc-André Lureau  
>>> wrote:
>>> 
>>> Hi
>>> 
>>> On Mon, Nov 25, 2019 at 7:40 PM Felipe Franciosi  wrote:
 
 Traditionally, the uint-specific property helpers only offer getters.
 When adding object (or class) uint types, one must therefore use the
 generic property helper if a setter is needed.
 
 This enhances the uint-specific property helper APIs by adding a
 'readonly' field and modifying all users of that API to set this
 parameter to true. If 'readonly' is false, though, the helper will add
 an automatic setter for the value.
 
 Signed-off-by: Felipe Franciosi 
 ---
 hw/acpi/ich9.c   |   4 +-
 hw/acpi/pcihp.c  |   6 +-
 hw/acpi/piix4.c  |  12 ++--
 hw/isa/lpc_ich9.c|   4 +-
 hw/ppc/spapr_drc.c   |   2 +-
 include/qom/object.h |  28 +---
 qom/object.c | 152 ---
 ui/console.c |   3 +-
 8 files changed, 163 insertions(+), 48 deletions(-)
 
 diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
 index 2034dd749e..94dc5147ce 100644
 --- a/hw/acpi/ich9.c
 +++ b/hw/acpi/ich9.c
 @@ -454,12 +454,12 @@ void ich9_pm_add_properties(Object *obj, 
 ICH9LPCPMRegs *pm, Error **errp)
pm->s4_val = 2;
 
object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
 -   &pm->pm_io_base, errp);
 +   &pm->pm_io_base, true, errp);
object_property_add(obj, ACPI_PM_PROP_GPE0_BLK, "uint32",
ich9_pm_get_gpe0_blk,
NULL, NULL, pm, NULL);
object_property_add_uint32_ptr(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
 -   &gpe0_len, errp);
 +   &gpe0_len, true, errp);
object_property_add_bool(obj, "memory-hotplug-support",
 ich9_pm_get_memory_hotplug_support,
 ich9_pm_set_memory_hotplug_support,
 diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
 index 8413348a33..70bc1408e6 100644
 --- a/hw/acpi/pcihp.c
 +++ b/hw/acpi/pcihp.c
 @@ -80,7 +80,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 
*bus_bsel = (*bsel_alloc)++;
object_property_add_uint32_ptr(OBJECT(bus), ACPI_PCIHP_PROP_BSEL,
 -   bus_bsel, &error_abort);
 +   bus_bsel, true, &error_abort);
}
 
return bsel_alloc;
 @@ -373,9 +373,9 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *s, 
 PCIBus *root_bus,
memory_region_add_subregion(address_space_io, s->io_base, &s->io);
 
object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_BASE_PROP, 
 &s->io_base,
 -   &error_abort);
 +   true, &error_abort);
object_property_add_uint16_ptr(owner, ACPI_PCIHP_IO_LEN_PROP, 
 &s->io_len,
 -   &error_abort);
 +   true, &error_abort);
 }
 
 const VMStateDescription vmstate_acpi_pcihp_pci_status = {
 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index 93aec2dd2c..032ba11e62 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -443,17 +443,17 @@ static void piix4_pm_add_propeties(PIIX4PMState *s)
static const uint16_t sci_int = 9;
 
object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_ENABLE_CMD,
 -  &acpi_enable_cmd, NULL);
 +  &acpi_enable_cmd, true, NULL);
object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_DISABLE_CMD,
 -  &acpi_disable_cmd, NULL);
 +  &acpi_disable_cmd, true, NULL);
object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK,
 -  &gpe0_blk, NULL);
 +  &gpe0_blk, true, NULL);
object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_GPE0_BLK_LEN,
 -  &gpe0_blk_len, NULL);
 +  &gpe0_blk_len, true, NULL);
object_property_add_uint16_ptr(OBJECT(s), ACPI_PM_PROP_SCI_INT,
 -  &sci_int, NULL);
 +  &sci_int, true, NULL);
object_property_add_uint32_ptr(OBJECT(s), ACPI_PM_PROP_PM_IO_BASE,
 -  &s->io_base, NULL);
 +  &s->io_base, true, NULL);
 }
 
 static voi

[PATCH] blockjob: Fix error message for negative speed

2019-11-26 Thread Kevin Wolf
The error message for a negative speed uses QERR_INVALID_PARAMETER,
which implies that the 'speed' option doesn't even exist:

{"error": {"class": "GenericError", "desc": "Invalid parameter 'speed'"}}

Make it use QERR_INVALID_PARAMETER_VALUE instead:

{"error": {"class": "GenericError", "desc": "Parameter 'speed' expects a 
non-negative value"}}

Signed-off-by: Kevin Wolf 
---
 blockjob.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index c6e20e2fcd..5d63b1e89d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -261,7 +261,8 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 return;
 }
 if (speed < 0) {
-error_setg(errp, QERR_INVALID_PARAMETER, "speed");
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "speed",
+   "a non-negative value");
 return;
 }
 
-- 
2.20.1




Re: [PATCH] blockjob: Fix error message for negative speed

2019-11-26 Thread Alberto Garcia
On Tue 26 Nov 2019 02:42:22 PM CET, Kevin Wolf wrote:
> The error message for a negative speed uses QERR_INVALID_PARAMETER,
> which implies that the 'speed' option doesn't even exist:
>
> {"error": {"class": "GenericError", "desc": "Invalid parameter 'speed'"}}
>
> Make it use QERR_INVALID_PARAMETER_VALUE instead:
>
> {"error": {"class": "GenericError", "desc": "Parameter 'speed' expects a 
> non-negative value"}}
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Alberto Garcia 

Berto



Re: [PATCH v17 01/14] util/cutils: Add Add qemu_strtold and qemu_strtold_finite

2019-11-26 Thread Markus Armbruster
Tao Xu  writes:

> Hi Markus,
>
> Do you have any comments on this patch and 02/14 05/14 06/14.
> Thank you!

These provide a new QAPI built-in type 'time'.  It's like 'uint64' with
an implied nanoseconds unit, and additional convenience syntax in the
opts visitor and the keyval qobject input visitor.  Patterned after
'size'.

The only use of 'time' so far is member @latency of NumaOptions member
@hmap-lb.  Uses of that:

* QMP command set-numa-node

  The convenience syntax does not apply, as QMP uses the regular qobject
  input visitor, not the keyval one.

* CLI option -numa

  We first parse the option argument with QemuOpts, then convert it to
  NumaOptions with the opts visitor.

  The new built-in type 'time' gets used in -numa hmat-lb,...,latency=T

Questions / observations:

* The keyval qobject input visitor's support for 'time' appears to be
  unused for now.

* What's the anticipated range of values for -numa
  hmat-lb,...,latency=T?  I'm asking because I wonder whether we really
  need convenience syntax there.

* Sure you want fractions?

  Supporting fractions for byte counts (e.g.  1.5G) has been a mixed
  blessing, to put it charitably.

  Use of fractions that aren't representable as double is not advisable.
  For instance, 1.1G is 1181116006 bytes rounded from
  1181116006.401.  Why would anybody want that?

  Use of "nice" fractions is unproblematic, but the additional
  convenience is rather minor.  Is being able to write 1536M as 1.5G
  worth the trouble?  Meh.

  With "metric" rather than "binary" suffixes, fractions provide even
  less convenience: 1.5ms vs. 1500us.

  The implementation is limited to 53 bits of precision, which has been
  a source of confusion.  Even that has arguably taken far more patches
  than it's worth.  We're now talking about more patches to lift the
  restriction.  Meh.

  What exactly are we trying to achieve by supporting fractions?

* What about all the other time-valued things in the QAPI schema?

  There are many more, and some of them are also visible in CLI or HMP.
  By providing convenience syntax for just -numa hmat-lb,...,latency=T,
  we create inconsistency.

  To avoid it, we'd have to hunt down all the others.  But some of them
  aren't in nanoseconds.  Your new built-in type 'time' is only
  applicable to the ones in nanoseconds.  Do we need more built-in
  types?

This series is at v17.  I really, really want to tell you it's ready for
merging.  But as you see, I can't.

Maybe the convenience syntax is a good idea, maybe it's a bad idea.  But
it's definitely not a must-have idea.

If you want to pursue the idea, I recommend to split this series in two:
one part without the convenience, and a second part adding it.
Hopefully, we can then merge the first part without too much fuss.  The
second part will have to deal with the questions above.

You can also shelve the idea, i.e. do just the first part now.  It's
what I'd do.




Re: [PATCH v2 2/2] s390x/cpumodel: Introduce dynamic feature groups

2019-11-26 Thread David Hildenbrand
On 26.11.19 13:59, Christian Borntraeger wrote:
> re-adding ccs from the cover-letter
> 
 On 25.11.19 18:20, David Hildenbrand wrote:

 As soon as dynamic feature groups are used, the CPU model becomes
 migration-unsafe. Upper layers can expand these models to migration-safe
 and static variants, allowing them to be migrated.
>>>
>>> I really dislike that. I am trying to get rid of the unsafe variants (e.g. 
>>> now
>>> defaulting to host-model instead of host-passthrough). I do not want to give
>>> users new ways of hurting themselves.
>>>
>>
>> Please note that this is just on the bare command line. Libvirt and friends 
>> will expand the model and have proper migration in place. What exactly is 
>> your concern in that regard?
> 
> What is then the value? libvirt can also use host-model or  baselining if 
> necessary.
> And for the command line this will just add more opportunity to shot yourself 
> in the
> foot, no?

I don't think so. It's in no way more dangerous than "-cpu host" or
"-cpu max". And it is in no way more dangerous than the discussed CPU
versions, where even a "-cpu z13" would no longer be migration-safe.

You could - in theory - baseline(z13, host), but it could suddenly
fallback to a, say, zEC12 - and that's not what you asked for. And you
should not simply mask of deprecated features when baselining. Sure, we
could eventually add config knobs for that , but ...

... I really do like the part where you can specify on the command line
to have specific CPU definition, but including all available/recommended
features (e.g., nested virtualization).

> 
> Let me put it this way, I might have misunderstood what you are trying to do 
> here,
> but if I do not get, then others (e.g. users) will also not get it.

I remember you participated in the relevant discussions. That's where we
also agreed that versioned CPU models on s390x don't make any sense. But
I can refine what's included in this patch description

"There is the demand from higher levels in the stack to "have the
best CPU model possible on a given accelerator, firmware and HW"" - the
best CPU model for a specific *CPU definition*.

Say the user has the option to select a model (zEC12, z13, z14), upper
layers always want to have a model that includes all backported security
features. While the host model can do that, CPU definitions can't. You
can't change default models within a QEMU release, or for older releases
(e.g., a z13).

> 
> Maybe its just the interface or the name. But I find this very non-intuitive

I'm open for suggestions.

> 
> e.g. you wrote
> 
> Get the maximum possible feature set (e.g., including deprecated
> features) for a CPU definition in the configuration ("everything that
> could be enabled"):
> -cpu z14,all-features=off,available-features=on
> 
> Get all valid features for a CPU definition:
> -cpu z14,all-features=on
> 
> What is the point of this? It is either the same as the one before, or it wont
> be able to start. 

valid != available, all != available. Yes, the model won't run unless
you are on pretty good HW :)

Maybe I should just have dropped the last example, as it seems to
confuse people - it's mostly only relevant for introspection via CPU
model expansion.

I am open for better names. e.g. all-features -> valid-features.

-- 
Thanks,

David / dhildenb




[PULL 2/4] hw/arm: versal: Add the CRP as unimplemented

2019-11-26 Thread Peter Maydell
From: "Edgar E. Iglesias" 

Add the CRP as unimplemented thus avoiding bus errors when
guests access these registers.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Alistair Francis 
Reviewed-by: Luc Michel 
Message-id: 20191115154734.26449-2-edgar.igles...@gmail.com
Signed-off-by: Peter Maydell 
---
 include/hw/arm/xlnx-versal.h | 3 +++
 hw/arm/xlnx-versal.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index 14405c1465d..d844c4ffe47 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -119,4 +119,7 @@ typedef struct Versal {
 #define MM_IOU_SCNTRS_SIZE  0x1
 #define MM_FPD_CRF  0xfd1aU
 #define MM_FPD_CRF_SIZE 0x14
+
+#define MM_PMC_CRP  0xf126U
+#define MM_PMC_CRP_SIZE 0x1
 #endif
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index 98163eb1aad..8b3d8d85b86 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -257,6 +257,8 @@ static void versal_unimp(Versal *s)
 MM_CRL, MM_CRL_SIZE);
 versal_unimp_area(s, "crf", &s->mr_ps,
 MM_FPD_CRF, MM_FPD_CRF_SIZE);
+versal_unimp_area(s, "crp", &s->mr_ps,
+MM_PMC_CRP, MM_PMC_CRP_SIZE);
 versal_unimp_area(s, "iou-scntr", &s->mr_ps,
 MM_IOU_SCNTR, MM_IOU_SCNTR_SIZE);
 versal_unimp_area(s, "iou-scntr-seucre", &s->mr_ps,
-- 
2.20.1




[PULL 0/4] target-arm queue

2019-11-26 Thread Peter Maydell
Arm patches for rc3 : just a handful of bug fixes.

thanks
-- PMM


The following changes since commit 4ecc984210ca1bf508a96a550ec8a93a5f833f6c:

  Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.2-rc3' 
into staging (2019-11-26 12:36:40 +)

are available in the Git repository at:

  https://git.linaro.org/people/pmaydell/qemu-arm.git 
tags/pull-target-arm-20191126

for you to fetch changes up to 6a4ef4e5d1084ce41fafa7d470a644b0fd3d9317:

  target/arm: Honor HCR_EL2.TID3 trapping requirements (2019-11-26 13:55:37 
+)


target-arm queue:
 * handle FTYPE flag correctly in v7M exception return
   for v7M CPUs with an FPU (v8M CPUs were already correct)
 * versal: Add the CRP as unimplemented
 * Fix ISR_EL1 tracking when executing at EL2
 * Honor HCR_EL2.TID3 trapping requirements


Edgar E. Iglesias (1):
  hw/arm: versal: Add the CRP as unimplemented

Jean-Hugues Deschênes (1):
  target/arm: Fix handling of cortex-m FTYPE flag in EXCRET

Marc Zyngier (2):
  target/arm: Fix ISR_EL1 tracking when executing at EL2
  target/arm: Honor HCR_EL2.TID3 trapping requirements

 include/hw/arm/xlnx-versal.h |  3 ++
 hw/arm/xlnx-versal.c |  2 ++
 target/arm/helper.c  | 83 ++--
 target/arm/m_helper.c|  7 ++--
 4 files changed, 89 insertions(+), 6 deletions(-)



[PULL 1/4] target/arm: Fix handling of cortex-m FTYPE flag in EXCRET

2019-11-26 Thread Peter Maydell
From: Jean-Hugues Deschênes 

According to the PushStack() pseudocode in the armv7m RM,
bit 4 of the LR should be set to NOT(CONTROL.PFCA) when
an FPU is present. Current implementation is doing it for
armv8, but not for armv7. This patch makes the existing
logic applicable to both code paths.

Signed-off-by: Jean-Hugues Deschenes 
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 target/arm/m_helper.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index 4a48b792520..76de317e6af 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -2233,19 +2233,18 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 if (env->v7m.secure) {
 lr |= R_V7M_EXCRET_S_MASK;
 }
-if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
-lr |= R_V7M_EXCRET_FTYPE_MASK;
-}
 } else {
 lr = R_V7M_EXCRET_RES1_MASK |
 R_V7M_EXCRET_S_MASK |
 R_V7M_EXCRET_DCRS_MASK |
-R_V7M_EXCRET_FTYPE_MASK |
 R_V7M_EXCRET_ES_MASK;
 if (env->v7m.control[M_REG_NS] & R_V7M_CONTROL_SPSEL_MASK) {
 lr |= R_V7M_EXCRET_SPSEL_MASK;
 }
 }
+if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+lr |= R_V7M_EXCRET_FTYPE_MASK;
+}
 if (!arm_v7m_is_handler_mode(env)) {
 lr |= R_V7M_EXCRET_MODE_MASK;
 }
-- 
2.20.1




[PULL 3/4] target/arm: Fix ISR_EL1 tracking when executing at EL2

2019-11-26 Thread Peter Maydell
From: Marc Zyngier 

The ARMv8 ARM states when executing at EL2, EL3 or Secure EL1,
ISR_EL1 shows the pending status of the physical IRQ, FIQ, or
SError interrupts.

Unfortunately, QEMU's implementation only considers the HCR_EL2
bits, and ignores the current exception level. This means a hypervisor
trying to look at its own interrupt state actually sees the guest
state, which is unexpected and breaks KVM as of Linux 5.3.

Instead, check for the running EL and return the physical bits
if not running in a virtualized context.

Fixes: 636540e9c40b
Cc: qemu-sta...@nongnu.org
Reported-by: Quentin Perret 
Signed-off-by: Marc Zyngier 
Message-id: 20191122135833.28953-1-...@kernel.org
Reviewed-by: Peter Maydell 
Reviewed-by: Edgar E. Iglesias 
Signed-off-by: Peter Maydell 
---
 target/arm/helper.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index a089fb5a690..027fffbff69 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1934,8 +1934,11 @@ static uint64_t isr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 CPUState *cs = env_cpu(env);
 uint64_t hcr_el2 = arm_hcr_el2_eff(env);
 uint64_t ret = 0;
+bool allow_virt = (arm_current_el(env) == 1 &&
+   (!arm_is_secure_below_el3(env) ||
+(env->cp15.scr_el3 & SCR_EEL2)));
 
-if (hcr_el2 & HCR_IMO) {
+if (allow_virt && (hcr_el2 & HCR_IMO)) {
 if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
 ret |= CPSR_I;
 }
@@ -1945,7 +1948,7 @@ static uint64_t isr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 }
 }
 
-if (hcr_el2 & HCR_FMO) {
+if (allow_virt && (hcr_el2 & HCR_FMO)) {
 if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
 ret |= CPSR_F;
 }
-- 
2.20.1




[PULL 4/4] target/arm: Honor HCR_EL2.TID3 trapping requirements

2019-11-26 Thread Peter Maydell
From: Marc Zyngier 

HCR_EL2.TID3 mandates that access from EL1 to a long list of id
registers traps to EL2, and QEMU has so far ignored this requirement.

This breaks (among other things) KVM guests that have PtrAuth enabled,
while the hypervisor doesn't want to expose the feature to its guest.
To achieve this, KVM traps the ID registers (ID_AA64ISAR1_EL1 in this
case), and masks out the unsupported feature.

QEMU not honoring the trap request means that the guest observes
that the feature is present in the HW, starts using it, and dies
a horrible death when KVM injects an UNDEF, because the feature
*really* isn't supported.

Do the right thing by trapping to EL2 if HCR_EL2.TID3 is set.

Note that this change does not include trapping of the MVFR
registers from AArch32 (they are accessed via the VMRS
instruction and need to be handled in a different way).

Reported-by: Will Deacon 
Signed-off-by: Marc Zyngier 
Tested-by: Will Deacon 
Message-id: 20191123115618.29230-1-...@kernel.org
[PMM: added missing accessfn line for ID_AA4PFR2_EL1_RESERVED;
 changed names of access functions to include _tid3]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 target/arm/helper.c | 76 +
 1 file changed, 76 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 027fffbff69..0bf8f53d4b8 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5978,6 +5978,26 @@ static const ARMCPRegInfo predinv_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+static CPAccessResult access_aa64_tid3(CPUARMState *env, const ARMCPRegInfo 
*ri,
+   bool isread)
+{
+if ((arm_current_el(env) < 2) && (arm_hcr_el2_eff(env) & HCR_TID3)) {
+return CP_ACCESS_TRAP_EL2;
+}
+
+return CP_ACCESS_OK;
+}
+
+static CPAccessResult access_aa32_tid3(CPUARMState *env, const ARMCPRegInfo 
*ri,
+   bool isread)
+{
+if (arm_feature(env, ARM_FEATURE_V8)) {
+return access_aa64_tid3(env, ri, isread);
+}
+
+return CP_ACCESS_OK;
+}
+
 void register_cp_regs_for_features(ARMCPU *cpu)
 {
 /* Register all the coprocessor registers based on feature bits */
@@ -6001,6 +6021,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 { .name = "ID_PFR0", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 0,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_pfr0 },
 /* ID_PFR1 is not a plain ARM_CP_CONST because we don't know
  * the value of the GIC field until after we define these regs.
@@ -6008,63 +6029,78 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 { .name = "ID_PFR1", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 1,
   .access = PL1_R, .type = ARM_CP_NO_RAW,
+  .accessfn = access_aa32_tid3,
   .readfn = id_pfr1_read,
   .writefn = arm_cp_write_ignore },
 { .name = "ID_DFR0", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 2,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_dfr0 },
 { .name = "ID_AFR0", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 3,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_afr0 },
 { .name = "ID_MMFR0", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 4,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_mmfr0 },
 { .name = "ID_MMFR1", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 5,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_mmfr1 },
 { .name = "ID_MMFR2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 6,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_mmfr2 },
 { .name = "ID_MMFR3", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 7,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   .resetvalue = cpu->id_mmfr3 },
 { .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
   .access = PL1_R, .type = ARM_CP_CONST,
+  .accessfn = access_aa32_tid3,
   

Re: [RFC 00/10] R300 QEMU device V2

2019-11-26 Thread Daniel P . Berrangé
On Tue, Nov 26, 2019 at 06:14:27PM +0530, aaron.zakh...@gmail.com wrote:
> From: Aaron Dominick 
> 
> I have removed the botched patches and have got the code working upto the 
> GART initialization.
> I am not sure how to implement the GART. I am guessing it should be an IOMMU 
> device but I think that is a bit much for an emulated card. 
> The earlier problem of display probing seems to be resolved by using an R300 
> bios I got from TechPowerUP's GPU database:
> 
>   https://www.techpowerup.com/vgabios/14509/14509
> I am NOT sure if we can distribute it in the QEMU source tree. If it
> does cause problems I can send a patch to remove it.

That site seems to be a repository of BIOS uploaded by arbitrary users,
with no information on what license terms might apply to the uploads.

We have to therefore assume the worst and treat the BIOS images on that
site as proprietary and not re-distributable, despite the fact that the
site itself is acting as a 3rd party distributor.

IOW, we can't have this in QEMU git I'm afraid, unless someone can find
a trustworthy vendor source for the original image with accompanying
license information.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH 1/4] block/io: fix bdrv_co_block_status_above

2019-11-26 Thread Kevin Wolf
Am 26.11.2019 um 08:26 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 25.11.2019 19:00, Kevin Wolf wrote:
> > Am 16.11.2019 um 17:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >> bdrv_co_block_status_above has several problems with handling short
> >> backing files:
> >>
> >> 1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
> >> without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
> >> which produces these after-EOF zeros is inside requested backing
> >> sequesnce.
> > 
> > s/sequesnce/sequence/
> > 
> >>
> >> 2. With want_zeros=false, it will just stop inside requested region, if
> >> we have unallocated region in top node when underlying backing is
> >> short.
> > 
> > I honestly don't understand this one. Can you rephrase/explain in more
> > detail what you mean by "stop inside [the] requested region"?
> 
> Hmm, yes, bad description. I mean, it may return pnum=0 prior to actual EOF,
> because of EOF of short backing file.

Ah, yes, that's true. Definitely mention pnum=0 in the comment, this
explanation is much clearer.

> >> Fix these things, making logic about short backing files clearer.
> >>
> >> Note that 154 output changed, because now bdrv_block_status_above don't
> >> merge unallocated zeros with zeros after EOF (which are actually
> >> "allocated" in POV of read from backing-chain top) and is_zero() just
> >> don't understand that the whole head or tail is zero. We may update
> >> is_zero to call bdrv_block_status_above several times, or add flag to
> >> bdrv_block_status_above that we are not interested in ALLOCATED flag,
> >> so ranges with different ALLOCATED status may be merged, but actually,
> >> it seems that we'd better don't care about this corner case.
> >>
> >> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> >> ---
> >>   block/io.c | 41 --
> >>   tests/qemu-iotests/154.out |  4 ++--
> >>   2 files changed, 32 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/block/io.c b/block/io.c
> >> index f75777f5ea..4d7fa99bd2 100644
> >> --- a/block/io.c
> >> +++ b/block/io.c
> >> @@ -2434,25 +2434,44 @@ static int coroutine_fn 
> >> bdrv_co_block_status_above(BlockDriverState *bs,
> >>   ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, 
> >> map,
> >>  file);
> >>   if (ret < 0) {
> >> -break;
> >> +return ret;
> >>   }
> >> -if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
> >> +if (*pnum == 0) {
> >> +if (first) {
> >> +return ret;
> >> +}
> >> +
> >>   /*
> >> - * Reading beyond the end of the file continues to read
> >> - * zeroes, but we can only widen the result to the
> >> - * unallocated length we learned from an earlier
> >> - * iteration.
> >> + * Reads from bs for selected region will return zeroes, 
> >> produced
> >> + * because current level is short. We should consider it as
> >> + * allocated.
> > 
> > "the selected region"
> > "the current level"
> > 
> >> + * TODO: Should we report p as file here?
> > 
> > I think that would make sense.
> > 
> >>*/
> >> +assert(ret & BDRV_BLOCK_EOF);
> > 
> > Can this assertion be moved above the if (first)?
> 
> it may correspond to requested bytes==0.. But we can check it separately
> before for loop and move this assertion.

Ah, right. Don't bother then, it's fine either way.

Kevin




[Bug 1843254] Re: arm emulation of HCR.TID3 traps are not implemented

2019-11-26 Thread Peter Maydell
TID3 trapping should be mostly fixed for 4.2 -- we will trap accesses to
all the coprocessor/sysreg ID registers that TID3 covers. Trapping of
aarch32 MVFR* (which are accessed via vmrs) will not make it into this
release, but should be in 5.0.


** Changed in: qemu
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843254

Title:
  arm emulation of HCR.TID3 traps are not implemented

Status in QEMU:
  In Progress

Bug description:
  On ARM (aarch64), HCR_EL2.TID3 [bit18] is supposed to trap ID group 3,
  which includes the ID_AA64{PFR,DFR,ISAR,MMFR,AFR}*_EL1 registers.
  However, setting that HCR bit has no effect and accesses to those ID
  registers are not trapped to EL2 with an EC syndrome value of 0x18.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843254/+subscriptions



  1   2   3   >