Mike,

I had same issue month ago when I roll out sriov in my cloud and this is what I 
did to solve this issue. Set following in flavor 

hw:numa_nodes=2

It will spread out instance vcpu across numa, yes there will be little penalty 
but if you tune your application according they you are good 

Yes this is bug I have already open ticket and I believe folks are working on 
it but its not simple fix. They may release new feature in coming oprnstack 
release. 

Sent from my iPhone

> On Nov 11, 2018, at 9:25 PM, Mike Joseph <m...@mode.net> wrote:
> 
> Hi folks,
> 
> It appears that the numa_policy attribute of a PCI alias is ignored for 
> flavors referencing that alias if the flavor also has hw:cpu_policy=dedicated 
> set.  The alias config is:
> 
> alias = { "name": "mlx", "device_type": "type-VF", "vendor_id": "15b3", 
> "product_id": "1004", "numa_policy": "preferred" }
> 
> And the flavor config is:
> 
> {
>   "OS-FLV-DISABLED:disabled": false,
>   "OS-FLV-EXT-DATA:ephemeral": 0,
>   "access_project_ids": null,
>   "disk": 10,
>   "id": "221e1bcd-2dde-48e6-bd09-820012198908",
>   "name": "vm-2",
>   "os-flavor-access:is_public": true,
>   "properties": "hw:cpu_policy='dedicated', pci_passthrough:alias='mlx:1'",
>   "ram": 8192,
>   "rxtx_factor": 1.0,
>   "swap": "",
>   "vcpus": 2
> }
> 
> In short, our compute nodes have an SR-IOV Mellanox NIC (ConnectX-3) with 16 
> VFs configured.  We wish to expose these VFs to VMs that schedule on the 
> host.  However, the NIC is in NUMA region 0 which means that only half of the 
> compute node's CPU cores would be usable if we required VM affinity to the 
> NIC's NUMA region.  But we don't need that, since we are okay with 
> cross-region access to the PCI device.
> 
> However, we do need CPU pinning to work, in order to have efficient cache 
> hits on our VM processes.  Therefore, we still want to pin our vCPUs to 
> pCPUs, even if the pins end up on on a NUMA region opposite of the NIC.  The 
> spec for numa_policy seem to indicate that this is exactly the intent of the 
> option:
> 
> https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html
> 
> But, with the above config, we still get PCI affinity scheduling errors:
> 
> 'Insufficient compute resources: Requested instance NUMA topology together 
> with requested PCI devices cannot fit the given host NUMA topology.'
> 
> This strikes me as a bug, but perhaps I am missing something here?
> 
> Thanks,
> MJ
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to