On Wed, 5 Jun 2013 18:42:13 +0800 Amos Kong <ak...@redhat.com> wrote:
> Currently macvtap based macvlan device is working in promiscuous > mode, we want to implement mac-programming over macvtap through > Libvirt for better performance. > > Design: > QEMU notifies Libvirt when rx-filter config is changed in guest, > then Libvirt query the rx-filter information by a monitor command, > and sync the change to macvtap device. Related rx-filter config > of the nic contains main mac, rx-mode items and vlan table. > > This patch adds a QMP event to notify management of rx-filter change, > and adds a monitor command for management to query rx-filter > information. > > For reducing length of output, we just return the entries of vlan > filter table that have active vlan. > > Event_throttle API can avoid the events to flood QMP client, but it > could cause an unexpected delay. So a flag for each nic is used to > avoid events flooding, if management doesn't query rx-filter after > it receives one event, new events won't be emitted to QMP monitor. > > There maybe exist an uncontrollable delay if we let Libvirt do the > real change, guests normally expect rx-filter updates immediately. > But it's another separate issue, we can investigate it when the > work in Libvirt side is done. I think I completely misunderstood your testing results. I had understood that: 1. changing the mac often & quick enough to be a problem was a corner case and 2. you actually can overflow mngt I hope I really got it wrong, otherwise you'll be using that flag as a replacement for the event throttle API, which would be a big mistake. Can you please add your test results & analysis to this commit message? Two small comments below. > Signed-off-by: Amos Kong <ak...@redhat.com> > --- > v2: add argument to filter mac-table info of single nic (Stefan) > update the document, add event notification > v3: rename to rx-filter, add main mac, avoid events flooding (MST) > fix error process (Stefan), fix qmp interface (Eric) > v4: process qerror in hmp, cleanup (Luiz) > set flag for each device, add device path in event, add > helper for g_strdup_printf (MST) > fix qmp document (Eric) > v5: add path in doc, define notify flag to unsigned (Eric) > add vlan table (Jason), drop monitor cmd > --- > QMP/qmp-events.txt | 20 +++++++++ > hw/net/virtio-net.c | 112 > ++++++++++++++++++++++++++++++++++++++++++++++ > include/monitor/monitor.h | 1 + > include/net/net.h | 3 ++ > monitor.c | 1 + > net/net.c | 47 +++++++++++++++++++ > qapi-schema.json | 89 ++++++++++++++++++++++++++++++++++++ > qmp-commands.hx | 66 +++++++++++++++++++++++++++ > 8 files changed, 339 insertions(+) > > diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt > index 92fe5fb..885230e 100644 > --- a/QMP/qmp-events.txt > +++ b/QMP/qmp-events.txt > @@ -172,6 +172,26 @@ Data: > }, > "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } > > +NIC_RX_FILTER_CHANGED > +----------------- > + > +Emitted when rx-filter configuration of nic is changed by the guest. > +Each nic has a flag to control event emit, the flag is set to false > +when it emits one event of the nic, the flag is set to true when > +management queries the rx-filter of the nic. This is used to avoid > +events flooding. Having this flag is an implementation detail. I think you should only say that the event is emitted once until the query command is executed. > + > +Data: > + > +- "name": net client name (json-string) > +- "path": device path (json-string) > + > +{ "event": "NIC_RX_FILTER_CHANGED", > + "data": { "name": "vnet0", > + "path": "/machine/peripheral/vnet0/virtio-backend" }, > + "timestamp": { "seconds": 1368697518, "microseconds": 326866 } } > +} > + > RESET > ----- > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 1ea9556..ae1eab6 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -21,6 +21,8 @@ > #include "hw/virtio/virtio-net.h" > #include "net/vhost_net.h" > #include "hw/virtio/virtio-bus.h" > +#include "qapi/qmp/qjson.h" > +#include "monitor/monitor.h" > > #define VIRTIO_NET_VM_VERSION 11 > > @@ -192,6 +194,104 @@ static void virtio_net_set_link_status(NetClientState > *nc) > virtio_net_set_status(vdev, vdev->status); > } > > +static void rxfilter_notify(NetClientState *nc) > +{ > + QObject *event_data; > + VirtIONet *n = qemu_get_nic_opaque(nc); > + > + if (nc->rxfilter_notify_enabled) { > + event_data = qobject_from_jsonf("{ 'name': %s, 'path': %s }", > + n->netclient_name, > + object_get_canonical_path(OBJECT(n->qdev))); > + monitor_protocol_event(QEVENT_NIC_RX_FILTER_CHANGED, event_data); > + qobject_decref(event_data); > + /* disable event notification to avoid events flooding */ > + nc->rxfilter_notify_enabled = 0; > + } > +} > + > +static char *mac_strdup_printf(uint8_t *mac) > +{ mac can be const. Is there more code in QEMU that could use this function? If there's, then it's better to move this function to a more generic .c file. > + return g_strdup_printf("%.2x:%.2x:%.2x:%.2x:%.2x:%.2x", mac[0], > + mac[1], mac[2], mac[3], mac[4], mac[5]); > +} > + > +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) > +{ > + VirtIONet *n = qemu_get_nic_opaque(nc); > + RxFilterInfo *info; > + strList *str_list = NULL; > + strList *entry; > + VlanTableEntryList *vlan_list = NULL; > + VlanTableEntryList *vlan_entry; > + VlanTableEntry *table_entry; > + int i; > + > + info = g_malloc0(sizeof(*info)); > + info->name = g_strdup(nc->name); > + info->promiscuous = n->promisc; > + > + if (n->nouni) { > + info->unicast = RX_STATE_NONE; > + } else if (n->alluni) { > + info->unicast = RX_STATE_ALL; > + } else { > + info->unicast = RX_STATE_NORMAL; > + } > + > + if (n->nomulti) { > + info->multicast = RX_STATE_NONE; > + } else if (n->allmulti) { > + info->multicast = RX_STATE_ALL; > + } else { > + info->multicast = RX_STATE_NORMAL; > + } > + > + info->broadcast_allowed = n->nobcast; > + info->multicast_overflow = n->mac_table.multi_overflow; > + info->unicast_overflow = n->mac_table.uni_overflow; > + > + info->main_mac = mac_strdup_printf(n->mac); > + > + for (i = 0; i < n->mac_table.first_multi; i++) { > + entry = g_malloc0(sizeof(*entry)); > + entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN); > + entry->next = str_list; > + str_list = entry; > + } > + info->unicast_table = str_list; > + > + str_list = NULL; > + for (i = n->mac_table.first_multi; i < n->mac_table.in_use; i++) { > + entry = g_malloc0(sizeof(*entry)); > + entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN); > + entry->next = str_list; > + str_list = entry; > + } > + info->multicast_table = str_list; > + > + i = 0; > + while (i < MAX_VLAN >> 5) { > + /* ignore the entries that have no active vlan */ > + if (n->vlans[i] == 0) { > + i++; > + continue; > + } > + vlan_entry = g_malloc0(sizeof(*vlan_entry)); > + vlan_entry->value = g_malloc0(sizeof(*table_entry)); > + vlan_entry->value->index = i; > + vlan_entry->value->value = n->vlans[i++]; > + vlan_entry->next = vlan_list; > + vlan_list = vlan_entry; > + } > + info->vlan_table = vlan_list; > + > + /* enable event notification after query */ > + nc->rxfilter_notify_enabled = 1; > + > + return info; > +} > + > static void virtio_net_reset(VirtIODevice *vdev) > { > VirtIONet *n = VIRTIO_NET(vdev); > @@ -420,6 +520,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, > uint8_t cmd, > { > uint8_t on; > size_t s; > + NetClientState *nc = qemu_get_queue(n->nic); > > s = iov_to_buf(iov, iov_cnt, 0, &on, sizeof(on)); > if (s != sizeof(on)) { > @@ -442,6 +543,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, > uint8_t cmd, > return VIRTIO_NET_ERR; > } > > + rxfilter_notify(nc); > + > return VIRTIO_NET_OK; > } > > @@ -487,6 +590,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t > cmd, > { > struct virtio_net_ctrl_mac mac_data; > size_t s; > + NetClientState *nc = qemu_get_queue(n->nic); > > if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { > if (iov_size(iov, iov_cnt) != sizeof(n->mac)) { > @@ -495,6 +599,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t > cmd, > s = iov_to_buf(iov, iov_cnt, 0, &n->mac, sizeof(n->mac)); > assert(s == sizeof(n->mac)); > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > + rxfilter_notify(nc); > + > return VIRTIO_NET_OK; > } > > @@ -559,6 +665,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t > cmd, > n->mac_table.multi_overflow = 1; > } > > + rxfilter_notify(nc); > + > return VIRTIO_NET_OK; > } > > @@ -567,6 +675,7 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, > uint8_t cmd, > { > uint16_t vid; > size_t s; > + NetClientState *nc = qemu_get_queue(n->nic); > > s = iov_to_buf(iov, iov_cnt, 0, &vid, sizeof(vid)); > vid = lduw_p(&vid); > @@ -584,6 +693,8 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, > uint8_t cmd, > else > return VIRTIO_NET_ERR; > > + rxfilter_notify(nc); > + > return VIRTIO_NET_OK; > } > > @@ -1312,6 +1423,7 @@ static NetClientInfo net_virtio_info = { > .receive = virtio_net_receive, > .cleanup = virtio_net_cleanup, > .link_status_changed = virtio_net_set_link_status, > + .query_rx_filter = virtio_net_query_rxfilter, > }; > > static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx) > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h > index 1a6cfcf..1942cc4 100644 > --- a/include/monitor/monitor.h > +++ b/include/monitor/monitor.h > @@ -41,6 +41,7 @@ typedef enum MonitorEvent { > QEVENT_BLOCK_JOB_READY, > QEVENT_DEVICE_DELETED, > QEVENT_DEVICE_TRAY_MOVED, > + QEVENT_NIC_RX_FILTER_CHANGED, > QEVENT_SUSPEND, > QEVENT_SUSPEND_DISK, > QEVENT_WAKEUP, > diff --git a/include/net/net.h b/include/net/net.h > index 43d85a1..30e4b04 100644 > --- a/include/net/net.h > +++ b/include/net/net.h > @@ -49,6 +49,7 @@ typedef ssize_t (NetReceiveIOV)(NetClientState *, const > struct iovec *, int); > typedef void (NetCleanup) (NetClientState *); > typedef void (LinkStatusChanged)(NetClientState *); > typedef void (NetClientDestructor)(NetClientState *); > +typedef RxFilterInfo *(QueryRxFilter)(NetClientState *); > > typedef struct NetClientInfo { > NetClientOptionsKind type; > @@ -59,6 +60,7 @@ typedef struct NetClientInfo { > NetCanReceive *can_receive; > NetCleanup *cleanup; > LinkStatusChanged *link_status_changed; > + QueryRxFilter *query_rx_filter; > NetPoll *poll; > } NetClientInfo; > > @@ -74,6 +76,7 @@ struct NetClientState { > unsigned receive_disabled : 1; > NetClientDestructor *destructor; > unsigned int queue_index; > + unsigned rxfilter_notify_enabled:1; > }; > > typedef struct NICState { > diff --git a/monitor.c b/monitor.c > index 6ce2a4e..5e64fe8 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { > [QEVENT_BLOCK_JOB_READY] = "BLOCK_JOB_READY", > [QEVENT_DEVICE_DELETED] = "DEVICE_DELETED", > [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED", > + [QEVENT_NIC_RX_FILTER_CHANGED] = "NIC_RX_FILTER_CHANGED", > [QEVENT_SUSPEND] = "SUSPEND", > [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK", > [QEVENT_WAKEUP] = "WAKEUP", > diff --git a/net/net.c b/net/net.c > index 43a74e4..33abffe 100644 > --- a/net/net.c > +++ b/net/net.c > @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState *nc) > nc->info_str); > } > > +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name, > + Error **errp) > +{ > + NetClientState *nc; > + RxFilterInfoList *filter_list = NULL, *last_entry = NULL; > + > + QTAILQ_FOREACH(nc, &net_clients, next) { > + RxFilterInfoList *entry; > + RxFilterInfo *info; > + > + /* only query rx-filter information of nic */ > + if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) { > + continue; > + } > + if (has_name && strcmp(nc->name, name) != 0) { > + continue; > + } > + > + if (nc->info->query_rx_filter) { > + info = nc->info->query_rx_filter(nc); > + entry = g_malloc0(sizeof(*entry)); > + entry->value = info; > + > + if (!filter_list) { > + filter_list = entry; > + } else { > + last_entry->next = entry; > + } > + last_entry = entry; > + } else if (has_name) { > + error_setg(errp, "net client(%s) doesn't support" > + " rx-filter querying", name); > + break; > + } > + } > + > + if (filter_list == NULL && !error_is_set(errp)) { > + if (has_name) { > + error_setg(errp, "invalid net client name: %s", name); > + } else { > + error_setg(errp, "no net client supports rx-filter querying"); > + } > + } > + > + return filter_list; > +} > + > void do_info_network(Monitor *mon, const QDict *qdict) > { > NetClientState *nc, *peer; > diff --git a/qapi-schema.json b/qapi-schema.json > index ef1f657..fb6f12c 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -3618,3 +3618,92 @@ > '*cpuid-input-ecx': 'int', > 'cpuid-register': 'X86CPURegister32', > 'features': 'int' } } > + > +## > +# @RxState: > +# > +# Packets receiving state > +# > +# @normal: filter assigned packets according to the mac-table > +# > +# @none: don't receive any assigned packet > +# > +# @all: receive all assigned packets > +# > +## > +{ 'enum': 'RxState', 'data': [ 'normal', 'none', 'all' ] } > + > +## > +# @VlanTableEntry > +# > +# Entry detail of vlan filter table > +# > +# @index: index of vlan filter table > +# > +# @value: 32 bit of indicator for active vlans > +# > +# Since 1.6 > +## > + > +{ 'type': 'VlanTableEntry', 'data': { 'index': 'int', 'value': 'uint32' } } > + > +## > +# @RxFilterInfo: > +# > +# Rx-filter information for a nic. > +# > +# @name: net client name > +# > +# @promiscuous: whether promiscuous mode is enabled > +# > +# @multicast: multicast receive state > +# > +# @unicast: unicast receive state > +# > +# @broadcast-allowed: whether to receive broadcast > +# > +# @multicast-overflow: multicast table is overflowed or not > +# > +# @unicast-overflow: unicast table is overflowed or not > +# > +# @main-mac: the main macaddr string > +# > +# @vlan-table: a list of @VlanTableEntry > +# > +# @unicast-table: a list of unicast macaddr string > +# > +# @multicast-table: a list of multicast macaddr string > +# > +# Since 1.6 > +## > + > +{ 'type': 'RxFilterInfo', > + 'data': { > + 'name': 'str', > + 'promiscuous': 'bool', > + 'multicast': 'RxState', > + 'unicast': 'RxState', > + 'broadcast-allowed': 'bool', > + 'multicast-overflow': 'bool', > + 'unicast-overflow': 'bool', > + 'main-mac': 'str', > + 'vlan-table': ['VlanTableEntry'], > + 'unicast-table': ['str'], > + 'multicast-table': ['str'] }} > + > +## > +# @query-rx-filter: > +# > +# Return rx-filter information for all nics (or for the given nic). > +# > +# @name: #optional net client name > +# > +# Returns: list of @RxFilterInfo for all nics (or for the given nic). > +# Returns an error if the given @name doesn't exist, or given > +# nic doesn't support rx-filter querying, or no net client > +# supports rx-filter querying > +# > +# Since: 1.6 > +## > +{ 'command': 'query-rx-filter', 'data': { '*name': 'str' }, > + 'returns': ['RxFilterInfo'] } > diff --git a/qmp-commands.hx b/qmp-commands.hx > index ffd130e..e76e9a2 100644 > --- a/qmp-commands.hx > +++ b/qmp-commands.hx > @@ -2932,3 +2932,69 @@ Example: > <- { "return": {} } > > EQMP > + { > + .name = "query-rx-filter", > + .args_type = "name:s?", > + .mhandler.cmd_new = qmp_marshal_input_query_rx_filter, > + }, > + > +SQMP > +query-rx-filter > +--------------- > + > +Show rx-filter information. > + > +Returns a json-array of rx-filter information for all nics (or for the > +given nic), returning an error if the given nic doesn't exist, or > +given nic doesn't support rx-filter querying, or no net client > +supports rx-filter querying. > + > +The query will clear the event notification flag of each nic, then qemu > +will start to emit event to QMP monitor. > + > +Each array entry contains the following: > + > +- "name": net client name (json-string) > +- "promiscuous": promiscuous mode is enabled (json-bool) > +- "multicast": multicast receive state (one of 'normal', 'none', 'all') > +- "unicast": unicast receive state (one of 'normal', 'none', 'all') > +- "broadcast-allowed": allow to receive broadcast (json-bool) > +- "multicast-overflow": multicast table is overflowed (json-bool) > +- "unicast-overflow": unicast table is overflowed (json-bool) > +- "main-mac": main macaddr string (json-string) > +- "vlan-table": a json-array of vlan-table entry > + - "index": index of vlan filter table (json-int) > + - "value": 32 bit of indicator for active vlans > +- "unicast-table": a json-array of unicast macaddr string > +- "multicast-table": a json-array of multicast macaddr string > + > +Example: > + > +-> { "execute": "query-rx-filter", "arguments": { "name": "vnet0" } } > +<- { "return": [ > + { > + "promiscuous": true, > + "name": "vnet0", > + "main-mac": "52:54:00:12:34:56", > + "unicast": "normal", > + "vlan-table": [ > + { "index": 0, > + "entry": 1 > + } > + ], > + "unicast-table": [ > + ], > + "multicast": "normal", > + "multicast-overflow": false, > + "unicast-overflow": false, > + "multicast-table": [ > + "01:00:5e:00:00:01", > + "33:33:00:00:00:01", > + "33:33:ff:12:34:56" > + ], > + "broadcast-allowed": false > + } > + ] > + } > + > +EQMP