Nithin,

Thanks for reviewing my patch. In the future I will try to break big code 
changes into smaller patches to ease the code review process.

I will add more comments to the added data structures and functions. Also a 
diagram to explain the workflow.

Please see my answers inline.

-Sorin

-----Original Message-----
From: Nithin Raju [mailto:nit...@vmware.com] 
Sent: Saturday, 11 April, 2015 12:14
To: dev@openvswitch.org; Sorin Vinturis
Subject: Re: [ovs-dev] [PATCH v3 1/3] datapath-windows: Support for custom 
VXLAN tunnel port

hi Sorin,
Thanks for making this change. Apologies of the delay in the review. In 
defense, it was not a small patch :)

The change looks good in general. I had the following comments though:

The WFP (tunnel filter) state machine is quite confusing. Would it be possible 
to write a diagram/sentences to explain which function is called at which 
point? A tree would also be great. If not now, it could be done later too.

An example of notes is as follows:
OVS Driver Init:
<these are the functions>

OVS Switch attach handler:
<these are the functions>

OVS VXLAN port add:
<these are the functions>

Also, what would be useful is to add a comment for each of the ‘g*’ variables 
to say at what stage they are initialized.

Pls. add a short description of each function, at least the important ones.

Specific comments are as follows.

> On Mar 12, 2015, at 5:42 PM, Sorin Vinturis 
> <svintu...@cloudbasesolutions.com> wrote:
> 
> Update 1: Modified the patch in order to report WFP operation status
> to userspace.
> 
> The kernel datapath supports only port 4789 for VXLAN tunnel creation.
> Added support in order to allow for the VXLAN tunnel port to be
> configurable to any port number set by the userspace.
> 
> The patch also checks to see if an existing WFP filter, for the
> necessary UDP tunnel port, is already created before adding a new one.
> This is a double check, because currently the userspace also verifies
> this, but it is necessary to avoid future issues.
> 
> Custom VXLAN tunnel port requires the addition of a new WFP filter
> with the new UDP tunnel port. The creation of a new WFP filter is
> triggered in OvsInitVxlanTunnel function and the removal of the WFP
> filter in OvsCleanupVxlanTunnel function.
> But the latter functions are running at IRQL = DISPATCH_LEVEL, due
> to the NDIS RW lock acquisition, and all WFP calls must be running at
> IRQL = PASSIVE_LEVEL. This is why I have created a system thread which
> records all filter addition/removal requests into a list for later
> processing by the system thread. The ThreadStart routine processes all
> received requests at IRQL = PASSIVE_LEVEL, which is the required IRQL
> for the necessary WFP calls for adding/removal of the WFP filters.
> 
> The WFP filter for the default VXLAN port 4789 is not added anymore at
> filter attach. All WFP filters for the tunnel ports are added when the
> tunnel ports are initialized and are removed at cleanup.
> 
> It is necessary that OvsTunnelFilterUninitialize function is called
> after OvsClearAllSwitchVports in order to allow for the added WFP
> filters to be removed. OvsTunnelFilterUninitialize function closes the
> global engine handle used by most of the WFP calls, including filter
> removal.
> 
> Signed-off-by: Sorin Vinturis <svintu...@cloudbasesolutions.com>
> Reported-by: Alin Gabriel Serdean <aserd...@cloudbasesolutions.com>
> Reported-at: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitch_ovs-2Dissues_issues_66&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=pNHQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80&m=iZE-ay1RfRGlzBiIK32zf67VOMWgV3Jo3R4_9nF5A9U&s=i18-64P5jwbBBujlfkhZ6JNQW2sZ2YDk1YHk1hrqP2U&e=
>  
> ---
> datapath-windows/ovsext/Datapath.c             |   8 +-
> datapath-windows/ovsext/Netlink/NetlinkError.h |   8 +-
> datapath-windows/ovsext/Switch.c               |   2 +-
> datapath-windows/ovsext/Tunnel.h               |   6 -
> datapath-windows/ovsext/TunnelFilter.c         | 803 +++++++++++++++++++++----
> datapath-windows/ovsext/TunnelIntf.h           |  15 +
> datapath-windows/ovsext/Vport.c                | 336 ++++++++++-
> datapath-windows/ovsext/Vport.h                |  13 +-
> datapath-windows/ovsext/Vxlan.c                |  86 ++-
> datapath-windows/ovsext/Vxlan.h                |  14 +-
> 10 files changed, 1107 insertions(+), 184 deletions(-)
> 
> diff --git a/datapath-windows/ovsext/Datapath.c 
> b/datapath-windows/ovsext/Datapath.c
> index 8eb13f1..0dc8a77 100644
> --- a/datapath-windows/ovsext/Datapath.c
> +++ b/datapath-windows/ovsext/Datapath.c
> @@ -653,7 +653,6 @@ OvsCleanupDevice(PDEVICE_OBJECT deviceObject,
>    return OvsCompleteIrpRequest(irp, (ULONG_PTR)0, status);
> }
> 
> -
> /*
> * --------------------------------------------------------------------------
> * IOCTL function handler for the device.
> @@ -877,9 +876,12 @@ done:
>    KeMemoryBarrier();
>    instance->inUse = 0;
> 
> -    /* Should not complete a pending IRP unless proceesing is completed */
> +    /* Should not complete a pending IRP unless proceesing is completed. */
>    if (status == STATUS_PENDING) {
> -        return status;
> +        /* STATUS_PENDING is returned by the NL handler when the request is
> +         * to be processed later, so we mark the IRP as pending and complete
> +         * it in another thread when the request is processed. */
> +        IoMarkIrpPending(irp);

Do we need to mark the IRP as pending explicitly? We already have working code 
where we returned STATUS_PENDING and completed the IRP later. My understanding 
is that as long as the IRP is not completed, it is in pending state.

I am fine if this is needed/nice-to-have/makes-things-clear.
SV: If we return from the Dispatch routine without completing the IRP, we MUST 
(a) mark the IRP pending, before the IRP is completed to the caller, by calling 
IoMarkIrpPending(), and (b) return STATUS_PENDING from the Dispatch routine. 
That is a rule, not a nice-to-have thing.

>       return status;
>    }
>    return OvsCompleteIrpRequest(irp, (ULONG_PTR)replyLen, status);
> }
> diff --git a/datapath-windows/ovsext/Netlink/NetlinkError.h 
> b/datapath-windows/ovsext/Netlink/NetlinkError.h
> index 53c935f..fde46b0 100644
> --- a/datapath-windows/ovsext/Netlink/NetlinkError.h
> +++ b/datapath-windows/ovsext/Netlink/NetlinkError.h
> @@ -195,8 +195,10 @@ typedef enum _NL_ERROR_
>    NL_ERROR_TIMEDOUT = ((ULONG)-138),
>    /* The given text file is busy */
>    NL_ERROR_TXTBSY = ((ULONG)-139),
> -    /*the operation would block */
> +    /* The operation would block */
>    NL_ERROR_WOULDBLOCK = ((ULONG)-140),
> +    /* The operation is not finished */
> +    NL_ERROR_PENDING = ((ULONG)-141),
> } NL_ERROR;
> 
> static __inline
> @@ -215,7 +217,11 @@ NlMapStatusToNlErr(NTSTATUS status)
>    case STATUS_SUCCESS:
>      ret = NL_ERROR_SUCCESS;
>      break;
> +    case STATUS_PENDING:
> +      ret = NL_ERROR_PENDING;
> +      break;

Do we need NL_ERROR_PENDING? Userspace does not really expect an output message 
if the return value is STATUS_PENDING, right?
SV: I have added the new NL error for pending in order to make the behavior of 
the OvsNewVportCmdHandler function consistent. In the latter function the 
OvsInitTunnelVport returns STATUS_PENDING and NlMapStatusToNlErr is called to 
map the status to a NL error. Then NL error is used in all across the function 
for result interpretation. Also, in the cleanup phase the NL error is checked, 
not the status.

>    default:
> +        ret = NL_ERROR_OTHER;
>      break;
>    }
> 
> diff --git a/datapath-windows/ovsext/Switch.c 
> b/datapath-windows/ovsext/Switch.c
> index a228d8e..cf5e3c4 100644
> --- a/datapath-windows/ovsext/Switch.c
> +++ b/datapath-windows/ovsext/Switch.c
> @@ -261,8 +261,8 @@ OvsDeleteSwitch(POVS_SWITCH_CONTEXT switchContext)
>    if (switchContext)
>    {
>        dpNo = switchContext->dpNo;
> -        OvsTunnelFilterUninitialize(gOvsExtDriverObject);
>        OvsClearAllSwitchVports(switchContext);
> +        OvsTunnelFilterUninitialize(gOvsExtDriverObject);

I see a little tricky situation here. If there are lots of tunnel ports, then 
the driver gets unloaded, OvsClearAllSwitchVports() would end up submitting a 
whole bunch of requests to the WFP threads. Soon after that, 
OvsTunnelFilterUninitialize() would try to stop the threads. Is this situation 
handled?
SV: The situation that you have mentioned is not handled correctly. In that 
specific case all requests will be completed with STATUS_CANCELLED. That is OK 
for the remaining VXLAN tunnel creation requests, but not OK for the remaining 
tunnel deletion requests. I will fix this in the next version of this patch.

>        OvsUninitSwitchContext(switchContext);
>        OvsFreeMemory(switchContext);
>    }
> diff --git a/datapath-windows/ovsext/Tunnel.h 
> b/datapath-windows/ovsext/Tunnel.h
> index 2978bb3..2c45e35 100644
> --- a/datapath-windows/ovsext/Tunnel.h
> +++ b/datapath-windows/ovsext/Tunnel.h
> @@ -32,12 +32,6 @@ typedef struct OVS_TUNNEL_PENDED_PACKET_
>   FWPS_CLASSIFY_OUT *classifyOut;
> } OVS_TUNNEL_PENDED_PACKET;
> 
> -/* Shared global data. */
> -
> -extern UINT16 configNewDestPort;
> -
> -extern UINT32 gCalloutIdV4;
> -
> //
> // Shared function prototypes
> //
> diff --git a/datapath-windows/ovsext/TunnelFilter.c 
> b/datapath-windows/ovsext/TunnelFilter.c
> index e0adc37..5a9c17c 100644
> --- a/datapath-windows/ovsext/TunnelFilter.c
> +++ b/datapath-windows/ovsext/TunnelFilter.c
> @@ -48,27 +48,24 @@
> /* Infinite timeout */
> #define INFINITE                        0xFFFFFFFF
> 
> -/*
> - * The provider name should always match the provider string from the install
> - * file.
> - */
> +/* The provider name should always match the provider string from the install
> + * file. */
> #define OVS_TUNNEL_PROVIDER_NAME        L"Open vSwitch"
> 
> -/*
> - * The provider description should always contain the OVS service description
> - * string from the install file.
> - */
> +/* The provider description should always contain the OVS service description
> + * string from the install file. */
> #define OVS_TUNNEL_PROVIDER_DESC        L"Open vSwitch Extension tunnel 
> provider"
> 
> /* The session name isn't required but it's useful for diagnostics. */
> #define OVS_TUNNEL_SESSION_NAME         L"OVS tunnel session"
> 
> -/* Configurable parameters (addresses and ports are in host order) */
> -UINT16   configNewDestPort = VXLAN_UDP_PORT;
> +/* Maximum number of tunnel threads to be created. */
> +#define OVS_TUNFLT_MAX_THREADS          4
> 
> /*
> * Callout and sublayer GUIDs
> */
> +
> // b16b0a6e-2b2a-41a3-8b39-bd3ffc855ff8
> DEFINE_GUID(
>    OVS_TUNNEL_CALLOUT_V4,
> @@ -106,15 +103,80 @@ DEFINE_GUID(
>    );
> 
> /*
> - * Callout driver global variables
> + * Callout driver type definitions
> */
> -PDEVICE_OBJECT gDeviceObject;
> +typedef enum _OVS_TUNFLT_OPERATION {
> +    OVS_TUN_FILTER_CREATE = 0,
> +    OVS_TUN_FILTER_DELETE
> +} OVS_TUNFLT_OPERATION;
> +
> +typedef struct _OVS_TUNFLT_REQUEST {
> +    LIST_ENTRY              entry;
> +    /* Tunnel filter destination port. */
> +    UINT16                  port;
> +    /* Tunnel filter identification used for filter deletion. */
> +    UINT64                  ID;
> +    /* Pointer used to return filter ID to the caller on filter creation. */
> +    UINT64*                 retID;

Nit: use
UINT64 *retID or
PUINT64 retID

I see why you have a retID and ID separated out. This seems to be because the 
filter create uses ‘retID', and filter delete uses ‘ID’. In 
OvsTunelFilterDelete(), you are passing the address of a stack variable, and 
storing it in ‘retID’. This can be confusing. If you want to use the same 
POVS_TUNFLT_REQUEST, you can create a union of these two values and use them 
accordingly. It can be confusing to debug later on.
SV: Here, the '*retID' points to the '((POVS_VXLAN_VPORT) vxlan)->filterID' 
memory address from heap and this is how I can return the filter ID to the 
caller, when creating the tunnel filter at a later time by one of the tunnel 
filter threads. The problem of using this pointer comes when the tunnel filter 
needs to be deleted. In the latter case the OvsCleanupVxlanTunnel is called, 
the tunnel filter delete request is recorded but not yet processed, and the 
POVS_VXLAN_VPORT structure is released. Thus, by the time the filter delete 
request is processed, the 'retID' pointer will point to a freed memory space.

> +    /* Requested operation to be performed. */
> +    OVS_TUNFLT_OPERATION    operation;
> +    /* Current I/O request to be completed when requested
> +     * operation is finished. */
> +    PIRP                    irp;

Is there any situation where we’d need to cancel an IRP? Let me also think.

> +    /* Callback function called before completing the IRP. */
> +    PFNTunnelVportPendingOp callback;
> +    /* Context passed to the callback function. */
> +    PVOID                   context;
> +} OVS_TUNFLT_REQUEST, *POVS_TUNFLT_REQUEST;

Pls. add a L4 protocol also along with the port. So, VXLAN would say UDP. You 
can do it later, no worries.
SV: I can add the L4 port also.

> +
> +typedef struct _OVS_TUNFLT_REQUEST_LIST {
> +    /* SpinLock for syncronizing access to the requests list. */
> +    NDIS_SPIN_LOCK spinlock;
> +    /* Head of the requests list. */
> +    LIST_ENTRY     head;
> +    /* Number of requests in the list. */
> +    UINT32         numEntries;
> +} OVS_TUNFLT_REQUEST_LIST, *POVS_TUNFLT_REQUEST_LIST;
> +
> +typedef struct _OVS_TUNFLT_THREAD_CONTEXT {
> +    /* Thread identification. */
> +    UINT                    threadID;
> +    /* Thread's engine session handle. */
> +    HANDLE                  engineSession;
> +    /* Reference of the thread object. */
> +    PVOID                   threadObject;
> +    /* Requests queue list. */
> +    OVS_TUNFLT_REQUEST_LIST listRequests;
> +    /* Event signaling that there are requests to process. */
> +    KEVENT                  requestEvent;
> +    /* Event for stopping thread execution. */
> +    KEVENT                  stopEvent;
> +} OVS_TUNFLT_THREAD_CONTEXT, *POVS_TUNFLT_THREAD_CONTEXT;
> +
> +KSTART_ROUTINE  OvsTunnelFilterThreadProc;

static?
SV: No need for that. A kernel ThreadStart routine is declared using the 
KSTART_ROUTINE function pointer definition type.

> +
> +NTSTATUS OvsTunnelFilterStartThreads();
> +NTSTATUS OvsTunnelFilterThreadStart(POVS_TUNFLT_THREAD_CONTEXT threadCtx);
> +VOID     OvsTunnelFilterStopThreads();
> +VOID     OvsTunnelFilterThreadStop(POVS_TUNFLT_THREAD_CONTEXT threadCtx);
> +NTSTATUS OvsTunnelFilterThreadInit(POVS_TUNFLT_THREAD_CONTEXT threadCtx);
> +VOID     OvsTunnelFilterThreadUninit(POVS_TUNFLT_THREAD_CONTEXT threadCtx);

some of these can be static.
SV: OK.

> 
> -HANDLE gEngineHandle = NULL;
> -UINT32 gCalloutIdV4;
> +/*
> + * Callout driver global variables
> + */
> 
> +PDEVICE_OBJECT            gDeviceObject = NULL;
> +HANDLE                    gEngineHandle = NULL;
> +UINT32                    gCalloutIdV4 = 0;
> +OVS_TUNFLT_THREAD_CONTEXT gTunnelThreadCtx[OVS_TUNFLT_MAX_THREADS] = { 0 };
> +UINT32                    gRoundRobinThreadIndex = 0;
> +/* Stores the last filter destination port. */
> +UINT16                    gLastQueuedFilterPort = 0;

Some of these variables can be made static.
SV: Will do.

> 
> -/* Callout driver implementation */
> +/*
> + * Callout driver implementation.
> + */
> 
> NTSTATUS
>       (HANDLE *handle)
> @@ -122,14 +184,11 @@ OvsTunnelEngineOpen(HANDLE *handle)
>    NTSTATUS status = STATUS_SUCCESS;
>    FWPM_SESSION session = { 0 };
> 
> -    /* The session name isn't required but may be useful for diagnostics. */
> -    session.displayData.name = OVS_TUNNEL_SESSION_NAME;
>    /*
>    * Set an infinite wait timeout, so we don't have to handle FWP_E_TIMEOUT
>    * errors while waiting to acquire the transaction lock.
>    */
>    session.txnWaitTimeoutInMSec = INFINITE;
> -    session.flags = FWPM_SESSION_FLAG_DYNAMIC;

Is there any specific reason to remove the ‘session.displayData.name’ and 
‘session.flags’?
SV: I have removed the session display name, because I am using the same 
function to create new engine sessions for each new thread and the session name 
should not be the same.
Also, I have removed the session dynamic flag because I needed a static 
session, which is the default session type. And that was required in order for 
another session to reference the callout by ID. If the session was dynamic, one 
cannot add a filter with a callout created in another session.

> 
>    /* The authentication service should always be RPC_C_AUTHN_DEFAULT. */
>    status = FwpmEngineOpen(NULL,
> @@ -138,7 +197,7 @@ OvsTunnelEngineOpen(HANDLE *handle)
>                            &session,
>                            handle);
>    if (!NT_SUCCESS(status)) {
> -        OVS_LOG_ERROR("Fail to open filtering engine session, status: %x.",
> +        OVS_LOG_ERROR("Failed to open filtering engine session, status: %x.",
>                      status);
>    }
> 
> @@ -173,16 +232,16 @@ OvsTunnelAddSystemProvider(HANDLE handle)
>        provider.displayData.name = OVS_TUNNEL_PROVIDER_NAME;
>        provider.displayData.description = OVS_TUNNEL_PROVIDER_DESC;
>        /*
> -        * Since we always want the provider to be present, it's easiest to 
> add
> -        * it as persistent object during driver load.
> -        */
> +         * Since we always want the provider to be present, it's easiest to 
> add
> +         * it as persistent object during driver load.
> +         */

Nit: unnecessary indentation change.

>        provider.flags = FWPM_PROVIDER_FLAG_PERSISTENT;
> 
>        status = FwpmProviderAdd(handle,
>                                 &provider,
>                                 NULL);
>        if (!NT_SUCCESS(status)) {
> -            OVS_LOG_ERROR("Fail to add WFP provider, status: %x.", status);
> +            OVS_LOG_ERROR("Failed to add WFP provider, status: %x.", status);
>            break;
>        }
> 
> @@ -232,14 +291,16 @@ OvsTunnelRemoveSystemProvider(HANDLE handle)
> }
> 
> NTSTATUS
> -OvsTunnelAddFilter(PWSTR filterName,
> +OvsTunnelAddFilter(HANDLE engineSession,

Can you also update the parameter names of OvsTunnelAddSystemProvider(), 
OvsTunnelRemoveSystemProvider(), OvsTunnelEngineOpen() and 
OvsTunnelEngineClose() from ‘handle’ to ‘engineSession’?
SV: Sure.

> +                   PWSTR filterName,
>                   const PWSTR filterDesc,
>                   USHORT remotePort,
>                   FWP_DIRECTION direction,
>                   UINT64 context,
>                   const GUID *filterKey,
>                   const GUID *layerKey,
> -                   const GUID *calloutKey)
> +                   const GUID *calloutKey,
> +                   UINT64 *filterID)
> {
>    NTSTATUS status = STATUS_SUCCESS;
>    FWPM_FILTER filter = {0};
> @@ -249,7 +310,9 @@ OvsTunnelAddFilter(PWSTR filterName,
>    UNREFERENCED_PARAMETER(remotePort);
>    UNREFERENCED_PARAMETER(direction);
> 
> -    filter.filterKey = *filterKey;
> +    if (filterKey) {
> +        filter.filterKey = *filterKey;
> +    }
>    filter.layerKey = *layerKey;
>    filter.displayData.name = (wchar_t*)filterName;
>    filter.displayData.description = (wchar_t*)filterDesc;
> @@ -279,64 +342,18 @@ OvsTunnelAddFilter(PWSTR filterName,
> 
>    filter.numFilterConditions = conditionIndex;
> 
> -    status = FwpmFilterAdd(gEngineHandle,
> +    status = FwpmFilterAdd(engineSession,
>                           &filter,
>                           NULL,
> -                           NULL);
> -
> -    return status;
> -}
> -
> -NTSTATUS
> -OvsTunnelRemoveFilter(const GUID *filterKey,
> -                      const GUID *sublayerKey)
> -{
> -    NTSTATUS status = STATUS_SUCCESS;
> -    BOOLEAN inTransaction = FALSE;
> -
> -    do {
> -        status = FwpmTransactionBegin(gEngineHandle, 0);
> -        if (!NT_SUCCESS(status)) {
> -            break;
> -        }
> -
> -        inTransaction = TRUE;
> -
> -        /*
> -         * We have to delete the filter first since it references the
> -         * sublayer. If we tried to delete the sublayer first, it would fail
> -         * with FWP_ERR_IN_USE.
> -         */
> -        status = FwpmFilterDeleteByKey(gEngineHandle,
> -                                       filterKey);
> -        if (!NT_SUCCESS(status)) {
> -            break;
> -        }
> -
> -        status = FwpmSubLayerDeleteByKey(gEngineHandle,
> -                                         sublayerKey);
> -        if (!NT_SUCCESS(status)) {
> -            break;
> -        }
> -
> -        status = FwpmTransactionCommit(gEngineHandle);
> -        if (!NT_SUCCESS(status)){
> -            break;
> -        }
> -
> -        inTransaction = FALSE;
> -    } while (inTransaction);
> +                           filterID);
> 
> -    if (inTransaction) {
> -        FwpmTransactionAbort(gEngineHandle);
> -    }
>    return status;
> }
> 
> /*
> * --------------------------------------------------------------------------
> - * This function registers callouts and filters that intercept UDP traffic at
> - * WFP FWPM_LAYER_DATAGRAM_DATA_V4
> + * This function registers callouts for intercepting UDP traffic at WFP
> + * FWPM_LAYER_DATAGRAM_DATA_V4 layer.
> * --------------------------------------------------------------------------
> */
> NTSTATUS
> @@ -363,10 +380,7 @@ OvsTunnelRegisterDatagramDataCallouts(const GUID 
> *layerKey,
>    sCallout.flags = FWP_CALLOUT_FLAG_CONDITIONAL_ON_FLOW;
> #endif
> 
> -    status = FwpsCalloutRegister(deviceObject,
> -                                 &sCallout,
> -                                 calloutId);
> -
> +    status = FwpsCalloutRegister(deviceObject, &sCallout, calloutId);
>    if (!NT_SUCCESS(status)) {
>        goto Exit;
>    }
> @@ -379,24 +393,11 @@ OvsTunnelRegisterDatagramDataCallouts(const GUID 
> *layerKey,
>    mCallout.displayData = displayData;
>    mCallout.applicableLayer = *layerKey;
> 
> -    status = FwpmCalloutAdd(gEngineHandle,
> -                            &mCallout,
> -                            NULL,
> -                            NULL);
> -
> +    status = FwpmCalloutAdd(gEngineHandle, &mCallout, NULL, NULL);
>    if (!NT_SUCCESS(status)) {
>        goto Exit;
>    }
> 
> -    status = OvsTunnelAddFilter(L"Datagram-Data OVS Filter (Inbound)",
> -                                L"address/port for UDP",
> -                                configNewDestPort,
> -                                FWP_DIRECTION_INBOUND,
> -                                0,
> -                                &OVS_TUNNEL_FILTER_KEY,
> -                                layerKey,
> -                                calloutKey);
> -
> Exit:
> 
>    if (!NT_SUCCESS(status)){
> @@ -411,24 +412,16 @@ Exit:
> 
> /*
> * --------------------------------------------------------------------------
> - * This function registers dynamic callouts and filters that intercept UDP
> - * Callouts and filters will be removed during De-Initialize.
> + * This function registers non-dynamic callouts for intercepting UDP traffic.
> + * Callouts will be removed during un-initializing phase.
> * --------------------------------------------------------------------------
> */
> NTSTATUS
> OvsTunnelRegisterCallouts(VOID *deviceObject)
> {
> -    NTSTATUS status = STATUS_SUCCESS;
> -    FWPM_SUBLAYER OvsTunnelSubLayer;
> -
> -    BOOLEAN engineOpened = FALSE;
> -    BOOLEAN inTransaction = FALSE;
> -
> -    status = OvsTunnelEngineOpen(&gEngineHandle);
> -    if (!NT_SUCCESS(status)) {
> -        goto Exit;
> -    }
> -    engineOpened = TRUE;
> +    NTSTATUS        status = STATUS_SUCCESS;
> +    BOOLEAN         inTransaction = FALSE;
> +    FWPM_SUBLAYER   OvsTunnelSubLayer;
> 
>    status = FwpmTransactionBegin(gEngineHandle, 0);

Nit: Can you add ASSERT for ‘gEngineHandle != NULL’.
SV: OK.

>    if (!NT_SUCCESS(status)) {
> @@ -476,22 +469,17 @@ Exit:
>        if (inTransaction) {
>            FwpmTransactionAbort(gEngineHandle);
>        }
> -        if (engineOpened) {
> -            OvsTunnelEngineClose(&gEngineHandle);
> -        }
>    }
> 
>    return status;
> }
> 
> VOID
> -OvsTunnelUnregisterCallouts(VOID)
> +OvsTunnelUnregisterCallouts()
> {
> -    OvsTunnelRemoveFilter(&OVS_TUNNEL_FILTER_KEY,
> -                          &OVS_TUNNEL_SUBLAYER);
>    FwpsCalloutUnregisterById(gCalloutIdV4);
> +    FwpmSubLayerDeleteByKey(gEngineHandle, &OVS_TUNNEL_SUBLAYER);
>    FwpmCalloutDeleteById(gEngineHandle, gCalloutIdV4);
> -    OvsTunnelEngineClose(&gEngineHandle);

There’s an assumption here that all the VXLAN ports and the corresponding 
filters have been removed. Can you pls. add some ASSERT to make sure this is 
true. i.e. all the filters have been removed before we unregister the callout 
and remove the sublayer. You can perhaps count the number of filters we added 
for this callout and ASSERT it is 0.
SV: No WFP filters are added using the gEngineHandle session. All filters are 
added in the tunnel filter threads, which have their own requests queue and 
engine session.

> }
> 
> VOID
> @@ -499,16 +487,22 @@ OvsTunnelFilterUninitialize(PDRIVER_OBJECT driverObject)
> {
>    UNREFERENCED_PARAMETER(driverObject);
> 
> +    OvsTunnelFilterStopThreads();
> +
>    OvsTunnelUnregisterCallouts();
> -    IoDeleteDevice(gDeviceObject);
> +    OvsTunnelEngineClose(&gEngineHandle);
> +
> +    if (gDeviceObject) {
> +        IoDeleteDevice(gDeviceObject);
> +    }
> }
> 
> 
> NTSTATUS
> OvsTunnelFilterInitialize(PDRIVER_OBJECT driverObject)
> {
> -    NTSTATUS status = STATUS_SUCCESS;
> -    UNICODE_STRING deviceName;
> +    NTSTATUS        status = STATUS_SUCCESS;
> +    UNICODE_STRING  deviceName;
> 
>    RtlInitUnicodeString(&deviceName,
>                         L"\\Device\\OvsTunnelFilter");
> @@ -522,22 +516,589 @@ OvsTunnelFilterInitialize(PDRIVER_OBJECT driverObject)
>                            &gDeviceObject);
> 
>    if (!NT_SUCCESS(status)){
> +        OVS_LOG_ERROR("Failed to create tunnel filter device, status: %x.",
> +                      status);
> +        goto Exit;
> +    }
> +
> +    status = OvsTunnelFilterStartThreads();
> +    if (!NT_SUCCESS(status)){
> +        goto Exit;
> +    }
> +
> +    status = OvsTunnelEngineOpen(&gEngineHandle);
> +    if (!NT_SUCCESS(status)){
>        goto Exit;
>    }
> 
>    status = OvsTunnelRegisterCallouts(gDeviceObject);
> +    if (!NT_SUCCESS(status)) {
> +        OVS_LOG_ERROR("Failed to register callout, status: %x.",
> +                      status);
> +    }
> 
> Exit:
> 
>    if (!NT_SUCCESS(status)){
> -        if (gEngineHandle != NULL) {
> -            OvsTunnelUnregisterCallouts();
> +        OvsTunnelFilterUninitialize(driverObject);
> +    }
> +
> +    return status;
> +}
> +
> +NTSTATUS
> +OvsTunnelAddFilterEx(HANDLE engineSession,
> +                     UINT32 filterPort,
> +                     UINT64 *filterID)
> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +
> +    status = OvsTunnelAddFilter(engineSession,
> +                                L"Datagram-Data OVS Filter (Inbound)",
> +                                L"address/port for UDP",
> +                                (USHORT)filterPort,
> +                                FWP_DIRECTION_INBOUND,
> +                                0,
> +                                NULL,
> +                                &FWPM_LAYER_DATAGRAM_DATA_V4,
> +                                &OVS_TUNNEL_CALLOUT_V4,
> +                                filterID);
> +    if (!NT_SUCCESS(status)) {
> +        OVS_LOG_ERROR("Failed to add tunnel filter for port: %d, status: 
> %x.",
> +                      filterPort, status);
> +    } else {
> +        OVS_LOG_INFO("Filter added, filter port: %d, filter ID: %d.",
> +                     filterPort, *filterID);
> +    }
> +
> +    return status;
> +}
> +
> +NTSTATUS
> +OvsTunnelRemoveFilterEx(HANDLE engineSession,
> +                        UINT64 filterID)
> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +    BOOLEAN  error = TRUE;
> +
> +    do {
> +        if (0 == filterID) {

Nit: if you are not particular, we can use ‘filterID == 0’.
SV: The reason I am accustomed to use this ordering is to avoid issues when 
adding 'if (filterID = 0) {' by mistake. The compiler won't complain, because 
the code is syntactically correct, but it is logically incorrect. I can change 
it if you don't like it.  

> +            OVS_LOG_INFO("No tunnel filter to remove.");
> +            break;
> +        }
> +
> +        status = FwpmFilterDeleteById(engineSession, filterID);
> +        if (!NT_SUCCESS(status)) {
> +            OVS_LOG_ERROR("Failed to remove tunnel with filter ID: %d,\
> +                           status: %x.", filterID, status);
> +            break;
> +        }
> +        OVS_LOG_INFO("Filter removed, filter ID: %d.",
> +                     filterID);
> +
> +        error = FALSE;
> +    } while (error);

‘error’ is not really being used to signal anything. You can use a do {} 
while(0); if you wish to.
SV: I did that, but the compiler won't let me. That is the reason for using the 
variable.

> +
> +    return status;
> +}
> +
> +NTSTATUS
> +OvsTunnelFilterExecuteAction(HANDLE engineSession,
> +                             POVS_TUNFLT_REQUEST request)
> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +
> +    switch (request->operation)
> +    {
> +    case OVS_TUN_FILTER_CREATE:
> +        status = OvsTunnelAddFilterEx(engineSession,
> +                                      request->port,
> +                                      request->retID);
> +        break;
> +    case OVS_TUN_FILTER_DELETE:
> +        status = OvsTunnelRemoveFilterEx(engineSession,
> +                                         request->ID);
> +        break;
> +    default:

Add ASSERT() here, reason being the caller will end up waiting forever if a 
wrong code was specified.
SV: I will set an error to the 'status' variable that will cause the request to 
be completed with the specified error.

> +        break;
> +    }
> +
> +    return status;
> +}
> +
> +VOID
> +OvsTunnelFilterRequestPopCrtList(POVS_TUNFLT_REQUEST_LIST listRequests,
> +                                 PLIST_ENTRY head,
> +                                 UINT32 *count)
> +{
> +    NdisAcquireSpinLock(&listRequests->spinlock);
> +
> +    if (!IsListEmpty(&listRequests->head)) {
> +        PLIST_ENTRY PrevEntry;
> +        PLIST_ENTRY NextEntry;
> +
> +        NextEntry = listRequests->head.Flink;
> +        PrevEntry = listRequests->head.Blink;
> +
> +        head->Flink = NextEntry;
> +        NextEntry->Blink = head;
> +
> +        head->Blink = PrevEntry;
> +        PrevEntry->Flink = head;
> +
> +        *count = listRequests->numEntries;
> +
> +        InitializeListHead(&listRequests->head);
> +        listRequests->numEntries = 0;
> +    }
> +
> +    NdisReleaseSpinLock(&listRequests->spinlock);
> +}
> +
> +POVS_TUNFLT_REQUEST
> +OvsTunnelFilterRequestPop(POVS_TUNFLT_REQUEST_LIST listRequests)
> +{
> +    POVS_TUNFLT_REQUEST request = NULL;
> +
> +    NdisAcquireSpinLock(&listRequests->spinlock);
> +
> +    if (!IsListEmpty(&listRequests->head)) {
> +        request = (POVS_TUNFLT_REQUEST) RemoveHeadList(&listRequests->head);
> +        listRequests->numEntries--;
> +    }
> +
> +    NdisReleaseSpinLock(&listRequests->spinlock);
> +
> +    return request;
> +}

OvsTunnelFilterRequestPopCrtList() is an optimized version of 
OvsTunnelFilterRequestPop(). I think we can make do with just the former. Less 
code the better and easier to read.
SV: The new function I have added is faster. It gets all current requests in 
the queue, while acquiring the spin lock only once. The thread's request queue 
is emptied and other requests are added as they come. There is no reason to pop 
the requests one by one, while waiting each time for the spin lock to be 
acquired, because all the requests are destined for the current thread and will 
be handled by the same thread.

Also, renaming OvsTunnelFilterRequestPopCrtList => 
OvsTunnelFilterRequestPopList() would make it easier to read.
SV: Sure, I can rename the function.

> +
> +VOID
> +OvsTunnelFilterRequestPush(POVS_TUNFLT_REQUEST_LIST listRequests,
> +                           POVS_TUNFLT_REQUEST request)
> +{
> +    NdisAcquireSpinLock(&listRequests->spinlock);
> +
> +    InsertTailList(&listRequests->head, &(request->entry));
> +    listRequests->numEntries++;
> +
> +    NdisReleaseSpinLock(&listRequests->spinlock);
> +}
> +
> +VOID
> +OvsTunnelFilterRoundRobinPush(POVS_TUNFLT_REQUEST request)
> +{
> +    /* If the new request contains the same filter port like the last one,
> +     * then send the request to the same/previous thread. */
> +    if ((0 != request->port) && (gLastQueuedFilterPort != request->port))

Is it possible for request->port to be 0? Shouldn’t it have been validated way 
before itself in VPORT add? You can add an ASSERT if you wish. Also, what 
happens if the request->port is 0? We go ahead and add the port anyway? This 
seems wrong.
SV: It is not wrong. All tunnel delete requests have the 'request->port' equal 
to zero and that is how I am identifying it. Sure, I could use the 
'request->operation' for that. I can change it.

I saw later that request->port == 0 is true during OVS_TUN_FILTER_DELETE. You 
might as well check for the request type than request->port == 0.
SV: I will change to check for the operation instead of the port. Sure.

Also, I don’t see the necessity for using the ‘gLastQueuedFilterPort’ port. 
This can happen in the following cases:
a. There are two different tunneling protocols that are using the same 
destination L4 port, and both of them are being created in succession.
b. A tunneling port is getting deleted, and re-created.

The possibility of this is remote. I think you can nuke 
‘gLastQueuedFilterPort’. The global variable is not protected anyway.

The hashing logic is good. I like it esp. if there’s a flurry of port-creation 
requests.

But, I am wondering if we really need a ‘gRoundRobinThreadIndex’. I see the 
following problems:
1. It is not protected.
2. You can just hash on the port number directly, and assuming that the port 
numbers are in host order and in a sequence, you should get good distribution.

In other words:
queueIndex = request->port % OVS_TUNFLT_MAX_THREADS.

Are you worried about a DOS attack involving create-port & delete-port for the 
same L4 port#? That should be handled at a higher layer in Vport.c. Fail a 
delete-port request until a previous crete-port is successful.

Maybe I am missing something about the need for ‘gRoundRobinThreadIndex’ and 
‘gLastQueuedFilterPort’. Pls. feel free to explain.
SV: In case an existing VXLAN interface is set with a different destination 
port, userspace will issue two vport command operations, OVS_VPORT_CMD_NEW and 
OVS_VPORT_CMD_DEL - in this order. If a tunnel filter create request, 
corresponding to an OVS_VPORT_CMD_NEW operation, is assigned to thread 1 and 
the tunnel delete request is assigned to thread 2, this could be a potential 
problem if the tunnel thread delete request is processed before the tunnel 
create request.

> +    {
> +        /* Increment the round-robin index. */
> +        gRoundRobinThreadIndex = (gRoundRobinThreadIndex + 1) %
> +            OVS_TUNFLT_MAX_THREADS;
> +    }
> +    gLastQueuedFilterPort = request->port;
> +
> +    OvsTunnelFilterRequestPush(
> +        &gTunnelThreadCtx[gRoundRobinThreadIndex].listRequests,
> +        request);
> +
> +    KeSetEvent(&gTunnelThreadCtx[gRoundRobinThreadIndex].requestEvent,
> +        IO_NO_INCREMENT,
> +        FALSE);
> +}
> +
> +VOID
> +OvsTunnelFilterCompleteRequest(POVS_TUNFLT_REQUEST request, NTSTATUS status)
> +{
> +    if (request->irp) {
> +        UINT32 replyLen = 0;
> +        request->callback(request->context, status, &replyLen);
> +        OvsCompleteIrpRequest(request->irp, (ULONG_PTR)replyLen, status);
> +
> +        /* Release the context passed to the callback function. */
> +        OvsFreeMemory(request->context);
> +    }
> +}
> +
> +VOID
> +OvsTunnelFilterRequestListCleanup(POVS_TUNFLT_REQUEST_LIST listRequests)
> +{
> +    POVS_TUNFLT_REQUEST request = NULL;
> +
> +    while (NULL != (request = OvsTunnelFilterRequestPop(listRequests))) {
> +        /* Complete the IRP with cancelled status. */
> +        OvsTunnelFilterCompleteRequest(request, STATUS_CANCELLED);
> +
> +        OvsFreeMemory(request);
> +        request = NULL;
> +    }
> +}

This can be re-written with ‘OvsTunnelFilterRequestPopCrtList()’.
SV: Sure.

> +
> +VOID
> +OvsTunnelFilterRequestListProcess(POVS_TUNFLT_THREAD_CONTEXT threadCtx)

A header would be useful.
SV: I will add it.

> +{
> +    POVS_TUNFLT_REQUEST request = NULL;
> +    PLIST_ENTRY         link = NULL;
> +    PLIST_ENTRY         next = NULL;
> +    LIST_ENTRY          head;
> +    NTSTATUS            status = STATUS_SUCCESS;
> +    UINT32              count = 0;
> +    BOOLEAN             inTransaction = FALSE;
> +    BOOLEAN             error = TRUE;
> +
> +    do
> +    {
> +        if (!InterlockedCompareExchange(
> +            (LONG volatile *)&threadCtx->listRequests.numEntries, 0, 0)) {
> +            OVS_LOG_INFO("Nothing to do... request list is empty.");
> +            break;
> +        }

Documentation for InterlockedCompareExchange says that numEntries should be 
aligned at 32-bit boundaries, which is the case here in ‘struct 
OVS_TUNFLT_REQUEST_LIST’. Can you pls. add a comment about this in the 
structure?

Also, this counts as a spurious wakeup. Can you make the log as a ERROR/WARN?
SV: Sure.

> +
> +        status = FwpmTransactionBegin(threadCtx->engineSession, 0);
> +        if (!NT_SUCCESS(status)) {
> +            OVS_LOG_ERROR("Failed to start transaction, status: %x.",
> +                          status);
> +            break;
> +        }
> +        inTransaction = TRUE;
> +
> +        InitializeListHead(&head);
> +        OvsTunnelFilterRequestPopCrtList(&threadCtx->listRequests, &head, 
> &count);
> +
> +        LIST_FORALL_SAFE(&head, link, next) {
> +            request = CONTAINING_RECORD(link, OVS_TUNFLT_REQUEST, entry);
> +
> +            status = OvsTunnelFilterExecuteAction(threadCtx->engineSession,
> +                                                  request);
> +            if (!NT_SUCCESS(status)) {
> +                RemoveEntryList(&request->entry);
> +                count--;
> +
> +                /* Complete the IRP with the failure status. */
> +                OvsTunnelFilterCompleteRequest(request, status);
> +
> +                OvsFreeMemory(request);
> +                request = NULL;
> +            } else {
> +                error = FALSE;
> +            }
> +        }
> +        
> +        if (error) {
> +            /* No successful requests were made, so there is no point to 
> commit
> +             * the transaction. */
> +            break;
> +        }
> +
> +        status = FwpmTransactionCommit(threadCtx->engineSession);
> +        if (!NT_SUCCESS(status)){
> +            OVS_LOG_ERROR("Failed to commit transaction, status: %x.",
> +                          status);
> +            break;
> +        }
> +
> +        inTransaction = FALSE;
> +    } while (inTransaction);
> +
> +    if (inTransaction) {
> +        FwpmTransactionAbort(threadCtx->engineSession);
> +        OVS_LOG_ERROR("Failed to execute request, status: %x.\
> +                       Transaction aborted.", status);
> +    }
> +
> +    /* Complete all remaining requests with the last operation status. */


I wondered what happens to the ports that got added via 
‘OvsTunnelFilterExecuteAction()’ if ‘FwpmTransactionCommit()’ fails. Good that 
you’ve handled the case.

Nit: The comment can probably updated to:
/* Complete all remaining requests with the last operation status. */
/* Complete the ports successfully added with the transaction commit status. */
SV: OK.

> +    while (count) {
> +        request = (POVS_TUNFLT_REQUEST)RemoveHeadList(&head);
> +        count--;
> +
> +        OvsTunnelFilterCompleteRequest(request, status);
> +
> +        OvsFreeMemory(request);
> +        request = NULL;
> +    }
> +}

> +
> +/*
> + 
> *----------------------------------------------------------------------------
> + *  System thread routine that handles tunnel filter create/delete requests.
> + 
> *----------------------------------------------------------------------------
> + */
> +_Use_decl_annotations_
> +VOID
> +OvsTunnelFilterThreadProc(PVOID context)
> +{
> +    NTSTATUS                   status = STATUS_SUCCESS;
> +    POVS_TUNFLT_THREAD_CONTEXT threadCtx = 
> (POVS_TUNFLT_THREAD_CONTEXT)context;
> +    BOOLEAN                    exit = FALSE;
> +    PKEVENT                    eventArray[2] = { 0 };
> +    ULONG                      count = 0;
> +
> +    OVS_LOG_INFO("Starting OVS Tunnel system thread.”);

Is there a thread id, that could also be printed?
SV: There is, yes. I will add it in the log message.

> +
> +    eventArray[0] = &threadCtx->stopEvent;
> +    eventArray[1] = &threadCtx->requestEvent;
> +    count = ARRAY_SIZE(eventArray);
> +
> +    do {
> +        status = KeWaitForMultipleObjects(count,
> +                                          (PVOID)eventArray,
> +                                          WaitAny,
> +                                          Executive,
> +                                          KernelMode,
> +                                          FALSE,
> +                                          NULL,
> +                                          NULL);
> +        switch (status) {
> +            case STATUS_WAIT_1:
> +                OvsTunnelFilterRequestListProcess(threadCtx);
> +                break;
> +            default:
> +                exit = TRUE;
> +                break;
> +        }
> +    } while (!exit);
> +
> +    OVS_LOG_INFO("Terminating OVS Tunnel system thread.”);

Is there a thread id, that could also be printed?

> +
> +    PsTerminateSystemThread(STATUS_SUCCESS);
> +};
> +
> +
> +NTSTATUS
> +OvsTunnelFilterStartThreads()

Probably deserves a description header comment. I know the name is obvious :)
SV: I will add a description header.

> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +
> +    for (UINT index = 0; index < OVS_TUNFLT_MAX_THREADS; index++) {
> +        gTunnelThreadCtx[index].threadID = index;
> +
> +        status = OvsTunnelFilterThreadStart(&gTunnelThreadCtx[index]);
> +        if (!NT_SUCCESS(status)) {
> +            OVS_LOG_ERROR("Failed to start tunnel filter thread %d.", index);
> +            break;
> +        }
> +
> +        OVS_LOG_INFO("Started tunnel filter thread %d.", index);
> +    }
> +
> +    return status;
> +}
> +
> +NTSTATUS
> +OvsTunnelFilterThreadStart(POVS_TUNFLT_THREAD_CONTEXT threadCtx)
> +{
> +    NTSTATUS    status = STATUS_SUCCESS;
> +    HANDLE      threadHandle = NULL;
> +    BOOLEAN     error = TRUE;
> +
> +    do {
> +        status = OvsTunnelFilterThreadInit(threadCtx);
> +        if (!NT_SUCCESS(status)) {
> +            OVS_LOG_ERROR("Failed to initialize tunnel filter thread %d.",
> +                          threadCtx->threadID);
> +            break;
> +        }
> +
> +        status = PsCreateSystemThread(&threadHandle,
> +                                      SYNCHRONIZE,
> +                                      NULL,
> +                                      NULL,
> +                                      NULL,
> +                                      OvsTunnelFilterThreadProc,
> +                                      threadCtx);
> +        if (!NT_SUCCESS(status)) {
> +            OVS_LOG_ERROR("Failed to create tunnel thread, status: %x.",
> +                          status);
> +            break;
> +        }
> +
> +        ObReferenceObjectByHandle(threadHandle,
> +                                  SYNCHRONIZE,
> +                                  NULL,
> +                                  KernelMode,
> +                                  &threadCtx->threadObject,
> +                                  NULL);
> +        ZwClose(threadHandle);
> +        threadHandle = NULL;
> +
> +        error = FALSE;
> +    } while (error);
> +
> +    if (error) {
> +        OvsTunnelFilterThreadStop(threadCtx);
> +    }
> +
> +    return status;
> +}
> +
> +VOID
> +OvsTunnelFilterStopThreads()
> +{
> +    for (UINT index = 0; index < OVS_TUNFLT_MAX_THREADS; index++) {
> +        OvsTunnelFilterThreadStop(&gTunnelThreadCtx[index]);
> +    }
> +}
> +
> +VOID
> +OvsTunnelFilterThreadStop(POVS_TUNFLT_THREAD_CONTEXT threadCtx)
> +{
> +    if (threadCtx->threadObject) {
> +        /* Signal stop thread event. */
> +        KeSetEvent(&threadCtx->stopEvent, IO_NO_INCREMENT, FALSE);
> +
> +        /* Wait for the tunnel thread to finish. */
> +        KeWaitForSingleObject(threadCtx->threadObject,
> +                              Executive,
> +                              KernelMode,
> +                              FALSE,
> +                              NULL);

Just for my information, when does 'threadCtx->threadObject' get signaled?
SV: The 'threadCtx->threadObject' is signaled when the thread exits.

> +
> +        ObDereferenceObject(threadCtx->threadObject);
> +    }
> +
> +    OvsTunnelFilterThreadUninit(threadCtx);
> +}
> +
> +NTSTATUS
> +OvsTunnelFilterThreadInit(POVS_TUNFLT_THREAD_CONTEXT threadCtx)
> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +    BOOLEAN error = TRUE;
> +
> +    do {
> +        /* Create thread's engine session object. */
> +        status = OvsTunnelEngineOpen(&threadCtx->engineSession);
> +        if (!NT_SUCCESS(status)) {
> +            break;
> +        }
> +
> +        NdisAllocateSpinLock(&threadCtx->listRequests.spinlock);
> +
> +        InitializeListHead(&threadCtx->listRequests.head);
> +
> +        KeInitializeEvent(&threadCtx->stopEvent,
> +            NotificationEvent,
> +            FALSE);
> +
> +        KeInitializeEvent(&threadCtx->requestEvent,
> +            SynchronizationEvent,
> +            FALSE);
> +
> +        error = FALSE;
> +    } while (error);
> +
> +    return status;
> +}
> +
> +VOID
> +OvsTunnelFilterThreadUninit(POVS_TUNFLT_THREAD_CONTEXT threadCtx)
> +{
> +    if (threadCtx->engineSession) {
> +        /* Close thread's FWPM session. */
> +        OvsTunnelEngineClose(&threadCtx->engineSession);
> +
> +        OvsTunnelFilterRequestListCleanup(&threadCtx->listRequests);
> +        NdisFreeSpinLock(&threadCtx->listRequests.spinlock);
> +    }

Nit: may be useful to set threadCtx->engineSession to NULL.
SV: Sure.

> +}
> +
> +NTSTATUS
> +OvsTunnelFilterQueueRequest(PIRP irp,
> +                            UINT16 remotePort,
> +                            UINT64 *filterID,
> +                            OVS_TUNFLT_OPERATION operation,
> +                            PFNTunnelVportPendingOp callback,
> +                            PVOID context)
> +{
> +    POVS_TUNFLT_REQUEST request = NULL;
> +    NTSTATUS            status = STATUS_PENDING;
> +    BOOLEAN             error = TRUE;
> +
> +    do {
> +        if (NULL == filterID) {
> +            OVS_LOG_ERROR("Invalid request.");
> +            status = STATUS_INVALID_PARAMETER;
> +            break;
>        }
> 
> -        if (gDeviceObject) {
> -            IoDeleteDevice(gDeviceObject);
> +        request = (POVS_TUNFLT_REQUEST) OvsAllocateMemory(sizeof(*request));
> +        if (NULL == request) {
> +            OVS_LOG_ERROR("Failed to allocate list item.");
> +            status = STATUS_INSUFFICIENT_RESOURCES;
> +            break;
> +        }
> +
> +        request->port = remotePort;
> +        request->operation = operation;
> +        request->ID = *filterID;
> +        request->retID = filterID;

Pls. use something like. It makes it clearer.
union {
   PUINT64 addId;
   UINT64 deleteId;
} filterID;
SV: OK.

> +        request->irp = irp;
> +        request->callback = callback;
> +        request->context = context;
> +
> +        OvsTunnelFilterRoundRobinPush(request);
> +
> +        error = FALSE;
> +    } while (error);
> +
> +    if (error) {
> +        if (request) {
> +            OvsTunnelFilterCompleteRequest(request, status);
> +            OvsFreeMemory(request);
> +            request = NULL;
>        }
>    }
> 
>    return status;
> }
> +
> +/*
> + * --------------------------------------------------------------------------
> + *  This function adds a new WFP filter for the received port and returns the
> + *  ID of the created WFP filter.
> + *
> + *  Note:
> + *  All necessary calls to the WFP filtering engine must be running at IRQL =
> + *  PASSIVE_LEVEL. Because the function is called at IRQL = DISPATCH_LEVEL,
> + *  we register an OVS_TUN_FILTER_CREATE request that will be processed by
> + *  the tunnel filter thread routine at IRQL = PASSIVE_LEVEL.
> + * --------------------------------------------------------------------------
> + */
> +NTSTATUS
> +OvsTunelFilterCreate(PIRP irp,
> +                     UINT16 filterPort,
> +                     UINT64 *filterID,
> +                     PFNTunnelVportPendingOp callback,
> +                     PVOID context)
> +{
> +    return OvsTunnelFilterQueueRequest(irp,
> +                                       filterPort,
> +                                       filterID,
> +                                       OVS_TUN_FILTER_CREATE,
> +                                       callback,
> +                                       context);
> +}
> +
> +/*
> + * --------------------------------------------------------------------------
> + *  This function removes a WFP filter using the received filter ID.
> + *
> + *  Note:
> + *  All necessary calls to the WFP filtering engine must be running at IRQL =
> + *  PASSIVE_LEVEL. Because the function is called at IRQL = DISPATCH_LEVEL,
> + *  we register an OVS_TUN_FILTER_DELETE request that will be processed by
> + *  the tunnel filter thread routine at IRQL = PASSIVE_LEVEL.
> + * --------------------------------------------------------------------------
> + */
> +NTSTATUS
> +OvsTunelFilterDelete(PIRP irp,
> +                     UINT64 filterID,
> +                     PFNTunnelVportPendingOp callback,
> +                     PVOID context)
> +{
> +    return OvsTunnelFilterQueueRequest(irp,
> +                                       0,
> +                                       &filterID,
> +                                       OVS_TUN_FILTER_DELETE,
> +                                       callback,
> +                                       context);
> +}
> \ No newline at end of file
> diff --git a/datapath-windows/ovsext/TunnelIntf.h 
> b/datapath-windows/ovsext/TunnelIntf.h
> index b2bba30..7ac24a9 100644
> --- a/datapath-windows/ovsext/TunnelIntf.h
> +++ b/datapath-windows/ovsext/TunnelIntf.h
> @@ -17,6 +17,10 @@
> #ifndef __TUNNEL_INTF_H_
> #define __TUNNEL_INTF_H_ 1
> 
> +typedef VOID(*PFNTunnelVportPendingOp)(PVOID context,
> +                                        NTSTATUS status,
> +                                        UINT32 *replyLen);
> +
> /* Tunnel callout driver load/unload functions */
> NTSTATUS OvsTunnelFilterInitialize(PDRIVER_OBJECT driverObject);
> 
> @@ -30,4 +34,15 @@ VOID OvsTunnelAddSystemProvider(HANDLE handle);
> 
> VOID OvsTunnelRemoveSystemProvider(HANDLE handle);
> 
> +NTSTATUS OvsTunelFilterCreate(PIRP irp,
> +                              UINT16 filterPort,
> +                              UINT64 *filterID,
> +                              PFNTunnelVportPendingOp callback,
> +                              PVOID context);
> +
> +NTSTATUS OvsTunelFilterDelete(PIRP irp,
> +                              UINT64 filterID,
> +                              PFNTunnelVportPendingOp callback,
> +                              PVOID context);
> +
> #endif /* __TUNNEL_INTF_H_ */

LG

> diff --git a/datapath-windows/ovsext/Vport.c b/datapath-windows/ovsext/Vport.c
> index c9dfaea..f3bfc93 100644
> --- a/datapath-windows/ovsext/Vport.c
> +++ b/datapath-windows/ovsext/Vport.c
> @@ -47,6 +47,14 @@
> 
> #define OVS_VPORT_DEFAULT_WAIT_TIME_MICROSEC    100
> 
> +typedef struct _OVS_TUNFLT_INIT_CONTEXT {
> +    UINT32 outputLength;
> +    PVOID outputBuffer;
> +    PVOID inputBuffer;
> +    POVS_VPORT_ENTRY vport;
> +} OVS_TUNFLT_INIT_CONTEXT, *POVS_TUNFLT_INIT_CONTEXT;

minor: A comment would certainly be useful here. You can say something like - 
context structure used to pass back and forth information to the tunnel filter.
SV: OK, I'll add a comment to the structure.

> +
> +
> extern POVS_SWITCH_CONTEXT gOvsSwitchContext;
> extern PNDIS_SPIN_LOCK gOvsCtrlLock;
> 
> @@ -70,6 +78,13 @@ static POVS_VPORT_ENTRY 
> OvsFindVportByHvNameW(POVS_SWITCH_CONTEXT switchContext,
> static NDIS_STATUS InitHvVportCommon(POVS_SWITCH_CONTEXT switchContext,
>                                     POVS_VPORT_ENTRY vport,
>                                     BOOLEAN newPort);
> +static VOID OvsTunnelVportPendingInit(PVOID context,
> +                                      NTSTATUS status,
> +                                      UINT32 *replyLen);
> +static VOID OvsTunnelVportPendingUninit(PVOID context,
> +                                        NTSTATUS status,
> +                                        UINT32 *replyLen);
> +
> 
> /*
> * Functions implemented in relaton to NDIS port manipulation.
> @@ -247,7 +262,7 @@ HvDeletePort(POVS_SWITCH_CONTEXT switchContext,
>     * delete will delete the vport.
>    */
>    if (vport) {
> -        OvsRemoveAndDeleteVport(switchContext, vport, TRUE, FALSE, NULL);
> +        OvsRemoveAndDeleteVport(NULL, switchContext, vport, TRUE, FALSE, 
> NULL);
>    } else {
>        OVS_LOG_WARN("Vport not present.");
>    }
> @@ -537,7 +552,7 @@ HvDeleteNic(POVS_SWITCH_CONTEXT switchContext,
>    portNo = vport->portNo;
>    if (vport->portType == NdisSwitchPortTypeExternal &&
>        vport->nicIndex != 0) {
> -        OvsRemoveAndDeleteVport(switchContext, vport, TRUE, FALSE, NULL);
> +        OvsRemoveAndDeleteVport(NULL, switchContext, vport, TRUE, FALSE, 
> NULL);
>    }
>    vport->nicState = NdisSwitchNicStateUnknown;
>    vport->ovsState = OVS_STATE_PORT_CREATED;
> @@ -849,11 +864,14 @@ OvsInitPhysNicVport(POVS_VPORT_ENTRY physExtVport,
> * --------------------------------------------------------------------------
> */
> NTSTATUS
> -OvsInitTunnelVport(POVS_VPORT_ENTRY vport,
> +OvsInitTunnelVport(PVOID context,
> +                   POVS_VPORT_ENTRY vport,

PVOID context can be
POVS_USER_PARAMS_CONTEXT usrParamsCtx.

No reason for making it PVOID.
SV: Yes, there is. The POVS_USER_PARAMS_CONTEXT is not available from the 
Vport.h header.

>                   OVS_VPORT_TYPE ovsType,
>                   UINT16 dstPort)
> {
>    NTSTATUS status = STATUS_SUCCESS;
> +    POVS_USER_PARAMS_CONTEXT usrParamsCtx =
> +        (POVS_USER_PARAMS_CONTEXT) context;
> 
>    vport->isBridgeInternal = FALSE;
>    vport->ovsType = ovsType;
> @@ -864,8 +882,26 @@ OvsInitTunnelVport(POVS_VPORT_ENTRY vport,
>    case OVS_VPORT_TYPE_GRE64:
>        break;
>    case OVS_VPORT_TYPE_VXLAN:
> -        status = OvsInitVxlanTunnel(vport, dstPort);
> +    {
> +        POVS_TUNFLT_INIT_CONTEXT tunnelContext = NULL;
> +
> +        tunnelContext = OvsAllocateMemory(sizeof(*tunnelContext));
> +        if (tunnelContext == NULL) {
> +            status = STATUS_INSUFFICIENT_RESOURCES;
> +            break;
> +        }
> +        tunnelContext->inputBuffer = usrParamsCtx->inputBuffer;
> +        tunnelContext->outputBuffer = usrParamsCtx->outputBuffer;
> +        tunnelContext->outputLength = usrParamsCtx->outputLength;
> +        tunnelContext->vport = vport;
> +
> +        status = OvsInitVxlanTunnel(usrParamsCtx->irp,
> +                                    vport,
> +                                    dstPort,
> +                                    OvsTunnelVportPendingInit,
> +                                    (PVOID)tunnelContext);
>        break;
> +    }
>    default:
>        ASSERT(0);
>    }
> @@ -1011,7 +1047,7 @@ InitOvsVportCommon(POVS_SWITCH_CONTEXT switchContext,
> 
>    switch(vport->ovsType) {
>    case OVS_VPORT_TYPE_VXLAN:
> -        ASSERT(switchContext->vxlanVport == NULL);
> +        //ASSERT(switchContext->vxlanVport == NULL);
>        switchContext->vxlanVport = vport;
>        switchContext->numNonHvVports++;
>        break;
> @@ -1054,13 +1090,17 @@ InitOvsVportCommon(POVS_SWITCH_CONTEXT switchContext,
> * port being removed from OVS userspace.
> * --------------------------------------------------------------------------
> */
> -VOID
> -OvsRemoveAndDeleteVport(POVS_SWITCH_CONTEXT switchContext,
> +NTSTATUS
> +OvsRemoveAndDeleteVport(PVOID usrParamsContext,
> +                        POVS_SWITCH_CONTEXT switchContext,
>                        POVS_VPORT_ENTRY vport,
>                        BOOLEAN hvDelete,
>                        BOOLEAN ovsDelete,
>                        BOOLEAN *vportDeallocated)
> {
> +    NTSTATUS status = STATUS_SUCCESS;
> +    POVS_USER_PARAMS_CONTEXT usrParamsCtx =
> +        (POVS_USER_PARAMS_CONTEXT)usrParamsContext;
>    BOOLEAN hvSwitchPort = FALSE;
>    BOOLEAN deletedOnOvs = FALSE, deletedOnHv = FALSE;
> 
> @@ -1077,7 +1117,7 @@ OvsRemoveAndDeleteVport(POVS_SWITCH_CONTEXT 
> switchContext,
>            if (vportDeallocated) {
>                *vportDeallocated = TRUE;
>            }
> -            return;
> +            return STATUS_SUCCESS;
>        } else {
>            ASSERT(switchContext->numPhysicalNics);
>            switchContext->numPhysicalNics--;
> @@ -1095,9 +1135,32 @@ OvsRemoveAndDeleteVport(POVS_SWITCH_CONTEXT 
> switchContext,
>        }
>        break;
>    case OVS_VPORT_TYPE_VXLAN:
> -        OvsCleanupVxlanTunnel(vport);
> +    {
> +        POVS_TUNFLT_INIT_CONTEXT tunnelContext = NULL;
> +        PIRP irp = NULL;
> +
> +        if (usrParamsCtx) {
> +            tunnelContext = OvsAllocateMemory(sizeof(*tunnelContext));
> +            if (tunnelContext == NULL) {
> +                status = STATUS_INSUFFICIENT_RESOURCES;
> +                break;
> +            }
> +            tunnelContext->inputBuffer = usrParamsCtx->inputBuffer;
> +            tunnelContext->outputBuffer = usrParamsCtx->outputBuffer;
> +            tunnelContext->outputLength = usrParamsCtx->outputLength;
> +            tunnelContext->vport = vport;
> +
> +            irp = usrParamsCtx->irp;
> +        }
> +
> +        status = OvsCleanupVxlanTunnel(irp,
> +                                       vport,
> +                                       OvsTunnelVportPendingUninit,
> +                                       tunnelContext);
> +
>        switchContext->vxlanVport = NULL;
>        break;
> +    }
>    case OVS_VPORT_TYPE_GRE:
>    case OVS_VPORT_TYPE_GRE64:
>        break;
> @@ -1156,6 +1219,8 @@ OvsRemoveAndDeleteVport(POVS_SWITCH_CONTEXT 
> switchContext,
>            *vportDeallocated = TRUE;
>        }
>    }
> +
> +    return status;
> }
> 
> NDIS_STATUS
> @@ -1293,7 +1358,7 @@ OvsClearAllSwitchVports(POVS_SWITCH_CONTEXT 
> switchContext)
>        LIST_FORALL_SAFE(head, link, next) {
>            POVS_VPORT_ENTRY vport;
>            vport = CONTAINING_RECORD(link, OVS_VPORT_ENTRY, portIdLink);
> -            OvsRemoveAndDeleteVport(switchContext, vport, TRUE, TRUE, NULL);
> +            OvsRemoveAndDeleteVport(NULL, switchContext, vport, TRUE, TRUE, 
> NULL);
>        }
>    }
>    /*
> @@ -1301,7 +1366,7 @@ OvsClearAllSwitchVports(POVS_SWITCH_CONTEXT 
> switchContext)
>     * 'portIdHashArray'.
>     */
>    if (switchContext->virtualExternalVport) {
> -        OvsRemoveAndDeleteVport(switchContext,
> +        OvsRemoveAndDeleteVport(NULL, switchContext,
>            (POVS_VPORT_ENTRY)switchContext->virtualExternalVport, TRUE, TRUE,
>            NULL);
>    }
> @@ -1316,7 +1381,7 @@ OvsClearAllSwitchVports(POVS_SWITCH_CONTEXT 
> switchContext)
>            ASSERT(OvsIsTunnelVportType(vport->ovsType) ||
>                   (vport->ovsType == OVS_VPORT_TYPE_INTERNAL &&
>                    vport->isBridgeInternal) || vport->isPresentOnHv == TRUE);
> -            OvsRemoveAndDeleteVport(switchContext, vport, TRUE, TRUE, NULL);
> +            OvsRemoveAndDeleteVport(NULL, switchContext, vport, TRUE, TRUE, 
> NULL);
>        }
>    }
> 
> @@ -1908,7 +1973,7 @@ Cleanup:
> 
> /*
> * --------------------------------------------------------------------------
> - *  Command Handler for 'OVS_VPORT_CMD_NEW'.
> + *  Command Handler for 'OVS_VPORT_CMD_GET'.
> *
> *  The function handles the initial call to setup the dump state, as well as
> *  subsequent calls to continue dumping data.
> @@ -2033,8 +2098,8 @@ OvsNewVportCmdHandler(POVS_USER_PARAMS_CONTEXT 
> usrParamsCtx,
>    } else {
>        ASSERT(OvsIsTunnelVportType(portType) ||
>               (portType == OVS_VPORT_TYPE_INTERNAL && isBridgeInternal));
> -        ASSERT(OvsGetTunnelVport(gOvsSwitchContext, portType) == NULL ||
> -               !OvsIsTunnelVportType(portType));
> +        //ASSERT(OvsGetTunnelVport(gOvsSwitchContext, portType) == NULL ||
> +        //       !OvsIsTunnelVportType(portType));
> 
>        vport = (POVS_VPORT_ENTRY)OvsAllocateVport();
>        if (vport == NULL) {
> @@ -2044,11 +2109,24 @@ OvsNewVportCmdHandler(POVS_USER_PARAMS_CONTEXT 
> usrParamsCtx,
>        vportAllocated = TRUE;
> 
>        if (OvsIsTunnelVportType(portType)) {
> -            status = OvsInitTunnelVport(vport, portType, VXLAN_UDP_PORT);
> +            PNL_ATTR attr = 
> NlAttrFindNested(vportAttrs[OVS_VPORT_ATTR_OPTIONS],
> +                                             OVS_TUNNEL_ATTR_DST_PORT);
> +            UINT16 udpPortDest = NlAttrGetU16(attr);

Since VPORT_ATTR_OPTIONS is optional attribute, there should be checks to 
handle a request without the option and use the default port #.
SV: OK, I'll modify this to use the default port 4789 if the attribute is not 
present.

> +
> +            status = OvsInitTunnelVport(usrParamsCtx,
> +                                        vport,
> +                                        portType,
> +                                        udpPortDest);
> +
>            nlError = NlMapStatusToNlErr(status);
> +
> +            if (nlError == NL_ERROR_PENDING) {
> +                goto Cleanup;
> +            }
>        } else {
>            OvsInitBridgeInternalVport(vport);
>        }
> +
>        vportInitialized = TRUE;
> 
>        if (nlError == NL_ERROR_SUCCESS) {
> @@ -2119,14 +2197,14 @@ OvsNewVportCmdHandler(POVS_USER_PARAMS_CONTEXT 
> usrParamsCtx,
> Cleanup:
>    NdisReleaseRWLock(gOvsSwitchContext->dispatchLock, &lockState);
> 
> -    if (nlError != NL_ERROR_SUCCESS) {
> +    if ((nlError != NL_ERROR_SUCCESS) && (nlError != NL_ERROR_PENDING)) {
>        POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR)
>            usrParamsCtx->outputBuffer;
> 
>        if (vport && vportAllocated == TRUE) {
>            if (vportInitialized == TRUE) {
>                if (OvsIsTunnelVportType(portType)) {
> -                    OvsCleanupVxlanTunnel(vport);
> +                    OvsCleanupVxlanTunnel(NULL, vport, NULL, NULL);
>                }
>            }
>            OvsFreeMemory(vport);
> @@ -2136,7 +2214,7 @@ Cleanup:
>        *replyLen = msgError->nlMsg.nlmsgLen;
>    }
> 
> -    return STATUS_SUCCESS;
> +    return (status == STATUS_PENDING) ? STATUS_PENDING : STATUS_SUCCESS;
> }
> 
> 
> @@ -2309,22 +2387,31 @@ OvsDeleteVportCmdHandler(POVS_USER_PARAMS_CONTEXT 
> usrParamsCtx,
>        goto Cleanup;
>    }
> 
> -    status = OvsCreateMsgFromVport(vport, msgIn, usrParamsCtx->outputBuffer,
> -                                   usrParamsCtx->outputLength,
> -                                   gOvsSwitchContext->dpNo);
> -
>    /*
>     * Mark the port as deleted from OVS userspace. If the port does not exist
>     * on the Hyper-V switch, it gets deallocated. Otherwise, it stays.
>     */
> -    OvsRemoveAndDeleteVport(gOvsSwitchContext, vport, FALSE, TRUE, NULL);
> +    status = OvsRemoveAndDeleteVport(usrParamsCtx,
> +                                     gOvsSwitchContext,
> +                                     vport,
> +                                     FALSE,
> +                                     TRUE,
> +                                     NULL);
> +    if (STATUS_PENDING == status) {
> +        nlError = NL_ERROR_PENDING;
> +        goto Cleanup;
> +    }
> +
> +    status = OvsCreateMsgFromVport(vport, msgIn, usrParamsCtx->outputBuffer,
> +                                   usrParamsCtx->outputLength,
> +                                   gOvsSwitchContext->dpNo);
> 
>    *replyLen = msgOut->nlMsg.nlmsgLen;
> 
> Cleanup:
>    NdisReleaseRWLock(gOvsSwitchContext->dispatchLock, &lockState);
> 
> -    if (nlError != NL_ERROR_SUCCESS) {
> +    if ((nlError != NL_ERROR_SUCCESS) && (nlError != NL_ERROR_PENDING)) {
>        POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR)
>            usrParamsCtx->outputBuffer;
> 
> @@ -2332,5 +2419,202 @@ Cleanup:
>        *replyLen = msgError->nlMsg.nlmsgLen;
>    }
> 
> -    return STATUS_SUCCESS;
> +    return (status == STATUS_PENDING) ? STATUS_PENDING : STATUS_SUCCESS;
> +}
> +
> +static VOID
> +OvsTunnelVportPendingUninit(PVOID context,
> +                            NTSTATUS status,
> +                            UINT32 *replyLen)
> +{
> +    POVS_TUNFLT_INIT_CONTEXT tunnelContext =
> +        (POVS_TUNFLT_INIT_CONTEXT) context;
> +    POVS_MESSAGE msgIn = NULL;
> +    POVS_MESSAGE msgOut = NULL;
> +    NL_ERROR nlError = NL_ERROR_SUCCESS;
> +    BOOLEAN error = TRUE;
> +
> +    do {
> +        if (NULL == tunnelContext) {
> +            nlError = NL_ERROR_INVAL;
> +            break;
> +        }

Can this happen? Shouldn’t this be an ASSERT?

> +
> +        if ((NULL == tunnelContext->vport) ||
> +            (NULL == tunnelContext->inputBuffer) ||
> +            (NULL == tunnelContext->outputBuffer)) {
> +            nlError = NL_ERROR_INVAL;
> +            break;
> +        }

Can this happen? Shouldn’t this be an ASSERT?

> +
> +        if (!NT_SUCCESS(status)) {
> +            nlError = NlMapStatusToNlErr(status);

ASSERT(nlError != NL_ERROR_INVAL) would be useful IMO.

> +            break;
> +        }
> +
> +        msgIn = (POVS_MESSAGE)tunnelContext->inputBuffer;
> +        msgOut = (POVS_MESSAGE)tunnelContext->outputBuffer;
> +
> +        OvsCreateMsgFromVport(tunnelContext->vport,
> +                              msgIn,
> +                              tunnelContext->outputBuffer,
> +                              tunnelContext->outputLength,
> +                              gOvsSwitchContext->dpNo);
> +
> +        *replyLen = msgOut->nlMsg.nlmsgLen;
> +
> +        error = FALSE;
> +    } while (error);

while (0)/while(FALSE) would be a better choice. There’s no reason to be using 
‘error’ as the while condition since we’ll not be looping for whatsoever reason.
SV: Using 'while(0)/while(FALSE)' will raise a compiler warning treated as 
error.
> +
> +    if (error && (NL_ERROR_INVAL != nlError)) {
> +        POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR)
> +            tunnelContext->outputBuffer;
> +
> +        NlBuildErrorMsg(msgIn, msgError, nlError);
> +        *replyLen = msgError->nlMsg.nlmsgLen;
> +    }
> +}
> +
> +static VOID
> +OvsTunnelVportPendingInit(PVOID context,
> +                          NTSTATUS status,
> +                          UINT32 *replyLen)
> +{
> +    POVS_TUNFLT_INIT_CONTEXT tunnelContext =
> +        (POVS_TUNFLT_INIT_CONTEXT) context;
> +    POVS_MESSAGE msgIn = NULL;
> +    POVS_MESSAGE msgOut = NULL;
> +    PCHAR portName;
> +    ULONG portNameLen = 0;
> +    UINT32 portType = 0;
> +    NL_ERROR nlError = NL_ERROR_SUCCESS;
> +    BOOLEAN error = TRUE;
> +
> +    do {
> +        if (NULL == tunnelContext) {
> +            nlError = NL_ERROR_INVAL;
> +            break;
> +        }
> +
> +        if ((NULL == tunnelContext->vport) ||
> +            (NULL == tunnelContext->inputBuffer) ||
> +            (NULL == tunnelContext->outputBuffer)) {
> +            nlError = NL_ERROR_INVAL;
> +            break;
> +        }

Can these two conditions happen? Shouldn’t they be ASSERT?

> +
> +        msgIn = (POVS_MESSAGE)tunnelContext->inputBuffer;
> +        msgOut = (POVS_MESSAGE)tunnelContext->outputBuffer;
> +
> +        if (!NT_SUCCESS(status)) {
> +            nlError = NlMapStatusToNlErr(status);
> +            break;
> +        }

ASSERT(nlError != NL_ERROR_INVAL) would be useful IMO.

> +
> +        static const NL_POLICY ovsVportPolicy[] = {
> +            [OVS_VPORT_ATTR_PORT_NO] = { .type = NL_A_U32, .optional = TRUE 
> },
> +            [OVS_VPORT_ATTR_TYPE] = { .type = NL_A_U32, .optional = FALSE },
> +            [OVS_VPORT_ATTR_NAME] = { .type = NL_A_STRING, .maxLen = 
> IFNAMSIZ,
> +            .optional = FALSE },
> +            [OVS_VPORT_ATTR_UPCALL_PID] = { .type = NL_A_UNSPEC,
> +            .optional = FALSE },
> +            [OVS_VPORT_ATTR_OPTIONS] = { .type = NL_A_NESTED, .optional = 
> TRUE },
> +        };
> +
> +        PNL_ATTR vportAttrs[ARRAY_SIZE(ovsVportPolicy)];
> +
> +        /* input buffer has been validated while validating write dev op. */
> +        ASSERT(tunnelContext->inputBuffer != NULL);
> +
> +        /* Output buffer has been validated while validating transact dev 
> op. */
> +        ASSERT(msgOut != NULL && tunnelContext->outputLength >= sizeof 
> *msgOut);
> +
> +        if (!NlAttrParse((PNL_MSG_HDR)msgIn,
> +            NLMSG_HDRLEN + GENL_HDRLEN + OVS_HDRLEN,
> +            NlMsgAttrsLen((PNL_MSG_HDR)msgIn),
> +            ovsVportPolicy, vportAttrs, ARRAY_SIZE(vportAttrs))) {
> +            nlError = NL_ERROR_INVAL;

I think we can ASSERT here. It just catches any corruptions.

> +            break;
> +        }
> +
> +        portName =
> +            NlAttrGet(vportAttrs[OVS_VPORT_ATTR_NAME]);
> +        portNameLen =
> +            NlAttrGetSize(vportAttrs[OVS_VPORT_ATTR_NAME]);
> +        portType =
> +            NlAttrGetU32(vportAttrs[OVS_VPORT_ATTR_TYPE]);

minor: extra new lines after ‘=‘.

> +
> +        if (tunnelContext->vport->portNo != OVS_DPPORT_NUMBER_INVALID) {
> +            nlError = NL_ERROR_EXIST;
> +            break;
> +        }

Can this happen? We have not assigned a value for ‘tunnel->vport->portNo’. We 
can ASSERT I think. This was applicable in OvsNewVportCmdHandler() for 
non-tunnel ports e.g. VIFs, if they have already been added from userspace. For 
tunnel ports, we allocate a new VPORT structure and vport->portNo is guaranteed 
to be invalid.

You can declare a stack variable ‘vport’ and get rid of using ‘tunnelContext->’.
SV: OK.

> +
> +        tunnelContext->vport->ovsState = OVS_STATE_CONNECTED;
> +        tunnelContext->vport->nicState = NdisSwitchNicStateConnected;
> +
> +        /*
> +         * Allow the vport to be deleted, because there is no
> +         * corresponding hyper-v switch part.
> +         */
> +        tunnelContext->vport->isPresentOnHv = TRUE;
> +
> +        if (vportAttrs[OVS_VPORT_ATTR_PORT_NO] != NULL) {
> +            /*
> +             * XXX: when we implement the limit for OVS port number to be
> +             * MAXUINT16, we'll need to check the port number received from 
> the
> +             * userspace.
> +             */
> +             tunnelContext->vport->portNo =
> +                NlAttrGetU32(vportAttrs[OVS_VPORT_ATTR_PORT_NO]);
> +        } else {
> +            tunnelContext->vport->portNo =
> +                OvsComputeVportNo(gOvsSwitchContext);
> +            if (tunnelContext->vport->portNo == OVS_DPPORT_NUMBER_INVALID) {
> +                nlError = NL_ERROR_NOMEM;
> +                break;
> +            }
> +        }
> +
> +        /* The ovs port name must be uninitialized. */
> +        ASSERT(tunnelContext->vport->ovsName[0] == '\0');
> +        ASSERT(portNameLen <= OVS_MAX_PORT_NAME_LENGTH);
> +
> +        RtlCopyMemory(tunnelContext->vport->ovsName, portName, portNameLen);
> +        /* if we don't have options, then vport->portOptions will be NULL */
> +        tunnelContext->vport->portOptions = 
> vportAttrs[OVS_VPORT_ATTR_OPTIONS];
> +
> +        /*
> +         * XXX: when we implement OVS_DP_ATTR_USER_FEATURES in datapath,
> +         * we'll need to check the OVS_DP_F_VPORT_PIDS flag: if it is set,
> +         * it means we have an array of pids, instead of a single pid.
> +         * ATM we assume we have one pid only.
> +         */
> +        tunnelContext->vport->upcallPid =
> +            NlAttrGetU32(vportAttrs[OVS_VPORT_ATTR_UPCALL_PID]);
> +
> +        status = InitOvsVportCommon(gOvsSwitchContext,
> +                                    tunnelContext->vport);
> +        ASSERT(status == STATUS_SUCCESS);
> +
> +        OvsCreateMsgFromVport(tunnelContext->vport,
> +                              msgIn,
> +                              tunnelContext->outputBuffer,
> +                              tunnelContext->outputLength,
> +                              gOvsSwitchContext->dpNo);
> +
> +        *replyLen = msgOut->nlMsg.nlmsgLen;
> +
> +        error = FALSE;
> +    } while (error);

while (0/FALSE) would be a better choice. There’s no reason to be using ‘error’ 
as the while condition since we’ll not be looping for whatsoever reason.

> +
> +    if (error && (NL_ERROR_INVAL != nlError)) {
> +        POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR)
> +            tunnelContext->outputBuffer;
> +
> +        OvsCleanupVxlanTunnel(NULL, tunnelContext->vport, NULL, NULL);
> +        OvsFreeMemory(tunnelContext->vport);
> +
> +        NlBuildErrorMsg(msgIn, msgError, nlError);
> +        *replyLen = msgError->nlMsg.nlmsgLen;
> +    }
> }

For both OvsTunnelVportPendingInit()/OvsTunnelVportPendingUninit(), a short 
description above the definition would help, IMO.
SV: I will add a description for the two functions.

> diff --git a/datapath-windows/ovsext/Vport.h b/datapath-windows/ovsext/Vport.h
> index 348fbfd..bca98d7 100644
> --- a/datapath-windows/ovsext/Vport.h
> +++ b/datapath-windows/ovsext/Vport.h
> @@ -207,15 +207,16 @@ OvsIsBridgeInternalVport(POVS_VPORT_ENTRY vport)
>    return vport->isBridgeInternal == TRUE;
> }
> 
> -VOID OvsRemoveAndDeleteVport(POVS_SWITCH_CONTEXT switchContext,
> -                             POVS_VPORT_ENTRY vport,
> -                             BOOLEAN hvDelete, BOOLEAN ovsDelete,
> -                             BOOLEAN *vportDeallocated);
> +NTSTATUS OvsRemoveAndDeleteVport(PVOID usrParamsCtx, 
> +                                 POVS_SWITCH_CONTEXT switchContext,
> +                                 POVS_VPORT_ENTRY vport,
> +                                 BOOLEAN hvDelete, BOOLEAN ovsDelete,
> +                                 BOOLEAN *vportDeallocated);

‘vportDeallocated’ is not being used. You can nuke it.
SV: OK.

> 
> NDIS_STATUS InitOvsVportCommon(POVS_SWITCH_CONTEXT switchContext,
>                               POVS_VPORT_ENTRY vport);
> -NTSTATUS OvsInitTunnelVport(POVS_VPORT_ENTRY vport, OVS_VPORT_TYPE ovsType,
> -                            UINT16 dstport);
> +NTSTATUS OvsInitTunnelVport(PVOID usrParamsCtx, POVS_VPORT_ENTRY vport,
> +                            OVS_VPORT_TYPE ovsType, UINT16 dstport);
> NTSTATUS OvsInitBridgeInternalVport(POVS_VPORT_ENTRY vport);
> 
> POVS_VPORT_ENTRY OvsAllocateVport(VOID);
> diff --git a/datapath-windows/ovsext/Vxlan.c b/datapath-windows/ovsext/Vxlan.c
> index 1ce5af2..cd3dc45 100644
> --- a/datapath-windows/ovsext/Vxlan.c
> +++ b/datapath-windows/ovsext/Vxlan.c
> @@ -49,15 +49,54 @@
> /* Move to a header file */
> extern POVS_SWITCH_CONTEXT gOvsSwitchContext;
> 
> +BOOLEAN
> +OvsIsTunnelFilterCreated(POVS_SWITCH_CONTEXT switchContext,
> +                         UINT16 udpPortDest)

Pls. add a description. Also, this is has static scope.

I am not clear as to the motivation for this function. It seems to be used in 
OvsInitVxlanTunnel() to decide on whether or not to insert the VXLAN WFP filter 
or not. If a VXLAN port exists in the hash tables, wouldn’t it have 
vxlanPort->filterID != NULL? The VXLAN vport would have got inserted into the 
hash tables in OvsTunnelVportPendingInit(), but what also means 
vxlanPort->filterID != NULL.

Are you trying to handle the case of duplicate request being made while the 
previous one is not complete yet? This won’t help for that.
SV: OvsIsTunnelFilterCreated function was added to verify if a VXLAN WFP port 
already exists. When this function is called, the current vxlan vport already 
exist in the hash tables, but the port is not added yet. The port addition is 
delayed, due to high IRQL. If the port was not added yet, 'vxlanPort->filterID 
== 0'.

> +{
> +    for (UINT hash = 0; hash < OVS_MAX_VPORT_ARRAY_SIZE; hash++) {
> +        PLIST_ENTRY head, link, next;
> +
> +        head = &(switchContext->portNoHashArray[hash & OVS_VPORT_MASK]);
> +        LIST_FORALL_SAFE(head, link, next) {
> +            POVS_VPORT_ENTRY vport = NULL;
> +            POVS_VXLAN_VPORT vxlanPort = NULL;
> +            vport = CONTAINING_RECORD(link, OVS_VPORT_ENTRY, portNoLink);
> +            vxlanPort = (POVS_VXLAN_VPORT)vport->priv;
> +            if (vxlanPort) {
> +                if ((udpPortDest == vxlanPort->dstPort) &&
> +                    (0 != vxlanPort->filterID)) {
> +                    /*
> +                     * VXLAN destination port matching is not enough to 
> decide
> +                     * if the WFP filter was created or not, because the 
> newly
> +                     * allocated VXLAN vport is already added to the
> +                     * portNoHashArray in InitOvsVportCommon function. If the
> +                     * filterID of the VXLAN vport is zero it means that the
> +                     * WFP filter is not created yet.
> +                     */
> +                    return TRUE;
> +                }
> +            }
> +        }
> +    }
> +
> +    return FALSE;
> +}
> +
> /*
> * udpDestPort: the vxlan is set as payload to a udp frame. If the destination
> * port of an udp frame is udpDestPort, we understand it to be vxlan.
> + * 
> + * add comments with the modifications regarding delayed WFP processing.

yes, indeed "add comments" :)

> */
> NTSTATUS
> -OvsInitVxlanTunnel(POVS_VPORT_ENTRY vport,
> -                   UINT16 udpDestPort)
> +OvsInitVxlanTunnel(PIRP irp,
> +                   POVS_VPORT_ENTRY vport,
> +                   UINT16 udpDestPort,
> +                   PFNTunnelVportPendingOp callback,
> +                   PVOID context)
> {
> -    POVS_VXLAN_VPORT vxlanPort;
> +    NTSTATUS status = STATUS_SUCCESS;
> +    POVS_VXLAN_VPORT vxlanPort = NULL;
> 
>    vxlanPort = OvsAllocateMemory(sizeof (*vxlanPort));
>    if (vxlanPort == NULL) {
> @@ -66,28 +105,46 @@ OvsInitVxlanTunnel(POVS_VPORT_ENTRY vport,
> 
>    RtlZeroMemory(vxlanPort, sizeof(*vxlanPort));
>    vxlanPort->dstPort = udpDestPort;
> -    /*
> -     * since we are installing the WFP filter before the port is created
> -     * We need to check if it is the same number
> -     * XXX should be removed later
> -     */
> -    ASSERT(vxlanPort->dstPort == VXLAN_UDP_PORT);
>    vport->priv = (PVOID)vxlanPort;
> 
> -    return STATUS_SUCCESS;
> +    if (!OvsIsTunnelFilterCreated(gOvsSwitchContext, udpDestPort)) {
> +        status = OvsTunelFilterCreate(irp,
> +                                      udpDestPort,
> +                                      &vxlanPort->filterID,
> +                                      callback,
> +                                      context);
> +    }
> +
> +    return status;
> }
> 
> 
> -VOID
> -OvsCleanupVxlanTunnel(POVS_VPORT_ENTRY vport)
> +NTSTATUS
> +OvsCleanupVxlanTunnel(PIRP irp,
> +                      POVS_VPORT_ENTRY vport,
> +                      PFNTunnelVportPendingOp callback,
> +                      PVOID context)
> {
> +    NTSTATUS status = STATUS_SUCCESS;
> +    POVS_VXLAN_VPORT vxlanPort = NULL;
> +
>    if (vport->ovsType != OVS_VPORT_TYPE_VXLAN ||
>        vport->priv == NULL) {
> -        return;
> +        return STATUS_SUCCESS;
> +    }
> +
> +    vxlanPort = (POVS_VXLAN_VPORT)vport->priv;
> +    if (0 != vxlanPort->filterID) {

=> if (vxlanPort->filterID != 0)

Though it is good to be graceful and handle the condition of 
'vxlanPort->filterID != 0’, it could be an ASSERT(). Based on looking at the 
code, it cannot happen.

> +        status = OvsTunelFilterDelete(irp,
> +                                      vxlanPort->filterID,
> +                                      callback,
> +                                      context);
>    }
> 
>    OvsFreeMemory(vport->priv);
>    vport->priv = NULL;
> +
> +    return status;
> }
> 
> 
> @@ -474,9 +531,6 @@ OvsSlowPathDecapVxlan(const PNET_BUFFER_LIST packet,
>            break;
>        }
> 
> -        /* XXX Should be tested against the dynamic port # in the VXLAN 
> vport */
> -        ASSERT(udp->dest == RtlUshortByteSwap(VXLAN_UDP_PORT));
> -
>        VxlanHeader = (VXLANHdr *)OvsGetPacketBytes(packet,
>                                                    sizeof(*VxlanHeader),
>                                                    layers.l7Offset,
> diff --git a/datapath-windows/ovsext/Vxlan.h b/datapath-windows/ovsext/Vxlan.h
> index d84796d..e2e03f9 100644
> --- a/datapath-windows/ovsext/Vxlan.h
> +++ b/datapath-windows/ovsext/Vxlan.h
> @@ -24,6 +24,7 @@ typedef struct _OVS_VXLAN_VPORT {
>    UINT64 outPkts;
>    UINT64 slowInPkts;
>    UINT64 slowOutPkts;
> +    UINT64 filterID;
>    /*
>     * To be filled
>     */
> @@ -47,10 +48,16 @@ typedef struct VXLANHdr {
>    UINT32   reserved2:8;
> } VXLANHdr;
> 
> -NTSTATUS OvsInitVxlanTunnel(POVS_VPORT_ENTRY vport,
> -                            UINT16 udpDestPort);
> +NTSTATUS OvsInitVxlanTunnel(PIRP irp,
> +                            POVS_VPORT_ENTRY vport,
> +                            UINT16 udpDestPort,
> +                            PFNTunnelVportPendingOp callback,
> +                            PVOID context);
> 
> -VOID OvsCleanupVxlanTunnel(POVS_VPORT_ENTRY vport);
> +NTSTATUS OvsCleanupVxlanTunnel(PIRP irp,
> +                               POVS_VPORT_ENTRY vport,
> +                               PFNTunnelVportPendingOp callback,
> +                               PVOID context);
> 
> NDIS_STATUS OvsSlowPathDecapVxlan(const PNET_BUFFER_LIST packet,
>                                  OvsIPv4TunnelKey *tunnelKey);
> @@ -75,7 +82,6 @@ OvsGetVxlanTunHdrSize(VOID)
>           sizeof (VXLANHdr);
> }
> 
> -#define VXLAN_UDP_PORT 4789
> #define VXLAN_UDP_PORT_NBO 0xB512
> 
> #endif /* __VXLAN_H_ */
> -- 
> 1.9.0.msysgit.0
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailman_listinfo_dev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=pNHQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80&m=iZE-ay1RfRGlzBiIK32zf67VOMWgV3Jo3R4_9nF5A9U&s=S9xReyWzI_S-JxWBmmYonf6FJiOlNGpTDYpEm3VDiCE&e=
>  

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to