Thanks Numan & Amitabha, this may be the right direction to solve the bug [1].

It basically implements Neutron API as async call, and queuing the request 
within DB transaction, and the ordering is preserved by the journal thread 
"lock" that is implemented with state PROCESSING plus DB transaction 
"with_for_update", with the help of validation functions for dependency 
checking (e.g. same object cannot be updated by 2 journal threads at the same 
time, etc.).

However, I didn't figure out how errors are handled with this approach. For 
example, a port is created in Neutron but ODL controller failed to create it 
although the journal thread successfully sent the request to ODL. And I didn't 
see how the port states (UP & DOWN) are handled (I didn’t see any call to 
ProvisioningBlock, so does it mean it will just be UP from the beginning?) It 
would be great if anyone can help answer this question.

[1] https://bugs.launchpad.net/networking-ovn/+bug/1605089

Thanks,
Han Zhou

From: Numan Siddique <nusid...@redhat.com>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Date: Friday, July 22, 2016 at 4:51 AM
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Subject: Re: [openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing 
neutron DB and OVN DB

Thanks for the comments Amitabha.
Please see comments inline

On Fri, Jul 22, 2016 at 5:50 AM, Amitabha Biswas 
<azbis...@gmail.com<mailto:azbis...@gmail.com>> wrote:
Hi Numan,

Thanks for the proposal. We have also been thinking about this use-case.

If I’m reading this accurately (and I may not be), it seems that the proposal 
is to not have any OVN NB (CUD) operations (R operations outside the scope) 
done by the api_worker threads but rather by a new journal thread.


Correct.
​

If this is indeed the case, I’d like to consider the scenario when there any N 
neutron nodes, each node with M worker threads. The journal thread at the each 
node contain list of pending operations. Could there be (sequence) dependency 
in the pending operations amongst each the journal threads in the nodes that 
prevents them from getting applied (for e.g. Logical_Router_Port and 
Logical_Switch_Port inter-dependency), because we are returning success on 
neutron operations that have still not been committed to the NB DB.


I
​ts a valid scenario and should be designed properly to handle such scenarios 
in case we take this approach.

​
Couple of clarifications and thoughts below.

Thanks
Amitabha <abis...@us.ibm.com<mailto:abis...@us.ibm.com>>

On Jul 13, 2016, at 1:20 AM, Numan Siddique 
<nusid...@redhat.com<mailto:nusid...@redhat.com>> wrote:

Adding the proper tags in subject

On Wed, Jul 13, 2016 at 1:22 PM, Numan Siddique 
<nusid...@redhat.com<mailto:nusid...@redhat.com>> wrote:
Hi Neutrinos,

Presently, In the OVN ML2 driver we have 2 ways to sync neutron DB and OVN DB
 - At neutron-server startup, OVN ML2 driver syncs the neutron DB and OVN DB if 
sync mode is set to repair.
 - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.

Recently, in the v2 of networking-odl ML2 driver (Please see (1) below which 
has more details). (ODL folks please correct me if I am wrong here)

  - a journal thread is created which does the CRUD operations of neutron 
resources asynchronously (i.e it sends the REST APIs to the ODL controller).

Would this be the equivalent of making OVSDB transactions to the OVN NB DB?

​Correct.
​



  - a maintenance thread is created which does some cleanup periodically and at 
startup does full sync if it detects ODL controller cold reboot.


Few question I have
 - can OVN ML2 driver take same or similar approach. Are there any advantages 
in taking this approach ? One advantage is neutron resources can be 
created/updated/deleted even if the OVN ML2 driver has lost connection to the 
ovsdb-server. The journal thread would eventually sync these resources in the 
OVN DB. I would like to know the communities thoughts on this.

If we can make it work, it would indeed be a huge plus for system wide upgrades 
and some corner cases in the code (ACL specifically), where the post_commit 
relies on all transactions to be successful and doesn’t revert the neutron db 
if something fails.






 - Are there are other ML2 drivers which might have to handle the DB sync's 
(cases where the other controllers also maintain their own DBs) and how they 
are handling it ?

 - Can a common approach be taken to sync the neutron DB and controller DBs ?


-----------------------------------------------------------------------------------------------------------

(1)
Sync threads created by networking-odl ML2 driver
--------------------------------------------------
ODL ML2 driver creates 2 threads (threading.Thread module) at init
 - Journal thread
 - Maintenance thread

Journal thread
----------------
The journal module creates a new journal table by name “opendaylightjournal”  - 
https://github.com/openstack/networking-odl/blob/master/networking_odl/db/models.py#L23

Journal thread will be in loop waiting for the sync event from the ODL ML2 
driver.

 - ODL ML2 driver resource (network, subnet, port) precommit functions when 
called by the ML2 plugin adds an entry in the “opendaylightjournal” table with 
the resource data and sets the journal operation state for this entry to 
“PENDING”.
 - The corresponding resource postcommit function of the ODL ML2 plugin when 
called, sets the sync event flag.
 - A timer is also created which sets the sync event flag when it expires (the 
default value is 10 seconds).
 - Journal thread wakes up, looks into the “opendaylightjournal” table with the 
entries with state “pending” and runs the CRUD operation on those resources in 
the ODL DB. Once done, it sets the state to “completed”.

Maintenance thread
------------------
Maintenance thread does 3 operations
 - JournalCleanup - Delete completed rows from journal table 
“opendaylightjournal”.
 - CleanupProcessing - Mark orphaned processing rows to pending.
 - Full sync - Re-sync when detecting an ODL "cold reboot”.



Thanks
Numan


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org<mailto:openstack-dev-requ...@lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to