Public bug reported:

This was observed during tests on environment with several controllers: when a 
routers with  gateways and  subnets are created at a high rate, sometimes port 
creation for router gateway may fail with DBDeadlock. In several cases that I 
investigated I found that deadlock happens when router port is created in 
parallel with dhcp port(s) creation on other servers. Generally we have 
simultaneous port creation. Port creation involves locking 'ports' and 
'binding' tables: get_locked_port_and_binding() ml2 db method, which 
essentially does:
        port = (session.query(models_v2.Port).
                enable_eagerloads(False).
                filter_by(id=port_id).
                with_lockmode('update').
                one())
        binding = (session.query(models.PortBinding).
                   enable_eagerloads(False).
                   filter_by(port_id=port_id).
                   with_lockmode('update').
                   one())

Also there are locks during ip allocation for the port.
I'm not sure how exacly this may lead to deadlock. It may probably happen due 
to specifics of Galera working in active-active
mode: throwing deadlock errors when it fails to validate a change with other 
members of the cluster.

Examples of tracebacks:
http://paste.openstack.org/show/399624/
http://paste.openstack.org/show/405057/

** Affects: neutron
     Importance: Undecided
     Assignee: Oleg Bondarev (obondarev)
         Status: New


** Tags: db ml2

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1479738

Title:
  DB deadlocks on simultaneous port creation

Status in neutron:
  New

Bug description:
  This was observed during tests on environment with several controllers: when 
a routers with  gateways and  subnets are created at a high rate, sometimes 
port creation for router gateway may fail with DBDeadlock. In several cases 
that I investigated I found that deadlock happens when router port is created 
in parallel with dhcp port(s) creation on other servers. Generally we have 
simultaneous port creation. Port creation involves locking 'ports' and 
'binding' tables: get_locked_port_and_binding() ml2 db method, which 
essentially does:
          port = (session.query(models_v2.Port).
                  enable_eagerloads(False).
                  filter_by(id=port_id).
                  with_lockmode('update').
                  one())
          binding = (session.query(models.PortBinding).
                     enable_eagerloads(False).
                     filter_by(port_id=port_id).
                     with_lockmode('update').
                     one())

  Also there are locks during ip allocation for the port.
  I'm not sure how exacly this may lead to deadlock. It may probably happen due 
to specifics of Galera working in active-active
  mode: throwing deadlock errors when it fails to validate a change with other 
members of the cluster.

  Examples of tracebacks:
  http://paste.openstack.org/show/399624/
  http://paste.openstack.org/show/405057/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1479738/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to