Public bug reported: Description:
Instance boot operation is stuck in BUILDING status if user has some roles which total name length is over 255 chars. Details: The nova-compute manager tries to store all roles list of the request context in InstanceSystemMetadata in the __build_and_run_instance method[1], but the metadata value is limited to 255 chars as DB model[2]. So the total role name string length including ","-joined is over 255 chars, the instance boot operation gets fail due to the DB operation failure. 1. https://opendev.org/openstack/nova/src/commit/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/compute/manager.py#L2584-L2587 2. https://opendev.org/openstack/nova/src/commit/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/db/main/models.py#L980 Step to reproduce: Create some roles which name length is over 255 characters. In this sample, use three 100 characters roles. $ openstack role create aaaaaa...<omit 90 a chars>..aaaa $ openstack role create bbbbbb...<omit 90 b chars>..bbbb $ openstack role create cccccc...<omit 90 b chars>..cccc $ openstack role add --project admin --user admin <aaaa role id> $ openstack role add --project admin --user admin <bbbb role id> $ openstack role add --project admin --user admin <cccc role id> $ openstack server create --image <image-id> --nic net-id=<net-id> --flavor <flavor-id> stuck-vm The server, stuck-vm, gets stuck in status is "BUILD" and vm_state is "building" status. Error log by nova-compute: DBDataError (pymysql.err.DataError) (1406, "Data too long for column 'value' at row 1") [SQL: INSERT INTO instance_system_metadata (created_at, updated_at, deleted_at, deleted, `key`, value, instance_uuid) VALUES (%(created_at)s, %(updated_at)s, %(deleted_at)s, %(deleted)s, %(key)s, %(value)s, %(instance_uuid)s)] [parameters: {'created_at': datetime.datetime(2024, 7, 12, 18, 57, 35, 750579), 'updated_at': None, 'deleted_at': None, 'deleted': 0, 'key': 'boot_roles', 'value': 'ccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbb,ddddddddddbbbbbbbbbbddddddddddbbbbbbbbbbdddddddd ... (32 characters truncated) ... ddddddddddbbbbbbbbbb,manager,admin,reader,member,aaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbb', 'instance_uuid': '1bbd12b0-62d1-4b03-88e2-0edc77517261'}] (Background on this error at: https://sqlalche.me/e/14/9h9h) Environment: Caracal release (2024.1) and master branch ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2075100 Title: Instance stuck on building status if user has many roles Status in OpenStack Compute (nova): New Bug description: Description: Instance boot operation is stuck in BUILDING status if user has some roles which total name length is over 255 chars. Details: The nova-compute manager tries to store all roles list of the request context in InstanceSystemMetadata in the __build_and_run_instance method[1], but the metadata value is limited to 255 chars as DB model[2]. So the total role name string length including ","-joined is over 255 chars, the instance boot operation gets fail due to the DB operation failure. 1. https://opendev.org/openstack/nova/src/commit/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/compute/manager.py#L2584-L2587 2. https://opendev.org/openstack/nova/src/commit/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/db/main/models.py#L980 Step to reproduce: Create some roles which name length is over 255 characters. In this sample, use three 100 characters roles. $ openstack role create aaaaaa...<omit 90 a chars>..aaaa $ openstack role create bbbbbb...<omit 90 b chars>..bbbb $ openstack role create cccccc...<omit 90 b chars>..cccc $ openstack role add --project admin --user admin <aaaa role id> $ openstack role add --project admin --user admin <bbbb role id> $ openstack role add --project admin --user admin <cccc role id> $ openstack server create --image <image-id> --nic net-id=<net-id> --flavor <flavor-id> stuck-vm The server, stuck-vm, gets stuck in status is "BUILD" and vm_state is "building" status. Error log by nova-compute: DBDataError (pymysql.err.DataError) (1406, "Data too long for column 'value' at row 1") [SQL: INSERT INTO instance_system_metadata (created_at, updated_at, deleted_at, deleted, `key`, value, instance_uuid) VALUES (%(created_at)s, %(updated_at)s, %(deleted_at)s, %(deleted)s, %(key)s, %(value)s, %(instance_uuid)s)] [parameters: {'created_at': datetime.datetime(2024, 7, 12, 18, 57, 35, 750579), 'updated_at': None, 'deleted_at': None, 'deleted': 0, 'key': 'boot_roles', 'value': 'ccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbbccccccccccbbbbbbbbbb,ddddddddddbbbbbbbbbbddddddddddbbbbbbbbbbdddddddd ... (32 characters truncated) ... ddddddddddbbbbbbbbbb,manager,admin,reader,member,aaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbbaaaaaaaaaabbbbbbbbbb', 'instance_uuid': '1bbd12b0-62d1-4b03-88e2-0edc77517261'}] (Background on this error at: https://sqlalche.me/e/14/9h9h) Environment: Caracal release (2024.1) and master branch To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2075100/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp