Public bug reported: This might somewhat be related to https://bugs.launchpad.net/nova/+bug/1800755 and discussion there.
Recently the following problem was reported in one of our clouds: - a homegrown self-written monitoring that polls servers diagnostics - the monitoring script is naive and does not check the server state before requesting server diagnostics - several servers in shutdown state - instance_faults table is growing and ballooning database size on disk During handling of GET /servers/<uuid>/diagnostics call for anything but RUNNING instance nova raises InstanceInvalidState exception which is then: - stored in instance_faults table; - returns as HTTP409 Conflict to the user. https://opendev.org/openstack/nova/src/commit/03d2715ed492350fa11908aea0fdd0265993e284/nova/compute/manager.py#L6550-L6558 Effectively benign 'read-only' GET requests are recorded in the DB. Also, these instance_faults entries can not purged by standard means since the instance is not deleted yet. What's more, they won't be shown in any API at all, since the server is also not in ERROR state. This got me thinking - should the InvalidInstanceState be saved as instance_faults at all? After all, usually this exception indicates not the problem (fault) with the instance, but the mismatch between instance state and requested action upon instance, which might not warrant storing it. There's also a slight DoS potential here, but since default policy for get diagnostics call is admin-only, this is probably not worth worrying. ** Affects: nova Importance: Undecided Assignee: Pavlo Shchelokovskyy (pshchelo) Status: New ** Changed in: nova Assignee: (unassigned) => Pavlo Shchelokovskyy (pshchelo) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1992169 Title: instance_faults entries are created on InstanceInvalidState exceptions Status in OpenStack Compute (nova): New Bug description: This might somewhat be related to https://bugs.launchpad.net/nova/+bug/1800755 and discussion there. Recently the following problem was reported in one of our clouds: - a homegrown self-written monitoring that polls servers diagnostics - the monitoring script is naive and does not check the server state before requesting server diagnostics - several servers in shutdown state - instance_faults table is growing and ballooning database size on disk During handling of GET /servers/<uuid>/diagnostics call for anything but RUNNING instance nova raises InstanceInvalidState exception which is then: - stored in instance_faults table; - returns as HTTP409 Conflict to the user. https://opendev.org/openstack/nova/src/commit/03d2715ed492350fa11908aea0fdd0265993e284/nova/compute/manager.py#L6550-L6558 Effectively benign 'read-only' GET requests are recorded in the DB. Also, these instance_faults entries can not purged by standard means since the instance is not deleted yet. What's more, they won't be shown in any API at all, since the server is also not in ERROR state. This got me thinking - should the InvalidInstanceState be saved as instance_faults at all? After all, usually this exception indicates not the problem (fault) with the instance, but the mismatch between instance state and requested action upon instance, which might not warrant storing it. There's also a slight DoS potential here, but since default policy for get diagnostics call is admin-only, this is probably not worth worrying. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1992169/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp