Public bug reported: [Impact]
A recent change (issue#43975 [0]) was made to slow request logging to include detail on each operation in the cluster logs. With this change, detail for every slow request is always sent to the monitors and added to the cluster logs. This does not scale. Large, high-throughput clusters can overwhelm their monitors with spurious logs in the event of a performance issue. Disrupting the monitors can then cause further instability in the cluster. This SRU reverts the cluster logging of every slow request the osd is processing. The slow request clog change was added in nautilus (14.2.10) and octopus (15.2.0). [Test Case] Stress the cluster with a benchmarking tool to generate slow requests and observe the cluster logs. [Where problems could occur] The cluster logs contain detailed debug information on slow requests that is useful for smaller, low-throughput clusters. While these logs are not used by ceph, they may be used by the cluster administrators (for monitoring or alerts). Changing this logging behavior may be unexpected. [Other Info] The intent is to re-enable this feature behind a configurable setting, but the solution must be discussed upstream. The same slow request detail can be enabled for each osd by raising the "debug osd" log level to 20. [0] https://tracker.ceph.com/issues/43975 ** Affects: cloud-archive Importance: High Status: In Progress ** Affects: cloud-archive/train Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Affects: cloud-archive/ussuri Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Affects: ceph (Ubuntu) Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Affects: ceph (Ubuntu Focal) Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Affects: ceph (Ubuntu Groovy) Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Affects: ceph (Ubuntu Hirsute) Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Tags: seg sts ** Also affects: ceph (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: ceph (Ubuntu Hirsute) Importance: Undecided Status: New ** Also affects: ceph (Ubuntu Groovy) Importance: Undecided Status: New ** Tags added: seg sts ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Changed in: ceph (Ubuntu Hirsute) Status: New => In Progress ** Changed in: ceph (Ubuntu Hirsute) Importance: Undecided => High ** Changed in: ceph (Ubuntu Groovy) Importance: Undecided => High ** Changed in: ceph (Ubuntu Focal) Importance: Undecided => High ** Changed in: cloud-archive/ussuri Importance: Undecided => High ** Changed in: cloud-archive/train Importance: Undecided => High ** Changed in: cloud-archive Importance: Undecided => High ** Changed in: ceph (Ubuntu Groovy) Status: New => In Progress ** Changed in: ceph (Ubuntu Focal) Status: New => In Progress ** Changed in: cloud-archive/ussuri Status: New => In Progress ** Changed in: cloud-archive/train Status: New => In Progress ** Changed in: cloud-archive Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1909162 Title: cluster log slow request spam To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1909162/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs