I just yesterday updated by Ceph cluster from Quincy (17.2.7) to Reef (18.2.4, 
purely by accident; I didn't realize I was getting it THE EXACT SECOND it was 
pushed out).

The upgrade resolved an issue I was having with HTTP 500 errors on the RGW UI, 
but seems to have created an issue in the dashboard with the Block / Images 
screen. I see errors that come up as alerts:

0 - Unknown Error
* Http failure response for api/prometheus: 0 Unknown Error
* Http failure reponse for api/prometheus/rules: 0 Unknown Error
* Http failure response for api/summary: 0 Unknown Error

All the other menu options on the dashboard are working.

I thought it was a UI issue, but having read the notes on 18.2.4, I wonder if 
I've hit a bug, or if there's something wonky with the RBD images in some of 
our pools. We're using Ceph mostly for OpenStack.

The syslog on the mgr server shows an error and traceback, when I try to use 
the Block / Images menu link. Notably I see

Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: librbd::DiffIterate: 
fast diff enabled

just before the crash, and there's mention of this function in the release 
notes.

I'm going to try and use the rbd command and query some of the pools and images 
to see if I can narrow down specifically what's going on. If anyone's got some 
suggestions, I'd love to hear them. I can drop additional information from the 
logs if needed. I don't THINK there were errors prior to the upgrade, but I 
can't say for sure since I've been working in other parts of the Ceph UI.

Full traceback below

Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: asok(0x56360092c000) 
register_command rbd cache flush 
images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b hook 0x56361144ffc0
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: asok(0x56360092c000) 
register_command rbd cache invalidate 
images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b hook 0x56361144ffc0
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: librbd::ImageCtx: 
0x5636176bd000: disabling zero-copy writes
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::Dispatcher: 0x563615cf1170 register_dispatch: dispatch_layer=3
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::cache::WriteAroundObjectDispatch: 0x56360ce5efc0 init:
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::Dispatcher: 0x56360bd64900 register_dispatch: dispatch_layer=1
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::SimpleSchedulerObjectDispatch: 0x5636164b9e00 
SimpleSchedulerObjectDispatch: ictx=0x5636176bd000
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::SimpleSchedulerObjectDispatch: 0x5636164b9e00 init:
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::Dispatcher: 0x56360bd64900 register_dispatch: dispatch_layer=5
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::WriteBlockImageDispatch: 0x5636134e46e0 block_writes: 
0x5636176bd000, num=1
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::Dispatcher: 0x563615cf1170 shut_down_dispatch: dispatch_layer=3
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::WriteBlockImageDispatch: 0x5636134e46e0 unblock_writes: 
0x5636176bd000, num=0
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::WriteBlockImageDispatch: 0x5636134e46e0 block_writes: 
0x5636176bd000, num=1
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
librbd::io::WriteBlockImageDispatch: 0x5636134e46e0 unblock_writes: 
0x5636176bd000, num=0

Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: librbd::DiffIterate: 
fast diff enabled

Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.4/rpm/el9/BUILD/ceph-18.2.4/src/librbd/api/DiffIterate.cc:
 In function 'int librbd::api::DiffIterate<ImageCtxT>::execute() [with 
ImageCtxT = librbd::ImageCtx]' thread 7fd5897ac640 time 
2024-07-25T13:13:15.549430+0000
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.4/rpm/el9/BUILD/ceph-18.2.4/src/librbd/api/DiffIterate.cc:
 341: FAILED ceph_assert(object_diff_state.size() == end_object_no - 
start_object_no)
ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) 
[0x7fd6f5b0104d]
2: /usr/lib64/ceph/libceph-common.so.2(+0x16b20b) [0x7fd6f5b0120b]
3: /lib64/librbd.so.1(+0x193403) [0x7fd6e5ce2403]
4: /lib64/librbd.so.1(+0x51ada7) [0x7fd6e6069da7]
5: rbd_diff_iterate2()
6: /lib64/python3.9/site-packages/rbd.cpython-39-x86_64-linux-gnu.so(+0x630bc) 
[0x7fd6e62f40bc]
7: /lib64/libpython3.9.so.1.0(+0x11d7a1) [0x7fd6f661e7a1]
8: PyVectorcall_Call()
9: /lib64/python3.9/site-packages/rbd.cpython-39-x86_64-linux-gnu.so(+0x44d50) 
[0x7fd6e62d5d50]
10: _PyObject_MakeTpCall()
11: /lib64/libpython3.9.so.1.0(+0x125133) [0x7fd6f6626133]
12: _PyEval_EvalFrameDefault()
13: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
14: _PyFunction_Vectorcall()
15: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
16: _PyEval_EvalFrameDefault()
17: /lib64/libpython3.9.so.1.0(+0x11cb73) [0x7fd6f661db73]
18: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
19: _PyEval_EvalFrameDefault()
20: /lib64/libpython3.9.so.1.0(+0x11cb73) [0x7fd6f661db73]
21: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
22: _PyEval_EvalFrameDefault()
23: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
24: _PyFunction_Vectorcall()
25: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
26: _PyEval_EvalFrameDefault()
27: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
28: _PyFunction_Vectorcall()
29: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
30: _PyEval_EvalFrameDefault()
31: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
Jul 25 09:13:15 ceph00.cecnet.gmu.edu ceph-mgr[2293091]: *** Caught signal 
(Aborted) **
in thread 7fd5897ac640 thread_name:dashboard
ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)
1: /lib64/libc.so.6(+0x3e6f0) [0x7fd6f54aa6f0]
2: /lib64/libc.so.6(+0x8b94c) [0x7fd6f54f794c]
3: raise()
4: abort()
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x188) 
[0x7fd6f5b010a7]
6: /usr/lib64/ceph/libceph-common.so.2(+0x16b20b) [0x7fd6f5b0120b]
7: /lib64/librbd.so.1(+0x193403) [0x7fd6e5ce2403]
8: /lib64/librbd.so.1(+0x51ada7) [0x7fd6e6069da7]
9: rbd_diff_iterate2()
10: /lib64/python3.9/site-packages/rbd.cpython-39-x86_64-linux-gnu.so(+0x630bc) 
[0x7fd6e62f40bc]
11: /lib64/libpython3.9.so.1.0(+0x11d7a1) [0x7fd6f661e7a1]
12: PyVectorcall_Call()
13: /lib64/python3.9/site-packages/rbd.cpython-39-x86_64-linux-gnu.so(+0x44d50) 
[0x7fd6e62d5d50]
14: _PyObject_MakeTpCall()
15: /lib64/libpython3.9.so.1.0(+0x125133) [0x7fd6f6626133]
16: _PyEval_EvalFrameDefault()
17: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
18: _PyFunction_Vectorcall()
19: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
20: _PyEval_EvalFrameDefault()
21: /lib64/libpython3.9.so.1.0(+0x11cb73) [0x7fd6f661db73]
22: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
23: _PyEval_EvalFrameDefault()
24: /lib64/libpython3.9.so.1.0(+0x11cb73) [0x7fd6f661db73]
25: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
26: _PyEval_EvalFrameDefault()
27: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
28: _PyFunction_Vectorcall()
29: /lib64/libpython3.9.so.1.0(+0x125031) [0x7fd6f6626031]
30: _PyEval_EvalFrameDefault()
31: /lib64/libpython3.9.so.1.0(+0x10ec35) [0x7fd6f660fc35]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to