I installed a small ceph setup in order to test ceph before to be installed a more big production setup. The setup is composed by 3 OSD nodes (with 2 osd per node) + 1 Mon daemon + 1 MDS daemon. The monitor agent is running on a single OSD node and MDS service running on the admin node. All nodes with: Ceph 13.2.6-0.el7.x86_64 CentOS Linux release 7.6.1810 (Core)
Testing the setup I had 3 troubles: 1. Listing directories is very slow when I Writing a multiple files on the same directory. [cephuser@xxxx ~]$ time ls /mnt/cephfs/dir2/ dir1 fileD real 0m35.903s user 0m0.000s sys 0m0.002s the log shows: 2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : 2 slow requests, 2 included below; oldest blocked for > 33.475100 secs 2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 33.475099 seconds old, received at 2019-09-07 15:11:18.544642: client_request(client.4394:4818 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:18.543367 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting 2019-09-07 15:11:52.019 7fbae9521700 0 log_channel(cluster) log [WRN] : slow request 30.370497 seconds old, received at 2019-09-07 15:11:21.649244: client_request(client.4394:4819 getattr pAsLsXsFs #0x10000000e65 2019-09-07 15:11:21.648417 caller_uid=1000, caller_gid=1000{}) currently failed to rdlock, waiting 2019-09-07 15:11:53.107 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 520 from mon.0 2019-09-07 15:11:57.019 7fbae9521700 0 log_channel(cluster) log [WRN] : 2 slow requests, 0 included below; oldest blocked for > 38.475154 secs 2019-09-07 15:12:05.115 7fbaebfaf700 1 mds.stor1demo Updating MDS map to version 521 from mon.0 ~ However from the same client node that is writing the time : [cephuser@storage3demo ~]$ time ls /mnt/cephfs/dir2/ dir1 fileD real 0m0.003s user 0m0.000s sys 0m0.003s 2. Writing files the bandwidth is around 110 MB/s but reading files only takes about 50-60 MB/s. Why this behavior?. 3 I did two reweigh-by-utilization . After waiting some hours ceph finished the operation but now it shows "1/2083316 objects misplaced (0.000%)" all the time. How could I fix it ?. Information about my ceph system: cluster: id: ad3f4a27-bceb-4635-acd9-691b792661af health: HEALTH_WARN 1/1980216 objects misplaced (0.000%) services: mon: 1 daemons, quorum storage1demo mgr: storage1demo(active), standbys: storage2demo, storage3demo mds: cephfs-1/1/1 up {0=stor1demo=up:active} osd: 6 osds: 6 up, 6 in; 2 remapped pgs data: pools: 2 pools, 256 pgs objects: 990.1 k objects, 3.7 TiB usage: 7.4 TiB used, 2.1 TiB / 9.6 TiB avail pgs: 1/1980216 objects misplaced (0.000%) 254 active+clean 2 active+clean+remapped io: client: 85 B/s wr, 0 op/s rd, 238 op/s wr +----+--------------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+--------------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | storage1demo | 1522G | 340G | 29 | 0 | 0 | 0 | exists,up | | 1 | storage1demo | 353G | 112G | 12 | 0 | 0 | 0 | exists,up | | 2 | storage2demo | 1524G | 338G | 47 | 0 | 0 | 0 | exists,up | | 3 | storage2demo | 1201G | 661G | 38 | 0 | 0 | 0 | exists,up | | 4 | storage3demo | 1521G | 341G | 45 | 0 | 0 | 0 | exists,up | | 5 | storage3demo | 1378G | 484G | 35 | 0 | 0 | 0 | exists,up | +----+--------------+-------+-------+--------+---------+--------+---------+-----------+ Thanks in advance.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com