Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-06 Thread Philippe D'Anjou
 I had to use rocksdb repair tool before because the rocksdb files got 
corrupted, for another reason (another bug possibly). Maybe that is why now it 
crash loops, although it ran fine for a day.What is meant with "turn it off and 
rebuild from remainder"?

Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum 
 Folgendes geschrieben:  
 
 Hmm, that assert means the monitor tried to grab an OSDMap it had on
disk but it didn't work. (In particular, a "pinned" full map which we
kept around after trimming the others to save on disk space.)

That *could* be a bug where we didn't have the pinned map and should
have (or incorrectly thought we should have), but this code was in
Mimic as well as Nautilus and I haven't seen similar reports. So it
could also mean that something bad happened to the monitor's disk or
Rocksdb store. Can you turn it off and rebuild from the remainder, or
do they all exhibit this bug?


On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou
 wrote:
>
> Hi,
> our mon is acting up all of a sudden and dying in crash loop with the 
> following:
>
>
> 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352
>    -3> 2019-10-04 14:00:24.335 7f6e5d461700  5 
>mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) 
>is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.00 has v0 
>lc 4549352
>    -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
>mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map 
>closest pinned map ver 252615 not available! error: (2) No such file or 
>directory
>    -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
>/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int 
>OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread 
>7f6e5d461700 time 2019-10-04 14:00:24.347580
> /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0)
>
>  ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
>(stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>const*)+0x152) [0x7f6e68eb064e]
>  2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
>char const*, ...)+0) [0x7f6e68eb0829]
>  3: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
>ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
>  4: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
>ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
>  5: 
>(OSDMonitor::encode_trim_extra(std::shared_ptr, 
>unsigned long)+0x8c) [0x717c3c]
>  6: (PaxosService::maybe_trim()+0x473) [0x707443]
>  7: (Monitor::tick()+0xa9) [0x5ecf39]
>  8: (C_MonContext::finish(int)+0x39) [0x5c3f29]
>  9: (Context::complete(int)+0x9) [0x6070d9]
>  10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580]
>  11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d]
>  12: (()+0x76ba) [0x7f6e67cab6ba]
>  13: (clone()+0x6d) [0x7f6e674d441d]
>
>      0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) **
>  in thread 7f6e5d461700 thread_name:safe_timer
>
>  ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
>(stable)
>  1: (()+0x11390) [0x7f6e67cb5390]
>  2: (gsignal()+0x38) [0x7f6e67402428]
>  3: (abort()+0x16a) [0x7f6e6740402a]
>  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>const*)+0x1a3) [0x7f6e68eb069f]
>  5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
>char const*, ...)+0) [0x7f6e68eb0829]
>  6: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
>ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
>  7: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
>ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
>  8: 
>(OSDMonitor::encode_trim_extra(std::shared_ptr, 
>unsigned long)+0x8c) [0x717c3c]
>  9: (PaxosService::maybe_trim()+0x473) [0x707443]
>  10: (Monitor::tick()+0xa9) [0x5ecf39]
>  11: (C_MonContext::finish(int)+0x39) [0x5c3f29]
>  12: (Context::complete(int)+0x9) [0x6070d9]
>  13: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580]
>  14: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d]
>  15: (()+0x76ba) [0x7f6e67cab6ba]
>  16: (clone()+0x6d) [0x7f6e674d441d]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
>interpret this.
>
>
> This was running fine for 2months now, it's a crashed cluster that is in 
> recovery.
>
> Any suggestions?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Hidden Objects

2019-10-06 Thread Lazuardi Nasution
Hi,

On inspecting new installed cluster (Nautilus), I find following result.
ssd-test pool is cache pool for hdd-test pool. After running some RBD bench
and delete all rbd images used for benchmarking, it there is some hidden
objects inside both pools (except rbd_directory, rbd_info and rbd_trash).
What are they and how to clear them?

[root@c10-ctrl ~]# ceph df detail
RAW STORAGE:
CLASS SIZEAVAIL   USEDRAW USED %RAW USED
hdd   256 TiB 255 TiB 1.5 TiB  1.6 TiB  0.62
ssd   7.8 TiB 7.8 TiB 5.9 GiB   24 GiB  0.30
TOTAL 264 TiB 262 TiB 1.5 TiB  1.6 TiB  0.61

POOLS:
POOL ID STORED OBJECTS USED%USED MAX
AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER
COMPR
hdd-test  8 16 KiB  15 384 KiB 0   161
TiB N/A   N/A150 B
0 B
ssd-test  9 25 MiB  17.57k 2.7 GiB  0.04   2.5
TiB N/A   N/A 30 B
0 B
[root@c10-ctrl ~]# rados -p hdd-test ls
rbd_directory
rbd_info
rbd_trash
[root@c10-ctrl ~]# rados -p ssd-test ls
[root@c10-ctrl ~]# ceph osd dump | grep test
pool 8 'hdd-test' erasure size 6 min_size 5 crush_rule 2 object_hash
rjenkins pg_num 2048 pgp_num 2048 autoscale_mode warn last_change 6701 lfor
1242/6094/6102 flags hashpspool,selfmanaged_snaps tiers 9 read_tier 9
write_tier 9 stripe_width 16384 application rbd
pool 9 'ssd-test' replicated size 3 min_size 2 crush_rule 1 object_hash
rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 6702 lfor
1242/3890/6106 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of
8 cache_mode writeback target_bytes 1374389534720 hit_set
bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x24
decay_rate 0 search_last_n 0 stripe_width 0
[root@c10-ctrl ~]#

Best regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
Out of the blue this popped up (on an otherwise healthy cluster):

HEALTH_WARN 1 large omap objects
LARGE_OMAP_OBJECTS 1 large omap objects
1 large objects found in pool 'cephfs_metadata'
Search the cluster log for 'Large omap object found' for more details.

"Search the cluster log" is somewhat opaque, there are logs for many
daemons, what is a "cluster" log? In the ML history some found it in the
OSD logs?

Another post suggested removing lost+found, but using cephfs-shell I don't
see one at the top-level, is there another way to disable this "feature"?

thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I followed some other suggested steps, and have this:

root@cnx-17:/var/log/ceph# zcat ceph-osd.178.log.?.gz|fgrep Large
2019-10-02 13:28:39.412 7f482ab1c700  0 log_channel(cluster) log [WRN] :
Large omap object found. Object: 2:654134d2:::mds0_openfiles.0:head Key
count: 306331 Size (bytes): 13993148
root@cnx-17:/var/log/ceph# ceph daemon osd.178 config show | grep
osd_deep_scrub_large_omap
"osd_deep_scrub_large_omap_object_key_threshold": "20",
"osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824",

root@cnx-11:~# rados -p cephfs_metadata stat 'mds0_openfiles.0'
cephfs_metadata/mds0_openfiles.0 mtime 2019-10-06 23:37:23.00, size 0
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I've adjusted the threshold:

ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 35

Colleague suggested that this will take effect on the next deep-scrub.

Is the default of 200,000 too small? will this be adjusted in future
releases or is it meant to be adjusted in some use-cases?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com