date:20200227

[ceph-users] rgw lifecycle process is not fast enough

2020-02-27 Thread quexian da

ceph version: 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable) I set up a ceph cluster and I'm uploading objects through rgw with a speed of 60 objects/s. I added some lifecycle rules to buckets so that my disks will not be used up. However, after I set "debug_rgw" to 5 and run

[ceph-users] Re: continued warnings: Large omap object found

2020-02-27 Thread Brad Hubbard

> /var/log/ceph/ceph.log:2020-02-27 16:18:00.328869 osd.40 (osd.40) 1585 : > cluster [WRN] Large omap object found. Object: > 2:654134d2:::mds0_openfiles.0:head PG: 2.4b2c82a6 (2.26) Key count: > 1048559 Size (bytes): 46407183 > /var/log/ceph/ceph.log-20200227.gz:2020-02-26 19:56

[ceph-users] continued warnings: Large omap object found

2020-02-27 Thread Seth Galitzer

46403355 /var/log/ceph/ceph.log:2020-02-27 16:18:00.328869 osd.40 (osd.40) 1585 : cluster [WRN] Large omap object found. Object: 2:654134d2:::mds0_openfiles.0:head PG: 2.4b2c82a6 (2.26) Key count: 1048559 Size (bytes): 46407183 /var/log/ceph/ceph.log-20200227.gz:2020-02-26 19:56:24.972431 osd

[ceph-users] Re: SSD considerations for block.db and WAL

2020-02-27 Thread DHilsbos

Christian; What is your failure domain? If your failure domain is set to OSD / drive, and 2 OSDs share a DB / WAL device, and that DB / WAL device dies, then portions of the data could drop to read-only (or be lost...). Ceph is really set up to own the storage hardware directly. It doesn't (

[ceph-users] SSD considerations for block.db and WAL

2020-02-27 Thread Christian Wahl

Hi everyone, we currently have 6 OSDs with 8TB HDDs split across 3 hosts. The main usage is KVM-Images. To improve speed we planned on putting the block.db and WAL onto NVMe-SSDs. The plan was to put 2x1TB in each host. One option I thought of was to RAID 1 them for better redundancy, I don't

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Lincoln Bryant

It seems that one of the down PGs was able to recover just fine, but the other OSD went into "incomplete" state after export-and-removing the affected PG from the down OSD. I've still got the exported data from the pg, although re-importing it to the OSD again causes the crashes. What's the be

[ceph-users] Re: osdmap::decode crc error -- 13.2.7 -- most osds down

2020-02-27 Thread Dan van der Ster

FTR, the root cause is now understood: https://tracker.ceph.com/issues/39525#note-21 -- dan On Thu, Feb 20, 2020 at 9:24 PM Dan van der Ster wrote: > > On Thu, Feb 20, 2020 at 9:20 PM Wido den Hollander wrote: > > > > > Op 20 feb. 2020 om 19:54 heeft Dan van der Ster het > > > volgende geschr

[ceph-users] Re: official ceph.com buster builds? [https://eu.ceph.com/debian-luminous buster]

2020-02-27 Thread Jelle de Jong

Hi all, Could someone make luminous available for buster (not container version, or nautilus)? What are the reasons for not having the version available from eu.ceph.com? What would be the motivation needed to add the packages? As I can see curl/libcurl4 version is the only thing needed to

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Lincoln Bryant

Thanks Sage, I can try that. Admittedly I'm not sure how to tell if these two PG can recover without this particular OSD. Note, it seems like there is still an underlying related issue, with hit set archives popping up as unfound objects on my cluster as in Paul's ticket. In total I had about 1

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Sage Weil

If the pg in question can recover without that OSD, I would use use ceph-objectstore-tool to export and remove it, and then move on. I hit a similar issue on my system (due to a bunch in an early octopus build) and it was super tedious to fix up manually (needed patched code and manual modificat

[ceph-users] Re: [External Email] Re: 回复：Re: ceph prometheus module no export content

2020-02-27 Thread Dave Hall

Alternatively, it might be handy to have the passive mgrs issue an HTTP redirect to the active mgr. Then a single DNS name pointing to all mgrs would always work, even when the active mgr fails over. Going a step further with some HA strategies, the cluster could have a separate, floating IP/

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich

Also: make a backup using the PG export feature of objectstore-tool before doing anything else. Sometimes it's enough to export and delete the PG from the broken OSD and import it into a different OSD using objectstore-tool. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contac

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich

Crash happens in PG::activate, so it's unrelated to IO etc. My first approach here would be to read the code and try to understand why it crashes/what the exact condition is that is violated here. It looks like something that can probably be fixed by fiddling around with ceph-objectstore-tool (but

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Lincoln Bryant

Thanks Paul. I was able to mark many of the unfound ones as lost, but I'm still stuck with one unfound and OSD assert at this point. I've tried setting many of the OSD options to pause all cluster I/O, backfilling, rebalancing, tiering agent, etc to try to avoid hitting the assert but alas thi

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich

I've also encountered this issue, but luckily without the crashing OSDs, so marking as lost resolved it for us. See https://tracker.ceph.com/issues/44286 Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München w

[ceph-users] Re: Is a scrub error (read_error) on a primary osd safe to repair?

2020-02-27 Thread Caspar Smit

Hi Mehmet, In our case the ceph pg repair fixed the issues (read_error). I think the read_error was just temporary due to low available RAM. You might want to check your actual issue with ceph pg query Kind regards, Caspar Smit Systemengineer SuperNAS Dorsvlegelstraat 13 1445 PA Purmerend t:

[ceph-users] Re: 回复：Re: ceph prometheus module no export content

2020-02-27 Thread Michael Bisig

Hi all, A similar question would be if it is possible to let passive mgr do the data collection!? We run 14.2.6 on a medium 2.5PB cluster with over 900M objects (rbd and mainly S3) . At the moment, we face an issue with the prometheus exporter while it has high load. (e.g. while we insert a ne

[ceph-users] rgw lifecycle process is not fast enough

[ceph-users] Re: continued warnings: Large omap object found

[ceph-users] continued warnings: Large omap object found

[ceph-users] Re: SSD considerations for block.db and WAL

[ceph-users] SSD considerations for block.db and WAL

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: osdmap::decode crc error -- 13.2.7 -- most osds down

[ceph-users] Re: official ceph.com buster builds? [https://eu.ceph.com/debian-luminous buster]

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: [External Email] Re: 回复：Re: ceph prometheus module no export content

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

[ceph-users] Re: Is a scrub error (read_error) on a primary osd safe to repair?

[ceph-users] Re: 回复：Re: ceph prometheus module no export content

17 matches

Site Navigation

Mail list logo

Footer information