I honestly haven't investigated the command line structure that it would need, but that looks about what I'd expect.
On Thu, May 11, 2017, 7:58 AM Anton Dmitriev <t...@enumnet.ru> wrote: > I`m on Jewel 10.2.7 > Do you mean this: > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${osd_num} > --journal-path /var/lib/ceph/osd/ceph-${osd_num}/journal > --log-file=/var/log/ceph/objectstore_tool.${osd_num}.log --op > apply-layout-settings --pool default.rgw.buckets.data --debug > > ? > And before running it I need to stop OSD and flush its journal > > > On 11.05.2017 14:52, David Turner wrote: > > If you are on the current release of Ceph Hammer 0.94.10 or Jewel 10.2.7, > you have it already. I don't remember which release it came out in, but > it's definitely in the current releases.. > > On Thu, May 11, 2017, 12:24 AM Anton Dmitriev <t...@enumnet.ru> wrote: > >> "recent enough version of the ceph-objectstore-tool" - sounds very >> interesting. Would it be released in one of next Jewel minor releases? >> >> >> On 10.05.2017 19:03, David Turner wrote: >> >> PG subfolder splitting is the primary reason people are going to be >> deploying Luminous and Bluestore much faster than any other major release >> of Ceph. Bluestore removes the concept of subfolders in PGs. >> >> I have had clusters that reached what seemed a hardcoded maximum of >> 12,800 objects in a subfolder. It would take an osd_heartbeat_grace of 240 >> or 300 to let them finish splitting their subfolders without being marked >> down. Recently I came across a cluster that had a setting of 240 objects >> per subfolder before splitting, so it was splitting all the time, and >> several of the OSDs took longer than 30 seconds to finish splitting into >> subfolders. That led to more problems as we started adding backfilling to >> everything and we lost a significant amount of throughput on the cluster. >> >> I have yet to manage a cluster with a recent enough version of the >> ceph-objectstore-tool (hopefully I'll have one this month) that includes >> the ability to take an osd offline, split the subfolders, then bring it >> back online. If you set up a way to monitor how big your subfolders are >> getting, you can leave the ceph settings as high as you want, and then go >> in and perform maintenance on your cluster 1 failure domain at a time >> splitting all of the PG subfolders on the OSDs. This approach would remove >> this ever happening in the wild. >> >> On Wed, May 10, 2017 at 5:37 AM Piotr Nowosielski < >> piotr.nowosiel...@allegrogroup.com> wrote: >> >>> It is difficult for me to clearly state why some PGs have not been >>> migrated. >>> crushmap settings? Weight of OSD? >>> >>> One thing is certain - you will not find any information about the split >>> process in the logs ... >>> >>> pn >>> >>> -----Original Message----- >>> From: Anton Dmitriev [mailto:t...@enumnet.ru] >>> Sent: Wednesday, May 10, 2017 10:14 AM >>> To: Piotr Nowosielski <piotr.nowosiel...@allegrogroup.com>; >>> ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] All OSD fails after few requests to RGW >>> >>> When I created cluster, I made a mistake in configuration, and set split >>> parameter to 32 and merge to 40, so 32*40*16 = 20480 files per folder. >>> After that I changed split to 8, and increased number of pg and pgp from >>> 2048 to 4096 for pool, where problem occurs. While it was backfilling I >>> observed, that placement groups were backfilling from one set of 3 OSD to >>> another set of 3 OSD (replicated size = 3), so I made a conclusion, that >>> PGs >>> are completely recreating while increasing PG and PGP for pool and after >>> this process number of files per directory must be Ok. But when >>> backfilling >>> finished I found many directories in this pool with ~20 >>> 000 files. Why Increasing PG num did not helped? Or maybe after this >>> process >>> some files will be deleted with some delay? >>> >>> I couldn`t find any information about directory split process in logs, >>> also >>> with osd and filestore debug 20. What pattern and in what log I need to >>> grep >>> for finding it? >>> >>> On 10.05.2017 10:36, Piotr Nowosielski wrote: >>> > You can: >>> > - change these parameters and use ceph-objectstore-tool >>> > - add OSD host - rebuild the cluster will reduce the number of files >>> > in the directories >>> > - wait until "split" operations are over ;-) >>> > >>> > In our case, we could afford to wait until the "split" operation is >>> > over (we have 2 clusters in slightly different configurations storing >>> > the same data) >>> > >>> > hint: >>> > When creating a new pool, use the parameter "expected_num_objects" >>> > https://www.suse.com/documentation/ses-4/book_storage_admin/data/ceph_ >>> > pools_operate.html >>> > >>> > Piotr Nowosielski >>> > Senior Systems Engineer >>> > Zespół Infrastruktury 5 >>> > Grupa Allegro sp. z o.o. >>> > Tel: +48 512 08 55 92 >>> > >>> > >>> > -----Original Message----- >>> > From: Anton Dmitriev [mailto:t...@enumnet.ru] >>> > Sent: Wednesday, May 10, 2017 9:19 AM >>> > To: Piotr Nowosielski <piotr.nowosiel...@allegrogroup.com>; >>> > ceph-users@lists.ceph.com >>> > Subject: Re: [ceph-users] All OSD fails after few requests to RGW >>> > >>> > How did you solved it? Set new split/merge thresholds, and manually >>> > applied it by ceph-objectstore-tool --data-path >>> > /var/lib/ceph/osd/ceph-${osd_num} --journal-path >>> > /var/lib/ceph/osd/ceph-${osd_num}/journal >>> > --log-file=/var/log/ceph/objectstore_tool.${osd_num}.log --op >>> > apply-layout-settings --pool default.rgw.buckets.data >>> > >>> > on each OSD? >>> > >>> > How I can see in logs, that split occurs? >>> > >>> > On 10.05.2017 10:13, Piotr Nowosielski wrote: >>> >> Hey, >>> >> We had similar problems. Look for information on "Filestore merge and >>> >> split". >>> >> >>> >> Some explain: >>> >> The OSD, after reaching a certain number of files in the directory >>> >> (it depends of 'filestore merge threshold' and 'filestore split >>> multiple' >>> >> parameters) rebuilds the structure of this directory. >>> >> If the files arrives, the OSD creates new subdirectories and moves >>> >> some of the files there. >>> >> If the files are missing the OSD will reduce the number of >>> >> subdirectories. >>> >> >>> >> >>> >> -- >>> >> Piotr Nowosielski >>> >> Senior Systems Engineer >>> >> Zespół Infrastruktury 5 >>> >> Grupa Allegro sp. z o.o. >>> >> Tel: +48 512 08 55 92 >>> >> >>> >> Grupa Allegro Sp. z o.o. z siedzibą w Poznaniu, 60-166 Poznań, przy >>> ul. >>> >> Grunwaldzka 182, wpisana do rejestru przedsiębiorców prowadzonego >>> >> przez Sąd Rejonowy Poznań - Nowe Miasto i Wilda, Wydział VIII >>> >> Gospodarczy Krajowego Rejestru Sądowego pod numerem KRS 0000268796, o >>> >> kapitale zakładowym w wysokości 33 976 500,00 zł, posiadająca numer >>> >> identyfikacji podatkowej NIP: 5272525995. >>> >> >>> >> >>> >> >>> >> -----Original Message----- >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >>> >> Of Anton Dmitriev >>> >> Sent: Wednesday, May 10, 2017 8:14 AM >>> >> To: ceph-users@lists.ceph.com >>> >> Subject: Re: [ceph-users] All OSD fails after few requests to RGW >>> >> >>> >> Hi! >>> >> >>> >> I increased pg_num and pgp_num for pool default.rgw.buckets.data from >>> >> 2048 to 4096, and it seems that situation became a bit better, >>> >> cluster dies after 20-30 PUTs, not after 1. Could someone please give >>> >> me some recommendations how to rescue the cluster? >>> >> >>> >> On 27.04.2017 09:59, Anton Dmitriev wrote: >>> >>> Cluster was going well for a long time, but on the previous week >>> >>> osds start to fail. >>> >>> We use cluster like image storage for Opennebula with small load and >>> >>> like object storage with high load. >>> >>> Sometimes disks of some osds utlized by 100 %, iostat shows avgqu-sz >>> >>> over 1000, while reading or writing a few kilobytes in a second, >>> >>> osds on this disks become unresponsive and cluster marks them down. >>> >>> We lower the load to object storage and situation became better. >>> >>> >>> >>> Yesterday situation became worse: >>> >>> If RGWs are disabled and there is no requests to object storage >>> >>> cluster performing well, but if enable RGWs and make a few PUTs or >>> >>> GETs all not SSD osds on all storages become in the same situation, >>> >>> described above. >>> >>> IOtop shows, that xfsaild/<disk> burns disks. >>> >>> >>> >>> trace-cmd record -e xfs\* for a 10 seconds shows 10 milion objects, >>> >>> as i understand it means ~360 000 objects to push per one osd for a >>> >>> 10 seconds >>> >>> $ wc -l t.t >>> >>> 10256873 t.t >>> >>> >>> >>> fragmentation on one of such disks is about 3% >>> >>> >>> >>> more information about cluster: >>> >>> >>> >>> https://yadi.sk/d/Y63mXQhl3HPvwt >>> >>> >>> >>> also debug logs for osd.33 while problem occurs >>> >>> >>> >>> https://yadi.sk/d/kiqsMF9L3HPvte >>> >>> >>> >>> debug_osd = 20/20 >>> >>> debug_filestore = 20/20 >>> >>> debug_tp = 20/20 >>> >>> >>> >>> >>> >>> >>> >>> Ubuntu 14.04 >>> >>> $ uname -a >>> >>> Linux storage01 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 >>> >>> 20:22:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >>> >>> >>> >>> Ceph 10.2.7 >>> >>> >>> >>> 7 storages: Supermicro 28 osd 4tb 7200 JBOD + journal raid10 4 ssd >>> >>> intel 3510 800gb + 2 osd SSD intel 3710 400gb for rgw meta and index >>> >>> One of this storages differs only in number of osd, it has 26 osd on >>> >>> 4tb, instead of 28 on others >>> >>> >>> >>> Storages connect to each other by bonded 2x10gbit Clients connect to >>> >>> storages by bonded 2x1gbit >>> >>> >>> >>> in 5 storages 2 x CPU E5-2650v2 and 256 gb RAM in 2 storages 2 x >>> >>> CPU >>> >>> E5-2690v3 and 512 gb RAM >>> >>> >>> >>> 7 mons >>> >>> 3 rgw >>> >>> >>> >>> Help me please to rescue the cluster. >>> >>> >>> >>> >>> >> -- >>> >> Dmitriev Anton >>> >> >>> >> _______________________________________________ >>> >> ceph-users mailing list >>> >> ceph-users@lists.ceph.com >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> > -- >>> > Dmitriev Anton >>> >>> >>> -- >>> Dmitriev Anton >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> >> -- >> Dmitriev Anton >> >> > > -- > Dmitriev Anton > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com