date:20180719

[ceph-users] PGs go to down state when OSD fails

2018-07-19 Thread shrey chauhan

Hi, I am trying to understand what happens when an OSD fails. Few days back I wanted to check what happens when an OSD goes down for that what I did was I just went to the node and stopped one of the osd's service. When OSD went in down state pgs started recovering and after sometime everything s

[ceph-users] 12.2.6 upgrade

2018-07-19 Thread Glen Baars

Hello Ceph Users, We have upgraded all nodes to 12.2.7 now. We have 90PGs ( ~2000 scrub errors ) to fix from the time when we ran 12.2.6. It doesn't seem to be affecting production at this time. Below is the log of a PG repair. What is the best way to correct these errors? Is there any further

[ceph-users] OSD failed, wont come up

2018-07-19 Thread shrey chauhan

Hi all, I am facing a major issue where my osd is down and not coming up after a reboot. These are the last osd logs 2018-07-20 10:43:00.701904 7f02f1b53d80 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1532063580701900, "job": 1, "event": "recovery_finished"} 2018-07-20 10:43:00.735978 7f02f1b53d80

[ceph-users] OSD failed, wont come up

2018-07-19 Thread shrey chauhan

Hi all, I am facing a major issue where my osd is down and not coming up after a reboot. These are the last osd logs 2018-07-20 10:43:00.701904 7f02f1b53d80 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1532063580701900, "job": 1, "event": "recovery_finished"} 2018-07-20 10:43:00.735978 7f02f1b53d80

Re: [ceph-users] RDMA question for ceph

2018-07-19 Thread Will Zhao

Hi John: Thanks for your reply. Yes, the following is the detail . ibdev2netdev mlx4_0 port 1 ==> ib0 (Down) mlx4_0 port 2 ==> ib1 (Up) sh show-gids.sh DEV PORTINDEX GID IPv4 VER DEV --- - --- -

Re: [ceph-users] active+clean+inconsistent PGs after upgrade to 12.2.7

2018-07-19 Thread Brad Hubbard

I've updated the tracker. On Thu, Jul 19, 2018 at 7:51 PM, Robert Sander wrote: > On 19.07.2018 11:15, Ronny Aasen wrote: > >> Did you upgrade from 12.2.5 or 12.2.6 ? > > Yes. > >> sounds like you hit the reason for the 12.2.7 release >> >> read : https://ceph.com/releases/12-2-7-luminous-release

Re: [ceph-users] Omap warning in 12.2.6

2018-07-19 Thread Brad Hubbard

Search the cluster log for 'Large omap object found' for more details. On Fri, Jul 20, 2018 at 5:13 AM, Brent Kennedy wrote: > I just upgraded our cluster to 12.2.6 and now I see this warning about 1 > large omap object. I looked and it seems this warning was just added in > 12.2.6. I found a f

Re: [ceph-users] Migrating EC pool to device-class crush rules

2018-07-19 Thread Graham Allan

On 7/18/2018 10:27 PM, Konstantin Shalygin wrote: So mostly I want to confirm that is is safe to change the crush rule for the EC pool. Changing crush rules for replicated or ec pool is safe. One thing is, when I was migrated from multiroot to device-classes I was recreate ec pools and clone

Re: [ceph-users] Increase tcmalloc thread cache bytes - still recommended?

2018-07-19 Thread Mark Nelson

I believe that the standard mechanisms for launching OSDs already sets the thread cache higher than default. It's possible we might be able to relax that now as async messenger doesn't thrash the cache as badly as simple messenger did. I suspect there's probably still some value to increasing

Re: [ceph-users] Omap warning in 12.2.6

2018-07-19 Thread Brady Deetz

12.2.6 has a regression. See "v12.2.7 Luminous released" and all of the related disaster posts. Also in the release nodes for .7 is a bug disclosure for 12.2.5 that affects rgw users pretty badly during upgrade. You might take a look there. On Thu, Jul 19, 2018 at 2:13 PM Brent Kennedy wrote: >

[ceph-users] Omap warning in 12.2.6

2018-07-19 Thread Brent Kennedy

I just upgraded our cluster to 12.2.6 and now I see this warning about 1 large omap object. I looked and it seems this warning was just added in 12.2.6. I found a few discussions on what is was but not much information on addressing it properly. Our cluster uses rgw exclusively with just a few b

Re: [ceph-users] Increase tcmalloc thread cache bytes - still recommended?

2018-07-19 Thread Gregory Farnum

I don't think that's a default recommendation — Ceph is doing more configuration of tcmalloc these days, tcmalloc has resolved a lot of bugs, and that was only ever a thing that mattered for SSD-backed OSDs anyway. -Greg On Thu, Jul 19, 2018 at 5:50 AM Robert Stanford wrote: > > It seems that t

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-19 Thread Oliver Schulz

Yes, I'd love to go with Optanes ... you think 480 GB will be fine for WAL+DB for 15x12TB, long term? I only hesitate because I've seen recommendations of "10 GB DB per 1 TB HDD" several times. How much total HDD capacity do you have per Optane 900P 480GB? Cheers, Oliver On 18.07.2018 10:23,

[ceph-users] design question - NVME + NLSAS, SSD or SSD + NLSAS

2018-07-19 Thread Steven Vacaroaia

Hi, I would appreciate any advice ( with arguments , if possible) regarding the best design approach considering below facts - budget is set to XX amount - goal is to get as much performance / capacity as possible using XX - 4 to 6 servers, DELL R620/R630 with 8 disk slots, 64 G RAM and 8 cores

[ceph-users] Lost TB for Object storage

2018-07-19 Thread CUZA Frédéric

Hi Guys, We are running a Ceph Luminous 12.2.6 cluster. The cluster is used both for RBD storage and Ceph Object Storage and is about 742 TB raw space. We have an application that push snapshots of our VMs through RGW all seem to be fine except that we have a decorrelation between what the S3 A

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-19 Thread Troy Ablan

>> >> I'm on IRC (as MooingLemur) if more real-time communication would help :) > > Sure, I'll try to contact you there. In the meantime could you open up > a tracker showing the crash stack trace above and a brief description > of the current situation and the events leading up to it? Could yo

Re: [ceph-users] Need advice on Ceph design

2018-07-19 Thread Satish Patel

I am following your blog which is awesome! based on your explanation this is what i am thinking, I have hardware and some consumer grade SSD in my stock so i am build my cluster using those and will keep journal+data on same SSD after that i will run some load test to see how it performing and lat

Re: [ceph-users] Force cephfs delayed deletion

2018-07-19 Thread Alexander Ryabov

>Also, since I see this is a log directory, check that you don't have some >processes that are holding their log files open even after they're unlinked. Thank you very much - that was the case. lsof /mnt/logs | grep deleted After dealing with these, space was reclaimed in about 2-3min.

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-19 Thread Oliver Schulz

Thanks! On 18.07.2018 10:23, Linh Vu wrote: I think the P4600 should be fine, although 2TB is probably way over kill for 15 OSDs. Our older nodes use the P3700 400GB for 16 OSDs. I have yet to see the WAL and DB getting filled up at 2GB/10GB each. Our newer nodes use the Intel Optane 900P 4

Re: [ceph-users] Force cephfs delayed deletion

2018-07-19 Thread John Spray

On Thu, Jul 19, 2018 at 1:58 PM Alexander Ryabov wrote: > Hello, > > I see that free space is not released after files are removed on CephFS. > > I'm using Luminous with replica=3 without any snapshots etc and with > default settings. > > > From client side: > $ du -sh /mnt/logs/ > 4.1G /mnt/logs

Re: [ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Eugen Block

Sounds like the typical configuration is just RocksDB on the SSD, and both data and WAL on the OSD disk? Not quite, WAL will be on the fastest available device. If you have NVMe, SSD and HDD, your command should look something like this: ceph-volume lvm create --bluestore --data /dev/$HDD --

Re: [ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Robert Stanford

Thank you. Sounds like the typical configuration is just RocksDB on the SSD, and both data and WAL on the OSD disk? On Thu, Jul 19, 2018 at 9:00 AM, Eugen Block wrote: > Hi, > > if you have SSDs for RocksDB, you should provide that in the command > (--block.db $DEV), otherwise Ceph will use th

[ceph-users] [RBD]Replace block device cluster

2018-07-19 Thread Nino Bosteels

We're looking to replace our existing RBD cluster, which makes and stores our backups. Atm we've got one machine running backuppc, where the RBD is mounted and 8 ceph nodes. The idea is to gain in speed and/or pay less (or pay equally for moar speed). Doubting to get SSD in the mix. Have I unde

Re: [ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Eugen Block

Hi, if you have SSDs for RocksDB, you should provide that in the command (--block.db $DEV), otherwise Ceph will use the one provided disk for all data and RocksDB/WAL. Before you create that OSD you probably should check out the help page for that command, maybe there are more options you s

[ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Robert Stanford

I am following the steps here: http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/ The final step is: ceph-volume lvm create --bluestore --data $DEVICE --osd-id $ID I notice this command doesn't specify a device to use as the journal. Is it implied that BlueStore will use

[ceph-users] Force cephfs delayed deletion

2018-07-19 Thread Alexander Ryabov

Hello, I see that free space is not released after files are removed on CephFS. I'm using Luminous with replica=3 without any snapshots etc and with default settings. >From client side: $ du -sh /mnt/logs/ 4.1G /mnt/logs/ $ df -h /mnt/logs/ Filesystem Size Used Avail Use% Mounted on h1,h2:

[ceph-users] Increase tcmalloc thread cache bytes - still recommended?

2018-07-19 Thread Robert Stanford

It seems that the Ceph community no longer recommends changing to jemalloc. However this also recommends to do what's in this email's subject: https://ceph.com/geen-categorie/the-ceph-and-tcmalloc-performance-story/ Is it still recommended to increase the tcmalloc thread cache bytes, or is that

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Willem Jan Withagen

On 19/07/2018 13:28, Satish Patel wrote: Thanks for massive details, so what are the options I have can I disable raid controller and run system without raid and use software raid for OS? Not sure what kind of RAID controller you have. I seem to recall and HP thingy? And those I don't trust a

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Satish Patel

Thanks for massive details, so what are the options I have can I disable raid controller and run system without raid and use software raid for OS? Does that make sense ? Sent from my iPhone > On Jul 19, 2018, at 6:33 AM, Willem Jan Withagen wrote: > >> On 19/07/2018 10:53, Simon Ironside wrot

Re: [ceph-users] v12.2.7 Luminous released

2018-07-19 Thread Kevin Olbrich

Hi, on upgrade from 12.2.4 to 12.2.5 the balancer module broke (mgr crashes minutes after service started). Only solution was to disable the balancer (service is running fine since). Is this fixed in 12.2.7? I was unable to locate the bug in bugtracker. Kevin 2018-07-17 18:28 GMT+02:00 Abhishek

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Willem Jan Withagen

On 19/07/2018 10:53, Simon Ironside wrote: On 19/07/18 07:59, Dietmar Rieder wrote: We have P840ar controllers with battery backed cache in our OSD nodes and configured an individual RAID-0 for each OSD (ceph luminous + bluestore). We have not seen any problems with this setup so far and perfor

Re: [ceph-users] active+clean+inconsistent PGs after upgrade to 12.2.7

2018-07-19 Thread Robert Sander

On 19.07.2018 11:15, Ronny Aasen wrote: > Did you upgrade from 12.2.5 or 12.2.6 ? Yes. > sounds like you hit the reason for the 12.2.7 release > > read : https://ceph.com/releases/12-2-7-luminous-released/ > > there should come features in 12.2.8 that can deal with the "objects are > in sync

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-19 Thread Daniel Carrasco

Hello again, It is still early to say that is working fine now, but looks like the MDS memory is now under 20% of RAM and the most of time between 6-9%. Maybe was a mistake on configuration. As appointment, I've changed this client config: [global] ... bluestore_cache_size_ssd = 805306360 bluesto

[ceph-users] RDMA question for ceph

2018-07-19 Thread Will Zhao

Hi all: Has anyone successfully set up ceph with rdma over IB ? By following the instructions: (https://community.mellanox.com/docs/DOC-2721) (https://community.mellanox.com/docs/DOC-2693) (http://hwchiu.com/2017-05-03-ceph-with-rdma.html) I'm trying to configure CEPH with RDMA feature

Re: [ceph-users] active+clean+inconsistent PGs after upgrade to 12.2.7

2018-07-19 Thread Ronny Aasen

On 19. juli 2018 10:37, Robert Sander wrote: Hi, just a quick warning: We currently see active+clean+inconsistent PGs on two cluster after upgrading to 12.2.7. I created http://tracker.ceph.com/issues/24994 Regards Did you upgrade from 12.2.5 or 12.2.6 ? sounds like you hit the reason for

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Simon Ironside

On 19/07/18 07:59, Dietmar Rieder wrote: We have P840ar controllers with battery backed cache in our OSD nodes and configured an individual RAID-0 for each OSD (ceph luminous + bluestore). We have not seen any problems with this setup so far and performance is great at least for our workload.

[ceph-users] active+clean+inconsistent PGs after upgrade to 12.2.7

2018-07-19 Thread Robert Sander

Hi, just a quick warning: We currently see active+clean+inconsistent PGs on two cluster after upgrading to 12.2.7. I created http://tracker.ceph.com/issues/24994 Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-19 Thread Daniel Carrasco

Hello, Finally I've to remove CephFS and use a simple NFS, because the MDS daemon starts to use a lot of memory and is unstable. After reboot one node because it started to swap (the cluster will be able to survive without a node), the cluster goes down because one of the other MDS starts to use a

Re: [ceph-users] Crush Rules with multiple Device Classes

2018-07-19 Thread Oliver Freyermuth

Am 19.07.2018 um 08:43 schrieb Linh Vu: > Since the new NVMes are meant to replace the existing SSDs, why don't you > assign class "ssd" to the new NVMe OSDs? That way you don't need to change > the existing OSDs nor the existing crush rule. And the new NVMe OSDs won't > lose any performance, "s

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-19 Thread Nikola Ciprich

> > opts="--randrepeat=1 --ioengine=rbd --direct=1 --numjobs=${numjobs} > > --gtod_reduce=1 --name=test --pool=${pool} --rbdname=${vol} --invalidate=0 > > --bs=4k --iodepth=64 --time_based --runtime=$time --group_reporting" > > > > So that "--numjobs" parameter is what I was referring to when I sa

Re: [ceph-users] Crush Rules with multiple Device Classes

2018-07-19 Thread Oliver Freyermuth

Am 19.07.2018 um 05:57 schrieb Konstantin Shalygin: >> Now my first question is: >> 1) Is there a way to specify "take default class (ssd or nvme)"? >>Then we could just do this for the migration period, and at some point >> remove "ssd". >> >> If multi-device-class in a crush rule is not s

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Marco Gaiarin

Mandi! Troy Ablan In chel di` si favelave... > Even worse, the P410i doesn't appear to support a pass-thru (JBOD/HBA) > mode, so your only sane option for using this card is to create RAID-0s. I confirm Even worse, P410i can define a maximum of 2 'array' (even a fake array composed of one disk

Re: [ceph-users] RAID question for Ceph

2018-07-19 Thread Dietmar Rieder

On 07/19/2018 04:44 AM, Satish Patel wrote: > If i have 8 OSD drives in server on P410i RAID controller (HP), If i > want to make this server has OSD node in that case show should i > configure RAID? > > 1. Put all drives in RAID-0? > 2. Put individual HDD in RAID-0 and create 8 individual RAID-0

43 matches

Mail list logo