Hi,
I am trying to understand what happens when an OSD fails.
Few days back I wanted to check what happens when an OSD goes down for that
what I did was I just went to the node and stopped one of the osd's
service. When OSD went in down state pgs started recovering and after
sometime everything s
Hello Ceph Users,
We have upgraded all nodes to 12.2.7 now. We have 90PGs ( ~2000 scrub errors )
to fix from the time when we ran 12.2.6. It doesn't seem to be affecting
production at this time.
Below is the log of a PG repair. What is the best way to correct these errors?
Is there any further
Hi all,
I am facing a major issue where my osd is down and not coming up after a
reboot.
These are the last osd logs
2018-07-20 10:43:00.701904 7f02f1b53d80 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1532063580701900, "job": 1, "event": "recovery_finished"}
2018-07-20 10:43:00.735978 7f02f1b53d80
Hi all,
I am facing a major issue where my osd is down and not coming up after a
reboot.
These are the last osd logs
2018-07-20 10:43:00.701904 7f02f1b53d80 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1532063580701900, "job": 1, "event": "recovery_finished"}
2018-07-20 10:43:00.735978 7f02f1b53d80
Hi John:
Thanks for your reply. Yes, the following is the detail .
ibdev2netdev
mlx4_0 port 1 ==> ib0 (Down)
mlx4_0 port 2 ==> ib1 (Up)
sh show-gids.sh
DEV PORTINDEX GID IPv4
VER DEV
--- - ---
-
I've updated the tracker.
On Thu, Jul 19, 2018 at 7:51 PM, Robert Sander
wrote:
> On 19.07.2018 11:15, Ronny Aasen wrote:
>
>> Did you upgrade from 12.2.5 or 12.2.6 ?
>
> Yes.
>
>> sounds like you hit the reason for the 12.2.7 release
>>
>> read : https://ceph.com/releases/12-2-7-luminous-release
Search the cluster log for 'Large omap object found' for more details.
On Fri, Jul 20, 2018 at 5:13 AM, Brent Kennedy wrote:
> I just upgraded our cluster to 12.2.6 and now I see this warning about 1
> large omap object. I looked and it seems this warning was just added in
> 12.2.6. I found a f
On 7/18/2018 10:27 PM, Konstantin Shalygin wrote:
So mostly I want to confirm that is is safe to change the crush rule for
the EC pool.
Changing crush rules for replicated or ec pool is safe.
One thing is, when I was migrated from multiroot to device-classes I was
recreate ec pools and clone
I believe that the standard mechanisms for launching OSDs already sets
the thread cache higher than default. It's possible we might be able to
relax that now as async messenger doesn't thrash the cache as badly as
simple messenger did. I suspect there's probably still some value to
increasing
12.2.6 has a regression. See "v12.2.7 Luminous released" and all of the
related disaster posts. Also in the release nodes for .7 is a bug
disclosure for 12.2.5 that affects rgw users pretty badly during upgrade.
You might take a look there.
On Thu, Jul 19, 2018 at 2:13 PM Brent Kennedy wrote:
>
I just upgraded our cluster to 12.2.6 and now I see this warning about 1
large omap object. I looked and it seems this warning was just added in
12.2.6. I found a few discussions on what is was but not much information
on addressing it properly. Our cluster uses rgw exclusively with just a few
b
I don't think that's a default recommendation — Ceph is doing more
configuration of tcmalloc these days, tcmalloc has resolved a lot of bugs,
and that was only ever a thing that mattered for SSD-backed OSDs anyway.
-Greg
On Thu, Jul 19, 2018 at 5:50 AM Robert Stanford
wrote:
>
> It seems that t
Yes, I'd love to go with Optanes ... you think 480 GB will be
fine for WAL+DB for 15x12TB, long term? I only hesitate because
I've seen recommendations of "10 GB DB per 1 TB HDD" several times.
How much total HDD capacity do you have per Optane 900P 480GB?
Cheers,
Oliver
On 18.07.2018 10:23,
Hi,
I would appreciate any advice ( with arguments , if possible) regarding the
best design approach considering below facts
- budget is set to XX amount
- goal is to get as much performance / capacity as possible using XX
- 4 to 6 servers, DELL R620/R630 with 8 disk slots, 64 G RAM and 8 cores
Hi Guys,
We are running a Ceph Luminous 12.2.6 cluster.
The cluster is used both for RBD storage and Ceph Object Storage and is about
742 TB raw space.
We have an application that push snapshots of our VMs through RGW all seem to
be fine except that we have a decorrelation between what the S3 A
>>
>> I'm on IRC (as MooingLemur) if more real-time communication would help :)
>
> Sure, I'll try to contact you there. In the meantime could you open up
> a tracker showing the crash stack trace above and a brief description
> of the current situation and the events leading up to it? Could yo
I am following your blog which is awesome!
based on your explanation this is what i am thinking, I have hardware
and some consumer grade SSD in my stock so i am build my cluster using
those and will keep journal+data on same SSD after that i will run
some load test to see how it performing and lat
>Also, since I see this is a log directory, check that you don't have some
>processes that are holding their log files open even after they're unlinked.
Thank you very much - that was the case.
lsof /mnt/logs | grep deleted
After dealing with these, space was reclaimed in about 2-3min.
Thanks!
On 18.07.2018 10:23, Linh Vu wrote:
I think the P4600 should be fine, although 2TB is probably way over kill
for 15 OSDs.
Our older nodes use the P3700 400GB for 16 OSDs. I have yet to see the
WAL and DB getting filled up at 2GB/10GB each. Our newer nodes use the
Intel Optane 900P 4
On Thu, Jul 19, 2018 at 1:58 PM Alexander Ryabov
wrote:
> Hello,
>
> I see that free space is not released after files are removed on CephFS.
>
> I'm using Luminous with replica=3 without any snapshots etc and with
> default settings.
>
>
> From client side:
> $ du -sh /mnt/logs/
> 4.1G /mnt/logs
Sounds like the typical configuration is just RocksDB on the
SSD, and both data and WAL on the OSD disk?
Not quite, WAL will be on the fastest available device. If you have
NVMe, SSD and HDD, your command should look something like this:
ceph-volume lvm create --bluestore --data /dev/$HDD --
Thank you. Sounds like the typical configuration is just RocksDB on the
SSD, and both data and WAL on the OSD disk?
On Thu, Jul 19, 2018 at 9:00 AM, Eugen Block wrote:
> Hi,
>
> if you have SSDs for RocksDB, you should provide that in the command
> (--block.db $DEV), otherwise Ceph will use th
We're looking to replace our existing RBD cluster, which makes and stores our
backups. Atm we've got one machine running backuppc, where the RBD is mounted
and 8 ceph nodes.
The idea is to gain in speed and/or pay less (or pay equally for moar speed).
Doubting to get SSD in the mix. Have I unde
Hi,
if you have SSDs for RocksDB, you should provide that in the command
(--block.db $DEV), otherwise Ceph will use the one provided disk for
all data and RocksDB/WAL.
Before you create that OSD you probably should check out the help page
for that command, maybe there are more options you s
I am following the steps here:
http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/
The final step is:
ceph-volume lvm create --bluestore --data $DEVICE --osd-id $ID
I notice this command doesn't specify a device to use as the journal. Is
it implied that BlueStore will use
Hello,
I see that free space is not released after files are removed on CephFS.
I'm using Luminous with replica=3 without any snapshots etc and with default
settings.
>From client side:
$ du -sh /mnt/logs/
4.1G /mnt/logs/
$ df -h /mnt/logs/
Filesystem Size Used Avail Use% Mounted on
h1,h2:
It seems that the Ceph community no longer recommends changing to
jemalloc. However this also recommends to do what's in this email's
subject:
https://ceph.com/geen-categorie/the-ceph-and-tcmalloc-performance-story/
Is it still recommended to increase the tcmalloc thread cache bytes, or is
that
On 19/07/2018 13:28, Satish Patel wrote:
Thanks for massive details, so what are the options I have can I disable raid
controller and run system without raid and use software raid for OS?
Not sure what kind of RAID controller you have. I seem to recall and HP
thingy? And those I don't trust a
Thanks for massive details, so what are the options I have can I disable raid
controller and run system without raid and use software raid for OS?
Does that make sense ?
Sent from my iPhone
> On Jul 19, 2018, at 6:33 AM, Willem Jan Withagen wrote:
>
>> On 19/07/2018 10:53, Simon Ironside wrot
Hi,
on upgrade from 12.2.4 to 12.2.5 the balancer module broke (mgr crashes
minutes after service started).
Only solution was to disable the balancer (service is running fine since).
Is this fixed in 12.2.7?
I was unable to locate the bug in bugtracker.
Kevin
2018-07-17 18:28 GMT+02:00 Abhishek
On 19/07/2018 10:53, Simon Ironside wrote:
On 19/07/18 07:59, Dietmar Rieder wrote:
We have P840ar controllers with battery backed cache in our OSD nodes
and configured an individual RAID-0 for each OSD (ceph luminous +
bluestore). We have not seen any problems with this setup so far and
perfor
On 19.07.2018 11:15, Ronny Aasen wrote:
> Did you upgrade from 12.2.5 or 12.2.6 ?
Yes.
> sounds like you hit the reason for the 12.2.7 release
>
> read : https://ceph.com/releases/12-2-7-luminous-released/
>
> there should come features in 12.2.8 that can deal with the "objects are
> in sync
Hello again,
It is still early to say that is working fine now, but looks like the MDS
memory is now under 20% of RAM and the most of time between 6-9%. Maybe was
a mistake on configuration.
As appointment, I've changed this client config:
[global]
...
bluestore_cache_size_ssd = 805306360
bluesto
Hi all:
Has anyone successfully set up ceph with rdma over IB ?
By following the instructions:
(https://community.mellanox.com/docs/DOC-2721)
(https://community.mellanox.com/docs/DOC-2693)
(http://hwchiu.com/2017-05-03-ceph-with-rdma.html)
I'm trying to configure CEPH with RDMA feature
On 19. juli 2018 10:37, Robert Sander wrote:
Hi,
just a quick warning: We currently see active+clean+inconsistent PGs on
two cluster after upgrading to 12.2.7.
I created http://tracker.ceph.com/issues/24994
Regards
Did you upgrade from 12.2.5 or 12.2.6 ?
sounds like you hit the reason for
On 19/07/18 07:59, Dietmar Rieder wrote:
We have P840ar controllers with battery backed cache in our OSD nodes
and configured an individual RAID-0 for each OSD (ceph luminous +
bluestore). We have not seen any problems with this setup so far and
performance is great at least for our workload.
Hi,
just a quick warning: We currently see active+clean+inconsistent PGs on
two cluster after upgrading to 12.2.7.
I created http://tracker.ceph.com/issues/24994
Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin
https://www.heinlein-support.de
Tel: 030 / 405051-
Hello,
Finally I've to remove CephFS and use a simple NFS, because the MDS daemon
starts to use a lot of memory and is unstable. After reboot one node
because it started to swap (the cluster will be able to survive without a
node), the cluster goes down because one of the other MDS starts to use
a
Am 19.07.2018 um 08:43 schrieb Linh Vu:
> Since the new NVMes are meant to replace the existing SSDs, why don't you
> assign class "ssd" to the new NVMe OSDs? That way you don't need to change
> the existing OSDs nor the existing crush rule. And the new NVMe OSDs won't
> lose any performance, "s
> > opts="--randrepeat=1 --ioengine=rbd --direct=1 --numjobs=${numjobs}
> > --gtod_reduce=1 --name=test --pool=${pool} --rbdname=${vol} --invalidate=0
> > --bs=4k --iodepth=64 --time_based --runtime=$time --group_reporting"
> >
>
> So that "--numjobs" parameter is what I was referring to when I sa
Am 19.07.2018 um 05:57 schrieb Konstantin Shalygin:
>> Now my first question is:
>> 1) Is there a way to specify "take default class (ssd or nvme)"?
>>Then we could just do this for the migration period, and at some point
>> remove "ssd".
>>
>> If multi-device-class in a crush rule is not s
Mandi! Troy Ablan
In chel di` si favelave...
> Even worse, the P410i doesn't appear to support a pass-thru (JBOD/HBA)
> mode, so your only sane option for using this card is to create RAID-0s.
I confirm Even worse, P410i can define a maximum of 2 'array' (even a
fake array composed of one disk
On 07/19/2018 04:44 AM, Satish Patel wrote:
> If i have 8 OSD drives in server on P410i RAID controller (HP), If i
> want to make this server has OSD node in that case show should i
> configure RAID?
>
> 1. Put all drives in RAID-0?
> 2. Put individual HDD in RAID-0 and create 8 individual RAID-0
43 matches
Mail list logo