[ceph-users] Re: Persistent problem with slow metadata

2020-08-26 Thread Eugen Block

Hi,


root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue
wpq
root@0cephosd01:~# ceph config get mds.cephosd01 osd_op_queue_cut_off
high


just to make sure, I referred to OSD not MDS settings, maybe check again?

I wouldn't focus too much on the MDS service, 64 GB RAM should be  
enough, but you could and should also check the actual RAM usage, of  
course. But in our case it's pretty clear that the hard disks are the  
bottleneck although we  have rocksDB on SSD for all OSDs. We seem to  
have a similar use case (we have nightly compile jobs running in  
cephfs) just with fewer clients. Our HDDs are saturated especially if  
we also run deep-scrubs during the night,  but the slow requests have  
been reduced since we changed the osd_op_queue settings for our OSDs.


Have you checked your disk utilization?

Regards,
Eugen


Zitat von Momčilo Medić :


Hi friends,

I was re-reading documentation[1] when I noticed that 64GiB of RAM
should suffice even for a 1000 clients.
That really makes our issue that much more difficult to troubleshoot.

There are no assumptions that I can make that can encompass all of the
details I observe.
With no assumptions, there is nothing to test, nothing to start from -
feels like a dead end.

If anyone has any ideas that we could explore and look into, I'd
appreciate it.

We made little to no configuration changes, and we believe we followed
all the best practices.
Cluster is by no means under extreme stress, I would even argue that it
is a very dormant one.

For the time being, automated cleanup of oudated backups is disabled,
and is to be performed manually.

[1]
https://docs.ceph.com/docs/master/cephfs/add-remove-mds/#provisioning-hardware-for-an-mds

Kind regards,
Momo.

On Mon, 2020-08-24 at 16:39 +0200, Momčilo Medić wrote:

Hi Eugen,

On Mon, 2020-08-24 at 14:26 +, Eugen Block wrote:
> Hi,
>
> there have been several threads about this topic [1], most likely
> it's
> the metadata operation during the cleanup that saturates your
> disks.
>
> The recommended settings seem to be:
>
> [osd]
> osd op queue = wpq
> osd op queue cut off = high

Yeah, I've stumbled upon those settings recently.
However, it seems to be the default nowadays...

root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue
wpq
root@0cephosd01:~# ceph config get mds.cephosd01 osd_op_queue_cut_off
high
root@cephosd01:~#

I do appreciate your input anyway.

> This helped us a lot, the number of slow requests has decreased
> significantly.
>
> Regards,
> Eugen
>
>
> [1]
>  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MK672ROJSW3X56PC2KWOK2GX7ENQP2LS/#FF3FMP5EEMOBCXAYB4ZVFIAAN6U4IRS3

>
>
> Zitat von Momčilo Medić :
>
> > Hi friends,
> >
> > Since deployment of our Ceph cluster we've been plagued by slow
> > metadata error.
> > Namely, cluster goes into HEALTH_WARN with a message similar to
> > this
> > one:
> >
> > 2 MDSs report slow metadata IOs
> > 1 MDSs report slow requests
> > 1 slow ops, oldest one blocked for 32 sec, daemons [osd.22,osd.4]
> > have
> > slow ops.
> >
> > Here is a brief overview of our setup:
> > - 7 OSD nodes with 6 OSD drives each
> > - three of those are also monitors, managers and MDS
> > - there is a single Ceph client (at the moment)
> > - there is only CephFS being used (at the moment)
> > - metadata for CephFS is on HDD (was on HDD, but we moved it as
> > suggested - no improvement)
> >
> > Our expectation is that this is not a RAM issue as we have 64GiB
> > of
> > memory and is never fully utilized.
> >
> > It might be a CPU problem, as issue happens mostly during high
> > loads
> > (load of ~12 on a 8-core Intel Xeon Bronze 3106).
> > However, the load is present on all OSD nodes, not just MDS ones.
> >
> > Cluster is used for (mostly nightly) backups and has no critical
> > performance requirement.
> > Interestingly, significant load across all nodes appears when
> > running
> > cleanup of outdated backups.
> > This boils down to mostly truncating files and some removal, but
> > it
> > is
> > usually small number of large files.
> >
> > Bellow you can find an example of "dump_ops_in_flight" output
> > during
> > the problem (which you may find useful - I couldn't make sense
> > out
> > of
> > it).
> >
> > Should we invest into more powerfull CPU hardware (or should we
> > move
> > MDS roles to more powerful nodes)?
> >
> > Please let me know if I can share any more information to help
> > resolve
> > this thing.
> >
> > Thanks in advance!
> >
> > Kind regards,
> > Momo.
> >
> > ===
> >
> > {
> > "ops": [
> > {
> > "description":
> > "client_request(client.22661659:706483006
> > create #0x1002742/a-random-file 2020-08-
> > 23T23:09:33.919740+0200
> > caller_uid=117, caller_gid=121{})",
> > "initiated_at": "2020-08-23T23:09:33.926509+0200",
> > "age": 30.193027896,
> > "duration": 30.19308393401,
> > "type_data": {
> > "flag_point": "failed to 

[ceph-users] Re: Undo ceph osd destroy?

2020-08-26 Thread Eugen Block

Hi,

I don't know if the ceph version is relevant here but I could undo  
that quite quickly in my small test cluster (Octopus native, no docker).
After the OSD was marked as "destroyed" I recreated the auth caps for  
that OSD_ID (marking as destroyed removes cephx keys etc.), changed  
the keyring in /var/lib/ceph/osd/ceph-1/keyring to reflect that and  
restarted the OSD, now it's up and in again. Is the OSD in your case  
actually up and running?


Regards,
Eugen


Zitat von Michael Fladischer :


Hi,

I accidentally destroyed the wrong OSD in my cluster. It is now  
marked as "destroyed" but the HDD is still there and data was not  
touch AFAICT. I was able to avtivate it again using ceph-volume lvm  
activate and I can make the OSD as "in" but it's status is not  
changing from "destroyed".


Is there a way to unmark it so I can reintegrate it in the cluster?

Regards,
Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] can not remove orch service

2020-08-26 Thread Ml Ml
Hello,

root@ceph02:~# ceph orch ps
NAME  HOSTSTATUS  REFRESHED  AGE
VERSIONIMAGE NAME   IMAGE ID  CONTAINER ID
mgr.ceph01ceph01  running (18m)   6s ago 4w
15.2.4 docker.io/ceph/ceph:v15.2.4  54fa7e66fb03  7deebe09f6fd
(...)
mgr.cph02 ceph02  error   3s ago 4w
  docker.io/ceph/ceph:v15.2.4   

I must have been drunk when i added "mgr.cph02".

Now i am sober again, but i get:

root@ceph02:~# ceph orch rm mgr.cph02 --force
Failed to remove service.  was not found.

:-(

Any hint?

Cheers,
Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs needs access from two networks

2020-08-26 Thread Simon Sutter
Hello,

So I know, the mon services can only bind to just one ip.
But I have to make it accessible to two networks because internal and external 
servers have to mount the cephfs.
The internal ip is 10.99.10.1 and the external is some public-ip.
I tried nat'ing it  with this: "firewall-cmd --zone=public 
--add-forward-port=port=6789:proto=tcp:toport=6789:toaddr=10.99.10.1 -permanent"

So the nat is working, because I get a "ceph v027" (alongside with some 
gibberish) when I do a telnet "telnet *public-ip* 6789"
But when I try to mount it, I get just a timeout:
mount - -t ceph *public-ip*:6789:/testing /mnt -o 
name=test,secretfile=/root/ceph.client. test.key
mount error 110 = Connection timed out

The tcpdump also recognizes a "Ceph Connect" packet, coming from the mon.

How can I get around this problem?
Is there something I have missed?

Specs:
Latest Octopus 15.2.4
Centos 8
8 Nodes
No health warnings.

Thanks in advance,
Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RandomCrashes on OSDs Attached to Mon Hosts with Octopus

2020-08-26 Thread Denis Krienbühl
Hi!

We've recently upgraded all our clusters from Mimic to Octopus (15.2.4). Since
then, our largest cluster is experiencing random crashes on OSDs attached to the
mon hosts.

This is the crash we are seeing (cut for brevity, see links in post scriptum):

   {
   "ceph_version": "15.2.4",
   "utsname_release": "4.15.0-72-generic",
   "assert_condition": "r == 0",
   "assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, 
bool)",
   "assert_file": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
",
   "assert_line": 11430,
   "assert_thread_name": "bstore_kv_sync",
   "assert_msg": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
: In function 'void 
BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fc56311a700 
time 
2020-08-26T08:52:24.917083+0200\n/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc
 : 11430: FAILED ceph_assert(r == 0)\n",
   "backtrace": [
   "(()+0x12890) [0x7fc576875890]",
   "(gsignal()+0xc7) [0x7fc575527e97]",
   "(abort()+0x141) [0x7fc575529801]",
   "(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1a5) [0x559ef9ae97b5]",
   "(ceph::__ceph_assertf_fail(char const*, char const*, int, char 
const*, char const*, ...)+0) [0x559ef9ae993f]",
   "(BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x3a0) 
[0x559efa0245b0]",
   "(BlueStore::_kv_sync_thread()+0xbdd) [0x559efa07745d]",
   "(BlueStore::KVSyncThread::entry()+0xd) [0x559efa09cd3d]",
   "(()+0x76db) [0x7fc57686a6db]",
   "(clone()+0x3f) [0x7fc57560a88f]"
   ]
   }

Right before the crash occurs, we see the following message in the crash log:

   -3> 2020-08-26T08:52:24.787+0200 7fc569b2d700  2 rocksdb: 
[db/db_impl_compaction_flush.cc:2212 
] Waiting after background compaction 
error: Corruption: block checksum mismatch: expected 2548200440, got 2324967102 
 in db/815839.sst offset 67107066 size 3808, Accumulated background error 
counts: 1
   -2> 2020-08-26T08:52:24.852+0200 7fc56311a700 -1 rocksdb: submit_common 
error: Corruption: block checksum mismatch: expected 2548200440, got 2324967102 
 in db/815839.sst offset 67107066 size 3808 code = 2 Rocksdb transaction:

In short, we see a Rocksdb corruption error after background compaction, when 
this happens.

When an OSD crashes, which happens about 10-15 times a day, it restarts and
resumes work without any further problems.

We are pretty confident that this is not a hardware issue, due to the following 
facts:

* The crashes occur on 5 different hosts over 3 different racks.
* There is no smartctl/dmesg output that could explain it.
* It usually happens to a different OSD that did not crash before.

Still we checked the following on a few OSDs/hosts:

* We can do a manual compaction, both offline and online.
* We successfully ran "ceph-bluestore-tool fsck --deep yes" on one of the OSDs.
* We manually compacted a number of OSDs, one of which crashed hours later.

The only thing we have noticed so far: It only happens to OSDs that are attached
to a mon host. *None* of the non-mon host OSDs have had a crash!

Does anyone have a hint what could be causing this? We currently have no good
theory that could explain this, much less have a fix or workaround.

Any help would be greatly appreciated.

Denis

Crash: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/meta.txt 

Log: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/log.txt 


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Storage class usage stats

2020-08-26 Thread Tobias Urdin
Hello,

I've been trying to understand if there is any way to get usage information 
based on storage classes for buckets.

Since there is no information available from the "radosgw-admin bucket stats" 
commands nor any other endpoint I
tried to browse the source code but couldn't find any references where the 
storage class would be exposed in such a way.

It also seems that RadosGW today is not saving any counters on amount of 
objects stored in storage classes when it's
collecting usage stats, which means there is no such metadata saved for a 
bucket.


I was hoping it was atleast saved but not exposed because then it would have 
been a easier fix than adding support to count number of objects in storage 
classes based on operations which would involve a lot of places and mean 
writing to the bucket metadata on each op :(


Is my assumptions correct that there is no way to retrieve such information, 
meaning there is no way to measure such usage?

If the answer is yes, I assume the only way to get something that could be 
measured would be to instead have multiple placement
targets since that is exposed from in bucket info. The bad things would be 
though that you lose a lot of functionality related to lifecycle
and moving a single object to another storage class.

Best regards
Tobias
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] slow "rados ls"

2020-08-26 Thread Marcel Kuiper
Hi

One of my clusters running nautilus 14.2.8 is very slow (13 seconds or so
where my other clusters are returning almost instantanious) when doing a
'rados --pool rc3-se.rgw.buckets.index ls' from one of the monitors.

I checked
- ceph status => OK
- routing to/from osds ok (I see a lot of established connections to osds
due to the command, nothing in syn_sent indicating incomplete handshake)
- ping times are OK
- no interface errors
- no packet drops
- no increasing send queus
- and as far as I can see nothing out of the ordinary in mon and osd logs

I have no clue how to debug the issue. If someone has pointers it would be
much appreciated

Kind Regards

Marcel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Infiniband support

2020-08-26 Thread Rafael Quaglio
Hi,
     I could not see in the doc if Ceph has infiniband support. Is there someone using it?
     Also, is there any rdma support working natively?

     Can anyoune point me where to find more information about it?


Thanks,
Rafael.___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs needs access from two networks

2020-08-26 Thread Janne Johansson
Den ons 26 aug. 2020 kl 14:16 skrev Simon Sutter :

> Hello,
> So I know, the mon services can only bind to just one ip.
> But I have to make it accessible to two networks because internal and
> external servers have to mount the cephfs.
> The internal ip is 10.99.10.1 and the external is some public-ip.
> I tried nat'ing it  with this: "firewall-cmd --zone=public
> --add-forward-port=port=6789:proto=tcp:toport=6789:toaddr=10.99.10.1
> -permanent"
>
> So the nat is working, because I get a "ceph v027" (alongside with some
> gibberish) when I do a telnet "telnet *public-ip* 6789"
> But when I try to mount it, I get just a timeout:
> mount - -t ceph *public-ip*:6789:/testing /mnt -o
> name=test,secretfile=/root/ceph.client. test.key
> mount error 110 = Connection timed out
>
> The tcpdump also recognizes a "Ceph Connect" packet, coming from the mon.
>
> How can I get around this problem?
> Is there something I have missed?


Any ceph client will need direct access to all OSDs involved also. Your
mail doesn't really say if the cephfs-mounting client can talk to OSDs?

In ceph, traffic is not shuffled via mons, mons only tell the client which
OSDs it needs to talk to, then all IO goes directly from client to any
involved OSD servers.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Infiniband support

2020-08-26 Thread Fabrizio Cuseo
I used ceph with proxmox server and IP over Infiniband without any problem. 

- Il 26-ago-20, alle 15:08, Rafael Quaglio  ha scritto: 

> Hi,
> I could not see in the doc if Ceph has infiniband support. Is there someone
> using it?
> Also, is there any rdma support working natively?

> Can anyoune point me where to find more information about it?

> Thanks,
> Rafael.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

-- 
--- 
Fabrizio Cuseo - mailto:f.cu...@panservice.it 
Direzione Generale - Panservice InterNetWorking 
Servizi Professionali per Internet ed il Networking 
Panservice e' associata AIIP - RIPE Local Registry 
Phone: +39 0773 410020 - Fax: +39 0773 470219 
http://www.panservice.it mailto:i...@panservice.it 
Numero verde nazionale: 800 901492 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow "rados ls"

2020-08-26 Thread Vladimir Sigunov
Hi Marcel,
If this issue related to only one monitor?
If yes, check the overall node status: average load, disk I/O, RAM
consumption, swap size, etc. Could be not a ceph-related issue.

Regards,
Vladimir.

On Wed, Aug 26, 2020 at 9:07 AM Marcel Kuiper  wrote:

> Hi
>
> One of my clusters running nautilus 14.2.8 is very slow (13 seconds or so
> where my other clusters are returning almost instantanious) when doing a
> 'rados --pool rc3-se.rgw.buckets.index ls' from one of the monitors.
>
> I checked
> - ceph status => OK
> - routing to/from osds ok (I see a lot of established connections to osds
> due to the command, nothing in syn_sent indicating incomplete handshake)
> - ping times are OK
> - no interface errors
> - no packet drops
> - no increasing send queus
> - and as far as I can see nothing out of the ordinary in mon and osd logs
>
> I have no clue how to debug the issue. If someone has pointers it would be
> much appreciated
>
> Kind Regards
>
> Marcel
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] iSCSI gateways in nautilus dashboard in state down

2020-08-26 Thread Willi Schiegel

Hello All,

I have a Nautilus (14.2.11) cluster which is running fine on CentOS 7 
servers. 4 OSD nodes, 3 MON/MGR hosts. Now I wanted to enable iSCSI 
gateway functionality to be used by some Solaris and FreeBSD clients. I 
followed the instructions under


https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli-manual-install

and

https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli/

I setup two iSCSI targets and they both show up with gwcli on both gateways:

[root@osd1 ~]# gwcli ls
...
o- gateways . [Up: 2/2, Portals: 2]
| o- osd1.mydomain.pri . [172.29.1.171 (UP)]
| o- osd2.mydomain.pri . [172.29.1.172 (UP)]
...

tcmu-runner, rbd-target-gw and rbd-target-api are active (running) on 
both gateways, there is no firewall and SELinux is disabled but on the 
dashboard the state of the gateways is "down".  and ceph-mgr.mon2.log shows


mgr[dashboard] iscsi REST API failed GET req status: 403

Any hints? Thank you.

Best
Willi
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow "rados ls"

2020-08-26 Thread Marcel Kuiper
Hi Vladimir,

no it is the same on all monitors. Actually I got triggered because I got
slow responses on my rados gateway with the radosgw-admin command and
narrowed it down to slow respons for rados commands anywhere in the
cluster.

The cluster is not that busy and all osds and monitors use very little
resources compared to what they have onboard

Kind Regards

Marcel Kuiper

> Hi Marcel,
> If this issue related to only one monitor?
> If yes, check the overall node status: average load, disk I/O, RAM
> consumption, swap size, etc. Could be not a ceph-related issue.
>
> Regards,
> Vladimir.
>
> On Wed, Aug 26, 2020 at 9:07 AM Marcel Kuiper  wrote:
>
>> Hi
>>
>> One of my clusters running nautilus 14.2.8 is very slow (13 seconds or
>> so
>> where my other clusters are returning almost instantanious) when doing a
>> 'rados --pool rc3-se.rgw.buckets.index ls' from one of the monitors.
>>
>> I checked
>> - ceph status => OK
>> - routing to/from osds ok (I see a lot of established connections to
>> osds
>> due to the command, nothing in syn_sent indicating incomplete handshake)
>> - ping times are OK
>> - no interface errors
>> - no packet drops
>> - no increasing send queus
>> - and as far as I can see nothing out of the ordinary in mon and osd
>> logs
>>
>> I have no clue how to debug the issue. If someone has pointers it would
>> be
>> much appreciated
>>
>> Kind Regards
>>
>> Marcel
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RandomCrashes on OSDs Attached to Mon Hosts with Octopus

2020-08-26 Thread Igor Fedotov

Hi Denis,

this reminds me the following ticket: https://tracker.ceph.com/issues/37282

Please note they mentioned co-location with mon in comment #29.


Working hypothesis for this ticket is the interim disk read failures 
which cause RocksDB checksum failures. Earlier we observed such a 
problem for main device. Presumably it's heavy memory pressure which 
causes kernel to be failing this way.  See my comment #38 there.


So I'd like to see answers/comments for the following questions:

0) What is backing disks layout for OSDs in question (main device type?, 
additional DB/WAL devices?).


1) Please check all the existing logs for OSDs at "failing" nodes for 
other checksum errors (as per my comment #38)


2) Check if BlueFS spillover is observed for any failing OSDs.

3) Check "bluestore_reads_with_retries" performance counters for all 
OSDs at nodes in question. See comments 38-42 on the details. Any 
non-zero values?


4) Start monitoring RAM usage and swapping for these nodes. Comment 39.


Thanks,

Igor






On 8/26/2020 3:47 PM, Denis Krienbühl wrote:

Hi!

We've recently upgraded all our clusters from Mimic to Octopus (15.2.4). Since
then, our largest cluster is experiencing random crashes on OSDs attached to the
mon hosts.

This is the crash we are seeing (cut for brevity, see links in post scriptum):

{
"ceph_version": "15.2.4",
"utsname_release": "4.15.0-72-generic",
"assert_condition": "r == 0",
"assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, 
bool)",
"assert_file": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
",
"assert_line": 11430,
"assert_thread_name": "bstore_kv_sync",
"assert_msg": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' 
thread 7fc56311a700 time 2020-08-26T08:52:24.917083+0200\n/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
: 11430: FAILED ceph_assert(r == 0)\n",
"backtrace": [
"(()+0x12890) [0x7fc576875890]",
"(gsignal()+0xc7) [0x7fc575527e97]",
"(abort()+0x141) [0x7fc575529801]",
"(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1a5) [0x559ef9ae97b5]",
"(ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
char const*, ...)+0) [0x559ef9ae993f]",
"(BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x3a0) 
[0x559efa0245b0]",
"(BlueStore::_kv_sync_thread()+0xbdd) [0x559efa07745d]",
"(BlueStore::KVSyncThread::entry()+0xd) [0x559efa09cd3d]",
"(()+0x76db) [0x7fc57686a6db]",
"(clone()+0x3f) [0x7fc57560a88f]"
]
}

Right before the crash occurs, we see the following message in the crash log:

-3> 2020-08-26T08:52:24.787+0200 7fc569b2d700  2 rocksdb: 
[db/db_impl_compaction_flush.cc:2212 ] 
Waiting after background compaction error: Corruption: block checksum mismatch: expected 
2548200440, got 2324967102  in db/815839.sst offset 67107066 size 3808, Accumulated 
background error counts: 1
-2> 2020-08-26T08:52:24.852+0200 7fc56311a700 -1 rocksdb: submit_common 
error: Corruption: block checksum mismatch: expected 2548200440, got 2324967102  
in db/815839.sst offset 67107066 size 3808 code = 2 Rocksdb transaction:

In short, we see a Rocksdb corruption error after background compaction, when 
this happens.

When an OSD crashes, which happens about 10-15 times a day, it restarts and
resumes work without any further problems.

We are pretty confident that this is not a hardware issue, due to the following 
facts:

* The crashes occur on 5 different hosts over 3 different racks.
* There is no smartctl/dmesg output that could explain it.
* It usually happens to a different OSD that did not crash before.

Still we checked the following on a few OSDs/hosts:

* We can do a manual compaction, both offline and online.
* We successfully ran "ceph-bluestore-tool fsck --deep yes" on one of the OSDs.
* We manually compacted a number of OSDs, one of which crashed hours later.

The only thing we have noticed so far: It only happens to OSDs that are attached
to a mon host. *None* of the non-mon host OSDs have had a crash!

Does anyone have a hint what could be causing this? We currently have no good
theory that could explain this, much less have a fix or workaround.

Any help would be greatly appreciated.

Denis

Crash: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/meta.txt 

Log: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/log.txt 


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send 

[ceph-users] Re: RandomCrashes on OSDs Attached to Mon Hosts with Octopus

2020-08-26 Thread Igor Fedotov

just to add

a hypothesis why mon hosts are affected only  - higher memory 
utilization at these nodes is what causes disk reading failures to 
appear. RAM leakage (or excessive utilization) in MON processes or 
something?


On 8/26/2020 4:29 PM, Igor Fedotov wrote:

Hi Denis,

this reminds me the following ticket: 
https://tracker.ceph.com/issues/37282


Please note they mentioned co-location with mon in comment #29.


Working hypothesis for this ticket is the interim disk read failures 
which cause RocksDB checksum failures. Earlier we observed such a 
problem for main device. Presumably it's heavy memory pressure which 
causes kernel to be failing this way.  See my comment #38 there.


So I'd like to see answers/comments for the following questions:

0) What is backing disks layout for OSDs in question (main device 
type?, additional DB/WAL devices?).


1) Please check all the existing logs for OSDs at "failing" nodes for 
other checksum errors (as per my comment #38)


2) Check if BlueFS spillover is observed for any failing OSDs.

3) Check "bluestore_reads_with_retries" performance counters for all 
OSDs at nodes in question. See comments 38-42 on the details. Any 
non-zero values?


4) Start monitoring RAM usage and swapping for these nodes. Comment 39.


Thanks,

Igor






On 8/26/2020 3:47 PM, Denis Krienbühl wrote:

Hi!

We've recently upgraded all our clusters from Mimic to Octopus 
(15.2.4). Since
then, our largest cluster is experiencing random crashes on OSDs 
attached to the

mon hosts.

This is the crash we are seeing (cut for brevity, see links in post 
scriptum):


    {
    "ceph_version": "15.2.4",
    "utsname_release": "4.15.0-72-generic",
    "assert_condition": "r == 0",
    "assert_func": "void 
BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)",
    "assert_file": 
"/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
",

    "assert_line": 11430,
    "assert_thread_name": "bstore_kv_sync",
    "assert_msg": 
"/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
: In function 'void 
BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 
7fc56311a700 time 
2020-08-26T08:52:24.917083+0200\n/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc 
: 11430: FAILED ceph_assert(r == 0)\n",

    "backtrace": [
    "(()+0x12890) [0x7fc576875890]",
    "(gsignal()+0xc7) [0x7fc575527e97]",
    "(abort()+0x141) [0x7fc575529801]",
    "(ceph::__ceph_assert_fail(char const*, char const*, int, 
char const*)+0x1a5) [0x559ef9ae97b5]",
    "(ceph::__ceph_assertf_fail(char const*, char const*, 
int, char const*, char const*, ...)+0) [0x559ef9ae993f]",
    "(BlueStore::_txc_apply_kv(BlueStore::TransContext*, 
bool)+0x3a0) [0x559efa0245b0]",

    "(BlueStore::_kv_sync_thread()+0xbdd) [0x559efa07745d]",
    "(BlueStore::KVSyncThread::entry()+0xd) [0x559efa09cd3d]",
    "(()+0x76db) [0x7fc57686a6db]",
    "(clone()+0x3f) [0x7fc57560a88f]"
    ]
    }

Right before the crash occurs, we see the following message in the 
crash log:


    -3> 2020-08-26T08:52:24.787+0200 7fc569b2d700  2 rocksdb: 
[db/db_impl_compaction_flush.cc:2212 
] Waiting after background 
compaction error: Corruption: block checksum mismatch: expected 
2548200440, got 2324967102  in db/815839.sst offset 67107066 size 
3808, Accumulated background error counts: 1
    -2> 2020-08-26T08:52:24.852+0200 7fc56311a700 -1 rocksdb: 
submit_common error: Corruption: block checksum mismatch: expected 
2548200440, got 2324967102  in db/815839.sst offset 67107066 size 
3808 code = 2 Rocksdb transaction:


In short, we see a Rocksdb corruption error after background 
compaction, when this happens.


When an OSD crashes, which happens about 10-15 times a day, it 
restarts and

resumes work without any further problems.

We are pretty confident that this is not a hardware issue, due to the 
following facts:


* The crashes occur on 5 different hosts over 3 different racks.
* There is no smartctl/dmesg output that could explain it.
* It usually happens to a different OSD that did not crash before.

Still we checked the following on a few OSDs/hosts:

* We can do a manual compaction, both offline and online.
* We successfully ran "ceph-bluestore-tool fsck --deep yes" on one of 
the OSDs.
* We manually compacted a number of OSDs, one of which crashed hours 
later.


The only thing we have noticed so far: It only happens to OSDs that 
are attached

to a mon host. *None* of the non-mon host OSDs have had a crash!

Does anyone have a hint what could be causing this? We currently have 
no good

theory that could explain this, much less have a fix or workaround.

Any help would be greatly appreciated.

Denis

Crash: 
https://public-resources.objects.lpg.cloudscale.ch/osd-crash/meta.txt 


[ceph-users] Re: iSCSI gateways in nautilus dashboard in state down

2020-08-26 Thread Jason Dillaman
On Wed, Aug 26, 2020 at 9:15 AM Willi Schiegel
 wrote:
>
> Hello All,
>
> I have a Nautilus (14.2.11) cluster which is running fine on CentOS 7
> servers. 4 OSD nodes, 3 MON/MGR hosts. Now I wanted to enable iSCSI
> gateway functionality to be used by some Solaris and FreeBSD clients. I
> followed the instructions under
>
> https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli-manual-install
>
> and
>
> https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli/
>
> I setup two iSCSI targets and they both show up with gwcli on both gateways:
>
> [root@osd1 ~]# gwcli ls
> ...
> o- gateways . [Up: 2/2, Portals: 2]
> | o- osd1.mydomain.pri . [172.29.1.171 (UP)]
> | o- osd2.mydomain.pri . [172.29.1.172 (UP)]
> ...
>
> tcmu-runner, rbd-target-gw and rbd-target-api are active (running) on
> both gateways, there is no firewall and SELinux is disabled but on the
> dashboard the state of the gateways is "down".  and ceph-mgr.mon2.log shows
>
> mgr[dashboard] iscsi REST API failed GET req status: 403
>
> Any hints? Thank you.

Have you ensured that your MGR ip addresses are in "trusted_ip_list"
on the GWs [1]?

> Best
> Willi
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

[1] 
https://github.com/ceph/ceph-iscsi/blob/master/ceph_iscsi_config/gateway_setting.py#L179

-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] anyone using ceph csi

2020-08-26 Thread Marc Roos



I was wondering if anyone is using ceph csi plugins[1]? I would like to 
know how to configure credentials, that is not really described for 
testing on the console. 

I am running 
./csiceph --endpoint unix:///tmp/mesos-csi-XSJWlY/endpoint.sock --type 
rbd --drivername rbd.csi.ceph.com --nodeid test

Connection is fine
[ ~]# csc identity plugin-info
"rbd.csi.ceph.com""canary"

However I have no idea how to configure the clientid, pool etc in the 
volumes


[1]
https://github.com/ceph/ceph-csi

Ps. I am not using kubernetes.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: anyone using ceph csi

2020-08-26 Thread Jason Dillaman
On Wed, Aug 26, 2020 at 10:11 AM Marc Roos  wrote:
>
>
>
> I was wondering if anyone is using ceph csi plugins[1]? I would like to
> know how to configure credentials, that is not really described for
> testing on the console.
>
> I am running
> ./csiceph --endpoint unix:///tmp/mesos-csi-XSJWlY/endpoint.sock --type
> rbd --drivername rbd.csi.ceph.com --nodeid test
>
> Connection is fine
> [ ~]# csc identity plugin-info
> "rbd.csi.ceph.com""canary"
>
> However I have no idea how to configure the clientid, pool etc in the
> volumes
>
>
> [1]
> https://github.com/ceph/ceph-csi
>
> Ps. I am not using kubernetes.

The credentials and Ceph cluster configuration metadata are passed via
the RPC calls as per the CSI spec. In k8s, these details would be
stored in StorageClass and Secret objects.

> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: iSCSI gateways in nautilus dashboard in state down

2020-08-26 Thread Willi Schiegel




On 8/26/20 3:56 PM, Jason Dillaman wrote:

On Wed, Aug 26, 2020 at 9:15 AM Willi Schiegel
 wrote:


Hello All,

I have a Nautilus (14.2.11) cluster which is running fine on CentOS 7
servers. 4 OSD nodes, 3 MON/MGR hosts. Now I wanted to enable iSCSI
gateway functionality to be used by some Solaris and FreeBSD clients. I
followed the instructions under

https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli-manual-install

and

https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli/

I setup two iSCSI targets and they both show up with gwcli on both gateways:

[root@osd1 ~]# gwcli ls
...
o- gateways . [Up: 2/2, Portals: 2]
| o- osd1.mydomain.pri . [172.29.1.171 (UP)]
| o- osd2.mydomain.pri . [172.29.1.172 (UP)]
...

tcmu-runner, rbd-target-gw and rbd-target-api are active (running) on
both gateways, there is no firewall and SELinux is disabled but on the
dashboard the state of the gateways is "down".  and ceph-mgr.mon2.log shows

mgr[dashboard] iscsi REST API failed GET req status: 403

Any hints? Thank you.


Have you ensured that your MGR ip addresses are in "trusted_ip_list"
on the GWs [1]?


Stupid me, that was it! trusted_ip_list held only the gateways.

Thank you!




Best
Willi
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



[1] 
https://github.com/ceph/ceph-iscsi/blob/master/ceph_iscsi_config/gateway_setting.py#L179


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow "rados ls"

2020-08-26 Thread Wido den Hollander




On 26/08/2020 15:59, Stefan Kooman wrote:

On 2020-08-26 15:20, Marcel Kuiper wrote:

Hi Vladimir,

no it is the same on all monitors. Actually I got triggered because I got
slow responses on my rados gateway with the radosgw-admin command and
narrowed it down to slow respons for rados commands anywhere in the
cluster.


Do you have a very large amount of objects. And / or a lot of OMAP data
and thus large rocksdb databases? We have seen slowness (and slow ops)
from having very large rocksdb databases due to a lot of OMAP data
concentrated on only a few nodes (cephfs metadata only). You might
suffer from the same thing.

Manual rocksdb compaction on the OSDs might help.


In addition: Keep in mind that RADOS was never designed to list objects 
fast. The more Placement Groups you have the slower a listing will be.


Wido



Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: anyone using ceph csi

2020-08-26 Thread Marc Roos
 >>
 >>
 >> I was wondering if anyone is using ceph csi plugins[1]? I would like 
to
 >> know how to configure credentials, that is not really described for
 >> testing on the console.
 >>
 >> I am running
 >> ./csiceph --endpoint unix:///tmp/mesos-csi-XSJWlY/endpoint.sock 
--type
 >> rbd --drivername rbd.csi.ceph.com --nodeid test
 >>
 >> Connection is fine
 >> [ ~]# csc identity plugin-info
 >> "rbd.csi.ceph.com""canary"
 >>
 >> However I have no idea how to configure the clientid, pool etc in 
the
 >> volumes
 >>
 >>
 >> [1]
 >> https://github.com/ceph/ceph-csi
 >>
 >> Ps. I am not using kubernetes.
 >
 >The credentials and Ceph cluster configuration metadata are passed via
 >the RPC calls as per the CSI spec. In k8s, these details would be
 >stored in StorageClass and Secret objects.
 
So there is no way of testing this driver via the commandline, with
some generic grpc client?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Fwd: Upgrade Path Advice Nautilus (CentOS 7) -> Octopus (new OS)

2020-08-26 Thread Cloud Guy
Hello,

Looking for a bit of guidance / approach to upgrading from Nautilus to
Octopus considering CentOS and Ceph-Ansible.

We're presently running a Nautilus cluster (all nodes / daemons 14.2.11 as
of this post).
- There are 4 monitor-hosts with mon, mgr, and dashboard functions
consolidated;
- 4 RGW hosts
- 4 ODS costs, with 10 OSDs each.   This is planned to scale to 7 nodes
with additional OSDs and capacity (considering to do this as part of
upgrade process)
- Currently using ceph-ansible (however it's a process to maintain scripts
/ configs between playbook versions - although a great framework, not ideal
in our case;
- All hosts run CentOS 7.x;
- dm-crypt in use on LVM OSDs (via ceph-ansible);
- Deployment IS NOT containerized.

Octopus support on CentOS 7 is limited due to python dependencies, as a
result we want to move to CentOS 8 or Ubuntu 20.04.   The other outlier is
CentOS native Kernel support for LSI2008 (eg. 9211)  HBAs which some of our
OSD nodes use.

Irrespective of OS considerations above, the upgrade will be to an OS that
fully supports Octopus.

We'd like to make use of ceph orchestrator for on-going cluster management.


Here's an upgrade path scenario that is being considered.   At a high-level:
1.  Deploy a new monitor on CentOS 8.   May be Nautilus via established
ceph-ansible playbook.
2.  Upgrade new monitor to Octopus using dnf / ceph package upgrade.
3.  Decommission individual monitor hosts (existing on CentOS 7) and
redeploy on CentOS 8 via ceph orchestrator from new monitor node;
4.  Repeat until all monitors are on new OS + Octopus (all deployed via
Ceph Orchestrator.
5.  Add additional OSD nodes / drives / capacity via orchestrator on
Octopus;
6.  Upgrade existing OSD hosts by keeping OSDs intact, reinstalling new OS
(CentOS 8 or Ubuntu 20.04);
7.  Deploy ceph octopus on new nodes via orchestrator;
8.  Reactivate / rescan in-tact OSDs on newly redeployed node. (i.e.
ceph-volume
lvm activate --all)
9.  Rinse / repeat for remaining Nautilus nodes.
10.  Manually upgrade RGW packages on gateway nodes.

Thank you.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow "rados ls"

2020-08-26 Thread Stefan Kooman
On 2020-08-26 15:20, Marcel Kuiper wrote:
> Hi Vladimir,
> 
> no it is the same on all monitors. Actually I got triggered because I got
> slow responses on my rados gateway with the radosgw-admin command and
> narrowed it down to slow respons for rados commands anywhere in the
> cluster.

Do you have a very large amount of objects. And / or a lot of OMAP data
and thus large rocksdb databases? We have seen slowness (and slow ops)
from having very large rocksdb databases due to a lot of OMAP data
concentrated on only a few nodes (cephfs metadata only). You might
suffer from the same thing.

Manual rocksdb compaction on the OSDs might help.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: anyone using ceph csi

2020-08-26 Thread Jason Dillaman
On Wed, Aug 26, 2020 at 10:33 AM Marc Roos  wrote:
>
>  >>
>  >>
>  >> I was wondering if anyone is using ceph csi plugins[1]? I would like
> to
>  >> know how to configure credentials, that is not really described for
>  >> testing on the console.
>  >>
>  >> I am running
>  >> ./csiceph --endpoint unix:///tmp/mesos-csi-XSJWlY/endpoint.sock
> --type
>  >> rbd --drivername rbd.csi.ceph.com --nodeid test
>  >>
>  >> Connection is fine
>  >> [ ~]# csc identity plugin-info
>  >> "rbd.csi.ceph.com""canary"
>  >>
>  >> However I have no idea how to configure the clientid, pool etc in
> the
>  >> volumes
>  >>
>  >>
>  >> [1]
>  >> https://github.com/ceph/ceph-csi
>  >>
>  >> Ps. I am not using kubernetes.
>  >
>  >The credentials and Ceph cluster configuration metadata are passed via
>  >the RPC calls as per the CSI spec. In k8s, these details would be
>  >stored in StorageClass and Secret objects.
>
> So there is no way of testing this driver via the commandline, with
> some generic grpc client?
>

You would need some way to inject the correct/expected gRPC calls as
per the CSI spec [1]. Data from the StorageClass mostly gets
translated and sent via the "parameters" argument and data from the
Secret gets translated and sent via the "secrets" argument of various
gRPC requests.



[1] https://github.com/container-storage-interface/spec/blob/master/spec.md

-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: iSCSI gateways in nautilus dashboard in state down

2020-08-26 Thread Ricardo Marques
Hi Willi,

Check the 'iscsi-gateway.cfg' file on your iSCSI gateways to make sure that the 
mgr IP (where the dashboard is running) is included in the 'trusted_ip_list' 
config.

After adding the IP to the config file, you need to restart the 
'rbd-target-api' service.

Ricardo Marques


From: Willi Schiegel 
Sent: Wednesday, August 26, 2020 2:12 PM
To: ceph-users@ceph.io 
Subject: [ceph-users] iSCSI gateways in nautilus dashboard in state down

Hello All,

I have a Nautilus (14.2.11) cluster which is running fine on CentOS 7
servers. 4 OSD nodes, 3 MON/MGR hosts. Now I wanted to enable iSCSI
gateway functionality to be used by some Solaris and FreeBSD clients. I
followed the instructions under

https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli-manual-install

and

https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli/

I setup two iSCSI targets and they both show up with gwcli on both gateways:

[root@osd1 ~]# gwcli ls
...
o- gateways . [Up: 2/2, Portals: 2]
| o- osd1.mydomain.pri . [172.29.1.171 (UP)]
| o- osd2.mydomain.pri . [172.29.1.172 (UP)]
...

tcmu-runner, rbd-target-gw and rbd-target-api are active (running) on
both gateways, there is no firewall and SELinux is disabled but on the
dashboard the state of the gateways is "down".  and ceph-mgr.mon2.log shows

mgr[dashboard] iscsi REST API failed GET req status: 403

Any hints? Thank you.

Best
Willi
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Infiniband support

2020-08-26 Thread Paul Mezzanini
We've used RDMA via RoCEv2 on 100GbE.  It ran in production that way for at 
least 6 months before I had to turn it off when doing some migrations using 
hardware that didn't support it.  We noticed no performance change in our 
environment so once we were done I just never turned it back on.  I'm not even 
sure we could right now with how we have our network topology / bond interfaces

The biggest annoyance was making sure the device name and gid were correct.  
This was before the ceph config stuff existed so it may be easier now to roll 
that one out.

Example config section for one of my nodes (in the global part under 
public+cluster network):

ms_cluster_type = async+rdma
ms_async_rdma_device_name = mlx5_1
ms_async_rdma_polling_us = 0
ms_async_rdma_local_gid = ::::::c1b8:4fa0
ms_async_rdma_roce_ver = 1

We pulled the GID in ansible with:

- name: "Insert RDMA GID into ceph.conf"
  shell: sed -i s/GIDGOESHERE/$(cat 
/sys/class/infiniband/mlx5_1/ports/1/gids/5)/g /etc/ceph/ceph.conf
  args:
warn: no 

The stub config file we pushed had "GIDGOESHERE" in it.

I hope that helps someone out there.  Not all of the settings were obvious and 
it took some trial and error.   Now that we have a pure NVMe tier I'll probably 
try and turn it back on to see if we notice any changes.

Netdata also proved to be a valuable tool to make sure we had traffic in both 
TCP and RDMA
https://www.netdata.cloud/


--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfm...@rit.edu

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.



From: Andrei Mikhailovsky 
Sent: Wednesday, August 26, 2020 5:55 PM
To: Rafael Quaglio
Cc: ceph-users
Subject: [ceph-users] Re: Infiniband support

Rafael, We've been using ceph with ipoib for over 7 years and it's been 
supported. However, I am not too sure of the the native rdma support. There has 
been discussions on/off for a while now, but I've not seen much. Perhaps others 
know.

Cheers

> From: "Rafael Quaglio" 
> To: "ceph-users" 
> Sent: Wednesday, 26 August, 2020 14:08:57
> Subject: [ceph-users] Infiniband support

> Hi,
> I could not see in the doc if Ceph has infiniband support. Is there someone
> using it?
> Also, is there any rdma support working natively?

> Can anyoune point me where to find more information about it?

> Thanks,
> Rafael.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Infiniband support

2020-08-26 Thread Andrei Mikhailovsky
Rafael, We've been using ceph with ipoib for over 7 years and it's been 
supported. However, I am not too sure of the the native rdma support. There has 
been discussions on/off for a while now, but I've not seen much. Perhaps others 
know. 

Cheers 

> From: "Rafael Quaglio" 
> To: "ceph-users" 
> Sent: Wednesday, 26 August, 2020 14:08:57
> Subject: [ceph-users] Infiniband support

> Hi,
> I could not see in the doc if Ceph has infiniband support. Is there someone
> using it?
> Also, is there any rdma support working natively?

> Can anyoune point me where to find more information about it?

> Thanks,
> Rafael.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: pg stuck in unknown state

2020-08-26 Thread steven prothero
Hello,

I started a new fresh ceph cluster and have the exact same problem and
also the slow op warnings.

I found this bug report that seems to be about this problem:
https://158.69.68.89/issues/46743

"... mgr/devicehealth: device_health_metrics pool gets created even
without any OSDs in the cluster..."this creates the 1 pg inactive

I now have my OSDs online and it has been a few days but I still have
that "1 pg inactive" and as I am a rookie I don't know how to repair
this.

Thanks and cheers for all the good work
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Add OSD with primary on HDD, WAL and DB on SSD

2020-08-26 Thread Zhenshi Zhou
Official document says that you should allocate 4% of the slow device space
for block.db.
But the main problem is that Bluestore uses RocksDB and RocksDB puts a file
on the fast
device only if it thinks that the whole layer will fit there.
As for RocksDB, L1 is about 300M, L2 is about 3G, L3 is near 30G, and L4 is
about 300G.
For instance, RocksDB puts L2 files to block.db only if it’s at least 3G
there.
As a result, 30G is a acceptable value.

Tony Liu  于2020年8月25日周二 上午10:49写道:

> > -Original Message-
> > From: Anthony D'Atri 
> > Sent: Monday, August 24, 2020 7:30 PM
> > To: Tony Liu 
> > Subject: Re: [ceph-users] Re: Add OSD with primary on HDD, WAL and DB on
> > SSD
> >
> > Why such small HDDs?  Kinda not worth the drive bays and power, instead
> > of the complexity of putting WAL+DB on a shared SSD, might you have been
> > able to just buy SSDs and not split? ymmv.
>
> 2TB is for testing, it will bump up to 10TB for production.
>
> > The limit is a function of the way the DB levels work, it’s not
> > intentional.
> >
> > WAL by default takes a fixed size, like 512 MB or something.
> >
> > 64 GB is a reasonable size, it accomodates the WAL and allows space for
> > DB compaction without overflowing.
>
> For each 10TB HDD, what's the recommended DB device size for both
> DB and WAL? The doc recommends 1% - 4%, meaning 100GB - 400GB for
> each 10TB HDD. But given the WAL data size and DB data size, I am
> not sure if that 100GB - 400GB will be used efficiently.
>
> > With this commit the situation should be improved, though you don’t
> > mention what release you’re running
> >
> > https://github.com/ceph/ceph/pull/29687
>
> I am using ceph version 15.2.4 octopus (stable).
>
> Thanks!
> Tony
>
> > >>>  I don't need to create
> > >>> WAL device, just primary on HDD and DB on SSD, and WAL will be using
> > >>> DB device cause it's faster. Is that correct?
> > >>
> > >> Yes.
> > >>
> > >>
> > >> But be aware that the DB sizes are limited to 3GB, 30GB and 300GB.
> > >> Anything less than those sizes will have a lot of untilised space,
> > >> e.g a 20GB device will only utilise 3GB.
> > >
> > > I have 1 480GB SSD and 7 2TB HDDs. 7 LVs are created on SSD, each is
> > > about 64GB, for 7 OSDs.
> > >
> > > Since it's shared by DB and WAL, DB will take 30GB and WAL will take
> > > the rest 34GB. Is that correct?
> > >
> > > Is that size of DB and WAL good for 2TB HDD (block store and object
> > > store cases)?
> > >
> > > Could you share a bit more about the intention of such limit?
> > >
> > >
> > > Thanks!
> > > Tony
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > > email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How To Configure Bellsouth Email Settings in a Right Way?

2020-08-26 Thread sofi Hayat
It is required to use the right server and port settings to enjoy all the 
benefits of the Bellsouth email service. It is also recommended everyone to 
configure Bellsouth Email Settings and correctly and appropriately. There are 
few users unable to setup Bellsouth email on Android phone, iPhone, or computer 
device. For such helpless candidates, we provide helpline number by which they 
connect with top-most technicians for quality assistance. Once you contact to 
tech-savvy, your Bellsouth email settings will easily be configured in a 
second.   
https://www.emailsupport.us/blog/bellsouth-email-settings-for-outlook/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How To Configure Bellsouth Email Settings in a Right Way?

2020-08-26 Thread sofi Hayat
It is required to use the right server and port settings to enjoy all the 
benefits of the Bellsouth email service. It is also recommended everyone to 
configure Bellsouth Email Settings and correctly and appropriately. There are 
few users unable to setup Bellsouth email on Android phone, iPhone, or computer 
device. For such helpless candidates, we provide helpline number by which they 
connect with top-most technicians for quality assistance. Once you contact to 
tech-savvy, your Bellsouth email settings will easily be configured in a 
second.   
https://www.emailsupport.us/blog/bellsouth-email-settings-for-outlook/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io