hi there,
we are seeing osd occasionally getting kicked out of our cluster, after
having been marked down by other osds. most of the time, the affected
osd rejoins the cluster after about ~5 minutes, but sometimes this takes
much longer. during that time, the osd seems to run just fine.
this happ
Hi,
I have the following Ceph Mimic setup :
- a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
- index/leveldb is stored on each OSD (so no SSD drives, just SATA)
- the current usage is :
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
542 TiB 105 TiB
Hi Andreas,
I made exactly the same observation in another scenario. I added some OSDs
while other OSDs were down.
This is expected.
The crush map is an a-priory algorithm to compute the location of objects
without contacting a central server. Hence, *any*change of a crush map while an
OSD is
Hi Thoralf,
given the following indication from your logs:
May 18 21:12:34 ceph-osd-05 ceph-osd[2356578]: 2020-05-18 21:12:34.211
7fb25cc80700 0 bluestore(/var/lib/ceph/osd/ceph-293) log_latency_fn
slow operation observed for _collection_list, latency = 96.337s, lat =
96s cid =2.0s2_head start
On Tue, May 19, 2020 at 2:06 PM Igor Fedotov wrote:
> Hi Thoralf,
>
> given the following indication from your logs:
>
> May 18 21:12:34 ceph-osd-05 ceph-osd[2356578]: 2020-05-18 21:12:34.211
> 7fb25cc80700 0 bluestore(/var/lib/ceph/osd/ceph-293) log_latency_fn
> slow operation observed for _col
hi igor, hi paul -
thank you for your answers.
On 5/19/20 2:05 PM, Igor Fedotov wrote:
> I presume that your OSDs suffer from slow RocksDB access,
> collection_listing operation is a culprit in this case - 30 items
> listing takes 96seconds to complete.
> From my experience such issues tend to ha
On Tue, May 19, 2020 at 3:11 PM thoralf schulze
wrote:
>
> On 5/19/20 2:13 PM, Paul Emmerich wrote:
> > 3) if necessary add more OSDs; common problem is having very
> > few dedicated OSDs for the index pool; running the index on
> > all OSDs (and having a fast DB device for every disk) is
> > bet
Thoralf,
from your perf counter's dump:
"db_total_bytes": 15032377344,
"db_used_bytes": 411033600,
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"slow_total_bytes": 94737203200,
"slow_used_bytes": 10714480640,
slow_used_bytes is non-zero hence you ha
hi igor -
On 5/19/20 3:23 PM, Igor Fedotov wrote:
> slow_used_bytes is non-zero hence you have a spillover.
you are absolutely right, we do have spillovers on a large number of
osds. ceph tell osd.* compact is running right now.
> Additionally your DB volume size selection isn't perfect. For op
Hello everyone,
I'd like to setup a multisite ceph cluster.
Are there any sample setups that you can recommend studying?
I want to achieve fault tolerance but also I want to avoid split brain
scenarios.
I'm not that familiar with systems like ceph, so I would consider myself as
a beginner.
Thank
The updated images have not been pushed to Dockerhub yet. I ran into the
same problem yesterday trying to update. Hopefully updated images will be
pushed on release (at the same time as the tarball release/prior to
announcement) moving forward in order to avoid this issue.
See here for latest tags
In the docs: https://docs.ceph.com/docs/master/radosgw/multisite/ - in the
section Requirements and Assumptions
There is this warning hint:
"Running a single Ceph storage cluster is NOT recommended unless you have low
latency WAN connections."
What exactly does "single Ceph storage cluster" mea
On Tue, May 12, 2020 at 6:03 AM Wido den Hollander wrote:
> And to add to this: No, a newly created RBD image will never have 'left
> over' bits and bytes from a previous RBD image.
>
> I had to explain this multiple times to people which were used to old
> (i)SCSI setups where partitions could ha
It is my understanding that it refers to running a single, normal ceph
cluster with it's component hosts connected over WAN. This would
require OSDs to connect to other OSDs and mons over WAN for nearly
every operation, and is not likely to perform acceptably.
__
It is possible to run a ceph cluster over a WAN if you have reliable
enough WAN with sites close enough for low-ish latency. The OSiRIS
project is architected that way with Ceph services spread evenly
across three university sites in Michigan. There's more information
and contact on their website
Hi Frank,
My understanding was that once a cluster is in a degraded state (an OSD
is down), ceph stores all changed cluster maps until the cluster is
healthy again exactly for the reason of finding missing objects. If
there is a real disaster of some kind, and many OSDs go up and down at
vari
I have been running Ceph over a gigabit WAN for a few months now and have been
happy with it. Mine is set up with Strongswan tunnels And dynamic routing with
BIRD) (although I would have used transport Mode and iBGP in hindsight). I
generally have 300-500kbps flow with 5ms latency.
What I spec
On Tue, May 19, 2020 at 10:34 AM Benjeman Meekhof wrote:
>
> It is possible to run a ceph cluster over a WAN if you have reliable
> enough WAN with sites close enough for low-ish latency. The OSiRIS
> project is architected that way with Ceph services spread evenly
> across three university sites
Greg,
My name's Zac and I'm the docs guy for the Ceph Foundation. I have a
long-term plan to create a document that collects error codes and failure
cases, but I am only one man and it will be a few months before I can begin
on it.
Zac Dover
Ceph Docs Guy
On Wed, May 20, 2020 at 4:32 AM Gregory
Great, thanks already.
I will study the publications of the project :)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Zac, can you confirm that this assumption is true?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
What does tiebreaker monitor mean? What exactly is its purpose?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
You need a third monitor in order to form a quorum if one of the two
sites goes down. With only two sites, there is no safe way for them to
decide who is down.
On Tue, May 19, 2020 at 3:11 PM CodingSpiderFox
wrote:
>
> What does tiebreaker monitor mean? What exactly is its purpose?
>
Hi Andreas,
the cluster map and crush map are not the same thing. If you change the crush
map while the cluster is in degraded state, you basically modify this history
of cluster maps explicitly and have to live with the consequences (keeping
history under crush map changes is limited to up+in
Hello Everyone,
I have installed both Prometheus and Grafana on one of my manager nodes (Ubuntu
18.04), and have configured both according to the documentation. I have visible
Grafana dashboards when visiting http://mon1:3000, but no data exists on the
dashboard. Python errors are shown for the
Hello Zac,
I have some further questions on that page:
Right before the section "Delete Default Zone Group and Zone" there is another
warning that says:
"The following steps assume a multi-site configuration using newly installed
systems that aren’t storing data yet. DO NOT DELETE the default
Hi,
I was browsing dashboard today. Then suddently it stopped working and i got 502
errors. I checked via root login and see thet ceph health is down to WARN.
I can access all rdb devices and CephFS. They work. All OSDs in server-1 is up.
health: HEALTH_WARN
1 hosts fail cepha
Hi again,
One more update:
I connected to server-2 and ran ceph -s there. I got:
Error initializing cluster client: ObjectNotFound('RADOS object not found
(error calling conf_read_file)')
Today I created a RBD pool and created 2 RDB images in this pool. Could this be
reason for all dashboard
Hi,
I'm using Nautilus and I'm using the whole cluster mainly for a single
bucket in RadosGW.
There is a lot of data in this bucket (Petabyte scale) and I don't want to
waste all of SSD on it.
Is there anyway to automatically set some aging threshold for this data and
e.g. move any data older than
This happens (unfortunately) frequently to me. Look for the active mgr
(ceph -s), and go restart the mgr service there (systemctl list-units |grep
mgr then systemctl restart NAMEOFSERVICE). This normally resolves that
error for me. You can look at the journalctl output and you'll likely see
errors
Here's what I learned about PG maps from my investigation of the code.
First, they don't seem to be involved in deciding what needs reconstruction
when a dead OSD is revived. There is a version number stored with the PGs
that is probably used for that.
It looks like nothing but statistics - the
Hi,
take a look into 'ceph osd df' (maybe share the output) to see which
OSD(s) are full, they determine if when pool becomes full.
Did you delete lots of objects from that pool recently? That can take
some time until the space is really cleared.
Zitat von "Szabo, Istvan (Agoda)" :
Hi,
Please add 'ceph osd df' output, not 'ceph df'.
Zitat von "Szabo, Istvan (Agoda)" :
Hello,
No, haven't deleted, this warning is quite long time ago.
ceph health detail
HEALTH_WARN 1 pool(s) full
POOL_FULL 1 pool(s) full
pool 'k8s' is full (no quota)
ceph df
GLOBAL:
SIZE AVAIL
Hi Eugen,
Thanks for your reply.
The problem is all rbd images were removed from pool rbd days ago. i.e.
both below commands also return empty:
$ rbd ls rbd
$ rados -p rbd listomapkeys rbd_directory
But below rados df still located 430K objects. Any other methods can I dig
out those ghost object
34 matches
Mail list logo