Hi all,
Our cluster is experiencing a very odd issue and I'm hoping for some
guidance on troubleshooting steps and/or suggestions to mitigate the issue.
tl;dr: Individual ceph-osd processes try to allocate > 90GiB of RAM and are
eventually nuked by oom_killer.
I'll try to explain the situation in
Hi to all
Sorry if this question was already asked but I didn't find anything related
AFAIK MDS are a foundametal component for CephFS.
What happens in case of MDS crash between replication from the active MDS
to the slaves?
Changes made between the crash and the missing replication are lost?
___
How many PGs do you have? And did you change any config, like mds cache
size? Show your ceph.conf.
On 04/15/17 07:34, Aaron Ten Clay wrote:
> Hi all,
>
> Our cluster is experiencing a very odd issue and I'm hoping for some
> guidance on troubleshooting steps and/or suggestions to mitigate the
> is
On Sat, Apr 15, 2017 at 8:49 AM, Gandalf Corvotempesta
wrote:
> Hi to all
> Sorry if this question was already asked but I didn't find anything related
>
>
> AFAIK MDS are a foundametal component for CephFS.
> What happens in case of MDS crash between replication from the active MDS to
> the slave
I'd recommend running through these steps and posting the output as well
http://docs.ceph.com/docs/master/rados/troubleshooting/memory-profiling/
Bob
On Sat, Apr 15, 2017 at 5:39 AM, Peter Maloney <
peter.malo...@brockmann-consult.de> wrote:
> How many PGs do you have? And did you change any con
Peter,
There are 624 PGs across 4 pools:
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 2505 flags hashpspool
stripe_width 0
removed_snaps [1~3]
pool 3 'fsdata' erasure size 14 min_size 11 crush_ruleset 3 object_hash
rjenkin
Thanks for the recommendation, Bob! I'll try to get this data later today
and reply with it.
-Aaron
On Sat, Apr 15, 2017 at 9:46 AM, Bob R wrote:
> I'd recommend running through these steps and posting the output as well
> http://docs.ceph.com/docs/master/rados/troubleshooting/memory-profiling/
Il 15 apr 2017 5:48 PM, "John Spray" ha scritto:
MDSs do not replicate to one another. They write all metadata to a
RADOS pool (i.e. to the OSDs), and when a failover happens, the new
active MDS reads the metadata in.
Is MDS atomic? A successful ack is sent only after data is properly wrote
on
On Sat, Apr 15, 2017 at 7:19 PM, Gandalf Corvotempesta
wrote:
> Il 15 apr 2017 5:48 PM, "John Spray" ha scritto:
>
> MDSs do not replicate to one another. They write all metadata to a
> RADOS pool (i.e. to the OSDs), and when a failover happens, the new
> active MDS reads the metadata in.
>
>
>