75
Min IOPS: 61
Average Latency(s): 0.227538
Stddev Latency(s): 0.0843661
Max latency(s): 0.48464
Min latency(s): 0.0467124
On 2020-01-06 20:44, Jelle de Jong wrote:
Hello everybody,
I have issues with very slow requests a simple tree node cluster here,
Hi,
What are the full commands you used to setup this iptables config?
iptables --table raw --append OUTPUT --jump NOTRACK
iptables --table raw --append PREROUTING --jump NOTRACK
Does not create the same output, it needs some more.
Kind regards,
Jelle de Jong
On 2019-07-17 14:59, Kees
Hello everybody,
I have issues with very slow requests a simple tree node cluster here,
four WDC enterprise disks and Intel Optane NVMe journal on identical
high memory nodes, with 10GB networking.
It was working all good with Ceph Hammer on Debian Wheezy, but I wanted
to upgrade to a suppor
Hello everybody,
I got a tree node ceph cluster made of E3-1220v3, 24GB ram, 6 hdd osd's
with 32GB Intel Optane NVMe journal, 10GB networking.
I wanted to move to bluestore due to dropping support of filestore, our
cluster was working fine with filestore and we could take complete nodes
out
Hello everybody,
[fix confusing typo]
I got a tree node ceph cluster made of E3-1220v3, 24GB ram, 6 hdd osd's
with 32GB Intel Optane NVMe journal, 10GB networking.
I wanted to move to bluestore due to dropping support of filestore, our
cluster was working fine with filestore and we could tak
Hello everybody,
I got a tree node ceph cluster made of E3-1220v3, 24GB ram, 6 hdd osd's
with 32GB Intel Optane NVMe journal, 10GB networking.
I wanted to move to bluestore due to dropping support of file store, our
cluster was working fine with bluestore and we could take complete nodes
out
t rule.
>
> On Thu, Nov 21, 2019 at 7:46 AM Alfredo De Luca
> wrote:
> >
> > Hi all.
> > We are doing some tests on how to scale out nodes on Ceph Nautilus.
> > Basically we want to try to install Ceph on one node and scale up to 2+
> nodes. How to do so?
>
Hi all.
We are doing some tests on how to scale out nodes on Ceph Nautilus.
Basically we want to try to install Ceph on one node and scale up to 2+
nodes. How to do so?
Every nodes has 6 disks and maybe we can use the crushmap to achieve this?
Any thoughts/ideas/recommendations?
Cheers
--
*
Hi, does anyone have any feedback for me regarding this?
Here's the log I get when trying to restart the OSD via systemctl:
https://pastebin.com/tshuqsLP
On Mon, 4 Nov 2019 at 12:42, Eugene de Beste mailto:eug...@sanbi.ac.za)> wrote:
> Hi everyone
>
> I have a cluster that was
Hi everyone
I have a cluster that was initially set up with bad defaults in Luminous. After
upgrading to Nautilus I've had a few OSDs crash on me, due to errors seemingly
related to https://tracker.ceph.com/issues/42223 and
https://tracker.ceph.com/issues/22678.
One of my pools have been runnin
hi all,
maybe to clarify a bit, e.g.
https://indico.cern.ch/event/755842/contributions/3243386/attachments/1784159/2904041/2019-jcollet-openlab.pdf
clearly shows that the db+wal disks are not saturated,
but we are wondering what is really needed/acceptable wrt throughput and
latency (eg is a 6gbps
hi marc,
> - how to prevent the D state process to accumulate so much load?
you can't. in linux, uninterruptable tasks themself count as "load",
this does not mean you eg ran out of cpu resources.
stijn
>
> Thanks,
>
>
>
>
>
> ___
> ceph-users ma
Thanks for the answers, guys!
Am I right to assume msgr2 (http://docs.ceph.com/docs/mimic/dev/msgr2/)
will provide encryption between Ceph daemons as well as between clients and
daemons?
Does anybody know if it will be available in Nautilus?
On Fri, Jan 11, 2019 at 8:10 AM Tobias Florek wrote:
Hi everyone, I have some questions about encryption in Ceph.
1) Are RBD connections encrypted or is there an option to use encryption
between clients and Ceph? From reading the documentation, I have the
impression that the only option to guarantee encryption in transit is to
force clients to encry
Hi,
I have 3 machine with ceph config with cephfs. But I lost one machine, just
with mon and mds. It's possible recovey cephfs? If yes how?
ceph: Ubuntu 16.05.5 (lost this machine)
- mon
- mds
- osd
ceph-osd-1: Ubuntu 16.05.5
- osd
ceph-osd-2: Ubuntu 16.05.5
- osd
[]´s
Maiko de Andrad
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Wed, May 16, 2018 at 5:15 PM Webert de Souza Lima
wrote:
> Thanks Jack.
>
> That's good to know. It is definitely something to consider.
> In a distributed storage scenario we might build a dedica
erhaps using a cache tier pool?
The pool had 2 snaps. After removing those, the ls command returned no
'non-existing' objects. I expected that ls would only return objects of
the current contents, I did not specify -s for working with snaps of the
pool.
>
> John
>
>>
Hi Eugen.
Just tried everything again here by removing the /sda4 partitions and
letting it so that either salt-run proposal-populate or salt-run state.orch
ceph.stage.configure could try to find the free space on the partitions to
work with: unsuccessfully again. :(
Just to make things clear: are
z)
>
> If that doesn't reveal anything run stage.3 again and watch the logs.
>
> Regards,
> Eugen
>
>
> Zitat von Jones de Andrade :
>
> > Hi Eugen.
> >
> > Ok, edited the file /etc/salt/minion, uncommented the "log_level_logfile"
> > line an
cer
> > node01 is also the master, I was expecting it to have logs from the other
> > too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have
> > available, and nothing "shines out" (sorry for my poor english) as a
> > possible error.
>
> t
egards,
> Eugen
>
> [1] https://forums.suse.com/forumdisplay.php?99-SUSE-Enterprise-Storage
>
> Zitat von Jones de Andrade :
>
> > Hi Eugen.
> >
> > Thanks for the suggestion. I'll look for the logs (since it's our first
> > attempt with ceph
ou in the right direction.
> Since the deployment stage fails at the OSD level, start with the OSD
> logs. Something's not right with the disks/partitions, did you wipe
> the partition from previous attempts?
>
> Regards,
> Eugen
>
> Zitat von Jones de Andrade :
>
>
(Please forgive my previous email: I was using another message and
completely forget to update the subject)
Hi all.
I'm new to ceph, and after having serious problems in ceph stages 0, 1 and
2 that I could solve myself, now it seems that I have hit a wall harder
than my head. :)
When I run salt-
Hi all.
I'm new to ceph, and after having serious problems in ceph stages 0, 1 and
2 that I could solve myself, now it seems that I have hit a wall harder
than my head. :)
When I run salt-run state.orch ceph.stage.deploy, i monitor I see it going
up to here:
###
[14/71] ceph.sysctl on
nt, I can't restart it everytime.
>
> Webert de Souza Lima 于2018年8月8日周三 下午10:33写道:
>
>> Hi Zhenshi,
>>
>> if you still have the client mount hanging but no session is connected,
>> you probably have some PID waiting with blocked IO from cephfs mount.
>>
g time.
> So I cannot get useful infomation from the command you provide.
>
> Thanks
>
> Webert de Souza Lima 于2018年8月8日周三 下午10:10写道:
>
>> You could also see open sessions at the MDS server by issuing `ceph
>> daemon mds.XX session ls`
>>
>> Regards,
>>
gt;>>
>>> >>> This is not a Ceph-specific thing -- it can also affect similar
>>> >>> systems like Lustre.
>>> >>>
>>> >>> The classic case is when under some memory pressure, the kernel tries
>>> >>> to f
p is_healthy 'OSD::osd_op_tp thread 0x7fdabd897700' had
> timed out after 90
>
>
>
> (I update it to 90 instead of 15s)
>
>
>
> Regards,
>
>
>
>
>
>
>
> *De :* ceph-users *De la part de*
> Webert de Souza Lima
> *Envoyé :* 07 August
he kernel client at this
> point, but that isn’t etched in stone.
> >
> > Curious if there is more to share.
> >
> > Reed
> >
> > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima
> wrote:
> >
> >
> > Yan, Zheng 于2018年8月7日周二 下午7:51写道:
>
Yan, Zheng 于2018年8月7日周二 下午7:51写道:
> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou wrote:
> this can cause memory deadlock. you should avoid doing this
>
> > Yan, Zheng 于2018年8月7日 周二19:12写道:
> >>
> >> did you mount cephfs on the same machines that run ceph-osd?
> >>
I didn't know about this. I ru
Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Tue, Aug 7, 2018 at 10:47 AM CUZA Frédéric wrote:
> Pool is already deleted and no longer present in stats.
>
>
>
> Regards,
>
>
>
> *De :* ceph-users *De la part de*
> Webert de Souz
ole cluster keeps flapping, it is
> never the same OSDs that go down.
>
> Is there a way to get the progress of this recovery ? (The pool hat I
> deleted is no longer present (for a while now))
>
> In fact, there is a lot of i/o activity on the server where osds go down.
>
>
>
The pool deletion might have triggered a lot of IO operations on the disks
and the process might be too busy to respond to hearbeats, so the mons mark
them as down due to no response.
Check also the OSD logs to see if they are actually crashing and
restarting, and disk IO usage (i.e. iostat).
Rega
07 PM Alessandro De Salvo
wrote:
However, I cannot reduce the number of mdses anymore, I was used to do
that with e.g.:
ceph fs set cephfs max_mds 1
Trying this with 12.2.6 has apparently no effect, I am left with 2
active mdses. Is this another bug?
Are you following this procedure?
http://docs.ceph.com
, Jul 12, 2018 at 11:39 PM Alessandro De Salvo
wrote:
Some progress, and more pain...
I was able to recover the 200. using the ceph-objectstore-tool for one
of the OSDs (all identical copies) but trying to re-inject it just with rados
put was giving no error while the get was still
5) Input/output
error)
Can I safely try to do the same as for object 200.? Should I
check something before trying it? Again, checking the copies of the
object, they have identical md5sums on all the replicas.
Thanks,
Alessandro
Il 12/07/18 16:46, Alessandro De Salvo ha scritto:
howed up when
trying to read an object,
but not on scrubbing, that magically disappeared after restarting the
OSD.
However, in my case it was clearly related to
https://tracker.ceph.com/issues/22464 which doesn't
seem to be the issue here.
Paul
2018-07-12 13:53 GMT+02:00 Alessandr
Il 12/07/18 11:20, Alessandro De Salvo ha scritto:
Il 12/07/18 10:58, Dan van der Ster ha scritto:
On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum
wrote:
On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo
wrote:
OK, I found where the object is:
ceph osd map cephfs_metadata
Il 12/07/18 10:58, Dan van der Ster ha scritto:
On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum wrote:
On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo
wrote:
OK, I found where the object is:
ceph osd map cephfs_metadata 200.
osdmap e632418 pool 'cephfs_metadata' (
> Il giorno 11 lug 2018, alle ore 23:25, Gregory Farnum ha
> scritto:
>
>> On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo
>> wrote:
>> OK, I found where the object is:
>>
>>
>> ceph osd map cephfs_metadata 200.
>>
Cheers!
Thanks for all the backports and fixes.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Wed, Jul 11, 2018 at 1:46 PM Abhishek Lekshmanan
wrote:
>
> We're glad to announce v10.2.11 release of the Jewel stable release
> series.
controllers, but 2 of
the OSDs with 10.14 are on a SAN system and one on a different one, so I
would tend to exclude they both had (silent) errors at the same time.
Thanks,
Alessandro
Il 11/07/18 18:56, John Spray ha scritto:
On Wed, Jul 11, 2018 at 4:49 PM Alessandro De Salvo
wrote:
, 2018 at 4:10 PM Alessandro De Salvo
wrote:
Hi,
after the upgrade to luminous 12.2.6 today, all our MDSes have been
marked as damaged. Trying to restart the instances only result in
standby MDSes. We currently have 2 filesystems active and 2 MDSes each.
I found the following error messages in the
e damage before
issuing the "repaired" command?
What is the history of the filesystems on this cluster?
On Wed, Jul 11, 2018 at 8:10 AM Alessandro De Salvo
<mailto:alessandro.desa...@roma1.infn.it>> wrote:
Hi,
after the upgrade to luminous 12.2.6 today, all our MDS
Hi,
after the upgrade to luminous 12.2.6 today, all our MDSes have been
marked as damaged. Trying to restart the instances only result in
standby MDSes. We currently have 2 filesystems active and 2 MDSes each.
I found the following error messages in the mon:
mds.0 :6800/2412911269 down:dama
Hi all.
I'm looking for some information on several distributed filesystems for our
application.
It looks like it finally came down to two candidates, Ceph being one of
them. But there are still a few questions about ir that I would really like
to clarify, if possible.
Our plan, initially on 6 w
bluestore doesn't have a journal like the filestore does, but there is the
WAL (Write-Ahead Log) which is looks like a journal but works differently.
You can (or must, depending or your needs) have SSDs to serve this WAL (and
for Rocks DB).
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*
I
> don't run FreeBSD, but any particular issue you are seeing?
>
> On Tue, Jun 26, 2018 at 6:06 PM Frank de Bot (lists) <mailto:li...@searchy.net>> wrote:
>
> Hi,
>
> In my test setup I have a ceph iscsi gateway (configured as in
> http://docs
Hi,
In my test setup I have a ceph iscsi gateway (configured as in
http://docs.ceph.com/docs/luminous/rbd/iscsi-overview/ )
I would like to use thie with a FreeBSD (11.1) initiator, but I fail to
make a working setup in FreeBSD. Is it known if the FreeBSD initiator
(with gmultipath) can work with
I want to try using NUMA to also run KVM guests besides the OSD. I
should have enough cores and only have a few osd processes.
Kind regards,
Jelle de Jong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph
further testing, all slow request are blocked by OSD's on
a single host. How can I debug this problem further? I can't find any
errors or other strange things on the host with osd's that are seemingly
not sending a response to an op.
Regards,
Frank de Bot
___
Keep in mind that the mds server is cpu-bound, so during heavy workloads it
will eat up CPU usage, so the OSD daemons can affect or be affected by the
MDS daemon.
But it does work well. We've been running a few clusters with MON, MDS and
OSDs sharing the same hosts for a couple of years now.
Regar
Hi,
Il 14/06/18 06:13, Yan, Zheng ha scritto:
On Wed, Jun 13, 2018 at 9:35 PM Alessandro De Salvo
wrote:
Hi,
Il 13/06/18 14:40, Yan, Zheng ha scritto:
On Wed, Jun 13, 2018 at 7:06 PM Alessandro De Salvo
wrote:
Hi,
I'm trying to migrate a cephfs data pool to a different one in ord
n’t. The backtrace does
> create another object but IIRC it’s a maximum one IO per create/rename (on
> the file).
> On Wed, Jun 13, 2018 at 1:12 PM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>> Thanks for clarifying that, Gregory.
>>
>> As said bef
pool isn’t
> available you would stack up pending RADOS writes inside of your mds but
> the rest of the system would continue unless you manage to run the mds out
> of memory.
> -Greg
> On Wed, Jun 13, 2018 at 9:25 AM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>>
Hi,
Il 13/06/18 14:40, Yan, Zheng ha scritto:
On Wed, Jun 13, 2018 at 7:06 PM Alessandro De Salvo
wrote:
Hi,
I'm trying to migrate a cephfs data pool to a different one in order to
reconfigure with new pool parameters. I've found some hints but no
specific documentation to mig
I think in this scenario the overhead may be acceptable for us.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Wed, Jun 13, 2018 at 9:51 AM Yan, Zheng wrote:
> On Wed, Jun 13, 2018 at 3:34 AM Webert de Souza Lima
> wrote:
>
Hi,
I'm trying to migrate a cephfs data pool to a different one in order to
reconfigure with new pool parameters. I've found some hints but no
specific documentation to migrate pools.
I'm currently trying with rados export + import, but I get errors like
these:
Write #-9223372036854775808:
hello,
is there any performance impact on cephfs for using file layouts to bind a
specific directory in cephfs to a given pool? Of course, such pool is not
the default data pool for this cephfs.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRL
Hi Daniel,
Thanks for clarifying.
I'll have a look at dirfrag option.
Regards,
Webert Lima
Em sáb, 19 de mai de 2018 01:18, Daniel Baumann
escreveu:
> On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> > New question: will it make any difference in the balancing if instead
Hi Patrick
On Fri, May 18, 2018 at 6:20 PM Patrick Donnelly
wrote:
> Each MDS may have multiple subtrees they are authoritative for. Each
> MDS may also replicate metadata from another MDS as a form of load
> balancing.
Ok, its good to know that it actually does some load balance. Thanks.
New
Hi,
We're migrating from a Jewel / filestore based cephfs archicture to a
Luminous / buestore based one.
One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge
of how it actually works.
After reading the docs and ML we learned that they work by sort of dividing
the responsibili
Hello,
On Mon, Apr 30, 2018 at 7:16 AM Daniel Baumann
wrote:
> additionally: if rank 0 is lost, the whole FS stands still (no new
> client can mount the fs; no existing client can change a directory, etc.).
>
> my guess is that the root of a cephfs (/; which is always served by rank
> 0) is nee
*IRC NICK - WebertRLZ*
On Wed, May 16, 2018 at 4:45 PM Jack wrote:
> On 05/16/2018 09:35 PM, Webert de Souza Lima wrote:
> > We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore
> > backend.
> > We'll have to do some some work on how to simulat
ction, but you can try it to run a POC.
>
> For more information check out my slides from Ceph Day London 2018:
> https://dalgaaf.github.io/cephday-london2018-emailstorage/#/cover-page
>
> The project can be found on github:
> https://github.com/ceph-dovecot/
>
> -Danny
>
and will help you a lot:
> - Compression (classic, https://wiki.dovecot.org/Plugins/Zlib)
> - Single-Instance-Storage (aka sis, aka "attachment deduplication" :
> https://www.dovecot.org/list/dovecot/2013-December/094276.html)
>
> Regards,
> On 05/16/2018 08:37 PM, Webert d
I'm sending this message to both dovecot and ceph-users ML so please don't
mind if something seems too obvious for you.
Hi,
I have a question for both dovecot and ceph lists and below I'll explain
what's going on.
Regarding dbox format (https://wiki2.dovecot.org/MailboxFormat/dbox), when
using s
0",
> "osd_op_num_threads_per_shard_hdd": "1",
> "osd_op_num_threads_per_shard_ssd": "2",
> "osd_op_thread_suicide_timeout": "150",
> "osd_op_thread_timeout": "15",
> "os
On Sat, May 12, 2018 at 3:11 AM Alexandre DERUMIER
wrote:
> The documentation (luminous) say:
>
> >mds cache size
> >
> >Description:The number of inodes to cache. A value of 0 indicates an
> unlimited number. It is recommended to use mds_cache_memory_limit to limit
> the amount of memory t
11, 2018 at 2:39 PM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>> I think ceph doesn't have IO metrics will filters by pool right? I see IO
>> metrics from clients only:
>>
>> ceph_client_io_ops
>> ceph_client_io_read_bytes
>> ceph_cli
(write/read)_bytes(_total)
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Wed, May 9, 2018 at 2:23 PM Webert de Souza Lima
wrote:
> Hey Jon!
>
> On Wed, May 9, 2018 at 12:11 PM, John Spray wrote:
>
>> It depends
This message seems to be very concerning:
>mds0: Metadata damage detected
but for the rest, the cluster seems still to be recovering. you could try
to seep thing up with ceph tell, like:
ceph tell osd.* injectargs --osd_max_backfills=10
ceph tell osd.* injectargs --osd_recovery_sleep
quot;imported": 0,
"imported_inodes": 0
}
}
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Fri, May 11, 2018 at 3:13 PM Alexandre DERUMIER
wrote:
> Hi,
>
> I'm still seeing memory leak with 12
Basically what we're trying to figure out looks like what is being done
here:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020958.html
But instead of using LIBRADOS to store EMAILs directly into RADOS we're
still using CEPHFS for it, just figuring out if it makes sense to sep
Hey Jon!
On Wed, May 9, 2018 at 12:11 PM, John Spray wrote:
> It depends on the metadata intensity of your workload. It might be
> quite interesting to gather some drive stats on how many IOPS are
> currently hitting your metadata pool over a week of normal activity.
>
Any ceph built-in tool f
I'm sorry I have mixed up some information. The actual ratio I have now
is 0,0005% (*100MB for 20TB data*).
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Wed, May 9, 2018 at 11:32 AM, Webert de Souza Lima wrote:
&g
Hello,
Currently, I run Jewel + Filestore for cephfs, with SSD-only pools used for
cephfs-metadata, and HDD-only pools for cephfs-data. The current
metadata/data ratio is something like 0,25% (50GB metadata for 20TB data).
Regarding bluestore architecture, assuming I have:
- SSDs for WAL+DB
-
I'd also try to boot up only one mds until it's fully up and running. Not
both of them.
Sometimes they go switching states between each other.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Thu, Mar 29, 2018 at 7:32 AM, John Spray wro
hi,
can you give soem more details on the setup? number and size of osds.
are you using EC or not? and if so, what EC parameters?
thanks,
stijn
On 02/26/2018 08:15 AM, Linh Vu wrote:
> Sounds like you just need more RAM on your MDS. Ours have 256GB each, and the
> OSD nodes have 128GB each. Ne
hi oliver,
>>> in preparation for production, we have run very successful tests with large
>>> sequential data,
>>> and just now a stress-test creating many small files on CephFS.
>>>
>>> We use a replicated metadata pool (4 SSDs, 4 replicas) and a data pool with
>>> 6 hosts with 32 OSDs each,
hi oliver,
the IPoIB network is not 56gb, it's probably a lot less (20gb or so).
the ib_write_bw test is verbs/rdma based. do you have iperf tests
between hosts, and if so, can you share those reuslts?
stijn
> we are just getting started with our first Ceph cluster (Luminous 12.2.2) and
> doing
itto:
On Tue, Jan 30, 2018 at 5:49 AM Alessandro De Salvo
<mailto:alessandro.desa...@roma1.infn.it>> wrote:
Hi,
we have several times a day different OSDs running Luminous 12.2.2 and
Bluestore crashing with errors like this:
starting osd.2 at - osd_data /var/lib/ceph
Hi,
we have several times a day different OSDs running Luminous 12.2.2 and
Bluestore crashing with errors like this:
starting osd.2 at - osd_data /var/lib/ceph/osd/ceph-2
/var/lib/ceph/osd/ceph-2/journal
2018-01-30 13:45:28.440883 7f1e193cbd00 -1 osd.2 107082 log_to_monitors
{default=true}
Hi,
On Fri, Jan 19, 2018 at 8:31 PM, zhangbingyin
wrote:
> 'MAX AVAIL' in the 'ceph df' output represents the amount of data that can
> be used before the first OSD becomes full, and not the sum of all free
> space across a set of OSDs.
>
Thank you very much. I figured this out by the end of t
available space.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
On Thu, Jan 18, 2018 at 8:21 PM, Webert de Souza Lima wrote:
> With the help of robbat2 and llua on IRC channel I was able to solve this
> situation by taking down the 2-OS
With the help of robbat2 and llua on IRC channel I was able to solve this
situation by taking down the 2-OSD only hosts.
After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, ceph
df showed the expected storage capacity usage (about 70%)
With this in mind, those guys have told me
WebertRLZ*
On Thu, Jan 18, 2018 at 8:05 PM, David Turner wrote:
> `ceph osd df` is a good command for you to see what's going on. Compare
> the osd numbers with `ceph osd tree`.
>
>
>>
>> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima <
>>
Sorry I forgot, this is a ceph jewel 10.2.10
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-cep
Also, there is no quota set for the pools
Here is "ceph osd pool get xxx all": http://termbin.com/ix0n
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists
Hello,
I'm running near-out-of service radosgw (very slow to write new objects)
and I suspect it's because of ceph df is showing 100% usage in some pools,
though I don't know what that information comes from.
Pools:
#~ ceph osd pool ls detail -> http://termbin.com/lsd0
Crush Rules (important is
On 01/08/2018 05:40 PM, Alessandro De Salvo wrote:
> > Thanks Lincoln,
> >
> > indeed, as I said the cluster is recovering, so there are pending ops:
> >
> >
> > pgs: 21.034% pgs not active
> > 1692310/24980804 objects degraded (6.7
Good to know. I don't think this should trigger HEALTH_ERR though, but
HEALTH_WARN makes sense.
It makes sense to keep the backfillfull_ratio greater than nearfull_ratio
as one might need backfilling to avoid OSD getting full on reweight
operations.
Regards,
Webert Lima
DevOps Engineer at MAV Te
On Wed, Jan 10, 2018 at 12:44 PM, Mark Schouten wrote:
> > Thanks, that's a good suggestion. Just one question, will this affect
> RBD-
> > access from the same (client)host?
i'm sorry that this didn't help. No, it does not affect rbd clients, as MDS
is related only to cephfs.
Regards,
Webert
now the simple solution is to just reboot the server, but the server
> holds
> quite a lot of VM's and Containers, so I'd prefer to fix this without a
> reboot.
>
> Anybody with some clever ideas? :)
>
> --
> Kerio Operator in de Cloud? https://www.kerioindeclo
Mon, 2018-01-08 at 17:21 +0100, Alessandro De Salvo wrote:
Hi,
I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded.
I have 2 active mds instances and 1 standby. All the active
instances
are now in replay state and show the same error in the logs:
mds1
2018-01-08 1
Hi,
I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded.
I have 2 active mds instances and 1 standby. All the active instances
are now in replay state and show the same error in the logs:
mds1
2018-01-08 16:04:15.765637 7fc2e92451c0 0 ceph version 12.2.2
(cf0baee
- will affect
>> librbd performance in the hypervisors.
>>
>> Does anybody have some information about how Meltdown or Spectre affect ceph
>> OSDs and clients?
>>
>> Also, regarding Meltdown patch, seems to be a compilation option, meaning
>> you coul
Hello all,
I have ceph Luminous setup with filestore and bluestore OSDs. This cluster
was deployed initially as Hammer, than I upgraded it to Jewel and
eventually to Luminous. It’s heterogenous, we have SSDs, SAS 15K and 7.2K
HDDs in it (see crush map attached). Earlier I converted 7.2K HDD from
f
On Thu, Dec 21, 2017 at 12:52 PM, shadow_lin wrote:
>
> After 18:00 suddenly the write throughput dropped and the osd latency
> increased. TCmalloc started relcaim page heap freelist much more
> frequently.All of this happened very fast and every osd had the indentical
> pattern.
>
Could that be c
it depends on how you use it. for me, it runs fine on the OSD hosts but the
mds server consumes loads of RAM, so be aware of that.
if the system load average goes too high due to osd disk utilization the
MDS server might run into troubles too, as delayed response from the host
could cause the MDS t
1 - 100 of 273 matches
Mail list logo