"sudo start ceph-osd-all" isn't working well and doesn't like the idea
of "sudo start ceph-osd id=1" for each OSD in rc file.
Need to do it for both Hammer (Ubuntu 1404) and Luminous (Ubuntu 1604).
Pardhiv Karri
We have a large 1PB ceph cluster. We recently added 6 nodes with 16 2TB
disks each to the cluster. All the 5 nodes rebalanced well without any
issues and the sixth/last node OSDs started acting weird as I increase
weight of one osd the utilization doesn't change but a different osd on the
use the straw2 data placement algorithm:
> http://docs.ceph.com/docs/master/rados/operations/crush-
> map/#hammer-crush-v4
> That should help as well. Once that's enabled you can convert your
> existing
> buckets to straw2 as well. Just be careful you don't have any
>> http://docs.ceph.com/docs/master/rados/operations/crush-
>> map/#hammer-crush-v4
>> That should help as well. Once that's enabled you can convert your
>> existing
>> buckets to straw2 as well. Just be careful you don't have any
Pardhiv K
On Fri, May 11, 2018 at 7:14 PM, David Turner wrote:
> What's your `ceph osd tree`, `ceph df`, `ceph osd df`? You sound like you
> just have a fairly fill cluster that you haven't balanced the crush weights
> on.
> On Fri, May 11, 2018, 10:
STDDEV: 8.26
Pardhiv karri
ceph-users mailing list
Our ceph cluster have 12 pools and only 3 pools are really used. How can I
see number of PGs on a OSD and which PGs belong to which pool on that OSD?
Something like below,
OSD 0 = 1000PGs (500PGs belong to PoolA, 200PGs belong to PoolB, 300PGs
belong to PoolC)
Pardhiv Karri
Pardhiv Karri
On Tue, May 22, 2018 at 5:01 AM, David Turner wrote:
> What are your `ceph osd tree` and `ceph status` as well?
> On Tue, May 22, 2018, 3:05 AM Pardhiv Karri wrote:
>> Hi,
>> We are using Ceph Hammer 0.94.9. Some of our OSDs never get an
This is exactly what I'm looking for. Tested it in our lab and it works
great. Thanks you Caspar!
Pardhiv Karri
On Tue, May 22, 2018 at 3:42 AM, Caspar Smit wrote:
> Here you go:
> ps. You might have to map your poolnames to pool ids
> http://cephnotes.ksperis
gp_num: 64 = Working on pool: compute pg_num:
512 pgp_num: 512 = Working on pool: volumes pg_num: 1024
pgp_num: 1024 = Working on pool: images pg_num: 128
pgp_num: 128 root@or1010051251044:~#
Pardhiv Karri
On Tue, May 22, 2018 at 9:16 AM, David Tur
Hi David,
We are using tree algorithm.
Pardhiv Karri
On Tue, May 22, 2018 at 9:42 AM, David Turner wrote:
> Your PG counts per pool per osd doesn't have any PGs on osd.38. that
> definitely matches what your seeing, but I've never seen this happen
> before. The
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type rack
step emit
# end crush map
Pardhiv Karri
On Tue, May 22, 2018 at 9:58 AM, Pardhiv Karri
> Hi David,
> We are using tree algorithm.
> Thanks,
Finally figured that it is happening because of unbalanced rack structure.
When we moved the host/osd to another rack they are working just fine. Now
we balanced the racks by moving hosts, some rebalancing happened due to
that but everything is fine now.
Pardhiv Karri
On Tue, May 22
Bluestore-Luminous all SSD. Due to the cost of
SSD's want to know if 2 replica is good or still need 3.
Pardhiv Karri
ceph-users mailing list
Thank You Linh for the info. Started reading about this solution. Could be
lot of cost savings, need to check about the limitations though. Not sure
how it works with Openstack as a front end to Ceph with Erasure Coded pools
in Luminous.
Pardhiv Karri
On Thu, May 24, 2018 at 6:39 PM
Is anyone using Openstack with Ceph Erasure Coding pools as it now
supports RBD in Luminous. If so, hows the performance?
Pardhiv Karri
ceph-users mailing list
Thank you, Andrew and Jason for replying.
Do you have a sample ceph config file that you can share which works with
RBD and EC pools?
Pardhiv Karri
On Thu, Jun 7, 2018 at 9:08 AM, Jason Dillaman wrote:
> On Thu, Jun 7, 2018 at 11:54 AM, Andrew Denton
> wrote:
> >
args '--osd-recovery-op-priority 20'
--Pardhiv Karri
On Thu, Jun 7, 2018 at 2:23 PM, Paul Emmerich
> Hi,
> the "osd_recovery_sleep_hdd/ssd" options are way better to fine-tune the
> impact of a backfill operation in this case.
> Paul
> 2018-0
using it in production.
Script Name: osd_crush_reweight.py
Config File Name: rebalance_config.ini
Script: https://jpst.it/1gwrk
Config File: https://jpst.it/1gwsh
--Pardhiv Karri
On Fri, Jun 8, 2018 at 12:20 AM, mj wrote:
> Hi Pardhiv,
> On 06/08/2018 05:07 AM, Pardhiv Ka
pool and reduce slow storage pool it will be easier in migration
and also currently works for our budget in getting ceph faster for heavy
I also looked at storage tiering but that won't be of much help as the
usage cannot be combined between storage tiers.
I am playing with Ceph Luminous and getting confused information around
usage of WalDB vs RocksDB.
I have 2TB NVMe drive which I want to use for Wal/Rocks DB and have 5 2TB
SSD's for OSD.
I am planning to create 5 30GB partitions for RocksDB on NVMe drive, do I
need to create partitions of Wa
case IMO it's optimal to have merged WAL+DB at NVME and data at
> SSD. Hence no need for separate WAL volume.
> Regards,
> Igor
> On 6/26/2018 10:22 PM, Pardhiv Karri wrote:
> Hi,
> I am playing with Ceph Luminous and getting confused information ar
disk space unused as some
OSDs are above 87% and some are below 50%. If the above 87% OSDs reach 95%
then the cluster will have issues. What is the best way to mitigate this
*Pardhiv Karri*
ceph-users mailing list
Thank You Dwyeni for the quick response. We have 2 Hammer which are due for
upgrade to Luminous next month and 1 Luminous 12.2.8. Will try this on
Luminous and if it works then will apply the same once the Hammer clusters
are upgraded rather than adjusting the weights.
Pardhiv Karri
now we are actively
rebooting the nodes in timely manner to avoid crashes. One R740xd node we
set all the OSDs to 0.0 and there is no memory leak there. Any pointers to
fix the issue would be helpful.
*Pardhiv Karri*
ceph-users mailing list
Thank You for the quick response Dyweni!
We are using FileStore as this cluster is upgraded from
Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes. R730xd
has 128GB and R740xd has 96GB of RAM. Everything else is the same.
Pardhiv Karri
On Fri, Dec 21, 2018 at 1
ze of the
mon store to 32GB or something to avoid getting the Ceph health to warning
state due to Mon store growing too big?
*Pardhiv Karri*
ceph-users mailing list
ot; to get
it back earlier this week. Currently the mon store is around 12G on each
monitor. If it doesn't grow then I won't change the value but if it grows
and gives the warning then I will increase it using "mon_data_size_warn".
Pardhiv Karri
On Mon, Jan 7, 2019 at
quests; recovery 12/240361653 objects degraded (0.000%);
recovery 151527/240361653 objects misplaced (0.063%)
pg 13.110c is stuck inactive since forever, current state incomplete, last
acting [490,16,120]
pg 7.9b7 is stuck inactive since forever, current state incomplete, last
acting [492,680,265]
id it?
*Pardhiv Karri*
"Rise and Rise again until LAMBS become LIONS"
ceph-users mailing list
Pardhiv Karri
ceph-users mailing list
mon.sh1ora1301 mon.1 295 :
cluster [INF] mon.sh1ora1301 calling monitor election
2019-04-02 00:52:39.810572 mon.sh1ora1301 mon.1 296 :
cluster [INF] mon.sh1ora1301 is new leader, mons sh1ora1301,sh1ora1302 in
quorum (ranks 1,2)
Pardhiv Karri
On Mon, Apr
um": 619
"client": {
"group": {
"features": "0x81dff8eeacfffb",
"release": "hammer",
"num": 3316
"group": {
Hi Paul,
All the underlying compute nodes Ceph packages were upgraded already but
the instances were not. So are you saying that live-migrate will get them
Pardhiv Karri
On Tue, Jun 18, 2019 at 7:34 AM Paul Emmerich
> You can live-migrate VMs to a server w
When I take down two OSDs crush weight to zero in a cluster with 575 OSDs
with all flags set to not rebalance there is an insane spike in client IO
and Bandwidth for few seconds and then when the flags are removed too many
slow requests every few seconds. Does anyone know why it happens, is it
We upgraded our Ceph cluster from Hammer to Luminous and it is running
fine. Post upgrade we live migrated all our Openstack instances (not 100%
sure). Currently we see 1658 clients still on Hammer version. To track the
clients we increased the debugging of debug_mon=10/10, debug_ms=1/5,
36 matches
Mail list logo