Hi
I'm about to change some SATA SSD disks to NVME disks and for CEPH I too
would like to know how to assign space. I have 3 1TB SATA OSDs so I'll
split the NVME disks in 3 partitions of equal size, I'm not going to
assign a different WAL partition because, if the docs are right, the WAL
is a
Hello,
I have been trying cephfs on the latest 12.x release.
Performance under cephfs mounted via kernel seems to be as expected maxing out
the underlying storage / resources using kernel version 4.13.4.
However when it comes to mounting cephfs via ceph-fuse looking at performance
of 5-10% for
Hi Nico,
I'm sorry I forgot about your issue. Crazy few weeks.
I checked the log you initially sent to the list, but it only contains
the log from one of the monitors, and it's from the one synchronizing.
This monitor is not stuck however - synchronizing is progressing, albeit
slowly.
Can y
Hello Joao,
thanks for coming back!
I copied the log of the crashing monitor to
http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz
Can I somehow get access to the logs of the other monitors, without
restarting them?
I would like to not stop them, as currently we are running with 2/3
Hi All,
I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore. I
expected somewhat higher memory usage/RSS values, however I see, imo, a
huge memory usage for all OSDs on both nodes.
Small snippet from `top`
PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+
C
On 10/18/2017 10:38 AM, Nico Schottelius wrote:
Hello Joao,
thanks for coming back!
I copied the log of the crashing monitor to
http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz
Can I somehow get access to the logs of the other monitors, without
restarting them?
If you mean incre
Finally i've disabled the mon_osd_report_timeout option and seems to works
fine.
Greetings!.
2017-10-17 19:02 GMT+02:00 Daniel Carrasco :
> Thanks!!
>
> I'll take a look later.
>
> Anyway, all my Ceph daemons are in same version on all nodes (I've
> upgraded the whole cluster).
>
> Cheers!!
>
>
On 10/18/2017 10:38 AM, Nico Schottelius wrote:
Hello Joao,
thanks for coming back!
I copied the log of the crashing monitor to
http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz
The monitor is crashing as part of bug #21300
http://tracker.ceph.com/issues/21300
And the fix is c
> Op 18 oktober 2017 om 11:41 schreef Hans van den Bogert
> :
>
>
> Hi All,
>
> I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore. I
> expected somewhat higher memory usage/RSS values, however I see, imo, a
> huge memory usage for all OSDs on both nodes.
>
> Small snippe
I can only speak for some environments, but sometimes, you would want to
make sure that a cluster cannot fill up until you can add more capacity.
Some organizations are unable to purchase new capacity rapidly and making
sure you cannot exceed your current capacity, then you can't run into
problems
Hello,
We created a BUG #21827 . Also updated the log file of the OSD with
debug 20. Reference is 6e4dba6f-2c15-4920-b591-fe380bbca200
Thanks,
Ana
On 18/10/17 00:46, Mart van Santen wrote:
>
>
> Hi Greg,
>
> (I'm a colleague of Ana), Thank you for your reply
>
>
> On 10/17/2017 11:57 PM, Gregory
Hi Bryan.
I hope that solved it for you.
Another think you can do in situations like this is to set the full_ration
higher so you can work on the problem. Always set it back to a safe value
after the issue is solved.
*ceph pg set_full_ratio 0.98*
Regards,
Webert Lima
DevOps Engineer at MAV Te
Hey Joao,
thanks for the pointer! Do you have a timeline for the release of
v12.2.2?
Best,
Nico
--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/li
Indeed it shows ssd in the OSD's metadata.
"bluestore_bdev_type": "ssd",
Then I misunderstood the role of the device class in CRUSH, I expected the
OSD would actually set its settings according to the CRUSH device class.
I'll try to force the OSDs to behave like HDDs and monitor the memory
Memory usage is still quite high here even with a large onode cache!
Are you using erasure coding? I recently was able to reproduce a bug in
bluestore causing excessive memory usage during large writes with EC,
but have not tracked down exactly what's going on yet.
Mark
On 10/18/2017 06:48 A
Hi!
I have a problem with ceph luminous 12.2.1. It was upgraded from kraken,
but I'm not sure if it was a problem in kraken.
I have slow requests on different OSDs on random time (for example at
night, but I don't see any problems at the time of problem with disks, CPU,
there is possibility of net
> Op 18 oktober 2017 om 13:48 schreef Hans van den Bogert
> :
>
>
> Indeed it shows ssd in the OSD's metadata.
>
> "bluestore_bdev_type": "ssd",
>
>
> Then I misunderstood the role of the device class in CRUSH, I expected the
> OSD would actually set its settings according to the CRUSH d
hi all,
we have a ceph 10.2.7 cluster with a 8+3 EC pool.
in that pool, there is a pg in inconsistent state.
we followed http://ceph.com/geen-categorie/ceph-manually-repair-object/,
however, we are unable to solve our issue.
from the primary osd logs, the reported pg had a missing object.
we fo
On Wed, 18 Oct 2017, Wido den Hollander wrote:
> > Op 18 oktober 2017 om 13:48 schreef Hans van den Bogert
> > :
> >
> >
> > Indeed it shows ssd in the OSD's metadata.
> >
> > "bluestore_bdev_type": "ssd",
> >
> >
> > Then I misunderstood the role of the device class in CRUSH, I expected
hello,
For 2 week, I lost sometime some OSD :
Here trace :
0> 2017-10-18 05:16:40.873511 7f7c1e497700 -1 osd/ReplicatedPG.cc:
In function '*void
ReplicatedPG::hit_set_trim(*ReplicatedPG::OpContextUPtr&, unsigned int)'
thread 7f7c1e497700 time 2017-10-18 05:16:40.869962
osd/ReplicatedPG.c
First a general comment: local RAID will be faster than Ceph for a
single threaded (queue depth=1) io operation test. A single thread Ceph
client will see at best same disk speed for reads and for writes 4-6
times slower than single disk. Not to mention the latency of local disks
will much better.
I'm running into a permission error when attempting to use ceph-deploy
to create an mgr on a recently upgraded jewel->luminous ceph cluster.
I've attempted to track down the permission, but so far no success.
I'm doing this on a dev environment so I can replicate:
Start with a sample jewel
In my previous post, in one of my points I was wondering if the request
size would increase if I enabled jumbo packets. currently it is disabled.
@jdillama: The qemu settings for both these two guest machines, with
RAID/LVM and Ceph/rbd images, are the same. I am not thinking that changing
the qem
Dear all,
We are still struggling this this issue. By now, one OSD crashes all the
time (a different then yesterday), but now on a different assert.
Namely with this one:
#0 0x75464428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x7546602a in _
Check out the following link: some SSDs perform bad in Ceph due to sync
writes to journal
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
Anther thing that can help is to re-run the rados 32 threads as stress
and view resource usage usi
The SSD drives are Crucial M500
A Ceph user did some benchmarks and found it had good performance
https://forum.proxmox.com/threads/ceph-bad-performance-in-qemu-guests.21551/
However, a user comment from 3 years ago on the blog post you linked to
says to avoid the Crucial M500
Yet, this performan
Sorry to reply to my own question, but I noticed that the cephx key for
client.bootstrap-mgr was inconsistent with the key in
/var/lib/ceph/bootstrap-mgr/ceph.keyring.
I deleted the entry in ceph:
ceph auth del client.bootstrap-mgr
reran the ceph-deploy gather keys:
ceph-deploy gathe
36 OSDs
Each of 4 storage servers has 9 1TB SSD drives, each drive as 1 osd (no
RAID) == 36 OSDs
Each drive is one LVM group, with two volumes - one volume for the osd, one
volume for the journal
Each osd is formatted with xfs
On Wed, Oct 18, 2017 at 1:33 PM, Maged Mokhtar wrote:
> measuring r
Hello Ashley,
On Wed, Oct 18, 2017 at 12:45 AM, Ashley Merrick wrote:
> 1/ Is there any options or optimizations that anyone has used or can suggest
> to increase ceph-fuse performance?
You may try playing with the sizes of reads/writes. Another
alternative is to use libcephfs directly to avoid
measuring resource load as outlined earlier will show if the drives are
performing well or not. Also how many osds do you have ?
On 2017-10-18 19:26, Russell Glaue wrote:
> The SSD drives are Crucial M500
> A Ceph user did some benchmarks and found it had good performance
> https://forum.prox
I cannot run the write test reviewed at
the ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device blog. The
tests write directly to the raw disk device.
Reading an infile (created with urandom) on one SSD, writing the outfile to
another osd, yields about 17MB/s.
But Isn't this write speed li
Hi all,
Thanks for the replies.
The main reason why I was looking for the thin/thick provisioning setting
is that I want to be sure that provisioned space should not exceed the
cluster capacity.
With thin provisioning there is a risk that more space is provisioned than
the cluster capacity. When
It would help if you can provide the exact output of "ceph -s", "pg query",
and any other relevant data. You shouldn't need to do manual repair of
erasure-coded pools, since it has checksums and can tell which bits are
bad. Following that article may not have done you any good (though I
wouldn't ex
Hi All,
The linked document is for filestore, which in your case is correct as I
understand it, but I wonder, if a similar document exists for bluestore?
Thanks,
Denes.
On 10/18/2017 02:56 PM, Stijn De Weirdt wrote:
hi all,
we have a ceph 10.2.7 cluster with a 8+3 EC pool.
in that pool, the
just run the same 32 threaded rados test as you did before and this time
run atop while the test is running looking for %busy of cpu/disks. It
should give an idea if there is a bottleneck in them.
On 2017-10-18 21:35, Russell Glaue wrote:
> I cannot run the write test reviewed at the
> ceph-ho
I've created a ticket http://tracker.ceph.com/issues/21833
Hopefully we can work this out.
On Mon, Oct 16, 2017 at 6:03 PM Dejan Lesjak wrote:
>
> > On 17. okt. 2017, at 00:59, Gregory Farnum wrote:
> >
> > On Mon, Oct 16, 2017 at 3:49 PM Dejan Lesjak
> wrote:
> >
> > > On 17. okt. 2017, at 0
On Wed, Oct 18, 2017 at 11:16 PM, pascal.pu...@pci-conseil.net
wrote:
> hello,
>
> For 2 week, I lost sometime some OSD :
> Here trace :
>
> 0> 2017-10-18 05:16:40.873511 7f7c1e497700 -1 osd/ReplicatedPG.cc: In
> function '*void ReplicatedPG::hit_set_trim(*ReplicatedPG::OpContextUPtr&,
> unsig
I updated the ticket with some findings. It appears that osd.93 has that
snapshot object in its missing set that gets sent to osd.78, and then
osd.69 claims to have the object. Can you upload debug logs of those OSDs
that go along with this log? (Or just generate a new set of them together.)
-Greg
I concur - at the moment we need to manually sum the RBD images to look at how
much we have "provisioned" vs what ceph df shows. in our case we had a rapid
run of provisioning new LUNs but it took a while before usage started to catch
up with what was provisioned as data was migrated in. Cep
I am trying to install Ceph luminous (ceph version 12.2.1) on 4 ubuntu
16.04 servers each with 74 disks, 60 of which are HGST 7200rpm sas drives::
HGST HUS724040AL sdbv sas
root@kg15-2:~# lsblk --output MODEL,KNAME,TRAN | grep HGST | wc -l
60
I am trying to deploy them all with ::
a line like th
40 matches
Mail list logo