; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Impact of fancy striping
Hi James, Robert, Craig,
Thank your for those informative answers! You all pointed out interesting
issues.
I know losing 1 SAS disk in RAID0 means losing all journals, but this is for
testing so I do not care.
I do
Hi James, Robert, Craig,
Thank your for those informative answers! You all pointed out
interesting issues.
I know losing 1 SAS disk in RAID0 means losing all journals, but this is
for testing so I do not care.
I do not think sequential write speed to the RAID0 array is the
bottleneck (I be
A general rule of thumb for separate journal devices is to use 1 SSD for
every 4 OSDs. Since SSDs have no seek penalty, 4 partitions are fine.
Going much above the 1:4 ratio can saturate the SSD.
On your SAS journal device, by creating 9 partitions, you're forcing
head seeks for every journa
If I understand correctly you have one sas disk as a journal for multiple OSDs.
If you do small synchronous writes it will become a IO bottleneck pretty
quickly:
Due to multiple journals on the same disk it will no longer be sequential
writes writes to one journal but 4k writes to x journals mak
Hopefully a Ceph developer will be able to clarify how small writes are
journaled?
The write-through 'bug' seems to explain small-block performance I've
measured in various configurations (I find similar results to you).
I've not still tested the patch cited, but it would be *very*
interesti
Hi James,
Thank you for this clarification. I am quite aware of that, which is why
the journals are on SAS disks in RAID0 (SSDs out of scope).
I still have trouble believing that fast-but-not-super-fast journals is
the main reason for the poor performances observed. Maybe I am mistaken?
Bes
I would really appreciate it if someone could:
- explain why the journal setup is way more important than striping
settings;
I'm not sure if it's what you're asking, but any write must be
physically written to the journal before the operation is acknowledged.
So the overall cluster performa
Hi Kyle,
All OSDs are SATA drives in JBOD. The journals are all on a pair of SAS
in RAID0. All of those are on a shared backplane with a single RAID
controller (8 ports -> 12 disks).
I also have a pair of SAS in RAID1 holding the OS, which may be on a
different port/data-path. I am going to
> This journal problem is a bit of wizardry to me, I even had weird
intermittent issues with OSDs not starting because the journal was not
found, so please do not hesitate to suggest a better journal setup.
You mentioned using SAS for journal, if your OSDs are SATA and a expander
is in the data pa
I will try to look into this issue of device cache flush. Do you have
a tracker link for the bug?
How I wish this were a forum! But here is a link:
http://www.spinics.net/lists/ceph-users/msg05966.html
And this:
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?
Hi James,
Unfortunately, SSDs are out of budget. Currently there are 2 SAS disks
in RAID0 on each node, split into 9 partitions: one for each OSD journal
on the node. I benchmarked the RAID0 volumes at around 500MB/s in
sequential sustained write, so that's not bad — maybe access latency is
a
Did you try moving the journals to separate SSDs?
It was recently discovered that due to a kernel bug/design, the journal
writes are translated into device cache flush commands, so thinking
about that I wonder also whether there would be performance improvement
in the case that journal and OSD
Hi every one,
I am currently testing a use-case with large rbd images (several TB),
each containing an XFS filesystem, which I mount on local clients. I
have been testing the throughput writing on a single file in the XFS
mount, using "dd oflag=direct", for various block sizes.
With a defaul
13 matches
Mail list logo