Hi list,
I deployed a Windows 7 VM with qemu-rbd disk, and got an unexpected booting
phase performance.
I discovered that when booting the Windows VM up, there are consecutive ~2
minutes that `ceph -w` gives me an interesting log like: "... 567 KB/s rd,
567 op/s", "... 789 KB/s rd, 789 op/s" and
The OSD should have logged the identities of the inconsistent objects
to the central log on the monitors, as well as to its own local log
file. You'll need to identify for yourself which version is correct,
which will probably involve going and looking at them inside each
OSD's data store. If the p
I'm having trouble finding a concise set of steps to repair inconsistent
placement groups. I know from other threads that issuing a 'ceph pg repair
...' command could cause loss of data integrity if the primary OSD happens
to have the bad copy of the placement group. I know how to find which PG's
a
This was commented on recently on ceph-users, but I’ll explain the scenario.
If the single kernel needs to flush rbd blocks to reclaim memory and the OSD
process needs memory to handle the flushes, you end up deadlocked.
If you run the rbd client in a VM with dedicated memory allocation from th
I remember reading somewhere that the kernel ceph clients (rbd/fs) could
not run on the same host as the OSD. I tried finding where I saw that,
and could only come up with some irc chat logs.
The issue stated there is that there can be some kind of deadlock. Is
this true, and if so, would you ha
On Fri, 13 Jun 2014, Charles 'Boyo wrote:
> Aha! Thanks Sage.
>
> I completely get it now. So I can use a ramdisk provided it is always
> flushed to disk during shutdowns and I never have unplanned outages
> right?
i.e., never! :)
> Does this hard OSD consistency also explain why eventual con
Aha! Thanks Sage.
I completely get it now. So I can use a ramdisk provided it is always flushed
to disk during shutdowns and I never have unplanned outages right?
Does this hard OSD consistency also explain why eventual consistency at the
RADOS level was "designed" out? Having all OSDs in a rep
On Fri, 13 Jun 2014, Charles 'Boyo wrote:
> Hello Sage.
>
> I'm running xfs and crashes are rarely enough. When they do happen, I
> would rather just rebuild the entire cluster than bother with fsck
> anyway.
I mean any unclean/abrupt shutdown of ceph-osd, not an XFS error. Like a
power failu
Hello Sage.
I'm running xfs and crashes are rarely enough. When they do happen, I would
rather just rebuild the entire cluster than bother with fsck anyway.
So can you show me how to turn off journalling using the xfs FileStore backend?
:)
Charles
--Original Message--
From: Sage Weil
T
On Thu, 12 Jun 2014, Charles 'Boyo wrote:
> Hello list.
>
> Is it possible, or will it ever be possible to disable the OSD's
> journalling activity?
>
> I understand it is risky and has the potential for data loss but in my
> use case, the data is easily re-built from scratch and I'm really
>
Hello list.
Is it possible, or will it ever be possible to disable the OSD's journalling
activity?
I understand it is risky and has the potential for data loss but in my use
case, the data is easily re-built from scratch and I'm really bothered about
the reduced throughput "wasted" on journall
We actually disabled swap all together on these machines...
On Thu, Jun 12, 2014 at 5:06 PM, Gregory Farnum wrote:
> To be clear, that's the solution to one of the causes of this issue.
> The log message is very general, and just means that a disk access
> thread has been gone for a long time (
To be clear, that's the solution to one of the causes of this issue.
The log message is very general, and just means that a disk access
thread has been gone for a long time (15 seconds, in this case)
without checking in (so usually, it's been inside of a read/write
syscall for >=15 seconds).
Other
Can you check and see if swap is being used on your OSD servers when
this happens, and even better, use something like collectl or another
tool to look for major page faults?
If you see anything like this, you may want to tweak swappiness to be
lower (say 10).
Mark
On 06/12/2014 03:17 PM, X
I've done some more tracing. It looks like the high IO wait in VMs are
somewhat correlated when some OSDs have high inflight ops (ceph admin
socket, dump_ops_in_flight).
When in_flight_ops is high, I see something like this in the OSD log:
2014-06-12 19:57:24.572338 7f4db6bdf700 1 heartbeat_map r
Hi JC,
The cluster already has 1024 PGs on only 15 OSD, which is above the
formula of (100 x #OSDs)/size. How large should I make it?
# ceph osd dump | grep Ray
pool 17 'Ray' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 7785 owner 0 f
Hi,
I am following the standard deployment guide for ceph firefly. When I try
to do the step 5 for collecting the key, it gives me warnings saying that
keyrings not found for bootstrap-mds, bootstrap-osd and admin due to which
the next step for deploying osds fail. Other people on this forum have
Hi,
I am following the standard deployment guide for ceph firefly. When I try
to do the step 5 for collecting the key, it gives me warnings saying that
keyrings not found for bootstrap-mds, bootstrap-osd and admin due to which
the next step for deploying osds fail. Other people on this forum have
You can set up pools which have all their primaries in one data
center, and point the clients at those pools. But writes will still
have to traverse the network link because Ceph does synchronous
replication for strong consistency.
If you want them to both write to the same pool, but use local OSD
ulimit -Sa
ulimit -Ha
Which will show you your limits.
If you are hitting this limit and it's 16k I would say that the server
is not tuned for your needs and upper it.
If it's more then that but not reaching 1Million or any other very high
number I would say use "lsof -n|wc -l" to get some sta
Fred,
I'm not sure it will completely answer your question, but I would
definitely have a look at:
http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address
There are some important steps in there for monitors.
On Wed, Jun 11, 2014 at 12:08 PM, Fred Yang wrot
You probably just want to increase the ulimit settings. You can change the
OSD setting, but that only covers file descriptors against the backing
store, not sockets for network communication -- the latter is more often
the one that runs out.
-Greg
On Thursday, June 12, 2014, Christian Kauhaus > wr
HI,
Thanks for your information!
I will check it soon, and will post results later,
Thanks a lot and best regards,
Yamashita
=
OSS Laboratories Inc.
Yoshitami Yamashita
Mail:yamash...@ossl.co.jp
- 元のメッセージ -
差出人: "Karan Singh"
宛先: "山下 良民"
Cc: ceph-users@list
On 06/12/2014 08:47 AM, Xu (Simon) Chen wrote:
1) I did check iostat on all OSDs, and iowait seems normal.
2) ceph -w shows no correlation between high io wait and high iops.
Sometimes the reverse is true: when io wait is high (since it's a
cluster wide thing), the overall ceph iops drops too.
On Thu, Jun 12, 2014 at 2:21 AM, VELARTIS Philipp Dürhammer
wrote:
> Hi,
>
> Will ceph support mixing different disk pools (example spinners and ssds) in
> the future a little bit better (more safe)?
There are no immediate plans to do so, but this is an extension to the
CRUSH language that we're
Hi,
we have a Ceph cluster with 32 OSDs running on 4 servers (8 OSDs per server,
one for each disk).
From time to time, I see Ceph servers running out of file descriptors. It logs
lines like:
> 2014-06-08 22:15:35.154759 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20)
write couldn't open
86.37_
1) I did check iostat on all OSDs, and iowait seems normal.
2) ceph -w shows no correlation between high io wait and high iops.
Sometimes the reverse is true: when io wait is high (since it's a cluster
wide thing), the overall ceph iops drops too.
3) We have collectd running in VMs, and that's how
Hi Simon,
Did you check iostat on the OSDs to check their utilization? What does your
ceph -w say - pehaps you’re maxing your cluster’s IOPS?
Also, are you running any monitoring of your VMs iostats? We’ve often found
some culprits overusing IOs..
Kind Regards,
David Majchrzak
12 jun 2014 kl.
Hi folks,
We have two similar ceph deployments, but one of them is having trouble:
VMs running with ceph-provided block devices are seeing frequent high io
wait, every a few minutes, usually 15-20%, but as high as 60-70%. This is
cluster-wide and not correlated with VM's IO load. We turned on rbd
On Thu, Jun 12, 2014 at 5:02 PM, David wrote:
> Thanks Mark!
>
> Well, our workload has more IOs and quite low throughput, perhaps 10MB/s ->
> 100MB/s. It’s a quite mixed workload, but mostly small files (http / mail /
> sql).
> During the recovery we had ranged between 600-1000MB/s throughput.
Thanks Mark!
Well, our workload has more IOs and quite low throughput, perhaps 10MB/s ->
100MB/s. It’s a quite mixed workload, but mostly small files (http / mail /
sql).
During the recovery we had ranged between 600-1000MB/s throughput.
So the only way to currently ”fix” this is to have enough
On 06/12/2014 07:27 AM, Christian Kauhaus wrote:
Am 12.06.2014 14:09, schrieb Loic Dachary:
With the replication factor set to three (which is the default), it can
tolerate that two OSD fail at the same time.
I've noticed that a replication factor of 3 is the new default in firefly.
What rati
On 06/12/2014 03:44 AM, David wrote:
Hi,
We have 5 OSD servers, with 10 OSDs each (journals on enterprise SSDs).
We lost an OSD and the cluster started to backfill the data to the rest of the
OSDs - during which the latency skyrocketed on some OSDs and connected clients
experienced massive IO
Am 12.06.2014 14:09, schrieb Loic Dachary:
> With the replication factor set to three (which is the default), it can
> tolerate that two OSD fail at the same time.
I've noticed that a replication factor of 3 is the new default in firefly.
What rationale led to changing the default? It used to be
Hi,
With the replication factor set to three (which is the default), it can
tolerate that two OSD fail at the same time.
Cheers
On 12/06/2014 13:43, yalla.gnan.ku...@accenture.com wrote:
> Hi All,
>
>
>
> Upto how many OSD failures can ceph tolerate ?
>
>
>
>
>
> Thanks
>
> Kumar
>
Hi All,
Upto how many OSD failures can ceph tolerate ?
Thanks
Kumar
This message is for the designated recipient only and may contain privileged,
proprietary, or otherwise confidential information. If you have received it in
error, please notify the sender im
Hi,
I added the fourth monitor in the cluster, but it's always down even after I
restart the mon service. I attached the logs below, could you help?
[root@ceph1]# ceph health detail
HEALTH_WARN 1 mons down, quorum 0,1,2 1,3,0; clock skew detected on mon.3, mon.0
mon.8 (rank 3) addr 192.168.1.4:6
Hi,
Will ceph support mixing different disk pools (example spinners and ssds) in
the future a little bit better (more safe)?
Thank you
philipp
On Wed, Jun 11, 2014 at 5:18 AM, Davide Fanciola wrote:
> Hi,
>
> we have a similar setup where we have SSD and HDD in the same hosts.
> Our very basic
Hi,
We have 5 OSD servers, with 10 OSDs each (journals on enterprise SSDs).
We lost an OSD and the cluster started to backfill the data to the rest of the
OSDs - during which the latency skyrocketed on some OSDs and connected clients
experienced massive IO wait.
I’m trying to rectify the situa
Hi,
Depends what you mean with a ”user”. You can set up pools with different
replication / erasure coding etc:
http://ceph.com/docs/master/rados/operations/pools/
Kind Regards,
David Majchrzak
12 jun 2014 kl. 10:22 skrev
:
> Hi All,
>
>
> I have a ceph cluster. If a user wants just st
Hi All,
I have a ceph cluster. If a user wants just striping or distributed or
replicated storages , can we provide these types of storages exclusively ?
Thanks
Kumar
This message is for the designated recipient only and may contain privileged,
proprietar
Hi all,
One short question quite useful for me :
Is there a way to set up a highest osd/host priority for some clients in a
datacenter and do the opposite in another datacenter ? I mean, my network links
between those datacenters will be used in case of failover for clients
accessing data on ce
Am 11.06.2014 16:47, schrieb Alfredo Deza:
On Wed, Jun 11, 2014 at 9:29 AM, Markus Goldberg
wrote:
Hi,
ceph-deploy-1.5.3 can make trouble, if a reboot is done between preparation
and aktivation of an osd:
The osd-disk was /dev/sdb at this time, osd itself should go to sdb1,
formatted to cleare
43 matches
Mail list logo