Hello friends,
I had RBD images mapped to Windows client through iSCSI, however, the
MON got OOM due to some unknown reason. After rebooting MON, I am able
to mount one of the image/ iSCSI LUN back to client, second image when
mapped is shown as unallocated on windows client. I have data on that
Hello friends,
I had RBD images mapped to Windows client through iSCSI, however, the
MON got OOM due to some unknown reason. After rebooting MON, I am able
to mount one of the image/ iSCSI LUN back to client, second image when
mapped is shown as unallocated on windows client. I have data on that
i
Hi all, I got these error messages daily on radosgw for multiple users:
2016-11-12 13:49:08.905114 7fbba7fff700 20 RGWUserStatsCache: sync
user=myuserid1
2016-11-12 13:49:08.905956 7fbba7fff700 0 ERROR: can't read user header: ret=-2
2016-11-12 13:49:08.905978 7fbba7fff700 0 ERROR: sync_user() f
Hi,
What's your Ceph version?
I am using Jewel 10.2.3 and systemd seems to work normally. I deployed Ceph by
ansible, too.
You can check whether you have /lib/systemd/system/ceph-mon.target file.
I believe it was a bug existing in 10.2.1 before
cfa2d0a08a0bcd0fac153041b9eff17cb6f7c9af has been
Maybe a long shot, but have you checked OSD memory usage? Are the OSD
hosts low on RAM and swapping to disk?
I am not familiar with your issue, but though that might cause it.
Chris
On 2016-11-14 3:29 pm, Brad Hubbard wrote:
> Have you looked for clues in the output of dump_historic_ops
Hi John...
Thanks for replying.
Some of the requested input is inline.
Cheers
Goncalo
We are currently undergoing an infrastructure migration. One of the first
machines to go through this migration process is our standby-replay mds. We
are running 10.2.2. My plan is to:
Is the 10.2.2 her
Have you looked for clues in the output of dump_historic_ops ?
On Tue, Nov 15, 2016 at 1:45 AM, Thomas Danan
wrote:
> Thanks Luis,
>
>
>
> Here are some answers ….
>
>
>
> Journals are not on SSD and collocated with OSD daemons host.
>
> We look at the disk performances and did not notice anythi
smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On Mon, Nov 14, 2016 at 10:05 PM, John Spray wrote:
> Hi folks,
>
> For those with cephfs filesystems created using older versions of
> Ceph, you may be affected by this issue if you try to access your
> filesystem using the 4.8 or 4.9-rc kernels:
> http://tracker.ceph.com/issues/17825
>
> If your
Hi folks,
For those with cephfs filesystems created using older versions of
Ceph, you may be affected by this issue if you try to access your
filesystem using the 4.8 or 4.9-rc kernels:
http://tracker.ceph.com/issues/17825
If your data pool does not have ID 0 then you don't need to worry. If
you
On Mon, Nov 14, 2016 at 9:38 AM, Ridwan Rashid Noel wrote:
> Hi Ilya,
>
> I tried to test the primary-affinity change so I have setup a small cluster
> to test. I am trying to understand how the different components of Ceph
> interacts in the event of change of primary-affinity of any osd. I am
>
On Mon, Nov 14, 2016 at 9:20 AM, Brian Andrus
wrote:
> Hi William,
>
> "rgw print continue = true" is an apache specific setting, as mentioned
> here:
>
> http://docs.ceph.com/docs/master/install/install-ceph-gateway/#migrating-from-apache-to-civetweb
>
> I do not believe it is needed for civetweb
I had to set my mons to sysvinit while my osds are systemd. That allows
everything to start up when my system boots. I don't know why the osds don't
work with sysvinit and the mon doesn't work with systemd... but that worked to
get me running.
[cid:imagef8632
Hi William,
"rgw print continue = true" is an apache specific setting, as mentioned
here:
http://docs.ceph.com/docs/master/install/install-ceph-
gateway/#migrating-from-apache-to-civetweb
I do not believe it is needed for civetweb. For documentation, you can see
or change the version branch in t
Hi,
I have a problem that my ceph-mon isn't getting started when my machine
boots; the OSDs start up just fine. Checking logs, there's no sign of
systemd making any attempt to start it, although it is seemingly enabled:
root@sto-1-1:~# systemctl status ceph-mon@sto-1-1
● ceph-mon@sto-1-1.service
Hi,
Thanks for looking into this, this seems to mitigate the problem.
Do you think this is just related to httpd or is going to impact other
services such as nginx and be a wider point to know about / document?
Thanks
On Mon, Oct 24, 2016 at 2:28 PM, Yan, Zheng wrote:
> I finally reproduced t
Thanks Luis,
Here are some answers ….
Journals are not on SSD and collocated with OSD daemons host.
We look at the disk performances and did not notice anything wrong with
acceptable rw latency < 20ms.
No issue on the network as well from what we have seen.
There is only one pool in the cluster
Without knowing the cluster architecture it's hard to know exactly
what may be happening. And you sent no information on your cluster...
How is the cluster hardware? Where are the journals? How busy are the
disks (% time busy)? What is the pool size? Are these replicated or EC
pools?
On Mon, No
Without knowing the cluster architecture it's hard to know exactly what may
be happening.
How is the cluster hardware? Where are the journals? How busy are the disks
(% time busy)? What is the pool size? Are these replicated or EC pools?
Have you tried tuning the deep-scrub processes? Have you tr
try to see the specific logs for those particularly osd's, and see if
something is there, also take a deep close to the pg's that hold those osds
Best,
*German*
2016-11-14 12:04 GMT-03:00 M Ranga Swami Reddy :
> When this issue seen, ceph logs shows "slow requests to OSD"
>
> But Ceph status i
Hi All,
We have a cluster in production who is suffering from intermittent blocked
request (25 requests are blocked > 32 sec). The blocked request occurrences are
frequent and global to all OSDs.
>From the OSD daemon logs, I can see related messages:
16-11-11 18:25:29.917518 7fd28b989700 0 log_
When this issue seen, ceph logs shows "slow requests to OSD"
But Ceph status is in OK state.
Thanks
Swami
On Mon, Nov 14, 2016 at 8:27 PM, German Anders wrote:
> Could you share some info about the ceph cluster? logs? did you see
> anything different from normal op on the logs?
>
> Best,
>
>
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> William Josefsson
> Sent: 14 November 2016 14:46
> To: Nick Fisk
> Cc: Ceph Users
> Subject: Re: [ceph-users] Ceph Blog Articles
>
> Hi Nick, I found the graph very useful explaining the
Could you share some info about the ceph cluster? logs? did you see
anything different from normal op on the logs?
Best,
*German*
2016-11-14 11:46 GMT-03:00 M Ranga Swami Reddy :
> +ceph-devel
>
> On Fri, Nov 11, 2016 at 5:09 PM, M Ranga Swami Reddy > wrote:
>
>> Hello,
>> I am using the ceph
Hi Nick, I found the graph very useful explaining the concept. thx for sharing.
I'm currently planning to setup a new cluster and wanted to get low
latency by using,
2U server,
6xIntel P3700 400GB for journal and
18x1.8TB Hitachi Spinning 10k SAS. My OSD:Journal ratio would be 3:1.
All over 10Gbi
+ceph-devel
On Fri, Nov 11, 2016 at 5:09 PM, M Ranga Swami Reddy
wrote:
> Hello,
> I am using the ceph volumes with a VM. Details are below:
>
> VM:
> OS: Ubuntu 14.0.4
>CPU: 12 Cores
>RAM: 40 GB
>
> Volumes:
>Size: 1 TB
> No: 6 Volumes
>
>
> With above, VM got hung without a
I have a pool that I every time I try to change it's crush_ruleset
crashes 2 out of my 3 mons, and it's always the same. I've tried
leaving the first one down and it crashes the second.
It's a replicated pool, and I have other pools that look exactly the same.
I've deep-scrub'ed all the PG's to m
On Fri, 11 Nov 2016, Sage Weil wrote:
> Currently the distros we use for upstream testing are
>
> centos 7.x
> ubuntu 16.04 (xenial)
> ubuntu 14.04 (trusty)
>
> We also do some basic testing for Debian 8 and Fedora (some old version).
>
> Jewel was the first release that had native systemd an
On Mon, Nov 14, 2016 at 12:46 AM, Goncalo Borges
wrote:
> Hi Greg, Jonh, Zheng, CephFSers
>
> Maybe a simple question but I think it is better to ask first than to
> complain after.
>
> We are currently undergoing an infrastructure migration. One of the first
> machines to go through this migratio
On Thu, Nov 10, 2016 at 3:41 PM, Dan van der Ster wrote:
> Hi all, Hi Zheng,
>
> We're seeing a strange issue with the kernel cephfs clients, combined
> with a path restricted mds cap. It seems that files/dirs are
> intermittently not created due to permission denied.
>
> For example, when I untar
Oh right, yes you would still see an increase in latency once the
SSD's+CPU+Network start getting increased load. But I guess you
could scale out with more nodes/SSD's to combat this. This figure is more about
finding out the minimum latency possible, to
maintain it under load probably just requi
Hi Nick,
Actually I was referring to an all SSD cluster. I expect the latency to
increase from when you have a low load / queue depth to when you have a
cluster under heavy load at/near its maximum iops throughput when the cpu
cores are near peak utilization.
Cheers /Maged
-
I vote for 1. until Ubuntu 14.04 is supported.
On 11.11.2016 19:43, Sage Weil wrote:
> Currently the distros we use for upstream testing are
>
> centos 7.x
> ubuntu 16.04 (xenial)
> ubuntu 14.04 (trusty)
>
> We also do some basic testing for Debian 8 and Fedora (some old version).
>
> Jewel
Hi;
There are still lots of people who are using 14.04 and its also supported
till 2019 so +1 for option 1.
Thanks
Özhan
On Fri, Nov 11, 2016 at 11:22 PM, Blair Bethwaite wrote:
> Worth considering OpenStack and Ubuntu cloudarchive release cycles
> here. Mitaka is the release where all Ubuntu O
Hi Maged,
I would imagine as soon as you start saturating the disks, the latency impact
would make the savings from the fast CPU's pointless.
Really you would only try and optimise the latency if you are using SSD based
cluster.
This was only done with spinning disks in our case with a low Que
Hi,
Yes, I used fio, here is the fio file I used for the latency test
[global]
ioengine=rbd
randrepeat=0
clientname=admin
rbdname=test2
invalidate=0# mandatory
rw=write
bs=4k
direct=1
time_based=1
runtime=360
numjobs=1
[rbd_iodepth1]
iodepth=1
> -Original Message-
> From: Fulvio G
Hallo Nick, very interesting reading, thanks!
What are you using for measuring performance? Base "fio" or something
else? Would you be willing to attach to the article the relevant part of
the benchmark tool configuration?
Thanks!
Fulvio
Original Message
37 matches
Mail list logo