Hello!
On Fri, Oct 09, 2015 at 01:45:42PM +0200, jan wrote:
> Have you tried running iperf between the nodes? Capturing a pcap of the
> (failing) Ceph comms from both sides could help narrow it down.
> Is there any SDN layer involved that could add overhead/padding to the frames?
> What about s
Hi Everyone,
I upgraded our cluster to Hammer 0.94.3 a couple of days ago and today
we've had one monitor crash twice and another one once. We have 3 monitors
total and have been running Firefly 0.80.10 for quite some time without any
monitor issues.
When the monitor crashes it leaves a core file a
Hi,
Is there a backtrace in /var/log/ceph/ceph-mon.*.log ?
Cheers, Dan
On Fri, Oct 16, 2015 at 12:46 PM, Richard Bade wrote:
> Hi Everyone,
> I upgraded our cluster to Hammer 0.94.3 a couple of days ago and today we've
> had one monitor crash twice and another one once. We have 3 monitors total
>
Thanks for your quick response Dan, but no. All the ceph-mon.*.log files
are empty.
I did track this down in syslog though, in case it helps:
ceph-mon: 2015-10-16 21:25:00.117115 7f4c9f458700 -1 *** Caught signal
(Segmentation fault) **#012 in thread 7f4c9f458700#012#012 ceph version
0.94.3 (95cefe
Hmm, that's strange. I didn't see anything in the tracker that looks
related. Hopefully an expert can chime in...
Cheers, Dan
On Fri, Oct 16, 2015 at 1:38 PM, Richard Bade wrote:
> Thanks for your quick response Dan, but no. All the ceph-mon.*.log files are
> empty.
> I did track this down in sy
This doesn't look familiar. Are you able to enable a higher log level so
that if it happens again we'll have more info?
debug mon = 20
debug ms = 1
Thanks!
sage
On Fri, 16 Oct 2015, Dan van der Ster wrote:
> Hmm, that's strange. I didn't see anything in the tracker that looks
> related. Hopef
Hi all,
I'm trying to upgrade a ceph cluster (prev hammer release 0.94.3) to the
last release of *infernalis* (9.1.0-61-gf2b9f89). So far so good while
upgrading the mon servers, all work fine. But then when trying to upgrade
the OSD servers I got an error while trying to start the osd services ag
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
You need to make sure that you go through the 0.94.4 (not yet released
version) before the OSD will boot in the latest Infernalis. You can
get the packages from gitbuilder.ceph.com in the Hammer branch.
Install the packages (downgrade), and start up
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
OK, I've set this up and now all I/O is locked up. I've reduced
target_max_bytes because one OSD was reporting 97% usage, there was
some I/O for a few seconds as things flushed, but client I/O is still
blocked. Anyone have some thoughts?
ceph osd cr
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I started another fio test to one of the same RBDs (leaving the hung
ones still hung) and it is working OK, but the hungs ones are still
just hung.
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, 16 Oct 2015, Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> I started another fio test to one of the same RBDs (leaving the hung
> ones still hung) and it is working OK, but the hungs ones are still
> just hung.
There is a full-disk failsafe that is still so
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Is the only option to restart the librbd client in this case? Anything
I can do to help resolve it?
Thanks,
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Oct 16, 2015 at 10:17 AM, Sage
On Fri, 16 Oct 2015, Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Is the only option to restart the librbd client in this case? Anything
> I can do to help resolve it?
If you know which OSD the request is outstanding against (ceph daemon
objecter_requests) you c
Ok, debugging increased
ceph tell mon.[abc] injectargs --debug-mon 20
ceph tell mon.[abc] injectargs --debug-ms 1
Regards,
Richard
On 17 October 2015 at 01:38, Sage Weil wrote:
> This doesn't look familiar. Are you able to enable a higher log level so
> that if it happens again we'll have more
Hi,
I've noticed that CephFS (both ceph-fuse and kernel client in version
4.2.3) remove files from page cache as soon as they are not in use by a
process anymore.
Is this intended behaviour? We use CephFS as a replacement for NFS in
our HPC cluster. It should serve large files which are read
Trusty has just been pushed out and should be ready to be used right
away. We didn't realize this until today, sorry!
-Alfredo
On Wed, Oct 14, 2015 at 9:24 PM, Sage Weil wrote:
> On Thu, 15 Oct 2015, Francois Lafont wrote:
>
>> Sorry, another remark.
>>
>> On 13/10/2015 23:01, Sage Weil wrote:
>
After I set up the Ceph cluster. I tried to create a block device image from
QEMU. But I got this:
$ qemu-img create -f raw rbd:rbd/test 20G
Formatting 'rbd:rbd/test', fmt=raw size=21474836480
qemu-img: rbd:rbd/test: error connecting
There is a pool named rbd, and the output of ceph -s is:
17 matches
Mail list logo