Hello,
Actually i am working on a NFS HA Cluster to export rbd Images with NFS.
To test the failover i tried the following:
https://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/
i set the rbdimage to exclusive Lock and the osd and mon timeout to 20
Seconds.
on 1 NFS Server i mapped the rbd
Hello,
I'm following
http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#ceph-volume-lvm-prepare-bluestore
to create new OSD's.
I took the latest branch from https://shaman.ceph.com/repos/ceph/luminous/
# ceph -v
ceph version 12.2.1-851-g6d9f216
What I did, formatted the device.
#sgdisk
# ps axHo %cpu,stat,pid,tid,pgid,ppid,comm,wchan | grep ceph-osd
To find the actual thread that is using 100% CPU.
# for x in `seq 1 5`; do gdb -batch -p [PID] -ex "thr appl all bt";
echo; done > /tmp/osd.stack.dump
Then look at the stacks for the thread that was using all the CPU and
see what i
Hello,
You might consider checking the iowait (during the problem), and the
dmesg (after it recovered). Maybe an issue with the given sata/sas/nvme
port?
Regards,
Denes
On 11/29/2017 06:24 PM, Matthew Vernon wrote:
Hi,
We have a 3,060 OSD ceph cluster (running Jewel
10.2.7-0ubuntu0.16.0
Hello,
We are trying out Ceph on a small cluster and are observing memory
leakage in the OSD processes. The leak seems to be in addition to the known
leak related to the "buffer_anon" pool and is high enough for the processes
to run against their memory limits in a few hours.
The following tab
On 2017-11-27 14:02, German Anders wrote:
4x 2U servers:
1x 82599ES 10-Gigabit SFI/SFP+ Network Connection
1x Mellanox ConnectX-3 InfiniBand FDR 56Gb/s Adapter (dual port)
so I assume you are using IPoIB as the cluster network for the
replication...
1x OneConnect 10Gb NIC (quad-port) - in
On Tue, Nov 28, 2017 at 1:50 PM, Jens-U. Mozdzen wrote:
> Hi David,
>
> Zitat von David C :
>
>> On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" wrote:
>>
>> Hi David,
>>
>> Zitat von David C :
>>
>> Hi Jens
>>
>>>
>>> We also see these messages quite frequently, mainly the "replicating
>>> dir...".
Hi Mathhew,
anything special happening on the NIC side that could cause a problem? Packet
drops? Incorrect jumbo frame settings causing fragmentation?
Have you checked the cstate settings on the box?
Have you disabled energy saving settings differently from the other boxes?
Any unexpected wait
Hi *,
while tracking down a different performance issue with CephFS
(creating tar balls from CephFS-based directories takes multiple times
as long as when backing up the same data from local disks, i.e. 56
hours instead of 7), we had a look at CephFS performance related to
the size of the
On Wed, Nov 29, 2017 at 6:52 PM, Aristeu Gil Alves Jr
wrote:
>> > Does s3 or swifta (for hadoop or spark) have integrated data-layout APIs
>> > for
>> > local processing data as have cephfs hadoop plugin?
>> >
>> With s3 and swift you won't have data locality as it was designed for
>> public cloud
On Wed, Nov 29, 2017 at 6:54 PM, Gregory Farnum wrote:
> On Wed, Nov 29, 2017 at 8:52 AM Aristeu Gil Alves Jr
> wrote:
>>>
>>> > Does s3 or swifta (for hadoop or spark) have integrated data-layout
>>> > APIs for
>>> > local processing data as have cephfs hadoop plugin?
>>> >
>>> With s3 and swift
Hi,
We have a 3,060 OSD ceph cluster (running Jewel
10.2.7-0ubuntu0.16.04.1), and one OSD on one host keeps misbehaving - by
which I mean it keeps spinning ~100% CPU (cf ~5% for other OSDs on that
host), and having ops blocking on it for some time. It will then behave
for a bit, and then go back t
On Wed, Nov 29, 2017 at 3:44 AM, Jens-U. Mozdzen wrote:
> Hi *,
>
> we recently have switched to using CephFS (with Luminous 12.2.1). On one
> node, we're kernel-mounting the CephFS (kernel 4.4.75, openSUSE version) and
> export it via kernel nfsd. As we're transitioning right now, a number of
> m
On Wed, Nov 29, 2017 at 8:52 AM Aristeu Gil Alves Jr
wrote:
> > Does s3 or swifta (for hadoop or spark) have integrated data-layout APIs
>> for
>> > local processing data as have cephfs hadoop plugin?
>> >
>> With s3 and swift you won't have data locality as it was designed for
>> public cloud.
>
>
> > Does s3 or swifta (for hadoop or spark) have integrated data-layout APIs
> for
> > local processing data as have cephfs hadoop plugin?
> >
> With s3 and swift you won't have data locality as it was designed for
> public cloud.
> We recommend disable locality based scheduling in Hadoop when ru
Hi,
On Wed, Nov 29, 2017 at 5:32 PM, Aristeu Gil Alves Jr
wrote:
> Orit,
>
> As I mentioned, I have cephfs in production for almost two years.
> Can I use this installed filesystem or I need to start from scratch? If the
> first is true, is there any tutorial that you recommend on adding s3 on an
Orit,
As I mentioned, I have cephfs in production for almost two years.
Can I use this installed filesystem or I need to start from scratch? If the
first is true, is there any tutorial that you recommend on adding s3 on an
installed base, or to ceph in general?
Does s3 or swifta (for hadoop or spa
We experienced this problem in the past on older (pre-Jewel) releases
where a PG split that affected the RBD header object would result in
the watch getting lost by librados. Any chance you know if the
affected RBD header objects were involved in a PG split? Can you
generate a gcore dump of one of
We've seen this. Our environment isn't identical though, we use oVirt and
connect to ceph (11.2.1) via cinder (9.2.1), but it's so very rare that we've
never had any luck in pin pointing it and have a lot less VMs, <300.
Regards,
Logan
- On Nov 29, 2017, at 7:48 AM, Wido den Hollander w...
Hi,
On a OpenStack environment I encountered a VM which went into R/O mode after a
RBD snapshot was created.
Digging into this I found 10s (out of thousands) RBD images which DO have a
running VM, but do NOT have a watcher on the RBD image.
For example:
$ rbd status volumes/volume-79773f2e-1f
On 11/29/17 00:06, Nigel Williams wrote:
> Are their opinions on how stable multiple filesystems per single Ceph
> cluster is in practice?
we're using a single cephfs in production since february, and switched
to three cephfs in september - without any problem so far (running 12.2.1).
workload is
On Wed, Nov 29, 2017 at 7:06 AM, Nigel Williams
wrote:
> On 29 November 2017 at 01:51, Daniel Baumann wrote:
>> On 11/28/17 15:09, Geoffrey Rhodes wrote:
>>> I'd like to run more than one Ceph file system in the same cluster.
>
> Are their opinions on how stable multiple filesystems per single Ce
Le 27/11/2017 à 14:36, Alfredo Deza a écrit :
> For the upcoming Luminous release (12.2.2), ceph-disk will be
> officially in 'deprecated' mode (bug fixes only). A large banner with
> deprecation information has been added, which will try to raise
> awareness.
>
> We are strongly suggesting using
Is possible that in Ubuntu with kernel version 4.12.14 at least, it comes
by default with the parameter enabled in [madvise]?
*German*
2017-11-28 12:07 GMT-03:00 Nigel Williams :
> Given that memory is a key resource for Ceph, this advice about switching
> Transparent Huge Pages kernel setting
Hi *,
we recently have switched to using CephFS (with Luminous 12.2.1). On
one node, we're kernel-mounting the CephFS (kernel 4.4.75, openSUSE
version) and export it via kernel nfsd. As we're transitioning right
now, a number of machines still auto-mount users home directories from
that n
Hi German,
I would personally prefer to use rados bench/ fio which are more common
to benchmark the cluster first then later do mysql specific tests using
sysbench. Another thing is to run the client test simultaneously on more
than 1 machine and aggregate/add the performance numbers of each, the
26 matches
Mail list logo