I am a user of cephfs.
Recently i met a problem by using the cephfs-journal-tool.
There were some strange things happened below.
1.After use the cephfs-journal-tool and cephfs-table-tool(i came up with the
"negative object nums” issues, so i tried these tools to repair the cephfs),i
remount t
Using ceph-deploy:
I have ceph-node1 as admin and mon, and I would like to add another mon
ceph-node2.
On ceph-node1:
ceph-deploy mon create ceph-node2
ceph-deploy mon add ceph-node2
The fisrt command warns:
[ceph-node2][WARNIN] ceph-node2 is not defined in `mon initial members`
[ceph-node2][WAR
I've seen something similar if you are using RBD caching, I found that if you
can fill the RBD cache faster than it can flush you
get these stalls. I increased the size of the cache and also the flush
threshold and this solved the problem. I didn't spend much
time looking into it, but it seemed l
Hello
I don't really find any hardware problems. I have done disk checks and
looked at log files.
Should the osd fail in a core dump if there are hardware problems ?
All my data seems intact I only have:
HEALTH_ERR 915 pgs are stuck inactive for more than 300 seconds; 915 pgs
down; 915 pgs peeri
Am 13.07.16 um 17:44 schrieb David:
> Aside from the 10GbE vs 40GbE question, if you're planning to export
> an RBD image over smb/nfs I think you are going to struggle to reach
> anywhere near 1GB/s in a single threaded read. This is because even
> with readahead cranked right up you're still only
Am 13.07.16 um 17:08 schrieb c...@jack.fr.eu.org:
> I am using these for other stuff:
> http://www.supermicro.com/products/accessories/addon/AOC-STG-b4S.cfm
>
> If you want NIC, also think of the "network side" : SFP+ switch are very
> common, 40G is less common, 25G is really new (= really few pro
>>> Christian Balzer schrieb am Donnerstag, 14. Juli 2016 um
05:05:
Hello,
> Hello,
>
> On Wed, 13 Jul 2016 09:34:35 + Ashley Merrick wrote:
>
>> Hello,
>>
>> Looking at using 2 x 960GB SSD's (SM863)
>>
> Massive overkill.
>
>> Reason for larger is I was thinking would be better off w
Hi,
we have problem with drastic performance slowing down on a cluster. We used
radosgw with S3 protocol. Our configuration:
153 OSD SAS 1.2TB with journal on SSD disks (ratio 4:1)
- no problems with networking, no hardware issues, etc.
Output from "ceph df":
GLOBAL:
SIZE AVAIL RAW
Hi Jaroslaw,
several things are springing up to mind. I'm assuming the cluster is
healthy (other than the slow requests), right?
From the (little) information you send it seems the pools are
replicated with size 3, is that correct?
Are there any long running delete processes? They usually have a
2016-07-14 15:26 GMT+02:00 Luis Periquito :
> Hi Jaroslaw,
>
> several things are springing up to mind. I'm assuming the cluster is
> healthy (other than the slow requests), right?
>
>
Yes.
> From the (little) information you send it seems the pools are
> replicated with size 3, is that correct
I think that first symptoms of out problems occurred when we posted this
issue:
http://tracker.ceph.com/issues/15727
Regards
--
Jarek
--
Jarosław Owsiewski
2016-07-14 15:43 GMT+02:00 Jaroslaw Owsiewski <
jaroslaw.owsiew...@allegrogroup.com>:
> 2016-07-14 15:26 GMT+02:00 Luis Periquito :
>
>>
Hello,
On Thu, 14 Jul 2016 13:37:54 +0200 Steffen Weißgerber wrote:
>
>
> >>> Christian Balzer schrieb am Donnerstag, 14. Juli 2016 um
> 05:05:
>
> Hello,
>
> > Hello,
> >
> > On Wed, 13 Jul 2016 09:34:35 + Ashley Merrick wrote:
> >
> >> Hello,
> >>
> >> Looking at using 2 x 960GB SS
This is fairly standard for container deployment: one app per container
instance. This is how we're deploying docker in our upstream
ceph-docker / ceph-ansible as well.
Daniel
On 07/13/2016 08:41 PM, Łukasz Jagiełło wrote:
Hi,
Just wonder why you want each OSD inside separate LXC container?
Something in this section is causing all the 0 IOPS issue. Have not been able
to nail down it yet. (I did comment out the filestore_max_inline_xattr_size
entries, and problem still exists).
If I take out the whole [osd] section, I was able to get rid of IOPS staying at
0 for long periods of time
Try increasing the following to say 10
osd_op_num_shards = 10
filestore_fd_cache_size = 128
Hope, the following you introduced after I told you , so, it shouldn't be the
cause it seems (?)
filestore_odsync_write = true
Also, comment out the following.
filestore_wbthrottle_enable = false
Fr
Disregard the last msg. Still getting long 0 IOPS periods.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg,
Pankaj
Sent: Thursday, July 14, 2016 10:05 AM
To: Somnath Roy; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Terrible RBD performance with Jewel
Someth
We have been observing this similar behavior. Usually it is the case where
we create a new rbd image, expose it into the guest and perform any
operation that issues discard to the device.
A typical command that's first run on a given device is mkfs, usually with
discard on.
# time mkfs.xfs -s siz
I would probably be able to resolve the issue fairly quickly if it
would be possible for you to provide a RBD replay trace from a slow
and fast mkfs.xfs test run and attach it to the tracker ticket I just
opened for this issue [1]. You can follow the instructions here [2]
but would only need to per
Hi,
thanks for the suggestion. I tried it out.
No effect.
My ceph.conf looks like:
[osd]
osd_pool_default_crush_replicated_ruleset = 2
osd_pool_default_size = 2
osd_pool_default_min_size = 1
The complete: http://pastebin.com/sG4cPYCY
But the config is completely ignored.
If i run
# ceph osd
Thanks for all your answers,
Today people dedicate servers to act as ceph osd nodes which serve data
stored inside to other dedicated servers which run applications or VMs, can
we think about squashing the 2 inside 1?
Le 14 juil. 2016 18:15, "Daniel Gryniewicz" a écrit :
> This is fairly standa
Hi,
I've a cluster with 3 MON nodes and 5 OSD nodes. If i make a reboot
of 1 of the osd nodes i get slow request waiting for active.
2016-07-14 19:39:07.996942 osd.33 10.255.128.32:6824/7404 888 : cluster
[WRN] slow request 60.627789 seconds old, received at 2016-07-14
19:38:07.369009: o
Hi,
wow, figured it out.
If you dont have a ruleset 0 id, you are in trouble.
So the solution is, that you >MUST< have a ruleset id 0.
--
Mit freundlichen Gruessen / Best regards
Oliver Dzombic
IP-Interactive
mailto:i...@ip-interactive.de
Anschrift:
IP Interactive UG ( haftungsbeschraenkt
Hi list,
I ran into some issue on customizing the librbd(linking with jemalloc) with
stock qemu in Ubuntu Trusty here.
Stock qemu depends on librbd1 and librados2(0.80.x). These two libraries
will be installed at /usr/lib/x86_64-linux-gnu/lib{rbd,rados}.so. The path
is included in /etc/ld.so.conf.
I would suggest caution with " filestore_odsync_write" - its fine on good SSDs,
but on poor SSDs or spinning disks it will kill performance.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Somnath Roy
Sent: Friday, 15 July 2016 3:12 AM
To: Garg, Pankaj; ceph-users@list
good job, thank you for sharing, Wido~
it's very useful~
2016-07-14 14:33 GMT+08:00 Wido den Hollander :
> To add, the RGWs upgraded just fine as well.
>
> No regions in use here (yet!), so that upgraded as it should.
>
> Wido
>
> > Op 13 juli 2016 om 16:56 schreef Wido den Hollander :
> >
> >
>
Hi All...
I've seen that Zheng, Brad, Pat and Greg already updated or made some
comments on the bug issue. Zheng also proposes a simple patch. However,
I do have a bit more information. We do think we have identified the
source of the problem and that we can correct it. Therefore, I would
pro
On Fri, Jul 15, 2016 at 11:35:10AM +1000, Goncalo Borges wrote:
> Hi All...
>
> I've seen that Zheng, Brad, Pat and Greg already updated or made some
> comments on the bug issue. Zheng also proposes a simple patch. However, I do
> have a bit more information. We do think we have identified the sou
On Fri, Jul 15, 2016 at 9:35 AM, Goncalo Borges
wrote:
> Hi All...
>
> I've seen that Zheng, Brad, Pat and Greg already updated or made some
> comments on the bug issue. Zheng also proposes a simple patch. However, I do
> have a bit more information. We do think we have identified the source of
>
On Fri, Jul 15, 2016 at 11:19:12AM +0800, Yan, Zheng wrote:
> On Fri, Jul 15, 2016 at 9:35 AM, Goncalo Borges
> wrote:
> > So, we are hopping that compiling 10.2.2 in an intel processor without the
> > AVX extensions will solve our problem.
> >
> > Does this make sense?
>
> I have a different the
Thanks Zheng...
Now that we have identified the exact context when the segfault appears
(only in AMD 62XX) I think it should be safe to understand in each
situation does the crash appears.
My current compilation is ongoing and I will then test it.
If it fails, I will recompile including your
You may want to change value of "osd_pool_default_crush_replicated_ruleset".
shinobu
On Fri, Jul 15, 2016 at 7:38 AM, Oliver Dzombic
wrote:
> Hi,
>
> wow, figured it out.
>
> If you dont have a ruleset 0 id, you are in trouble.
>
> So the solution is, that you >MUST< have a ruleset id 0.
>
> -
Hello George,
i did what you suggested, but it didn't help...no autostart - i have to
start them manually
root@cephosd01:~# sgdisk -i 1 /dev/sdb
Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
Partition unique GUID: 48B7EC4E-A582-4B84-B823-8C3A36D9BB0A
First sector: 1048
32 matches
Mail list logo