>
> So, questions: does that really matter? What are possible impacts? What
> could have caused this 2 hosts to hold so many capabilities?
> 1 of the hosts are for tests purposes, traffic is close to zero. The other
> host wasn't using cephfs at all. All services stopped.
>
The reason might be upd
On Thu, Dec 14, 2017 at 4:44 PM, Webert de Souza Lima
wrote:
> Hi Patrick,
>
> On Thu, Dec 14, 2017 at 7:52 PM, Patrick Donnelly
> wrote:
>>
>>
>> It's likely you're a victim of a kernel backport that removed a dentry
>> invalidation mechanism for FUSE mounts. The result is that ceph-fuse
>> can'
>/Is this useful for someone? /
Yes!
Seehttp://tracker.ceph.com/issues/21259
The latest luminous branch (which you can get from
https://shaman.ceph.com/builds/ceph/luminous/) has some additional
debugging on OSD shutdown that should help me figure out what is causing
this. If this is somethin
On Wed, Dec 13, 2017 at 11:39 PM, Nick Fisk wrote:
> Boom!! Fixed it. Not sure if the behavior I stumbled from is correct, but
> this has a potential to break a few things for people moving from Jewel to
> Luminous if they potentially had a few too many PG’s.
>
>
>
> Firstly, how I stumbled across
hi
i used 3 nodes to deploy mds (each node also has mon on it)
my config:
[mds.ceph-node-10-101-4-17]
mds_standby_replay = true
mds_standby_for_rank = 0
[mds.ceph-node-10-101-4-21]
mds_standby_replay = true
mds_standby_for_rank = 0
[mds.ceph-node-10-101-4-22]
mds_standby_replay = true
mds_stand
On Fri, Dec 15, 2017 at 1:18 AM, Webert de Souza Lima
wrote:
> Hi,
>
> I've been look at ceph mds perf counters and I saw the one of my clusters
> was hugely different from other in number of caps:
>
> rlat inos caps | hsr hcs hcr | writ read actv | recd recy stry purg |
> segs evts subm
>
Hi Patrick,
On Thu, Dec 14, 2017 at 7:52 PM, Patrick Donnelly
wrote:
>
> It's likely you're a victim of a kernel backport that removed a dentry
> invalidation mechanism for FUSE mounts. The result is that ceph-fuse
> can't trim dentries.
>
even though I'm not using FUSE? I'm using kernel mount
Hi there all,
Perhaps someone can help.
We tried to free some storage so we deleted a lot S3 objects. The bucket
has also valuable data so we can't delete the whole bucket.
Everything went fine, but used storage space doesn't get less. We are
expecting several TB of data to be freed.
We th
On 12/14/2017 04:00 AM, Martin Emrich wrote:
Hi!
Am 13.12.17 um 20:50 schrieb Graham Allan:
After our Jewel to Luminous 12.2.2 upgrade, I ran into some of the
same issues reported earlier on the list under "rgw resharding
operation seemingly won't end".
Yes, that were/are my threads, I al
James,
Usually once the misplaced data has balanced out the cluster should
reach a healthy state. If you run a "ceph health detail" Ceph will
show you some more detail about what is happening. Is Ceph still
recovering, or has it stalled? has the "objects misplaced (62.511%"
changed to a lower %?
Thanks Cary!
Your directions worked on my first sever. (once I found the missing carriage
return in your list of commands, the email musta messed it up.
For anyone else:
chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 osd 'allow *'
mon 'allow profile osd' -i /etc/ceph/ceph.osd.
On 14.12.2017 18:34, James Okken wrote:
Hi all,
Please let me know if I am missing steps or using the wrong steps
I'm hoping to expand my small CEPH cluster by adding 4TB hard drives to each of
the 3 servers in the cluster.
I also need to change my replication factor from 1 to 3.
This is part
On Thu, Dec 14, 2017 at 9:18 AM, Webert de Souza Lima
wrote:
> So, questions: does that really matter? What are possible impacts? What
> could have caused this 2 hosts to hold so many capabilities?
> 1 of the hosts are for tests purposes, traffic is close to zero. The other
> host wasn't using cep
Jim,
I am not an expert, but I believe I can assist.
Normally you will only have 1 OSD per drive. I have heard discussions
about using multiple OSDs per disk, when using SSDs though.
Once your drives have been installed you will have to format them,
unless you are using Bluestore. My steps for
Hi all,
Please let me know if I am missing steps or using the wrong steps
I'm hoping to expand my small CEPH cluster by adding 4TB hard drives to each of
the 3 servers in the cluster.
I also need to change my replication factor from 1 to 3.
This is part of an Openstack environment deployed by F
Hi,
I've been look at ceph mds perf counters and I saw the one of my clusters
was hugely different from other in number of caps:
rlat inos caps | hsr hcs hcr | writ read actv | recd recy stry purg
| segs evts subm
0 3.0M 5.1M | 0 0 595 | 30440 | 0 0 13k
0
We show high disk latencies on a node when the controller's cache battery
dies. This is assuming that you're using a controller with cache enabled
for your disks. In any case, I would look at the hardware on the server.
On Thu, Dec 14, 2017 at 10:15 AM John Petrini wrote:
> Anyone have any ide
I've tracked this in a much more manual way. I would grab a random subset
of PGs in the pool and query the PGs counting how much were in there
queues. After that, you average it out by how many PGs you queried and how
many objects there were and multiply it back out by how many PGs are in the
poo
On 12/14/2017 09:46 AM, nigel davies wrote:
Is this nfs-ganesha exporting Cephfs? Yes
Are you using NFS for a Vmware Datastore? Yes
What are you using for the NFS failover? (this is where i could be going
wrong)
When creating the NFS Datastore i added the two NFS servers ip address in
NFS
Anyone have any ideas on this?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Is this nfs-ganesha exporting Cephfs? Yes
Are you using NFS for a Vmware Datastore? Yes
What are you using for the NFS failover? (this is where i could be going
wrong)
When creating the NFS Datastore i added the two NFS servers ip address in
On Thu, Dec 14, 2017 at 2:29 PM, David C wrote:
On Thu, Dec 14, 2017 at 8:52 PM, Florent B wrote:
> On 14/12/2017 03:38, Yan, Zheng wrote:
>> On Thu, Dec 14, 2017 at 12:49 AM, Florent B wrote:
>>>
>>> Systems are on Debian Jessie : kernel 3.16.0-4-amd64 & libfuse 2.9.3-15.
>>>
>>> I don't know pattern of corruption, but according to error mess
Hi,
We recently ran into low disk space issues on our clusters, and it wasn't
because of actual data. On those affected clusters we're hosting VMs and
volumes, so naturally there are snapshots involved. For some time, we
observed increased disk space usage that we couldn't explain, as there wa
Is this nfs-ganesha exporting Cephfs?
Are you using NFS for a Vmware Datastore?
What are you using for the NFS failover?
We need more info but this does sound like a vmware/nfs question rather
than specifically ceph/nfs-ganesha
On Thu, Dec 14, 2017 at 1:47 PM, nigel davies wrote:
> Hay all
>
>
Hello ,
I have following doubts, Could you please help me out?
I am using S3 Apis, What is the Max number of objects a bucket can have
when using indexless bucket?
What the max number of bucket can a user create?
Can we have both indexless and indexed buckets at the same time. Do we have
any co
On Thu, 14 Dec 2017, Stefan Priebe - Profihost AG wrote:
>
> Am 14.12.2017 um 13:22 schrieb Sage Weil:
> > On Thu, 14 Dec 2017, Stefan Priebe - Profihost AG wrote:
> >> Hello,
> >>
> >> Am 21.11.2017 um 11:06 schrieb Stefan Priebe - Profihost AG:
> >>> Hello,
> >>>
> >>> to measure performance / l
Am 14.12.2017 um 13:22 schrieb Sage Weil:
> On Thu, 14 Dec 2017, Stefan Priebe - Profihost AG wrote:
>> Hello,
>>
>> Am 21.11.2017 um 11:06 schrieb Stefan Priebe - Profihost AG:
>>> Hello,
>>>
>>> to measure performance / latency for filestore we used:
>>> filestore:apply_latency
>>> filestore:com
Hay all
i am in the process or trying to set up and VMware storage environment
I been reading and found that Iscsi (on jewel release) can cause issues and
the datastore can drop out.
I been looking at using nfs-ganesha with my ceph platform, it all looked
good until i looked at failover to our 2
On 29/11/17 17:24, Matthew Vernon wrote:
> We have a 3,060 OSD ceph cluster (running Jewel
> 10.2.7-0ubuntu0.16.04.1), and one OSD on one host keeps misbehaving - by
> which I mean it keeps spinning ~100% CPU (cf ~5% for other OSDs on that
> host), and having ops blocking on it for some time. It w
On Thu, 14 Dec 2017, Stefan Priebe - Profihost AG wrote:
> Hello,
>
> Am 21.11.2017 um 11:06 schrieb Stefan Priebe - Profihost AG:
> > Hello,
> >
> > to measure performance / latency for filestore we used:
> > filestore:apply_latency
> > filestore:commitcycle_latency
> > filestore:journal_latency
On Thu, Dec 14, 2017 at 12:52 AM, Jens-U. Mozdzen wrote:
> Hi Yan,
>
> Zitat von "Yan, Zheng" :
>>
>> [...]
>>
>> It's likely some clients had caps on unlinked inodes, which prevent
>> MDS from purging objects. When a file gets deleted, mds notifies all
>> clients, clients are supposed to drop cor
Hi Jared,
did you have find a solution to your problem ? It appear that I
have the same osd problem, and tcpdump captures won't show any solution.
All OSD nodes produced logs like
2017-12-14 11:25:11.756552 7f0cc5905700 -1 osd.49 29546 heartbeat_check:
no reply from 172.16.5.155:6817 osd.
Hi!
Am 13.12.17 um 20:50 schrieb Graham Allan:
After our Jewel to Luminous 12.2.2 upgrade, I ran into some of the same
issues reported earlier on the list under "rgw resharding operation
seemingly won't end".
Yes, that were/are my threads, I also have this issue.
I was able to correct the
Hi, Gregory,Thank you for your answer! Is there a way to not promote on "locking", when not using EC pools?Is it possible to make this configurable? We don't use EC pool. So, for us this meachanism is overhead. It only adds more load on both pools and network. 14.12.2017, 01:16, "Gregory Farnum" :V
Hi,
We see the following in the logs after we start a scrub for some osds:
ceph-osd.2.log:2017-12-14 06:50:47.180344 7f0f47db2700 0 log_channel(cluster)
log [DBG] : 1.2d8 scrub starts
ceph-osd.2.log:2017-12-14 06:50:47.180915 7f0f47db2700 -1 osd.2 pg_epoch: 11897
pg[1.2d8( v 11890'165209 (3221
Hallo Matthew, thanks for your feedback!
Please clarify one point: you mean that you recreated the pool as an
erasure-coded one, or that you recreated it as a regular replicated one?
I mean, you now have an erasure-coded pool in production as a gnocchi
backend?
In any case, from the insta
36 matches
Mail list logo