From: Dennis Kramer (DBS) [den...@holmes.nl]
Sent: 30 August 2016 20:59
To: Goncalo Borges; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-fuse "Transport endpoint is not connected" on
Jewel 10.2.2
Hi Goncalo,
Thank you for providing below info. I'm getting
ot;
Sorry for the extra email
Cheers
Goncalo
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Goncalo
Borges [goncalo.bor...@sydney.edu.au]
Sent: 30 August 2016 18:53
To: Brad Hubbard
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-use
Hi Brad...
Thanks for the feedback. I think we are making some progress.
I have opened the following tracker issue: http://tracker.ceph.com/issues/17177
.
There I give pointers for all the logs, namely the result of the pg query and
all osd logs after increasing the log levels (debug_ms=1, de
Here it goes:
# xfs_info /var/lib/ceph/osd/ceph-78
meta-data=/dev/sdu1 isize=2048 agcount=4, agsize=183107519 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0finobt=0
data = bsize=4096
Hi Kenneth, All
Just an update for completeness on this topic.
We have been hit again by this issue.
I have been discussing it with Brad (RH staff) in another ML thread, and
I have opened a tracker issue: http://tracker.ceph.com/issues/17177
I believe this is a bug since there are other peop
Hi Dan.
It might be worthwhile to read:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg17820.html.
Have you seen that one.
>From Sam Just: "For each of those pgs, you'll need to identify the pg copy you
>want to be the winner and either 1) Remove all of the other ones using
>ceph-
Hi Simon.
Simple answer is that you can upgrade directly to 10.2.2. We did it from 9.2.0.
In cases where you have to pass by an intermediate release, the release notes
should be clear about it.
Cheers
Goncalo
From: ceph-users [ceph-users-boun...@lists
Hi Greg...
I've had to force recreate some PGs on my cephfs data pool due to some
cascading disk failures in my homelab cluster. Is there a way to easily
determine which files I need to restore from backup? My metadata pool is
completely intact.
Assuming you're on Jewel, run a recursive "scru
Can you please share the result of
ceph pg 11.34a query
?
On 09/08/2016 05:03 PM, Arvydas Opulskis wrote:
2016-09-08 08:45:01.441945 osd.24 [INF] 11.34a scrub starts
2016-09-08 08:45:03.585039 osd.24 [INF] 11.34a scrub ok
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for
: 0,
"omap_digest": "0xaa3fd281",
"data_digest": "0x"
},
{
"osd": 78,
"missing": false,
"read_error": false,
Hi Daznis...
Something is not quite right. You have pools with 2 replicas (right?). The fact
that you have 18 down pgs says that both the OSDS acting on those pgs are with
problems.
You should try to understand which PGs are down and which OSDs are acting on
them ('ceph pg dump_stuck' or 'ceph
Hi Dennis
Have you checked
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-February/007207.html ?
The issue there was some near full osd blocking IO.
Cheers
G.
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Dennis Kramer
(DBS
Hi
I am assuming that you do not have any near full osd (either before or along
the pg splitting process) and that your cluster is healthy.
To minimize the impact on the clients during recover or operations like pg
splitting, it is good to set the following configs. Obviously the whole
operat
Hi...
I think you are seeing an issue we saw some time ago. Your segfault seems the
same we had but please confirm against the info in
https://github.com/ceph/ceph/pull/10027
We solve it by recompiling ceph with the patch described above.
I think it should be solved in the next bug release ve
ectory.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydne
_
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 20
Hi Mike...
I was hoping that someone with a bit more experience would answer you since I
never had similar situation. So, I'll try to step in and help.
The peering process means that the OSDs are agreeing on the state of objects in
the PGs they share. The peering process can take some time and
Hi John.
That would be good.
In our case we are just picking that up simply through nagios and some fancy
scripts parsing the dump of the MDS maps.
Cheers
Goncalo
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John Spray
[jsp...@red
Hi Kostis...
That is a tale from the dark side. Glad you recover it and that you were
willing to doc it all up, and share it. Thank you for that,
Can I also ask which tool did you use to recover the leveldb?
Cheers
Goncalo
From: ceph-users [ceph-users-boun.
Hi Dan...
Have you tried 'rados df' to see if it agrees with 'ceph df' ?
Cheers
G.
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Dan van der
Ster [d...@vanderster.com]
Sent: 28 October 2016 03:01
To: ceph-users
Subject: [ceph-users] cep
Hi
"ceph daemon mds. session ls", executed in your mds server, should give you
hostname and client id of all your cephfs clients.
"ceph daemon mds. dump_ops_in_flight" should give you operations not
completed or pending to complete for certain clients ids. In case of problems,
that those probl
Hi Dan,,,
I know there are path restriction issues in the kernel client. See the
discussion here.
http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2016-June/010656.html
http://tracker.ceph.com/issues/16358
Cheers
Goncalo
From: ceph-users [ceph
Doesn't the mds log tell you which clients ids are with problems?
Does you mds has enough RAM so that you can increase the default value 10
of the mds cache size
?
Cheers
G.
From: Yutian Li [l...@megvii.com]
Sent: 11 November 2016 14:03
To: Goncalo B
Hi Joel.
The pgs of a given pool start with the id of the pool. So, the 19.xx mean that
those pgs are from pool 19. I think that a 'ceph osd dump' should give you a
summary of all pools and their ids at the very beginning of the output. My
guess is that this will confirm that your volume or im
Hi Greg, Jonh, Zheng, CephFSers
Maybe a simple question but I think it is better to ask first than to complain
after.
We are currently undergoing an infrastructure migration. One of the first
machines to go through this migration process is our standby-replay mds. We are
running 10.2.2. My pla
o
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Phy
Olá Pedro...
These are extremely generic questions, and therefore, hard to answer. Nick did
a good job in defining the risks.
In our case, we are running a Ceph/CephFS system in production for over an
year, and before that, we tried to understand Ceph for a year also.
Ceph is incredibility go
Olá Bruno
I am not understanding your outputs.
On the first 'ceph -s' it says one mon is down but hour 'ceph health detail'
does not report it further.
On your crush map I count 7 osds= 0,1,2,3,4,6,7 but ceph -s says only 6 are
active.
Can you send the output of 'ceph osd tree, 'ceph osd df'
;
}
],
"mdsmap_epoch": 5224
}
---> Running following command in the mds:
{
"id": 616338,
"num_leases": 0,
"num_caps": 16078,
"state": "open",
153T of used space with respect to 306T in total (case 1)
51T of used space with respect to 81TB in total (case 2)
Am i doing something wrong here?
Cheers
Goncalo
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Phys
Hi Greg, Jonh...
To Jonh: Nothing is done in tge background between two consecutive df commands,
I have opened the following tracker issue: http://tracker.ceph.com/issues/18151
(sorry, all the issue headers are empty apart from the title. I've hit enter
before actually filling all the appropr
Hi John...
>> We are running ceph/cephfs in 10.2.2. All infrastructure is in the same
>> version (rados cluster, mons, mds and cephfs clients). We mount cephfs using
>> ceph-fuse.
>>
>> Last week I triggered some of my heavy users to delete data. In the
>> following example, the user in question
Hi John, Greg, Zheng
And now a much more relevant problem. Once again, my environment:
- ceph/cephfs in 10.2.2 but patched for
o client: add missing client_lock for get_root
(https://github.com/ceph/ceph/pull/10027)
o Jewel: segfault in ObjectCacher::FlusherThread
(http://tracker.ceph.com/
Thanks Dan for your critical eye.
Somehow I did not notice that there was already a tracker for it.
Cheers
G.
From: Dan van der Ster [d...@vanderster.com]
Sent: 06 December 2016 19:30
To: Goncalo Borges
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users
s.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 2006
T: +61 2 93511937
___
ceph-use
Hi John.
I have been hitting that issue also although have not seen any asserts in my
mds yet.
Could you please clarify a bit further your proposal about manually removing
omap info from strays? Should it be applied:
- to the problematic replicas of the stray object which triggered the
inconsi
Hi Sean, Rob.
I saw on the tracker that you were able to resolve the mds assert by manually
cleaning the corrupted metadata. Since I am also hitting that issue and I
suspect that i will face an mds assert of the same type sooner or later, can
you please explain a bit further what operations did
t;: 0,
"dir_split": 0,
"inode_max": 200,
"inodes": 258,
"inodes_top": 0,
"inodes_bottom": 1993207,
"inodes_pin_tail": 6851,
"inodes_pinned": 12413
Borges
Cc: John Spray; ceph-us...@ceph.com
Subject: Re: [ceph-users] cephfs quotas reporting
On Mon, Dec 5, 2016 at 5:24 PM, Goncalo Borges
wrote:
> Hi Greg, Jonh...
>
> To Jonh: Nothing is done in tge background between two consecutive df
> commands,
>
> I have opened the follo
"forward": 0,
"dir_fetch": 0,
"dir_commit": 0,
"dir_split": 0,
"inode_max": 200,
"inodes": 2000058,
"inodes_top": 0,
"inodes_bottom": 1993207,
s
problematic by MDS although inodes < inodes_max. Looking to the number
of inodes of that machine, I get "inode_count": 13862. So, it seems that
the client is still tagged as problematic although it has an inode_count
bellow 16384 and inodes < inodes_max. Maybe a consequence of
Hi all
Even when using ceph fuse, quotas are only enabled once you mount with the
--client-quota option.
Cheers
Goncalo
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of gjprabu
[gjpr...@zohocorp.com]
Sent: 16 December 2016 18:18
To: gjprab
Hi Sean
In our case, the last time we had this error, we stopped the osd, mark it out,
let ceph recover and then reinstall it. We did it because we were suspecting of
issues with the osd and that was why we decided to take this approach. The fact
is that the pg we were seeing constantly declared
Hi Manuel
I am Goncalo Borges (Portuguese) and I work at the university of Sydney. We
have been using ceph and cephfs since almost two years. If you think
worthwhile, we can just talk and discuss our experiences. There is good ceph
community in Melbourne but you are actually the first one in
Hi Dan
Hope to find you ok.
Here goes a suggestion from someone who has been sitting in the side line
for the last 2 years but following stuff as much as possible
Will weight set per pool help?
This is only possible in luminous but according to the docs there is the
possibility to adjust positi
Hi Jorge
Indeed my advice is to configure your high memory mds as a standby mds. Once
you restart the service in the low memory mds, the standby one should take over
without downtime and the first one becomes the standby one.
Cheers
Goncalo
From: ceph-user
ully understanding how to properly do it?
TIA
Goncalo
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 2006
T: +61 2 93511937
___
ceph-users mailing list
I think I've understood how to run it...
ceph-fuse -m MON_IP:6789 -r /syd /coepp/cephfs/syd
does what I want
Cheers
Goncalo
On 12/15/2015 12:04 PM, Goncalo Borges wrote:
Dear CephFS experts
Before it was possible to mount a subtree of a filesystem using
ceph-fuse and -r option.
Dear Cephfs gurus.
I have two questions regarding ACL support on cephfs.
1) Last time we tried ACLs we saw that they were only working properly in the
kernel module and I wonder what is the present status of acl support on
ceph-fuse. Can you clarify on that?
2) If ceph-fuse is still not proper
osd use.
I am not 100% sure if this is either a problem with Ceph v9.2.0 or
to do with the recent update of CentOS 7.2
Has anyone else encountered a similar problem?
Also, should I be posting this on ceph-devel mailing list, or is
here OK?
Thanks!
Regards,
Matthew Taylor.
15
To: Goncalo Borges; Loic Dachary
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] CentOS 7.2, Infernalis, preparing osd's and partprobe
issues.
I commented out partprobe and everything seems to work just fine.
*If someone has experience with why this is very bad please advise.
Make sur
Dear CephFS experts...
We are using Ceph and CephFS 9.2.0. CephFS clients are being mounted via
ceph-fuse.
We recently noticed the firewall from certain CephFS clients dropping
connections with OSDs as SRC. This is something which is not systematic but we
noticed happening at least once. Here
Hi Greg.
We are using Ceph and CephFS 9.2.0. CephFS clients are being mounted via
ceph-fuse.
We recently noticed the firewall from certain CephFS clients dropping
connections with OSDs as SRC. This is something which is not systematic but we
noticed happening at least once. Here is an example
Hi CephFS experts.
1./ We are using Ceph and CephFS 9.2.0 with an active mds and a standby-replay
mds (standard config)
# ceph -s
cluster
health HEALTH_OK
monmap e1: 3 mons at
{mon1=:6789/0,mon2=:6789/0,mon3=:6789/0}
election epoch 98, quorum 0,1,2 mon1,mon3,mon2
Hi...
Seems very similar to
http://tracker.ceph.com/issues/14144
Can you confirm it is the same issue?
Cheers
G.
From: Goncalo Borges
Sent: 02 February 2016 15:30
To: ceph-us...@ceph.com
Cc: rct...@coepp.org.au
Subject: CEPHFS: standby-replay mds crash
Hi
Hi X
Have you tried to inspect the mds for problematic sessions still connected from
those clients?
To check which sessions are still connected to the mds, do (in ceph 9.2.0, the
command might be different or even do not exist in other older versions)
ceph daemon mds. session ls
Cheers
G.
: Zhao Xu [xuzh....@gmail.com]
Sent: 03 February 2016 11:31
To: Goncalo Borges
Cc: Mykola Dvornik; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Urgent help needed for ceph storage "mount error 5 =
Input/output error"
I see a lot sessions. How can I clear these session? Since I'
Dear CephFS gurus...
I would like your advise on how to improve performance without compromising
reliability for CephFS clients deployed under a WAN.
Currently, our infrastructure relies on:
- ceph infernalis
- a ceph object cluster, with all core infrastructure components sitting in the
same d
Hi Zhang...
If I can add some more info, the change of PGs is a heavy operation, and as far
as i know, you should NEVER decrease PGs. From the notes in pgcalc
(http://ceph.com/pgcalc/):
"It's also important to know that the PG count can be increased, but NEVER
decreased without destroying / re
From: Zhang Qiang [dotslash...@gmail.com]
Sent: 23 March 2016 23:17
To: Goncalo Borges
Cc: Oliver Dzombic; ceph-users
Subject: Re: [ceph-users] Need help for PG problem
And here's the osd tree if it matters.
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINI
ning Hammer 0.94.5 in this case.
From what I know a OSD had a failing disk and was restarted a couple of times
while the disk gave errors. This caused the PG to become incomplete.
I've set debug osd to 20, but I can't really tell what is going wrong on osd.68
which causes it to stall this
heers
Goncalo
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 2006
T: +61 2 93511937
___
ceph-users mailing list
ceph-users@lists.ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
101 - 163 of 163 matches
Mail list logo