Re: [ceph-users] cephfs file block size: must it be so big?

2018-12-15 Thread Paul Emmerich
Bryan Henderson :
> In some NFS experiments of mine, the blocksize reported by 'stat' appears to
> be controlled by the rsize and wsize mount options.  Without such options, in
> the one case I tried, Linux 4.9, blocksize was 32K.  Maybe it's affected by
> the server or by the filesystem the NFS server is serving.  This was NFS 3.
>

NFS servers provide both a maximum and a preferred read/write size
which should be used by default by an NFS client. 1M is common default
for the preferred size for modern NFS servers.
That value (if not overriden) should then show up as block size in the
file system

Paul

> > This patch should address this issue [massive reads of e.g. /dev/urandom]:.
>
> Thanks!
>
> > mount option should work.
>
> And thanks again.
>
> --
> Bryan Henderson   San Jose, California
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mirroring global id mismatch

2018-12-15 Thread Jason Dillaman
On Fri, Dec 14, 2018 at 4:27 PM Vikas Rana  wrote:
>
> Hi there,
>
> We are replicating a RBD image from Primary to DR site using RBD mirroring.
> We were using 10.2.10.
>
> We decided to upgrade the DR site to luminous and upgrade went fine and 
> mirroring status also was good.
> We then promoted the DR copy to test the failure. Everything checked out good.
>
> The issue now is, we are not able to resume our replication. Its complaining 
> about "description: remote image does not exist"
> This was the same image which were in mirroring relationship before the 
> promotion.
> We compared the mirroring global id and they are not matching. When we did 
> the testing in lab, this value is same on both side.
>
>
>
> rbd info nfs/dir_research
>
> rbd image 'dir_research':
>
> size 200 TB in 52428800 objects
>
> order 22 (4096 kB objects)
>
> block_name_prefix: rbd_data.edd65238e1f29
>
> format: 2
>
> features: layering, exclusive-lock, journaling
>
> flags:
>
> journal: edd65238e1f29
>
> mirroring state: enabled
>
> mirroring global id: a8522ed7-70ff-4966-9edc-e7ef41906fd9
>
> mirroring primary: true
>
>
> rbd --cluster cephdr info nfs/dir_research
>
> rbd image 'dir_research':
>
> size 200TiB in 52428800 objects
>
> order 22 (4MiB objects)
>
> block_name_prefix: rbd_data.58e76109cf92e
>
> format: 2
>
> features: layering, exclusive-lock, journaling
>
> flags:
>
> journal: 58e76109cf92e
>
> mirroring state: enabled
>
> mirroring global id: 1490c637-21f9-4eff-bef6-54defc1e0988
>
> mirroring primary: false
>
>
>
> rbd mirror image status nfs/dir_research
>
> dir_research:
>
>   global_id:   a8522ed7-70ff-4966-9edc-e7ef41906fd9
>
>   state:   down+unknown
>
>   description: status not found
>
>   last_update: 1969-12-31 19:00:00

It's odd that this image is not reporting any status. Do you have an
"rbd-mirror" daemon still running against this cluster?

>
> rbd --cluster cephdr mirror image status nfs/dir_research
>
> dir_research:
>
>   global_id:   1490c637-21f9-4eff-bef6-54defc1e0988
>
>   state:   down+error
>
>   description: remote image does not exist
>
>   last_update: 2018-11-30 11:28:49
>
>
>
>
>
>
> So the question is, Is it possible the mirroring global id got changed after 
> upgrade and is there's any way to change the global id to match the 
> production so that the replication can be resumed.

Can you provide the output from the following commands (run against
both clusters)?

rados -p nfs getomapval rbd_mirroring mirror_uuid
rbd mirror pool info --pool nfs
rbd journal status --pool nfs --image dir_research


>
>
> Thanks,
>
> -Vikas
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] active+recovering+degraded after cluster reboot

2018-12-15 Thread David C
Hi All

I have what feels like a bit of a rookie question

I shutdown a Luminous 12.2.1 cluster with noout,nobackfill,norecover set

Before shutting down, all PGs were active+clean

I brought the cluster up, all daemons started and all but 2 PGs are
active+clean

I have 2 pgs showing: "active+recovering+degraded"

It's been reporting this for about an hour with no signs of clearing on
it's own

Ceph health detail shows: PG_DEGRADED Degraded data redundancy: 2/131709267
objects degraded (0.000%), 2 pgs unclean, 2 pgs degraded

I've tried restarting MONs and all OSDs in the cluster.

How would you recommend I proceed at this point?

Thanks
David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+recovering+degraded after cluster reboot

2018-12-15 Thread Paul Emmerich
Did you unset norecover?


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sun, Dec 16, 2018 at 12:22 AM David C  wrote:
>
> Hi All
>
> I have what feels like a bit of a rookie question
>
> I shutdown a Luminous 12.2.1 cluster with noout,nobackfill,norecover set
>
> Before shutting down, all PGs were active+clean
>
> I brought the cluster up, all daemons started and all but 2 PGs are 
> active+clean
>
> I have 2 pgs showing: "active+recovering+degraded"
>
> It's been reporting this for about an hour with no signs of clearing on it's 
> own
>
> Ceph health detail shows: PG_DEGRADED Degraded data redundancy: 2/131709267 
> objects degraded (0.000%), 2 pgs unclean, 2 pgs degraded
>
> I've tried restarting MONs and all OSDs in the cluster.
>
> How would you recommend I proceed at this point?
>
> Thanks
> David
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+recovering+degraded after cluster reboot

2018-12-15 Thread David C
Hi Paul

Thanks for the response. Not yet, just being a bit cautious ;) I'll go
ahead and do that.

Thanks
David


On Sat, 15 Dec 2018, 23:39 Paul Emmerich  Did you unset norecover?
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Sun, Dec 16, 2018 at 12:22 AM David C  wrote:
> >
> > Hi All
> >
> > I have what feels like a bit of a rookie question
> >
> > I shutdown a Luminous 12.2.1 cluster with noout,nobackfill,norecover set
> >
> > Before shutting down, all PGs were active+clean
> >
> > I brought the cluster up, all daemons started and all but 2 PGs are
> active+clean
> >
> > I have 2 pgs showing: "active+recovering+degraded"
> >
> > It's been reporting this for about an hour with no signs of clearing on
> it's own
> >
> > Ceph health detail shows: PG_DEGRADED Degraded data redundancy:
> 2/131709267 objects degraded (0.000%), 2 pgs unclean, 2 pgs degraded
> >
> > I've tried restarting MONs and all OSDs in the cluster.
> >
> > How would you recommend I proceed at this point?
> >
> > Thanks
> > David
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+recovering+degraded after cluster reboot

2018-12-15 Thread David C
Yep, that cleared it. Sorry for the noise!

On Sun, Dec 16, 2018 at 12:16 AM David C  wrote:

> Hi Paul
>
> Thanks for the response. Not yet, just being a bit cautious ;) I'll go
> ahead and do that.
>
> Thanks
> David
>
>
> On Sat, 15 Dec 2018, 23:39 Paul Emmerich 
>> Did you unset norecover?
>>
>>
>> Paul
>>
>> --
>> Paul Emmerich
>>
>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>
>> croit GmbH
>> Freseniusstr. 31h
>> 81247 München
>> www.croit.io
>> Tel: +49 89 1896585 90
>>
>> On Sun, Dec 16, 2018 at 12:22 AM David C  wrote:
>> >
>> > Hi All
>> >
>> > I have what feels like a bit of a rookie question
>> >
>> > I shutdown a Luminous 12.2.1 cluster with noout,nobackfill,norecover set
>> >
>> > Before shutting down, all PGs were active+clean
>> >
>> > I brought the cluster up, all daemons started and all but 2 PGs are
>> active+clean
>> >
>> > I have 2 pgs showing: "active+recovering+degraded"
>> >
>> > It's been reporting this for about an hour with no signs of clearing on
>> it's own
>> >
>> > Ceph health detail shows: PG_DEGRADED Degraded data redundancy:
>> 2/131709267 objects degraded (0.000%), 2 pgs unclean, 2 pgs degraded
>> >
>> > I've tried restarting MONs and all OSDs in the cluster.
>> >
>> > How would you recommend I proceed at this point?
>> >
>> > Thanks
>> > David
>> >
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com