Re: [ceph-users] Issue with Ceph padding files out to ceph.dir.layout.stripe_unit size

Kate Ward Thu, 20 Oct 2016 15:07:35 -0700

All are relatively recent Ubuntu 16.04.1 kernels. I upgraded ka05 last
night, but still see an issue. I'm happy to upgrade the rest.
$ for h in ka00 ka01 ka02 ka03 ka04 ka05; do ssh $h uname -a; done
Linux ka00 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:00:59 UTC 2016
i686 i686 i686 GNU/Linux
Linux ka01 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
Linux ka02 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
Linux ka03 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
Linux ka04 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
Linux ka05 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux


k8

On Thu, Oct 20, 2016 at 11:39 PM John Spray <jsp...@redhat.com> wrote:

> On Thu, Oct 20, 2016 at 10:15 PM, Kate Ward <kate.w...@forestent.com>
> wrote:
> > I have a strange problem that began manifesting after I rebuilt my
> cluster a
> > month or so back. A tiny subset of my files on CephFS are being
> zero-padded
> > out to the length of ceph.dir.layout.stripe_unit when the files are later
> > *read* (not when they are written). Tonight I realized the padding
> matched
> > the stripe_unit value of 1048576, and changed it to 4194304, which
> resulted
> > in those files that get padded taking on the new stripe_unit value. I've
> > since changed it back. I've tried searching Google for answers, and Ceph
> > bugs, but have had no luck so far.
> >
> > Current ceph.dir.layout setting for the entire cluster.
> > $ getfattr -n ceph.dir.layout /ceph/ka
> > getfattr: Removing leading '/' from absolute path names
> > # file: ceph/ka
> > ceph.dir.layout="stripe_unit=1048576 stripe_count=2 object_size=8388608
> > pool=cephfs_data"
> >
> > Ceph is mounted on all machines using the kernel driver. The problem is
> not
> > isolated to a single machine.
> > $ grep /ceph/ka /etc/mtab
> > backupz/ceph/ka /backupz/ceph/ka zfs rw,noatime,xattr,noacl 0 0
> > 172.16.0.11:6789,172.16.0.19:6789:/ /ceph/ka ceph
> > rw,noatime,nodiratime,name=admin,secret=<hidden>,acl 0 0
> >
> > Files from a Subversion repository, where the last one was padded after I
> > tried to check out the repo.
> > kward@ka02 2016-10-20T22:57:13
> > %0]/ceph/ka/data/repoz/forestent/forestent/db/revs/0
> > $ ls -lrt |tail -5
> > -rw-r--r-- 1 www-data www-data    1079 Oct 20 08:53 877
> > -rw-r--r-- 1 www-data www-data    1415 Oct 20 08:55 878
> > -rw-r--r-- 1 www-data www-data    1059 Oct 20 09:01 879
> > -rw-r--r-- 1 www-data www-data    1318 Oct 20 09:36 880
> > -rw-r--r-- 1 www-data www-data 4194304 Oct 20 19:18 881
>
> This issue isn't one I immediately recognise.  What kernel version is
> in use on the clients?
>
> I would guess an issue like this comes from the file size recovery
> that we do if/when a client loses contact while writing a file, which
> I don't know that we ever test with non-default layouts, and might not
> work very well with layouts that have stripe_unit != object_size.
>
> If you set "debug mds = 10" and "debug filer = 10" on the MDS, and
> then capture the log from the point in time where you write a file to
> the point in time where the file is statted and gives the incorrect
> size (if it's reproducible that readily), that should gives us a
> better idea.
>
> John
>
> >
> > Files stored, then later accessed via WebDAV. Only those files accessed
> were
> > subsequently padded.
> > [kward@ka02 2016-10-20T23:01:42 %0]~/www/webdav/OmniFocus.ofocus
> > $ ls -l
> > total 16389
> > -rw-r--r-- 1 www-data www-data 4194304 Oct 20 20:08
> > 00000000000000=ay-_KSusSw8+jOtYClSC2kx.zip
> > -rw-r--r-- 1 www-data www-data    1383 Oct 20 19:22
> > 20161020172209=pP4DpDOXAaA.client
> > -rw-r--r-- 1 www-data www-data 4194304 Oct 20 20:20
> > 20161020182047=pP4DpDOXAaA.client
> > -rw-r--r-- 1 www-data www-data 4194304 Oct 20 21:11
> > 20161020191117=pP4DpDOXAaA.client
> > -rw-r--r-- 1 www-data www-data    1309 Oct 20 21:56
> > 20161020195647=jY9iwiPfUhB.client
> > -rw-r--r-- 1 www-data www-data 4194304 Oct 20 22:04
> > 20161020200427=pP4DpDOXAaA.client
> > -rw-r--r-- 1 www-data www-data    1309 Oct 20 22:54
> > 20161020205415=jY9iwiPfUhB.client
> >
> > Cluster lists as healthy. (Yes, I'm aware one of the OSDs is currently
> down.
> > The issue was there two months before it went down.)
> > $ ceph status
> >     cluster f13b6373-0cdc-4372-85a2-66bf2841e313
> >      health HEALTH_OK
> >      monmap e3: 3 mons at
> > {ka01=172.16.0.11:6789/0,ka03=172.16.0.15:6789/0,ka04=172.16.0.17:6789/0
> }
> >             election epoch 36, quorum 0,1,2 ka01,ka03,ka04
> >       fsmap e1140219: 1/1/1 up {0=ka01=up:active}, 2 up:standby
> >      osdmap e1234338: 16 osds: 15 up, 15 in
> >             flags sortbitwise
> >       pgmap v2296058: 1216 pgs, 3 pools, 7343 GB data, 1718 kobjects
> >             14801 GB used, 19360 GB / 34161 GB avail
> >                 1216 active+clean
> >
> > Details:
> > - Ceph 10.2.2 (Ubuntu 16.04.1 packages)
> > - 4x servers, each with 4x OSDs on HDDs (mixture of 2T and 3T drives);
> > journals on SSD
> > - 3x Mons, and 3x MDSs
> > - Data is replicated 2x
> > - The only usage of the cluster is via CephFS
> >
> > Kate
> > https://ch.linkedin.com/in/kate-ward-1119b9
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Issue with Ceph padding files out to ceph.dir.layout.stripe_unit size

Reply via email to