Hi Frederic, I think the problem is that you have a package that has this bug http://tracker.ceph.com/issues/9747
Let me know if using the latest from EPEL ( i.e. what was created from https://dl.fedoraproject.org/pub/epel/7/SRPMS/c/ceph-0.80.5-8.el7.src.rpm ) solves the problem. I'm learning a lot about udev, centos and RHEL in the process ;-) Cheers On 11/10/2014 17:31, Loic Dachary wrote: > Hi, > > On RHEL7 it works as expected : > > [ubuntu@mira042 ~]$ sudo ceph-disk prepare /dev/sdg > > *************************************************************** > Found invalid GPT and valid MBR; converting MBR to GPT format. > *************************************************************** > > Information: Moved requested sector from 34 to 2048 in > order to align on 2048-sector boundaries. > The operation has completed successfully. > partx: /dev/sdg: error adding partition 2 > Information: Moved requested sector from 10485761 to 10487808 in > order to align on 2048-sector boundaries. > The operation has completed successfully. > meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=0 > data = bsize=4096 blocks=242879665, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=0 > log =internal log bsize=4096 blocks=118593, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > The operation has completed successfully. > partx: /dev/sdg: error adding partitions 1-2 > [ubuntu@mira042 ~]$ df > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 974540108 3412868 971110856 1% / > devtmpfs 8110996 0 8110996 0% /dev > tmpfs 8130388 0 8130388 0% /dev/shm > tmpfs 8130388 58188 8072200 1% /run > tmpfs 8130388 0 8130388 0% /sys/fs/cgroup > /dev/sdh1 971044288 34740 971009548 1% /var/lib/ceph/osd/ceph-0 > /dev/sdg1 971044288 33700 971010588 1% /var/lib/ceph/osd/ceph-1 > > There is an important difference though: RHEL7 does not use > https://github.com/ceph/ceph/blob/giant/src/ceph-disk-udev . It should not be > necessary for centos7 but it looks like it is in use since the debug you get > comes from it . There must be something wrong in the source package you are > using around this point > https://github.com/ceph/ceph/blob/giant/ceph.spec.in#L382 > > I checked > http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHOS/SRPMS/ceph-0.80.5-1.el7ost.src.rpm > and it is as expected. Could you let me know where you got the package from > ? And what is the version according to > > $ yum info ceph > Installed Packages > Name : ceph > Arch : x86_64 > Epoch : 1 > Version : 0.80.5 > Release : 8.el7 > Size : 37 M > Repo : installed > From repo : epel > Summary : User space components of the Ceph file system > URL : http://ceph.com/ > License : GPL-2.0 > Description : Ceph is a massively scalable, open-source, distributed > : storage system that runs on commodity hardware and delivers > : object, block and file system storage. > > I'm not very familiar with RPMs and maybe Release : 8.el7 means it is a > more recent version of the package : ceph-0.80.5-1.el7ost.src.rpm suggests > the Release should be 1.el7 and not 8.el7. > > Cheers > > On 10/10/2014 15:59, SCHAER Frederic wrote: >> Hi Loic, >> >> Patched, and still not working (sorry)... >> I'm attaching the prepare output, and also a different a "real " udev debug >> output I captured using " udevadm monitor --environment " (udev.log file) >> >> I added a "sync" command in ceph-disk-udev (this did not change a thing), >> and I noticed that udev script is called 3 times when adding one disk, and >> that the debug output was captured and then mixed all into one file. >> This may lead to log mis-interpretation (race conditions ?)... >> I changed a bit the logging in order to get one file per call and attached >> those logs to this mail. >> >> File timestamps are as follows : >> File: '/var/log/udev_ceph.log.out.22706' >> Change: 2014-10-10 15:48:09.136386306 +0200 >> File: '/var/log/udev_ceph.log.out.22749' >> Change: 2014-10-10 15:48:11.502425395 +0200 >> File: '/var/log/udev_ceph.log.out.22750' >> Change: 2014-10-10 15:48:11.606427113 +0200 >> >> Actually, I can reproduce the UUID=0 thing with this command : >> >> [root@ceph1 ~]# /usr/sbin/ceph-disk -v activate-journal /dev/sdc2 >> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid >> --osd-journal /dev/sdc2 >> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID >> 00000000-0000-0000-0000-000000000000 >> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- >> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 >> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such >> file or directory >> ceph-disk: Cannot discover filesystem type: device >> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command >> '/sbin/blkid' returned non-zero exit status 2 >> >> Ah - to answer previous mails : >> - I tried to manually create the gpt partition table to see if things would >> improve, but this was not the case (I also tried to zero out the start and >> end of disks, and also to add random data) >> - running ceph-disk prepare twice does not work, it's just that once every >> 20 (?) times it "surprisingly does not fail" on this hardware/os combination >> ;) >> >> Regards >> >> -----Message d'origine----- >> De : Loic Dachary [mailto:[email protected]] >> Envoyé : vendredi 10 octobre 2014 14:37 >> À : SCHAER Frederic; [email protected] >> Objet : Re: [ceph-users] ceph-dis prepare : >> UUID=00000000-0000-0000-0000-000000000000 >> >> Hi Frederic, >> >> To be 100% sure it would be great if you could manually patch your local >> ceph-disk script and change 'partprobe', into 'partx', '-a', in >> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 >> >> ceph-disk zap >> ceph-disk prepare >> >> and hopefully it will show up as it should. It works for me on centos7 but >> ... >> >> Cheers >> >> On 10/10/2014 14:33, Loic Dachary wrote: >>> Hi Frederic, >>> >>> It looks like this is just because >>> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 should call >>> partx instead of partprobe. The udev debug output makes this quite clear >>> http://tracker.ceph.com/issues/9721 >>> >>> I think >>> https://github.com/dachary/ceph/commit/8d914001420e5bfc1e12df2d4882bfe2e1719a5c#diff-788c3cea6213c27f5fdb22f8337096d5R1285 >>> fixes it >>> >>> Cheers >>> >>> On 09/10/2014 16:29, SCHAER Frederic wrote: >>>> >>>> >>>> -----Message d'origine----- >>>> De : Loic Dachary [mailto:[email protected]] >>>> Envoyé : jeudi 9 octobre 2014 16:20 >>>> À : SCHAER Frederic; [email protected] >>>> Objet : Re: [ceph-users] ceph-dis prepare : >>>> UUID=00000000-0000-0000-0000-000000000000 >>>> >>>> >>>> >>>> On 09/10/2014 16:04, SCHAER Frederic wrote: >>>>> Hi Loic, >>>>> >>>>> Back on sdb, as the sde output was from another machine on which I ran >>>>> partx -u afterwards. >>>>> To reply your last question first : I think the SG_IO error comes from >>>>> the fact that disks are exported as a single disks RAID0 on a PERC 6/E, >>>>> which does not support JBOD - this is decommissioned hardware on which >>>>> I'd like to test and validate we can use ceph for our use case... >>>>> >>>>> So back on the UUID. >>>>> It's funny : I retried and ceph-disk prepare worked this time. I tried on >>>>> another disk, and it failed. >>>>> There is a difference in the output from ceph-disk : on the failing disk, >>>>> I have these extra lines after disks are prepared : >>>>> >>>>> (...) >>>>> realtime =none extsz=4096 blocks=0, rtextents=0 >>>>> Warning: The kernel is still using the old partition table. >>>>> The new table will be used at the next reboot. >>>>> The operation has completed successfully. >>>>> partx: /dev/sdc: error adding partitions 1-2 >>>>> >>>>> I didn't have the warning about the old partition tables on the disk that >>>>> worked. >>>>> So on this new disk, I have : >>>>> >>>>> [root@ceph1 ~]# mount /dev/sdc1 /mnt >>>>> [root@ceph1 ~]# ll /mnt/ >>>>> total 16 >>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 ceph_fsid >>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 fsid >>>>> lrwxrwxrwx 1 root root 58 Oct 9 15:58 journal -> >>>>> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc >>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 journal_uuid >>>>> -rw-r--r-- 1 root root 21 Oct 9 15:58 magic >>>>> >>>>> [root@ceph1 ~]# cat /mnt/journal_uuid >>>>> 5e50bb8b-0b99-455f-af71-10815a32bfbc >>>>> >>>>> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc >>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) >>>>> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441 >>>>> First sector: 10487808 (at 5.0 GiB) >>>>> Last sector: 1952448478 (at 931.0 GiB) >>>>> Partition size: 1941960671 sectors (926.0 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph data' >>>>> >>>>> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc >>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) >>>>> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC >>>>> First sector: 2048 (at 1024.0 KiB) >>>>> Last sector: 10485760 (at 5.0 GiB) >>>>> Partition size: 10483713 sectors (5.0 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph journal' >>>>> >>>>> Puzzling, isn't it ? >>>>> >>>>> >>>> >>>> Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it >>>> shows an error and complains that the journal uuid is 0000-000* etc ? If >>>> so could you copy your udev debug output ? >>>> >>>> Cheers >>>> >>>> [>- FS : -<] >>>> >>>> No, when I manually activate the disk instead of attempting to go the udev >>>> way, it seems to work : >>>> [root@ceph1 ~]# ceph-disk activate /dev/sdc1 >>>> got monmap epoch 1 >>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid >>>> 00000000-0000-0000-0000-000000000000 doesn't match expected >>>> 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal >>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> 2014-10-09 16:21:43.301957 7f2be6a027c0 -1 >>>> filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find >>>> 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory >>>> 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store >>>> /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal >>>> for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1 >>>> 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file: >>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open >>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory >>>> 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring >>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring >>>> added key for osd.47 >>>> === osd.47 === >>>> create-or-move updating item name 'osd.47' weight 0.9 at location >>>> {host=ceph1,root=default} to crush map >>>> Starting Ceph osd.47 on ceph1... >>>> Running as unit run-12392.service. >>>> >>>> The osd then appeared in the osd tree... >>>> I attached the logs to this email (I just added a set -x in the script >>>> called by udev, and redirected the output) >>>> >>>> Regards >>>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> > > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
