Hi, On RHEL7 it works as expected :
[ubuntu@mira042 ~]$ sudo ceph-disk prepare /dev/sdg *************************************************************** Found invalid GPT and valid MBR; converting MBR to GPT format. *************************************************************** Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. partx: /dev/sdg: error adding partition 2 Information: Moved requested sector from 10485761 to 10487808 in order to align on 2048-sector boundaries. The operation has completed successfully. meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 data = bsize=4096 blocks=242879665, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal log bsize=4096 blocks=118593, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 The operation has completed successfully. partx: /dev/sdg: error adding partitions 1-2 [ubuntu@mira042 ~]$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 974540108 3412868 971110856 1% / devtmpfs 8110996 0 8110996 0% /dev tmpfs 8130388 0 8130388 0% /dev/shm tmpfs 8130388 58188 8072200 1% /run tmpfs 8130388 0 8130388 0% /sys/fs/cgroup /dev/sdh1 971044288 34740 971009548 1% /var/lib/ceph/osd/ceph-0 /dev/sdg1 971044288 33700 971010588 1% /var/lib/ceph/osd/ceph-1 There is an important difference though: RHEL7 does not use https://github.com/ceph/ceph/blob/giant/src/ceph-disk-udev . It should not be necessary for centos7 but it looks like it is in use since the debug you get comes from it . There must be something wrong in the source package you are using around this point https://github.com/ceph/ceph/blob/giant/ceph.spec.in#L382 I checked http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHOS/SRPMS/ceph-0.80.5-1.el7ost.src.rpm and it is as expected. Could you let me know where you got the package from ? And what is the version according to $ yum info ceph Installed Packages Name : ceph Arch : x86_64 Epoch : 1 Version : 0.80.5 Release : 8.el7 Size : 37 M Repo : installed From repo : epel Summary : User space components of the Ceph file system URL : http://ceph.com/ License : GPL-2.0 Description : Ceph is a massively scalable, open-source, distributed : storage system that runs on commodity hardware and delivers : object, block and file system storage. I'm not very familiar with RPMs and maybe Release : 8.el7 means it is a more recent version of the package : ceph-0.80.5-1.el7ost.src.rpm suggests the Release should be 1.el7 and not 8.el7. Cheers On 10/10/2014 15:59, SCHAER Frederic wrote: > Hi Loic, > > Patched, and still not working (sorry)... > I'm attaching the prepare output, and also a different a "real " udev debug > output I captured using " udevadm monitor --environment " (udev.log file) > > I added a "sync" command in ceph-disk-udev (this did not change a thing), and > I noticed that udev script is called 3 times when adding one disk, and that > the debug output was captured and then mixed all into one file. > This may lead to log mis-interpretation (race conditions ?)... > I changed a bit the logging in order to get one file per call and attached > those logs to this mail. > > File timestamps are as follows : > File: '/var/log/udev_ceph.log.out.22706' > Change: 2014-10-10 15:48:09.136386306 +0200 > File: '/var/log/udev_ceph.log.out.22749' > Change: 2014-10-10 15:48:11.502425395 +0200 > File: '/var/log/udev_ceph.log.out.22750' > Change: 2014-10-10 15:48:11.606427113 +0200 > > Actually, I can reproduce the UUID=0 thing with this command : > > [root@ceph1 ~]# /usr/sbin/ceph-disk -v activate-journal /dev/sdc2 > INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid > --osd-journal /dev/sdc2 > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID > 00000000-0000-0000-0000-000000000000 > INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- > /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 > error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such > file or directory > ceph-disk: Cannot discover filesystem type: device > /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command > '/sbin/blkid' returned non-zero exit status 2 > > Ah - to answer previous mails : > - I tried to manually create the gpt partition table to see if things would > improve, but this was not the case (I also tried to zero out the start and > end of disks, and also to add random data) > - running ceph-disk prepare twice does not work, it's just that once every 20 > (?) times it "surprisingly does not fail" on this hardware/os combination ;) > > Regards > > -----Message d'origine----- > De : Loic Dachary [mailto:l...@dachary.org] > Envoyé : vendredi 10 octobre 2014 14:37 > À : SCHAER Frederic; ceph-users@lists.ceph.com > Objet : Re: [ceph-users] ceph-dis prepare : > UUID=00000000-0000-0000-0000-000000000000 > > Hi Frederic, > > To be 100% sure it would be great if you could manually patch your local > ceph-disk script and change 'partprobe', into 'partx', '-a', in > https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 > > ceph-disk zap > ceph-disk prepare > > and hopefully it will show up as it should. It works for me on centos7 but ... > > Cheers > > On 10/10/2014 14:33, Loic Dachary wrote: >> Hi Frederic, >> >> It looks like this is just because >> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 should call >> partx instead of partprobe. The udev debug output makes this quite clear >> http://tracker.ceph.com/issues/9721 >> >> I think >> https://github.com/dachary/ceph/commit/8d914001420e5bfc1e12df2d4882bfe2e1719a5c#diff-788c3cea6213c27f5fdb22f8337096d5R1285 >> fixes it >> >> Cheers >> >> On 09/10/2014 16:29, SCHAER Frederic wrote: >>> >>> >>> -----Message d'origine----- >>> De : Loic Dachary [mailto:l...@dachary.org] >>> Envoyé : jeudi 9 octobre 2014 16:20 >>> À : SCHAER Frederic; ceph-users@lists.ceph.com >>> Objet : Re: [ceph-users] ceph-dis prepare : >>> UUID=00000000-0000-0000-0000-000000000000 >>> >>> >>> >>> On 09/10/2014 16:04, SCHAER Frederic wrote: >>>> Hi Loic, >>>> >>>> Back on sdb, as the sde output was from another machine on which I ran >>>> partx -u afterwards. >>>> To reply your last question first : I think the SG_IO error comes from the >>>> fact that disks are exported as a single disks RAID0 on a PERC 6/E, which >>>> does not support JBOD - this is decommissioned hardware on which I'd like >>>> to test and validate we can use ceph for our use case... >>>> >>>> So back on the UUID. >>>> It's funny : I retried and ceph-disk prepare worked this time. I tried on >>>> another disk, and it failed. >>>> There is a difference in the output from ceph-disk : on the failing disk, >>>> I have these extra lines after disks are prepared : >>>> >>>> (...) >>>> realtime =none extsz=4096 blocks=0, rtextents=0 >>>> Warning: The kernel is still using the old partition table. >>>> The new table will be used at the next reboot. >>>> The operation has completed successfully. >>>> partx: /dev/sdc: error adding partitions 1-2 >>>> >>>> I didn't have the warning about the old partition tables on the disk that >>>> worked. >>>> So on this new disk, I have : >>>> >>>> [root@ceph1 ~]# mount /dev/sdc1 /mnt >>>> [root@ceph1 ~]# ll /mnt/ >>>> total 16 >>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 ceph_fsid >>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 fsid >>>> lrwxrwxrwx 1 root root 58 Oct 9 15:58 journal -> >>>> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc >>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 journal_uuid >>>> -rw-r--r-- 1 root root 21 Oct 9 15:58 magic >>>> >>>> [root@ceph1 ~]# cat /mnt/journal_uuid >>>> 5e50bb8b-0b99-455f-af71-10815a32bfbc >>>> >>>> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc >>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) >>>> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441 >>>> First sector: 10487808 (at 5.0 GiB) >>>> Last sector: 1952448478 (at 931.0 GiB) >>>> Partition size: 1941960671 sectors (926.0 GiB) >>>> Attribute flags: 0000000000000000 >>>> Partition name: 'ceph data' >>>> >>>> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc >>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) >>>> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC >>>> First sector: 2048 (at 1024.0 KiB) >>>> Last sector: 10485760 (at 5.0 GiB) >>>> Partition size: 10483713 sectors (5.0 GiB) >>>> Attribute flags: 0000000000000000 >>>> Partition name: 'ceph journal' >>>> >>>> Puzzling, isn't it ? >>>> >>>> >>> >>> Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it >>> shows an error and complains that the journal uuid is 0000-000* etc ? If so >>> could you copy your udev debug output ? >>> >>> Cheers >>> >>> [>- FS : -<] >>> >>> No, when I manually activate the disk instead of attempting to go the udev >>> way, it seems to work : >>> [root@ceph1 ~]# ceph-disk activate /dev/sdc1 >>> got monmap epoch 1 >>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid >>> 00000000-0000-0000-0000-000000000000 doesn't match expected >>> 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal >>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 >>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 2014-10-09 16:21:43.301957 7f2be6a027c0 -1 >>> filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find >>> 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory >>> 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store >>> /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal >>> for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1 >>> 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file: >>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open >>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory >>> 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring >>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring >>> added key for osd.47 >>> === osd.47 === >>> create-or-move updating item name 'osd.47' weight 0.9 at location >>> {host=ceph1,root=default} to crush map >>> Starting Ceph osd.47 on ceph1... >>> Running as unit run-12392.service. >>> >>> The osd then appeared in the osd tree... >>> I attached the logs to this email (I just added a set -x in the script >>> called by udev, and redirected the output) >>> >>> Regards >>> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com