Hi, On RHEL7 it works as expected :
[ubuntu@mira042 ~]$ sudo ceph-disk prepare /dev/sdg
***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format.
***************************************************************
Information: Moved requested sector from 34 to 2048 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
partx: /dev/sdg: error adding partition 2
Information: Moved requested sector from 10485761 to 10487808 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
meta-data=/dev/sdg1 isize=2048 agcount=4, agsize=60719917 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0
data = bsize=4096 blocks=242879665, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=118593, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.
partx: /dev/sdg: error adding partitions 1-2
[ubuntu@mira042 ~]$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 974540108 3412868 971110856 1% /
devtmpfs 8110996 0 8110996 0% /dev
tmpfs 8130388 0 8130388 0% /dev/shm
tmpfs 8130388 58188 8072200 1% /run
tmpfs 8130388 0 8130388 0% /sys/fs/cgroup
/dev/sdh1 971044288 34740 971009548 1% /var/lib/ceph/osd/ceph-0
/dev/sdg1 971044288 33700 971010588 1% /var/lib/ceph/osd/ceph-1
There is an important difference though: RHEL7 does not use
https://github.com/ceph/ceph/blob/giant/src/ceph-disk-udev . It should not be
necessary for centos7 but it looks like it is in use since the debug you get
comes from it . There must be something wrong in the source package you are
using around this point
https://github.com/ceph/ceph/blob/giant/ceph.spec.in#L382
I checked
http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHOS/SRPMS/ceph-0.80.5-1.el7ost.src.rpm
and it is as expected. Could you let me know where you got the package from ?
And what is the version according to
$ yum info ceph
Installed Packages
Name : ceph
Arch : x86_64
Epoch : 1
Version : 0.80.5
Release : 8.el7
Size : 37 M
Repo : installed
From repo : epel
Summary : User space components of the Ceph file system
URL : http://ceph.com/
License : GPL-2.0
Description : Ceph is a massively scalable, open-source, distributed
: storage system that runs on commodity hardware and delivers
: object, block and file system storage.
I'm not very familiar with RPMs and maybe Release : 8.el7 means it is a
more recent version of the package : ceph-0.80.5-1.el7ost.src.rpm suggests the
Release should be 1.el7 and not 8.el7.
Cheers
On 10/10/2014 15:59, SCHAER Frederic wrote:
> Hi Loic,
>
> Patched, and still not working (sorry)...
> I'm attaching the prepare output, and also a different a "real " udev debug
> output I captured using " udevadm monitor --environment " (udev.log file)
>
> I added a "sync" command in ceph-disk-udev (this did not change a thing), and
> I noticed that udev script is called 3 times when adding one disk, and that
> the debug output was captured and then mixed all into one file.
> This may lead to log mis-interpretation (race conditions ?)...
> I changed a bit the logging in order to get one file per call and attached
> those logs to this mail.
>
> File timestamps are as follows :
> File: '/var/log/udev_ceph.log.out.22706'
> Change: 2014-10-10 15:48:09.136386306 +0200
> File: '/var/log/udev_ceph.log.out.22749'
> Change: 2014-10-10 15:48:11.502425395 +0200
> File: '/var/log/udev_ceph.log.out.22750'
> Change: 2014-10-10 15:48:11.606427113 +0200
>
> Actually, I can reproduce the UUID=0 thing with this command :
>
> [root@ceph1 ~]# /usr/sbin/ceph-disk -v activate-journal /dev/sdc2
> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid
> --osd-journal /dev/sdc2
> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID
> 00000000-0000-0000-0000-000000000000
> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue --
> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such
> file or directory
> ceph-disk: Cannot discover filesystem type: device
> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command
> '/sbin/blkid' returned non-zero exit status 2
>
> Ah - to answer previous mails :
> - I tried to manually create the gpt partition table to see if things would
> improve, but this was not the case (I also tried to zero out the start and
> end of disks, and also to add random data)
> - running ceph-disk prepare twice does not work, it's just that once every 20
> (?) times it "surprisingly does not fail" on this hardware/os combination ;)
>
> Regards
>
> -----Message d'origine-----
> De : Loic Dachary [mailto:[email protected]]
> Envoyé : vendredi 10 octobre 2014 14:37
> À : SCHAER Frederic; [email protected]
> Objet : Re: [ceph-users] ceph-dis prepare :
> UUID=00000000-0000-0000-0000-000000000000
>
> Hi Frederic,
>
> To be 100% sure it would be great if you could manually patch your local
> ceph-disk script and change 'partprobe', into 'partx', '-a', in
> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284
>
> ceph-disk zap
> ceph-disk prepare
>
> and hopefully it will show up as it should. It works for me on centos7 but ...
>
> Cheers
>
> On 10/10/2014 14:33, Loic Dachary wrote:
>> Hi Frederic,
>>
>> It looks like this is just because
>> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 should call
>> partx instead of partprobe. The udev debug output makes this quite clear
>> http://tracker.ceph.com/issues/9721
>>
>> I think
>> https://github.com/dachary/ceph/commit/8d914001420e5bfc1e12df2d4882bfe2e1719a5c#diff-788c3cea6213c27f5fdb22f8337096d5R1285
>> fixes it
>>
>> Cheers
>>
>> On 09/10/2014 16:29, SCHAER Frederic wrote:
>>>
>>>
>>> -----Message d'origine-----
>>> De : Loic Dachary [mailto:[email protected]]
>>> Envoyé : jeudi 9 octobre 2014 16:20
>>> À : SCHAER Frederic; [email protected]
>>> Objet : Re: [ceph-users] ceph-dis prepare :
>>> UUID=00000000-0000-0000-0000-000000000000
>>>
>>>
>>>
>>> On 09/10/2014 16:04, SCHAER Frederic wrote:
>>>> Hi Loic,
>>>>
>>>> Back on sdb, as the sde output was from another machine on which I ran
>>>> partx -u afterwards.
>>>> To reply your last question first : I think the SG_IO error comes from the
>>>> fact that disks are exported as a single disks RAID0 on a PERC 6/E, which
>>>> does not support JBOD - this is decommissioned hardware on which I'd like
>>>> to test and validate we can use ceph for our use case...
>>>>
>>>> So back on the UUID.
>>>> It's funny : I retried and ceph-disk prepare worked this time. I tried on
>>>> another disk, and it failed.
>>>> There is a difference in the output from ceph-disk : on the failing disk,
>>>> I have these extra lines after disks are prepared :
>>>>
>>>> (...)
>>>> realtime =none extsz=4096 blocks=0, rtextents=0
>>>> Warning: The kernel is still using the old partition table.
>>>> The new table will be used at the next reboot.
>>>> The operation has completed successfully.
>>>> partx: /dev/sdc: error adding partitions 1-2
>>>>
>>>> I didn't have the warning about the old partition tables on the disk that
>>>> worked.
>>>> So on this new disk, I have :
>>>>
>>>> [root@ceph1 ~]# mount /dev/sdc1 /mnt
>>>> [root@ceph1 ~]# ll /mnt/
>>>> total 16
>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 ceph_fsid
>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 fsid
>>>> lrwxrwxrwx 1 root root 58 Oct 9 15:58 journal ->
>>>> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>> -rw-r--r-- 1 root root 37 Oct 9 15:58 journal_uuid
>>>> -rw-r--r-- 1 root root 21 Oct 9 15:58 magic
>>>>
>>>> [root@ceph1 ~]# cat /mnt/journal_uuid
>>>> 5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>>
>>>> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc
>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>>> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441
>>>> First sector: 10487808 (at 5.0 GiB)
>>>> Last sector: 1952448478 (at 931.0 GiB)
>>>> Partition size: 1941960671 sectors (926.0 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph data'
>>>>
>>>> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc
>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>>> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC
>>>> First sector: 2048 (at 1024.0 KiB)
>>>> Last sector: 10485760 (at 5.0 GiB)
>>>> Partition size: 10483713 sectors (5.0 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph journal'
>>>>
>>>> Puzzling, isn't it ?
>>>>
>>>>
>>>
>>> Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it
>>> shows an error and complains that the journal uuid is 0000-000* etc ? If so
>>> could you copy your udev debug output ?
>>>
>>> Cheers
>>>
>>> [>- FS : -<]
>>>
>>> No, when I manually activate the disk instead of attempting to go the udev
>>> way, it seems to work :
>>> [root@ceph1 ~]# ceph-disk activate /dev/sdc1
>>> got monmap epoch 1
>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid
>>> 00000000-0000-0000-0000-000000000000 doesn't match expected
>>> 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal
>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 2014-10-09 16:21:43.301957 7f2be6a027c0 -1
>>> filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find
>>> 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
>>> 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store
>>> /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal
>>> for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1
>>> 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file:
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory
>>> 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring
>>> added key for osd.47
>>> === osd.47 ===
>>> create-or-move updating item name 'osd.47' weight 0.9 at location
>>> {host=ceph1,root=default} to crush map
>>> Starting Ceph osd.47 on ceph1...
>>> Running as unit run-12392.service.
>>>
>>> The osd then appeared in the osd tree...
>>> I attached the logs to this email (I just added a set -x in the script
>>> called by udev, and redirected the output)
>>>
>>> Regards
>>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
--
Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
