Hi,

On RHEL7 it works as expected : 

[ubuntu@mira042 ~]$ sudo ceph-disk prepare /dev/sdg

***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format.
***************************************************************

Information: Moved requested sector from 34 to 2048 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
partx: /dev/sdg: error adding partition 2
Information: Moved requested sector from 10485761 to 10487808 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
meta-data=/dev/sdg1              isize=2048   agcount=4, agsize=60719917 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0
data     =                       bsize=4096   blocks=242879665, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=118593, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
The operation has completed successfully.
partx: /dev/sdg: error adding partitions 1-2
[ubuntu@mira042 ~]$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1      974540108 3412868 971110856   1% /
devtmpfs         8110996       0   8110996   0% /dev
tmpfs            8130388       0   8130388   0% /dev/shm
tmpfs            8130388   58188   8072200   1% /run
tmpfs            8130388       0   8130388   0% /sys/fs/cgroup
/dev/sdh1      971044288   34740 971009548   1% /var/lib/ceph/osd/ceph-0
/dev/sdg1      971044288   33700 971010588   1% /var/lib/ceph/osd/ceph-1

There is an important difference though: RHEL7 does not use 
https://github.com/ceph/ceph/blob/giant/src/ceph-disk-udev . It should not be 
necessary for centos7 but it looks like it is in use since the debug you get 
comes from it . There must be something wrong in the source package you are 
using around this point 
https://github.com/ceph/ceph/blob/giant/ceph.spec.in#L382

I checked 
http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHOS/SRPMS/ceph-0.80.5-1.el7ost.src.rpm
 and it is as expected. Could you let me know where you got the package from ? 
And what is the version according to 

$ yum info ceph
Installed Packages
Name        : ceph
Arch        : x86_64
Epoch       : 1
Version     : 0.80.5
Release     : 8.el7
Size        : 37 M
Repo        : installed
From repo   : epel
Summary     : User space components of the Ceph file system
URL         : http://ceph.com/
License     : GPL-2.0
Description : Ceph is a massively scalable, open-source, distributed
            : storage system that runs on commodity hardware and delivers
            : object, block and file system storage.

I'm not very familiar with RPMs and maybe Release     : 8.el7 means it is a 
more recent version of the package : ceph-0.80.5-1.el7ost.src.rpm suggests the 
Release should be 1.el7 and not 8.el7.

Cheers

On 10/10/2014 15:59, SCHAER Frederic wrote:
> Hi Loic,
> 
> Patched, and still not working (sorry)...
> I'm attaching the prepare output, and also a different a "real " udev debug 
> output I captured using " udevadm monitor --environment " (udev.log file)
> 
> I added a "sync" command in ceph-disk-udev (this did not change a thing), and 
> I noticed that udev script is called 3 times when adding one disk, and that 
> the debug output was captured and then mixed all into one file.
> This may lead to log mis-interpretation (race conditions ?)...
> I changed a bit the logging in order to get one file per call and attached 
> those logs to this mail.
> 
> File timestamps are as follows :
>   File: '/var/log/udev_ceph.log.out.22706'
> Change: 2014-10-10 15:48:09.136386306 +0200
>   File: '/var/log/udev_ceph.log.out.22749'
> Change: 2014-10-10 15:48:11.502425395 +0200
>   File: '/var/log/udev_ceph.log.out.22750'
> Change: 2014-10-10 15:48:11.606427113 +0200
> 
> Actually, I can reproduce the UUID=0 thing with this command :
> 
> [root@ceph1 ~]# /usr/sbin/ceph-disk -v activate-journal /dev/sdc2
> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid 
> --osd-journal /dev/sdc2
> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID 
> 00000000-0000-0000-0000-000000000000
> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- 
> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such 
> file or directory
> ceph-disk: Cannot discover filesystem type: device 
> /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command 
> '/sbin/blkid' returned non-zero exit status 2
> 
> Ah - to answer previous mails :
> - I tried to manually create the gpt partition table to see if things would 
> improve, but this was not the case (I also tried to zero out the start and 
> end of disks, and also to add random data)
> - running ceph-disk prepare twice does not work, it's just that once every 20 
> (?) times it "surprisingly does not fail" on this hardware/os combination ;)
> 
> Regards
> 
> -----Message d'origine-----
> De : Loic Dachary [mailto:l...@dachary.org] 
> Envoyé : vendredi 10 octobre 2014 14:37
> À : SCHAER Frederic; ceph-users@lists.ceph.com
> Objet : Re: [ceph-users] ceph-dis prepare : 
> UUID=00000000-0000-0000-0000-000000000000
> 
> Hi Frederic,
> 
> To be 100% sure it would be great if you could manually patch your local 
> ceph-disk script and change 'partprobe', into 'partx', '-a', in 
> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284
> 
> ceph-disk zap
> ceph-disk prepare
> 
> and hopefully it will show up as it should. It works for me on centos7 but ...
> 
> Cheers
> 
> On 10/10/2014 14:33, Loic Dachary wrote:
>> Hi Frederic,
>>
>> It looks like this is just because 
>> https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 should call 
>> partx instead of partprobe. The udev debug output makes this quite clear 
>> http://tracker.ceph.com/issues/9721
>>
>> I think 
>> https://github.com/dachary/ceph/commit/8d914001420e5bfc1e12df2d4882bfe2e1719a5c#diff-788c3cea6213c27f5fdb22f8337096d5R1285
>>  fixes it
>>
>> Cheers
>>
>> On 09/10/2014 16:29, SCHAER Frederic wrote:
>>>
>>>
>>> -----Message d'origine-----
>>> De : Loic Dachary [mailto:l...@dachary.org] 
>>> Envoyé : jeudi 9 octobre 2014 16:20
>>> À : SCHAER Frederic; ceph-users@lists.ceph.com
>>> Objet : Re: [ceph-users] ceph-dis prepare : 
>>> UUID=00000000-0000-0000-0000-000000000000
>>>
>>>
>>>
>>> On 09/10/2014 16:04, SCHAER Frederic wrote:
>>>> Hi Loic,
>>>>
>>>> Back on sdb, as the sde output was from another machine on which I ran 
>>>> partx -u afterwards.
>>>> To reply your last question first : I think the SG_IO error comes from the 
>>>> fact that disks are exported as a single disks RAID0 on a PERC 6/E, which 
>>>> does not support JBOD - this is decommissioned hardware on which I'd like 
>>>> to test and validate we can use ceph for our use case...
>>>>
>>>> So back on the  UUID.
>>>> It's funny : I retried and ceph-disk prepare worked this time. I tried on 
>>>> another disk, and it failed.
>>>> There is a difference in the output from ceph-disk : on the failing disk, 
>>>> I have these extra lines after disks are prepared :
>>>>
>>>> (...)
>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>> Warning: The kernel is still using the old partition table.
>>>> The new table will be used at the next reboot.
>>>> The operation has completed successfully.
>>>> partx: /dev/sdc: error adding partitions 1-2
>>>>
>>>> I didn't have the warning about the old partition tables on the disk that 
>>>> worked. 
>>>> So on this new disk, I have :
>>>>
>>>> [root@ceph1 ~]# mount /dev/sdc1 /mnt
>>>> [root@ceph1 ~]# ll /mnt/
>>>> total 16
>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 ceph_fsid
>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 fsid
>>>> lrwxrwxrwx 1 root root 58 Oct  9 15:58 journal -> 
>>>> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 journal_uuid
>>>> -rw-r--r-- 1 root root 21 Oct  9 15:58 magic
>>>>
>>>> [root@ceph1 ~]# cat /mnt/journal_uuid
>>>> 5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>>
>>>> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc
>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>>> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441
>>>> First sector: 10487808 (at 5.0 GiB)
>>>> Last sector: 1952448478 (at 931.0 GiB)
>>>> Partition size: 1941960671 sectors (926.0 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph data'
>>>>
>>>> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc
>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>>> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC
>>>> First sector: 2048 (at 1024.0 KiB)
>>>> Last sector: 10485760 (at 5.0 GiB)
>>>> Partition size: 10483713 sectors (5.0 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph journal'
>>>>
>>>> Puzzling, isn't it ?
>>>>
>>>>
>>>
>>> Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it 
>>> shows an error and complains that the journal uuid is 0000-000* etc ? If so 
>>> could you copy your udev debug output ?
>>>
>>> Cheers
>>>
>>> [>- FS : -<]  
>>>
>>> No, when I manually activate the disk instead of attempting to go the udev 
>>> way, it seems to work :
>>> [root@ceph1 ~]# ceph-disk activate /dev/sdc1
>>> got monmap epoch 1
>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid 
>>> 00000000-0000-0000-0000-000000000000 doesn't match expected 
>>> 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal
>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 
>>> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 2014-10-09 16:21:43.301957 7f2be6a027c0 -1 
>>> filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find 
>>> 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
>>> 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store 
>>> /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal 
>>> for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1
>>> 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file: 
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open 
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory
>>> 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring 
>>> /var/lib/ceph/tmp/mnt.4lJlzP/keyring
>>> added key for osd.47
>>> === osd.47 ===
>>> create-or-move updating item name 'osd.47' weight 0.9 at location 
>>> {host=ceph1,root=default} to crush map
>>> Starting Ceph osd.47 on ceph1...
>>> Running as unit run-12392.service.
>>>
>>> The osd then appeared in the osd tree...
>>> I attached the logs to this email (I just added a set -x in the script 
>>> called by udev, and redirected the output)
>>>
>>> Regards
>>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to