Re: [ceph-users] Intel 520/530 SSD for ceph

2013-11-21 Thread Stefan Priebe - Profihost AG
Hi,

Am 21.11.2013 01:29, schrieb m...@linuxbox.com:
> On Tue, Nov 19, 2013 at 09:02:41AM +0100, Stefan Priebe wrote:
> ...
>>> You might be able to vary this behavior by experimenting with sdparm,
>>> smartctl or other tools, or possibly with different microcode in the drive.
>> Which values or which settings do you think of?
> ...
> 
> Off-hand, I don't know.  Probably the first thing would be
> to compare the configuration of your 520 & 530; anything that's
> different is certainly worth investigating.
> 
> This should display all pages,
>   sdparm --all --long /dev/sdX
> the 520 only appears to have 3 pages, which can be fetched directly w/
>   sdparm --page=ca --long /dev/sdX
>   sdparm --page=co --long /dev/sdX
>   sdparm --page=rw --long /dev/sdX
> 
> The sample machine I'm looking has an intel 520, and on ours,
> most options show as 0 except for
>   AWRE1  [cha: n, def:  1]  Automatic write reallocation enabled
>   WCE 1  [cha: y, def:  1]  Write cache enable
>   DRA 1  [cha: n, def:  1]  Disable read ahead
>   GLTSD   1  [cha: n, def:  1]  Global logging target save disable
>   BTP-1  [cha: n, def: -1]  Busy timeout period (100us)
>   ESTCT  30  [cha: n, def: 30]  Extended self test completion time (sec)
> Perhaps that's an interesting data point to compare with yours.
> 
> Figuring out if you have up-to-date intel firmware appears to require
> burning and running an iso image from
> https://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18455
> 
> The results of sdparm --page= --long /dev/sdc
> show the intel firmware, but this labels it better:
> smartctl -i /dev/sdc
> Our 520 has firmware "400i" loaded.

Firmware is up2date and all values are the same. I expect that the 520
firmware just ignores CMD_FLUSH commands and the 530 does not.

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Size of RBD images

2013-11-21 Thread nicolasc

Thanks Josh! This is a lot clearer now.

I understand that librbd is low-level, but still, a warning wouldn't 
hurt, would it? Just check if the size parameter is larger than the 
cluster capacity, no?


Thank you for pointing out the trick of simply deleting the rbd_header, 
I will try that now.


Best regards,

Nicolas Canceill
Scalable Storage Systems
SURFsara (Amsterdam, NL)



On 11/20/2013 06:33 PM, Josh Durgin wrote:

On 11/20/2013 06:53 AM, nicolasc wrote:

Thank you Bernhard and Wogri. My old kernel version also explains the
format issue. Once again, sorry to have mixed that in the problem.

Back to my original inquiries, I hope someone can help me understand 
why:

* it is possible to create an RBD image larger than the total capacity
of the cluster


There's simply no checking of the size of the cluster by librbd.
rbd does not know whether you're about to add a bunch of capacity to 
the cluster, or whether you want your storage overcommitted (and by 
how much).


Higher level tools like openstack cinder can provide that kind of 
logic, but 'rbd create' is more of a low level tool at this time.



* a large empty image takes longer to shrink/delete than a small one


rbd doesn't keep an index of which objects exist (since doing so would 
hurt write performance). The downside is as you saw, when shrinking or

deleting an image it must look for all objects above the shrink size
(deleting is like shrinking to 0 of course).

In dumpling or later rbd can do this in parallel controlled by 
--rbd-concurrent-management-ops, which defaults to 10.


If you've never written to the image, you can just delete the rbd_header
and rbd_id objects for it (or just the $imagename.rbd object for format 1
images), then 'rbd rm' will be fast since it'll just remove its entry 
from

the rbd_directory object.

Josh


Best regards,

Nicolas Canceill
Scalable Storage Systems
SURFsara (Amsterdam, NL)



On 11/20/2013 01:56 PM, Bernhard Glomm wrote:

That might be,
manpage of
ceph version 0.72.1
tells me it isn't though.
anyhow still running kernel 3.8.xx

Bernhard

Am 19.11.2013 20:10:04, schrieb Wolfgang Hennerbichler:


On Nov 19, 2013, at 3:47 PM, Bernhard Glomm
 wrote:

Hi Nicolas
just fyi
rbd format 2 is not supported yet by the linux kernel (module)


I believe this is wrong. I think linux supports rbd format 2
images since 3.10.

wogri




--
 


*Ecologic Institute* *Bernhard Glomm*
IT Administration

Phone: +49 (30) 86880 134
Fax: +49 (30) 86880 100
Skype: bernhard.glomm.ecologic

Website:  | Video:
 | Newsletter:
 | Facebook:
 | Linkedin:
 |
Twitter:  | YouTube:
 | Google+:

Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717
Berlin | Germany
GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.:
DE811963464
Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH
 







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] how to monitor osd?

2013-11-21 Thread tianqing lee
hello,
   is there some methods to monitor osd nodes? for example the free size of
one osd node.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Size of RBD images

2013-11-21 Thread Wolfgang Hennerbichler

-- 
http://www.wogri.at

On Nov 21, 2013, at 10:30 AM, nicolasc  wrote:

> Thanks Josh! This is a lot clearer now.
> 
> I understand that librbd is low-level, but still, a warning wouldn't hurt, 
> would it? Just check if the size parameter is larger than the cluster 
> capacity, no?

maybe I want to create a huge image now, and add the OSD capacity later. so 
this makes sense. 

> Thank you for pointing out the trick of simply deleting the rbd_header, I 
> will try that now.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Size of RBD images

2013-11-21 Thread nicolasc
Yes, I understand that creating an image larger than the cluster may 
sometimes be considered a feature. I am not suggesting it should be 
forbidden, simply that it should display a warning message to the operator.


Full disc: I am not a Ceph dev, this is a simple user's opinion

Best regards,

Nicolas Canceill
Scalable Storage Systems
SURFsara (Amsterdam, NL)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Size of RBD images

2013-11-21 Thread Damien Churchill
On 19 November 2013 20:12, LaSalle, Jurvis
 wrote:
>
> On 11/19/13, 2:10 PM, "Wolfgang Hennerbichler"  wrote:
>
> >
> >On Nov 19, 2013, at 3:47 PM, Bernhard Glomm 
> >wrote:
> >
> >> Hi Nicolas
> >> just fyi
> >> rbd format 2 is not supported yet by the linux kernel (module)
> >
> >I believe this is wrong. I think linux supports rbd format 2 images since
> >3.10.
>
> One more reason to cross our fingers for official Saucy Salamander support
> soonŠ
>

Or could just use precise with linux-image-generic-lts-saucy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is Ceph a provider of block device too ?

2013-11-21 Thread Dimitri Maziuk
On 11/21/2013 12:52 PM, Gregory Farnum wrote:

> If you want a logically distinct copy (? this seems to be what Dimitri
> is referring to with adding a 3rd DRBD copy on another node)

Disclaimer: I haven't done "stacked drbd", this is from my reading of
the fine manual -- I was referring to "stacked" setup where you make a
drbd raid-1 w/ 2 hosts and then a drbd raid-1 w/ the that drbd device
and another host. I don't believe drbd can keep 3 replicas any other way
-- unlike ceph, obviously.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to fetch files

2013-11-21 Thread Smart Weblications GmbH - Florian Wiessner
Hi Knut Moe,

Am 21.11.2013 22:51, schrieb Knut Moe:
> Thanks, Alfredo.
> 
> The challenge is that it is calling those links when issuing the following 
> command:
> 
> sudo apt-get update && sudo apt-get install ceph-deploy
> 
> It then goes through a lot different steps before displaying those error
> messages. See more of the error in this screenshot link:
> 
> http://www.specdata.net/ceph-message.gif
> 
> Is there a configuration file that should be modified before running the 
> update
> and deploy comman?
> 


Check your /etc/apt/sources.list and /etc/apt/sources.list.d/ceph.list

for "{ceph-estable-release}" and replace that with "emperor"

then do:

sudo apt-get update && sudo apt-get install ceph-deploy




-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack Havana, boot from volume fails

2013-11-21 Thread Jens-Christian Fischer
the weird thing is that I have some volumes that were created from a snapshot, 
that actually boot (they complain about not being able to connect to the 
metadataserver (which I guess is a totally different problem) but in the end, 
they come up.

I haven't been able to see the difference between the volumes….

I re-snapshotted the instance whose volume wouldn't boot, and made a volume out 
of it, and this instance booted nicely from the volume.

weirder and weirder…

/jc

-- 
SWITCH
Jens-Christian Fischer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 15 71
jens-christian.fisc...@switch.ch
http://www.switch.ch

http://www.switch.ch/socialmedia

On 21.11.2013, at 15:05, Jens-Christian Fischer 
 wrote:

> Hi all
> 
> I'm playing with the boot from volume options in Havana and have run into 
> problems:
> 
> (Openstack Havana, Ceph Dumpling (0.67.4), rbd for glance, cinder and 
> experimental ephemeral disk support)
> 
> The following things do work:
> - glance images are in rbd
> - cinder volumes are in rbd
> - creating a VM from an image works
> - creating a VM from a snapshot works
> 
> 
> However, the booting from volume fails:
> 
> Steps to reproduce:
> 
> Boot from image
> Create snapshot from running instance
> Create volume from this snapshot
> Start a new instance with "boot from volume" and the volume just created:
> 
> The boot process hangs after around 3 seconds, and the console.log of the 
> instance shows this:
> 
> [0.00] Linux version 3.11.0-12-generic (buildd@allspice) (gcc version 
> 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu7) ) #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 
> 2013 (Ubuntu 3.11.0-12.19-generic 3.11.3)
> [0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-3.11.0-12-generic 
> root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
> ...
> [0.098221] Brought up 1 CPUs
> [0.098964] smpboot: Total of 1 processors activated (4588.94 BogoMIPS)
> [0.100408] NMI watchdog: enabled on all CPUs, permanently consumes one 
> hw-PMU counter.
> [0.102667] devtmpfs: initialized
> …
> [0.560202] Linux agpgart interface v0.103
> [0.562276] brd: module loaded
> [0.563599] loop: module loaded
> [0.565315]  vda: vda1
> [0.568386] scsi0 : ata_piix
> [0.569217] scsi1 : ata_piix
> [0.569972] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc0a0 irq 14
> [0.571289] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc0a8 irq 15
> …
> [0.742082] Freeing unused kernel memory: 1040K (8800016fc000 - 
> 88000180)
> [0.746153] Freeing unused kernel memory: 836K (880001b2f000 - 
> 880001c0)
> Loading, please wait...
> [0.764177] systemd-udevd[95]: starting version 204
> [0.787913] floppy: module verification failed: signature and/or required 
> key missing - tainting kernel
> [0.825174] FDC 0 is a S82078B
> …
> [1.448178] tsc: Refined TSC clocksource calibration: 2294.376 MHz
> error: unexpectedly disconnected from boot status daemon
> Begin: Loading essential drivers ... done.
> Begin: Running /scripts/init-premount ... done.
> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
> done.
> Begin: Running /scripts/local-premount ... done.
> [2.384452] EXT4-fs (vda1): mounted filesystem with ordered data mode. 
> Opts: (null)
> Begin: Running /scripts/local-bottom ... done.
> done.
> Begin: Running /scripts/init-bottom ... done.
> [3.021268] init: mountall main process (193) killed by FPE signal
> General error mounting filesystems.
> A maintenance shell will now be started.
> CONTROL-D will terminate this shell and reboot the system.
> root@box-web1:~# 
> The console is stuck, I can't get to the rescue shell
> 
> I can "rbd map" the volume and mount it from a physical host - the filesystem 
> etc all is in good order.
> 
> Any ideas?
> 
> cheers
> jc
> 
> -- 
> SWITCH
> Jens-Christian Fischer, Peta Solutions
> Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
> phone +41 44 268 15 15, direct +41 44 268 15 71
> jens-christian.fisc...@switch.ch
> http://www.switch.ch
> 
> http://www.switch.ch/socialmedia
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to fetch files

2013-11-21 Thread Dan Mick
Perhaps you mean these instructions, from 
http://ceph.com/docs/master/start/quick-start-preflight/#ceph-deploy-setup?


---clip---
2. Add the Ceph packages to your repository. Replace 
{ceph-stable-release} with a stable Ceph release (e.g., cuttlefish, 
dumpling, etc.).


For example:
echo deb http://ceph.com/debian-{ceph-stable-release}/ $(lsb_release 
-sc) main | sudo tee /etc/apt/sources.list.d/ceph.list

---clip---

See that second sentence there?


On 11/21/2013 01:01 PM, Knut Moe wrote:

Hi all,

I am trying to install Ceph using the Preflight Checklist and when I
issue the following command

sudo apt-get update && sudo apt-get install ceph-deploy

I get the following error after it goes through a lot different steps:

Failed to fetch
http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-amd64/Packages
404 Not Found

Failed to fetch
http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-i386/Packages
404 Not Found

I am using Ubuntu 12.04, 64-bit.

Thanks,
Kurt





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Dan Mick, Filesystem Engineering
Inktank Storage, Inc.   http://inktank.com
Ceph docs: http://ceph.com/docs
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OpenStack, Boot from image (create volume) failed with volumes in rbd

2013-11-21 Thread Jens-Christian Fischer
Hi all

I'm playing with the boot from volume options in Havana and have run into 
problems:

(Openstack Havana, Ceph Dumpling (0.67.4), rbd for glance, cinder and 
experimental ephemeral disk support)

The following things do work:
- glance images are in rbd
- cinder volumes are in rbd
- creating a VM from an image works
- creating a VM from a snapshot works


However, the booting from volume options fail in various ways:


* Select "Boot from Image (create volume)" 

fails booting, with the VM complaining that there was no bootable device "Boot 
failed: not a bootable disk"

The libvirt.xml definition is as follows:


  
  



  
  

  
  
  fa635ee4-5ea5-429d-bfab-6e53aa687245


The qemu command line is this: 

 qemu-system-x86_64 -machine accel=kvm:tcg -name instance-0187 -S -machine 
pc-i440fx-1.5,accel=kvm,usb=off -cpu 
SandyBridge,+pdpe1gb,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
 -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 
f28f9b90-9e75-45a7-ac34-c8dd2c6e3c5b -smbios type=1,manufacturer=OpenStack 
Foundation,product=OpenStack 
Nova,version=2013.2,serial=078965e4-1a79-0010-82d4-089e015a2b41,uuid=f28f9b90-9e75-45a7-ac34-c8dd2c6e3c5b
 -no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0187.monitor,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew 
-no-kvm-pit-reinjection -no-shutdown -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
file=rbd:volumes/volume-fa635ee4-5ea5-429d-bfab-6e53aa687245:id=volumes:key=AQAsO2pSYEjXNBAAYB02+zSa2boqFcl+aZNwLw==:auth_supported=cephx\;none:mon_host=[\:\:0\:6\:\:11c]\:6789\;[\:\:0\:6\:\:11d]\:6789\;[\:\:0\:6\:\:11e]\:6789,if=none,id=drive-virtio-disk0,format=raw,serial=fa635ee4-5ea5-429d-bfab-6e53aa687245,cache=none
 -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:93:a1:88,bus=pci.0,addr=0x3 
-chardev 
file,id=charserial0,path=/var/lib/nova/instances/f28f9b90-9e75-45a7-ac34-c8dd2c6e3c5b/console.log
 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 
-device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 
-vnc 0.0.0.0:4 -k en-us -vga cirrus -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

The volume is known to cinder:

cinder list --all-tenants | grep fa635ee4
| fa635ee4-5ea5-429d-bfab-6e53aa687245 |   in-use  |
   |  10  | None|   true   | 
f28f9b90-9e75-45a7-ac34-c8dd2c6e3c5b |

and rbd

root@hxs:~# rbd --pool volumes ls | grep fa635ee4
volume-fa635ee4-5ea5-429d-bfab-6e53aa687245

the file is a qcow2 file:

root@hxs:~# rbd map --pool volumes volume-fa635ee4-5ea5-429d-bfab-6e53aa687245
root@hxs:~# mount /dev/rbd2p1 /dev/rbd
mount: special device /dev/rbd2p1 does not exist

root@hxs:~#  dd if=/dev/rbd2 of=foo count=100
root@hxs:~# file foo
foo: QEMU QCOW Image (v2), 2147483648 bytes

It is our understanding, that we need raw volumes to get the boot process 
working. Why is the volume created as a qcow2 volume?

cheers
jc

-- 
SWITCH
Jens-Christian Fischer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 15 71
jens-christian.fisc...@switch.ch
http://www.switch.ch

http://www.switch.ch/socialmedia

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is Ceph a provider of block device too ?

2013-11-21 Thread John-Paul Robinson
Is this statement accurate?

As I understand DRBD, you can replicate online block devices reliably,
but with Ceph the replication for RBD images requires that the file
system be offline.

Thanks for the clarification,

~jpr


On 11/08/2013 03:46 PM, Gregory Farnum wrote:
>> Does Ceph provides the distributed filesystem and block device?
> Ceph's RBD is a distributed block device and works very well; you
> could use it to replace DRBD. 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Failed to fetch files

2013-11-21 Thread Knut Moe
Hi all,

I am trying to install Ceph using the Preflight Checklist and when I issue
the following command

sudo apt-get update && sudo apt-get install ceph-deploy

I get the following error after it goes through a lot different steps:

Failed to fetch
http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-amd64/Packages
404 Not Found

Failed to fetch
http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-i386/Packages
404 Not Found

I am using Ubuntu 12.04, 64-bit.

Thanks,
Kurt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to fetch files

2013-11-21 Thread Knut Moe
Thanks, Alfredo.

The challenge is that it is calling those links when issuing the following
command:

sudo apt-get update && sudo apt-get install ceph-deploy

It then goes through a lot different steps before displaying those error
messages. See more of the error in this screenshot link:

http://www.specdata.net/ceph-message.gif

Is there a configuration file that should be modified before running the
update and deploy comman?



On Thu, Nov 21, 2013 at 2:11 PM, Alfredo Deza wrote:

> On Thu, Nov 21, 2013 at 4:01 PM, Knut Moe  wrote:
> > Hi all,
> >
> > I am trying to install Ceph using the Preflight Checklist and when I
> issue
> > the following command
> >
> > sudo apt-get update && sudo apt-get install ceph-deploy
> >
> > I get the following error after it goes through a lot different steps:
> >
> > Failed to fetch
> >
> http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-amd64/Packages
> > 404 Not Found
> >
> > Failed to fetch
> >
> http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-i386/Packages
> > 404 Not Found
> >
> > I am using Ubuntu 12.04, 64-bit.
>
> It looks like you've added a URL that doesn't exist and is used to
> explain that you need to replace "{ceph-estable-release}"
> with the actual ceph stable release ("emperor" at the moment).
>
> So with "emperor" that URL would be:
>
> http://ceph.com/debian-emperor/dists/precise/main/binary-i386/Packages
>
>
> >
> > Thanks,
> > Kurt
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS filesystem disapear!

2013-11-21 Thread Alphe Salas Michels
Title: signature

  
  
Hello all!

I experience a strange issue since last update to ubuntu 13.10
(saucy) and ceph emperor 0.72.1

kernel version  3.11.0-13-generic #20-Ubuntu 

ceph packages installed are the ones for RARING

when I mount my ceph cluster using cephfs and I upload a tons of
data or do a directory listing (find . -printf "%d %k" ) or do 
a chown -R user:user * at some point the filesystem disapear!

I don t know how to solve this issue there is no entry in anylog the
only thing that seems to be affected is ceph-watch-notice that get
stuckl
and forbid the unmount "have to pid kill .-9 that process to umount
/ mount the ceph cluster on client proxy to start over the process.
in the chown if I put --changes to slow it down just enought then
the problem seems to disapear.


Any suggestions are welcome
Atte, 

-- 
  
  
  Alphé

Salas
Ingeniero T.I

  Kepler Data Recovery
  Asturias

97, Las Condes
Santiago- Chile
  (56 2) 2362 7504
  asa...@kepler.cl
  www.kepler.cl
  
  

  

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Openstack Havana, boot from volume fails

2013-11-21 Thread Jens-Christian Fischer
Hi all

I'm playing with the boot from volume options in Havana and have run into 
problems:

(Openstack Havana, Ceph Dumpling (0.67.4), rbd for glance, cinder and 
experimental ephemeral disk support)

The following things do work:
- glance images are in rbd
- cinder volumes are in rbd
- creating a VM from an image works
- creating a VM from a snapshot works


However, the booting from volume fails:

Steps to reproduce:

Boot from image
Create snapshot from running instance
Create volume from this snapshot
Start a new instance with "boot from volume" and the volume just created:

The boot process hangs after around 3 seconds, and the console.log of the 
instance shows this:

[0.00] Linux version 3.11.0-12-generic (buildd@allspice) (gcc version 
4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu7) ) #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 
2013 (Ubuntu 3.11.0-12.19-generic 3.11.3)
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-3.11.0-12-generic 
root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
...
[0.098221] Brought up 1 CPUs
[0.098964] smpboot: Total of 1 processors activated (4588.94 BogoMIPS)
[0.100408] NMI watchdog: enabled on all CPUs, permanently consumes one 
hw-PMU counter.
[0.102667] devtmpfs: initialized
…
[0.560202] Linux agpgart interface v0.103
[0.562276] brd: module loaded
[0.563599] loop: module loaded
[0.565315]  vda: vda1
[0.568386] scsi0 : ata_piix
[0.569217] scsi1 : ata_piix
[0.569972] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc0a0 irq 14
[0.571289] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc0a8 irq 15
…
[0.742082] Freeing unused kernel memory: 1040K (8800016fc000 - 
88000180)
[0.746153] Freeing unused kernel memory: 836K (880001b2f000 - 
880001c0)
Loading, please wait...
[0.764177] systemd-udevd[95]: starting version 204
[0.787913] floppy: module verification failed: signature and/or required 
key missing - tainting kernel
[0.825174] FDC 0 is a S82078B
…
[1.448178] tsc: Refined TSC clocksource calibration: 2294.376 MHz
error: unexpectedly disconnected from boot status daemon
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
[2.384452] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: 
(null)
Begin: Running /scripts/local-bottom ... done.
done.
Begin: Running /scripts/init-bottom ... done.
[3.021268] init: mountall main process (193) killed by FPE signal
General error mounting filesystems.
A maintenance shell will now be started.
CONTROL-D will terminate this shell and reboot the system.
root@box-web1:~# 
The console is stuck, I can't get to the rescue shell

I can "rbd map" the volume and mount it from a physical host - the filesystem 
etc all is in good order.

Any ideas?

cheers
jc

-- 
SWITCH
Jens-Christian Fischer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 15 71
jens-christian.fisc...@switch.ch
http://www.switch.ch

http://www.switch.ch/socialmedia

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to fix active+remapped pg

2013-11-21 Thread John Wilkins
Ugis,

Can you provide the results for:

ceph osd tree
ceph osd crush dump






On Thu, Nov 21, 2013 at 7:59 AM, Gregory Farnum  wrote:
> On Thu, Nov 21, 2013 at 7:52 AM, Ugis  wrote:
>> Thanks, reread that section in docs and found tunables profile - nice
>> to have, hadn't noticed it before(ceph docs develop so fast that you
>> need RSS to follow all changes :) )
>>
>> Still problem persists in a different way.
>> Did set profile "optimal", reballancing started, but I had "rbd
>> delete" in background, in the end cluster ended up with negative
>> degradation %
>> I think I have hit bug http://tracker.ceph.com/issues/3720   which is
>> still open.
>> I did restart osds one by one and negative degradation dissapeared.
>>
>> Afterwards I added extra ~900GB data, degradation growed in process to 0.071%
>> This is rather http://tracker.ceph.com/issues/3747  which is closed,
>> but seems to happen still.
>> I did "ceph osd out X; sleep 40; ceph osd in X" for all osds,
>> degradation % went away.
>>
>> In the end I still have "55 active+remapped" pgs and no degradation %.
>> "pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361
>> GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd,
>> 203op/s"
>>
>> I queried some of remapped pgs, do not see why they do not
>> reballance(tunables are optimal now, checked).
>>
>> Where to look for the reason they are not reballancing? Is there
>> something to look for in osd logs if debug level is increased?
>>
>> one of those:
>> # ceph pg 4.5e query
>> { "state": "active+remapped",
>>   "epoch": 9165,
>>   "up": [
>> 9],
>>   "acting": [
>> 9,
>> 5],
>
> For some reason CRUSH is still failing to map all the PGs to two hosts
> (notice how the "up" set is only one OSD, so it's adding another one
> in "acting") — what's your CRUSH map look like?
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is Ceph a provider of block device too ?

2013-11-21 Thread Gregory Farnum
On Thu, Nov 21, 2013 at 10:13 AM, John-Paul Robinson  wrote:
> Is this statement accurate?
>
> As I understand DRBD, you can replicate online block devices reliably,
> but with Ceph the replication for RBD images requires that the file
> system be offline.

It's not clear to me what replication you're asking about with Ceph
here. Ceph is internally replicating the block device to keep it
available in case of server failures, and that's all happening online.
If you want a logically distinct copy (? this seems to be what Dimitri
is referring to with adding a 3rd DRBD copy on another node) then that
model is a lot different than in the classical storage world — it
might be that you're saying "I want 3-copy durability instead of
2-copy" which you can do online (for all images stored in a given
pool); or you might be after a copy you can use as a stable backup
point, which you can do by taking a snapshot of the image (but here
you probably want to quiesce the filesystem, yes); or you might be
after something completely different which I'm not thinking of...
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Intel 520/530 SSD for ceph

2013-11-21 Thread Mark Nelson

On 11/21/2013 02:36 AM, Stefan Priebe - Profihost AG wrote:

Hi,

Am 21.11.2013 01:29, schrieb m...@linuxbox.com:

On Tue, Nov 19, 2013 at 09:02:41AM +0100, Stefan Priebe wrote:
...

You might be able to vary this behavior by experimenting with sdparm,
smartctl or other tools, or possibly with different microcode in the drive.

Which values or which settings do you think of?

...

Off-hand, I don't know.  Probably the first thing would be
to compare the configuration of your 520 & 530; anything that's
different is certainly worth investigating.

This should display all pages,
sdparm --all --long /dev/sdX
the 520 only appears to have 3 pages, which can be fetched directly w/
sdparm --page=ca --long /dev/sdX
sdparm --page=co --long /dev/sdX
sdparm --page=rw --long /dev/sdX

The sample machine I'm looking has an intel 520, and on ours,
most options show as 0 except for
   AWRE1  [cha: n, def:  1]  Automatic write reallocation enabled
   WCE 1  [cha: y, def:  1]  Write cache enable
   DRA 1  [cha: n, def:  1]  Disable read ahead
   GLTSD   1  [cha: n, def:  1]  Global logging target save disable
   BTP-1  [cha: n, def: -1]  Busy timeout period (100us)
   ESTCT  30  [cha: n, def: 30]  Extended self test completion time (sec)
Perhaps that's an interesting data point to compare with yours.

Figuring out if you have up-to-date intel firmware appears to require
burning and running an iso image from
https://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18455

The results of sdparm --page= --long /dev/sdc
show the intel firmware, but this labels it better:
smartctl -i /dev/sdc
Our 520 has firmware "400i" loaded.


Firmware is up2date and all values are the same. I expect that the 520
firmware just ignores CMD_FLUSH commands and the 530 does not.


For those of you that don't follow LKML, there is some interesting 
discussion going on regarding this same issue (Hi Stefan!)


https://lkml.org/lkml/2013/11/20/158

Can anyone think of a reasonable (ie not yanking power out) way to test 
what CMD_FLUSH is actually doing?  I have some 520s in our test rig I 
can play with.  Otherwise, maybe an Intel engineer can chime in and let 
us know what's going on?




Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to monitor osd?

2013-11-21 Thread John Kinsella
As an OSD is just a partition, you could use any of the monitoring packages out 
there? (I like opsview…)

We use the check-ceph-status nagios plugin[1] to monitor overall cluster 
status, but I'm planning on adding/finding more monitoring functionality soon 
(e.g. ceph df)

John
1: https://github.com/dreamhost/ceph-nagios-plugin

On Nov 21, 2013, at 1:59 AM, tianqing lee 
 wrote:

> hello,
>is there some methods to monitor osd nodes? for example the free size of 
> one osd node.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to fix active+remapped pg

2013-11-21 Thread Ugis
Thanks, reread that section in docs and found tunables profile - nice
to have, hadn't noticed it before(ceph docs develop so fast that you
need RSS to follow all changes :) )

Still problem persists in a different way.
Did set profile "optimal", reballancing started, but I had "rbd
delete" in background, in the end cluster ended up with negative
degradation %
I think I have hit bug http://tracker.ceph.com/issues/3720   which is
still open.
I did restart osds one by one and negative degradation dissapeared.

Afterwards I added extra ~900GB data, degradation growed in process to 0.071%
This is rather http://tracker.ceph.com/issues/3747  which is closed,
but seems to happen still.
I did "ceph osd out X; sleep 40; ceph osd in X" for all osds,
degradation % went away.

In the end I still have "55 active+remapped" pgs and no degradation %.
"pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361
GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd,
203op/s"

I queried some of remapped pgs, do not see why they do not
reballance(tunables are optimal now, checked).

Where to look for the reason they are not reballancing? Is there
something to look for in osd logs if debug level is increased?

one of those:
# ceph pg 4.5e query
{ "state": "active+remapped",
  "epoch": 9165,
  "up": [
9],
  "acting": [
9,
5],
  "info": { "pgid": "4.5e",
  "last_update": "8838'106544",
  "last_complete": "8838'106544",
  "log_tail": "6874'103543",
  "last_backfill": "MAX",
  "purged_snaps": "[1~3]",
  "history": { "epoch_created": 452,
  "last_epoch_started": 9159,
  "last_epoch_clean": 9160,
  "last_epoch_split": 0,
  "same_up_since": 9128,
  "same_interval_since": 9155,
  "same_primary_since": 8797,
  "last_scrub": "7696'104899",
  "last_scrub_stamp": "2013-11-14 11:45:43.492512",
  "last_deep_scrub": "6874'103975",
  "last_deep_scrub_stamp": "2013-11-10 05:16:26.380028",
  "last_clean_scrub_stamp": "2013-11-14 11:45:43.492512"},
  "stats": { "version": "8838'106544",
  "reported_seq": "2172386",
  "reported_epoch": "9165",
  "state": "active+remapped",
  "last_fresh": "2013-11-21 17:39:51.055332",
  "last_change": "2013-11-21 16:04:04.786995",
  "last_active": "2013-11-21 17:39:51.055332",
  "last_clean": "2013-11-15 11:51:42.883916",
  "last_became_active": "0.00",
  "last_unstale": "2013-11-21 17:39:51.055332",
  "mapping_epoch": 9128,
  "log_start": "6874'103543",
  "ondisk_log_start": "6874'103543",
  "created": 452,
  "last_epoch_clean": 9160,
  "parent": "0.0",
  "parent_split_bits": 0,
  "last_scrub": "7696'104899",
  "last_scrub_stamp": "2013-11-14 11:45:43.492512",
  "last_deep_scrub": "6874'103975",
  "last_deep_scrub_stamp": "2013-11-10 05:16:26.380028",
  "last_clean_scrub_stamp": "2013-11-14 11:45:43.492512",
  "log_size": 3001,
  "ondisk_log_size": 3001,
  "stats_invalid": "0",
  "stat_sum": { "num_bytes": 2814377984,
  "num_objects": 671,
  "num_object_clones": 0,
  "num_object_copies": 0,
  "num_objects_missing_on_primary": 0,
  "num_objects_degraded": 0,
  "num_objects_unfound": 0,
  "num_read": 264075,
  "num_read_kb": 1203000,
  "num_write": 8363,
  "num_write_kb": 3850240,
  "num_scrub_errors": 0,
  "num_shallow_scrub_errors": 0,
  "num_deep_scrub_errors": 0,
  "num_objects_recovered": 0,
  "num_bytes_recovered": 0,
  "num_keys_recovered": 0},
  "stat_cat_sum": {},
  "up": [
9],
  "acting": [
9,
5]},
  "empty": 0,
  "dne": 0,
  "incomplete": 0,
  "last_epoch_started": 9159},
  "recovery_state": [
{ "name": "Started\/Primary\/Active",
  "enter_time": "2013-11-21 16:04:04.786697",
  "might_have_unfound": [],
  "recovery_progress": { "backfill_target": -1,
  "waiting_on_backfill": 0,
  "backfill_pos": "0\/\/0\/\/-1",
  "backfill_info": { "begin": "0\/\/0\/\/-1",
  "end": "0\/\/0\/\/-1",
  "objects": []},
  "peer_backfill_info": { "begin": "0\/\/0\/\/-1",
  "end": "0\/\/0\/\/-1",
  "objects": []},
  "backfills_in_flight": [],
  "pull_from_peer": [],
  "pushing": []},
  "scrub": { "scrubber.epoch_start": "0",
  "scrubber.active": 0,
  "scrubber.block_writes": 0,
  "scrubber.finalizing": 0,
  "scrubber.waiting_on": 0,
  "scrubber.waiting_

Re: [ceph-users] Is Ceph a provider of block device too ?

2013-11-21 Thread John-Paul Robinson
Sorry, I was mixing concepts.

I was thinking of RBD snapshots, which require the file system to be
consistent before creating.

I have been exploring an idea for creating remote, asynchronous copies
of RBD images, hopefully with some form of delayed state updating.  I've
been reviewing the features of DRBD and LVM to see if these components
might add features to make this feasible.

The RBD image replication features within a pool do work well and
clearly maintain a consistent state for the contained file systems.  I
use them daily. :)

~jpr

On 11/21/2013 12:52 PM, Gregory Farnum wrote:
> On Thu, Nov 21, 2013 at 10:13 AM, John-Paul Robinson  wrote:
>> Is this statement accurate?
>>
>> As I understand DRBD, you can replicate online block devices reliably,
>> but with Ceph the replication for RBD images requires that the file
>> system be offline.
> It's not clear to me what replication you're asking about with Ceph
> here. Ceph is internally replicating the block device to keep it
> available in case of server failures, and that's all happening online.
> If you want a logically distinct copy (? this seems to be what Dimitri
> is referring to with adding a 3rd DRBD copy on another node) then that
> model is a lot different than in the classical storage world — it
> might be that you're saying "I want 3-copy durability instead of
> 2-copy" which you can do online (for all images stored in a given
> pool); or you might be after a copy you can use as a stable backup
> point, which you can do by taking a snapshot of the image (but here
> you probably want to quiesce the filesystem, yes); or you might be
> after something completely different which I'm not thinking of...
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS filesystem disapear!

2013-11-21 Thread Yan, Zheng
On Fri, Nov 22, 2013 at 9:19 AM, Alphe Salas Michels wrote:

>  Hello all!
>
> I experience a strange issue since last update to ubuntu 13.10 (saucy) and
> ceph emperor 0.72.1
>
> kernel version  3.11.0-13-generic #20-Ubuntu
>
> ceph packages installed are the ones for RARING
>
> when I mount my ceph cluster using cephfs and I upload a tons of data or
> do a directory listing (find . -printf "%d %k" ) or do
> a chown -R user:user * at some point the filesystem disapear!
>
> I don t know how to solve this issue there is no entry in anylog the only
> thing that seems to be affected is ceph-watch-notice that get stuckl
> and forbid the unmount "have to pid kill .-9 that process to umount /
> mount the ceph cluster on client proxy to start over the process.
> in the chown if I put --changes to slow it down just enought then the
> problem seems to disapear.
>
>
> sounds like the d_prune_aliases() bug. please try updating 3.12 kernel or
using ceph-fuse

Yan, Zheng


> Any suggestions are welcome
> Atte,
>
> --
>
> *Alphé Salas*
> Ingeniero T.I
>
> [image: Descripción: cid:image001.gif@01CAA59C.F14CE4B0]*Kepler Data
> Recovery*
>
> *Asturias 97, Las Condes*
> * Santiago- Chile*
> *(56 2) 2362 7504*
> asa...@kepler.cl
> *www.kepler.cl *
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 2 probs after upgrading to emporer

2013-11-21 Thread ebay
Hi,
maybe you can help us with following probs:
if you need more info about our cluster or any debugging log I will be
happy to help

Environment:
---

Small test cluster with 7 node, 1 osd per node
Upgrade from dumpling to emporer 0.72.1


2 Problems after upgrade:
---

- ceph -s shows 
  - changing pg_num etc from 1024 to 1600 has no effect
  - after setting  to a larger value the cluster
moves to HEALTH_OK

- 20 pools
  create new pool
  wait until pool is ready
  delete pool again
  ceph -s still shows 21 pools

  stop first mon, wait, start mon again
  ceph -s shows 20 pools, 21 pools, 20 pools, ... changing every n (~ 2)
secs

  stop second mon, wait, start mon again
  ceph -s shows 20 pools, 21 pools, 20 pools, ... changing every m > n secs

  stop third mon, wait, start mon again
  ceph -s shows 20 pools ... stable


Some infos about the cluster:


# uname –a
Linux host1 3.2.0-53-generic #81-Ubuntu SMP Thu Aug 22 21:01:03 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux


# ceph --version
ceph version 0.72.1 (4d923861868f6a15dcb33fef7f50f674997322de)

# ceph -s
cluster 3f4a289a-ad40-40e7-8204-ab38affb18f8
 health HEALTH_WARN pool images has too few pgs
 monmap e1: 3 mons at
{host1=10.255.NNN.NN:6789/0,host2=10.255.NNN.NN:6789/0,host3=10.255.NNN.NN:6789/0},\
election epoch 306, quorum 0,1,2 host1,host2,host3
 mdsmap e21: 1/1/1 up {0=host4=up:active}
 osdmap e419: 7 osds: 7 up, 7 in
  pgmap v815042: 17152 pgs, 21 pools, 97795 MB data, 50433 objects
146 GB used, 693 GB / 839 GB avail
   17152 active+clean
  client io 25329 kB/s wr, 39 op/s


# ceph osd dump
epoch 419
fsid 3f4a289a-ad40-40e7-8204-ab38affb18f8
created 2013-09-27 14:37:13.601605
modified 2013-11-20 16:55:35.791740
flags

pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 2 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 owner 0
pool 3 'images' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 1600 pgp_num 1600 last_change 367 owner 0
  removed_snaps [1~1,3~2]
...
...

# rados lspools | wc -l
20


Greetings
Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to fix active+remapped pg

2013-11-21 Thread Gregory Farnum
On Thu, Nov 21, 2013 at 7:52 AM, Ugis  wrote:
> Thanks, reread that section in docs and found tunables profile - nice
> to have, hadn't noticed it before(ceph docs develop so fast that you
> need RSS to follow all changes :) )
>
> Still problem persists in a different way.
> Did set profile "optimal", reballancing started, but I had "rbd
> delete" in background, in the end cluster ended up with negative
> degradation %
> I think I have hit bug http://tracker.ceph.com/issues/3720   which is
> still open.
> I did restart osds one by one and negative degradation dissapeared.
>
> Afterwards I added extra ~900GB data, degradation growed in process to 0.071%
> This is rather http://tracker.ceph.com/issues/3747  which is closed,
> but seems to happen still.
> I did "ceph osd out X; sleep 40; ceph osd in X" for all osds,
> degradation % went away.
>
> In the end I still have "55 active+remapped" pgs and no degradation %.
> "pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361
> GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd,
> 203op/s"
>
> I queried some of remapped pgs, do not see why they do not
> reballance(tunables are optimal now, checked).
>
> Where to look for the reason they are not reballancing? Is there
> something to look for in osd logs if debug level is increased?
>
> one of those:
> # ceph pg 4.5e query
> { "state": "active+remapped",
>   "epoch": 9165,
>   "up": [
> 9],
>   "acting": [
> 9,
> 5],

For some reason CRUSH is still failing to map all the PGs to two hosts
(notice how the "up" set is only one OSD, so it's adding another one
in "acting") — what's your CRUSH map look like?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to fetch files

2013-11-21 Thread Alfredo Deza
On Thu, Nov 21, 2013 at 4:01 PM, Knut Moe  wrote:
> Hi all,
>
> I am trying to install Ceph using the Preflight Checklist and when I issue
> the following command
>
> sudo apt-get update && sudo apt-get install ceph-deploy
>
> I get the following error after it goes through a lot different steps:
>
> Failed to fetch
> http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-amd64/Packages
> 404 Not Found
>
> Failed to fetch
> http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-i386/Packages
> 404 Not Found
>
> I am using Ubuntu 12.04, 64-bit.

It looks like you've added a URL that doesn't exist and is used to
explain that you need to replace "{ceph-estable-release}"
with the actual ceph stable release ("emperor" at the moment).

So with "emperor" that URL would be:

http://ceph.com/debian-emperor/dists/precise/main/binary-i386/Packages


>
> Thanks,
> Kurt
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to fetch files

2013-11-21 Thread Alfredo Deza
On Thu, Nov 21, 2013 at 4:51 PM, Knut Moe  wrote:
> Thanks, Alfredo.
>
> The challenge is that it is calling those links when issuing the following
> command:
>
> sudo apt-get update && sudo apt-get install ceph-deploy
>
> It then goes through a lot different steps before displaying those error
> messages. See more of the error in this screenshot link:
>
> http://www.specdata.net/ceph-message.gif
>
> Is there a configuration file that should be modified before running the
> update and deploy comman?

I mean, something/someone added those links to your
/etc/apt/sources.list or /etc/apt/sources.list.d/ceph.list
You need to fix those to point to the correct location.



>
>
>
> On Thu, Nov 21, 2013 at 2:11 PM, Alfredo Deza 
> wrote:
>>
>> On Thu, Nov 21, 2013 at 4:01 PM, Knut Moe  wrote:
>> > Hi all,
>> >
>> > I am trying to install Ceph using the Preflight Checklist and when I
>> > issue
>> > the following command
>> >
>> > sudo apt-get update && sudo apt-get install ceph-deploy
>> >
>> > I get the following error after it goes through a lot different steps:
>> >
>> > Failed to fetch
>> >
>> > http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-amd64/Packages
>> > 404 Not Found
>> >
>> > Failed to fetch
>> >
>> > http://ceph.com/debian-{ceph-stable-release}/dists/precise/main/binary-i386/Packages
>> > 404 Not Found
>> >
>> > I am using Ubuntu 12.04, 64-bit.
>>
>> It looks like you've added a URL that doesn't exist and is used to
>> explain that you need to replace "{ceph-estable-release}"
>> with the actual ceph stable release ("emperor" at the moment).
>>
>> So with "emperor" that URL would be:
>>
>> http://ceph.com/debian-emperor/dists/precise/main/binary-i386/Packages
>>
>>
>> >
>> > Thanks,
>> > Kurt
>> >
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS filesystem disapear!

2013-11-21 Thread Gregory Farnum
What do you mean the filesystem disappears? Is it possible you're just
pushing more traffic to the disks than they can handle, and not waiting
long enough for them to catch up?
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Nov 21, 2013 at 5:19 PM, Alphe Salas Michels wrote:

>  Hello all!
>
> I experience a strange issue since last update to ubuntu 13.10 (saucy) and
> ceph emperor 0.72.1
>
> kernel version  3.11.0-13-generic #20-Ubuntu
>
> ceph packages installed are the ones for RARING
>
> when I mount my ceph cluster using cephfs and I upload a tons of data or
> do a directory listing (find . -printf "%d %k" ) or do
> a chown -R user:user * at some point the filesystem disapear!
>
> I don t know how to solve this issue there is no entry in anylog the only
> thing that seems to be affected is ceph-watch-notice that get stuckl
> and forbid the unmount "have to pid kill .-9 that process to umount /
> mount the ceph cluster on client proxy to start over the process.
> in the chown if I put --changes to slow it down just enought then the
> problem seems to disapear.
>
>
> Any suggestions are welcome
> Atte,
>
> --
>
> *Alphé Salas*
> Ingeniero T.I
>
> [image: Descripción: cid:image001.gif@01CAA59C.F14CE4B0]*Kepler Data
> Recovery*
>
> *Asturias 97, Las Condes*
> * Santiago- Chile*
> *(56 2) 2362 7504*
> asa...@kepler.cl
> *www.kepler.cl *
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is Ceph a provider of block device too ?

2013-11-21 Thread Dimitri Maziuk
On 11/21/2013 12:13 PM, John-Paul Robinson wrote:
> Is this statement accurate?
> 
> As I understand DRBD, you can replicate online block devices reliably,
> but with Ceph the replication for RBD images requires that the file
> system be offline.
> 
> Thanks for the clarification,

Basic DRBD is RAID-1 over network. You don't "replicate" the filesystem,
you have it backed by 2 devices one of which happens to be on another
computer.

Less basic DRBD allows you to mount your gluster fs on both hosts or add
another DRBD on top to mirror your filesystem to a 3rd node.

HTH
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com