Re: [ceph-users] [AD]:Metal Souvenir Gifts

2013-08-02 Thread Bybehj
Dear sir/madam,

This is christine zhou from sonier-pins.Our factory specializing in metal 
promotional gifts.we will avail this opportunity to establishment the trade 
relations with you. if you have any demand of meta items, please contact me 
without hesitate.

Best regards 
Christine Zhou
Sales representative
Sales Dept Skype ID: sonier-pins-christine

Sonier Pins Co.,Ltd
No.67, Ju Cheng Road, Da Beng Wei Industry Zone,
XiaoLan Town, ZhongShan 528415,GaungDong China.
Tel: (86 -760)22123680  Fax: (86 -760)22122916
Email:  sa...@sonier-pins.com  Website: www.sonier-pins.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] failed to create ceph monitor with ceph-deploy.

2013-08-02 Thread Sean Cao
Hi everyone

 

Failed to create a ceph monitor with ceph-deploy on Admin node, it listed
errors as below.

I recalled I never encountered the issues on the prior version, example
0.61-2.

Is it a bug in the current version?

 

root@ubuntu1:/cluster# ceph-deploy mon create cephcluster2-0

Traceback (most recent call last):

  File "/usr/bin/ceph-deploy", line 21, in 

main()

  File "/usr/lib/pymodules/python2.7/ceph_deploy/cli.py", line 112, in main

return args.func(args)

  File "/usr/lib/pymodules/python2.7/ceph_deploy/mon.py", line 234, in mon

mon_create(args)

  File "/usr/lib/pymodules/python2.7/ceph_deploy/mon.py", line 138, in
mon_create

init=init,

  File "/usr/lib/python2.7/dist-packages/pushy/protocol/proxy.py", line 255,
in 

(conn.operator(type_, self, args, kwargs))

  File "/usr/lib/python2.7/dist-packages/pushy/protocol/connection.py", line
66, in operator

return self.send_request(type_, (object, args, kwargs))

  File "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py",
line 323, in send_request

return self.__handle(m)

  File "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py",
line 639, in __handle

raise e

pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory:
'/var/lib/ceph/mon/ceph-cephcluster2-0'

 

root@ubuntu1:/cluster# dpkg -l|grep ceph

ii  ceph-deploy   1.0-1
Ceph-deploy is an easy to use configuration tool

 

root@ubuntu1:~# tail -f /cluster/ceph.log

2013-08-02 15:45:19,855 ceph_deploy.mon DEBUG Deploying mon, cluster ceph
hosts cephcluster2-0

2013-08-02 15:45:19,856 ceph_deploy.mon DEBUG Deploying mon to
cephcluster2-0

2013-08-02 15:45:20,887 ceph_deploy.mon DEBUG Distro Ubuntu codename
precise, will use upstart

2013-08-02 15:46:52,733 ceph_deploy.mon DEBUG Deploying mon, cluster ceph
hosts cephcluster2-0

2013-08-02 15:46:52,733 ceph_deploy.mon DEBUG Deploying mon to
cephcluster2-0

2013-08-02 15:46:53,655 ceph_deploy.mon DEBUG Distro Ubuntu codename
precise, will use upstart

 

root@cephcluster2-0:/var/lib/ceph# dpkg -l|grep ceph

ii  ceph  0.61.7-1precise
distributed storage and file system

ii  ceph-common   0.61.7-1precise
common utilities to mount and interact with a ceph storage cluster

ii  ceph-fs-common0.61.7-1precise
common utilities to mount and interact with a ceph file system

ii  ceph-mds  0.61.7-1precise
metadata server for the ceph distributed file system

 

Sean Cao

  http://www.lecast.com.cn

 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process

2013-08-02 Thread Oliver Francke

Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with 
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not 
alone here?
We have a raising number of tickets from people reinstalling from ISO's 
with 3.2-kernel.


Fast fallback is to start all VM's with qemu-1.2.2, but we then lose 
some features ala latency-free-RBD-cache ;)


I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x 
"fixes" it. So we have a bad combination for all distros with 3.2-kernel 
and rbd as storage-backend, I assume.


Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.

--

Oliver Francke

filoo GmbH
Moltkestraße 25a
0 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Idle OSD's keep using a lot of CPU

2013-08-02 Thread Mark Nelson

Hi Erik,

Is your mon still running properly?

Mark

On 08/01/2013 05:06 PM, Erik Logtenberg wrote:

Hi,

I think the high CPU usage was due to the system time not being right. I
activated ntp and it had to do quite big adjustment, and after that the
high CPU usage was gone.

Anyway, I immediately ran into another issue. I ran a simple benchmark:
# rados bench --pool benchmark 300 write --no-cleanup

During the benchmark, one of my osd's went down. I checked the logs and
apparently there was no hardware failure (the disk is still nicely
mounted and the osd is still running, but the logfile fills up rapidly
with these messages:

2013-08-02 00:03:40.014982 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36884 s=2 pgs=86874
cs=173547 l=0).fault, initiating reconnect
2013-08-02 00:03:40.016682 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36885 s=2 pgs=86875
cs=173549 l=0).fault, initiating reconnect
2013-08-02 00:03:40.019241 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36886 s=2 pgs=86876
cs=173551 l=0).fault, initiating reconnect

What could be wrong here?

King regards,

Erik.



On 08/01/2013 08:00 AM, Dan Mick wrote:

Logging might well help.

http://ceph.com/docs/master/rados/troubleshooting/log-and-debug/



On 07/31/2013 03:51 PM, Erik Logtenberg wrote:

Hi,

I just added a second node to my ceph test platform. The first node has
a mon and three osd's, the second node only has three osd's. Adding the
osd's was pretty painless, and ceph distributed the data from the first
node evenly over both nodes so everything seems to be fine. The monitor
also thinks everything is fine:

2013-08-01 00:41:12.719640 mon.0 [INF] pgmap v1283: 292 pgs: 292
active+clean; 9264 MB data, 24826 MB used, 5541 GB / 5578 GB avail

Unfortunately, the three osd's on the second node keep eating a lot of
cpu, while there is no activity whatsoever:

PID USER  VIRTRESSHR S  %CPU %MEM TIME+ COMMAND
21272 root  441440  34632   7848 S  61.8  0.9   4:08.62 ceph-osd
21145 root  439852  29316   8360 S  60.4  0.7   4:04.31 ceph-osd
21036 root  443828  31324   8336 S  60.1  0.8   4:07.55 ceph-osd

Any idea why that is and how I can even ask an osd what it's doing?
There is no corresponding hdd activity, it's only cpu and hardly any
memory usage.

Also the monitor on the first node is doing the same thing:

PID USERVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
12825 root186900  23492   5540 S 141.1 0.590   9:47.64 ceph-mon

I tried stopping the three osd's: that makes the monitor calm down, but
after restarting the osd's, the monitor resumes its cpu usage. I also
tried stopping the monitor, which makes the three osd's calm down, but
once again they will start eating cpu again as soon as the monitor is
back online.

In the mean time, the first three osd's, the ones on the same machine as
the monitor, don't behave like this at all. Currently as there is no
activity, they are just idling on low cpu usage, as expected.

Kind regards,

Erik.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Idle OSD's keep using a lot of CPU

2013-08-02 Thread Erik Logtenberg
Hi Mark,

Yes, I do believe so. When I run ceph -w now, I see a healthy cluster,
but during the benchmark one of the osd's went "down". The osd daemon
process was never down, and eventually it was marked "up" again some
time after the benchmark finished. There was some rebuilding/checking
because some of the pg's were stale+active+rebuilding or something like
that, but in the end all pg's were active+clean again.
During all this, I do believe the monitor was working properly.

Still, the osd's on the second node all report "hunting for new mon"
every now and then. But I don't see any cause for this. Apart from the
few benchmarks I ran, there is no activity whatsoever.

Erik.


On 08/02/2013 01:34 PM, Mark Nelson wrote:
> Hi Erik,
> 
> Is your mon still running properly?
> 
> Mark
> 
> On 08/01/2013 05:06 PM, Erik Logtenberg wrote:
>> Hi,
>>
>> I think the high CPU usage was due to the system time not being right. I
>> activated ntp and it had to do quite big adjustment, and after that the
>> high CPU usage was gone.
>>
>> Anyway, I immediately ran into another issue. I ran a simple benchmark:
>> # rados bench --pool benchmark 300 write --no-cleanup
>>
>> During the benchmark, one of my osd's went down. I checked the logs and
>> apparently there was no hardware failure (the disk is still nicely
>> mounted and the osd is still running, but the logfile fills up rapidly
>> with these messages:
>>
>> 2013-08-02 00:03:40.014982 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
>> 192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36884 s=2 pgs=86874
>> cs=173547 l=0).fault, initiating reconnect
>> 2013-08-02 00:03:40.016682 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
>> 192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36885 s=2 pgs=86875
>> cs=173549 l=0).fault, initiating reconnect
>> 2013-08-02 00:03:40.019241 7fe7336fd700  0 -- 192.168.1.15:6801/1229 >>
>> 192.168.1.16:6801/3001 pipe(0x39e9680 sd=28 :36886 s=2 pgs=86876
>> cs=173551 l=0).fault, initiating reconnect
>>
>> What could be wrong here?
>>
>> King regards,
>>
>> Erik.
>>
>>
>>
>> On 08/01/2013 08:00 AM, Dan Mick wrote:
>>> Logging might well help.
>>>
>>> http://ceph.com/docs/master/rados/troubleshooting/log-and-debug/
>>>
>>>
>>>
>>> On 07/31/2013 03:51 PM, Erik Logtenberg wrote:
 Hi,

 I just added a second node to my ceph test platform. The first node has
 a mon and three osd's, the second node only has three osd's. Adding the
 osd's was pretty painless, and ceph distributed the data from the first
 node evenly over both nodes so everything seems to be fine. The monitor
 also thinks everything is fine:

 2013-08-01 00:41:12.719640 mon.0 [INF] pgmap v1283: 292 pgs: 292
 active+clean; 9264 MB data, 24826 MB used, 5541 GB / 5578 GB avail

 Unfortunately, the three osd's on the second node keep eating a lot of
 cpu, while there is no activity whatsoever:

 PID USER  VIRTRESSHR S  %CPU %MEM TIME+ COMMAND
 21272 root  441440  34632   7848 S  61.8  0.9   4:08.62 ceph-osd
 21145 root  439852  29316   8360 S  60.4  0.7   4:04.31 ceph-osd
 21036 root  443828  31324   8336 S  60.1  0.8   4:07.55 ceph-osd

 Any idea why that is and how I can even ask an osd what it's doing?
 There is no corresponding hdd activity, it's only cpu and hardly any
 memory usage.

 Also the monitor on the first node is doing the same thing:

 PID USERVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
 12825 root186900  23492   5540 S 141.1 0.590   9:47.64 ceph-mon

 I tried stopping the three osd's: that makes the monitor calm down, but
 after restarting the osd's, the monitor resumes its cpu usage. I also
 tried stopping the monitor, which makes the three osd's calm down, but
 once again they will start eating cpu again as soon as the monitor is
 back online.

 In the mean time, the first three osd's, the ones on the same
 machine as
 the monitor, don't behave like this at all. Currently as there is no
 activity, they are just idling on low cpu usage, as expected.

 Kind regards,

 Erik.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy ceph-create-keys hangs

2013-08-02 Thread Mathias Lindberg
Hi, 

I have tried this with almost the same result, there seems to be one more set 
of processes. 

cloud-2: root 27782 1  0 14:10 ?00:00:00 /usr/bin/ceph-mon -i 
cloud-2 --pid-file /var/run/ceph/mon.cloud-2.pid -c /etc/ceph/ceph.conf
cloud-2: root 27802 1  0 14:10 ?00:00:00 /usr/bin/python 
/usr/sbin/ceph-create-keys -i cloud-2
cloud-2: root 27824 27802  0 14:10 ?00:00:00 ceph --cluster=ceph 
--name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-2/keyring auth get-or-create 
client.admin mon allow * osd allow * mds allow
cloud-1: root 31797 1  0 14:10 ?00:00:00 /usr/bin/ceph-mon -i 
cloud-1 --pid-file /var/run/ceph/mon.cloud-1.pid -c /etc/ceph/ceph.conf
cloud-1: root 31815 1  0 14:10 ?00:00:00 /usr/bin/python 
/usr/sbin/ceph-create-keys -i cloud-1
cloud-1: root 31839 31815  0 14:10 ?00:00:00 ceph --cluster=ceph 
--name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-1/keyring auth get-or-create 
client.admin mon allow * osd allow * mds allow
cloud-0: root 19629  8546  0 Aug01 pts/000:00:00 su - ceph
cloud-0: ceph 19630 19629  0 Aug01 pts/000:00:00 -bash
cloud-0: root 24090 1  0 14:10 pts/000:00:00 /usr/bin/ceph-mon -i 
cloud-0 --pid-file /var/run/ceph/mon.cloud-0.pid -c /etc/ceph/ceph.conf
cloud-0: root 24106 1  0 14:10 pts/000:00:00 /usr/bin/python 
/usr/sbin/ceph-create-keys -i cloud-0
cloud-0: root 24143 24106  0 14:10 pts/000:00:00 ceph --cluster=ceph 
--name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-0/keyring auth get-or-create 
client.admin mon allow * osd allow * mds allow
cloud-0: ceph 24793 19630  0 14:25 pts/000:00:00 grep ceph

Regards,
Mathias

On Aug 1, 2013, at 18:05 , Alfredo Deza  wrote:

> 
> 
> Hi Mathias,
> 
> Have you tried these steps and sticking to the slow interfaces? I would be 
> curious to see if this is just a problem of how those interfaces are able to 
> talk to each other.
> 
> 
>  
>> From: Mathias Lindberg 
>> Date: August 1, 2013, 4:01:38 MDT
>> To: "ceph-users@lists.ceph.com" 
>> Subject: [ceph-users] ceph-deploy ceph-create-keys hangs
>> 
>> Hi
>> 
>> Having previously had problems during startup with "creating keys" 
>> (otherwise a working setup) one one node when using mkcephfs, i have given 
>> ceph-deploy a try and get stuck on what feels like the same step.
>> Ceph version is 0.61.7 and OS is centos 6.4.
>> 
>> Steps i have done is:
>> #ceph-deploy new cloud-{0,1,2}-fast
>> #ceph-deploy --overwrite-conf mon create cloud-{0,1,2}-fast 
>> Ceph-mon starts ok on all nodes and ceph-create-keys seems to be stuck.
>> 
>> cloud-1-fast: root  6580 1  0 11:45 ?00:00:00 
>> /usr/bin/ceph-mon -i cloud-1 --pid-file /var/run/ceph/mon.cloud-1.pid -c 
>> /etc/ceph/ceph.conf
>> cloud-1-fast: root  6601 1  0 11:45 ?00:00:00 
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-1
>> cloud-2-fast: root 18724 1  0 11:45 ?00:00:00 
>> /usr/bin/ceph-mon -i cloud-2 --pid-file /var/run/ceph/mon.cloud-2.pid -c 
>> /etc/ceph/ceph.conf
>> cloud-2-fast: root 18747 1  0 11:45 ?00:00:00 
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-2
>> cloud-0-fast: root 19629  8546  0 11:44 pts/000:00:00 su - ceph
>> cloud-0-fast: ceph 19630 19629  0 11:44 pts/000:00:00 -bash
>> cloud-0-fast: root 19853 1  0 11:45 ?00:00:00 
>> /usr/bin/ceph-mon -i cloud-0 --pid-file /var/run/ceph/mon.cloud-0.pid -c 
>> /etc/ceph/ceph.conf
>> cloud-0-fast: root 19872 1  0 11:45 ?00:00:00 
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-0
>> cloud-0-fast: ceph 20282 19630  0 11:46 pts/000:00:00 grep ceph
>> 
>> I have the /var/log/ceph/ceph-mon.cloud-0.log with debug turned on at 
>> http://pastebin.com/DivE95mK 
>> And the output from a strace of the ceph-create-keys from one of the nodes 
>> at http://pastebin.com/JQak151Z
>> 
>> Worth mentioning is that each server has 2 interfaces cloud-* ("normal 
>> interface") and cloud-*-fast (10gE that i want to use), hostname resolves to 
>> cloud-*.
>> The small ceph.conf file created so far
>> 
>> [global]
>> debug_ms = 1
>> filestore_xattr_use_omap = true
>> debug_monc = 20
>> mon_host = 10.12.1.160,10.12.1.161,10.12.1.162
>> osd_journal_size = 1024
>> debug_mon = 20
>> mon_initial_members = cloud-0-fast, cloud-1-fast, cloud-2-fast
>> auth_supported = cephx
>> fsid = 90578caa-3c63-4183-96c7-176467a98ddb
>> 
>> Regards,
>> Mathias
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.co

Re: [ceph-users] ceph-deploy ceph-create-keys hangs

2013-08-02 Thread Alfredo Deza
On Fri, Aug 2, 2013 at 9:17 AM, Mathias Lindberg wrote:

> Hi,
>
> I have tried this with almost the same result, there seems to be one more
> set of processes.
>
> cloud-2: root 27782 1  0 14:10 ?00:00:00 /usr/bin/ceph-mon
> -i cloud-2 --pid-file /var/run/ceph/mon.cloud-2.pid -c /etc/ceph/ceph.conf
> cloud-2: root 27802 1  0 14:10 ?00:00:00 /usr/bin/python
> /usr/sbin/ceph-create-keys -i cloud-2
> cloud-2: root 27824 27802  0 14:10 ?00:00:00 ceph
> --cluster=ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-2/keyring
> auth get-or-create client.admin mon allow * osd allow * mds allow
> cloud-1: root 31797 1  0 14:10 ?00:00:00 /usr/bin/ceph-mon
> -i cloud-1 --pid-file /var/run/ceph/mon.cloud-1.pid -c /etc/ceph/ceph.conf
> cloud-1: root 31815 1  0 14:10 ?00:00:00 /usr/bin/python
> /usr/sbin/ceph-create-keys -i cloud-1
> cloud-1: root 31839 31815  0 14:10 ?00:00:00 ceph
> --cluster=ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-1/keyring
> auth get-or-create client.admin mon allow * osd allow * mds allow
> cloud-0: root 19629  8546  0 Aug01 pts/000:00:00 su - ceph
> cloud-0: ceph 19630 19629  0 Aug01 pts/000:00:00 -bash
> cloud-0: root 24090 1  0 14:10 pts/000:00:00 /usr/bin/ceph-mon
> -i cloud-0 --pid-file /var/run/ceph/mon.cloud-0.pid -c /etc/ceph/ceph.conf
> cloud-0: root 24106 1  0 14:10 pts/000:00:00 /usr/bin/python
> /usr/sbin/ceph-create-keys -i cloud-0
> cloud-0: root 24143 24106  0 14:10 pts/000:00:00 ceph
> --cluster=ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-cloud-0/keyring
> auth get-or-create client.admin mon allow * osd allow * mds allow
> cloud-0: ceph 24793 19630  0 14:25 pts/000:00:00 grep ceph
>
> Regards,
> Mathias
>
> On Aug 1, 2013, at 18:05 , Alfredo Deza  wrote:
>
>
>
> Hi Mathias,
>
> Have you tried these steps and sticking to the slow interfaces? I would be
> curious to see if this is just a problem of how those interfaces are able
> to talk to each other.
>
>
>
>
>> *From:* Mathias Lindberg 
>> *Date:* August 1, 2013, 4:01:38 MDT
>> *To:* "ceph-users@lists.ceph.com" 
>> *Subject:* *[ceph-users] ceph-deploy ceph-create-keys hangs*
>>
>> Hi
>>
>> Having previously had problems during startup with "creating keys"
>> (otherwise a working setup) one one node when using mkcephfs, i have given
>> ceph-deploy a try and get stuck on what feels like the same step.
>> Ceph version is 0.61.7 and OS is centos 6.4.
>>
>> Steps i have done is:
>> #ceph-deploy new cloud-{0,1,2}-fast
>> #ceph-deploy --overwrite-conf mon create cloud-{0,1,2}-fast
>> Ceph-mon starts ok on all nodes and ceph-create-keys seems to be stuck.
>>
>> cloud-1-fast: root  6580 1  0 11:45 ?00:00:00
>> /usr/bin/ceph-mon -i cloud-1 --pid-file /var/run/ceph/mon.cloud-1.pid -c
>> /etc/ceph/ceph.conf
>> cloud-1-fast: root  6601 1  0 11:45 ?00:00:00
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-1
>> cloud-2-fast: root 18724 1  0 11:45 ?00:00:00
>> /usr/bin/ceph-mon -i cloud-2 --pid-file /var/run/ceph/mon.cloud-2.pid -c
>> /etc/ceph/ceph.conf
>> cloud-2-fast: root 18747 1  0 11:45 ?00:00:00
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-2
>> cloud-0-fast: root 19629  8546  0 11:44 pts/000:00:00 su - ceph
>> cloud-0-fast: ceph 19630 19629  0 11:44 pts/000:00:00 -bash
>> cloud-0-fast: root 19853 1  0 11:45 ?00:00:00
>> /usr/bin/ceph-mon -i cloud-0 --pid-file /var/run/ceph/mon.cloud-0.pid -c
>> /etc/ceph/ceph.conf
>> cloud-0-fast: root 19872 1  0 11:45 ?00:00:00
>> /usr/bin/python /usr/sbin/ceph-create-keys -i cloud-0
>> cloud-0-fast: ceph 20282 19630  0 11:46 pts/000:00:00 grep ceph
>>
>> I have the /var/log/ceph/ceph-mon.cloud-0.log with debug turned on at
>> http://pastebin.com/DivE95mK
>> And the output from a strace of the ceph-create-keys from one of the
>> nodes at http://pastebin.com/JQak151Z
>>
>> Worth mentioning is that each server has 2 interfaces cloud-* ("normal
>> interface") and cloud-*-fast (10gE that i want to use), hostname resolves
>> to cloud-*.
>> The small ceph.conf file created so far
>>
>> [global]
>> debug_ms = 1
>> filestore_xattr_use_omap = true
>> debug_monc = 20
>> mon_host = 10.12.1.160,10.12.1.161,10.12.1.162
>> osd_journal_size = 1024
>> debug_mon = 20
>> mon_initial_members = cloud-0-fast, cloud-1-fast, cloud-2-fast
>> auth_supported = cephx
>> fsid = 90578caa-3c63-4183-96c7-176467a98ddb
>>
>> Regards,
>> Mathias
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-use

Re: [ceph-users] failed to create ceph monitor with ceph-deploy.

2013-08-02 Thread Alfredo Deza
On Fri, Aug 2, 2013 at 4:16 AM, Sean Cao  wrote:

> Hi everyone
>
> ** **
>
> Failed to create a ceph monitor with ceph-deploy on Admin node, it listed
> errors as below.
>
> I recalled I never encountered the issues on the prior version, example
> 0.61-2.
>
> Is it a bug in the current version?
>
> ** **
>
> root@ubuntu1:/cluster# ceph-deploy mon create cephcluster2-0
>
> Traceback (most recent call last):
>
>   File "/usr/bin/ceph-deploy", line 21, in 
>
> main()
>
>   File "/usr/lib/pymodules/python2.7/ceph_deploy/cli.py", line 112, in main
> 
>
> return args.func(args)
>
>   File "/usr/lib/pymodules/python2.7/ceph_deploy/mon.py", line 234, in mon
> 
>
> mon_create(args)
>
>   File "/usr/lib/pymodules/python2.7/ceph_deploy/mon.py", line 138, in
> mon_create
>
> init=init,
>
>   File "/usr/lib/python2.7/dist-packages/pushy/protocol/proxy.py", line
> 255, in 
>
> (conn.operator(type_, self, args, kwargs))
>
>   File "/usr/lib/python2.7/dist-packages/pushy/protocol/connection.py",
> line 66, in operator
>
> return self.send_request(type_, (object, args, kwargs))
>
>   File
> "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py", line
> 323, in send_request
>
> return self.__handle(m)
>
>   File
> "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py", line
> 639, in __handle
>
> raise e
>
> pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory:
> '/var/lib/ceph/mon/ceph-cephcluster2-0'
>
> ** **
>
> root@ubuntu1:/cluster# dpkg -l|grep ceph
>
> ii  ceph-deploy   1.0-1
> Ceph-deploy is an easy to use configuration tool
>
> ** **
>
> root@ubuntu1:~# tail -f /cluster/ceph.log
>
> 2013-08-02 15:45:19,855 ceph_deploy.mon DEBUG Deploying mon, cluster ceph
> hosts cephcluster2-0
>
> 2013-08-02 15:45:19,856 ceph_deploy.mon DEBUG Deploying mon to
> cephcluster2-0
>
> 2013-08-02 15:45:20,887 ceph_deploy.mon DEBUG Distro Ubuntu codename
> precise, will use upstart
>
> 2013-08-02 15:46:52,733 ceph_deploy.mon DEBUG Deploying mon, cluster ceph
> hosts cephcluster2-0
>
> 2013-08-02 15:46:52,733 ceph_deploy.mon DEBUG Deploying mon to
> cephcluster2-0
>
> 2013-08-02 15:46:53,655 ceph_deploy.mon DEBUG Distro Ubuntu codename
> precise, will use upstart
>
> ** **
>
> root@cephcluster2-0:/var/lib/ceph# dpkg -l|grep ceph
>
> ii  ceph
> 0.61.7-1precise distributed storage and
> file system
>
> ii  ceph-common   0.61.7-1precise
> common utilities to mount and interact with
> a ceph storage cluster
>
> ii  ceph-fs-common
> 0.61.7-1precise common utilities to
> mount and interact with a ceph file system
>
> ii  ceph-mds
> 0.61.7-1precise metadata server for the
> ceph distributed file system
>
> ** **
>
> *Sean Cao*
>
> *http://www.lecast.com.cn* **
>
> ** **
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

This looks like a bug to me, I just replicated the problem.

Right now we are in the process of migrating a few of the remote tasks for
ceph-deploy into a structure that will let us know what is exactly
happening on the remote end but this is not the case yet for `mon` actions.

I've created a ticket [0] to get that done and see if we can tell better
why is it that `/var/lib/ceph/mon/{cluster-name}` is not being created.

[0] http://tracker.ceph.com/issues/5839
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] trouble authenticating after bootstrapping monitors

2013-08-02 Thread Kevin Weiler
I'm having some trouble bootstrapping my monitors using this page as a guide:

http://ceph.com/docs/next/dev/mon-bootstrap/

I can't seem to authenticate to my monitors with client.admin after I've 
created them and started them:


[root@camelot ~]# cat /etc/ceph/ceph.keyring
[mon.]
key = AQD6yftRkKY3NxAA5VNbtUM23C3uPqUUXYSHeQ==
[client.admin]
key = AQANyvtRYDHCCxAAwgcgdMJ9ue64m6+enYONOw==


[root@camelot ~]# monmaptool --create --add camelot 10.198.1.3:6789 monmap
monmaptool: monmap file monmap
monmaptool: generated fsid 87a5f355-f7be-43aa-b26c-b6ad23f371bb
monmaptool: writing epoch 0 to monmap (1 monitors)


[root@camelot ~]# ceph-mon --mkfs -i camelot --monmap monmap --keyring 
/etc/ceph/ceph.keyring
ceph-mon: created monfs at /srv/mon.camelot for mon.camelot


[root@camelot ~]# service ceph start
=== mon.camelot ===
Starting Ceph mon.camelot on camelot...
=== mds.camelot ===
Starting Ceph mds.camelot on camelot...
starting mds.camelot at :/0


[root@camelot ~]# ceph auth get mon.
access denied
If someone could tell me what I'm doing wrong it would be greatly appreciated. 
Thanks!

--
Kevin Weiler
IT

IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | 
http://imc-chicago.com/
Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: 
kevin.wei...@imc-chicago.com



The information in this e-mail is intended only for the person or entity to 
which it is addressed.

It may contain confidential and /or privileged material. If someone other than 
the intended recipient should receive this e-mail, he / she shall not be 
entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by 
"reply" and then delete it from your system. Although this information has been 
compiled with great care, neither IMC Financial Markets & Asset Management nor 
any of its related entities shall accept any responsibility for any errors, 
omissions or other inaccuracies in this information or for the consequences 
thereof, nor shall it be bound in any way by the contents of this e-mail or its 
attachments. In the event of incomplete or incorrect transmission, please 
return the e-mail to the sender and permanently delete this message and any 
attachments.

Messages and attachments are scanned for all known viruses. Always scan 
attachments before opening them.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy progress and CDS session

2013-08-02 Thread Sage Weil
There is a session at CDS scheduled to discuss ceph-deploy (4:40pm PDT on 
Monday).  We'll be going over what we currently have in backlog for 
improvements, but if you have any opinions about what else ceph-deploy 
should or should not do or areas where it is problematic, please reply to 
this thread to let us know what you think, and/or join the CDS discussion 
hangout.

For those who haven't noticed, we now have a full-time devoloper,
Alfredo Deza, who is working on ceph-deploy.  He's been making huge
progress over the last couple of weeks improving error reporting,
visibility into what ceph-deploy is doing, and fixing various bugs.  We
have a long list of things we want to do with the tool, but any feedback
from users is helpful to make sure we're working on the right things 
first!

sage


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy progress and CDS session

2013-08-02 Thread Dewan Shamsul Alam
There should be a like button for emails too, so that I can like Sage's
update on ceph-deploy. :)



On Sat, Aug 3, 2013 at 12:02 AM, Sage Weil  wrote:

> There is a session at CDS scheduled to discuss ceph-deploy (4:40pm PDT on
> Monday).  We'll be going over what we currently have in backlog for
> improvements, but if you have any opinions about what else ceph-deploy
> should or should not do or areas where it is problematic, please reply to
> this thread to let us know what you think, and/or join the CDS discussion
> hangout.
>
> For those who haven't noticed, we now have a full-time devoloper,
> Alfredo Deza, who is working on ceph-deploy.  He's been making huge
> progress over the last couple of weeks improving error reporting,
> visibility into what ceph-deploy is doing, and fixing various bugs.  We
> have a long list of things we want to do with the tool, but any feedback
> from users is helpful to make sure we're working on the right things
> first!
>
> sage
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy progress and CDS session

2013-08-02 Thread Eric Eastman

Hi,

First I would like to state that with all its limitiation, I have 
managed to build multiple
clusters with ceph-deploy and without it, I would have been totally 
lost.  Things

that I feel would improve it include:

A debug mode where it lists everything it is doing.  This will be 
helpful in the future
when I move to a more integrated tool then ceph-deploy, as I could see 
exactly

how ceph-deploy built my test cluster.

To understand more types of linux storage devices. I have spent hours 
trying to make it
understand multipath devices, as I happen to have a large number of 
these in my lab,

but so far I have not made it work.

Really good documentation on all the ceph-deploy options.

Lastly, this is not just a ceph-deploy thing, but documentation 
explaining how things
boot up, and interact.  Ceph-deploy depends on tools like ceph-disk to 
mount OSD
disks on the servers during boot, and I learned the hard way that if a 
OSD is on a LUN
that is seen by more then one OSD node, you can corrupt data, as each 
OSD node

tries to mount all the ODS it can find.

There is a session at CDS scheduled to discuss ceph-deploy (4:40pm PDT 

on

Monday).  We'll be going over what we currently have in backlog for
improvements, but if you have any opinions about what else ceph-deploy
should or should not do or areas where it is problematic, please reply 

to
this thread to let us know what you think, and/or join the CDS 

discussion

hangout.



Thank
Eric
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process

2013-08-02 Thread Mike Dawson

Oliver,

We've had a similar situation occur. For about three months, we've run 
several Windows 2008 R2 guests with virtio drivers that record video 
surveillance. We have long suffered an issue where the guest appears to 
hang indefinitely (or until we intervene). For the sake of this 
conversation, we call this state "wedged", because it appears something 
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets 
wedged, we see the following:


- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a 
'virsh screenshot' command. After that, the guest resumes and runs as 
expected. At that point we can examine the guest. Each time we'll see:


- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to 
'virsh screenshot' them via cron. Then we installed some updates and had 
a month or so of higher stability (wedging happened maybe 1/10th as 
often). Until today we couldn't figure out why.


Yesterday, I realized qemu was starting the instances without specifying 
cache=writeback. We corrected that, and let them run overnight. With RBD 
writeback re-enabled, wedging came back as often as we had seen in the 
past. I've counted ~40 occurrences in the past 12-hour period. So I feel 
like writeback caching in RBD certainly makes the deadlock more likely 
to occur.


Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback 
at some point - if you could gather logs of a vm getting stuck with 
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be 
great"


We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/2/2013 6:22 AM, Oliver Francke wrote:

Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues going from 1 to 3 mons

2013-08-02 Thread Mikaël Cluseau

Hi Nelson,

On 07/31/13 18:11, Jeppesen, Nelson wrote:


ceph mon add 10.198.141.203:6789



was the monmap modified after the mon add ?

I had a problem with bobtail, on my lab, going from 1 to 2 and back 
because of quorum loss, maybe its me same. I had to get the monmap from 
the mon filesystem, add the host and inject it back manually to make it 
work.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com