[ceph-users] get/put files with radosgw once MDS crash

2014-10-24 Thread 廖建锋
dear cepher,
 Today, I use mds to put/get files from ceph storgate cluster as it is 
very easy to use for each side of a company.
But ceph mds is not very stable, So my question:
is it possbile to get the file name and contentes from OSD with radosgw 
once MDS crash and how ?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Continuous OSD crash with kv backend (firefly)

2014-10-24 Thread Andrey Korolyov
Hi,

during recovery testing on a latest firefly with leveldb backend we
found that the OSDs on a selected host may crash at once, leaving
attached backtrace. In other ways, recovery goes more or less smoothly
for hours.

Timestamps shows how the issue is correlated between different
processes on same node:

core.ceph-osd.25426.node01.1414148261
core.ceph-osd.25734.node01.1414148263
core.ceph-osd.25566.node01.1414148345

The question is about kv backend state in Firefly - is it considered
stable enough to run production test against it or we should better
move to giant/master for this?

Thanks!
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /usr/bin/ceph-osd...Reading symbols from 
/usr/lib/debug/usr/bin/ceph-osd...done.
done.
[New LWP 10182]
[New LWP 10183]
[New LWP 10699]
[New LWP 10184]
[New LWP 10703]
[New LWP 10704]
[New LWP 10702]
[New LWP 10708]
[New LWP 10707]
[New LWP 10710]
[New LWP 10700]
[New LWP 10717]
[New LWP 10765]
[New LWP 10705]
[New LWP 10706]
[New LWP 10701]
[New LWP 10712]
[New LWP 10735]
[New LWP 10713]
[New LWP 10750]
[New LWP 10718]
[New LWP 10711]
[New LWP 10716]
[New LWP 10715]
[New LWP 10785]
[New LWP 10766]
[New LWP 10796]
[New LWP 10720]
[New LWP 10725]
[New LWP 10736]
[New LWP 10709]
[New LWP 10730]
[New LWP 11541]
[New LWP 10770]
[New LWP 11573]
[New LWP 10778]
[New LWP 10804]
[New LWP 11561]
[New LWP 9388]
[New LWP 9398]
[New LWP 11538]
[New LWP 10790]
[New LWP 11586]
[New LWP 10798]
[New LWP 9910]
[New LWP 10726]
[New LWP 21823]
[New LWP 10815]
[New LWP 9397]
[New LWP 11248]
[New LWP 10723]
[New LWP 11253]
[New LWP 10728]
[New LWP 10791]
[New LWP 9389]
[New LWP 10724]
[New LWP 10780]
[New LWP 11287]
[New LWP 11592]
[New LWP 10816]
[New LWP 10812]
[New LWP 10787]
[New LWP 20622]
[New LWP 21822]
[New LWP 10751]
[New LWP 10768]
[New LWP 10767]
[New LWP 11874]
[New LWP 10733]
[New LWP 10811]
[New LWP 11574]
[New LWP 11873]
[New LWP 10771]
[New LWP 11551]
[New LWP 10799]
[New LWP 10729]
[New LWP 18254]
[New LWP 10792]
[New LWP 10803]
[New LWP 9912]
[New LWP 11293]
[New LWP 20623]
[New LWP 14805]
[New LWP 10773]
[New LWP 11298]
[New LWP 11872]
[New LWP 10763]
[New LWP 10783]
[New LWP 10769]
[New LWP 11300]
[New LWP 10777]
[New LWP 10764]
[New LWP 10802]
[New LWP 10749]
[New LWP 14806]
[New LWP 10806]
[New LWP 10805]
[New LWP 18255]
[New LWP 10181]
[New LWP 11277]
[New LWP 9913]
[New LWP 10800]
[New LWP 10801]
[New LWP 11555]
[New LWP 11871]
[New LWP 10748]
[New LWP 9915]
[New LWP 10779]
[New LWP 11294]
[New LWP 9916]
[New LWP 10757]
[New LWP 10734]
[New LWP 10786]
[New LWP 10727]
[New LWP 19063]
[New LWP 11279]
[New LWP 9905]
[New LWP 9911]
[New LWP 10772]
[New LWP 10722]
[New LWP 9914]
[New LWP 10789]
[New LWP 11540]
[New LWP 9917]
[New LWP 11289]
[New LWP 10714]
[New LWP 10721]
[New LWP 10719]
[New LWP 10788]
[New LWP 10782]
[New LWP 10784]
[New LWP 10776]
[New LWP 10774]
[New LWP 10737]
[New LWP 19064]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/ceph-osd -i 1 --pid-file 
/var/run/ceph/osd.1.pid -c /etc/ceph/ceph.con'.
Program terminated with signal 6, Aborted.
#0  0x7ff9ad91eb7b in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) 
Thread 135 (Thread 0x7ff99a492700 (LWP 19064)):
#0  0x7ff9ad91ad84 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00c496da in Wait (mutex=..., this=0x108cd110) at 
./common/Cond.h:55
#2  Pipe::writer (this=0x108ccf00) at msg/Pipe.cc:1730
#3  0x00c5485d in Pipe::Writer::entry (this=) at 
msg/Pipe.h:61
#4  0x7ff9ad916e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#5  0x7ff9ac4a43dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x in ?? ()

Thread 134 (Thread 0x7ff975e1d700 (LWP 10737)):
#0  0x7ff9ac498a13 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00c3e73c in Pipe::tcp_read_wait (this=this@entry=0x4a53180) at 
msg/Pipe.cc:2282
#2  0x00c3e9d0 in Pipe::tcp_read (this=this@entry=0x4a53180, 
buf=, buf@entry=0x7ff975e1cccf "\377", len=len@entry=1)
at msg/Pipe.cc:2255
#3  0x00c5095f in Pipe::reader (this=0x4a53180) at msg/Pipe.cc:1421
#4  0x00c5497d in Pipe::Reader::entry (this=) at 
msg/Pipe.h:49
#5  0x7ff9ad916e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#6  0x7ff9ac4a43dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x in ?? ()

Thread 133 (Thread 0x7ff972dda700 (LWP 10774)):
#0  0x7ff9ac49

Re: [ceph-users] Continuous OSD crash with kv backend (firefly)

2014-10-24 Thread Haomai Wang
It's not stable at Firely for kvstore. But for the master branch, it's
should be no existing/known bug.

On Fri, Oct 24, 2014 at 7:41 PM, Andrey Korolyov  wrote:
> Hi,
>
> during recovery testing on a latest firefly with leveldb backend we
> found that the OSDs on a selected host may crash at once, leaving
> attached backtrace. In other ways, recovery goes more or less smoothly
> for hours.
>
> Timestamps shows how the issue is correlated between different
> processes on same node:
>
> core.ceph-osd.25426.node01.1414148261
> core.ceph-osd.25734.node01.1414148263
> core.ceph-osd.25566.node01.1414148345
>
> The question is about kv backend state in Firefly - is it considered
> stable enough to run production test against it or we should better
> move to giant/master for this?
>
> Thanks!
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Best Regards,

Wheat
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-24 Thread Jasper Siero
Hello Greg and John,

I used the patch on the ceph cluster and tried it again:
 /usr/bin/ceph-mds -i th1-mon001 -c /etc/ceph/ceph.conf --cluster ceph 
--undump-journal 0 journaldumptgho-mon001
undump journaldumptgho-mon001
start 9483323613 len 134213311
writing header 200.
writing 9483323613~1048576
writing 9484372189~1048576


writing 9614395613~1048576
writing 9615444189~1048576
writing 9616492765~1044159
done.

It went well without errors and after that I restarted the mds.
The status went from up:replay to up:reconnect to up:rejoin(lagged or crashed)

In the log there is an error about trim_to > trimming_pos and its like Greg 
mentioned that maybe the dumpfile needs to be truncated to the proper length 
and resetting and undumping again.

How can I truncate the dumped file to the correct length?

The mds log during the undumping and starting the mds:
http://pastebin.com/y14pSvM0

Kind Regards,

Jasper

Van: john.sp...@inktank.com [john.sp...@inktank.com] namens John Spray 
[john.sp...@redhat.com]
Verzonden: donderdag 16 oktober 2014 12:23
Aan: Jasper Siero
CC: Gregory Farnum; ceph-users
Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full

Following up: firefly fix for undump is: https://github.com/ceph/ceph/pull/2734

Jasper: if you still need to try undumping on this existing firefly
cluster, then you can download ceph-mds packages from this
wip-firefly-undump branch from
http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/

Cheers,
John

On Wed, Oct 15, 2014 at 8:15 PM, John Spray  wrote:
> Sadly undump has been broken for quite some time (it was fixed in
> giant as part of creating cephfs-journal-tool).  If there's a one line
> fix for this then it's probably worth putting in firefly since it's a
> long term supported branch -- I'll do that now.
>
> John
>
> On Wed, Oct 15, 2014 at 8:23 AM, Jasper Siero
>  wrote:
>> Hello Greg,
>>
>> The dump and reset of the journal was succesful:
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --dump-journal 0 journaldumptgho-mon001
>> journal is 9483323613~134215459
>> read 134213311 bytes at offset 9483323613
>> wrote 134213311 bytes at offset 9483323613 to journaldumptgho-mon001
>> NOTE: this is a _sparse_ file; you can
>> $ tar cSzf journaldumptgho-mon001.tgz journaldumptgho-mon001
>>   to efficiently compress it while preserving sparseness.
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --reset-journal 0
>> old journal was 9483323613~134215459
>> new journal start will be 9621733376 (4194304 bytes past old end)
>> writing journal head
>> writing EResetJournal entry
>> done
>>
>>
>> Undumping the journal was not successful and looking into the error 
>> "client_lock.is_locked()" is showed several times. The mds is not running 
>> when I start the undumping so maybe have forgot something?
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --undump-journal 0 journaldumptgho-mon001
>> undump journaldumptgho-mon001
>> start 9483323613 len 134213311
>> writing header 200.
>> osdc/Objecter.cc: In function 'ceph_tid_t 
>> Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 2014-10-15 
>> 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  3: (main()+0x1632) [0x569c62]
>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  5: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
>> interpret this.
>> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In function 
>> 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 
>> 2014-10-15 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  3: (main()+0x1632) [0x569c62]
>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  5: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
>> interpret this.
>>
>>  0> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In 
>> function 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 
>> time 2014-10-15 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --p8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [

[ceph-users] Object Storage Statistics

2014-10-24 Thread Dane Elwell
Hi list,

We're using the object storage in production and billing people based
on their usage, much like S3. We're also trying to produce things like
hourly bandwidth graphs for our clients.

We're having some issues with the API not returning the correct
statistics. I can see that there is a --sync-stats option for the
command line radosgw-admin, but there doesn't appear to be anything
similar for the admin REST API. Is there an equivalent feature for the
API that hasn't been documented by chance?

Thanks

Dane
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
Hey folks,

I am trying to enable OpenStack to use RBD as image backend:
https://bugs.launchpad.net/nova/+bug/1226351

For some reason, nova-compute segfaults due to librados crash:

./log/SubsystemMap.h: In function 'bool
ceph::log::SubsystemMap::should_gather(unsigned
int, int)' thread 7f1b477fe700 time 2014-10-24 03:20:17.382769
./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
1: (()+0x42785) [0x7f1b4c4db785]
2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
4: (()+0x6b50) [0x7f1b6ea93b50]
5: (clone()+0x6d) [0x7f1b6df3e0ed]
NOTE: a copy of the executable, or `objdump -rdS ` is needed to
interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

I feel that there is some concurrency issue, since this sometimes happen
before and sometimes after this line:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/rbd_utils.py#L208

Any idea what are the potential causes of the crash?

Thanks.
-Simon
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread HURTEVENT VINCENT
Hello,

I was running a multi mon (3) Ceph cluster and in a migration move, I reinstall 
2 of the 3 monitors nodes without deleting them properly into the cluster.

So, there is only one monitor left which is stuck in probing phase and the 
cluster is down.

As I can only connect to mon socket, I don't how if it's possible to add a 
monitor, get and edit monmap.

This cluster is running Ceph version 0.67.1.

Is there a way to force my last monitor into a leader state or re build a lost 
monitor to pass the probe and election phases ?

Thank you,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-24 Thread Sergey Nazarov
Any update?

On Tue, Oct 21, 2014 at 3:32 PM, Sergey Nazarov  wrote:
> Ouch, I think client log is missing.
> Here it goes:
> https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0
>
> On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov  wrote:
>> I enabled logging and performed same tests.
>> Here is the link on archive with logs, they are only from one node
>> (from the node where active MDS was sitting):
>> https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0
>>
>> Rados bench results:
>>
>> # rados bench -p test 10 write
>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 10
>> seconds or 0 objects
>>  Object prefix: benchmark_data_atl-fs11_4630
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>  0   0 0 0 0 0 - 0
>>  1  164630   119.967   120  0.201327  0.348463
>>  2  168872   143.969   168  0.132983  0.353677
>>  3  16   124   108   143.972   144  0.930837  0.383018
>>  4  16   155   139   138.976   124  0.899468  0.426396
>>  5  16   203   187   149.575   192  0.236534  0.400806
>>  6  16   243   227   151.309   160  0.835213  0.397673
>>  7  16   276   260   148.549   132  0.905989  0.406849
>>  8  16   306   290   144.978   120  0.353279  0.422106
>>  9  16   335   319   141.757   116   1.12114  0.428268
>> 10  16   376   360143.98   164  0.418921   0.43351
>> 11  16   377   361   131.254 4  0.499769  0.433693
>>  Total time run: 11.206306
>> Total writes made:  377
>> Write size: 4194304
>> Bandwidth (MB/sec): 134.567
>>
>> Stddev Bandwidth:   60.0232
>> Max bandwidth (MB/sec): 192
>> Min bandwidth (MB/sec): 0
>> Average Latency:0.474923
>> Stddev Latency: 0.376038
>> Max latency:1.82171
>> Min latency:0.060877
>>
>>
>> # rados bench -p test 10 seq
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>  0   0 0 0 0 0 - 0
>>  1  166145   179.957   180  0.010405   0.25243
>>  2  16   10993   185.962   192  0.908263  0.284303
>>  3  16   151   135   179.965   168  0.255312  0.297283
>>  4  16   191   175174.97   160  0.836727  0.330659
>>  5  16   236   220   175.971   180  0.009995  0.330832
>>  6  16   275   259   172.639   156   1.06855  0.345418
>>  7  16   311   295   168.545   144  0.907648  0.361689
>>  8  16   351   335   167.474   160  0.947688  0.363552
>>  9  16   390   374   166.196   156  0.140539  0.369057
>>  Total time run:9.755367
>> Total reads made: 401
>> Read size:4194304
>> Bandwidth (MB/sec):164.422
>>
>> Average Latency:   0.387705
>> Max latency:   1.33852
>> Min latency:   0.008064
>>
>> # rados bench -p test 10 rand
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>  0   0 0 0 0 0 - 0
>>  1  165539   155.938   156  0.773716  0.257267
>>  2  169377   153.957   152  0.006573  0.339199
>>  3  16   135   119   158.629   168  0.009851  0.359675
>>  4  16   171   155   154.967   144  0.892027  0.359015
>>  5  16   209   193   154.369   152   1.13945  0.378618
>>  6  16   256   240159.97   188  0.009965  0.368439
>>  7  16   295   279 159.4   156  0.195812  0.371259
>>  8  16   343   327   163.472   192  0.880587  0.370759
>>  9  16   380   364161.75   148  0.113111  0.377983
>> 10  16   424   408   163.173   176  0.772274  0.379497
>>  Total time run:10.518482
>> Total reads made: 425
>> Read size:4194304
>> Bandwidth (MB/sec):161.620
>>
>> Average Latency:   0.393978
>> Max latency:   1.36572
>> Min latency:   0.006448
>>
>> On Tue, Oct 21, 2014 at 2:03 PM, Gregory Farnum  wrote:
>>> Can you enable debugging on the client ("debug ms = 1", "debug client
>>> = 20") and mds ("debug ms = 1", "debug mds = 20"), run this test
>>> again, and post them somewhere for me to look at?
>>>
>>> While you're at it, can you try rados bench and see what sort of
>>> results you get?
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Tue, Oct 21, 2014 at 10:57 AM, Sergey Nazarov  
>>> wrote:
 It is CephFS mounted via ceph-fuse.
 I am getting the same

Re: [ceph-users] RGW Federated Gateways and Apache 2.4 problems

2014-10-24 Thread Yehuda Sadeh
On Thu, Oct 23, 2014 at 3:51 PM, Craig Lewis  wrote:
> I'm having a problem getting RadosGW replication to work after upgrading to
> Apache 2.4 on my primary test cluster.  Upgrading the secondary cluster to
> Apache 2.4 doesn't cause any problems. Both Ceph's apache packages and
> Ubuntu's packages cause the same problem.
>
> I'm pretty sure I'm missing something obvious, but I'm not seeing it.
>
> Has anybody else upgraded their federated gateways to apache 2.4?
>
>
>
> My setup
> 2 VMs, each running their own ceph cluster with replication=1
> test0-ceph.cdlocal is the primary zone, named us-west
> test1-ceph.cdlocal is the secondary zone, named us-central
> Before I start, replication works, and I'm running
>
> Ubuntu 14.04 LTS
> Emperor (0.72.2-1precise, retained using apt-hold)
> Apache 2.2 (2.2.22-2precise.ceph, retained using apt-hold)
>
>
> As soon as I upgrade Apache to 2.4 in the primary cluster, replication gets
> permission errors.  radosgw-agent.log:
> 2014-10-23T15:13:43.022 31106:ERROR:radosgw_agent.worker:failed to sync
> object bucket3/test6.jpg: state is error
>
> The access logs from the primary say (using vhost_combined log format):
> test0-ceph.cdlocal:80 172.16.205.1 - - [23/Oct/2014:15:16:51 -0700] "PUT
> /test6.jpg HTTP/1.1" 200 209 "-" "-"- - - [23/Oct/2014:13:24:18 -0700] "GET
> /?delimiter=/ HTTP/1.1" 200 1254 "-" "-" "bucket3.test0-ceph.cdlocal"
> 
> test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] "GET
> /admin/log?marker=089.89.3&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2&max-entries=1000
> HTTP/1.1" 200 398 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] "GET
> /bucket3/test6.jpg?rgwx-uid=us-central&rgwx-region=us&rgwx-prepend-metadata=us
> HTTP/1.1" 403 249 "-" "-"
>
> 172.16.205.143 is the primary cluster, .144 is the secondary cluster, and .1
> is my workstation.
>
>
> The access logs on the secondary show:
> test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "GET
> /admin/replica_log?bounds&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2
> HTTP/1.1" 200 643 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "PUT
> /bucket3/test6.jpg?rgwx-op-id=test1-ceph0.cdlocal%3A6484%3A3&rgwx-source-zone=us-west&rgwx-client-id=radosgw-agent
> HTTP/1.1" 403 286 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "GET
> /admin/opstate?client-id=radosgw-agent&object=bucket3%2Ftest6.jpg&op-id=test1-ceph0.cdlocal%3A6484%3A3
> HTTP/1.1" 200 355 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
>
> If I crank up radosgw debugging, it tells me that the calculated digest is
> correct for the /admin/* requests, but fails for the object GET:
> /admin/log
> 2014-10-23 15:44:29.257688 7fa6fcfb9700 15 calculated
> digest=6Tt13P6naWJEc0mJmYyDj6NzBS8=
> 2014-10-23 15:44:29.257690 7fa6fcfb9700 15
> auth_sign=6Tt13P6naWJEc0mJmYyDj6NzBS8=
> /bucket3/test6.jpg
> 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
> digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
> 2014-10-23 15:44:29.257691 7fa6fcfb9700 15 compare=0
> 2014-10-23 15:44:29.257693 7fa6fcfb9700 20 system request
> 
> /bucket3/test6.jpg
> 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
> digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
> 2014-10-23 15:44:29.411573 7fa6fc7b8700 15
> auth_sign=Gv398QNc6gLig9/0QbdO+1UZUq0=
> 2014-10-23 15:44:29.411574 7fa6fc7b8700 15 compare=-41
> 2014-10-23 15:44:29.411577 7fa6fc7b8700 10 failed to authorize request
>
> That explains the 403 responses.
>
> So I have metadata replication working, but the data replication is failing
> with permission problems.  I verified that I can create users and buckets in
> the primary, and have them replicate to the secondary.
>
>
> A similar situation was posted to the list before.  That time, the problem
> was that the system users weren't correctly deployed to both the primary and
> secondary clusters.  I verified that both users exist in both clusters, with
> the same access and secret.
>
> Just to test, I used s3cmd.  I can read and write to both clusters using
> both system user's credentials.
>
>
> Anybody have any ideas?
>

You're hitting issue #9206. Apache 2.4 filters out certain http
headers because they use underscores instead of dashes. There's a fix
for that for firefly, although it hasn't made it to an officially
released version.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Dan van der Ster
Hi,

October 24 2014 5:28 PM, "HURTEVENT VINCENT"  
wrote: 
> Hello,
> 
> I was running a multi mon (3) Ceph cluster and in a migration move, I 
> reinstall 2 of the 3 monitors
> nodes without deleting them properly into the cluster.
> 
> So, there is only one monitor left which is stuck in probing phase and the 
> cluster is down.
> 
> As I can only connect to mon socket, I don't how if it's possible to add a 
> monitor, get and edit
> monmap.
> 
> This cluster is running Ceph version 0.67.1.
> 
> Is there a way to force my last monitor into a leader state or re build a 
> lost monitor to pass the
> probe and election phases ?

Did you already try to remake one of the lost monitors? Assuming your ceph.conf 
has the addresses of the mons, and the keyrings are in place, maybe this will 
work:

ceph-mon --mkfs -i  

then start the process?

I've never been in this situation before, so I don't know if it will work.

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Loic Dachary
Bonjour,

Maybe http://ceph.com/docs/giant/rados/troubleshooting/troubleshooting-mon/ can 
help ? Joao wrote that a few month ago and it covers a number of scenarios.

Cheers

On 24/10/2014 08:27, HURTEVENT VINCENT wrote:
> Hello,
> 
> I was running a multi mon (3) Ceph cluster and in a migration move, I 
> reinstall 2 of the 3 monitors nodes without deleting them properly into the 
> cluster.
> 
> So, there is only one monitor left which is stuck in probing phase and the 
> cluster is down.
> 
> As I can only connect to mon socket, I don't how if it's possible to add a 
> monitor, get and edit monmap.
> 
> This cluster is running Ceph version 0.67.1.
> 
> Is there a way to force my last monitor into a leader state or re build a 
> lost monitor to pass the probe and election phases ?
> 
> Thank you,
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-24 Thread Yan, Zheng
On Fri, Oct 24, 2014 at 8:47 AM, Sergey Nazarov  wrote:
> Any update?
>

The short answer is that when the command is executed for second time,
the MDS needs to truncate the file zero length. The speed of truncate
a file is limited by the OSD speed. (creating file and write data to
the file are async operations, but truncating a file is sync
operation)

Regards
Yan, Zheng


> On Tue, Oct 21, 2014 at 3:32 PM, Sergey Nazarov  wrote:
>> Ouch, I think client log is missing.
>> Here it goes:
>> https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0
>>
>> On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov  wrote:
>>> I enabled logging and performed same tests.
>>> Here is the link on archive with logs, they are only from one node
>>> (from the node where active MDS was sitting):
>>> https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0
>>>
>>> Rados bench results:
>>>
>>> # rados bench -p test 10 write
>>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 10
>>> seconds or 0 objects
>>>  Object prefix: benchmark_data_atl-fs11_4630
>>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>>  0   0 0 0 0 0 - 0
>>>  1  164630   119.967   120  0.201327  0.348463
>>>  2  168872   143.969   168  0.132983  0.353677
>>>  3  16   124   108   143.972   144  0.930837  0.383018
>>>  4  16   155   139   138.976   124  0.899468  0.426396
>>>  5  16   203   187   149.575   192  0.236534  0.400806
>>>  6  16   243   227   151.309   160  0.835213  0.397673
>>>  7  16   276   260   148.549   132  0.905989  0.406849
>>>  8  16   306   290   144.978   120  0.353279  0.422106
>>>  9  16   335   319   141.757   116   1.12114  0.428268
>>> 10  16   376   360143.98   164  0.418921   0.43351
>>> 11  16   377   361   131.254 4  0.499769  0.433693
>>>  Total time run: 11.206306
>>> Total writes made:  377
>>> Write size: 4194304
>>> Bandwidth (MB/sec): 134.567
>>>
>>> Stddev Bandwidth:   60.0232
>>> Max bandwidth (MB/sec): 192
>>> Min bandwidth (MB/sec): 0
>>> Average Latency:0.474923
>>> Stddev Latency: 0.376038
>>> Max latency:1.82171
>>> Min latency:0.060877
>>>
>>>
>>> # rados bench -p test 10 seq
>>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>>  0   0 0 0 0 0 - 0
>>>  1  166145   179.957   180  0.010405   0.25243
>>>  2  16   10993   185.962   192  0.908263  0.284303
>>>  3  16   151   135   179.965   168  0.255312  0.297283
>>>  4  16   191   175174.97   160  0.836727  0.330659
>>>  5  16   236   220   175.971   180  0.009995  0.330832
>>>  6  16   275   259   172.639   156   1.06855  0.345418
>>>  7  16   311   295   168.545   144  0.907648  0.361689
>>>  8  16   351   335   167.474   160  0.947688  0.363552
>>>  9  16   390   374   166.196   156  0.140539  0.369057
>>>  Total time run:9.755367
>>> Total reads made: 401
>>> Read size:4194304
>>> Bandwidth (MB/sec):164.422
>>>
>>> Average Latency:   0.387705
>>> Max latency:   1.33852
>>> Min latency:   0.008064
>>>
>>> # rados bench -p test 10 rand
>>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>>  0   0 0 0 0 0 - 0
>>>  1  165539   155.938   156  0.773716  0.257267
>>>  2  169377   153.957   152  0.006573  0.339199
>>>  3  16   135   119   158.629   168  0.009851  0.359675
>>>  4  16   171   155   154.967   144  0.892027  0.359015
>>>  5  16   209   193   154.369   152   1.13945  0.378618
>>>  6  16   256   240159.97   188  0.009965  0.368439
>>>  7  16   295   279 159.4   156  0.195812  0.371259
>>>  8  16   343   327   163.472   192  0.880587  0.370759
>>>  9  16   380   364161.75   148  0.113111  0.377983
>>> 10  16   424   408   163.173   176  0.772274  0.379497
>>>  Total time run:10.518482
>>> Total reads made: 425
>>> Read size:4194304
>>> Bandwidth (MB/sec):161.620
>>>
>>> Average Latency:   0.393978
>>> Max latency:   1.36572
>>> Min latency:   0.006448
>>>
>>> On Tue, Oct 21, 2014 at 2:03 PM, Gregory Farnum  wrote:
 Can you enable debugging on the client ("debug ms

Re: [ceph-users] RGW Federated Gateways and Apache 2.4 problems

2014-10-24 Thread Craig Lewis
Thanks!  I'll continue with Apache 2.2 until the next release.

On Fri, Oct 24, 2014 at 8:58 AM, Yehuda Sadeh  wrote:

> On Thu, Oct 23, 2014 at 3:51 PM, Craig Lewis 
> wrote:
> > I'm having a problem getting RadosGW replication to work after upgrading
> to
> > Apache 2.4 on my primary test cluster.  Upgrading the secondary cluster
> to
> > Apache 2.4 doesn't cause any problems. Both Ceph's apache packages and
> > Ubuntu's packages cause the same problem.
> >
> > I'm pretty sure I'm missing something obvious, but I'm not seeing it.
> >
> > Has anybody else upgraded their federated gateways to apache 2.4?
> >
> >
> >
> > My setup
> > 2 VMs, each running their own ceph cluster with replication=1
> > test0-ceph.cdlocal is the primary zone, named us-west
> > test1-ceph.cdlocal is the secondary zone, named us-central
> > Before I start, replication works, and I'm running
> >
> > Ubuntu 14.04 LTS
> > Emperor (0.72.2-1precise, retained using apt-hold)
> > Apache 2.2 (2.2.22-2precise.ceph, retained using apt-hold)
> >
> >
> > As soon as I upgrade Apache to 2.4 in the primary cluster, replication
> gets
> > permission errors.  radosgw-agent.log:
> > 2014-10-23T15:13:43.022 31106:ERROR:radosgw_agent.worker:failed to sync
> > object bucket3/test6.jpg: state is error
> >
> > The access logs from the primary say (using vhost_combined log format):
> > test0-ceph.cdlocal:80 172.16.205.1 - - [23/Oct/2014:15:16:51 -0700] "PUT
> > /test6.jpg HTTP/1.1" 200 209 "-" "-"- - - [23/Oct/2014:13:24:18 -0700]
> "GET
> > /?delimiter=/ HTTP/1.1" 200 1254 "-" "-" "bucket3.test0-ceph.cdlocal"
> > 
> > test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700]
> "GET
> >
> /admin/log?marker=089.89.3&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2&max-entries=1000
> > HTTP/1.1" 200 398 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> > test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700]
> "GET
> >
> /bucket3/test6.jpg?rgwx-uid=us-central&rgwx-region=us&rgwx-prepend-metadata=us
> > HTTP/1.1" 403 249 "-" "-"
> >
> > 172.16.205.143 is the primary cluster, .144 is the secondary cluster,
> and .1
> > is my workstation.
> >
> >
> > The access logs on the secondary show:
> > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
> "GET
> >
> /admin/replica_log?bounds&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2
> > HTTP/1.1" 200 643 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
> "PUT
> >
> /bucket3/test6.jpg?rgwx-op-id=test1-ceph0.cdlocal%3A6484%3A3&rgwx-source-zone=us-west&rgwx-client-id=radosgw-agent
> > HTTP/1.1" 403 286 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
> "GET
> >
> /admin/opstate?client-id=radosgw-agent&object=bucket3%2Ftest6.jpg&op-id=test1-ceph0.cdlocal%3A6484%3A3
> > HTTP/1.1" 200 355 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic"
> >
> > If I crank up radosgw debugging, it tells me that the calculated digest
> is
> > correct for the /admin/* requests, but fails for the object GET:
> > /admin/log
> > 2014-10-23 15:44:29.257688 7fa6fcfb9700 15 calculated
> > digest=6Tt13P6naWJEc0mJmYyDj6NzBS8=
> > 2014-10-23 15:44:29.257690 7fa6fcfb9700 15
> > auth_sign=6Tt13P6naWJEc0mJmYyDj6NzBS8=
> > /bucket3/test6.jpg
> > 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
> > digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
> > 2014-10-23 15:44:29.257691 7fa6fcfb9700 15 compare=0
> > 2014-10-23 15:44:29.257693 7fa6fcfb9700 20 system request
> > 
> > /bucket3/test6.jpg
> > 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
> > digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
> > 2014-10-23 15:44:29.411573 7fa6fc7b8700 15
> > auth_sign=Gv398QNc6gLig9/0QbdO+1UZUq0=
> > 2014-10-23 15:44:29.411574 7fa6fc7b8700 15 compare=-41
> > 2014-10-23 15:44:29.411577 7fa6fc7b8700 10 failed to authorize request
> >
> > That explains the 403 responses.
> >
> > So I have metadata replication working, but the data replication is
> failing
> > with permission problems.  I verified that I can create users and
> buckets in
> > the primary, and have them replicate to the secondary.
> >
> >
> > A similar situation was posted to the list before.  That time, the
> problem
> > was that the system users weren't correctly deployed to both the primary
> and
> > secondary clusters.  I verified that both users exist in both clusters,
> with
> > the same access and secret.
> >
> > Just to test, I used s3cmd.  I can read and write to both clusters using
> > both system user's credentials.
> >
> >
> > Anybody have any ideas?
> >
>
> You're hitting issue #9206. Apache 2.4 filters out certain http
> headers because they use underscores instead of dashes. There's a fix
> for that for firefly, although it hasn't made it to an officially
> released version.
>
> Yehuda
>
___
ceph-users mailing list
ceph-

Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Gregory Farnum
There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
 wrote:
> I'm doing some fio tests on Giant using fio rbd driver to measure
> performance on a new ceph cluster.
>
> However with block sizes > 1M (initially noticed with 4M) I am seeing
> absolutely no IOPS for *reads* - and the fio process becomes non
> interrupteable (needs kill -9):
>
> $ ceph -v
> ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)
>
> $ fio --version
> fio-2.1.11-20-g9a44
>
> $ fio read-busted.fio
> env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
> fio-2.1.11-20-g9a44
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 1158050441d:06h:58m:03s]
>
> This appears to be a pure fio rbd driver issue, as I can attach the relevant
> rbd volume to a vm and dd from it using 4M blocks no problem.
>
> Any ideas?
>
> Cheers
>
> Mark
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Mark Nelson
FWIW the specific fio read problem appears to have started after 0.86 
and before commit 42bcabf.


Mark

On 10/24/2014 12:56 PM, Gregory Farnum wrote:

There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
 wrote:

I'm doing some fio tests on Giant using fio rbd driver to measure
performance on a new ceph cluster.

However with block sizes > 1M (initially noticed with 4M) I am seeing
absolutely no IOPS for *reads* - and the fio process becomes non
interrupteable (needs kill -9):

$ ceph -v
ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)

$ fio --version
fio-2.1.11-20-g9a44

$ fio read-busted.fio
env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
fio-2.1.11-20-g9a44
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:58m:03s]

This appears to be a pure fio rbd driver issue, as I can attach the relevant
rbd volume to a vm and dd from it using 4M blocks no problem.

Any ideas?

Cheers

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Josh Durgin

On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:

Hey folks,

I am trying to enable OpenStack to use RBD as image backend:
https://bugs.launchpad.net/nova/+bug/1226351

For some reason, nova-compute segfaults due to librados crash:

./log/SubsystemMap.h: In function 'bool
ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
7f1b477fe700 time 2014-10-24 03:20:17.382769
./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
1: (()+0x42785) [0x7f1b4c4db785]
2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
4: (()+0x6b50) [0x7f1b6ea93b50]
5: (clone()+0x6d) [0x7f1b6df3e0ed]
NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

I feel that there is some concurrency issue, since this sometimes happen
before and sometimes after this line:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/rbd_utils.py#L208

Any idea what are the potential causes of the crash?

Thanks.
-Simon


This is http://tracker.ceph.com/issues/8912, fixed in the latest
firefly and dumpling releases.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Object Storage Statistics

2014-10-24 Thread Yehuda Sadeh
On Fri, Oct 24, 2014 at 8:17 AM, Dane Elwell  wrote:
> Hi list,
>
> We're using the object storage in production and billing people based
> on their usage, much like S3. We're also trying to produce things like
> hourly bandwidth graphs for our clients.
>
> We're having some issues with the API not returning the correct
> statistics. I can see that there is a --sync-stats option for the
> command line radosgw-admin, but there doesn't appear to be anything
> similar for the admin REST API. Is there an equivalent feature for the
> API that hasn't been documented by chance?
>

There are two different statistics that are collected, one is the
'usage' information that collects data about actual operations that
clients do in a period of time. This information can be accessed
through the admin api. The other one is the user stats info that is
part of the user quota system, which at the moment is not hooked into
a REST interface.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph and hadoop

2014-10-24 Thread Matan Safriel
Hi,

Given HDFS is far from ideal for small files, I am examining the
possibility of using Hadoop on top Ceph. I found mainly one online resource
about it https://ceph.com/docs/v0.79/cephfs/hadoop/. I am wondering whether
there is any reference implementation or blog post you are aware of, about
hadoop on top Ceph. Likewise happy to have any pointers about why _not_ to
attempt just that

Thanks!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to recover Incomplete PGs from "lost time" symptom?

2014-10-24 Thread Chris Kitzmiller
I have a number of PGs which are marked as incomplete. I'm at a loss for how to 
go about recovering these PGs and believe they're suffering from the "lost 
time" symptom. How do I recover these PGs? I'd settle for sacrificing the "lost 
time" and just going with what I've got. I've lost the ability to mount the RBD 
within this pool and I'm afraid that unless I can resolve this I'll have lost 
all my data.

A query from one of my incomplete PGs: http://pastebin.com/raw.php?i=AJ3RMjz6

My CRUSH map: http://pastebin.com/raw.php?i=gWtJuhsy___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get/put files with radosgw once MDS crash

2014-10-24 Thread Craig Lewis
No, MDS and RadosGW store their data in different pools.  There's no way
for them to access the other's data.

All of the data is stored in RADOS, and can be accessed via the rados CLI.
It's not easy, and you'd probably have to spend a lot of time reading the
source code to do it.


On Fri, Oct 24, 2014 at 1:49 AM, 廖建锋  wrote:

>  dear cepher,
>  Today, I use mds to put/get files from ceph storgate cluster as
> it is very easy to use for each side of a company.
> But ceph mds is not very stable, So my question:
> is it possbile to get the file name and contentes from OSD with
> radosgw once MDS crash and how ?
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-24 Thread Craig Lewis
You can deploy multiple RadosGW in a single cluster.  You'll need to setup
zones (see http://ceph.com/docs/master/radosgw/federated-config/).  Most
people seem to be using zones for geo-replication, but local replication
works even better.  Multiple zones don't have to be replicated either.  For
example, you could use multiple zones for tiered services.  For example, a
service with 4x replication on pure SSDs, and a cheaper service with 2x
replication on HDDs.

If you do have separate zones in a single cluster, you'll want to configure
different OSDs to serve the different zones.  You want fault isolation
between the zones. The problems this brings are mostly management of the
extra complexity.


CivetWeb is embedded into the RadosGW daemon, where as Apache talks to
RadosGW using FastCGI.  Overall, CivetWeb should be simpler to setup and
manage, since it doesn't require Apache, it's configuration, or the
overhead.

I don't know if Civetweb is considered production ready.  Giant has a bunch
of fixes for Civetweb, so I'm leaning towards "not on Firefly" unless
somebody more knowledgeable tells me otherwise.


On Thu, Oct 23, 2014 at 11:04 PM, yuelongguang  wrote:

> hi,yehuda
>
> 1.
> can we deploy multi-rgws on one ceph cluster?
> if so  does it bring us any problems?
>
> 2. what is the major difference between apache and civetweb?
> what is  civetweb's advantage?
>
> thanks
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't start osd- one osd alway be down.

2014-10-24 Thread Craig Lewis
It looks like you're running into http://tracker.ceph.com/issues/5699

You're running 0.80.7, which has a fix for that bug.  From my reading of
the code, I believe the fix only prevents the issue from occurring.  It
doesn't work around or repair bad snapshots created on older versions of
Ceph.

Were any of the snapshots you're removing up created on older versions of
Ceph?  If they were all created on Firefly, then you should open a new
tracker issue, and try to get some help on IRC or the developers mailing
list.


On Thu, Oct 23, 2014 at 10:21 PM, Ta Ba Tuan  wrote:

> Dear everyone
>
> I can't start osd.21, (attached log file).
> some pgs can't be repair. I'm using replicate 3 for my data pool.
> Feel some objects in those pgs be failed,
>
> I tried to delete some data that related above objects, but still not
> start osd.21
> and, removed osd.21, but other osds (eg: osd.86 down, not start osd.86).
>
> Guide me to debug it, please! Thanks!
>
> --
> Tuan
> Ha Noi - VietNam
>
>
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Mark Kirkwood

Yeah, looks like it. If I disable the rbd ccahe:

$ tail /etc/ceph/ceph.conf
...
[client]
rbd cache = false

then the 2-4M reads work fine (no invalid reads in valgrind either). 
I'll let the fio guys know.


Cheers

Mark

On 25/10/14 06:56, Gregory Farnum wrote:

There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
 wrote:

I'm doing some fio tests on Giant using fio rbd driver to measure
performance on a new ceph cluster.

However with block sizes > 1M (initially noticed with 4M) I am seeing
absolutely no IOPS for *reads* - and the fio process becomes non
interrupteable (needs kill -9):

$ ceph -v
ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)

$ fio --version
fio-2.1.11-20-g9a44

$ fio read-busted.fio
env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
fio-2.1.11-20-g9a44
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:58m:03s]

This appears to be a pure fio rbd driver issue, as I can attach the relevant
rbd volume to a vm and dd from it using 4M blocks no problem.

Any ideas?

Cheers

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
Thanks. I found the commit on git and confirms 0.80.7 fixes the issue.

On Friday, October 24, 2014, Josh Durgin  wrote:

> On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:
>
>> Hey folks,
>>
>> I am trying to enable OpenStack to use RBD as image backend:
>> https://bugs.launchpad.net/nova/+bug/1226351
>>
>> For some reason, nova-compute segfaults due to librados crash:
>>
>> ./log/SubsystemMap.h: In function 'bool
>> ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
>> 7f1b477fe700 time 2014-10-24 03:20:17.382769
>> ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
>> ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>> 1: (()+0x42785) [0x7f1b4c4db785]
>> 2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
>> 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
>> 4: (()+0x6b50) [0x7f1b6ea93b50]
>> 5: (clone()+0x6d) [0x7f1b6df3e0ed]
>> NOTE: a copy of the executable, or `objdump -rdS ` is needed
>> to interpret this.
>> terminate called after throwing an instance of 'ceph::FailedAssertion'
>> Aborted
>>
>> I feel that there is some concurrency issue, since this sometimes happen
>> before and sometimes after this line:
>> https://github.com/openstack/nova/blob/master/nova/virt/
>> libvirt/rbd_utils.py#L208
>>
>> Any idea what are the potential causes of the crash?
>>
>> Thanks.
>> -Simon
>>
>
> This is http://tracker.ceph.com/issues/8912, fixed in the latest
> firefly and dumpling releases.
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
I am actually curious about one more thing.

In the image -> rbd case, is rbd_secret_uuid config option really used? I
am running nova-compute as a non-root user, so virsh secret shouldn't be
accessible unless we get it via rootwrap. I had to make ceph keyring file
readable to the nova-compute user for the whole thing to work...


On Friday, October 24, 2014, Xu (Simon) Chen  wrote:

> Thanks. I found the commit on git and confirms 0.80.7 fixes the issue.
>
> On Friday, October 24, 2014, Josh Durgin  > wrote:
>
>> On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:
>>
>>> Hey folks,
>>>
>>> I am trying to enable OpenStack to use RBD as image backend:
>>> https://bugs.launchpad.net/nova/+bug/1226351
>>>
>>> For some reason, nova-compute segfaults due to librados crash:
>>>
>>> ./log/SubsystemMap.h: In function 'bool
>>> ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
>>> 7f1b477fe700 time 2014-10-24 03:20:17.382769
>>> ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
>>> ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>> 1: (()+0x42785) [0x7f1b4c4db785]
>>> 2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
>>> 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
>>> 4: (()+0x6b50) [0x7f1b6ea93b50]
>>> 5: (clone()+0x6d) [0x7f1b6df3e0ed]
>>> NOTE: a copy of the executable, or `objdump -rdS ` is needed
>>> to interpret this.
>>> terminate called after throwing an instance of 'ceph::FailedAssertion'
>>> Aborted
>>>
>>> I feel that there is some concurrency issue, since this sometimes happen
>>> before and sometimes after this line:
>>> https://github.com/openstack/nova/blob/master/nova/virt/
>>> libvirt/rbd_utils.py#L208
>>>
>>> Any idea what are the potential causes of the crash?
>>>
>>> Thanks.
>>> -Simon
>>>
>>
>> This is http://tracker.ceph.com/issues/8912, fixed in the latest
>> firefly and dumpling releases.
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] journals relabeled by OS, symlinks broken

2014-10-24 Thread Steve Anthony
Hello,

I was having problems with a node in my cluster (Ceph v0.80.7/Debian
Wheezy/Kernel 3.12), so I rebooted it and the disks were relabled when
it came back up. Now all the symlinks to the journals are broken. The
SSDs are now sda, sdb, and sdc but the journals were sdc, sdd, and sde:

root@ceph17:~# ls -l /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx 1 root root 9 Oct 20 16:47 /var/lib/ceph/osd/ceph-150/journal
-> /dev/sde1
lrwxrwxrwx 1 root root 9 Oct 20 16:53 /var/lib/ceph/osd/ceph-157/journal
-> /dev/sdd1
lrwxrwxrwx 1 root root 9 Oct 21 08:31 /var/lib/ceph/osd/ceph-164/journal
-> /dev/sdc1
lrwxrwxrwx 1 root root 9 Oct 21 16:33 /var/lib/ceph/osd/ceph-171/journal
-> /dev/sde2
lrwxrwxrwx 1 root root 9 Oct 22 10:50 /var/lib/ceph/osd/ceph-178/journal
-> /dev/sdc2
lrwxrwxrwx 1 root root 9 Oct 22 15:48 /var/lib/ceph/osd/ceph-184/journal
-> /dev/sdd2
lrwxrwxrwx 1 root root 9 Oct 23 10:46 /var/lib/ceph/osd/ceph-191/journal
-> /dev/sde3
lrwxrwxrwx 1 root root 9 Oct 23 15:22 /var/lib/ceph/osd/ceph-195/journal
-> /dev/sdc3
lrwxrwxrwx 1 root root 9 Oct 23 16:59 /var/lib/ceph/osd/ceph-201/journal
-> /dev/sdd3
lrwxrwxrwx 1 root root 9 Oct 24 21:32 /var/lib/ceph/osd/ceph-214/journal
-> /dev/sde4
lrwxrwxrwx 1 root root 9 Oct 24 21:33 /var/lib/ceph/osd/ceph-215/journal
-> /dev/sdd4

Any way to fix this without just removing all the OSDs and re-adding
them? I thought about recreating the symlinks to point at the new SSD
labels, but I figured I'd check here first. Thanks!

-Steve

-- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma...@lehigh.edu

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't start osd- one osd alway be down.

2014-10-24 Thread Ta Ba Tuan

Hi Craig, Thanks for replying.
When i started that osd, Ceph Log from "ceph -w" warns pgs 7.9d8 23.596, 
23.9c6, 23.63 can't recovery as pasted log.


Those pgs are "active+degraded" state.
#ceph pg map 7.9d8
osdmap e102808 pg 7.9d8 (7.9d8) -> up [93,49] acting [93,49] (When start 
osd.21 then pg 7.9d8 and three remain pgs  to changed to state 
"active+recovering") . osd.21 still down after following logs:



2014-10-25 10:57:48.415920 osd.21 [WRN] slow request 30.835731 seconds 
old, received at 2014-10-25 10:57:17.580013: MOSDPGPush(*7.9d8 *102803 [Push
Op(e13589d8/rbd_data.4b843b2ae8944a.0c00/head//6, version: 
102798'7794851, data_included: [0~4194304], data_size: 4194304, omap_heade
r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(e13589d8/rbd_data.4b843b2ae8944a.0c00/head//6@102
798'7794851, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complete
:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_rec

overed_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:48.415927 osd.21 [WRN] slow request 30.275588 seconds 
old, received at 2014-10-25 10:57:18.140156: MOSDPGPush(*23.596* 102803 [Pus
hOp(4ca76d96/rbd_data.5dd32f2ae8944a.0385/head//24, version: 
102798'295732, data_included: [0~4194304], data_size: 4194304, omap_head
er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(4ca76d96/rbd_data.5dd32f2ae8944a.0385/head//24@1
02798'295732, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complet
e:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_re

covered_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:48.415910 osd.21 [WRN] slow request 30.860696 seconds 
old, received at 2014-10-25 10:57:17.555048: MOSDPGPush(*23.9c6* 102803 [Pus
hOp(efdde9c6/rbd_data.5b64062ae8944a.0b15/head//24, version: 
102798'66056, data_included: [0~4194304], data_size: 4194304, omap_heade
r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(efdde9c6/rbd_data.5b64062ae8944a.0b15/head//24@10
2798'66056, copy_subset: [0~4194304], clone_subset: {}), after_progress: 
ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:
true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_reco

vered_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:58.418847 osd.21 [WRN] 26 slow requests, 1 included 
below; oldest blocked for > 54.967456 secs
2014-10-25 10:57:58.418859 osd.21 [WRN] slow request 30.967294 seconds 
old, received at 2014-10-25 10:57:27.451488: MOSDPGPush(*23.63c* 102803 [Pus
hOp(40e4b63c/rbd_data.57ed612ae8944a.0c00/head//24, version: 
102748'145637, data_included: [0~4194304], data_size: 4194304, omap_head
er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(40e4b63c/rbd_data.57ed612ae8944a.0c00/head//24@1
02748'145637, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complet
e:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_re

covered_to:, omap_complete:false))]) v2 currently no flag points reached

Thanks!
--
Tuan
HaNoi-VietNam

On 10/25/2014 05:07 AM, Craig Lewis wrote:

It looks like you're running into http://tracker.ceph.com/issues/5699

You're running 0.80.7, which has a fix for that bug.  From my reading 
of the code, I believe the fix only prevents the issue from 
occurring.  It doesn't work around or repair bad snapshots created on 
older versions of Ceph.


Were any of the snapshots you're removing up created on older versions 
of Ceph?  If they were all created on Firefly, then you should open a 
new tracker issue, and try to get some help on IRC or the developers 
mailing list.


On Thu, Oct 23, 2014 at 10:21 PM, Ta Ba Tuan > wrote:


Dear everyone

I can't start osd.21, (attached log file).
some pgs can't be repair. I'm using replicate 3 for my data pool.
Feel some objects in those pgs be failed,

I tried to delete some data that related above objects, but still
not start osd.21
and, removed osd.21, but other osds (eg: osd.86 down, not start
osd.86).

Guide me to debug it, please! Thanks!

--
Tuan
Ha Noi - VietNam










___
ceph-users mailing list
ceph