Re: [ceph-users] Ceph RGW + S3 Client (s3cmd)

2014-06-24 Thread Francois Deppierraz
Hi Vickey,

This really looks like a DNS issue. Are you sure that the host from
which s3cmd is running is able to resolve the host 'bmi-pocfe2.scc.fi'?

Does a regular ping works?

$ ping bmi-pocfe2.scc.fi

François

On 23. 06. 14 16:24, Vickey Singh wrote:
> # s3cmd ls
> 
> WARNING: Retrying failed request: / ([Errno -2] Name or service not known)
> 
> WARNING: Waiting 3 sec...
> 
> WARNING: Retrying failed request: / ([Errno -2] Name or service not known)
> 
> WARNING: Waiting 6 sec...
> 
> WARNING: Retrying failed request: / ([Errno -2] Name or service not known)
> 
> WARNING: Waiting 9 sec...
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Francois Deppierraz
Hi,

This issue is on a small 2 servers (44 osds) ceph cluster running 0.72.2
under Ubuntu 12.04. The cluster was filling up (a few osds near full)
and I tried to increase the number of pg per pool to 1024 for each of
the 14 pools to improve storage space balancing. This increase triggered
high memory usage on the servers which were unfortunately
under-provisioned (16 GB RAM for 22 osds) and started to swap and crash.

After installing memory into the servers, the result is a broken cluster
with unfound objects and two osds (osd.6 and osd.43) crashing at startup.

$ ceph health
HEALTH_WARN 166 pgs backfill; 326 pgs backfill_toofull; 2 pgs
backfilling; 765 pgs degraded; 715 pgs down; 1 pgs incomplete; 715 pgs
peering; 5 pgs recovering; 2 pgs recovery_wait; 716 pgs stuck inactive;
1856 pgs stuck unclean; 164 requests are blocked > 32 sec; recovery
517735/15915673 objects degraded (3.253%); 1241/7910367 unfound
(0.016%); 3 near full osd(s); 1/43 in osds are down; noout flag(s) set

osd.6 is crashing due to an assertion ("trim_objectcould not find coid")
which leads to a resolved bug report which unfortunately doesn't give
any advise on how to repair the osd.

http://tracker.ceph.com/issues/5473

It is much less obvious why osd.43 is crashing, please have a look at
the following osd logs:

http://paste.ubuntu.com/8288607/
http://paste.ubuntu.com/8288609/

Any advise on how to repair both osds and recover the unfound objects
would be more than welcome.

Thanks!

François

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Francois Deppierraz
Hi Greg,

Thanks for your support!

On 08. 09. 14 20:20, Gregory Farnum wrote:

> The first one is not caused by the same thing as the ticket you
> reference (it was fixed well before emperor), so it appears to be some
> kind of disk corruption.
> The second one is definitely corruption of some kind as it's missing
> an OSDMap it thinks it should have. It's possible that you're running
> into bugs in emperor that were fixed after we stopped doing regular
> support releases of it, but I'm more concerned that you've got disk
> corruption in the stores. What kind of crashes did you see previously;
> are there any relevant messages in dmesg, etc?

Nothing special in dmesg except probably irrelevant XFS warnings:

XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

All logs from before the disaster are still there, do you have any
advise on what would be relevant?

> Given these issues, you might be best off identifying exactly which
> PGs are missing, carefully copying them to working OSDs (use the osd
> store tool), and killing these OSDs. Do lots of backups at each
> stage...

This sounds scary, I'll keep fingers crossed and will do a bunch of
backups. There are 17 pg with missing objects.

What do you exactly mean by the osd store tool? Is it the
'ceph_filestore_tool' binary?

François

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-11 Thread Francois Deppierraz
Hi Greg,

An attempt to recover pg 3.3ef by copying it from broken osd.6 to
working osd.32 resulted in one more broken osd :(

Here's what was actually done:

root@storage1:~# ceph pg 3.3ef list_missing | head
{ "offset": { "oid": "",
  "key": "",
  "snapid": 0,
  "hash": 0,
  "max": 0,
  "pool": -1,
  "namespace": ""},
  "num_missing": 219,
  "num_unfound": 219,
  "objects": [
[...]
root@storage1:~# ceph pg 3.3ef query
[...]
  "might_have_unfound": [
{ "osd": 6,
  "status": "osd is down"},
{ "osd": 19,
  "status": "already probed"},
{ "osd": 32,
  "status": "already probed"},
{ "osd": 42,
  "status": "already probed"}],
[...]

# Exporting pg 3.3ef from broken osd.6

root@storage2:~# ceph_objectstore_tool --data-path
/var/lib/ceph/osd/ceph-6/ --journal-path
/var/lib/ceph/osd/ssd0/6.journal --pgid 3.3ef --op export --file
~/backup/osd-6.pg-3.3ef.export

# Remove an empty pg 3.3ef which was already present on this OSD

root@storage2:~# service ceph stop osd.32
root@storage2:~# ceph_objectstore_tool --data-path
/var/lib/ceph/osd/ceph-32/ --journal-path
/var/lib/ceph/osd/ssd0/32.journal --pgid 3.3ef --op remove

# Import pg 3.3ef from dump

root@storage2:~# ceph_objectstore_tool --data-path
/var/lib/ceph/osd/ceph-32/ --journal-path
/var/lib/ceph/osd/ssd0/32.journal --op import --file
~/backup/osd-6.pg-3.3ef.export
root@storage2:~# service ceph start osd.32

-1> 2014-09-10 18:53:37.196262 7f13fdd7d780  5 osd.32 pg_epoch:
48366 pg[3.3ef(unlocked)] enter Initial
 0> 2014-09-10 18:53:37.239479 7f13fdd7d780 -1 *** Caught signal
(Aborted) **
 in thread 7f13fdd7d780

 ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
 1: /usr/bin/ceph-osd() [0x8843da]
 2: (()+0xfcb0) [0x7f13fcfabcb0]
 3: (gsignal()+0x35) [0x7f13fb98a0d5]
 4: (abort()+0x17b) [0x7f13fb98d83b]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f13fc2dc69d]
 6: (()+0xb5846) [0x7f13fc2da846]
 7: (()+0xb5873) [0x7f13fc2da873]
 8: (()+0xb596e) [0x7f13fc2da96e]
 9: /usr/bin/ceph-osd() [0x94b34f]
 10:
(pg_log_entry_t::decode_with_checksum(ceph::buffer::list::iterator&)+0x12c)
[0x691b6c]
 11: (PGLog::read_log(ObjectStore*, coll_t, hobject_t, pg_info_t const&,
std::map,
std::allocator > >&, PGLog::IndexedLog&, pg_missing_t&,
std::basic_ostringstream,
std::allocator >&, std::set, std::allocator >*)+0x16d4) [0x7d3ef4]
 12: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x2c1) [0x7951b1]
 13: (OSD::load_pgs()+0x18f3) [0x61e143]
 14: (OSD::init()+0x1b9a) [0x62726a]
 15: (main()+0x1e8d) [0x5d2d0d]
 16: (__libc_start_main()+0xed) [0x7f13fb97576d]
 17: /usr/bin/ceph-osd() [0x5d69d9]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Fortunately it was possible to bring back osd.32 into a working state
simply be removing this pg.

root@storage2:~# ceph_objectstore_tool --data-path
/var/lib/ceph/osd/ceph-32/ --journal-path
/var/lib/ceph/osd/ssd0/32.journal --pgid 3.3ef --op remove

Did I miss something from this procedure or does it mean that this pg is
definitely lost?

Thanks!

François

On 09. 09. 14 00:23, Gregory Farnum wrote:
> On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz
>  wrote:
>> Hi Greg,
>>
>> Thanks for your support!
>>
>> On 08. 09. 14 20:20, Gregory Farnum wrote:
>>
>>> The first one is not caused by the same thing as the ticket you
>>> reference (it was fixed well before emperor), so it appears to be some
>>> kind of disk corruption.
>>> The second one is definitely corruption of some kind as it's missing
>>> an OSDMap it thinks it should have. It's possible that you're running
>>> into bugs in emperor that were fixed after we stopped doing regular
>>> support releases of it, but I'm more concerned that you've got disk
>>> corruption in the stores. What kind of crashes did you see previously;
>>> are there any relevant messages in dmesg, etc?
>>
>> Nothing special in dmesg except probably irrelevant XFS warnings:
>>
>> XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
> 
> Hmm, I'm not sure what the outcome of that could be. Googling for the
> error message returns this as the first result, though:
> http://comments.gmane.org/gmane.comp.file-systems.xfs.general/58429
> Which indicates that it's a real deadlock and capable of messing up
> your OSDs pretty good.
> 
>>
>> All logs from be

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-12 Thread Francois Deppierraz
Hi,

Following-up this issue, I've identified that almost all unfound objects
belongs to a single RBD volume (with the help of the script below).

Now what's the best way to try to recover the filesystem stored on this
RBD volume?

'mark_unfound_lost revert' or 'mark_unfound_lost lost' and then running
fsck?

By the way, I'm also still interested to know whether the procedure I've
tried with ceph_objectstore_tool was correct?

Thanks!

François

[1] ceph-list-unfound.sh

#!/bin/sh
for pg in $(ceph health detail | awk '/unfound$/ { print $2; }'); do
ceph pg $pg list_missing | jq .objects
done | jq -s add | jq '.[] | .oid.oid'


On 11. 09. 14 11:05, Francois Deppierraz wrote:
> Hi Greg,
> 
> An attempt to recover pg 3.3ef by copying it from broken osd.6 to
> working osd.32 resulted in one more broken osd :(
> 
> Here's what was actually done:
> 
> root@storage1:~# ceph pg 3.3ef list_missing | head
> { "offset": { "oid": "",
>   "key": "",
>   "snapid": 0,
>   "hash": 0,
>   "max": 0,
>   "pool": -1,
>   "namespace": ""},
>   "num_missing": 219,
>   "num_unfound": 219,
>   "objects": [
> [...]
> root@storage1:~# ceph pg 3.3ef query
> [...]
>   "might_have_unfound": [
> { "osd": 6,
>   "status": "osd is down"},
> { "osd": 19,
>   "status": "already probed"},
> { "osd": 32,
>   "status": "already probed"},
> { "osd": 42,
>   "status": "already probed"}],
> [...]
> 
> # Exporting pg 3.3ef from broken osd.6
> 
> root@storage2:~# ceph_objectstore_tool --data-path
> /var/lib/ceph/osd/ceph-6/ --journal-path
> /var/lib/ceph/osd/ssd0/6.journal --pgid 3.3ef --op export --file
> ~/backup/osd-6.pg-3.3ef.export
> 
> # Remove an empty pg 3.3ef which was already present on this OSD
> 
> root@storage2:~# service ceph stop osd.32
> root@storage2:~# ceph_objectstore_tool --data-path
> /var/lib/ceph/osd/ceph-32/ --journal-path
> /var/lib/ceph/osd/ssd0/32.journal --pgid 3.3ef --op remove
> 
> # Import pg 3.3ef from dump
> 
> root@storage2:~# ceph_objectstore_tool --data-path
> /var/lib/ceph/osd/ceph-32/ --journal-path
> /var/lib/ceph/osd/ssd0/32.journal --op import --file
> ~/backup/osd-6.pg-3.3ef.export
> root@storage2:~# service ceph start osd.32
> 
> -1> 2014-09-10 18:53:37.196262 7f13fdd7d780  5 osd.32 pg_epoch:
> 48366 pg[3.3ef(unlocked)] enter Initial
>  0> 2014-09-10 18:53:37.239479 7f13fdd7d780 -1 *** Caught signal
> (Aborted) **
>  in thread 7f13fdd7d780
> 
>  ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
>  1: /usr/bin/ceph-osd() [0x8843da]
>  2: (()+0xfcb0) [0x7f13fcfabcb0]
>  3: (gsignal()+0x35) [0x7f13fb98a0d5]
>  4: (abort()+0x17b) [0x7f13fb98d83b]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f13fc2dc69d]
>  6: (()+0xb5846) [0x7f13fc2da846]
>  7: (()+0xb5873) [0x7f13fc2da873]
>  8: (()+0xb596e) [0x7f13fc2da96e]
>  9: /usr/bin/ceph-osd() [0x94b34f]
>  10:
> (pg_log_entry_t::decode_with_checksum(ceph::buffer::list::iterator&)+0x12c)
> [0x691b6c]
>  11: (PGLog::read_log(ObjectStore*, coll_t, hobject_t, pg_info_t const&,
> std::map,
> std::allocator  hobject_t> > >&, PGLog::IndexedLog&, pg_missing_t&,
> std::basic_ostringstream,
> std::allocator >&, std::set string>, std::allocator >*)+0x16d4) [0x7d3ef4]
>  12: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x2c1) [0x7951b1]
>  13: (OSD::load_pgs()+0x18f3) [0x61e143]
>  14: (OSD::init()+0x1b9a) [0x62726a]
>  15: (main()+0x1e8d) [0x5d2d0d]
>  16: (__libc_start_main()+0xed) [0x7f13fb97576d]
>  17: /usr/bin/ceph-osd() [0x5d69d9]
>  NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
> 
> Fortunately it was possible to bring back osd.32 into a working state
> simply be removing this pg.
> 
> root@storage2:~# ceph_objectstore_tool --data-path
> /var/lib/ceph/osd/ceph-32/ --journal-path
> /var/lib/ceph/osd/ssd0/32.journal --pgid 3.3ef --op remove
> 
> Did I miss something from this procedure or does it mean that this pg is
> definitely lost?
> 
> Thanks!
> 
> François
> 
> On 09. 09. 14 00:23, Gregory Farnum wrote:
>> On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz
>>  wrote:
>>> Hi Greg,
>>>
>>> Thanks for your su

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-19 Thread Francois Deppierraz
Hi Craig,

I'm planning to completely re-install this cluster with firefly because
I started to see other OSDs crashes with the same trim_object error...

So now, I'm more interested in figuring out exactly why data corruption
happened in the first place than repairing the cluster.

Comments in-line.

On 16. 09. 14 23:53, Craig Lewis wrote:
> On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz
> mailto:franc...@ctrlaltdel.ch>> wrote:
> 
> XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
> 
> All logs from before the disaster are still there, do you have any
> advise on what would be relevant?
> 
> This is a problem.  It's not necessarily a deadlock.  The warning is
> printed if the XFS memory allocator has to retry more than 100 times
> when it's trying to allocate memory.  It either indicates extremely low
> memory, or extremely fragmented memory.  Either way, your OSDs are
> sitting there trying to allocate memory instead of doing something useful.

Do you mean that this particular error doesn't imply data corruption but
only bad OSD performances?

> By any chance, does your ceph.conf have:
> osd mkfs options xfs = -n size=64k
> 
> If so, you should start planning to remove that arg, and reformat every
> OSD.  Here's a thread where I discussion my (mis) adventures with XFS
> allocation deadlocks:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-July/041336.html

Yes! Thanks for the details, I'm actually using the puppet-ceph module
from enovance which indeed uses [1] the '-n size=64k' option when
formating a new disk.

François

[1]
https://github.com/enovance/puppet-ceph/blob/master/manifests/osd/device.pp#L44
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com