I forgot to register before posting so reposting.
I think I have a split issue or I can't seem to get rid of these objects.
How can I tell ceph to forget the objects and revert?
How this happened is that due to the python 2.7.8/ceph bug ( a whole rack
of ceph went town (it had ubuntu 14.10 and th
I think I have a split issue or I can't seem to get rid of these objects.
How can I tell ceph to forget the objects and revert?
How this happened is that due to the python 2.7.8/ceph bug ( a whole rack
of ceph went town (it had ubuntu 14.10 and that seemed to have 2.7.8 before
14.04). I didn't kno
So this was working a moment ago and I was running rados bencharks as
well as swift benchmarks to try to see how my install was doing. Now
when I try to download an object I get this read_length error::
http://pastebin.com/R4CW8Cgj
To try to poke at this I wiped all of the .rgw pools, removed
I had this happen to me as well. Turned out to be a connlimit thing for me.
I would check dmesg/kernel log and see if you see any conntrack limit
reached connection dropped messages then increase connlimit. Odd as I
connected over ssh for this but I can't deny syslog.
__
I have a single radosgw user with 2 s3 keys and 1 swift key. I have created a
few buckets and I can list all of the contents of bucket A and C but not B with
either S3 (boto) or python-swiftclient. I am able to list the first 1000
entries using radosgw-admin 'bucket list --bucket=bucketB' withou
Will do. The reason for the partial request is that the total size of the
file is close to 1TB so attempting a download would take quite some time on
our 10Gb connection. What is odd is that if I request the last bit
received to the end of the file we get a 406 can not be satisfied response
Sorry for the delay. It took me a while to figure out how to do a range request
and append the data to a single file. The good news is that the end file seems
to be 14G in size which matches the files manifest size. The bad news is that
the file is completely corrupt and the radosgw log has erro
fest, I'll need to take a look
> at the code. Could such a sequence happen with the client that you're using
> to upload?
>
> Yehuda
>
> - Original Message -
> > From: "Sean Sullivan"
> > To: "Yehuda Sadeh-Weinraub"
> > C
Some context. I have a small cluster running ubuntu 14.04 and giant ( now
hsmmer). I ran some updates everything was fine. Rebooted a node and a
drive must have failed as it no longer shows up.
I use --dmcrypt with ceph deploy and 5 osds per ssd journal. To do this I
created the ssd partit
Hello Yall!
I can't figure out why my gateways are performing so poorly and I am not
sure where to start looking. My RBD mounts seem to be performing fine
(over 300 MB/s) while uploading a 5G file to Swift/S3 takes 2m32s
(32MBps i believe). If we try a 1G file it's closer to 8MBps. Testing
with nu
t? Examining the
> OSDs to see if they're behaving differently on the different requests
> is one angle of attack. The other is look into is if the RGW daemons
> are hitting throttler limits or something that the RBD clients aren't.
> -Greg
> On Thu, Dec 18, 2014 at 7:35 PM Sea
third gateway was made last minute to test and rule out the hardware.
On December 18, 2014 10:57:41 PM Christian Balzer wrote:
Hello,
Nice cluster, I wouldn't mind getting my hand or her ample nacelles, er,
wrong movie. ^o^
On Thu, 18 Dec 2014 21:35:36 -0600 Sean Sullivan wrote:
> Hell
Wow Christian,
Sorry I missed these in line replies. Give me a minute to gather some data.
Thanks a million for the in depth responses!
I thought about raiding it but I needed the space unfortunately. I had a
3x60 osd node test cluster that we tried before this and it didn't have
this floppi
lzer wrote:
Hello,
On Thu, 18 Dec 2014 23:45:57 -0600 Sean Sullivan wrote:
Wow Christian,
Sorry I missed these in line replies. Give me a minute to gather some
data. Thanks a million for the in depth responses!
No worries.
I thought about raiding it but I needed the space unfortunately. I had
/q5E6JjkG
On 12/19/2014 08:10 PM, Christian Balzer wrote
> Hello Sean,
>
> On Fri, 19 Dec 2014 02:47:41 -0600 Sean Sullivan wrote:
>
>> Hello Christian,
>>
>> Thanks again for all of your help! I started a bonnie test using the
>> following::
>> bonnie -
014 at 2:57 PM, Sean Sullivan
> mailto:seapasu...@uchicago.edu>> wrote:
>
> Thanks Craig!
>
> I think that this may very well be my issue with osds dropping out
> but I am still not certain as I had the cluster up for a small
> period while running rado
I am trying to understand these drive throttle markers that were
mentioned to get an idea of why these drives are marked as slow.::
here is the iostat of the drive /dev/sdbm
http://paste.ubuntu.com/9607168/
an IO wait of .79 doesn't seem bad but a write wait of 21.52 seems
really high. Looking
So we recently had a power outage and I seem to have lost 2 of 3 of my
monitors. I have since copied /var/lib/ceph/mon/ceph-$(hostname){,.BAK} and
then created a new cephfs and finally generated a new filesystem via
''' sudo ceph-mon -i {mon-id} --mkfs --monmap {tmp}/{map-filename}
--keyring {tmp}
So our datacenter lost power and 2/3 of our monitors died with FS
corruption. I tried fixing it but it looks like the store.db didn't make
it.
I copied the working journal via
1.
sudo mv /var/lib/ceph/mon/ceph-$(hostname){,.BAK}
2.
sudo ceph-mon -i {mon-id} --mkfs --monmap {tmp}/
I think it just got worse::
all three monitors on my other cluster say that ceph-mon can't open
/var/lib/ceph/mon/$(hostname). Is there any way to recover if you lose all
3 monitors? I saw a post by Sage saying that the data can be recovered as
all of the data is held on other servers. Is this pos
previous osd maps are there.
I just don't understand what key/values I need inside.
On Aug 11, 2016 1:33 AM, "Wido den Hollander" wrote:
>
> > Op 11 augustus 2016 om 0:10 schreef Sean Sullivan <
> seapasu...@uchicago.edu>:
> >
> >
> > I think i
hat lost all 3 monitors in power falure? If I am going
down the right path is there any advice on how I can assemble/repair the
database?
I see that there is a rbd recovery from a dead cluster tool. Is it possible
to do the same with s3 objects?
On Thu, Aug 11, 2016 at 11:15 AM, Wido den Hollander w
other?
All should have the same keys/values although constructed differently
right? I can't blindly copy /var/lib/ceph/mon/ceph-$(hostname)/store.db/
from one host to another right? But can I copy the keys/values from one to
another?
On Fri, Aug 12, 2016 at 12:45 PM, Sean Sullivan
wrote:
>
log_file
--- end dump of recent events ---
Aborted (core dumped)
---
---
I feel like I am so close but so far. Can anyone give me a nudge as to what
I can do next? it looks like it is bombing out on trying to get an updated
p
We have a hammer cluster that experienced a similar power failure and ended
up corrupting our monitors leveldb stores. I am still trying to repair ours
but I can give you a few tips that seem to help.
1.) I would copy the database off to somewhere safe right away. Just
opening it seems to change
I am trying to install Ceph luminous (ceph version 12.2.1) on 4 ubuntu
16.04 servers each with 74 disks, 60 of which are HGST 7200rpm sas drives::
HGST HUS724040AL sdbv sas
root@kg15-2:~# lsblk --output MODEL,KNAME,TRAN | grep HGST | wc -l
60
I am trying to deploy them all with ::
a line like th
I have tried using ceph-disk directly and i'm running into all sorts of
trouble but I'm trying my best. Currently I am using the following cobbled
script which seems to be working:
https://github.com/seapasulli/CephScripts/blob/master/provision_storage.sh
I'm at 11 right now. I hope this works.
___
I am trying to stand up ceph (luminous) on 3 72 disk supermicro servers
running ubuntu 16.04 with HWE enabled (for a 4.10 kernel for cephfs). I am
not sure how this is possible but even though I am running the following
line to wipe all disks of their partitions, once I run ceph-disk to
partition t
On freshly installed ubuntu 16.04 servers with the HWE kernel selected
(4.10). I can not use ceph-deploy or ceph-disk to provision osd.
whenever I try I get the following::
ceph-disk -v prepare --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys
--bluestore --cluster ceph --fs-type xfs -- /dev/s
I was curious if anyone has filled ceph storage beyond 75%. Admitedly we
lost a single host due to power failure and are down 1 host until the
replacement parts arrive but outside of that I am seeing disparity between
the most and least full osd::
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR
M
sure I've been hesitant. I'll give it another shot in a test instance
and see how it goes.
Thanks for your help as always Mr. Balzer.
On Aug 28, 2016 8:59 PM, "Christian Balzer" wrote:
>
> Hello,
>
> On Sun, 28 Aug 2016 14:34:25 -0500 Sean Sullivan wrote:
>
>
I was creating a new user and mount point. On another hardware node I
mounted CephFS as admin to mount as root. I created /aufstest and then
unmounted. From there it seems that both of my mds nodes crashed for some
reason and I can't start them any more.
https://pastebin.com/1ZgkL9fa -- my mds log
4:32 PM, Sean Sullivan wrote:
> I was creating a new user and mount point. On another hardware node I
> mounted CephFS as admin to mount as root. I created /aufstest and then
> unmounted. From there it seems that both of my mds nodes crashed for some
> reason and I can't st
can't seem to get them to start again.
On Mon, Apr 30, 2018 at 5:06 PM, Sean Sullivan wrote:
> I had 2 MDS servers (one active one standby) and both were down. I took a
> dumb chance and marked the active as down (it said it was up but laggy).
> Then started the primary again and now
r 30, 2018 at 7:24 PM, Sean Sullivan wrote:
> So I think I can reliably reproduce this crash from a ceph client.
>
> ```
> root@kh08-8:~# ceph -s
> cluster:
> id: 9f58ee5a-7c5d-4d68-81ee-debe16322544
> health: HEALTH_OK
>
> services:
> mon: 3 dae
, May 1, 2018 at 12:09 AM, Patrick Donnelly
wrote:
> Hello Sean,
>
> On Mon, Apr 30, 2018 at 2:32 PM, Sean Sullivan
> wrote:
> > I was creating a new user and mount point. On another hardware node I
> > mounted CephFS as admin to mount as root. I created /aufstest and then
>
arge (one 4.1G and the other 200MB)
>
> kh10-8 (200MB) mds log -- https://griffin-objstore.op
> ensciencedatacloud.org/logs/ceph-mds.kh10-8.log
> kh09-8 (4.1GB) mds log -- https://griffin-objstore.op
> ensciencedatacloud.org/logs/ceph-mds.kh09-8.log
>
> On Tue, May 1, 2018 at 12:0
iable is a different type or a missing delimiter. womp. I am
definitely out of my depth but now is a great time to learn! Can anyone
shed some more light as to what may be wrong?
On Fri, May 4, 2018 at 7:49 PM, Yan, Zheng wrote:
> On Wed, May 2, 2018 at 7:19 AM, Sean Sullivan wrote:
> >
I am sorry for posting this if this has been addressed already. I am not
sure on how to search through old ceph-users mailing list posts. I used to
use gmane.org but that seems to be down.
My setup::
I have a moderate ceph cluster (ceph hammer 94.9
- fe6d859066244b97b24f09d46552afc2071e6f90 ). Th
reshold)
max_recent 500
max_new 1000
log_file
--- end dump of recent events ---
Segmentation fault (core dumped)
--
I have tried copying my monitor and admin keyring into the admin.keyring
used to try to r
I have a hammer cluster that died a bit ago (hammer 94.9) consisting of 3
monitors and 630 osds spread across 21 storage hosts. The clusters monitors
all died due to leveldb corruption and the cluster was shut down. I was
finally given word that I could try to revive the cluster this week!
https:/
Hi Ben!
I'm using ubuntu 14.04
I have restarted the gateways with the numthreads line you suggested. I
hope this helps. I would think I would get some kind of throttle log or
something.
500 seems really strange as well. Do you have a thread for this? RGW still
has a weird race condition w
42 matches
Mail list logo