Now I have also discovered that, by mistake, someone has put production
data on a virtual machine of the cluster. I need that ceph starts I/O so I
can boot that virtual machine.
Can I mark the incomplete pgs as valid?
If needed, where can I buy some paid support?
Thanks again,
Mario
Il giorno mer
Hi Oliver,
This is my problem:
I have deployed Ceph AIO with two interfaces 192.168.1.67 and 10.0.0.67 but
at the momento of installation I used 192.168.1.67 and I have an Openstack
installed with two interfaces 192.168.1.65 and 10.0.0.65.
Openstack have the storage in Ceph but is working on 192
Hi,
if you need fast access to your remaining data you can use
ceph-objectstore-tool to mark those PGs as complete, however this will
irreversibly lose the missing data.
If you understand the risks, this procedure is pretty good explained here:
http://ceph.com/community/incomplete-pgs-oh-my/
Sinc
I have read many times the post "incomplete pgs, oh my"
I think my case is different.
The broken disk is completely broken.
So how can I simply mark incomplete pgs as complete?
Should I stop ceph before?
Il giorno mer 29 giu 2016 alle ore 09:36 Tomasz Kuzemko <
tomasz.kuze...@corp.ovh.com> ha scr
I have searched google and I see that there is no official procedure.
Il giorno mer 29 giu 2016 alle ore 09:43 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:
> I have read many times the post "incomplete pgs, oh my"
> I think my case is different.
> The broken disk is completely broken.
> So
As far as I know there isn't, which is a shame. We have covered a
situation like this in our dev environment to be ready for it in
production and it worked, however be aware that the data that Ceph
believes is missing will be lost after you mark a PG complete.
In your situation I would find OSD wh
Hi Mario,
in my opinion you should
1. fix
too many PGs per OSD (307 > max 300)
2. stop scrubbing / deeb scrubbing
--
How looks your current
ceph osd tree
?
--
Mit freundlichen Gruessen / Best regards
Oliver Dzombic
IP-Interactive
mailto:i...@ip-interactive.de
Anschrift:
Hello,
On Wed, 29 Jun 2016 06:02:59 + Mario Giammarco wrote:
> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
^
And that's the root cause of all your woes.
The default replication size is 3 for a reason and while I do run pools
with repli
I thank you for your reply so I can add my experience:
1) the other time this thing happened to me I had a cluster with min_size=2
and size=3 and the problem was the same. That time I put min_size=1 to
recover the pool but it did not help. So I do not understand where is the
advantage to put three
Just loosing one disk doesn’t automagically delete it from CRUSH, but in the
output you had 10 disks listed, so there must be something else going - did you
delete the disk from the crush map as well?
Ceph waits by default 300 secs AFAIK to mark an OSD out after it will start to
recover.
> On
Yes I have removed it from crush because it was broken. I have waited 24
hours to see if cephs would like to heals itself. Then I removed the disk
completely (it was broken...) and I waited 24 hours again. Then I start
getting worried.
Are you saying to me that I should not remove a broken disk fro
Hi,
removing ONE disk while your replication is 2, is no problem.
You dont need to wait a single second to replace of remove it. Its
anyway not used and out/down. So from ceph's point of view its not existent.
But as christian told you already, what we see now fits to a szenari
Infact I am worried because:
1) ceph is under proxmox, and proxmox may decide to reboot a server if it
is not responding
2) probably a server was rebooted while ceph was reconstructing
3) even using max=3 do not help
Anyway this is the "unofficial" procedure that I am using, much simpler
than blo
xiaoxi chen writes:
>
> Hmm, I asked in the ML some days before,:) likely you hit the kernel bug
which fixed by commit 5e804ac482 "ceph: don't invalidate page cache when
inode is no longer used”. This fix is in 4.4 but not in 4.2. I haven't got a
chance to play with 4.4 , it would be great i
Dear ceph-users,
Are there any expressions / calculators available to calculate the
maximum expected random write IOPS of the ceph cluster?
To my understanding of the ceph IO, this should be something like
MAXIOPS = (1-OVERHEAD) * OSD_BACKENDSTORAGE_IOPS * NUM_OSD /
REPLICA_COUNT
So the questio
Hi Alex/Stefan,
I'm in the middle of testing 4.7rc5 on our test cluster to confirm
once and for all this particular issue has been completely resolved by
Peter's recent patch to sched/fair.c refereed to by Stefan above. For
us anyway the patches that Stefan applied did not solve the issue and
neit
Now the problem is that ceph has put out two disks because scrub has
failed (I think it is not a disk fault but due to mark-complete)
How can I:
- disable scrub
- put in again the two disks
I will wait anyway the end of recovery to be sure it really works again
Il giorno mer 29 giu 2016 alle ore
Hi,
to be precise i've far more patches attached to the sched part (around
20) of the kernel. So maybe that's the reason why it helps to me.
Could you please post a complete stack trace? Also Qemu / KVM triggers this.
Stefan
Am 29.06.2016 um 11:41 schrieb Campbell Steven:
> Hi Alex/Stefan,
>
>
hi,
ceph osd set noscrub
ceph osd set nodeep-scrub
ceph osd in
--
Mit freundlichen Gruessen / Best regards
Oliver Dzombic
IP-Interactive
mailto:i...@ip-interactive.de
Anschrift:
IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen
HRB 93402 beim Amtsgericht Hanau
Thanks,
I can put in osds but the do not stay in, and I am pretty sure that are not
broken.
Il giorno mer 29 giu 2016 alle ore 12:07 Oliver Dzombic <
i...@ip-interactive.de> ha scritto:
> hi,
>
> ceph osd set noscrub
> ceph osd set nodeep-scrub
>
> ceph osd in
>
>
> --
> Mit freundlichen Gruesse
Hi,
again:
You >must< check all your logs ( as fucky as it is for sure ).
Means on the ceph nodes in /var/log/ceph/*
And go back to the time where things went down the hill.
There must be something else going on, beyond normal osd crash.
And your manual pg repair/pg remove/pg set complete is,
Just one question: why when ceph has some incomplete pgs it refuses to do
I/o on good pgs?
Il giorno mer 29 giu 2016 alle ore 12:55 Oliver Dzombic <
i...@ip-interactive.de> ha scritto:
> Hi,
>
> again:
>
> You >must< check all your logs ( as fucky as it is for sure ).
>
> Means on the ceph nodes
Hi,
it does not.
But in your case, you have 10 OSD, and 7 of them have incomplete PG's.
So since your proxmox vps's are not on single PG's but spread across
many PG's you have a good chance that at least some data of any vps is
on one of the defect PG's.
--
Mit freundlichen Gruessen / Best reg
Hi,
Le 29/06/2016 12:00, Mario Giammarco a écrit :
> Now the problem is that ceph has put out two disks because scrub has
> failed (I think it is not a disk fault but due to mark-complete)
There is something odd going on. I've only seen deep-scrub failing (ie
detect one inconsistency and marking
This time at the end of recovery procedure you described it was like most
pgs active+clean 20 pgs incomplete.
After that when trying to use the cluster I got "request blocked more than"
and no vm can start.
I know that something has happened after the broken disk, probably a server
reboot. I am inv
> Am 28.06.2016 um 09:43 schrieb Lionel Bouton
> :
>
> Hi,
>
> Le 28/06/2016 08:34, Stefan Priebe - Profihost AG a écrit :
>> [...]
>> Yes but at least BTRFS is still not working for ceph due to
>> fragmentation. I've even tested a 4.6 kernel a few weeks ago. But it
>> doubles it's I/O after a
Greetings,
I have a lab cluster running Hammer 0.94.6 and being used exclusively for
object storage. The cluster consists of four servers running 60 6TB OSDs
each. The main .rgw.buckets pool is using k=3 m=1 erasure coding and
contains 8192 placement groups.
Last week, one of our guys out-ed an
Hi,
Le 29/06/2016 18:33, Stefan Priebe - Profihost AG a écrit :
>> Am 28.06.2016 um 09:43 schrieb Lionel Bouton
>> :
>>
>> Hi,
>>
>> Le 28/06/2016 08:34, Stefan Priebe - Profihost AG a écrit :
>>> [...]
>>> Yes but at least BTRFS is still not working for ceph due to
>>> fragmentation. I've even t
Hi all,
Is there anyone using rbd for xenserver vm storage? I have XenServer 7 and the
latest Ceph, I am looking for the the best way to mount the rbd volume under
XenServer. There is not much recent info out there I have found except for
this:
http://www.mad-hacking.net/documentation/linux/h
I am starting to work with and benchmark our ceph cluster. While
throughput is so far looking good, metadata performance so far looks to
be suffering. Is there anything that can be done to speed up the
response time of looking through a lot of small files and folders?
Right now, I am running
On Wednesday, June 29, 2016, Mike Jacobacci wrote:
> Hi all,
>
> Is there anyone using rbd for xenserver vm storage? I have XenServer 7
> and the latest Ceph, I am looking for the the best way to mount the rbd
> volume under XenServer. There is not much recent info out there I have
> found exce
On Thu, Jun 30, 2016 at 3:22 AM, Brian Felton wrote:
> Greetings,
>
> I have a lab cluster running Hammer 0.94.6 and being used exclusively for
> object storage. The cluster consists of four servers running 60 6TB OSDs
> each. The main .rgw.buckets pool is using k=3 m=1 erasure coding and
> cont
Hello, everyone
When I want to modify access_key using the following cmd:
radosgw-admin user modify --uid=user --access_key="userak"
I got:
{
"user_id": "user",
"display_name": "User name",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": []
were only appearing in osd.56
logs but not in others.
# cat ceph-osd.56.log-20160629 | grep -Hn 'ERR'
(standard input):8569:2016-06-29 08:09:50.952397 7fd023322700 -1
log_channel(cluster) log [ERR] : scrub 6.263
6:c645f18e:::12a343d.:head on disk size (1836) does not mat
errors
> crush map has legacy tunables (require bobtail, min is firefly); see
> http://ceph.com/docs/master/rados/operations/crush-map/#tunables
>
> We have started by looking to pg 6.263. Errors were only appearing in
> osd.56 logs but not in others.
>
> # cat ceph-osd.56.log-
": "6.263",
- "last_update": "1005'2273061",
-"last_complete": "1005'2273061",
-"log_tail": "1005'227",
-"last_user_version": 2273061,
+"last_update&quo
[],
> "pushing": []
> }
> },
> "scrub": {
> "scrubber.epoch_start": "995",
> "scrubber.active": 0,
> "scrubber.st
"1005'2273061",
-"last_complete": "1005'2273061",
-"log_tail": "1005'227",
-"last_user_version": 2273061,
+"last_update": "1005'2273745",
+"last_comp
I've had two osds fail and I'm pretty sure they wont recover from
this. I'm looking for help trying to get them back online if
possible...
terminate called after throwing an instance of 'ceph::buffer::malformed_input'
what(): buffer::malformed_input: bad checksum on pg_log_entry_t
- I'm having
Hey all,
I am interested in running ceph in docker containers. This is extremely
attractive given the recent integration of swarm into the docker engine,
making it really easy to set up a docker cluster.
When running ceph in docker, should monitors, radosgw and OSDs all be on
separate physic
I've had two osds fail and I'm pretty sure they wont recover from this. I'm
looking for help trying to get them back online if possible...
terminate called after throwing an instance of
'ceph::buffer::malformed_input'
what(): buffer::malformed_input: bad checksum on pg_log_entry_t
- I'm having
Last two questions:
1) I have used other systems in the past. In case of split brain or serious
problems they offered me to choose which copy is "good" and then work
again. Is there a way to tell ceph that all is ok? This morning again I
have 19 incomplete pgs after recovery
2) Where can I find pai
kend": {
> "pull_from_peer": [],
> "pushing": []
> }
> },
> "scrub": {
> "scrubber.epoch_start": "995",
> "scrubb
43 matches
Mail list logo