Hello,
Re-cap of my new test and staging cluster:
4 nodes running latest Hammer under Debian Jessie (with sysvinit, kernel
4.6) and manually created OSDs.
Infiniband (IPoIB) QDR (40Gb/s, about 30Gb/s effective) between all nodes.
2 HDD OSD nodes with 32GB RAM, fast enough CPU (E5-2620 v3), 2x 2
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON
nodes (2 of them are the OSD nodes as well) all with ceph version 0.80.9
(b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)
Thi
Good day.
My CephFS switched to read only
This problem was previously on Hammer, but i recreated cephfs, upgraded to
Jewel and problem was solved, but appeared after some time.
ceph.log
2016-08-07 18:11:31.226960 mon.0 192.168.13.100:6789/0 148601 : cluster [INF]
HEALTH_WARN; mds0: MDS in read-
Hi,
On 08.08.2016 09:58, Georgios Dimitrakakis wrote:
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON
nodes (2 of them are the OSD nodes as well) all with ceph version
Hello,
I read from a blog article that with rbd cache open, it will influence the
data's consistency.
is this true? for better consistency, should we disable rbd cache?
thanks.
--
Ops Cloud
o...@19cloud.net ___
ceph-users mailing list
ceph-users@li
Hi,
On 08.08.2016 09:58, Georgios Dimitrakakis wrote:
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and
3MON nodes (2 of them are the OSD nodes as well) all with ceph version
That will be down to the pool the rbd was in, the crush rule for that pool
will dictate which osd's store objects. In a standard config that rbd will
likely have objects on every osd in your cluster.
On 8 Aug 2016 9:51 a.m., "Georgios Dimitrakakis"
wrote:
> Hi,
>>
>>
>> On 08.08.2016 09:58, Geor
Hi,
On 08.08.2016 10:50, Georgios Dimitrakakis wrote:
Hi,
On 08.08.2016 09:58, Georgios Dimitrakakis wrote:
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and
3MON nodes (2
Hi Christian,
Thank you very much for the reply. Please find my comments in-line.
Thanks & Regards,
Manoj
On Sun, Aug 7, 2016 at 3:26 PM, Christian Balzer wrote:
>
> [Reduced to ceph-users, this isn't community related]
>
> Hello,
>
> On Sat, 6 Aug 2016 20:23:41 +0530 Venkata Manojawa Paritala
On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote:
> Good day.
>
> My CephFS switched to read only
> This problem was previously on Hammer, but i recreated cephfs, upgraded to
> Jewel and problem was solved, but appeared after some time.
>
> ceph.log
> 2016-08-07 18:11:31.226960 mon.0 192.168
> Op 8 augustus 2016 om 12:49 schreef John Spray :
>
>
> On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote:
> > Good day.
> >
> > My CephFS switched to read only
> > This problem was previously on Hammer, but i recreated cephfs, upgraded to
> > Jewel and problem was solved, but appeared af
Dear ceph community,
One of the OSDs in my cluster cannot start due to the
*ERROR: osd init failed: (28) No space left on device*
A while ago it was recommended to manually delete PGs on the OSD to let it
start.
So I am wondering was is the recommended way to fix this issue for the
cluster runn
On Mon, Aug 8, 2016 at 8:01 PM, Mykola Dvornik wrote:
> Dear ceph community,
>
> One of the OSDs in my cluster cannot start due to the
>
> ERROR: osd init failed: (28) No space left on device
>
> A while ago it was recommended to manually delete PGs on the OSD to let it
> start.
Who recommended t
@Shinobu
According to
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
"If you cannot start an OSD because it is full, you may delete some data by
deleting some placement group directories in the full OSD."
On 8 August 2016 at 13:16, Shinobu Kinjo wrote:
> On Mon, A
So I am wondering ``was`` is the recommended way to fix this issue for
the cluster running Jewel release (10.2.2)?
So I am wondering ``what`` is the recommended way to fix this issue
for the cluster running Jewel release (10.2.2)?
typo?
On Mon, Aug 8, 2016 at 8:19 PM, Mykola Dvornik wrote:
> @
Hi all,
Since last week, some pg's are going in the inconsistent state after a
scrub error. Last week we had 4 pgs in that state, They were on
different OSDS, but all of the metadata pool.
I did a pg repair on them, and all were healthy again. But now again one
pg is inconsistent.
with healt
On Sun, Aug 7, 2016 at 7:57 PM, Alex Gorbachev wrote:
>> I'm confused. How can a 4M discard not free anything? It's either
>> going to hit an entire object or two adjacent objects, truncating the
>> tail of one and zeroing the head of another. Using rbd diff:
>>
>> $ rbd diff test | grep -A 1 2
Hi David,
We haven't done any direct giant to jewel comparisons, but I wouldn't
expect a drop that big, even for cached tests. How long are you running
the test for, and how large are the IOs? Did you upgrade anything else
at the same time Ceph was updated?
Mark
On 08/06/2016 03:38 PM, Da
08.08.2016 13:51, Wido den Hollander пишет:
>
>> Op 8 augustus 2016 om 12:49 schreef John Spray :
>>
>>
>> On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote:
>>> Good day.
>>>
>>> My CephFS switched to read only
>>> This problem was previously on Hammer, but i recreated cephfs, upgraded to
>
Hi,
On 08.08.2016 10:50, Georgios Dimitrakakis wrote:
Hi,
On 08.08.2016 09:58, Georgios Dimitrakakis wrote:
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and
3MON nodes (2
I got to this situation several times, due to a strange behavior in the
xfs filesystem - I initially ran on debian, afterwards reinstalled the
nodes to centos7, kernel 3.10.0-229.14.1.el7.x86_64, package
xfsprogs-3.2.1-6.el7.x86_64. Around 75-80% of usage shown with df, the
disk is already full
Hello dear community!
I'm new to the Ceph and not long ago took up the theme of building clusters.
Therefore it is very important to your opinion.
It is necessary to create a cluster from 1.2 PB storage and very rapid access
to data. Earlier disks of "Intel® SSD DC P3608 Series 1.6TB NVMe PCIe 3.
Hi all,
we are in the process of expanding our cluster and I would like to
know if there are some best practices in doing so.
Our current cluster is composted as follows:
- 195 OSDs (14 Storage Nodes)
- 3 Monitors
- Total capacity 620 TB
- Used 360 TB
We will expand the cluster by other 14 Stora
librbd / QEMU advertise to the guest OS that the disk has writeback
cache enabled so that the guest OS will send any necessary flush
requests to inject write barriers and ensure data consistency. As a
safety precaution, librbd will treat the cache as writethrough until
it received the first flush
Under Jewel 10.2.2 I have also had to delete PG directories to get very
full OSDs to restart. I first use "du -sh *" under the "current" directory
to find which OSD directories are the fullest on the full OSD disk, and
pick 1 of the fullest. I then look at the PG map and verify the PG is
replicate
Hi,
On 08.08.2016 10:50, Georgios Dimitrakakis wrote:
Hi,
On 08.08.2016 09:58, Georgios Dimitrakakis wrote:
Dear all,
I would like your help with an emergency issue but first let me
describe our environment.
Our environment consists of 2OSD nodes with 10x 2TB HDDs each and
3MON nodes (2
I don't think there's a way of getting the prefix from the cluster at this
point.
If the deleted image was a similar size to the example you've given, you
will likely have had objects on every OSD. If this data is absolutely
critical you need to stop your cluster immediately or make copies of all
Hi Hope someone can help me on this
Below is my env
a) CloudStack 4.9 on CentOs 7.2,
b) running on 2 compute nodes and one controller node
c) NFS as shared storage.
d) Basic networking
f) I am able to ping hosts and all buests, system VMs and hosts are in
the
same rang
Hi,
i dont see how this is ceph related.
You should ask your question in a cloudstack mailinglist/forum/webseite.
--
Mit freundlichen Gruessen / Best regards
Oliver Dzombic
IP-Interactive
mailto:i...@ip-interactive.de
Anschrift:
IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
6
Dear David (and all),
the data are considered very critical therefore all this attempt to
recover them.
Although the cluster hasn't been fully stopped all users actions have.
I mean services are running but users are not able to read/write/delete.
The deleted image was the exact same size o
Look in the cinder db, the volumes table to find the Uuid of the deleted
volume.
If you go through yours OSDs and look for the directories for PG index 20, you
might find some fragments from the deleted volume, but it's a long shot...
> On Aug 8, 2016, at 4:39 PM, Georgios Dimitrakakis
> wro
All RBD images use a backing RADOS object to facilitate mapping
between the external image name and the internal image id. For v1
images this object would be named ".rbd" and for v2 images
this object would be named "rbd_id.". You would need to
find this deleted object first in order to start fig
On Mon, Aug 8, 2016 at 5:39 PM, Jason Dillaman wrote:
> Unfortunately, for v2 RBD images, this image name to image id mapping
> is stored in the LevelDB database within the OSDs and I don't know,
> offhand, how to attempt to recover deleted values from there.
Actually, to correct myself, the "rbd
On Thu, Aug 4, 2016 at 8:57 PM, Goncalo Borges
wrote:
> Dear cephers...
>
> I am looking for some advice on migrating from legacy tunables to Jewel
> tunables.
>
> What would be the best strategy?
>
> 1) A step by step approach?
> - starting with the transition from bobtail to firefly (and, in
Thanks for replying Greg.
I am trying to figure oout what parameters should I tune to mitigate the
impact of the data movement. For now, I've set
osd max backfills = 1
Are there others you think we should set?
What do you reckon?
Cheers
Goncalo
On 08/09/2016 09:26 AM, Gregory Farnum wr
On Mon, Aug 8, 2016 at 5:14 PM, Goncalo Borges
wrote:
> Thanks for replying Greg.
>
> I am trying to figure oout what parameters should I tune to mitigate the
> impact of the data movement. For now, I've set
>
>osd max backfills = 1
>
> Are there others you think we should set?
>
> What do you
Hello,
On Mon, 08 Aug 2016 17:39:07 +0300 Александр Пивушков wrote:
>
> Hello dear community!
> I'm new to the Ceph and not long ago took up the theme of building clusters.
> Therefore it is very important to your opinion.
> It is necessary to create a cluster from 1.2 PB storage and very rapid
Hi Kenneth...
The previous default behavior of 'ceph pg repair' was to copy the pg
objects from the primary osd to others. Not sure if it is till the case
in Jewel. For this reason, once we get these kind of errors in a data
pool, the best practice is to compare the md5 checksums of the damage
Hi I am sorry, please ignore this!!
Best Regards
*Asanka Gunasekara*
*Asst. Engineering Manager - Systems Infra Services*
*Global IT Infrastructure Services Division*
| Informatics International Limited | 89/57 | Jampettah Lane | Colombo 13 |
Sri Lanka |
| T: +94-115-794-942 (Dir)| F: +94-112-
I am sorry the inconvenience, I have cross send two mails in the middle of
the night,
Best Regards
*Asanka Gunasekara*
*Asst. Engineering Manager - Systems Infra Services*
*Global IT Infrastructure Services Division*
| Informatics International Limited | 89/57 | Jampettah Lane | Colombo 13 |
40 matches
Mail list logo