Hi Brian,
The issue started with failing mon service and after that both OSDs on that
node failed to start.
Mon service is on SSD disk and WAL/DB of OSDs on that SSD too with lvm.
I have changed SSD disk with new one, and changing SATA port and cable but
the problem is still remaining.
All disk te
On Wed, Feb 21, 2018 at 11:56 PM, Oliver Freyermuth
wrote:
> Am 21.02.2018 um 15:58 schrieb Alfredo Deza:
>> On Wed, Feb 21, 2018 at 9:40 AM, Dan van der Ster
>> wrote:
>>> On Wed, Feb 21, 2018 at 2:24 PM, Alfredo Deza wrote:
On Tue, Feb 20, 2018 at 9:05 PM, Oliver Freyermuth
wrote:
Did you remove and recreate the OSDs that used the SSD for their WAL/DB?
Or did you try to do something to not have to do that? That is an integral
part of the OSD and changing the SSD would destroy the OSDs involved unless
you attempted some sort of dd. If you did that, then any corruption for
t
On Thu, Feb 22, 2018 at 06:00:12AM +, Robin H. Johnson wrote:
> You need to create a RGW user with the system flag set (it might be
> possible with the newer admin flag as well).
That was the missing piece - thanks very much! I have it working now.
Cheers,
Dave
--
** Dave Holland ** Systems
Hi Oliver,
i also use Infiniband and Cephfs for HPC purposes.
My setup:
* 4x Dell R730xd and expansion shelf, 24 OSD à 8TB, 128GB Ram,
2x10Core Intel 4th Gen, Mellanox ConnectX-3, no SSD-Cache
* 7x Dell R630 Clients
* Ceph-Cluster running on Ubuntu Xenial and Ceph Jewel deployed with
Hi Sean and David,
Do you have any follow ups / news on the Intel DC S4600 case? We are
looking into this drives to use as DB/WAL devices for a new to be build
cluster.
Did Intel provide anything (like new firmware) which should fix the issues
you were having or are these drives still unreliable?
For the recent 12.2.3 Luminous release we've found that it is not
possible to mount a device when using filestore [0]
We are trying to find out how this was able to get pass our functional
testing, in the meantime the workaround for this problem is to add the
mount options in ceph.conf (as explain
On Wed, Feb 21, 2018 at 7:27 PM, Anthony D'Atri wrote:
>>> I was thinking we might be able to configure/hack rbd mirroring to mirror to
>>> a pool on the same cluster but I gather from the OP and your post that this
>>> is not really possible?
>>
>> No, it's not really possible currently and we ha
Le 22/02/2018 à 05:23, Brad Hubbard a écrit :
> On Wed, Feb 21, 2018 at 6:40 PM, Yoann Moulin wrote:
>> Hello,
>>
>> I migrated my cluster from jewel to luminous 3 weeks ago (using ceph-ansible
>> playbook), a few days after, ceph status told me "PG_DAMAGED
>> Possible data damage: 1 pg inconsist
Hi,
Debian Packages for stretch have broken dependencies:
The following packages have unmet dependencies:
ceph-common : Depends: libleveldb1 but it is not installable
Depends: libsnappy1 but it is not installable
ceph-mon : Depends: libleveldb1 but it is not installable
On Thu, Feb 22, 2018 at 8:03 AM, Alfredo Deza wrote:
> For the recent 12.2.3 Luminous release we've found that it is not
> possible to mount a device when using filestore [0]
>
> We are trying to find out how this was able to get pass our functional
> testing,
Our functional testing was not affec
Am 22.02.2018 um 02:54 schrieb David Turner:
> You could set the flag noin to prevent the new osds from being calculated by
> crush until you are ready for all of them in the host to be marked in.
> You can also set initial crush weight to 0 for new pads so that they won't
> receive any PGs unti
Hi Caspar,
Sean and I replaced the problematic DC S4600 disks (after all but one had
failed) in our cluster with Samsung SM863a disks.
There was an NDA for new Intel firmware (as mentioned earlier in the thread
by David) but given the problems we were experiencing we moved all Intel
disks to a sin
Hi,
I have a situation with a cluster which was recently upgraded to
Luminous and has a PG mapped to OSDs on the same host.
root@man:~# ceph pg map 1.41
osdmap e21543 pg 1.41 (1.41) -> up [15,7,4] acting [15,7,4]
root@man:~#
root@man:~# ceph osd find 15|jq -r '.crush_location.host'
n02
root@m
Hey Cephers,
Sorry for the short notice, but the Ceph Tech Talk schedule for today
(Thursday, February 22) has been canceled.
Kindest regards,
Leo
--
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team
___
ceph-users mailing list
ceph-
On Wed, Feb 21, 2018 at 11:17 PM, Daniel Carrasco wrote:
> I want to search also if there is any way to cache file metadata on client,
> to lower the MDS load. I suppose that files are cached but the client check
> with MDS if there are changes on files. On my server files are the most of
> time r
Cumulative followup to various insightful replies.
I wrote:
No, it's not really possible currently and we have no plans to add
>>> such support since it would not be of any long-term value.
>>
>> The long-term value would be the ability to migrate volumes from, say, a
>> replicated pool to
has anyone tried with the most recent firmwares from intel? i've had a
number of s4600 960gb drives that have been waiting for me to get around to
adding them to a ceph cluster. this as well as having 2 die almost
simultaneously in a different storage box is giving me pause. i noticed
that David li
Hi Mike,
I eventually got hold of a customer relations manager at Intel but his attitude
was lack luster and Intel never officially responded to any correspondence we
sent them. The Intel s4600 drives all passed our standard burn-in tests, they
exclusively appear to fail once they handle produc
Dear all,
I would like to know if there are additional risks when running CEPH
with "Min Size" equal to "Replicated Size" for a given pool.
What are the drawbacks and what could be go wrong in such a scenario?
Best regards,
G.
___
ceph-users mailin
If min_size == size, a single OSD failure will place your pool read only
On 02/22/2018 11:06 PM, Georgios Dimitrakakis wrote:
> Dear all,
>
> I would like to know if there are additional risks when running CEPH
> with "Min Size" equal to "Replicated Size" for a given pool.
>
> What are the drawb
All right! Thank you very much Jack!
The way I understand this is that it's not necessarily a bad thing. I
mean as long as it doesn't harm any data or
cannot cause any other issue.
Unfortunately my scenario consists of only two OSDs therefore there is
a replication factor of 2 and min_size=1.
hrm. intel has, until a year ago, been very good with ssds. the description
of your experience definitely doesn't inspire confidence. intel also
dropping the entire s3xxx and p3xxx series last year before having a viable
replacement has been driving me nuts.
i don't know that i have the luxury of
adding ceph-users back on.
it sounds like the enterprise samsungs and hitachis have been mentioned on
the list as alternatives. i have 2 micron 5200 (pro i think) that i'm
beginning testing on and have some micron 9100 nvme drives to use as
journals. so the enterprise micron might be good. i did t
On Thu, Feb 22, 2018 at 9:29 AM Wido den Hollander wrote:
> Hi,
>
> I have a situation with a cluster which was recently upgraded to
> Luminous and has a PG mapped to OSDs on the same host.
>
> root@man:~# ceph pg map 1.41
> osdmap e21543 pg 1.41 (1.41) -> up [15,7,4] acting [15,7,4]
> root@man:~
was the pg-upmap feature used to force a pg to get mapped to a particular
osd?
mike
On Thu, Feb 22, 2018 at 10:28 AM, Wido den Hollander wrote:
> Hi,
>
> I have a situation with a cluster which was recently upgraded to Luminous
> and has a PG mapped to OSDs on the same host.
>
> root@man:~# cep
On Wed, Feb 21, 2018 at 2:46 PM Oliver Freyermuth <
freyerm...@physik.uni-bonn.de> wrote:
> Dear Cephalopodians,
>
> in a Luminous 12.2.3 cluster with a pool with:
> - 192 Bluestore OSDs total
> - 6 hosts (32 OSDs per host)
> - 2048 total PGs
> - EC profile k=4, m=2
> - CRUSH failure domain = host
Hi Vadim,
many thanks for these benchmark results!
This indeed looks extremely similar to what we achieve after enabling connected
mode.
Our 6 OSD-hosts are Supermicro systems with 2 HDDs (Raid 1) for the OS, and 32
HDDs (4 TB) + 2 SSDs for the OSDs.
The 2 SSDs have 16 LVM volumes each (whi
Am 23.02.2018 um 01:05 schrieb Gregory Farnum:
>
>
> On Wed, Feb 21, 2018 at 2:46 PM Oliver Freyermuth
> mailto:freyerm...@physik.uni-bonn.de>> wrote:
>
> Dear Cephalopodians,
>
> in a Luminous 12.2.3 cluster with a pool with:
> - 192 Bluestore OSDs total
> - 6 hosts (32 OSDs p
What's the output of "ceph -s" while this is happening?
Is there some identifiable difference between these two states, like you
get a lot of throughput on the data pools but then metadata recovery is
slower?
Are you sure the recovery is actually going slower, or are the individual
ops larger or
On Wed, Feb 21, 2018 at 10:54 AM, Enrico Kern
wrote:
> Hey all,
>
> i would suggest some changes to the ceph auth caps command.
>
> Today i almost fucked up half of one of our openstack regions with i/o
> errors because of user failure.
>
> I tried to add osd blacklist caps to a cinder keyring aft
The pool will not actually go read only. All read and write requests will
block until both osds are back up. If I were you, I would use min_size=2
and change it to 1 temporarily if needed to do maintenance or
troubleshooting where down time is not an option.
On Thu, Feb 22, 2018, 5:31 PM Georgios
32 matches
Mail list logo