[ceph-users] Re: Nautilus packaging on stretch

2019-09-03 Thread Tobias Gall
Hello, Luminous was the last build for stretch. https://github.com/ceph/ceph/pull/22602/files There are currently no builds for buster either :( Tobias Am 04.09.19 um 03:11 schrieb mjclark...@gmail.com: Hello, I'm trying to install nautilus on stretch following the directions here https:

[ceph-users] Re: rgw auth error with self region name

2019-09-03 Thread Wesley Peng
Hi on 2019/9/4 11:40, 黄明友 wrote:             I use the aws s3 java sdk , when make a new bucket , with the hostname " s3.my-self.mydomain.com" ; will get a auth error. but , when I use the hostname " s3.us-east-1.mydomian.com" ,will be ok, why ? Both the domains can be resolved by DNS? rega

[ceph-users] rgw auth error with self region name

2019-09-03 Thread 黄明友
hi,all: I use the aws s3 java sdk , when make a new bucket , with the hostname " s3.my-self.mydomain.com" ; will get a auth error. but , when I use the hostname " s3.us-east-1.mydomian.com" ,will be ok, why ? 黄明友 IT基础架构部经理 V.Photos 云摄影 移动电话: +86 13540630430 客服电话:400 - 806 - 5775

[ceph-users] Re: ceph mons stuck in electing state

2019-09-03 Thread huang jun
can you set debug_mon=20 and debug_paxos=20 and debug_ms=1 on all mon and get log? Ashley Merrick 于2019年9月3日周二 下午9:35写道: > > What change did you make in ceph.conf > > Id check that hasn't caused an issue first. > > > On Tue, 27 Aug 2019 04:37:15 +0800 nkern...@gmail.com wrote > > Hello,

[ceph-users] Nautilus 14.2.3 packages appearing on the mirrors

2019-09-03 Thread Sasha Litvak
Is there an actual release or an accident? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Nautilus packaging on stretch

2019-09-03 Thread mjclark . 00
Hello, I'm trying to install nautilus on stretch following the directions here https://docs.ceph.com/docs/master/install/get-packages/ . However, it seems the stretch repo only includes ceph-deploy. Are the rest of the packages missing on purpose or have I missed something obvious? Thanks ___

[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-03 Thread Guilherme
Dear CEPHers, Adding some comments to my colleague's post: we are running Mimic 13.2.6 and struggling with 2 issues (that might be related): 1) After a "lack of space" event we've tried to remove a 40TB file. The file is not there anymore, but no space was released. No process is using the file

[ceph-users] Ceph FS not releasing space after file deletion

2019-09-03 Thread Gustavo Tonini
Dear ceph users, we are using CEPH since Jewel and now (with Mimic) are facing a problem that is impeding us to use it appropriately. After deleting some big backup files, we have noticed that their space has not returned to the pool's free space. Considering the mentioned pool, du shows that we h

[ceph-users] Re: forcing an osd down

2019-09-03 Thread solarflow99
hi, thanks for the replies. I guess this would also explain why every time an OSD would fail it wouldn't stay down, and add itself back again. On Tue, Sep 3, 2019 at 11:45 AM Frank Schilder wrote: > "ceph osd down" will mark an OSD down once, but not shut it down. Hence, > it will continue to

[ceph-users] Re: MDS blocked ops; kernel: Workqueue: ceph-pg-invalid ceph_invalidate_work [ceph]

2019-09-03 Thread jesper
> Hi, I encountered a problem with blocked MDS operations and a client > becoming unresponsive. I dumped the MDS cache, ops, blocked ops and some > further log information here: > > https://files.dtu.dk/u/peQSOY1kEja35BI5/2010-09-03-mds-blocked-ops?l > > A user of our HPC system was running a job t

[ceph-users] Re: forcing an osd down

2019-09-03 Thread Frank Schilder
"ceph osd down" will mark an OSD down once, but not shut it down. Hence, it will continue to send heartbeats and request to be marked up again after a couple of seconds. To keep it down, there are 2 ways: - either set "ceph osd set noup", - or actually shut the OSD down. The first version will

[ceph-users] Re: ceph fs crashes on simple fio test

2019-09-03 Thread Frank Schilder
Hi Robert and Paul, sad news. I did a 5 seconds single thread test after setting osd_op_queue_cut_off=high on all OSDs and MDSs. Here the current settings: [root@ceph-01 ~]# ceph config show osd.0 NAMEVALUE SOURCE OVERRIDES IGNORES bluestore_c

[ceph-users] Re: TASK_UNINTERRUPTIBLE kernel client threads

2019-09-03 Thread Ilya Dryomov
On Mon, Sep 2, 2019 at 5:39 PM Toby Darling wrote: > > Hi > > We have a couple of RHEL 7.6 (3.10.0-957.21.3.el7.x86_64) clients that > have a number of uninterruptible threads and I'm wondering if we're > looking at the issue fixed by > https://www.spinics.net/lists/ceph-devel/msg45467.html (the f

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-09-03 Thread Ilya Dryomov
On Tue, Sep 3, 2019 at 6:29 PM Florian Haas wrote: > > Hi, > > replying to my own message here in a shameless attempt to re-up this. I > really hope that the list archive can be resurrected in one way or > another... Adding David, who managed the transition. Thanks, Ilya ___

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-09-03 Thread Florian Haas
Hi, replying to my own message here in a shameless attempt to re-up this. I really hope that the list archive can be resurrected in one way or another... Cheers, Florian On 29/08/2019 15:00, Florian Haas wrote: > Hi, > > is there any chance the list admins could copy the pipermail archive > fr

[ceph-users] Re: FileStore OSD, journal direct symlinked, permission troubles.

2019-09-03 Thread Marco Gaiarin
Mandi! Alwin Antreich In chel di` si favelave... > > I'm not a ceph expert, but solution iii) seems decent for me, with a > > little overhead (a readlinkk and a stat for every osd start). > However you like it. But to note that in Ceph Nautilus the udev rules > aren't shipped anymore. Ok. I mak

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Mark Nelson
That was my thought as well.  It would be interesting to see the results of a much longer write test (say 100GB). Mark On 9/3/19 9:40 AM, Fabian Niepelt wrote: Hey, are these drives connected to a RAID controller with a write cache? I've seen lots of weird behaviors with them. You said the

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Fabian Niepelt
Hey, are these drives connected to a RAID controller with a write cache? I've seen lots of weird behaviors with them. You said the problem persists when rebooting but not when power cycling which would reinforce a hardware component being the culprit in this case. Greetings Fabian Am Dienstag, d

[ceph-users] Manual pg repair help

2019-09-03 Thread Marc Roos
Is there no ceph wiki page with examples of manual repairs with the ceph-objectstore-tool (eg. where pg repair and pg scrub don’t work) I am having this issue for quite some time. 2019-09-02 14:17:34.175139 7f9b3f061700 -1 log_channel(cluster) log [ERR] : deep-scrub 17.36 17:6ca1f70a:::rbd

[ceph-users] Fwd: Applications slow in VMs running RBD disks

2019-09-03 Thread Gesiel Galvão Bernardes
Hi, Em qui, 29 de ago de 2019 às 22:32, fengyd escreveu: > Hi, > > The issue is still there? > Yes, yet. > I have met an IO peformance issue recently and found that the count of the > max fd for the Qemu/KVM was not bigger enough, the fd for Qemu/KVM was > exhausted, the issue was solved after

[ceph-users] Re: ceph mons stuck in electing state

2019-09-03 Thread Ashley Merrick
What change did you make in ceph.conf Id check that hasn't caused an issue first. On Tue, 27 Aug 2019 04:37:15 +0800 nkern...@gmail.com wrote Hello, I have an old ceph 0.94.10 cluster that had 10 storage nodes with one extra management node used for running commands on the cluste

[ceph-users] Re: ceph mons stuck in electing state

2019-09-03 Thread Ashley Merrick
What change did you make in your ceph.conf? Id say it be a good idea to check and make sure that hasn't caused the issue. ,Ashley On Tue, 27 Aug 2019 04:37:15 +0800 nkern...@gmail.com wrote Hello, I have an old ceph 0.94.10 cluster that had 10 storage nodes with one extra manage

[ceph-users] ceph-volume 'ascii' codec can't decode byte 0xe2

2019-09-03 Thread changcheng.liu
Hi Alfredo Deza I see your below PR in ceph PR 23289 "ceph-volume ensure encoded bytes are always used" I meet with one problem when using ceph-deploy to deploy ceph cluster. During running ceph-volume command to create OSD through ceph-deploy, it always hit below problem: nstcc1@n

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-09-03 Thread Rich Kulawiec
A better course of action would be to copy the "mbox" file(s) that Mailman uses to archive all messages which pass through any mailing list that it runs, and then use Mailman "arch" command to regenerate the archives in the new location -- that way all of the links on the archive pages themselves

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-09-03 Thread Danny Abukalam
Yes I’m having the same problem - resorting to cached pages by Google for the time being! br, Danny > On 29 Aug 2019, at 14:00, Florian Haas wrote: > > Hi, > > is there any chance the list admins could copy the pipermail archive > from lists.ceph.com over to lists.ceph.io? It seems to contai

[ceph-users] Re: Ceph Ansible - - name: set grafana_server_addr fact - ipv4

2019-09-03 Thread Dimitri Savineau
This is probably a duplicate of https://github.com/ceph/ceph-ansible/issues/4404 Regards, Dimitri On Thu, Aug 29, 2019 at 9:42 AM Sebastien Han wrote: > +Guillaume Abrioux and +Dimitri Savineau > Thanks! > – > Sébastien Han > Principal Software Engineer, Storage Architect > > "Always g

[ceph-users] Re: Ceph Ansible - - name: set grafana_server_addr fact - ipv4

2019-09-03 Thread Sebastien Han
+Guillaume Abrioux and +Dimitri Savineau Thanks! – Sébastien Han Principal Software Engineer, Storage Architect "Always give 100%. Unless you're giving blood." On Wed, Aug 28, 2019 at 3:32 PM Lee Norvall wrote: > > Hi > > Ceph: nautilus (14.2.2) > NFS-Ganesha v 2.8 > ceph-ansible stable

[ceph-users] Re: ceph's replicas question

2019-09-03 Thread Christian Theune
Hi, > On 27. Aug 2019, at 14:43, Paul Emmerich wrote: > > 100% agree, this happens *all the time* with min_size 1. > > If you really care about your data then 2/1 just doesn't cut it. Just to make this more specific and less fictional: a very easy way to trigger this is by shutting down your

[ceph-users] ceph mons stuck in electing state

2019-09-03 Thread Nick
Hello, I have an old ceph 0.94.10 cluster that had 10 storage nodes with one extra management node used for running commands on the cluster. Over time we'd had some hardware failures on some of the storage nodes, so we're down to 6, with ceph-mon running on the management server and 4 of the stora

[ceph-users] MDS blocked ops; kernel: Workqueue: ceph-pg-invalid ceph_invalidate_work [ceph]

2019-09-03 Thread Frank Schilder
Hi, I encountered a problem with blocked MDS operations and a client becoming unresponsive. I dumped the MDS cache, ops, blocked ops and some further log information here: https://files.dtu.dk/u/peQSOY1kEja35BI5/2010-09-03-mds-blocked-ops?l A user of our HPC system was running a job that create

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Fyodor Ustinov
Hi! In this case, using dd is quite acceptable. - Original Message - > From: vita...@yourcmc.ru > To: "Fyodor Ustinov" > Cc: "EDH - Manuel Rios Fernandez" , "ceph-users" > > Sent: Tuesday, 3 September, 2019 15:18:23 > Subject: Re: [ceph-users] Re: Strange hardware behavior > Please ne

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Fyodor Ustinov
Hi! Absolutely. In reality, everything happened even longer: 1. I see such strange behavior 2. Reboot 3. Nothing changes 4. I upgrade kernel from 4.20.7 to 5.2.11 and reboot 5. Nothing changes 6. power off/power on 7. Everything starts to work as it should. And in fact, paragraphs 2 to 5 are op

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread vitalif
Please never use dd for disk benchmarks. Use fio. For linear write: fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -iodepth=32 -rw=write -runtime=60 -filename=/dev/sdX ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe s

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Marc Roos
Yes indeed very funny case, are you sure sdd/sdc etc are not being reconnected(renumbered) to different drives because of some bus reset or other failure? Or maybe some udev rule is messing things up? -Original Message- From: Fyodor Ustinov [mailto:u...@ufm.su] Sent: dinsdag 3 sept

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread Fyodor Ustinov
Hi! Micron_1100_MTFD But not only SSD "too slowly". And HDD - "too quickly". > Hi Fyodor > > Whats the model of SSD? > > Regards > > > -Mensaje original- > De: Fyodor Ustinov > Enviado el: martes, 3 de septiembre de 2019 13:13 > Para: ceph-users > Asunto: [ceph-users] Strange hardw

[ceph-users] Re: Best osd scenario + ansible config?

2019-09-03 Thread Yoann Moulin
Hi, > So you need to think about failure domains. Failure domains will be set to host. > If you put all the DB's on one SSD and all the WAL's on another SSD then a > failure of either of those SSD's will result in a failure of all the OSD's > behind them. So in this case all 10 OSD's would hav

[ceph-users] Re: Strange hardware behavior

2019-09-03 Thread EDH - Manuel Rios Fernandez
Hi Fyodor Whats the model of SSD? Regards -Mensaje original- De: Fyodor Ustinov Enviado el: martes, 3 de septiembre de 2019 13:13 Para: ceph-users Asunto: [ceph-users] Strange hardware behavior Hi! I understand that this question is not quite for this mailing list, but nonetheless

[ceph-users] Re: Best osd scenario + ansible config?

2019-09-03 Thread Yoann Moulin
Hello, > Just a note: > With 7+5 you will need 13 host for access your data in case one goes down. As far as I know, EC 7+5 imply ‘erasure size 12 min_size 8’. So I need at least 8 servers access to my data, k=7, m=5, size = k+m = 12 min_size = k+1 = 8 Am I wrong? > Expected in the nexts vers

[ceph-users] Strange hardware behavior

2019-09-03 Thread Fyodor Ustinov
Hi! I understand that this question is not quite for this mailing list, but nonetheless, experts who may be encountered this have gathered here. I have 24 servers, and on each, after six months of work, the following began to happen: [root@S-26-5-1-2 cph]# uname -a Linux S-26-5-1-2 5.2.11-1.el

[ceph-users] Bug: ceph-objectstore-tool ceph version 12.2.12

2019-09-03 Thread Marc Roos
Max 2x listed ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}] ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le

[ceph-users] Re: Best osd scenario + ansible config?

2019-09-03 Thread Darren Soothill
Hi Yoann, So you need to think about failure domains. If you put all the DB's on one SSD and all the WAL's on another SSD then a failure of either of those SSD's will result in a failure of all the OSD's behind them. So in this case all 10 OSD's would have failed. Splitting it to 5 OSD's you h

[ceph-users] Placement Groups - default.rgw.metadata pool.

2019-09-03 Thread Jaroslaw Owsiewski
Hi, we have cluster which running Ceph Luminous 12.2.12. Rados Gateway Only (S3). Pool with data is placed on SAS HDDs (1430 pcs) and the rest of pools is placed on the SSD (72 pcs) disks - 72 hosts with OSD role (3 rows, 2 racks per row, and 12 hosts per rack). BlueStore of course. The question

[ceph-users] Re: pg 17.36 is active+clean+inconsistent head expected clone 1 missing?

2019-09-03 Thread Marc Roos
Hi Steve, I was just about to follow your steps[0] with the ceph-objectstore-tool, (I do not want to remove more snapshots) So I have this error pg 17.36 is active+clean+inconsistent, acting [7,29,12] 2019-09-02 14:17:34.175139 7f9b3f061700 -1 log_channel(cluster) log [ERR] : deep-scrub 17.

[ceph-users] Re: Best osd scenario + ansible config?

2019-09-03 Thread EDH - Manuel Rios Fernandez
Just a note: With 7+5 you will need 13 host for access your data in case one goes down. Expected in the nexts version allow access data with the EC numbers. -Mensaje original- De: Yoann Moulin Enviado el: martes, 3 de septiembre de 2019 11:28 Para: ceph-users@ceph.io Asunto: [ceph-users

[ceph-users] Best osd scenario + ansible config?

2019-09-03 Thread Yoann Moulin
Hello, I am deploying a new Nautilus cluster and I would like to know what would be the best OSD's scenario config in this case : 10x 6TB Disk OSDs (data) 2x 480G SSD previously used for journal and can be used for WAL and/or DB Is it better to put all WAL on one SSD and all DBs on the other o