On 6/11/19 9:48 PM, J. Eric Ivancich wrote:
> Hi Wido,
>
> Interleaving below
>
> On 6/11/19 3:10 AM, Wido den Hollander wrote:
>>
>> I thought it was resolved, but it isn't.
>>
>> I counted all the OMAP values for the GC objects and I got back:
>>
>> gc.0: 0
>> gc.11: 0
>> gc.14: 0
>> gc.
Hi list,
I have a setup where two clients mount the same filesystem and
read/write from mostly non-overlapping subsets of files (Dovecot mail
storage/indices). There is a third client that takes backups by
snapshotting the top-level directory, then rsyncing the snapshot over to
another location.
Hi all,
we have a 5 node ceph cluster with 44 OSDs
where all nodes also serve as virtualization hosts,
running about 22 virtual machines with all in all about 75 rbd s
(158 including snapshots).
We experience absurd slow i/o in the VMs and I suspect
our thread settings in ceph.conf to be one of t
Hi Felix,
Better use fio.
Like fio -ioengine=rbd -direct=1 -invalidate=1 -name=test -bs=4k -iodepth=128
-rw=randwrite -pool=rpool_hdd -runtime=60 -rbdname=testimg (for peak parallel
random iops)
Or the same with -iodepth=1 for the latency test. Here you usually get
Or the same with -ioengine=
Hi all,
Here our story. Perhaps some day could help anyone. Be in mind that English is
not my native language so sorry if I make mistakes.
Our system is: Ceph 0.87.2 (Giant), with 5 OSD servers (116 1TB osd total) and
3 monitors.
After a nightmare time, we initially "correct" ceph monitor prob
On Wed, Jun 12, 2019 at 6:48 AM Glen Baars
wrote:
> Interesting performance increase! I'm Iscsi it at a few installations and
> now a wonder what version of Centos is required to improve performance! Did
> the cluster go from Luminous to Mimic?
>
wild guess: probably related to updating tcmu-run
On Wed, Jun 12, 2019 at 11:45 AM Lluis Arasanz i Nonell - Adam <
lluis.aras...@adam.es> wrote:
> - Be careful adding or removing monitors in a not healthy monitor cluster:
> If they lost quorum you will be into problems.
>
safe procedure: remove the dead monitor before adding a new one
>
>
> No
On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote:
> We experience absurd slow i/o in the VMs and I suspect
> our thread settings in ceph.conf to be one of the culprits.
>
this is probably not the cause. But someone might be able to help you if you
share details on your setup (hardware, software)
Hi all,
we have a 5 node ceph cluster with 44 OSDs
where all nodes also serve as virtualization hosts,
running about 22 virtual machines with all in all about 75 rbd s
(158 including snapshots).
We experience absurd slow i/o in the VMs and I suspect
our thread settings in ceph.conf to be one of t
On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote:
>
> Hi list,
>
> I have a setup where two clients mount the same filesystem and
> read/write from mostly non-overlapping subsets of files (Dovecot mail
> storage/indices). There is a third client that takes backups by
> snapshotting the top-leve
Hello Jason,
Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
4- I export the snapshot from the source pool and I import the snapshot
towards the destination pool (in the pipe)
rbd export-diff --from-snap ${LAST-SNAP}
${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP} - | rbd -c ${BACKUP-CLUSTER}
import-d
I will look into that, but:
IS there a rule of thumb to determine the optimal setting for
osd disk threads
and
osd op threads
?
TIA
On Wed, Jun 12, 2019 at 3:22 PM Paul Emmerich wrote:
>
>
>
> On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote:
>>
>> We experience absurd slow i/o in the VMs and I
On both larger and smaller clusters i have never had problems with the default
values.
So i guess thats a pretty good start.
- Original Message -
From: "tim taler"
To: "Paul Emmerich"
Cc: "ceph-users"
Sent: Wednesday, June 12, 2019 3:51:43 PM
Subject: Re: [ceph-users] ceph threads and
On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin
wrote:
>
> Hello Jason,
>
> Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
> >> 4- I export the snapshot from the source pool and I import the snapshot
> >> towards the destination pool (in the pipe)
> >> rbd export-diff --from-snap ${LAST-SNAP}
If there was an optimal setting, then it would be the default.
Also, both have these options have been removed in Luminous ~2 years ago
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +
We ended in a bad situation with our RadosGW (Cluster is Nautilus
14.2.1, 350 OSDs with BlueStore):
1. There is a bucket with about 60 million objects, without shards.
2. radosgw-admin bucket reshard --bucket $BIG_BUCKET --num-shards 1024
3. Resharding looked fine first, it counted up to the n
I have run into a similar hang on 'ls .snap' recently:
https://tracker.ceph.com/issues/40101#note-2
On Wed, Jun 12, 2019 at 9:33 AM Yan, Zheng wrote:
>
> On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote:
> >
> > Hi list,
> >
> > I have a setup where two clients mount the same filesystem and
>
Also opened an issue about the rocksdb problem:
https://tracker.ceph.com/issues/40300
On 12.06.19 16:06, Harald Staub wrote:
We ended in a bad situation with our RadosGW (Cluster is Nautilus
14.2.1, 350 OSDs with BlueStore):
1. There is a bucket with about 60 million objects, without shards.
Hi,
If nothing special with defined “initial monitos” on cluster, we’ll try to
remove mon01 from cluster.
I comment about “initial monitor” because in our ceph implementation there is
only one monitor as “initial:
[root@mon01 ceph]# cat /etc/ceph/ceph.conf
[global]
fsid = ---44
On Fri, 10 May 2019, Sage Weil wrote:
> Hi everyone,
>
> -- What --
>
> The Ceph Leadership Team[1] is proposing a change of license from
> *LGPL-2.1* to *LGPL-2.1 or LGPL-3.0* (dual license). The specific changes
> are described by this pull request:
>
> https://github.com/ceph/ceph/pul
Hi Aaron,
The data_log objects are storing logs for multisite replication. Judging
by the pool name '.us-phx2.log', this cluster was created before jewel.
Are you (or were you) using multisite or radosgw-agent?
If not, you'll want to turn off the logging (log_meta and log_data ->
false) in y
Hi Harald,
If the bucket reshard didn't complete, it's most likely one of the new
bucket index shards that got corrupted here and the original index shard
should still be intact. Does $BAD_BUCKET_ID correspond to the
new/resharded instance id? If so, once the rocksdb/osd issues are
resolved,
On Wed, 12 Jun 2019, Harald Staub wrote:
> Also opened an issue about the rocksdb problem:
> https://tracker.ceph.com/issues/40300
Thanks!
The 'rocksdb: Corruption: file is too short' the root of the problem
here. Can you try starting the OSD with 'debug_bluestore=20' and
'debug_bluefs=20'? (A
Hi,
Could someone be able to point me to a blog or documentation page which helps
me resolve the issues noted below?
All nodes are Luminous, 12.2.12; one realm, one zonegroup (clustered haproxies
fronting), two zones (three rgw in each); All endpoint references to each zone
go are an haproxy.
Correct, it was pre-jewel. I believe we toyed with multisite replication back
then so it may have gotten baked into the zonegroup inadvertently. Thanks for
the info!
> On Jun 12, 2019, at 11:08 AM, Casey Bodley wrote:
>
> Hi Aaron,
>
> The data_log objects are storing logs for multisite repli
All;
I'm testing and evaluating Ceph for the next generation of storage architecture
for our company, and so far I'm fairly impressed, but I've got a couple of
questions around cluster replication and disaster recovery.
First; intended uses.
Ceph Object Gateway will be used to support new softw
Le 12/06/2019 à 16:01, Jason Dillaman a écrit :
On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin
wrote:
Hello Jason,
Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
4- I export the snapshot from the source pool and I import the snapshot
towards the destination pool (in the pipe)
rbd export-
On 12.06.19 17:40, Sage Weil wrote:
On Wed, 12 Jun 2019, Harald Staub wrote:
Also opened an issue about the rocksdb problem:
https://tracker.ceph.com/issues/40300
Thanks!
The 'rocksdb: Corruption: file is too short' the root of the problem
here. Can you try starting the OSD with 'debug_bluest
On Wed, 12 Jun 2019, Harald Staub wrote:
> On 12.06.19 17:40, Sage Weil wrote:
> > On Wed, 12 Jun 2019, Harald Staub wrote:
> > > Also opened an issue about the rocksdb problem:
> > > https://tracker.ceph.com/issues/40300
> >
> > Thanks!
> >
> > The 'rocksdb: Corruption: file is too short' the ro
Dear Sage,
> Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm
> pretty sure it'll crash in the same spot, but just want to confirm
> it's a bluefs issue.
To my surprise, this actually seems to have worked:
$ time sudo ceph-bluestore-tool --out-dir /mnt/ceph bluefs-export -
On Wed, 12 Jun 2019, Simon Leinen wrote:
> Dear Sage,
>
> > Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm
> > pretty sure it'll crash in the same spot, but just want to confirm
> > it's a bluefs issue.
>
> To my surprise, this actually seems to have worked:
>
> $ time s
Sage Weil writes:
> What happens if you do
> ceph-kvstore-tool rocksdb /mnt/ceph/db stats
(I'm afraid that our ceph-kvstore-tool doesn't know about a "stats"
command; but it still tries to open the database.)
That aborts after complaining about many missing files in /mnt/ceph/db.
When I ( cd /
On Wed, 12 Jun 2019, Simon Leinen wrote:
> Sage Weil writes:
> > What happens if you do
>
> > ceph-kvstore-tool rocksdb /mnt/ceph/db stats
>
> (I'm afraid that our ceph-kvstore-tool doesn't know about a "stats"
> command; but it still tries to open the database.)
>
> That aborts after complaini
On Wed, 12 Jun 2019, Simon Leinen wrote:
> We hope that we can get some access to S3 bucket indexes back, possibly
> by somehow dropping and re-creating those indexes.
Are all 3 OSDs crashing in the same way?
My guess is that the reshard process triggered some massive rocksdb
transaction that in
Sage Weil writes:
>> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families:
>> [default]
>> Unrecognized command: stats
>> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356:
>> rocksdb::Version::~Version(): Assertion `path_id <
>> cfd_->ioptions()->cf_pat
Simon Leinen writes:
> Sage Weil writes:
>> Try 'compact' instead of 'stats'?
> That run for a while and then crashed, also in the destructor for
> rocksdb::Version, but with an otherwise different backtrace. [...]
Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the
overall size of
On Thu, 13 Jun 2019, Simon Leinen wrote:
> Sage Weil writes:
> >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families:
> >> [default]
> >> Unrecognized command: stats
> >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356:
> >> rocksdb::Version::~Version
[Sorry for the piecemeal information... it's getting late here]
> Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the
> overall size of that directory increased(!) from 3.9GB to 12GB. The
> compaction seems to have eaten two .log files, but created many more
> .sst files.
...and i
On Wed, 12 Jun 2019, Sage Weil wrote:
> On Thu, 13 Jun 2019, Simon Leinen wrote:
> > Sage Weil writes:
> > >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column
> > >> families: [default]
> > >> Unrecognized command: stats
> > >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/ve
I'm following the bluestore config reference guide and trying to change
the value for osd_memory_target. I added the following entry in the
/etc/ceph/ceph.conf file:
[osd]
osd_memory_target = 2147483648
and restarted the osd daemons doing "systemctl restart ceph-osd.target".
Now, how do I
On 6/12/19 5:51 PM, Jorge Garcia wrote:
I'm following the bluestore config reference guide and trying to
change the value for osd_memory_target. I added the following entry in
the /etc/ceph/ceph.conf file:
[osd]
osd_memory_target = 2147483648
and restarted the osd daemons doing "systemct
Hi
How can we enable bluestore_default_buffered_write using ceph-conf utility
Any pointers would be appreciated
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 12/06/2019 22.33, Yan, Zheng wrote:
> I have tracked down the bug. thank you for reporting this. 'echo 2 >
> /proc/sys/vm/drop_cache' should fix the hang. If you can compile ceph
> from source, please try following patch.
>
> diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc
> index ecd06294
43 matches
Mail list logo