Hi all,
I am trying to use Ceph with RGW to store lots (>300M) of small files (80%
2-15kB, 20% up to 500kB).
After some testing, I wonder if Ceph is the right tool for that.
Does anybody of you have experience with this use case?
Things I came across:
- EC pools: default stripe-width is 4kB. Doe
Hello Pavel,
I don't have all that much info ( fairly new to Ceph ) but we are facing a
similar issue. If the cluster is fairly idle we get slow requests - if I'm
backfilling a new node there is no slow requests. Same X540 network cards but
ceph 12.2.5 and Ubuntu 16.04. 4.4.0 kernel. LACP with
Hi Oliver,
We have several CephFS on EC pool deployments, one been in production for a
while, the others about to pending all the Bluestore+EC fixes in 12.2.7 😊
Firstly as John and Greg have said, you don't need SSD cache pool at all.
Secondly, regarding k/m, it depends on how many hosts or
Your issue is different since not only do the omap digests of all
replicas not match the omap digest from the auth object info but they
are all different to each other.
What is min_size of pool 67 and what can you tell us about the events
leading up to this?
On Mon, Jul 16, 2018 at 7:06 PM, Matth
Dear cephers,
Could someone tell me how to check the rbd volumes modification times in ceph
pool? I am currently in the process of trimming our ceph pool and would like to
start with volumes which were not modified for a long time. How do I get that
information?
Cheers
Andrei
Hi there,
I would like just to note, that for some scenarios defaults are not good enough.
Recently we upgraded one of our clusters from Jewel to Luminous, during the
upgrade we removed all the custom tuning we had done on it over the years from
ceph.conf -I was extremely excited to get rid of
Hi Pavel,
Any strange messages on dmesg, syslog, etc?
I would recommend profiling the kernel with perf and checking for the calls
that are consuming more CPU.
We had several problems like the one you are describing, and for example one of
them got fixed increasing vm.min_free_kbytes to 4GB.
Hello folks,
We've been having issues with slow requests cropping up on practically
idle ceph clusters. From what I can tell the requests are hanging
waiting for subops, and the OSD on the other end receives requests
minutes later! Below it started waiting for subops at 12:09:51 and the
subop was
Golden advice. Thank you Greg
On Mon, Jul 16, 2018 at 1:45 PM, Gregory Farnum wrote:
> On Fri, Jul 13, 2018 at 2:50 AM Robert Stanford
> wrote:
>
>>
>> This is what leads me to believe it's other settings being referred to
>> as well:
>> https://ceph.com/community/new-luminous-rados-improvem
On Mon, Jul 16, 2018 at 1:25 AM John Spray wrote:
> On Sun, Jul 15, 2018 at 12:46 PM Oliver Schulz
> wrote:
> >
> > Dear all,
> >
> > we're planning a new Ceph-Clusterm, with CephFS as the
> > main workload, and would like to use erasure coding to
> > use the disks more efficiently. Access patte
It's *repeatedly* crashing and restarting? I think the other times we've
seen this it was entirely ephemeral and went away on restart, and I really
don't know what about this state *could* be made persistent, so that's
quite strange. If you can set "debug monc = 20", reproduce this, and post
the lo
On Fri, Jul 13, 2018 at 2:50 AM Robert Stanford
wrote:
>
> This is what leads me to believe it's other settings being referred to as
> well:
> https://ceph.com/community/new-luminous-rados-improvements/
>
> *"There are dozens of documents floating around with long lists of Ceph
> configurables t
We are in the process of building the 12.2.7 release now that will fix
this. (If you don't want to wait you can also install the autobuilt
packages from shaman.ceph.com... official packages are only a few hours
away from being ready though).
I would set data migration for the time being (noreb
hello guys,
unfortunately I missed the warning on friday and upgraded my cluster on
saturday to 12.2.6.
The cluster is in a migration state from filestore to bluestore (10/2)
and I get constantly inconsistent PG's only on the two bluestore OSD's.
If I run a rados list-inconsistent-obj 2.17 --fo
I just ran test on Samsung 850 Pro 500GB (how to interpret result of
following output?)
[root@compute-01 tmp]# fio --filename=/dev/sda --direct=1 --sync=1
--rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based
--group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=
I dunno, to me benchmark tests are only really useful to compare different
drives.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Paul
Emmerich
Sent: Monday, July 16, 2018 8:41 AM
To: Satish Patel
Cc: ceph-users
Subject: Re: [ceph-users] SSDs for data drives
This does
On 07/15/2018 08:08 AM, Wladimir Mutel wrote:
> Hi,
>
> I cloned a NTFS with bad blocks from USB HDD onto Ceph RBD volume
> (using ntfsclone, so the copy has sparse regions), and decided to clean
> bad blocks within the copy. I run chkdsk /b from WIndows and it fails on
> free space verifi
This doesn't look like a good benchmark:
(from the blog post)
dd if=/dev/zero of=/mnt/rawdisk/data.bin bs=1G count=20 oflag=direct
1. it writes compressible data which some SSDs might compress, you should
use urandom
2. that workload does not look like something Ceph will do to your disk,
like n
Hi,
We run 5 RADOS Gateways on Luminous 12.2.5 as upstream servers in nginx
active-active setup, based on keepalived.
Cluster is 12x Ceph nodes (16x 10TB OSD(bluestore) per node, 2x 10Gb
network link shared by access and cluster networks), RGW pool is EC 9+3.
We recently noticed below entries in R
Yes, that suggestion worked for us, although we hit this when we've upgraded to
10.2.10 from 10.2.7.
I guess this was fixed via http://tracker.ceph.com/issues/21440 and
http://tracker.ceph.com/issues/19404
Thanks,
-Pavan.
On 7/16/18, 5:07 AM, "ceph-users on behalf of Matthew Vernon"
wrote:
Yes, i have tasks in `radosgw-admin reshard list`.
And objects count in .rgw.buckets.index is increasing, slowly.
But i confused a bit. I have one big bucket with 161 shards.
…
"max_marker":
"0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19#,20#,21#,22#,23#,24#,25#,26#,2
Hi Zhang,
There is no way to resize DB while OSD is running. There is a bit
shorter "unofficial" but risky way than redeploying OSD though. But
you'll need to tag specific OSD out for a while in any case. You will
also need either additional free partition(s) or initial deployment had
to be d
https://blog.cypressxt.net/hello-ceph-and-samsung-850-evo/
On Thu, Jul 12, 2018 at 3:37 AM, Adrian Saul
wrote:
>
>
> We started our cluster with consumer (Samsung EVO) disks and the write
> performance was pitiful, they had periodic spikes in latency (average of
> 8ms, but much higher spikes) and
Hi,
Do you have on going resharding? 'radosgw-admin reshard list' should so you
the status.
Do you see the number of objects in .rgw.bucket.index pool increasing?
I hit a lot of problems trying to use auto resharding in 12.2.5 - I have
disabled it for the moment.
Thanks
[1] https://tracker.cep
Dear John,
On 16.07.2018 16:25, John Spray wrote:
Since Luminous, you can use an erasure coded pool (on bluestore)
directly as a CephFS data pool, no cache pool needed.
Great! I'll be happy to go without
a cache pool then.
Thanks for your help, John,
Oliver
__
On 18-07-16 01:40 PM, Wido den Hollander wrote:
On 07/15/2018 11:12 AM, Mehmet wrote:
hello guys,
in my production cluster i've many objects like this
"#> rados -p rbd ls | grep 'benchmark'"
... .. .
benchmark_data_inkscope.example.net_32654_object1918
benchmark_data_server_26414_object1990
On 18-07-15 11:12 AM, Mehmet wrote:
hello guys,
in my production cluster i've many objects like this
"#> rados -p rbd ls | grep 'benchmark'"
... .. .
benchmark_data_inkscope.example.net_32654_object1918
benchmark_data_server_26414_object1990
... .. .
Is it safe to run "rados -p rbd cleanup" or
On 07/15/2018 11:12 AM, Mehmet wrote:
> hello guys,
>
> in my production cluster i've many objects like this
>
> "#> rados -p rbd ls | grep 'benchmark'"
> ... .. .
> benchmark_data_inkscope.example.net_32654_object1918
> benchmark_data_server_26414_object1990
> ... .. .
>
> Is it safe to run
I am upgrading my clusters to Luminous. We are already using rados
gateway, and index max shards has been set for the rgw data pools. Now we
want to use Luminous dynamic index resharding. How do we make this
transition?
Regards
___
ceph-users mailing
Hi, guys.
I use Luminous 12.2.5.
Automatic bucket index resharding has not been activated in the past.
Few days ago i activated auto. resharding.
After that and now i see:
- very high Ceph read I/O (~300 I/O before activating resharding, ~4k now),
- very high Ceph read bandwidth (50 MB/s befor
Hi,
Recently about ~2 weeks ago something strange started happening with
one of the ceph cluster I'm managing. It's running ceph jewel 10.2.10
with cache layer. Some OSD's started crashing with "too many open
files error". From looking at the issue I have found that it keeps a
lot of links in /pro
Hi,
Our cluster is running 10.2.9 (from Ubuntu; on 16.04 LTS), and we have a
pg that's stuck inconsistent; if I repair it, it logs "failed to pick
suitable auth object" (repair log attached, to try and stop my MUA
mangling it).
We then deep-scrubbed that pg, at which point
rados list-inconsistent
On Sun, Jul 15, 2018 at 12:46 PM Oliver Schulz
wrote:
>
> Dear all,
>
> we're planning a new Ceph-Clusterm, with CephFS as the
> main workload, and would like to use erasure coding to
> use the disks more efficiently. Access pattern will
> probably be more read- than write-heavy, on average.
>
> I
33 matches
Mail list logo