date:20180716

[ceph-users] Is Ceph the right tool for storing lots of small files?

2018-07-16 Thread Christian Wimmer

Hi all, I am trying to use Ceph with RGW to store lots (>300M) of small files (80% 2-15kB, 20% up to 500kB). After some testing, I wonder if Ceph is the right tool for that. Does anybody of you have experience with this use case? Things I came across: - EC pools: default stripe-width is 4kB. Doe

Re: [ceph-users] intermittent slow requests on idle ssd ceph clusters

2018-07-16 Thread Glen Baars

Hello Pavel, I don't have all that much info ( fairly new to Ceph ) but we are facing a similar issue. If the cluster is fairly idle we get slow requests - if I'm backfilling a new node there is no slow requests. Same X540 network cards but ceph 12.2.5 and Ubuntu 16.04. 4.4.0 kernel. LACP with

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-16 Thread Linh Vu

Hi Oliver, We have several CephFS on EC pool deployments, one been in production for a while, the others about to pending all the Bluestore+EC fixes in 12.2.7 😊 Firstly as John and Greg have said, you don't need SSD cache pool at all. Secondly, regarding k/m, it depends on how many hosts or

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Brad Hubbard

Your issue is different since not only do the omap digests of all replicas not match the omap digest from the auth object info but they are all different to each other. What is min_size of pool 67 and what can you tell us about the events leading up to this? On Mon, Jul 16, 2018 at 7:06 PM, Matth

[ceph-users] checking rbd volumes modification times

2018-07-16 Thread Andrei Mikhailovsky

Dear cephers, Could someone tell me how to check the rbd volumes modification times in ceph pool? I am currently in the process of trimming our ceph pool and would like to start with volumes which were not modified for a long time. How do I get that information? Cheers Andrei

Re: [ceph-users] OSD tuning no longer required?

2018-07-16 Thread Xavier Trilla

Hi there, I would like just to note, that for some scenarios defaults are not good enough. Recently we upgraded one of our clusters from Jewel to Luminous, during the upgrade we removed all the custom tuning we had done on it over the years from ceph.conf -I was extremely excited to get rid of

Re: [ceph-users] intermittent slow requests on idle ssd ceph clusters

2018-07-16 Thread Xavier Trilla

Hi Pavel, Any strange messages on dmesg, syslog, etc? I would recommend profiling the kernel with perf and checking for the calls that are consuming more CPU. We had several problems like the one you are describing, and for example one of them got fixed increasing vm.min_free_kbytes to 4GB.

[ceph-users] intermittent slow requests on idle ssd ceph clusters

2018-07-16 Thread Pavel Shub

Hello folks, We've been having issues with slow requests cropping up on practically idle ceph clusters. From what I can tell the requests are hanging waiting for subops, and the OSD on the other end receives requests minutes later! Below it started waiting for subops at 12:09:51 and the subop was

Re: [ceph-users] OSD tuning no longer required?

2018-07-16 Thread Robert Stanford

Golden advice. Thank you Greg On Mon, Jul 16, 2018 at 1:45 PM, Gregory Farnum wrote: > On Fri, Jul 13, 2018 at 2:50 AM Robert Stanford > wrote: > >> >> This is what leads me to believe it's other settings being referred to >> as well: >> https://ceph.com/community/new-luminous-rados-improvem

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-16 Thread Gregory Farnum

On Mon, Jul 16, 2018 at 1:25 AM John Spray wrote: > On Sun, Jul 15, 2018 at 12:46 PM Oliver Schulz > wrote: > > > > Dear all, > > > > we're planning a new Ceph-Clusterm, with CephFS as the > > main workload, and would like to use erasure coding to > > use the disks more efficiently. Access patte

Re: [ceph-users] OSD fails to start after power failure (with FAILED assert(num_unsent <= log_queue.size()) error)

2018-07-16 Thread Gregory Farnum

It's *repeatedly* crashing and restarting? I think the other times we've seen this it was entirely ephemeral and went away on restart, and I really don't know what about this state *could* be made persistent, so that's quite strange. If you can set "debug monc = 20", reproduce this, and post the lo

Re: [ceph-users] OSD tuning no longer required?

2018-07-16 Thread Gregory Farnum

On Fri, Jul 13, 2018 at 2:50 AM Robert Stanford wrote: > > This is what leads me to believe it's other settings being referred to as > well: > https://ceph.com/community/new-luminous-rados-improvements/ > > *"There are dozens of documents floating around with long lists of Ceph > configurables t

Re: [ceph-users] 12.2.6 CRC errors

2018-07-16 Thread Sage Weil

We are in the process of building the 12.2.7 release now that will fix this. (If you don't want to wait you can also install the autobuilt packages from shaman.ceph.com... official packages are only a few hours away from being ready though). I would set data migration for the time being (noreb

Re: [ceph-users] 12.2.6 CRC errors

2018-07-16 Thread Stefan Schneebeli

hello guys, unfortunately I missed the warning on friday and upgraded my cluster on saturday to 12.2.6. The cluster is in a migration state from filestore to bluestore (10/2) and I get constantly inconsistent PG's only on the two bluestore OSD's. If I run a rados list-inconsistent-obj 2.17 --fo

Re: [ceph-users] SSDs for data drives

2018-07-16 Thread Satish Patel

I just ran test on Samsung 850 Pro 500GB (how to interpret result of following output?) [root@compute-01 tmp]# fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=

Re: [ceph-users] SSDs for data drives

2018-07-16 Thread Michael Kuriger

I dunno, to me benchmark tests are only really useful to compare different drives. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Paul Emmerich Sent: Monday, July 16, 2018 8:41 AM To: Satish Patel Cc: ceph-users Subject: Re: [ceph-users] SSDs for data drives This does

Re: [ceph-users] chkdsk /b fails on Ceph iSCSI volume

2018-07-16 Thread Mike Christie

On 07/15/2018 08:08 AM, Wladimir Mutel wrote: > Hi, > > I cloned a NTFS with bad blocks from USB HDD onto Ceph RBD volume > (using ntfsclone, so the copy has sparse regions), and decided to clean > bad blocks within the copy. I run chkdsk /b from WIndows and it fails on > free space verifi

Re: [ceph-users] SSDs for data drives

2018-07-16 Thread Paul Emmerich

This doesn't look like a good benchmark: (from the blog post) dd if=/dev/zero of=/mnt/rawdisk/data.bin bs=1G count=20 oflag=direct 1. it writes compressible data which some SSDs might compress, you should use urandom 2. that workload does not look like something Ceph will do to your disk, like n

[ceph-users] Luminous 12.2.5 - crushable RGW

2018-07-16 Thread Jakub Jaszewski

Hi, We run 5 RADOS Gateways on Luminous 12.2.5 as upstream servers in nginx active-active setup, based on keepalived. Cluster is 12x Ceph nodes (16x 10TB OSD(bluestore) per node, 2x 10Gb network link shared by access and cluster networks), RGW pool is EC 9+3. We recently noticed below entries in R

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Pavan Rallabhandi

Yes, that suggestion worked for us, although we hit this when we've upgraded to 10.2.10 from 10.2.7. I guess this was fixed via http://tracker.ceph.com/issues/21440 and http://tracker.ceph.com/issues/19404 Thanks, -Pavan. On 7/16/18, 5:07 AM, "ceph-users on behalf of Matthew Vernon" wrote:

Re: [ceph-users] [rgw] Very high cache misses with automatic bucket resharding

2018-07-16 Thread Rudenko Aleksandr

Yes, i have tasks in `radosgw-admin reshard list`. And objects count in .rgw.buckets.index is increasing, slowly. But i confused a bit. I have one big bucket with 161 shards. … "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19#,20#,21#,22#,23#,24#,25#,26#,2

Re: [ceph-users] resize wal/db

2018-07-16 Thread Igor Fedotov

Hi Zhang, There is no way to resize DB while OSD is running. There is a bit shorter "unofficial" but risky way than redeploying OSD though. But you'll need to tag specific OSD out for a while in any case. You will also need either additional free partition(s) or initial deployment had to be d

Re: [ceph-users] SSDs for data drives

2018-07-16 Thread Satish Patel

https://blog.cypressxt.net/hello-ceph-and-samsung-850-evo/ On Thu, Jul 12, 2018 at 3:37 AM, Adrian Saul wrote: > > > We started our cluster with consumer (Samsung EVO) disks and the write > performance was pitiful, they had periodic spikes in latency (average of > 8ms, but much higher spikes) and

Re: [ceph-users] [rgw] Very high cache misses with automatic bucket resharding

2018-07-16 Thread Sean Redmond

Hi, Do you have on going resharding? 'radosgw-admin reshard list' should so you the status. Do you see the number of objects in .rgw.bucket.index pool increasing? I hit a lot of problems trying to use auto resharding in 12.2.5 - I have disabled it for the moment. Thanks [1] https://tracker.cep

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-16 Thread Oliver Schulz

Dear John, On 16.07.2018 16:25, John Spray wrote: Since Luminous, you can use an erasure coded pool (on bluestore) directly as a CephFS data pool, no cache pool needed. Great! I'll be happy to go without a cache pool then. Thanks for your help, John, Oliver __

Re: [ceph-users] Safe to use rados -p rbd cleanup?

2018-07-16 Thread Piotr Dałek

On 18-07-16 01:40 PM, Wido den Hollander wrote: On 07/15/2018 11:12 AM, Mehmet wrote: hello guys, in my production cluster i've many objects like this "#> rados -p rbd ls | grep 'benchmark'" ... .. . benchmark_data_inkscope.example.net_32654_object1918 benchmark_data_server_26414_object1990

Re: [ceph-users] Safe to use rados -p rbd cleanup?

2018-07-16 Thread Piotr Dałek

On 18-07-15 11:12 AM, Mehmet wrote: hello guys, in my production cluster i've many objects like this "#> rados -p rbd ls | grep 'benchmark'" ... .. . benchmark_data_inkscope.example.net_32654_object1918 benchmark_data_server_26414_object1990 ... .. . Is it safe to run "rados -p rbd cleanup" or

Re: [ceph-users] Safe to use rados -p rbd cleanup?

2018-07-16 Thread Wido den Hollander

On 07/15/2018 11:12 AM, Mehmet wrote: > hello guys, > > in my production cluster i've many objects like this > > "#> rados -p rbd ls | grep 'benchmark'" > ... .. . > benchmark_data_inkscope.example.net_32654_object1918 > benchmark_data_server_26414_object1990 > ... .. . > > Is it safe to run

[ceph-users] Luminous dynamic resharding, when index max shards already set

2018-07-16 Thread Robert Stanford

I am upgrading my clusters to Luminous. We are already using rados gateway, and index max shards has been set for the rgw data pools. Now we want to use Luminous dynamic index resharding. How do we make this transition? Regards ___ ceph-users mailing

[ceph-users] [rgw] Very high cache misses with automatic bucket resharding

2018-07-16 Thread Rudenko Aleksandr

Hi, guys. I use Luminous 12.2.5. Automatic bucket index resharding has not been activated in the past. Few days ago i activated auto. resharding. After that and now i see: - very high Ceph read I/O (~300 I/O before activating resharding, ~4k now), - very high Ceph read bandwidth (50 MB/s befor

[ceph-users] Ceph issue too many open files.

2018-07-16 Thread Daznis

Hi, Recently about ~2 weeks ago something strange started happening with one of the ceph cluster I'm managing. It's running ceph jewel 10.2.10 with cache layer. Some OSD's started crashing with "too many open files error". From looking at the issue I have found that it keeps a lot of links in /pro

[ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Matthew Vernon

Hi, Our cluster is running 10.2.9 (from Ubuntu; on 16.04 LTS), and we have a pg that's stuck inconsistent; if I repair it, it logs "failed to pick suitable auth object" (repair log attached, to try and stop my MUA mangling it). We then deep-scrubbed that pg, at which point rados list-inconsistent

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

2018-07-16 Thread John Spray

On Sun, Jul 15, 2018 at 12:46 PM Oliver Schulz wrote: > > Dear all, > > we're planning a new Ceph-Clusterm, with CephFS as the > main workload, and would like to use erasure coding to > use the disks more efficiently. Access pattern will > probably be more read- than write-heavy, on average. > > I

[ceph-users] Is Ceph the right tool for storing lots of small files?

Re: [ceph-users] intermittent slow requests on idle ssd ceph clusters

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

[ceph-users] checking rbd volumes modification times

Re: [ceph-users] OSD tuning no longer required?

Re: [ceph-users] intermittent slow requests on idle ssd ceph clusters

[ceph-users] intermittent slow requests on idle ssd ceph clusters

Re: [ceph-users] OSD tuning no longer required?

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

Re: [ceph-users] OSD fails to start after power failure (with FAILED assert(num_unsent <= log_queue.size()) error)

Re: [ceph-users] OSD tuning no longer required?

Re: [ceph-users] 12.2.6 CRC errors

Re: [ceph-users] 12.2.6 CRC errors

Re: [ceph-users] SSDs for data drives

Re: [ceph-users] SSDs for data drives

Re: [ceph-users] chkdsk /b fails on Ceph iSCSI volume

Re: [ceph-users] SSDs for data drives

[ceph-users] Luminous 12.2.5 - crushable RGW

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

Re: [ceph-users] [rgw] Very high cache misses with automatic bucket resharding

Re: [ceph-users] resize wal/db

Re: [ceph-users] SSDs for data drives

Re: [ceph-users] [rgw] Very high cache misses with automatic bucket resharding

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

Re: [ceph-users] Safe to use rados -p rbd cleanup?

Re: [ceph-users] Safe to use rados -p rbd cleanup?

Re: [ceph-users] Safe to use rados -p rbd cleanup?

[ceph-users] Luminous dynamic resharding, when index max shards already set

[ceph-users] [rgw] Very high cache misses with automatic bucket resharding

[ceph-users] Ceph issue too many open files.

[ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

Re: [ceph-users] CephFS with erasure coding, do I need a cache-pool?

33 matches

Site Navigation

Mail list logo

Footer information