[ceph-users] Re: RGW listing slower on nominally faster setup

2020-06-12 Thread James, GleSYS
Hi,

I’m experiencing the same symptoms as OP.

We’re running Ceph Octopus 15.2.1 with RGW, and have seen on multiple occasions 
the bucket index pool go up to 500MB/s read throughput / 100K read IOPS.

Our logs during this time are flooded with these entries:
2020-06-09T07:11:18.070+0200 7f2676efd700  1 
RGWRados::Bucket::List::list_objects_ordered INFO ordered bucket listing 
requires read #1

When I set the debug_rgw logs to "20/1", the issue disappears immediately, and 
the throughput for the index pool goes back down to normal levels.

I’ve not actually tried reproducing the issue myself as I assumed the problem 
was with the S3 client, but maybe this is a bug on the RGW side…

Our bucket index pool is running on the same HDDs as the data pool, no separate 
SSDs like OP.

Regards,
James


> On 12 Jun 2020, at 02:22, sw...@tiltworks.com wrote:
> 
> Not seeing anything in OSD logs after triggering a listing, just heartbeat 
> entries.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Upload speed slow for 7MB file cephfs+Samaba

2020-06-12 Thread Amudhan P
Hi,

I have Ceph Octopus 4 Node, each node has 12 disk cluster, which is
configured with cephfs (replica 2) + exposed via samba for windows 10G
client.

When a user copies a folder containing 1000's of 7MB files from windows 10
client getting only a speed of 40MB/s.
Client and Ceph nodes all connected in 10G. In the same setup copying 1GB
file from windows client to samba getting 90 MB/s.

Are there any kernel or network tunning needs to be done?

Any suggestions?

regards
Amudhan P
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: dealing with spillovers

2020-06-12 Thread Igor Fedotov

Hi Reed,

thanks for the log.

Nothing much of interest there though. Just a regular SST file that 
RocksDB instructed to put at "slow" device. Presumably it belongs to a 
higher level hence the desire to put it that "far". Or (which is less 
likely) RocksDB lacked free space when doing compaction at some point 
and spilled some data out. So I was wrong - ceph-kvstore's stats command 
output might be helpful...



Thanks,

Igor

On 6/11/2020 5:14 PM, Reed Dier wrote:

Apologies for the delay Igor,

Hopefully you are still interested in taking a look.

Attached is the bluestore bluefs-log-dump output.
I gzipped it as the log was very large.
Let me know if there is anything else I can do to help track this down.

Thanks,

Reed



On Jun 8, 2020, at 8:04 AM, Igor Fedotov > wrote:


Reed,

No, "ceph-kvstore-tool stats" isn't be of any interest.

For the sake of better issue understanding it might be interesting to 
have bluefs log dump obtained via ceph-bluestore-tool's 
bluefs-log-dump command. This will give some insight what RocksDB 
files are spilled over.  It's still not clear what's the root cause 
for the issue. It's not that frequent and dangerous though so no 
active investigation on that...


Wondering if migration has helped though?


Thanks,

Igor

On 6/6/2020 8:00 AM, Reed Dier wrote:

The WAL/DB was part of the OSD deployment.

OSD is running 14.2.9.

Would grabbing the ceph-kvstore-tool bluestore-kv  
stats as in that ticket be of any usefulness to this?


Thanks,

Reed

On Jun 5, 2020, at 5:27 PM, Igor Fedotov > wrote:


This might help -see comment #4 at 
https://tracker.ceph.com/issues/44509



And just for the sake of information collection - what Ceph version 
is used in this cluster?


Did you setup DB volume along with OSD deployment or they were 
added later as  was done in the ticket above?



Thanks,

Igor

On 6/6/2020 1:07 AM, Reed Dier wrote:

I'm going to piggy back on this somewhat.

I've battled RocksDB spillovers over the course of the life of the 
cluster since moving to bluestore, however I have always been able 
to compact it well enough.


But now I am stumped at getting this to compact via $ceph tell 
osd.$osd compact, which has always worked in the past.


No matter how many times I compact it, I always spill over exactly 
192KiB.

BLUEFS_SPILLOVER BlueFS spillover detected on 1 OSD(s)
     osd.36 spilled over 192 KiB metadata from 'db' device (26 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (16 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (22 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (13 
GiB used of 34 GiB) to slow device


The multiple entries are from different time trying to compact it.

The OSD is a 1.92TB SATA SSD, the WAL/DB is a 36GB partition on NVMe.
I tailed and tee'd the OSD's logs during a manual compaction here: 
https://pastebin.com/bcpcRGEe

This is with the normal logging level.
I have no idea how to make heads or tails of that log data, but 
maybe someone can figure out why this one OSD just refuses to compact?


OSD is 14.2.9.
OS is U18.04.
Kernel is 4.15.0-96.

I haven't played with ceph-bluestore-tool or ceph-kvstore-tool but 
after seeing the above mention in this thread, I do see 
ceph-kvstore-tool  compact, which sounds 
like it may be the same thing that ceph tell compact does under 
the hood?

compact
Subcommand compact is used to compact all data of kvstore. It 
will open the database, and trigger a database's compaction. 
After compaction, some disk space may be released.


Also, not sure if this is helpful:
osd.36 spilled over 192 KiB metadata from 'db' device (13 GiB 
used of 34 GiB) to slow device
ID   CLASS WEIGHT  REWEIGHT SIZE    RAW USE  DATA  OMAP    META   
 AVAIL   %USE  VAR  PGS STATUS TYPE NAME
  36   ssd   1.77879  1.0 1.8 TiB  1.2 TiB 1.2 TiB 6.2 GiB 
7.2 GiB 603 GiB 66.88 0.94  85     up             osd.36

You can see the breakdown between OMAP data and META data.

After compacting again:
osd.36 spilled over 192 KiB metadata from 'db' device (26 GiB 
used of 34 GiB) to slow device
ID   CLASS WEIGHT  REWEIGHT SIZE    RAW USE  DATA    OMAP    META 
   AVAIL   %USE  VAR  PGS STATUS TYPE NAME
  36   ssd 1.77879  1.0 1.8 TiB  1.2 TiB 1.2 TiB 6.2 GiB  20 
GiB 603 GiB 66.88 0.94  85     up       osd.36


So the OMAP size remained the same, while the metadata ballooned 
(while still conspicuously spilling over 192KiB exactly)
These OSDs have a few RBD images, cephfs metadata, and librados 
objects (not RGW) stored.


The breakdown of OMAP size is pretty widely binned, but the GiB 
sizes are definitely the minority.

Looking at the breakdown with some simple bash-fu
KiB = 147
MiB = 105
GiB = 24

To further divide that, all of the GiB sized OMAPs are SSD OSD's:


*SSD*

*HDD*

*TOTAL*
*KiB*

0
  

[ceph-users] Re: Poor Windows performance on ceph RBD.

2020-06-12 Thread Frank Schilder
Hi,

I think you are hit by two different problems at the same time. The second 
problem might be the same that we also experience, namely that Windows VMs have 
very strange performance characteristics with libvirt, vd driver and RBD. With 
copy operations on very large files (>2GB) we see a sharp drop of bandwidth 
after ca. 1 to 1.5GB to a measly 25MB/s for as yet unknown reasons. We cannot 
reproduce this behaviour with Linux VMs, so chances that this is a Windows and 
not a ceph problem are rather high.

The first problem, however, has to do with how ceph uses disks. Bare spinning 
disks have very poor performance characteristics and a lot of development since 
their invention has been on smart controllers (internal and external) with 
volatile and persistent caches and OS file buffers that attempt to translate 
usual user's workloads into something that works reasonable well with spinning 
drives. The main ideas being to re-order and merge I/O, cache hot data and 
absorb I/O bursts for constant write back. The SANs you are used to are almost 
certainly high-end products with all the magic money can currently afford.

Ceph forcefully bypasses all of such logic and a rule of thumb I'm following is 
that with ceph and current hardware, using current generation drives will 
provide previous generation's drive performance. With NVMes you can achieve SSD 
performance, with SSDs you get good spinning SAS drive performance and with SAS 
drives you get, well, floppy or zip drive performance. I'm afraid that's what 
you are seeing with 15VMs saturating the available aggregated performance of 
the spindles.

If you want to stick with spindles as a data store, what you need is fast, 
reliable persistent cache. Reliable here means that the firmware is free of 
bugs with respect to power outages, which is quite a requirement in itself. 
Some expensive disk controllers claim to have that, they offer persistent NVMe 
cache. How much you want to trust the firmware is a different story. 
Alternatively, you could consider a few TB NVMe drives for a ceph cache pool. 
People report that they are happy with that. As long as the cache pool can hold 
all hot data plus write bursts, I would also expect this to work fine.

Instead of caching we decided to go for a split. We use datacenter grade 
low-cost SSDs for a small all-flash pool for OS RBD disks and a large HDD-only 
pool for data storage. This works quite well since the major annoying 
simultaneous I/O workload of Windows VMs happens on the OS disks. For ordinary 
data access, an EC HDD pool is perfectly fine and we provision machines with a 
second large data disk on HDD. Our users are quite happy with that model.

In any case, we are still stuck with the strange performance drop with Windows 
machines that you also seem to observe and are still looking for help with 
that. If you manage to figure out what is going on, I would like to hear about 
that. So far, we haven't found a clue.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: jchar...@provectio.fr 
Sent: 11 June 2020 12:38:32
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Poor Windows performance on ceph RBD.

Hello,

we are using same environment, Opennebula + Ceph.
Our ceph cluster is composed by 5 ceph OSD Hosts with SSD, spinning 10ktrs and 
7.2ktrs, with 10Gb/s fiber network
Each spinning OSD are associated with a db and wall devices on SSD

Nearly all our Windows VM RBD images are in a 10k/trs pool with erasure coding.
For the moment we are house about 15 VM (RDS and exchange)

What we are noting :
   - VM are far from respondig as well as on our old 10k SAN ( less than 30%)
   - RBD average Latency is oscillating between 50ms to 250ms with some peaks 
that can reach the second
   - some tests (crystal test drive) from inside the VM can show performance up 
to 700MB/s on read and 170 MB/s on write, but a single file copy barely reach 
150 MB/s and  stay at a poor 25 MB/s most of the time
   -  test on 4K rnd, show some iops performance up to 4K iops read and 2kiops 
write, but view from RDB point of view, it's like the image iops cant barely go 
over 500 iops(read+write)

Since we have to migrate our VM from the old SAN to Ceph, I am really worried, 
there is mode than 150 VMs on it, and our Ceph seems to have hard time to cope 
with 15 VMs.

I can't find accurate date and relevant calculus templates  that should permit 
me to evualate what I can expect
All the documents I've read (and I read a lot ;) ) only reports empirical 
ascertainment with "it's better", or "it's worst".
There is a lot of parameters we can tweaks like block size, striping, stripe 
size, strip count, ... but those are poorly documented, especially the relation 
between them.

I will be more than happy to work with some peoples who are in the same 
situation to try to find some solutions, methods which can help us to be sure 
of our design. And brea

[ceph-users] ceph grafana dashboards on git

2020-06-12 Thread Marc Roos


I was wondering if I can fork and do pull request on the grafana 
dashboards at git[1]

Clean up a bit inconsistent naming use of labels etc.

[1]
https://github.com/ceph/ceph/tree/master/monitoring/grafana/dashboards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: dealing with spillovers

2020-06-12 Thread Reed Dier
Thanks for sticking with me Igor.Attached is the ceph-kvstore-tool stats output.Hopefully something interesting in here.Thanks,Reed

kvstoretool.log
Description: Binary data
On Jun 12, 2020, at 6:56 AM, Igor Fedotov  wrote:
  

  
  Hi Reed,thanks for the log.Nothing much of interest there though. Just a regular SST file
  that RocksDB instructed to put at "slow" device. Presumably it
  belongs to a higher level hence the desire to put it that "far".
  Or (which is less likely) RocksDB lacked free space when doing
  compaction at some point and spilled some data out. So I was wrong
  - ceph-kvstore's stats command output might be helpful...
Thanks,Igor

On 6/11/2020 5:14 PM, Reed Dier wrote:

  
  Apologies for the delay Igor,
  
  
  Hopefully you are still interested in taking a look.
  
  
  Attached is the bluestore bluefs-log-dump output.
  I gzipped it as the log was very large.
  Let me know if there is anything else I can do to
help track this down.
  
  
  Thanks,
  
  
  Reed
  
  
  
  

  
On Jun 8, 2020, at 8:04 AM, Igor Fedotov  wrote:


  
  Reed,No, "ceph-kvstore-tool stats" isn't be of
  any interest.For the sake of better issue understanding
  it might be interesting to have bluefs log dump
  obtained via ceph-bluestore-tool's bluefs-log-dump
  command. This will give some insight what RocksDB
  files are spilled over.  It's still not clear what's
  the root cause for the issue. It's not that frequent
  and dangerous though so no active investigation on
  that...
Wondering if migration has helped though?

Thanks,Igor

On 6/6/2020 8:00 AM, Reed
  Dier wrote:


  
  The WAL/DB was part of the OSD deployment.
  
  
  OSD is running 14.2.9.
  
  
  Would grabbing the ceph-kvstore-tool
bluestore-kv  stats as in that
ticket be of any usefulness to this?
  
  
  Thanks,
  
  
  Reed

  
On Jun 5, 2020, at 5:27 PM, Igor
  Fedotov 
  wrote:


  
  This might help -see comment #4
  at https://tracker.ceph.com/issues/44509
And just for the sake of
  information collection - what Ceph version
  is used in this cluster?Did you setup DB volume along
  with OSD deployment or they were added
  later as  was done in the ticket above?
Thanks,Igor

On 6/6/2020
  1:07 AM, Reed Dier wrote:


  
  I'm going to piggy back on this somewhat.
  
  
  I've battled RocksDB
spillovers over the course of the life
of the cluster since moving to
bluestore, however I have always been
able to compact it well enough.
  
  
  But now I am stumped at
getting this to compact via $ceph tell
osd.$osd compact, which has always
worked in the past.
  
  
  No matter how many times I
compact it, I always spill over exactly
192KiB.
  

  BLUEFS_SPILLOVER BlueFS
spillover detected on 1 OSD(s)
       osd.36 spilled over
192 KiB metadata from 'db' device
(26 GiB used of 34 GiB) to slow
device

[ceph-users] ceph grafana dashboards: rbd overview empty

2020-06-12 Thread Marc Roos


The grafana dashboard 'rbd overview' is empty. Queries have measurements 
'ceph_rbd_write_ops' that do not exist in prometheus (I think). Should I 
enable something more than just 'ceph mgr module enable prometheus'

I am on Nautilus



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph grafana dashboards: osd device details keeps loading.

2020-06-12 Thread Marc Roos


I have sometimes that the dashboard keeps loading when switching to 3 
hours range. However I do not see any load on the prometheus server. 
Anyone having something similar?




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: dealing with spillovers

2020-06-12 Thread Igor Fedotov

hmm, RocksDB reports 13GB at L4:

 "": "Level    Files   Size Score Read(GB)  Rn(GB) Rnp1(GB) 
Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) 
CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop",
    "": 
"",
    "": "  L0  2/0   29.39 MB   0.5  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",
    "": "  L1  1/0   22.31 MB   0.6  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",
    "": "  L2  2/0   94.03 MB   0.3  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",
    "": "  L3 12/0   273.29 MB   0.3  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",
    "": "  L4    205/0   12.82 GB   0.1  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",
    "": " Sum    222/0   13.23 GB   0.0  0.0 0.0 0.0   
0.0  0.0   0.0   0.0  0.0  0.0 0.00  
0.00 0    0.000   0  0",


which is unlikely to be correct...

No more ideas but do data migration using ceph-bluestore-tool.

I would appreciate if you share whether it helps in both short- and 
long-term. Will this reappear or not?



Thanks,

Igor


On 6/12/2020 5:17 PM, Reed Dier wrote:

Thanks for sticking with me Igor.

Attached is the ceph-kvstore-tool stats output.

Hopefully something interesting in here.

Thanks,

Reed




On Jun 12, 2020, at 6:56 AM, Igor Fedotov > wrote:


Hi Reed,

thanks for the log.

Nothing much of interest there though. Just a regular SST file that 
RocksDB instructed to put at "slow" device. Presumably it belongs to 
a higher level hence the desire to put it that "far". Or (which is 
less likely) RocksDB lacked free space when doing compaction at some 
point and spilled some data out. So I was wrong - ceph-kvstore's 
stats command output might be helpful...



Thanks,

Igor

On 6/11/2020 5:14 PM, Reed Dier wrote:

Apologies for the delay Igor,

Hopefully you are still interested in taking a look.

Attached is the bluestore bluefs-log-dump output.
I gzipped it as the log was very large.
Let me know if there is anything else I can do to help track this down.

Thanks,

Reed



On Jun 8, 2020, at 8:04 AM, Igor Fedotov > wrote:


Reed,

No, "ceph-kvstore-tool stats" isn't be of any interest.

For the sake of better issue understanding it might be interesting 
to have bluefs log dump obtained via ceph-bluestore-tool's 
bluefs-log-dump command. This will give some insight what RocksDB 
files are spilled over.  It's still not clear what's the root cause 
for the issue. It's not that frequent and dangerous though so no 
active investigation on that...


Wondering if migration has helped though?


Thanks,

Igor

On 6/6/2020 8:00 AM, Reed Dier wrote:

The WAL/DB was part of the OSD deployment.

OSD is running 14.2.9.

Would grabbing the ceph-kvstore-tool bluestore-kv  
stats as in that ticket be of any usefulness to this?


Thanks,

Reed

On Jun 5, 2020, at 5:27 PM, Igor Fedotov > wrote:


This might help -see comment #4 at 
https://tracker.ceph.com/issues/44509



And just for the sake of information collection - what Ceph 
version is used in this cluster?


Did you setup DB volume along with OSD deployment or they were 
added later as  was done in the ticket above?



Thanks,

Igor

On 6/6/2020 1:07 AM, Reed Dier wrote:

I'm going to piggy back on this somewhat.

I've battled RocksDB spillovers over the course of the life of 
the cluster since moving to bluestore, however I have always 
been able to compact it well enough.


But now I am stumped at getting this to compact via $ceph tell 
osd.$osd compact, which has always worked in the past.


No matter how many times I compact it, I always spill over 
exactly 192KiB.

BLUEFS_SPILLOVER BlueFS spillover detected on 1 OSD(s)
     osd.36 spilled over 192 KiB metadata from 'db' device (26 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (16 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (22 
GiB used of 34 GiB) to slow device
     osd.36 spilled over 192 KiB metadata from 'db' device (13 
GiB used of 34 GiB) to slow device


The multiple entries are from different time trying to compact it.

The OSD is a 1.92TB SATA SSD, the WAL/DB is a 36GB partition on 
NVMe.
I tailed and tee'd the OSD's logs during a manual compaction 
here: https://pastebin.com/bcpcRGEe

This is wi

[ceph-users] ceph on rhel7 / centos7 till eol?

2020-06-12 Thread Marc Roos


Will there be a ceph release available on rhel7 until the eol of rhel7?




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph on rhel7 / centos7 till eol?

2020-06-12 Thread Dietmar Rieder
On 2020-06-12 16:35, Marc Roos wrote:
> 
> Will there be a ceph release available on rhel7 until the eol of rhel7?

much needed here as well
+1

Would be really great, Thanks a lot.

Dietmar

-- 
_
D i e t m a r  R i e d e r, Mag.Dr.
Innsbruck Medical University
Biocenter - Institute of Bioinformatics
Email: dietmar.rie...@i-med.ac.at
Web:   http://www.icbi.at




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW listing slower on nominally faster setup

2020-06-12 Thread Stefan Wild
On 6/12/20, 5:40 AM, "James, GleSYS"  wrote:

> When I set the debug_rgw logs to "20/1", the issue disappears immediately, 
> and the throughput for the index pool goes back down to normal levels.

I can – somewhat happily – confirm that setting debug_rgw to "20/1" makes the 
issue disappear instantly. Even if the RGW is in the middle of a "stuck" 
listing, the debug level change causes the load to drop and results appear on 
the client almost instantly. After setting debug_rgw back to "5/1" the listings 
get stuck again, which in our case is not just occasionally, but always and for 
every bucket. Not exactly a solution, since we're already having some trouble 
keeping the docker container logs to a manageable size, but might be good 
enough as a workaround.

Not sure how we can find out steps to reproduce the issue in the first place. 
Happy to do some testing if anyone has suggestions. Also, I'm inclined to 
report this as a bug at this point unless there's opposing advice.

Thanks,
Stefan




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: dealing with spillovers

2020-06-12 Thread Reed Dier
Thanks Igor,

I did see that L4 sizing and thought it seemed auspicious.
Though after looking at a couple other OSD's with this, I saw that I think the 
sum of L0-L4 appears to match a rounded off version of the metadata size 
reported in ceph osd df tree.
So I'm not sure if thats actually showing the size of the Level store, but just 
what is stored in each level?
> No more ideas but do data migration using ceph-bluestore-tool. 
> 
Would this imply to backup the current block.db, then re-create the block.db 
and move the backup to the new block.db?

Just asking because I have never touched moving the block.db/WAL, and was 
actually under the impression that could not be done until the last few years 
as more people keep having spillovers.

Previously when I was expanding my block.db, I was just re-paving the OSD's, 
which was my likely course of action for this OSD if I was unsuccessful in 
clearing this as is.

Would that be bluefs-export and then bluefs-bdev-new-db?
Though that doesn't exactly look like it would work.

I don't think I could do migrate due to not having another block device to 
migrate from and to.

Should/could I try bluefs-bdev-expand to see if it sees a bigger partition and 
tries to use it?

Otherwise at this point I feel like re-paving may be the best path forward, I 
just wanted to provide any possible data points before doing that.

Thanks again for the help,

Reed

> On Jun 12, 2020, at 9:34 AM, Igor Fedotov  wrote:
> 
> hmm, RocksDB reports 13GB at L4:
> 
>  "": "LevelFiles   Size Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) 
> Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) 
> Comp(cnt) Avg(sec) KeyIn KeyDrop",
> "": 
> "",
> "": "  L0  2/0   29.39 MB   0.5  0.0 0.0  0.0   0.0   
>0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> "": "  L1  1/0   22.31 MB   0.6  0.0 0.0  0.0   0.0   
>0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> "": "  L2  2/0   94.03 MB   0.3  0.0 0.0  0.0   0.0   
>0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> "": "  L3 12/0   273.29 MB   0.3  0.0 0.0  0.0   0.0  
> 0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> "": "  L4205/0   12.82 GB   0.1  0.0 0.0  0.0   0.0   
>0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> "": " Sum222/0   13.23 GB   0.0  0.0 0.0  0.0   0.0   
>0.0   0.0   0.0  0.0  0.0  0.00  0.00 
> 00.000   0  0",
> 
> which is unlikely to be correct...
> 
> No more ideas but do data migration using ceph-bluestore-tool. 
> 
> I would appreciate if you share whether it helps in both short- and 
> long-term. Will this reappear or not?
> 
> 
> Thanks,
> 
> Igor
> 
> 
> 
> On 6/12/2020 5:17 PM, Reed Dier wrote:
>> Thanks for sticking with me Igor.
>> 
>> Attached is the ceph-kvstore-tool stats output.
>> 
>> Hopefully something interesting in here.
>> 
>> Thanks,
>> 
>> Reed
>> 
>> 
>> 
>> 
>> 
>>> On Jun 12, 2020, at 6:56 AM, Igor Fedotov >> > wrote:
>>> 
>>> Hi Reed,
>>> 
>>> thanks for the log.
>>> 
>>> Nothing much of interest there though. Just a regular SST file that RocksDB 
>>> instructed to put at "slow" device. Presumably it belongs to a higher level 
>>> hence the desire to put it that "far". Or (which is less likely) RocksDB 
>>> lacked free space when doing compaction at some point and spilled some data 
>>> out. So I was wrong - ceph-kvstore's stats command output might be 
>>> helpful...
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Igor
>>> 
>>> On 6/11/2020 5:14 PM, Reed Dier wrote:
 Apologies for the delay Igor,
 
 Hopefully you are still interested in taking a look.
 
 Attached is the bluestore bluefs-log-dump output.
 I gzipped it as the log was very large.
 Let me know if there is anything else I can do to help track this down.
 
 Thanks,
 
 Reed
 
 
 
 
> On Jun 8, 2020, at 8:04 AM, Igor Fedotov  > wrote:
> 
> Reed,
> 
> No, "ceph-kvstore-tool stats" isn't be of any interest.
> 
> For the sake of better issue understanding it might be interesting to 
> have bluefs log dump obtained via ceph-bluestore-tool's bluefs-log-dump 
> command. This will give some insight what RocksDB files are spilled over. 
>  It's still not clear what's the root cause for the issue. It's not that 
> freq

[ceph-users] Re: help with failed osds after reboot

2020-06-12 Thread Eugen Block

Hi,

which ceph release are you using? You mention ceph-disk so your OSDs  
are not LVM based, I assume?


I've seen these messages a lot when testing in my virtual lab  
environment although I don't believe it's the cluster's fsid but the  
OSD's fsid that's in the error message (the OSDs have their own ID,  
too, take a look in /var/lib/ceph/osd/ceph-/fsid). When I did  
several re-installs of the whole cluster I had to make sure to  
properly wipe the disks but sometimes only a reboot did the trick. Of  
course, this is not an option in your situation.


If your OSDs are systemd units check for orphaned units that need to  
be to disabled before restarting the correct ones. Did you re-deploy  
some of those disks?


Regards,
Eugen


Zitat von Seth Duncan :

I had 5 of 10 osds fail on one of my nodes, after reboot the other 5  
osds failed to start.


I have tried running ceph-disk activate-all and get back and error  
message about the cluster fsid not matching in /etc/ceph/ceph.conf


Has anyone experienced an issue such as this?



***
IMPORTANT MESSAGE FOR RECIPIENTS IN THE U.S.A.:
This message may constitute an advertisement of a BD group's  
products or services or a solicitation of interest in them. If this  
is such a message and you would like to opt out of receiving future  
advertisements or solicitations from this BD group, please forward  
this e-mail to optoutbygr...@bd.com. [BD.v1.0]

***
This message (which includes any attachments) is intended only for  
the designated recipient(s). It may contain confidential or  
proprietary information and may be subject to the attorney-client  
privilege or other confidentiality protections. If you are not a  
designated recipient, you may not review, use, copy or distribute  
this message. If you received this in error, please notify the  
sender by reply e-mail and delete this message. Thank you.

***
Corporate Headquarters Mailing Address: BD (Becton, Dickinson and  
Company) 1 Becton Drive Franklin Lakes, NJ 07417 U.S.A.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: help with failed osds after reboot

2020-06-12 Thread Marc Roos
 
Maybe you have the same issue?
https://tracker.ceph.com/issues/44102#change-167531

In my case an update(?) disabled osd runlevels. 
systemctl is-enabled ceph-osd@0




-Original Message-
To: ceph-users@ceph.io
Subject: [ceph-users] Re: help with failed osds after reboot

Hi,

which ceph release are you using? You mention ceph-disk so your OSDs are 
not LVM based, I assume?

I've seen these messages a lot when testing in my virtual lab 
environment although I don't believe it's the cluster's fsid but the 
OSD's fsid that's in the error message (the OSDs have their own ID, too, 
take a look in /var/lib/ceph/osd/ceph-/fsid). When I did several 
re-installs of the whole cluster I had to make sure to properly wipe the 
disks but sometimes only a reboot did the trick. Of course, this is not 
an option in your situation.

If your OSDs are systemd units check for orphaned units that need to be 
to disabled before restarting the correct ones. Did you re-deploy some 
of those disks?

Regards,
Eugen


Zitat von Seth Duncan :

> I had 5 of 10 osds fail on one of my nodes, after reboot the other 5 
> osds failed to start.
>
> I have tried running ceph-disk activate-all and get back and error 
> message about the cluster fsid not matching in /etc/ceph/ceph.conf
>
> Has anyone experienced an issue such as this?
>
>
>
> ***
> IMPORTANT MESSAGE FOR RECIPIENTS IN THE U.S.A.:
> This message may constitute an advertisement of a BD group's products 
> or services or a solicitation of interest in them. If this is such a 
> message and you would like to opt out of receiving future 
> advertisements or solicitations from this BD group, please forward 
> this e-mail to optoutbygr...@bd.com. [BD.v1.0]
> ***
> This message (which includes any attachments) is intended only for the 

> designated recipient(s). It may contain confidential or proprietary 
> information and may be subject to the attorney-client privilege or 
> other confidentiality protections. If you are not a designated 
> recipient, you may not review, use, copy or distribute this message. 
> If you received this in error, please notify the sender by reply 
> e-mail and delete this message. Thank you.
> ***
> Corporate Headquarters Mailing Address: BD (Becton, Dickinson and
> Company) 1 Becton Drive Franklin Lakes, NJ 07417 U.S.A.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is there a way froce sync metadata in a multisite cluster

2020-06-12 Thread pradeep8985
黄明友 wrote:
> Hi,all:
> 
>  the slave zone show  metadata is caught up with master ; but use
> radosgw-admin bucket list|wc diff  master and the slave zone , is not equal. 
> how can I force sync it?

I too face the same problem. I see new buckets getting created in the master 
zone however they are not getting replicated. I restart radodgw-admin service 
to resolve the issue, but that's not the way it should work.
Can somebody pls help to resolve the issue?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw - how to grant read-only access to another user by default

2020-06-12 Thread Paul Choi
Hi,

I'm new to radosgw (learned more about the MDS than I care to...), and it
seems like the buckets and objects created by one user cannot be accessed
by another user.

Is there a way to make any content created by User A accessible (read-only)
by User B?
>From the documentation it looks like this is handled as an S3 permission
but I'm not finding an easy/obvious way to do this.

Any help would be appreciated. Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw - how to grant read-only access to another user by default

2020-06-12 Thread Marc Roos
 
Yes best via bucket policies
https://docs.ceph.com/docs/mimic/radosgw/bucketpolicy/






-Original Message-

To: ceph-users@ceph.io
Subject: [ceph-users] radosgw - how to grant read-only access to another 
user by default

Hi,

I'm new to radosgw (learned more about the MDS than I care to...), and 
it seems like the buckets and objects created by one user cannot be 
accessed by another user.

Is there a way to make any content created by User A accessible 
(read-only) by User B?
>From the documentation it looks like this is handled as an S3 permission 
but I'm not finding an easy/obvious way to do this.

Any help would be appreciated. Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io