Well, it sounds like the pdcache setting may not be possible for SSD's, which 
is the first I've ever heard of this.

I actually just checked another system that I forgot was behind a 3108 
controller with SSD's (not ceph, so wasn't considering it).
It looks like I ran into the same issue during configuration of that then, as 
that vd is set to "Default" and not "Disabled".

I think the best case option is to try and configure it for "FastPath" which 
sadly has next to no documentation other than extolling its [purported] 
benefits.

> The per-virtual disk configuration required for Fast Path is:
> write-through cache policy
> no read-ahead
> direct (non-cache) I/O
> It provides the commands to set these (for adapter zero, logical disk zero): 
> megacli –LDSetProp WT Direct NORA L0 a0

I believe the storcli equivalent(s) would be
> storcli /cx/vx set wrcache=wt
> storcli /cx/vx set rdcache=nora
> storcli /cx/vx set iopolicy=direct

At this point that feels like the best option given the current restraints.

I don't know why they don't publish those settings more readily, but at least 
we have the Google machine for that.

Hopefully that may eek out a bit more performance.

Reed

> On Sep 3, 2020, at 5:31 AM, VELARTIS Philipp Dürhammer 
> <p.duerham...@velartis.at> wrote:
> 
> In theory it should be possible to do this (to change the Block SSD Write 
> Disk Cache Change = Yes setting)
> Run MegaSCU -adpsettings -write -f mfc.ini -a0
> Edit the mfc.ini file, setting "blockSSDWriteCacheChange" to 0 instead of 1.
> Run MegaSCU -adpsettings -read -f mfc.ini -a0
> With MEGACLI but I get en error. It is not working for me to save the config 
> file. No idea why…
>  
> My Config:
>  
> VD16 Properties :
> ===============
> Strip Size = 256 KB
> Number of Blocks = 3749642240
> VD has Emulated PD = Yes
> Span Depth = 1
> Number of Drives Per Span = 1
> Write Cache(initial setting) = WriteThrough
> Disk Cache Policy = Enabled
> Encryption = None
> Data Protection = Disabled
> Active Operations = None
> Exposed to OS = Yes
> Creation Date = 25-08-2020
> Creation Time = 12:05:41 PM
> Emulation type = None
>  
> Version :
> =======
> Firmware Package Build = 23.28.0-0010
> Firmware Version = 3.400.05-3175
> Bios Version = 5.46.02.0_4.16.08.00_0x06060900
> NVDATA Version = 2.1403.03-0128
> Boot Block Version = 2.05.00.00-0010
> Bootloader Version = 07.26.26.219
> Driver Name = megaraid_sas
> Driver Version = 07.703.05.00-rc1
>  
> Supported Adapter Operations :
> ============================
> Rebuild Rate = Yes
> CC Rate = Yes
> BGI Rate  = Yes
> Reconstruct Rate = Yes
> Patrol Read Rate = Yes
> Alarm Control = Yes
> Cluster Support = No
> BBU  = Yes
> Spanning = Yes
> Dedicated Hot Spare = Yes
> Revertible Hot Spares = Yes
> Foreign Config Import = Yes
> Self Diagnostic = Yes
> Allow Mixed Redundancy on Array = No
> Global Hot Spares = Yes
> Deny SCSI Passthrough = No
> Deny SMP Passthrough = No
> Deny STP Passthrough = No
> Support more than 8 Phys = Yes
> FW and Event Time in GMT = No
> Support Enhanced Foreign Import = Yes
> Support Enclosure Enumeration = Yes
> Support Allowed Operations = Yes
> Abort CC on Error = Yes
> Support Multipath = Yes
> Support Odd & Even Drive count in RAID1E = No
> Support Security = No
> Support Config Page Model = Yes
> Support the OCE without adding drives = Yes
> support EKM = No
> Snapshot Enabled = No
> Support PFK = Yes
> Support PI = Yes
> Support LDPI Type1 = No
> Support LDPI Type2 = No
> Support LDPI Type3 = No
> Support Ld BBM Info = No
> Support Shield State = Yes
> Block SSD Write Disk Cache Change = Yes -> this is not good as it prevents to 
> change the SSD cache! Stupid!
> Support Suspend Resume BG ops = Yes
> Support Emergency Spares = Yes
> Support Set Link Speed = Yes
> Support Boot Time PFK Change = No
> Support JBOD = Yes
> Disable Online PFK Change = No
> Support Perf Tuning = Yes
> Support SSD PatrolRead = Yes
> Real Time Scheduler = Yes
> Support Reset Now = Yes
> Support Emulated Drives = Yes
> Headless Mode = Yes
> Dedicated HotSpares Limited = No
> Point In Time Progress = Yes
>  
> Supported VD Operations :
> =======================
> Read Policy = Yes
> Write Policy = Yes
> IO Policy = Yes
> Access Policy = Yes
> Disk Cache Policy = Yes (but only HDD’s in this case)
> Reconstruction = Yes
> Deny Locate = No
> Deny CC = No
> Allow Ctrl Encryption = No
> Enable LDBBM = No
> Support FastPath = Yes
> Performance Metrics = Yes
> Power Savings = No
> Support Powersave Max With Cache = No
> Support Breakmirror = No
> Support SSC WriteBack = No
> Support SSC Association = No
>  
> Von: Reed Dier <reed.d...@focusvq.com> 
> Gesendet: Mittwoch, 02. September 2020 19:34
> An: VELARTIS Philipp Dürhammer <p.duerham...@velartis.at>
> Cc: ceph-users@ceph.io
> Betreff: Re: [ceph-users] Can 16 server grade ssd's be slower then 60 hdds? 
> (no extra journals)
>  
> Just for the sake of curiosity, if you do a show all on /cX/vX, what is shown 
> for the VD properties?
> VD0 Properties :
> ==============
> Strip Size = 256 KB
> Number of Blocks = 1953374208
> VD has Emulated PD = No
> Span Depth = 1
> Number of Drives Per Span = 1
> Write Cache(initial setting) = WriteBack
> Disk Cache Policy = Disabled
> Encryption = None
> Data Protection = Disabled
> Active Operations = None
> Exposed to OS = Yes
> Creation Date = 17-06-2016
> Creation Time = 02:49:02 PM
> Emulation type = default
> Cachebypass size = Cachebypass-64k
> Cachebypass Mode = Cachebypass Intelligent
> Is LD Ready for OS Requests = Yes
> SCSI NAA Id = 600304801bb4c0001ef6ca5ea0fcb283
>  
> I'm wondering if the pdcache value must be set at vd creation, as it is a 
> creation option as well.
> If that's the case, maybe consider blowing away one of the SSD vd's and 
> recreating the vd and OSD, and see if you can measure a difference on that 
> disk specifically in testing.
>  
> It might also be helpful to document some of these values from /cX show all
>  
> Version :
> =======
> Firmware Package Build = 24.7.0-0026
> Firmware Version = 4.270.00-3972
> Bios Version = 6.22.03.0_4.16.08.00_0x060B0200
> Ctrl-R Version = 5.08-0006
> Preboot CLI Version = 01.07-05:#%0000
> NVDATA Version = 3.1411.00-0009
> Boot Block Version = 3.06.00.00-0001
> Driver Name = megaraid_sas
> Driver Version = 07.703.05.00-rc1
>  
> Supported Adapter Operations :
> ============================
> Support Shield State = Yes
> Block SSD Write Disk Cache Change = Yes
> Support Suspend Resume BG ops = Yes
> Support Emergency Spares = Yes
> Support Set Link Speed = Yes
> Support Boot Time PFK Change = No
> Support JBOD = Yes
>  
> Supported VD Operations :
> =======================
> Read Policy = Yes
> Write Policy = Yes
> IO Policy = Yes
> Access Policy = Yes
> Disk Cache Policy = Yes
> Reconstruction = Yes
> Deny Locate = No
> Deny CC = No
> Allow Ctrl Encryption = No
> Enable LDBBM = No
> Support FastPath = Yes
> Performance Metrics = Yes
> Power Savings = No
> Support Powersave Max With Cache = No
> Support Breakmirror = Yes
> Support SSC WriteBack = No
> Support SSC Association = No
> Support VD Hide = Yes
> Support VD Cachebypass = Yes
> Support VD discardCacheDuringLDDelete = Yes
>  
>  
> Advanced Software Option :
> ========================
>  
> ----------------------------------------
> Adv S/W Opt        Time Remaining  Mode
> ----------------------------------------
> MegaRAID FastPath  Unlimited       -
> MegaRAID RAID6     Unlimited       -
> MegaRAID RAID5     Unlimited       -
> ----------------------------------------
>  
>  
> Namely, on my 3108 controller, Block SSD Write Disk Cache Change = Yes, 
> stands out to me.
> My controller has SAS HDD's behind it, though so I just may not be running 
> into the same issue, that may pertain to me.
> Also wondering if FastPath is enabled as well. I know on some of the older 
> controllers, it was a paid feature enable, but they then opened it up for 
> free, though you may need a software key to enable it (for free).
>  
> Just looking to widen the net and hope we catch something.
>  
> Reed
> 
> 
> On Sep 2, 2020, at 7:38 AM, VELARTIS Philipp Dürhammer 
> <p.duerham...@velartis.at <mailto:p.duerham...@velartis.at>> wrote:
>  
> I assume you are referencing this parameter?
> 
> 
> storcli /c0/v0 set ssdcaching=<on|off>
> 
> 
> If so, this is for CacheCade, which is LSI's cache tiering solution, which 
> should both be off and not in use for ceph.
> 
> No storcli /cx/vx set pdcache=off is denied because of the lsi setting "Block 
> SSD Write Disk Cache Change = Yes"
> I cannot find any firmware to upload or way to change this
> 
> Do you think that disabling the write cache also on the ssd helps a lot (ceph 
> is not aware of this because 'smartctl -g wcache /dev/sdX shows cache 
> disabled - because the cache on the lsi is disabled allready)
> The only way would be to buy some hba cards and add it to the server. But 
> that’s a lot of work - not knowing that this will improve the speed a lot.
> 
> I am using rbd with hyperconvergenced nodes (4 at the moment) pools are 2 and 
> 3 times replicated. actually the performance for windows and linux vms with 
> the hdd osd pool was ok. But with the time getting a little bit more slow. I 
> just want to get ready for the future. and we plan to put some bigger 
> database servers on the cluster (they are on local storage at the moment) and 
> therefore I want to increase the random small iops of the cluster a lot
> 
> -----Ursprüngliche Nachricht-----
> Von: Reed Dier <reed.d...@focusvq.com <mailto:reed.d...@focusvq.com>> 
> Gesendet: Dienstag, 01. September 2020 23:44
> An: VELARTIS Philipp Dürhammer <p.duerham...@velartis.at 
> <mailto:p.duerham...@velartis.at>>
> Cc: ceph-users@ceph.io <mailto:ceph-users@ceph.io>
> Betreff: Re: [ceph-users] Can 16 server grade ssd's be slower then 60 hdds? 
> (no extra journals)
> 
> 
> there is an option set in the controller "Block SSD Write Disk Cache Change = 
> Yes" which does not permit to deactivate the ssd cache. I could not find any 
> solution in google for this controller (LSI MegaRAID SAS 9271-8i) to change 
> this setting.
> 
> 
> I assume you are referencing this parameter?
> 
> storcli /c0/v0 set ssdcaching=<on|off>
> 
> If so, this is for CacheCade, which is LSI's cache tiering solution, which 
> should both be off and not in use for ceph.
> 
> Single thread and single iodepth benchmarks will tend to be underwhelming.
> Ceph shines with aggregate performance from lots of clients.
> And in an odd twist of fate, I typically see better performance on RBD for 
> random benchmarks rather than sequential benchmarks, as it distributes the 
> load across more OSD's.
> 
> Might also help others offer some pointers for tuning if you describe the 
> pool/application a bit more.
> 
> Ie RBD vs cephfs vs RGW, 3x replicated vs EC, etc.
> 
> At least things are trending in a positive direction.
> 
> Reed
> 
> 
> On Sep 1, 2020, at 4:21 PM, VELARTIS Philipp Dürhammer 
> <p.duerham...@velartis.at <mailto:p.duerham...@velartis.at>> wrote:
> 
> Thank you. I was working in this direction. The situation is a lot better. 
> But I think I can get still far better.
> 
> I could set the controller to writethrough, direct and no read ahead for the 
> ssds.
> But I cannot disable the pdcache ☹ there is an option set in the controller 
> "Block SSD Write Disk Cache Change = Yes" which does not permit to deactivate 
> the ssd cache. I could not find any solution in google for this controller 
> (LSI MegaRAID SAS 9271-8i) to change this setting.
> 
> I don’t know how much performance gain it will be to deactivate the ssd 
> cache. At least the micron 5200max has capacitor so I hope it is safe for 
> data loss in case if power failure. I wrote a request to lsi / Broadcom if 
> they know how I can change this setting. This is really annyoing.
> 
> I will check the cpu power settings. I rode also somewhere it can improve 
> iops a lot. (if its bad set)
> 
> At the moment I get 600iops 4k random write 1 thread and 1 iodepth. I get 40K 
> - 4k random iops for some instances with 32iodepth. Its not the world but a 
> lot better then before. Read around 100k iops. For 16 ssd's and 2 x dual 10G 
> nic.
> 
> I was reading that good tunings and hardware config can get more then 2000 
> iops on single thread out of the ssds. I know thet ceph does not shine with 
> single thread. But 600 iops is not very much...
> 
> philipp
> 
> -----Ursprüngliche Nachricht-----
> Von: Reed Dier <reed.d...@focusvq.com <mailto:reed.d...@focusvq.com>> 
> Gesendet: Dienstag, 01. September 2020 22:37
> An: VELARTIS Philipp Dürhammer <p.duerham...@velartis.at 
> <mailto:p.duerham...@velartis.at>>
> Cc: ceph-users@ceph.io <mailto:ceph-users@ceph.io>
> Betreff: Re: [ceph-users] Can 16 server grade ssd's be slower then 60 hdds? 
> (no extra journals)
> 
> If using storcli/perccli for manipulating the LSI controller, you can disable 
> the on-disk write cache with:
> storcli /cx/vx set pdcache=off
> 
> You can also ensure that you turn off write caching at the controller level 
> with 
> storcli /cx/vx set iopolicy=direct
> storcli /cx/vx set wrcache=wt
> 
> You can also tweak the readahead value for the vd if you want, though with an 
> ssd, I don't think it will be much of an issue.
> storcli /cx/vx set rdcache=nora
> 
> I'm sure the megacli alternatives are available with some quick searches.
> 
> May also want to check your c-states and p-states to make sure there isn't 
> any aggressive power saving features getting in the way.
> 
> Reed
> 
> 
> On Aug 31, 2020, at 7:44 AM, VELARTIS Philipp Dürhammer 
> <p.duerham...@velartis.at <mailto:p.duerham...@velartis.at>> wrote:
> 
> We have older LSi Raid controller with no HBA/JBOD option. So we expose the 
> single disks as raid0 devices. Ceph should not be aware of cache status?
> But digging deeper in to it it seems that 1 out of 4 serves is performing a 
> lot better and has super low commit/applay rates while the other have a lot 
> mor (20+) on heavy writes. This just applys fore the ssd. For the hdds I cant 
> see a difference...
> 
> -----Ursprüngliche Nachricht-----
> Von: Frank Schilder <fr...@dtu.dk <mailto:fr...@dtu.dk>> 
> Gesendet: Montag, 31. August 2020 13:19
> An: VELARTIS Philipp Dürhammer <p.duerham...@velartis.at 
> <mailto:p.duerham...@velartis.at>>; 'ceph-users@ceph.io 
> <mailto:ceph-users@ceph.io>' <ceph-users@ceph.io <mailto:ceph-users@ceph.io>>
> Betreff: Re: Can 16 server grade ssd's be slower then 60 hdds? (no extra 
> journals)
> 
> Yes, they can - if volatile write cache is not disabled. There are many 
> threads on this, also recent. Search for "disable write cache" and/or 
> "disable volatile write cache".
> 
> You will also find different methods of doing this automatically.
> 
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> ________________________________________
> From: VELARTIS Philipp Dürhammer <p.duerham...@velartis.at 
> <mailto:p.duerham...@velartis.at>>
> Sent: 31 August 2020 13:02:45
> To: 'ceph-users@ceph.io <mailto:ceph-users@ceph.io>'
> Subject: [ceph-users] Can 16 server grade ssd's be slower then 60 hdds? (no 
> extra journals)
> 
> I have a productive 60 osd's cluster. No extra Journals. Its performing okay. 
> Now I added an extra ssd Pool with 16 Micron 5100 MAX. And the performance is 
> little slower or equal to the 60 hdd pool. 4K random as also sequential 
> reads. All on dedicated 2 times 10G Network. HDDS are still on filestore. SSD 
> on bluestore. Ceph Luminous.
> What should be possible 16 ssd's vs. 60 hhd's no extra journals?
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> To 
> unsubscribe send an email to ceph-users-le...@ceph.io 
> <mailto:ceph-users-le...@ceph.io>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io>
> To unsubscribe send an email to ceph-users-le...@ceph.io 
> <mailto:ceph-users-le...@ceph.io>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to