On Sun, 2019-04-28 at 15:39 +0800, Ming Lei wrote:
> Now scsi_mq_setup_tags() pre-allocates a big buffer for IO sg list,
> and the buffer size is scsi_mq_sgl_size() which depends on smaller
> value between shost->sg_tablesize and SG_CHUNK_SIZE.
> 
> Modern HBA's DMA is often capable of deadling with very big segment
> number, so scsi_mq_sgl_size() is often big. Suppose the max sg number
> of SG_CHUNK_SIZE is taken, scsi_mq_sgl_size() will be 4KB.
> 
> Then if one HBA has lots of queues, and each hw queue's depth is
> high, pre-allocation for sg list can consume huge memory.
> For example of lpfc, nr_hw_queues can be 70, each queue's depth
> can be 3781, so the pre-allocation for data sg list is 70*3781*2k
> =517MB for single HBA.
> 
> There is Red Hat internal report that scsi_debug based tests can't
> be run any more since legacy io path is killed because too big
> pre-allocation.
> 
> So switch to runtime allocation for sg list, meantime pre-allocate 2
> inline sg entries. This way has been applied to NVMe PCI for a while,
> so it should be fine for SCSI too. Also runtime sg entries allocation
> has verified and run always in the original legacy io path.
> 
> Not see performance effect in my big BS test on scsi_debug.

Reviewed-by: Bart Van Assche <bvanass...@acm.org>


Reply via email to