Hello Roch, Monday, May 22, 2006, 3:42:41 PM, you wrote:
RBPE> Robert Says: RBPE> Just to be sure - you did reconfigure system to actually allow larger RBPE> IO sizes? RBPE> Sure enough, I messed up (I had no tuning to get the above data); So RBPE> 1 MB was my max transfer sizes. Using 8MB I now see: RBPE> Bytes Elapse of phys IO Size RBPE> Sent RBPE> RBPE> 8 MB; 3576 ms of phys; avg sz : 16 KB; throughput 2 MB/s RBPE> 9 MB; 1861 ms of phys; avg sz : 32 KB; throughput 4 MB/s RBPE> 31 MB; 3450 ms of phys; avg sz : 64 KB; throughput 8 MB/s RBPE> 78 MB; 4932 ms of phys; avg sz : 128 KB; throughput 15 MB/s RBPE> 124 MB; 4903 ms of phys; avg sz : 256 KB; throughput 25 MB/s RBPE> 178 MB; 4868 ms of phys; avg sz : 512 KB; throughput 36 MB/s RBPE> 226 MB; 4824 ms of phys; avg sz : 1024 KB; throughput 46 MB/s RBPE> 226 MB; 4816 ms of phys; avg sz : 2048 KB; throughput 54 MB/s (was 46 MB/s) RBPE> 32 MB; 686 ms of phys; avg sz : 4096 KB; throughput 58 MB/s (was 46 MB/s) RBPE> 224 MB; 4741 ms of phys; avg sz : 8192 KB; throughput 59 MB/s (was 47 MB/s) RBPE> 272 MB; 4336 ms of phys; avg sz : 16384 KB; throughput 58 MB/s (new data) RBPE> 288 MB; 4327 ms of phys; avg sz : 32768 KB; throughput 59 MB/s (new data) RBPE> Data was corrected after it was pointed out that, physio will be RBPE> throttled by maxphys. New data was obtained after settings RBPE> /etc/system: set maxphys=8338608 RBPE> /kernel/drv/sd.conf sd_max_xfer_size=0x800000 RBPE> /kernel/drv/ssd.cond ssd_max_xfer_size=0x800000 RBPE> And setting un_max_xfer_size in "struct sd_lun". RBPE> That address was figured out using dtrace and knowing that RBPE> sdmin() calls ddi_get_soft_state (details avail upon request). RBPE> RBPE> And of course disabling the write cache (using format -e) RBPE> With this in place I verified that each sdwrite() up to 8M RBPE> would lead to a single biodone interrupts using this: RBPE> dtrace -n 'biodone:entry,sdwrite:[EMAIL PROTECTED], stack(20)]=count()}' RBPE> Note that for 16M and 32M raw device writes, each default_physio RBPE> will issue a series of 8M I/O. And so we don't RBPE> expect any more throughput from that. RBPE> The script used to measure the rates (phys.d) was also RBPE> modified since I was counting the bytes before the I/O had RBPE> completed and that made a big difference for the very large RBPE> I/O sizes. RBPE> If you take the 8M case, the above rates correspond to the RBPE> time it takes to issue and wait for a single 8M I/O to the RBPE> sd driver. So this time certainly does include 1 seek and ~ RBPE> 0.13 seconds of data transfer, then the time to respond to RBPE> the interrupt, finally the wakeup of the thread waiting in RBPE> default_physio(). Given that the data transfer rate using 4 RBPE> MB is very close to the one using 8 MB, I'd say that at 60 RBPE> MB/sec all the fixed-cost element are well amortized. So I RBPE> would conclude from this that the limiting factor is now at RBPE> the device itself or on the data channel between the disk RBPE> and the host. RBPE> Now recall that the throughput that ZFS gets during an RBPE> spa_sync when submitted to a single dd and knowing that ZFS RBPE> will work with 128K I/O: RBPE> 1431 MB; 23723 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s RBPE> 1387 MB; 23044 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s RBPE> 2680 MB; 44209 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s RBPE> 1359 MB; 24223 ms of spa_sync; avg sz : 127 KB; throughput 56 MB/s RBPE> 1143 MB; 19183 ms of spa_sync; avg sz : 126 KB; throughput 59 MB/s RBPE> My disk is RBPE> <HITACHI-DK32EJ36NSUN36G-PQ08-33.92GB>. Is it over FC or just SCSI/SAS? I have to try again with SAS/SCSI - maybe due to more overhead in FC larger IOs give better results than on SCSI? -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss