@ross "If the write doesn't span the whole stripe width then there is a read of the parity chunk, write of the block and a write of the parity chunk which is the write hole penalty/vulnerability, and is 3 operations (if the data spans more then 1 chunk then it is written in parallel so you can think of it as one operation, if the data doesn't fill any given chunk then a read of the existing data chunk is necessary to fill in the missing data making it 4 operations). No other operation on the array can execute while this is happening."
I thought with raid5 for a new FS block write, the previous block is read in, then read parity, write/update parity then write the new block (2 reads 2 writes)?? "Yes, reads are exactly like writes on the raidz vdev, no other operation, read or write, can execute while this is happening. This is where the problem lies, and is felt hardest with random IOs." Ah - so with a random read workload, a read IO can not be executed in multiple streams or simultaneously until the current IO has completed with raidz. Was the thought process behind this to mitigate the write hole issue or for performance (a write is a single IO instead of 3 or 4 IOs with raid5)? -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss