In 4.7.2, the kernel is acknowledging block writes that have not completed to disk. To reproduce: create an MD array, run FIO (direct + libaio), and pull all drives. FIO will continue to run without receiving I/O errors. I have also reproduced the bug using physical drives. In this case, only a limited number of I/Os are incorrectly acknowledged; FIO eventually receives an I/O error after the device reference is removed.
The root cause of the problem is that dio_complete() does not correctly propagate I/O errors in the is_async case. Specifically, generic_write_sync() appears to be overwriting the return status destined for ki_complete(). This bug appears to have been introduced by the following commit: Description: "fs: simplify the generic_write_sync prototype" Committed: Apr 7, 2016 Hash: e259221763a40403d5bb232209998e8c45804ab8 Affects: 4.7-rc1 - master I have confirmed a fix for the AIO/Direct-IO failure condition but have not reviewed the rest of the changes associated with that commit. If you would like a small patch for direct-io.c, let me know. Regards, -Jonathan