Thank you very much Peter ! After I disable the disk cache and change the cache write mode from write-back to "write-through", I saw the result I'd like to see.
It seems fsync() only synced the data to the disk cache but not the storage devices while disk cache sync mode in write-back. But I have another question, while I disable the disk cache but leave the cache write mode write-back, how sync works ? Still write the data into the cache ? This issue may not belong to the scope of discussion here [?] . Thank you all ! 2011/6/3 Peter Schuller <peter.schul...@infidyne.com> > > I disable the disk cache of RAID controller, unfortunately it still lost > > some data. > > Disabling caching shouldn't be necessary so much as ensuring that all > layers honor write barriers properly. A battery backed cache that > survives a power outtage need not be disabled (and usually if you have > battery backed caching you don't want to since it has a considerable > performance impact). > > To re-address your original post: Yes, given QUORUM @ RF=2 (meaning > that QUORUM is equivalent to ALL), any *successful* write is supposed > to be guaranteed to be visible by a subsequent read. In this case even > at CL.ONE since RF was 2 and QUORUM was equivalent to ALL. > > If this is not what you're seeing, likely causes are either (a) a > problem with your test, (b) a cassandra bug, or (c) a kernel/hardware > misconfiguration or bug that causes fsync() to be broken with respect > to power outtages. > > In order to eliminate (a), can you share the actual test? Even if (a) > looks good, you'd be surprised as to how often (c) can be the case. > > If you are satisfied that the test is correct, one way to eliminate > Cassandra as a cause for the problem may be to restart your server by > a reset instead of cutting power, so that power supply never > disappears from your storage device. If you are no longer able to > reproduce the problem, it would indicate that fsync() is at least > causing I/O to reach a device (exit the operating system). If it still > fails, you're none the wiser. > > If you're running without battery backed cache, or with battery backed > cache, one test you can do is run this (on a system which is otherwise > idle): > > http://distfiles.scode.org/mlref/fsynctime.py > > The first argument is a filename which will be created/over-written. > It will then start printing the number of milliseconds each fsync() > takes. If you do not have battery backed caching, you should be seeing > numbers in the 5-25 ms range depending on circumstances. If you see > very low values, that indicates that fsync() is not working and the > writes are not forced to persistent storage. > > (If battery backed caching exists, you will legitimiately get very low > values without it indicating anything is wrong.) > > > -- > / Peter Schuller > -- by Preston Chang
<<328.png>>