In fact, I just write 4k in every hsync. Datenode would write checksum file and data file when I hsync data to datanode. Each of them would spent nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is very fast, which spent both 1ms in write checksum and data. If a hsync would spent 50ms, what meanings we use it? Or my test way is wrong?
-- Best Regards, Haosong Huang Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote: > 50ms is believable. hsync makes each DN call fsync and wait for acks, so > you'd expect at least a disk seek time (~10ms) with some extra time > depending on how much unsync'd data is being written. > > So, just as some back of the envelope math, assuming a disk that can write > at 100MB/s: > > 50ms - 10ms seek = 40ms writing time > 100 MB/s * 40ms = 4MB > > If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect. > > Best, > Andrew > > > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosd...@gmail.com > (mailto:haosd...@gmail.com)> wrote: > > > Hi, all. Hadoop support hsync which would call fsync of system after > > 2.0.2. I have tested the performance of hsync() and hflush() again and > > again, but I found that the hsync call() everytime would spent nearly 50ms > > while the hflush call() just spent 2ms. In this slide( > > http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage > > 18), the author mentions that hsync() is 2x slower than hflush(). So, > > is anything wrong? Thank you very much and looking forward to your help. > > > > -- > > Best Regards, > > Haosong Huang > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > > >