Re: hsync is too slower than hflush

2013-08-26 Thread Andrew Wang
It's syncing the checksum file, so the disk head very likely has to move. There are rotational seek delays too. On Mon, Aug 26, 2013 at 7:30 AM, lei liu wrote: > Hi all, > > DataNode sequential write file, so I think the disk seek time should be > very small.Why is disk seek time 10ms? I think

Re: hsync is too slower than hflush

2013-08-26 Thread lei liu
Hi all, DataNode sequential write file, so I think the disk seek time should be very small.Why is disk seek time 10ms? I think that is too long. Whether we can optimize the linux system configuration, reduce disk seek time. 2013/8/26 haosdent > haha, thank you very much, I get it now. > > -- >

Re: hsync is too slower than hflush

2013-08-25 Thread haosdent
haha, thank you very much, I get it now. -- Best Regards, Haosong Huang Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Monday, August 26, 2013 at 11:18 AM, Andrew Wang wrote: > Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes, 50ms > still feels in the right ballpa

Re: hsync is too slower than hflush

2013-08-25 Thread Andrew Wang
Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes, 50ms still feels in the right ballpark. Best case it's ~20ms, still way slower than hflush. It's also worth asking if there's other dirty data waiting for writeback, since I believe it can also get written out on an fsync. hflush

Re: hsync is too slower than hflush

2013-08-25 Thread haosdent
In fact, I just write 4k in every hsync. Datenode would write checksum file and data file when I hsync data to datanode. Each of them would spent nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is very fast, which spent both 1ms in write checksum and data. If a hsync would spent

Re: hsync is too slower than hflush

2013-08-25 Thread Andrew Wang
50ms is believable. hsync makes each DN call fsync and wait for acks, so you'd expect at least a disk seek time (~10ms) with some extra time depending on how much unsync'd data is being written. So, just as some back of the envelope math, assuming a disk that can write at 100MB/s: 50ms - 10ms see

hsync is too slower than hflush

2013-08-24 Thread haosdent
Hi, all. Hadoop support hsync which would call fsync of system after 2.0.2. I have tested the performance of hsync() and hflush() again and again, but I found that the hsync call() everytime would spent nearly 50ms while the hflush call() just spent 2ms. In this slide(http://www.slideshare.net/