In message <516c71bc.4000...@freebsd.org>, Alexander Motin writes: >On 15.04.2013 23:43, Poul-Henning Kamp wrote: >> In message <516c515a.9090...@freebsd.org>, Alexander Motin writes: >>
>> For tuning anything on a non-ridiculous SSD device or modern >> harddisks, it will be useless because of the bias you introduce is >> *not* one which averages out over many operations. > >Could you please explain why? > >> The fundamental problem is that on a busy system, getbinuptime() >> does not get called at random times, it will be heavily affected >> by the I/O traffic, because of the interrupts, the bus-traffic >> itself, the cache-effects of I/O transfers and the context-switches >> by the processes causing the I/O. > >I'm sorry, but I am not sure I understand above paragraphs. That was the exact explanation you asked for, and I'm not sure I can find a better way to explain it, but I'll try: Your assumption that the error will cancel out, implicitly assumes that the timestamp returned from getbinuptime() is updated at times which are totally independent from the I/O traffic you are trying to measure the latency of. That is not the case. The interrupt which updates getbinuptime()'s cached timestamp is affected a lot by the I/O traffic, for the various reasons I mention above. >Sure, getbinuptime() won't allow to answer how many requests completed >within 0.5ms, but present API doesn't allow to calculate that any way, >providing only total/average times. And why "_5-10_ timecounter interrupts"? A: Yes it actually does, a userland application running on a dedicated CPU core can poll the shared memory devstat structure at a very high rate and get very useful information about short latencies. Most people don't do that, becuase they don't care about the difference between 0.5 and 0.45 milliseconds. B: To get the systematic bias down to 10-20% of the measured interval. >> Latency distribution: >> >> <5msec: 92.12 % >> <10msec: 0.17 % >> <20msec: 1.34 % >> <50msec: 6.37 % >> >50msec: 0.00 % >> >I agree that such functionality could be interesting. The only worry is >which buckets should be there. For modern HDDs above buckets could be >fine. For high-end SSD it may go about microseconds then milliseconds. I >have doubt that 5 buckets will be universal enough, unless separated by >factor of 5-10. Remember what people use this for: Answering the question "Does my disk subsystem suck, and if so, how much" Buckets like the ones proposed will tell you that. >> The %busy crap should be killed, all it does is confuse people. > >I agree that it heavily lies, especially for cached writes, but at least >it allows to make some very basic estimates. For rotating disks: It always lies. For SSD: It almost always lies. Kill it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"