On Fri, Dec 19, 2014 at 09:53:24AM -0800, Andy Lutomirski wrote: > On Fri, Dec 19, 2014 at 9:42 AM, Chris Mason <[email protected]> wrote: > > > > > > On Fri, Dec 19, 2014 at 11:48 AM, Andy Lutomirski <[email protected]> > > wrote: > >> > >> On Fri, Dec 19, 2014 at 3:23 AM, Peter Zijlstra <[email protected]> > >> wrote: > >>> > >>> On Thu, Dec 18, 2014 at 04:22:59PM -0800, Andy Lutomirski wrote: > >>>> > >>>> Bad news: this patch is incorrect, I think. Take a look at > >>>> update_rq_clock -- it does fancy things involving irq time and > >>>> paravirt steal time. So this patch could result in extremely > >>>> non-monotonic results. > >>> > >>> > >>> Yeah, I'm not sure how (and if) we could make all that work :/ > >> > >> > >> I obviously can't comment on what Facebook needs, but if I were > >> rigging something up to profile my own code*, I'd want a count of > >> elapsed time, including user, system, and probably interrupt as well. > >> I would probably not want to count time during which I'm not > >> scheduled, and I would also probably not want to count steal time. > >> The latter makes any implementation kind of nasty. > >> > >> The API presumably doesn't need to be any particular clock id for > >> clock_gettime, and it may not even need to be clock_gettime at all. > >> > >> Is perf self-monitoring good enough for this? If not, can we make it > >> good enough? > >> > >> * I do this today using CLOCK_MONOTONIC > > > > > > The clock_gettime calls are used for a wide variety of things, but usually > > they are trying to instrument how much CPU the application is using. So for > > example with the HHVM interpreter they have a ratio of the number of hhvm > > instructions they were able to execute in N seconds of cputime. This gets > > used to optimize the HHVM implementation and can be used as a push blocking > > counter (code can't go in if it makes it slower). > > > > Wall time isn't a great representation of this because it includes factors > > that might be outside a given HHVM patch, but it sounds like we're saying > > almost the same thing. > > > > I'm not familiar with the perf self monitoring? > > You can call perf_event_open and mmap the result. Then you can read > the docs^Wheader file. > > On the god side, it's an explicit mmap, so all the nasty preemption > issues are entirely moot. And you can count cache misses and such if > you want to be fancy. > > On the bad side, the docs are a bit weak, and the added context switch > overhead might be higher.
I'll measure the overhead for sure. If overhead isn't high, the perf approach is very interesting. On the other hand, is it acceptable the clock_gettime fallbacks to slow path if irq time is enabled (it's overhead is high, we don't enable it actually)? Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

