Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

David Geier Sat, 19 Nov 2022 12:05:41 -0800

I think it would be great to get this patch committed. Beyond thereasons already mentioned, the significant overhead also tends to skewthe reported runtimes in ways that makes it difficult to compare them.For example, if two nodes are executed equally often but one needs twicethe time to process the rows: in such a case EXPLAIN ANALYZE shouldreport timings that are 2x apart. However, currently, the high overheadof clock_gettime() tends to skew the relative runtimes.


On 10/12/22 10:33, Michael Paquier wrote:

No rebased version has been sent since this update, so this patch has
been marked as RwF.

I've rebased the patch set on latest master and fixed a few compilerwarnings. Beyond that some findings and thoughts:

You're only using RDTSC if the clock source is 'tsc'. Great idea to notbother caring about a lot of hairy TSC details. Looking at the kernelcode this seems to imply that the TSC is frequency invariant. I don'tthink though that this implies that Linux is not running under ahypervisor; which is good because I assume PostgreSQL is used a lot inVMs. However, when running under a hypervisor (at least with VMWare)CPUID leaf 0x16 is not available. In my tests __get_cpuid() indicatedsuccess but the returned values were garbage. Instead of using leaf0x16, we should then use the hypervisor interface to obtain the TSCfrequency. Checking if a hypervisor is active can be done via:


bool IsHypervisorActive()
{
    uint32 cpuinfo[4] = {0};

int res = __get_cpuid(0x1, &cpuinfo[0], &cpuinfo[1], &cpuinfo[2],&cpuinfo[3]);

    return res > 0 && (cpuinfo[2] & (1 << 30));
}

Obtaining the TSC frequency via the hypervisor interface can be donewith the following code. See https://lwn.net/Articles/301888/ for moredetails.

// Under hypervisors (tested with VMWare) leaf 0x16 is not available,even though __get_cpuid() succeeds.// Hence, if running under a hypervisor, use the hypervisor interface toobtain TSC frequency.

uint32 cpuinfo[4] = {0};

if (IsHypervisorActive() && __get_cpuid(0x40000001, &cpuinfo[0],&cpuinfo[1], &cpuinfo[2], &cpuinfo[3]) > 0)

    cycles_to_sec = 1.0 / ((double)cpuinfo[0] * 1000 * 1000);

Given that we anyways switch between RDTSC and clock_gettime() with aglobal variable, what about exposing the clock source as GUC? That waythe user can switch back to a working clock source in case we miss adetail around activating or reading the TSC.


I'm happy to update the patches accordingly.

--
David Geier
(ServiceNow)

Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

Reply via email to