Hi,

On 2020-03-01 22:16:06 +0100, Fabien COELHO wrote:
>
> Hello Andres,
>
> > FWIW, leaving windows, error handling, and other annoyances aside, this
> > can be implemented fairly simply. See below.
>
> Attached an attempt at improving things.

Awesome!


> I've put 2 barriers: one so that all threads are up, one when all
> connections are setup and the bench is ready to go.

I'd done similarly locally.

Slight aside: Have you ever looked at moving pgbench to non-blocking
connection establishment? It seems weird to use non-blocking everywhere
but connection establishment.


> I've done a blind attempt at implementing the barrier stuff on windows.

Neat.


> I've changed the performance calculations depending on -C or not. Ramp-up
> effects are smoothed.


> I've merged all time-related stuff (time_t, instr_time, int64) to use a
> unique type (pg_time_usec_t) and set of functions/macros, which simplifies
> the code somehow.

Hm.  I'm not convinced it's a good idea for pgbench to do its own thing
here.


>
>  #ifdef WIN32
> +#define PTHREAD_BARRIER_SERIAL_THREAD (-1)
> +
>  /* Use native win32 threads on Windows */
>  typedef struct win32_pthread *pthread_t;
>  typedef int pthread_attr_t;
> +typedef SYNCHRONIZATION_BARRIER pthread_barrier_t;
>
>  static int   pthread_create(pthread_t *thread, pthread_attr_t *attr, void 
> *(*start_routine) (void *), void *arg);
>  static int   pthread_join(pthread_t th, void **thread_return);
> +
> +static int   pthread_barrier_init(pthread_barrier_t *barrier, void *unused, 
> int nthreads);
> +static int   pthread_barrier_wait(pthread_barrier_t *barrier);
> +static int   pthread_barrier_destroy(pthread_barrier_t *barrier);

How about using 'struct unknown_type *unused' instead of "unused"?
Because the void *unused will accept everything...


> +/* Thread synchronization barriers */
> +static pthread_barrier_t
> +     start_barrier,          /* all threads are started */
> +     bench_barrier;          /* benchmarking ready to start */
> +

We don't really need two barriers here. The way that pthread barriers
are defined is that they 'reset' after all the threads have arrived. You
can argue we still want that, but ...



> @@ -5165,20 +5151,16 @@ printSimpleStats(const char *prefix, SimpleStats *ss)
>
>  /* print out results */
>  static void
> -printResults(StatsData *total, instr_time total_time,
> -                      instr_time conn_total_time, int64 latency_late)
> +printResults(StatsData *total,

Given that we're changing the output (for the better) of pgbench again,
I wonder if we should add the pgbench version to the benchmark
output. Otherwise it seems easy to end up e.g. seeing a performance
difference between pg12 and pg14, where all that's actually happening is
a different output because each run used the respective pgbench version.



> +                      pg_time_usec_t total_delay,            /* benchmarking 
> time */
> +                      pg_time_usec_t conn_total_delay,       /* is_connect */
> +                      pg_time_usec_t conn_elapsed_delay,     /* !is_connect 
> */
> +                      int64 latency_late)

I'm not a fan of naming these 'delay'. To me that doesn't sounds like
it's about the time the total benchmark has taken.


> @@ -5239,8 +5220,16 @@ printResults(StatsData *total, instr_time total_time,
>                          0.001 * total->lag.sum / total->cnt, 0.001 * 
> total->lag.max);
>       }
>
> -     printf("tps = %f (including connections establishing)\n", tps_include);
> -     printf("tps = %f (excluding connections establishing)\n", tps_exclude);
> +     if (is_connect)
> +     {
> +             printf("average connection time = %.3f ms\n", 0.001 * 
> conn_total_delay / total->cnt);
> +             printf("tps = %f (including reconnection times)\n", tps);
> +     }
> +     else
> +     {
> +             printf("initial connection time = %.3f ms\n", 0.001 * 
> conn_elapsed_delay);
> +             printf("tps = %f (without initial connection establishing)\n", 
> tps);
> +     }

Keeping these separate makes sense to me, they're just so wildly
different.


> +/*
> + * Simpler convenient interface
> + *
> + * The instr_time type is expensive when dealing with time arithmetic.
> + * Define a type to hold microseconds on top of this, suitable for
> + * benchmarking performance measures, eg in "pgbench".
> + */
> +typedef int64 pg_time_usec_t;
> +
> +static inline pg_time_usec_t
> +pg_time_get_usec(void)
> +{
> +     instr_time now;
> +
> +     INSTR_TIME_SET_CURRENT(now);
> +     return (pg_time_usec_t) INSTR_TIME_GET_MICROSEC(now);
> +}

For me the function name sounds like you're getting the usec out of a
pg_time. Not that it's getting a new timestamp.


> +#define PG_TIME_SET_CURRENT_LAZY(t)          \
> +     if ((t) == 0)                                           \
> +             (t) = pg_time_get_usec()
> +
> +#define PG_TIME_GET_DOUBLE(t) (0.000001 * (t))
>  #endif                                                       /* INSTR_TIME_H 
> */

I'd make it an inline function instead of this.

Greetings,

Andres Freund


Reply via email to