Re: Additional SQL metrics

Denis Magda Thu, 02 Mar 2017 14:01:22 -0800

Vovan,

When I’m speaking of JOIN metrics I’m simply assume that we need to add metrics 
relevant for queries with joins, metrics that will help us get more insights on 
non-collocated and collocated joins execution flow.


> 1) Query exec count
> 2) Query exec time (first define what "time" means) - min, max, avg

Total query execution time might not be helpful in the trickiest cases. What if 
you have multiple joins in your query? How do I know which one contributes to 
the execution most?

So, I would do a breakdown having total time, map time, per-join time, reduce 
time. Hope it’s possible. If it’s unclear how to support everything at the 
first place then it’s a different question. Let’s create several tickets and 
start implementing everything gracefully.

> 3) Number of bytes exchanged between nodes during query execution

It will be really helpful to make a breakdown showing a number of bytes 
exchanged per-join (physical join). Again, if you believe it makes sense to do 
the breakdown later then let’s create an additional ticket for the 2 tier 
metrics then. 

> 4) Number of returned rows - min, max, avg


Plus, let’s add the following to the list:

5) Collocated: yes or no

—
Denis

> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <voze...@gridgain.com> wrote:
> 
> Denis,
> 
> The main problem with suggested metrics is that they implies that ceratin
> internal mechanics work in predefined way. For example, what is JOIN
> metrics? There are no guarantees that JOIN in user's query will be
> translated to a real physical join. What if several different query
> execution pieces happen in parallel? What if we rework our distributed
> query engine from pull to push approach for performance reasons and there
> will be no JOINs in classical sense?
> 
> This is why I think that we should start with very basic things. Something
> like:
> 1) Query exec count
> 2) Query exec time (first define what "time" means) - min, max, avg
> 3) Number of bytes exchanged between nodes during query execution
> 4) Number of returned rows - min, max, avg
> 
> Once we have base numbers in place, we can think of carefully integrating
> and enhancing all pieces of query execution into more verbose formats,
> similar to query plans with relative weights in classical RDBMS systems.
> 
> Thoughts?
> 
> 
> 
> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <dma...@apache.org> wrote:
> 
>> Vovan,
>> 
>> Your metrics make perfect sense to me. However, I see a high demand for
>> JOINs based metrics especially from those who give a try to non-collocated
>> joins in production  and want to measure them somehow. This is why,
>> personally, I prefer to see the metrics below in the top priority list as
>> well:
>> 
>> if a query was executed in the collocated or non-collocated mode. Three
>> results are valid: collocated, non-collocated, simple query (no joins).
>> non-collocated query: size of the data exchanged between the nodes to
>> complete a specific join. If there are multiple joins in the query we need
>> to provide this metric for every of them.
>> non-collocated and collocated query: a part of the time spent joining the
>> data. If there are multiple joins in the query we need to provide this
>> metric for every of them.
>> 
>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>> 
>> In any case, can we include timing information (map phase, reduce phase,
>> join phase) into an execution plan produced by H2? Are there any
>> implementation hooks?
>> 
>> —
>> Denis
>> 
>> 
>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
>> wrote:
>>> 
>>> I think some of the metrics specified by Denis also make sense, so I
>> would
>>> add them as well. See below...
>>> 
>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <voze...@gridgain.com
>> <mailto:voze...@gridgain.com>>
>>> wrote:
>>> 
>>>> Denis,
>>>> 
>>>> Query execution is complex process involving different stages which are
>> not
>>>> very easy to match with each other. Especially provided that any node
>> can
>>>> leave topology at any time. Another problem is that engine evolves and
>>>> metrics like "did a query do broadcast or unicast" may easily become
>>>> useless at some point, because for example there will be neither
>> unicast,
>>>> nor broadast, but something different. On the other hand I completely
>> agree
>>>> that performance monitoring is essential part of any mature DBMS.
>>>> 
>>>> I would start with metrics which are both very basic and easy to
>> implement
>>>> at the same time. For example we can add fingerprint (hash) to every
>> query
>>>> which will be used to join "map" and "reduce" parts with each other and
>> add
>>>> the following basic metrics:
>>>> 1) Execution count for particular query
>>>> 2) Number of map nodes - min, max, avg
>>>> 
>>> 
>>> (1) and (2) makes sense
>>> 
>>> 
>>>> 3) Map step duration (if applicable) - min, max,
>>> 
>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>> 
>>> 
>>> Not sure if (3) and (4) are needed. I would only add them if they are
>> easy
>>> to implement.
>>> 
>>> I would also add these:
>>> 
>>> 5) Collocated: yes/no
>>> 6) last execution time
>>> 7) min/max/average execution duration
>>> 
>>> 
>>>> 
>>>> Once done users will be able to get statistics for particular queries.
>>>> 
>>>> Vladimir.
>>>> 
>>>> 
>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <dma...@apache.org> wrote:
>>>> 
>>>>> BTW,
>>>>> 
>>>>> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
>>>>> Sergi, is this feasible?
>>>>> 
>>>>> —
>>>>> Denis
>>>>> 
>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <dma...@apache.org> wrote:
>>>>>> 
>>>>>> Igniters,
>>>>>> 
>>>>>> Let’s shed more light on SQL query execution internals introducing a
>>>> set
>>>>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
>>>>>> 
>>>>>> Per-query metrics. Total history size is defined by
>>>> *CacheConfiguration.
>>>>> getQueryDetailMetricsSize*:
>>>>>> * if a query was executed in the collocated or non-collocated mode.
>>>>> Three results are valid: collocated, non-collocated, simple query (no
>>>>> joins).
>>>>>> * non-collocated query: size of the data exchanged between the nodes
>> to
>>>>> complete a join.
>>>>>> * non-collocated query: did a query do broadcast or unicast to get
>> data
>>>>> needed to complete a join.
>>>>>> * non-collocated and collocated query: a part of the time spent
>> joining
>>>>> the data.
>>>>>> 
>>>>>> CacheMetrics:
>>>>>> * an average number of executed SQL queries (collocated,
>>>> non-collocated,
>>>>> simple query (no joins)).
>>>>>> 
>>>>>> Please don’t hesitate do share suggest another metrics or improve
>>>>> proposed ones.
>>>>>> 
>>>>>> —
>>>>>> Denis
>> 
>>

Re: Additional SQL metrics

Reply via email to