Attila,

I would like to select only the most recent "targettime" within 1 minute and
only display only the rows that are "the latest" and print out all of the
stats as columns on a per toolname basis:

One way: a three-step:

1. There is a formula (see "Group data by time periods" at http://www.artfulsoftware.com/queries.php) for calculating periods that are evenly divisible into 60 mins: |((60/periodMinutes) * HOUR( timevalue ) + FLOOR( MINUTE( timevalue ) / periodMinutes )). |For your query, periodMinutes=1, so it simplifies to 60*HOUR(target_time) + FLOOR(MINUTE(target_time)).

2. Use an exclusion join (see "Within-group aggregates" at http://www.artfulsoftware.com/queries.php) to find the latest rows to the nearest minute (from the above formula) per scenario and toolname group:

SELECT a.*
FROM data a
LEFT JOIN data b
 ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
WHERE b.id IS NULL;

3. Use the above as the inner query for an outer query which pivots (see steps 2 and 3 in "Group column statistics in rows" at http://www.artfulsoftware.com/queries.php) the last two columns of the above result:

SELECT
 target_time,
 scenarioname,
 toolname,
 SUM( CASE statname WHEN 'byte_count' THEN statvalue ELSE 0 END ) AS bytes,
 SUM( CASE statname WHEN 'udp_count'  THEN statvalue ELSE 0 END ) AS udps,
 SUM( CASE statname WHEN 'tcp_count'  THEN statvalue ELSE 0 END ) AS tcps
FROM (
 SELECT a.target_time,a.scenarioname,a.toolname,a.statname,a.statvalue
 FROM data a
 LEFT JOIN data b
   ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
 WHERE b.id IS NULL
) AS stats
GROUP BY target_time,scenarioname,toolname;
+---------------------+--------------+----------+-------+------+------+
| target_time         | scenarioname | toolname | bytes | udps | tcps |
+---------------------+--------------+----------+-------+------+------+
| 2009-01-21 19:34:00 | scenario1    | tool1    |   100 |  200 |  300 |
| 2009-01-21 19:34:00 | scenario1    | tool2    |   400 |  500 |  600 |
+---------------------+--------------+----------+-------+------+------+

PB

-----

Attila wrote:
Hi,

I have the following table:

CREATE TABLE DATA (
  TARGET_TIME datetime NOT NULL,
  SCENARIONAME    varchar(20) NOT NULL,
  TOOLNAME varchar(20) NOT NULL,
  STATNAME varchar(100) NOT NULL,
  STATVALUE int(10) NOT NULL,
  PRIMARY KEY (TARGET_TIME, SCENARIONAME, SIMTOOLNAME, STATNAME)
);

I am trying to collect statistics from running tools and feeding them into
the DB.

If I have the following entries:

("2009-01-21 19:34:00", "scenario1", "tool1", "byte_count", 100),
("2009-01-21 19:34:00", "scenario1", "tool1", "udp_count", 200),
("2009-01-21 19:34:00", "scenario1", "tool1", "tcp_count", 300),
("2009-01-21 19:34:00", "scenario1", "tool2", "byte_count", 400),
("2009-01-21 19:34:00", "scenario1", "tool2", "udp_count", 500),
("2009-01-21 19:34:00", "scenario1", "tool2", "tcp_count", 600),

("2009-01-21 19:33:00", "scenario1", "tool1", "byte_count", 10),
("2009-01-21 19:33:00", "scenario1", "tool1", "udp_count", 20),
("2009-01-21 19:33:00", "scenario1", "tool1", "tcp_count", 30),
("2009-01-21 19:33:00", "scenario1", "tool2", "byte_count", 40),
("2009-01-21 19:33:00", "scenario1", "tool2", "udp_count", 50),
("2009-01-21 19:33:00", "scenario1", "tool2", "tcp_count", 60)
(Notice that the "old" targettime  will be ignored in this particular query
as we are only interested in the "latest")

I would like to select only the most recent "targettime" within 1 minute and
only display only the rows that are "the latest" and print out all of the
stats as columns on a per toolname basis:

Targettime                         Scenario        Toolname
byte_count   udp_count     tcp_count
2009-01-21 19:34:00           scenario1       tool1
100             200               300
2009-01-21 19:34:00           scenario1       tool2              400
           500               600

The purpose of the query is to display the "latest statistics" for each
scenario/toolname group.  It is important that the original data comes in
rows (statname may not be known ahead of time, so it must be stored as a
string in a field value).

I am using MySQL and have found some possibilities with SQL Server (Pivot,
crosstab, etc) ... but the requirements call for MySQL.
Is the above query possible?

Thanks a lot!
------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG - http://www.avg.com Version: 8.0.176 / Virus Database: 270.10.12/1909 - Release Date: 1/22/2009 7:08 AM

Reply via email to