USING WHERE; USING TEMPORARY; USING filesort

Daevid Vincent Thu, 23 Feb 2012 14:51:18 -0800

Anyone have any thoughts on how I might optimize this query? It takes about
2 seconds. I know that seems quick, but we get nearly 30,000 hits per second
and so if we can shave this down, it helps. Also we do use memcache, but
even with that, we still see this in the slow-log sometimes. I have indexes
on everything used in this query and even a compound one as you see in the
EXPLAIN.


I'm not going to lose sleep over it, but I thought if there was something
simple or a way to refactor I'd give it a shot. I thought changing the query
to use JOIN ON syntax would have helped, but it didn't do anything really?!
Also, this is all being used with PHP, so I'm fine with pulling things out
into two or three queries if you suggest it will make a faster difference.

-- old query:

SELECT sg.`scene_id`, COUNT(*) AS num FROM `scenes_list` AS s, `dvds` AS d,
`scenes_genres` AS sg, `scenes_downloads_new` AS sd WHERE sg.`genre_id` IN
('1','8','10','19','38','58','65') AND d.`dvd_id`=s.`dvd_id` AND
sg.`scene_id`=s.`scene_id` AND d.`status`='ok' AND
d.`date_release`!='0000-00-00' AND sd.`scene_id`=s.`scene_id` GROUP BY
sg.`scene_id` ORDER BY num DESC, sd.`count_dl_monthly` DESC LIMIT 30;

-- refactored: is this correct order?? Smallest table and most filters first
right to narrow the dataset as quick as possible?

-- EXPLAIN
SELECT 
        s.`scene_id`,
        COUNT(*) AS num 
FROM    `dvds` AS d
        JOIN `scenes_list` AS s ON  d.`dvd_id` = s.`dvd_id`
                AND d.`date_release` != '0000-00-00' 
                AND d.`status` = 'ok' 
        JOIN `scenes_genres` AS sg ON sg.`scene_id` = s.`scene_id`
                AND sg.`genre_id` IN ('1', '8', '10', '19', '38', '58',
'65') 
        JOIN `scenes_downloads_new` AS sd ON sd.`scene_id` = s.`scene_id`
GROUP BY sg.`scene_id` 
ORDER BY num DESC, sd.`count_dl_monthly` DESC 
LIMIT 30
;

SELECT COUNT(*) FROM dvds; -- 12181
SELECT COUNT(*) FROM scenes_downloads_new; -- 66054
SELECT COUNT(*) FROM scenes_list; -- 67197
SELECT COUNT(*) FROM scenes_genres; -- 344272

-- why do I still hit the filesort and temp table? how can I get rid of
that?

    id  select_type  TABLE   TYPE    possible_keys
KEY             key_len  ref                          ROWS  Extra

------  -----------  ------  ------
------------------------------------------------------  --------------
-------  -------------------------  ------
----------------------------------------------
     1  SIMPLE       d       ref
PRIMARY,date_release,STATUS,status_release,dvd_release  status_release  1
const                        2436  USING WHERE; USING TEMPORARY; USING
filesort  
     1  SIMPLE       s       ref     PRIMARY,dvd_id_2,dvd_id
dvd_id          4        .d.dvd_id         6  USING WHERE

     1  SIMPLE       sd      eq_ref  PRIMARY
PRIMARY         3        .s.scene_id       1  USING WHERE

     1  SIMPLE       sg      ref     PRIMARY,scene_id,genre_id
scene_id        4        .s.scene_id       5  USING WHERE



If I take off the ORDER BY (just to see what happens), basically it's
exactly the same EXPLAIN output (wrong results of course), but it does add
"Using index" to the scene_id row (weird).

If I take off the GROUP BY (just to see what happens), basically it's
exactly the same EXPLAIN output (wrong results of course), but it does
remove "Using Filesort".

Taking them both off is optimal I suspect, and again, barring the fact the
results are wrong, it takes 1 second for the query. Should I read that to
mean, "it is what it is and that's the best I can expect from that
multi-join query"? Is there any benefit to splitting this up and if so, how
should I split it?

d


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

USING WHERE; USING TEMPORARY; USING filesort

Reply via email to