Anyone have any thoughts on how I might optimize this query? It takes about 2 seconds. I know that seems quick, but we get nearly 30,000 hits per second and so if we can shave this down, it helps. Also we do use memcache, but even with that, we still see this in the slow-log sometimes. I have indexes on everything used in this query and even a compound one as you see in the EXPLAIN.
I'm not going to lose sleep over it, but I thought if there was something simple or a way to refactor I'd give it a shot. I thought changing the query to use JOIN ON syntax would have helped, but it didn't do anything really?! Also, this is all being used with PHP, so I'm fine with pulling things out into two or three queries if you suggest it will make a faster difference. -- old query: SELECT sg.`scene_id`, COUNT(*) AS num FROM `scenes_list` AS s, `dvds` AS d, `scenes_genres` AS sg, `scenes_downloads_new` AS sd WHERE sg.`genre_id` IN ('1','8','10','19','38','58','65') AND d.`dvd_id`=s.`dvd_id` AND sg.`scene_id`=s.`scene_id` AND d.`status`='ok' AND d.`date_release`!='0000-00-00' AND sd.`scene_id`=s.`scene_id` GROUP BY sg.`scene_id` ORDER BY num DESC, sd.`count_dl_monthly` DESC LIMIT 30; -- refactored: is this correct order?? Smallest table and most filters first right to narrow the dataset as quick as possible? -- EXPLAIN SELECT s.`scene_id`, COUNT(*) AS num FROM `dvds` AS d JOIN `scenes_list` AS s ON d.`dvd_id` = s.`dvd_id` AND d.`date_release` != '0000-00-00' AND d.`status` = 'ok' JOIN `scenes_genres` AS sg ON sg.`scene_id` = s.`scene_id` AND sg.`genre_id` IN ('1', '8', '10', '19', '38', '58', '65') JOIN `scenes_downloads_new` AS sd ON sd.`scene_id` = s.`scene_id` GROUP BY sg.`scene_id` ORDER BY num DESC, sd.`count_dl_monthly` DESC LIMIT 30 ; SELECT COUNT(*) FROM dvds; -- 12181 SELECT COUNT(*) FROM scenes_downloads_new; -- 66054 SELECT COUNT(*) FROM scenes_list; -- 67197 SELECT COUNT(*) FROM scenes_genres; -- 344272 -- why do I still hit the filesort and temp table? how can I get rid of that? id select_type TABLE TYPE possible_keys KEY key_len ref ROWS Extra ------ ----------- ------ ------ ------------------------------------------------------ -------------- ------- ------------------------- ------ ---------------------------------------------- 1 SIMPLE d ref PRIMARY,date_release,STATUS,status_release,dvd_release status_release 1 const 2436 USING WHERE; USING TEMPORARY; USING filesort 1 SIMPLE s ref PRIMARY,dvd_id_2,dvd_id dvd_id 4 .d.dvd_id 6 USING WHERE 1 SIMPLE sd eq_ref PRIMARY PRIMARY 3 .s.scene_id 1 USING WHERE 1 SIMPLE sg ref PRIMARY,scene_id,genre_id scene_id 4 .s.scene_id 5 USING WHERE If I take off the ORDER BY (just to see what happens), basically it's exactly the same EXPLAIN output (wrong results of course), but it does add "Using index" to the scene_id row (weird). If I take off the GROUP BY (just to see what happens), basically it's exactly the same EXPLAIN output (wrong results of course), but it does remove "Using Filesort". Taking them both off is optimal I suspect, and again, barring the fact the results are wrong, it takes 1 second for the query. Should I read that to mean, "it is what it is and that's the best I can expect from that multi-join query"? Is there any benefit to splitting this up and if so, how should I split it? d -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql