On Tue, Sep 18, 2012 at 7:41 PM, Rick James <rja...@yahoo-inc.com> wrote: > SELECT ((the appropriate id)) -- <-- > FROM data_cst, rollup, data_target > WHERE data_target.name = rollup.Target > AND data_cst.ep = rollup.EP > AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 > 23:59:59' > AND data_target.id = data_cst.target_name_id > GROUP BY rollup.Target, rollup.EP > ORDER BY ABS(data_cst.bottom - rollup.`Avg Bottom`) > LIMIT 1 -- <--
rollup is the name of the temp table that I was testing with. It contains the output of the original query. I don't have that in the real world. That's my issue - I can't figure out how to join with the result set and apply the order by to the join condition. What would I use in place of rollup to get the 'Avg Bottom'? Here's what my query looks like now: SELECT data_target.name as Target, q1.ep as EP, COUNT(*) as Wafers, Lots, SUM(numonep)/(COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END)) as 'Sites/Wafer', MAX(LastRun) as "Last Run", SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as Rerun, COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as Runs, avgbottom as "Avg Bottom", 3*stdbottom as "3 Sig", maxbottom as Max, minbottom as Min, SUM(numonep) as Count, SUM(numonep) - SUM(numbottoms) as NAs, 100-((SUM(numonep) - SUM(numbottoms))/SUM(numonep)*100) as "% Success", 3*stdbottom/avgbottom as "3Sig/Avg", AVG(avgbottom) as 'Wafer Avg', AVG(Wafer3Sigma) as 'Wafer 3 Sigma', AVG(Ranges) as 'Avg Range', 3*STD(Ranges) as '3Sig of Ranges', MAX(Ranges) as 'Max Range', MIN(Ranges) as 'Min Range', (SUM(numonep) - SUM(numbottoms))/COUNT(*) as 'NAs/Wafer', mx.id as maxID, mn.id as minID, lr.id as lrID FROM (SELECT target_name_id, ep, wafer_id, COUNT(bottom) as numbottoms, AVG(bottom) as avgbottom, STD(bottom) as stdbottom, MAX(bottom) as maxbottom, MIN(bottom) as minbottom, MAX(date_time) as "LastRun", COUNT(*) as numonep, COUNT(DISTINCT target_name_id, ep, lot_id, data_file_id)-1 as reruns, COUNT(DISTINCT(lot_id)) as Lots, 3*STD(bottom) as Wafer3Sigma, MAX(bottom) - MIN(bottom) as Ranges FROM data_cst WHERE target_name_id IN (775, 776, 777, 778, 779, 780, 45, 44, 116, 117, 118, 119, 120, 121) AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 23:59:59' GROUP BY target_name_id, ep, wafer_id HAVING count(*) < 999) q1 join data_cst mn ON mn.bottom = q1.minbottom join data_cst mx on mx.bottom = q1.maxbottom join data_cst lr on lr.date_time = q1.LastRun, data_target WHERE data_target.id = q1.target_name_id GROUP BY q1.target_name_id, q1.ep; > >> -----Original Message----- >> From: Larry Martell [mailto:larry.mart...@gmail.com] >> Sent: Tuesday, September 18, 2012 2:57 PM >> To: Rick James >> Cc: peter.braw...@earthlink.net; mysql@lists.mysql.com >> Subject: Re: getting certain rows from a group by >> >> On Tue, Sep 18, 2012 at 4:01 PM, Rick James <rja...@yahoo-inc.com> >> wrote: >> > SELECT mn.* >> > mx.* >> >> > FROM ( SELECT @min = MIN(bottom), >> >> > @max = MAX(bottom), ... ) >> >> > JOIN data_cst mn ON bottom = @min >> >> > JOIN data_cst mx ON bottom = @max >> > >> > That is, the FROM finds the value for the desired row. >> > The JOIN then gets to the rest of the fields. >> > Caveat: If two rows have bottom = @min, it returns both. >> >> >> Thanks a lot Rick! This was super helpful. I've got it working for 3 of >> the 4 cases - min(bottom), max(bottom), and max(date_time). But I can't >> figure out how to work in the last case - where bottom is closest to >> avg(bottom). In an individual query I can get it with an order by, like >> this: >> >> SELECT rollup.Target, rollup.EP, rollup.`Avg Bottom`, data_cst.id, >> data_cst.bottom FROM data_cst, rollup, data_target WHERE >> data_target.name = rollup.Target AND data_cst.ep = rollup.EP AND >> data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 >> 23:59:59' >> AND data_target.id = data_cst.target_name_id GROUP BY rollup.Target, >> rollup.EP ORDER BY ABS(data_cst.bottom - rollup.`Avg Bottom`); >> >> Any way to work that into another join? >> >> Thanks! >> >> > >> >> -----Original Message----- >> >> From: Larry Martell [mailto:larry.mart...@gmail.com] >> >> Sent: Tuesday, September 18, 2012 12:54 PM >> >> To: Rick James >> >> Cc: peter.braw...@earthlink.net; mysql@lists.mysql.com >> >> Subject: Re: getting certain rows from a group by >> >> >> >> On Tue, Sep 18, 2012 at 2:05 PM, Rick James <rja...@yahoo-inc.com> >> >> wrote: >> >> > For single query... >> >> > >> >> > Plan A: >> >> > Repeat the SELECT as once for each (MIN, AVG, etc) as a subquery >> in >> >> > WHERE bottom = ( SELECT MIN(bottom) ... ) >> >> > >> >> > Plan B: >> >> > FROM ( SELECT @min = MIN(bottom), >> >> > @max = MAX(bottom), ... ) >> >> > JOIN data_cst mn ON bottom = @min >> >> > JOIN data_cst mx ON bottom = @max >> >> >> >> >> >> Can you elaborate on this? I don't see how this will give me what I >> >> need. I'm not looking for the min or max bottom - I already have >> that >> >> - I'm looking for the row from each group that has the min and max >> >> bottom. >> >> >> >> > Plan C: >> >> > Get rid of 3rd party packages that eventually get in your way >> >> > instead >> >> of 'helping'. >> >> >> >> It's not django that is getting in the way. django is a fantastic >> >> framework for web development. It's just changing requirements. The >> >> original developer (who no longer works here) created a few >> different >> >> classes, and then based on the original requirements, chose to use >> >> the one that supports a single query. That worked then, but may not >> >> work now. There are other classes that allow multiple queries, but >> >> then you have to parse the data and stuff it into the context that >> >> will be picked up by the browser yourself. The single query class >> >> does that for you. I may yet go that way, but I'm trying to avoid a >> >> big rewrite if I can. >> >> >> >> >> -----Original Message----- >> >> >> From: Larry Martell [mailto:larry.mart...@gmail.com] >> >> >> Sent: Tuesday, September 18, 2012 4:57 AM >> >> >> To: peter.braw...@earthlink.net >> >> >> Cc: mysql@lists.mysql.com >> >> >> Subject: Re: getting certain rows from a group by >> >> >> >> >> >> On Tue, Sep 18, 2012 at 7:52 AM, Peter Brawley >> >> >> <peter.braw...@earthlink.net> wrote: >> >> >> > On 2012-09-18 5:53 AM, Larry Martell wrote: >> >> >> >> >> >> >> >> I have this query: >> >> >> >> >> >> >> >> SELECT data_target.name as Target, >> >> >> >> q1.ep as EP, >> >> >> >> COUNT(*) as Wafers, >> >> >> >> Lots, >> >> >> >> SUM(numonep)/(COUNT(*)+SUM(CASE WHEN reruns > 0 THEN >> >> >> >> reruns ELSE 0 END)) as 'Sites/Wafer', >> >> >> >> MAX(LastRun) as "Last Run", >> >> >> >> SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as >> >> Rerun, >> >> >> >> COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 >> >> >> >> END) as >> >> >> Runs, >> >> >> >> avgbottom as "Avg Bottom", >> >> >> >> 3*stdbottom as "3 Sig", >> >> >> >> maxbottom as Max, >> >> >> >> minbottom as Min, >> >> >> >> SUM(numonep) as Count, >> >> >> >> SUM(numonep) - SUM(numbottoms) as NAs, >> >> >> >> 100-((SUM(numonep) - >> SUM(numbottoms))/SUM(numonep)*100) >> >> as >> >> >> "% >> >> >> >> Success", >> >> >> >> 3*stdbottom/avgbottom as "3Sig/Avg", >> >> >> >> AVG(avgbottom) as 'Wafer Avg', >> >> >> >> AVG(Wafer3Sigma) as 'Wafer 3 Sigma', >> >> >> >> AVG(Ranges) as 'Avg Range', >> >> >> >> 3*STD(Ranges) as '3Sig of Ranges', >> >> >> >> MAX(Ranges) as 'Max Range', >> >> >> >> MIN(Ranges) as 'Min Range', >> >> >> >> (SUM(numonep) - SUM(numbottoms))/COUNT(*) as >> 'NAs/Wafer' >> >> >> >> FROM (SELECT target_name_id, >> >> >> >> ep, >> >> >> >> wafer_id, >> >> >> >> COUNT(bottom) as numbottoms, >> >> >> >> AVG(bottom) as avgbottom, >> >> >> >> STD(bottom) as stdbottom, >> >> >> >> MAX(bottom) as maxbottom, >> >> >> >> MIN(bottom) as minbottom, >> >> >> >> MAX(date_time) as "LastRun", >> >> >> >> COUNT(*) as numonep, >> >> >> >> COUNT(DISTINCT target_name_id, ep, lot_id, >> >> >> >> data_file_id)-1 as reruns, >> >> >> >> COUNT(DISTINCT(lot_id)) as Lots, >> >> >> >> 3*STD(bottom) as Wafer3Sigma, >> >> >> >> MAX(bottom) - MIN(bottom) as Ranges >> >> >> >> FROM data_cst >> >> >> >> WHERE target_name_id IN (775, 776, 777, 778, 779, 780, >> >> >> >> 45, >> >> >> 44, >> >> >> >> 116, 117, 118, 119, 120, 121) >> >> >> >> AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' >> AND >> >> >> >> '2010-03-04 23:59:59' >> >> >> >> GROUP BY target_name_id, ep, wafer_id >> >> >> >> HAVING count(*) < 999) q1, >> >> >> >> data_target >> >> >> >> WHERE data_target.id = target_name_id GROUP BY >> >> >> >> q1.target_name_id, q1.ep; >> >> >> >> >> >> >> >> >> >> >> >> This works fine. But now I need to get a certain column >> >> >> >> (image_measurer_id) with each row returned that corresponds to >> >> the >> >> >> >> row from the group that has bottom = Min(bottom), bottom = >> >> >> >> Max(bottom), bottom closest to Avg(bottom), and bottom from >> the >> >> >> >> row where date_time = Max(date_time). >> >> >> >> >> >> >> >> Is this even possible from one query? >> >> >> > >> >> >> > >> >> >> > Might be, but what's the importance of doing it as one query? >> >> >> >> >> >> Because it's part of a django based web app, and the class that >> >> >> this is part of only supports having a single query. To use >> >> >> multiple queries will require a fairly major rewrite of the >> server >> >> >> side of >> >> that app. >> >> >> >> >> >> > I'd start by >> >> >> > saving this result to a temp table and developing the new >> query. >> >> >> > When that's running, see if you can to optimise a query built >> by >> >> >> > replacing the reference to the temp table with the original >> query. >> >> >> >> >> >> Thanks, I'll look into this. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql