I have the following SQL. The source table has 8.1 million rows, and the
resulting table will have about 7.9 million rows.  I know that's not much of a
decrease, but the logic is correct as I've tested it on smaller sets.  The
problem is that when I run with the full set of 8 million rows, it takes about
2 hours to complete.  The source and target tables are all char or tinyint
fields (i.e. fixed length records).  

insert into extract2
      select field1,field2,field3,field4,
            if(right(field1,1)='N',mid(field3,2,1),mid(field3,1,1)) as flag,
            count(*) as count,
            val-min(val_cnt) as cnt1,
            if(max(val)<val_cnt,1,0) as cnt2 ,
            if(max(val)=min(val) and max(val)=val_cnt,1,0) as last,
            if(min(val)=1,1,0) as initial
      from extract1
      group by field1,field2,field3,field4;

While this code is running, the temp table that is created to do the summary
grows to 730,662,620 bytes, which is 22 million records at 33 bytes per record.
 Why is mysql creating a temp table of 22 million records, when the SOURCE
table is only 8.1 million records?  Even if no summary was taking place at all,
I wouldn't expect the temp table to be almost three times the size.  

This is running on a dual 1GHZ Dell poweredge, with RH 7.2 and mysql 3.23.41,
and there is nothing else running on the box at the same time.

Is there any way to optimize this group by so it runs faster, and doesn't
create such a large temp table? 

Thanks,

-Henry

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to