I hace a fairly large table with two columns that I need to "de-normalize" (235 
million rows) There has got to be a better (i.e. faster) approach than what I 
am doing. I am using a MAX CASE on each of the 24 variables (column names 
variable and value) that I want to unstack.  Any suggestions would be most 
appreciated.  Here is the ANALYZE 

 GroupAggregate  (cost=107156950.16..174988710.12 rows=23511876 width=286)
   ->  Sort  (cost=107156950.16..107744747.04 rows=235118752 width=286)
         Sort Key: site, canopy, block, plot, measurement_interval, 
warming_treatment, treatment_code, treatment_abbr, water_treatment, 
variable_name, variable_channel, variable_id, rowid, time2, "timestamp", 
block_name, table_name, batt_volt, program, flag1, flag2, flag3, amb_a_avg, 
amb_closed_avg, airtc_avg, airtemp_avg, airtemp_max, airtemp_min, all_avgt, 
am25tref1, ptemp_avg, rh, s_all_avgt_avg, vp_avg, tabove_sdupper, 
tabove_sdlower, tabove_meantrim, tabove_mean_dc, tsoil_sdupper, tsoil_sdlower, 
tsoil_meantrim, tsoil_mean_dc
         ->  Seq Scan on derived_15min_joined  (cost=0.00..11782048.52 
rows=235118752 width=286)
(4 rows)


And here is the table:


b4warmed3=# \d derived_15min_joined
              Table "public.derived_15min_joined"
        Column        |            Type             | Modifiers 
----------------------+-----------------------------+-----------
 site                 | character varying(6)        | 
 canopy               | character varying(24)       | 
 block                | character(2)                | 
 plot                 | character(6)                | 
 measurement_interval | interval                    | 
 warming_treatment    | character varying(24)       | 
 treatment_code       | character varying(24)       | 
 treatment_abbr       | character varying(24)       | 
 water_treatment      | character varying(24)       | 
 variable_name        | character varying(24)       | 
 variable_channel     | character varying(24)       | 
 variable_id          | character varying(24)       | 
 rowid                | character varying(48)       | 
 time2                | timestamp without time zone | 
 timestamp            | timestamp without time zone | 
 block_name           | character varying(8)        | 
 table_name           | character varying(10)       | 
 batt_volt            | real                        | 
 program              | character varying(48)       | 
 flag1                | integer                     | 
 flag2                | integer                     | 
 flag3                | integer                     | 
 amb_a_avg            | real                        | 
 amb_closed_avg       | real                        | 
 airtc_avg            | real                        | 
 airtemp_avg          | real                        | 
 airtemp_max          | real                        | 
 airtemp_min          | real                        | 
 all_avgt             | real                        | 
 am25tref1            | real                        | 
 ptemp_avg            | real                        | 
 rh                   | real                        | 
 s_all_avgt_avg       | real                        | 
 vp_avg               | real                        | 
 tabove_sdupper       | real                        | 
 tabove_sdlower       | real                        | 
 tabove_meantrim      | real                        | 
 tabove_mean_dc       | real                        | 
 tsoil_sdupper        | real                        | 
 tsoil_sdlower        | real                        | 
 tsoil_meantrim       | real                        | 
 tsoil_mean_dc        | real                        | 
 variable             | text                        | 
 value                | real                        | 
Indexes:
    "derived_15min_joined_lower_idx" btree (lower(variable_name::text))
    "derived_15min_joined_time2_idx" btree (time2)

b4warmed3=# 

And here is my query. 

SELECT
        site,
        canopy,
        block,
        plot,
        measurement_interval,
        warming_treatment,
        treatment_code,
        treatment_abbr,
        water_treatment,
        variable_name,
        variable_channel,
        variable_id,
        rowid,
        time2,
        timestamp,
        block_name,
        table_name,
        batt_volt,
        program,
        flag1,
        flag2,
        flag3,
        amb_a_avg,
        amb_closed_avg,
        airtc_avg,
        airtemp_avg,
        airtemp_max,
        airtemp_min,
        all_avgt,
        am25tref1,
        ptemp_avg,
        rh,
        s_all_avgt_avg,
        vp_avg,
        tabove_sdupper,
        tabove_sdlower,
        tabove_meantrim,
        tabove_mean_dc,
        tsoil_sdupper,
        tsoil_sdlower,
        tsoil_meantrim,
        tsoil_mean_dc,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tabove' THEN

                value

                END
        ) AS tabove,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tabove_sc' THEN

                value

                END
        ) AS tabove_sc,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tabove_delta' THEN

                value

                END
        ) AS tabove_delta,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tsoil' THEN

                value

                END
        ) AS tsoil,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tsoil_sc' THEN

                value

                END
        ) AS tsoil_sc,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tsoil_delta' THEN

                value

                END
        ) AS tsoil_delta,
        MAX (
                CASE
                WHEN lower(variable_name) = 's_sdm_out' THEN

                value

                END
        ) AS s_sdm_out,
        MAX (
                CASE
                WHEN lower(variable_name) = 'sbtemp' THEN

                value

                END
        ) AS sbtemp,
        MAX (
                CASE
                WHEN lower(variable_name) = 'heat_a_avg' THEN

                value

                END
        ) AS heat_a_avg,
        MAX (
                CASE
                WHEN lower(variable_name) = 'b_dc_avg' THEN

                value

                END
        ) AS b_dc_avg,
        MAX (
                CASE
                WHEN lower(variable_name) = 'targettemp' THEN

                value

                END
        ) AS targettemp,
        MAX (
                CASE
                WHEN lower(variable_name) = 's_scldout' THEN

                value

                END
        ) AS s_scldout,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tmv' THEN

                value

                END
        ) AS tmv,
        MAX (
                CASE
                WHEN lower(variable_name) = 'a_dc' THEN

                value

                END
        ) AS a_dc,
        MAX (
                CASE
                WHEN lower(variable_name) = 'a_targettemp' THEN

                value

                END
        ) AS a_targettemp,
        MAX (
                CASE
                WHEN lower(variable_name) = 'scldout' THEN

                value

                END
        ) AS scldout,
        MAX (
                CASE
                WHEN lower(variable_name) = 'pid_lmt' THEN

                value

                END
        ) AS pid_lmt,
        MAX (
                CASE
                WHEN lower(variable_name) = 'targettemp_adj' THEN

                value

                END
        ) AS targettemp_adj,
        MAX (
                CASE
                WHEN lower(variable_name) = 'sdm_out' THEN

                value

                END
        ) AS sdm_out,
        MAX (
                CASE
                WHEN lower(variable_name) = 's_pid_lmt' THEN

                value

                END
        ) AS s_pid_lmt,
        MAX (
                CASE
                WHEN lower(variable_name) = 'tsoilr' THEN

                value

                END
        ) AS tsoilr,
        MAX (
                CASE
                WHEN lower(variable_name) = 's_pid_out' THEN

                value

                END
        ) AS s_pid_out,
        MAX (
                CASE
                WHEN lower(variable_name) = 'sctemp' THEN

                value

                END
        ) AS sctemp,
        MAX (
                CASE
                WHEN lower(variable_name) = 'amb_avg' THEN

                value

                END
        ) AS amb_avg
FROM
        derived_15min_joined
GROUP BY
        site,
        canopy,
        block,
        plot,
        measurement_interval,
        warming_treatment,
        treatment_code,
        treatment_abbr,
        water_treatment,
        variable_name,
        variable_channel,
        variable_id,
        rowid,
        time2,
        timestamp,
        block_name,
        table_name,
        batt_volt,
        program,
        flag1,
        flag2,
        flag3,
        amb_a_avg,
        amb_closed_avg,
        airtc_avg,
        airtemp_avg,
        airtemp_max,
        airtemp_min,
        all_avgt,
        am25tref1,
        ptemp_avg,
        rh,
        s_all_avgt_avg,
        vp_avg,
        tabove_sdupper,
        tabove_sdlower,
        tabove_meantrim,
        tabove_mean_dc,
        tsoil_sdupper,
        tsoil_sdlower,
        tsoil_meantrim,
        tsoil_mean_dc 
;




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to