Hi Jerome,

Yes it looks like you could stop using GET_SEMAINE and directly joining "calendrier_hebdo" with "calendrier" for example. For "FCALC_IDJOUR" you will have to make a udf so I hope you have some java skills :) The "calendrier" tables suggests you have star schema with a calendar table. If on oracle you partitioned on a date and use a subquery to get the dates you want from the fact table you can expect this to be a problem in hive. Partition pruning works during planning it will not know which partitioned to prune and thus run on all the data in the fact table and filter after it's done making partitioning useless. There are ways of working around this, it seems most people decide to use a "deterministic" udf which produces the dates and this causes the udfs to be run during planning and partition pruning magically works again.
Hope this helps.

Bennie.

Op 25-7-2013 09:50, Jérôme Verdier schreef:
Hi,

I need some help to translate a PL/SQL script in HiveQL.

Problem : my PL/SQL script is calling two functions.

you can see the script below :

    SELECT
      in_co_societe                             as co_societe,
      'SEMAINE'                                 as co_type_periode,
      a.type_entite                             as type_entite,
      a.code_entite                             as code_entite,
      a.type_rgrp_produits                      as type_rgrp_produits,
      a.co_rgrp_produits                        as co_rgrp_produits,
      SUM(a.MT_CA_NET_TTC)                      as MT_CA_NET_TTC,
      SUM(a.MT_OBJ_CA_NET_TTC)                  as MT_OBJ_CA_NET_TTC,
      SUM(a.NB_CLIENTS)                         as NB_CLIENTS,
      SUM(a.MT_CA_NET_TTC_COMP)                 as MT_CA_NET_TTC_COMP,
      SUM(a.MT_OBJ_CA_NET_TTC_COMP)             as MT_OBJ_CA_NET_TTC_COMP,
      SUM(a.NB_CLIENTS_COMP)                    as NB_CLIENTS_COMP
    from
kpi.thm_ca_rgrp_produits_jour/*@o_bi.match.eu <http://o_bi.match.eu>*/ a
    WHERE
        a.co_societe = in_co_societe
    AND a.dt_jour between
      (
        SELECT
          cal.dt_jour_deb
        FROM ods.calendrier_hebdo cal
        WHERE cal.co_societe = in_co_societe
        AND cal.co_an_semaine = ods.package_date.get_semaine(
          ods.package_date.fcalc_idjour(
            CASE
WHEN TO_CHAR(D_Dernier_Jour,'YYYY') = TO_CHAR(D_Dernier_Jour-364,'YYYY') THEN
                NEXT_DAY(D_Dernier_Jour-364,1)-7
              ELSE
                D_Dernier_Jour-364
            END
          )
        )
      )
      AND D_Dernier_Jour-364
    -- On ne calcule rien si la semaine est compl¿¿te
    AND (
          TO_CHAR(D_Dernier_Jour,'DDMM') <> '3112'
      AND TO_CHAR(D_Dernier_Jour,'D') <> '7'
    )
    GROUP BY
      a.type_entite,
      a.code_entite,
      a.type_rgrp_produits,
      a.co_rgrp_produits;

The function ods.package_date.get_semaine is :

FUNCTION GET_SEMAINE
       (ID_DEB  IN NUMBER)
  RETURN NUMBER
  IS
    SEMAINE  NUMBER(10);
  BEGIN
    SELECT CO_AN_SEMAINE
    INTO   SEMAINE
    FROM   CALENDRIER
    WHERE  ID_JOUR = ID_DEB;

    RETURN (SEMAINE);
  EXCEPTION
    WHEN NO_DATA_FOUND THEN
      RETURN (0);
    WHEN OTHERS THEN
      RETURN (0);
  END;

The function ods.package_date.fcalc_idjour is below :

FUNCTION FCALC_IDJOUR
       (DATE_REFERENCE  IN DATE)
  RETURN NUMBER
  IS
    NM_ANNEE        NUMBER := TO_NUMBER(TO_CHAR(DATE_REFERENCE,'YYYY'));
NM_MOIS NUMBER := TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMM'),5,2)); NM_JOUR NUMBER := TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMMDD'),7,2));
    IDJOUR_CALCULE  NUMBER := 0;
  BEGIN
    IF NM_ANNEE < 1998
        OR DATE_REFERENCE IS NULL THEN
      IDJOUR_CALCULE := 0;
    ELSE
IDJOUR_CALCULE := ((NM_ANNEE - 1998) * 600) + ((NM_MOIS - 01) * 50) + NM_JOUR;
    END IF;

    RETURN IDJOUR_CALCULE;
    DBMS_OUTPUT.PUT_LINE(IDJOUR_CALCULE);
  END FCALC_IDJOUR;

Is it possible to translate this in one  HiveQL script ?

Reply via email to