Hi Jerome,
Yes it looks like you could stop using GET_SEMAINE and directly joining
"calendrier_hebdo" with "calendrier" for example. For "FCALC_IDJOUR" you
will have to make a udf so I hope you have some java skills :)
The "calendrier" tables suggests you have star schema with a calendar
table. If on oracle you partitioned on a date and use a subquery to get
the dates you want from the fact table you can expect this to be a
problem in hive. Partition pruning works during planning it will not
know which partitioned to prune and thus run on all the data in the fact
table and filter after it's done making partitioning useless. There are
ways of working around this, it seems most people decide to use a
"deterministic" udf which produces the dates and this causes the udfs to
be run during planning and partition pruning magically works again.
Hope this helps.
Bennie.
Op 25-7-2013 09:50, Jérôme Verdier schreef:
Hi,
I need some help to translate a PL/SQL script in HiveQL.
Problem : my PL/SQL script is calling two functions.
you can see the script below :
SELECT
in_co_societe as co_societe,
'SEMAINE' as co_type_periode,
a.type_entite as type_entite,
a.code_entite as code_entite,
a.type_rgrp_produits as type_rgrp_produits,
a.co_rgrp_produits as co_rgrp_produits,
SUM(a.MT_CA_NET_TTC) as MT_CA_NET_TTC,
SUM(a.MT_OBJ_CA_NET_TTC) as MT_OBJ_CA_NET_TTC,
SUM(a.NB_CLIENTS) as NB_CLIENTS,
SUM(a.MT_CA_NET_TTC_COMP) as MT_CA_NET_TTC_COMP,
SUM(a.MT_OBJ_CA_NET_TTC_COMP) as MT_OBJ_CA_NET_TTC_COMP,
SUM(a.NB_CLIENTS_COMP) as NB_CLIENTS_COMP
from
kpi.thm_ca_rgrp_produits_jour/*@o_bi.match.eu
<http://o_bi.match.eu>*/ a
WHERE
a.co_societe = in_co_societe
AND a.dt_jour between
(
SELECT
cal.dt_jour_deb
FROM ods.calendrier_hebdo cal
WHERE cal.co_societe = in_co_societe
AND cal.co_an_semaine = ods.package_date.get_semaine(
ods.package_date.fcalc_idjour(
CASE
WHEN TO_CHAR(D_Dernier_Jour,'YYYY') =
TO_CHAR(D_Dernier_Jour-364,'YYYY') THEN
NEXT_DAY(D_Dernier_Jour-364,1)-7
ELSE
D_Dernier_Jour-364
END
)
)
)
AND D_Dernier_Jour-364
-- On ne calcule rien si la semaine est compl¿¿te
AND (
TO_CHAR(D_Dernier_Jour,'DDMM') <> '3112'
AND TO_CHAR(D_Dernier_Jour,'D') <> '7'
)
GROUP BY
a.type_entite,
a.code_entite,
a.type_rgrp_produits,
a.co_rgrp_produits;
The function ods.package_date.get_semaine is :
FUNCTION GET_SEMAINE
(ID_DEB IN NUMBER)
RETURN NUMBER
IS
SEMAINE NUMBER(10);
BEGIN
SELECT CO_AN_SEMAINE
INTO SEMAINE
FROM CALENDRIER
WHERE ID_JOUR = ID_DEB;
RETURN (SEMAINE);
EXCEPTION
WHEN NO_DATA_FOUND THEN
RETURN (0);
WHEN OTHERS THEN
RETURN (0);
END;
The function ods.package_date.fcalc_idjour is below :
FUNCTION FCALC_IDJOUR
(DATE_REFERENCE IN DATE)
RETURN NUMBER
IS
NM_ANNEE NUMBER := TO_NUMBER(TO_CHAR(DATE_REFERENCE,'YYYY'));
NM_MOIS NUMBER :=
TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMM'),5,2));
NM_JOUR NUMBER :=
TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMMDD'),7,2));
IDJOUR_CALCULE NUMBER := 0;
BEGIN
IF NM_ANNEE < 1998
OR DATE_REFERENCE IS NULL THEN
IDJOUR_CALCULE := 0;
ELSE
IDJOUR_CALCULE := ((NM_ANNEE - 1998) * 600) + ((NM_MOIS - 01) *
50) + NM_JOUR;
END IF;
RETURN IDJOUR_CALCULE;
DBMS_OUTPUT.PUT_LINE(IDJOUR_CALCULE);
END FCALC_IDJOUR;
Is it possible to translate this in one HiveQL script ?