Hi, Since we saw that we have to give arguments in RANK() function, i'm trying to translate this one (working on Oracle 10g) to be functionnally in Hive :
RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER BY pnvente.dt_debut_commercial DESC, COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang i try this : RANK(pnvente.dt_debut_commercial, COALESCE(pnvente.id_produit,dem.id_produit)) OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER BY pnvente.dt_debut_commercial DESC, COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang and this : RANK(pnvente.dt_debut_commercial, pnvente.id_produit, dem.id_produit) OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER BY pnvente.dt_debut_commercial DESC, COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang But Hive is giving me another error : FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Ranking Functions can take no arguments i don't understand this error, in the first try, he said that he can't work without arguments, and now, rank function is falling because of the arguments. what is wrong now ? 2013/7/17 Richa Sharma <mailtorichasha...@gmail.com> > my bad ... in relational databases we generally do not give a column name > inside rank() ... but the one in (partition by .... order by..) is > sufficient. > > But looks like that's not the case in Hive > > > Jerome, > > Please look at the examples in link below. See if you are able to make it > work > > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-PARTITIONBYwithpartitioning%2CORDERBY%2Candwindowspecification > > > > Cant help you beyond this as i don't have Hive 0.11 :-( > > > Richa > > > On Wed, Jul 17, 2013 at 3:08 PM, Jérôme Verdier < > verdier.jerom...@gmail.com> wrote: > >> Hi Richa, >> >> I have tried one query, with what i've understand of Vijay's tips. >> >> SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY >> mag.co_societe ORDER BY mag.me_vente_ht) AS rank FROM >> default.thm_renta_rgrp_produits_n_1 mag; >> >> This query is working, it gives me results. >> >> You say that maybe i'm hitting the same bug of JIRA HIVE-4663, but query >> is also failling when i put analytical columns in... >> >> >> 2013/7/17 Richa Sharma <mailtorichasha...@gmail.com> >> >>> Vijay >>> >>> Jerome has already passed column -> mag.co_societe for rank. >>> >>> syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY >>> mag.me_vente_ht) >>> This will generate a rank for column mag.co_societe based on column >>> value me_vente_ht >>> >>> Jerome, >>> >>> Its possible you are also hitting the same bug as I mentioned in my >>> email before. >>> >>> >>> Richa >>> >>> >>> On Wed, Jul 17, 2013 at 2:31 PM, Vijay <tec...@gmail.com> wrote: >>> >>>> As the error message states: "One ore more arguments are expected," you >>>> have to pass a column to the rank function. >>>> >>>> >>>> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier < >>>> verdier.jerom...@gmail.com> wrote: >>>> >>>>> Hi Richa, >>>>> >>>>> I have tried a simple query without joins, etc.... >>>>> >>>>> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY >>>>> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM >>>>> default.thm_renta_rgrp_produits_n_1 mag; >>>>> >>>>> Unfortunately, the error is the same like previously. >>>>> >>>>> Error: Query returned non-zero code: 40000, cause: FAILED: >>>>> SemanticException Failed to breakup Windowing invocations into Groups. At >>>>> least 1 group must only depend on input columns. Also check for circular >>>>> dependencies. >>>>> >>>>> Underlying error: >>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more >>>>> arguments are expected. >>>>> SQLState: 42000 >>>>> ErrorCode: 40000 >>>>> >>>>> >>>>> >>>>> >>>>> 2013/7/17 Richa Sharma <mailtorichasha...@gmail.com> >>>>> >>>>>> Jerome >>>>>> >>>>>> I would recommend that you try Rank function with columns from just >>>>>> one table first. >>>>>> Once it is established that rank is working fine then add all the >>>>>> joins. >>>>>> >>>>>> I am still on Hive 0.10 so cannot test it myself. >>>>>> However, I can find a similar issue on following link - so its >>>>>> possible you are facing issues due to this reported bug. >>>>>> >>>>>> https://issues.apache.org/jira/browse/HIVE-4663 >>>>>> >>>>>> >>>>>> Richa >>>>>> >>>>>> >>>>>> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier < >>>>>> verdier.jerom...@gmail.com> wrote: >>>>>> >>>>>>> You can see my query below : >>>>>>> >>>>>>> SELECT >>>>>>> mag.co_magasin, >>>>>>> dem.id_produit as >>>>>>> id_produit_orig, >>>>>>> pnvente.dt_debut_commercial as >>>>>>> dt_debut_commercial, >>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) as >>>>>>> id_produit, >>>>>>> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit >>>>>>> ORDER BY pnvente.dt_debut_commercial DESC, >>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang >>>>>>> >>>>>>> FROM default.demarque_mag_jour dem >>>>>>> >>>>>>> LEFT OUTER JOIN default.produit_norm pn >>>>>>> ON pn.co_societe = dem.co_societe >>>>>>> AND pn.id_produit = dem.id_produit >>>>>>> LEFT OUTER JOIN default.produit_norm pnvente >>>>>>> ON pnvente.co_societe = pn.co_societe >>>>>>> AND pnvente.co_produit_rfu = pn.co_produit_lip >>>>>>> INNER JOIN default.kpi_magasin mag >>>>>>> ON mag.id_magasin = dem.id_magasin >>>>>>> >>>>>>> >>>>>>> GROUP BY >>>>>>> mag.co_magasin, >>>>>>> dem.id_produit, >>>>>>> pnvente.dt_debut_commercial, >>>>>>> COALESCE(pnvente.id_produit,dem.id_produit); >>>>>>> >>>>>>> >>>>>>> 2013/7/16 Richa Sharma <mailtorichasha...@gmail.com> >>>>>>> >>>>>>>> Can you share query with just RANK(). >>>>>>>> >>>>>>>> Richa >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier < >>>>>>>> verdier.jerom...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Richa, >>>>>>>>> >>>>>>>>> I tried to execute the rank function alone, but the result is the >>>>>>>>> same >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/7/16 Richa Sharma <mailtorichasha...@gmail.com> >>>>>>>>> >>>>>>>>>> Hi Jerome >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think the problem is you are trying to use MIN, SUM and RANK >>>>>>>>>> function in a single query. >>>>>>>>>> >>>>>>>>>> Try to get the rank first in a query and on top of it apply these >>>>>>>>>> aggregate functions >>>>>>>>>> >>>>>>>>>> Richa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier < >>>>>>>>>> verdier.jerom...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have a problem while using RANK OVER PARTITION function with >>>>>>>>>>> Hive. >>>>>>>>>>> >>>>>>>>>>> Hive is in version 0.11 and, as we can see here : >>>>>>>>>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html, >>>>>>>>>>> we can now use these functions in Hive. >>>>>>>>>>> >>>>>>>>>>> But, when i use it, i encountered this error : >>>>>>>>>>> >>>>>>>>>>> FAILED: SemanticException Failed to breakup Windowing >>>>>>>>>>> invocations into Groups. At least 1 group must only depend on input >>>>>>>>>>> columns. Also check for circular dependencies. >>>>>>>>>>> Underlying error: >>>>>>>>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more >>>>>>>>>>> arguments are expected. >>>>>>>>>>> >>>>>>>>>>> Here is my script : >>>>>>>>>>> >>>>>>>>>>> SELECT >>>>>>>>>>> mag.co_magasin, >>>>>>>>>>> dem.id_produit as >>>>>>>>>>> id_produit_orig, >>>>>>>>>>> pnvente.dt_debut_commercial as >>>>>>>>>>> dt_debut_commercial, >>>>>>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) as >>>>>>>>>>> id_produit, >>>>>>>>>>> min( >>>>>>>>>>> CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 >>>>>>>>>>> END >>>>>>>>>>> ) as >>>>>>>>>>> flg_demarque_valide, >>>>>>>>>>> sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE >>>>>>>>>>> CAST(dem.mt_revient_ope AS INT) END) >>>>>>>>>>> as >>>>>>>>>>> me_dem_con_prx_cs, >>>>>>>>>>> 0 as >>>>>>>>>>> me_dem_inc_prx_cs, >>>>>>>>>>> 0 as >>>>>>>>>>> me_dem_prov_stk_cs, >>>>>>>>>>> sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE >>>>>>>>>>> CAST(dem.qt_demarque AS INT) END) >>>>>>>>>>> as >>>>>>>>>>> qt_dem_con, >>>>>>>>>>> 0 as >>>>>>>>>>> qt_dem_inc, >>>>>>>>>>> 0 as >>>>>>>>>>> qt_dem_prov_stk, -- !!!!!!!! VIRGULE >>>>>>>>>>> RANK() OVER (PARTITION BY mag.co_magasin, >>>>>>>>>>> dem.id_produit ORDER BY pnvente.dt_debut_commercial DESC, >>>>>>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang >>>>>>>>>>> from default.calendrier cal >>>>>>>>>>> INNER JOIN default.demarque_mag_jour dem >>>>>>>>>>> ON CASE WHEN dem.co_societe = 1 THEN 1 ELSE 2 END = >>>>>>>>>>> '${hiveconf:in_co_societe}' -- A modifier >>>>>>>>>>> AND dem.dt_jour = cal.dt_jour >>>>>>>>>>> LEFT OUTER JOIN default.produit_norm pn >>>>>>>>>>> ON pn.co_societe = dem.co_societe >>>>>>>>>>> AND pn.id_produit = dem.id_produit >>>>>>>>>>> LEFT OUTER JOIN default.produit_norm pnvente >>>>>>>>>>> ON pnvente.co_societe = pn.co_societe >>>>>>>>>>> AND pnvente.co_produit_rfu = pn.co_produit_lip >>>>>>>>>>> AND pnvente.co_type_motif='05' >>>>>>>>>>> INNER JOIN default.kpi_magasin mag >>>>>>>>>>> ON mag.co_societe = '${hiveconf:in_co_societe}' >>>>>>>>>>> AND mag.id_magasin = dem.id_magasin >>>>>>>>>>> WHERE cal.dt_jour = '${hiveconf:in_dt_jour}' >>>>>>>>>>> AND NOT (dem.co_validation IS NULL AND cal.dt_jour > >>>>>>>>>>> unix_timestamp()-3*60*60*24) -- A verifier >>>>>>>>>>> -- JYP 4.4 >>>>>>>>>>> AND dem.co_operation_magasin IN ('13','14','32') >>>>>>>>>>> GROUP BY >>>>>>>>>>> mag.co_magasin, >>>>>>>>>>> dem.id_produit, >>>>>>>>>>> pnvente.dt_debut_commercial, >>>>>>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) >>>>>>>>>>> >>>>>>>>>>> Thanks. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Jérôme VERDIER* >>>>>>>>>>> 06.72.19.17.31 >>>>>>>>>>> verdier.jerom...@gmail.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Jérôme VERDIER* >>>>>>>>> 06.72.19.17.31 >>>>>>>>> verdier.jerom...@gmail.com >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Jérôme VERDIER* >>>>>>> 06.72.19.17.31 >>>>>>> verdier.jerom...@gmail.com >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Jérôme VERDIER* >>>>> 06.72.19.17.31 >>>>> verdier.jerom...@gmail.com >>>>> >>>>> >>>> >>> >> >> >> -- >> *Jérôme VERDIER* >> 06.72.19.17.31 >> verdier.jerom...@gmail.com >> >> > -- *Jérôme VERDIER* 06.72.19.17.31 verdier.jerom...@gmail.com