Hi Team, Does anybody tried to have a look at it and would like to suggest any optimisations?
Thanks On Sat, Jun 11, 2022 at 12:48 AM Shubham Mittal <mittalshubha...@gmail.com> wrote: > Hi Team. > > *I have a use case to get the result as follows:* > > 1. asin_ymm is never null. > 2. If there is more than 1 entry for an asin_ymm with both null and > non-null submodelId, I should return rows with non-null submodelId only, > otherwise if there is no submodelid present for a asin_ymm, then return > that row with null submodelid. > 3. Also if submodelid is null , assuming fitment_key would always be null > in the table. > 4. Using that resultset, If there is more than 1 entry for an (asin_ymm > ,SubmodelID) with both null and non-null fitment_key, I should return rows > with non-null fitment_key only, otherwise if there is no fitment_key > present for a (asin_ymm,submodelId), then return that row with null > fitment_key. > 5. Using that resultset, i need to return those rows having maximum values > in fitment_key(ie. for eg.(out of these two 'A', 'D','1--3-4' and A', > 'D','1-2-3-4-7', i should return row having A', 'D','1-2-3-4-7) > > create table fitment_records(asin_ymm varchar(50), SubmodelID varchar(50), > fitment_key varchar(50)); > > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > null,null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B', > null,null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key ) values('C', > null,null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > 'D','1--3-4'); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B', > 'E','2-3-4-5'); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B', > 'E', null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > 'F','2-3'); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > 'E', null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > 'D', null); > insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A', > 'D','1-2-3-4-7'); > > output should be: > > asin_ymm | submodelid | fitment_key > ----------+------------+------------- > A | D | 1-2-3-4-7 > A | E | null > A | F | 2-3 > B | E | 2-3-4-5 > C | null | null > > Currently i have written these queries for this usecase. Can we optimise it > further? Considering data is in millions > > create temporary view tv1 as (SELECT * > FROM fitment_records fr_1 > WHERE fitment_key IS NOT NULL OR > (fitment_key IS NULL AND > NOT EXISTS (SELECT 1 FROM fitment_records fr_2 > WHERE fr_2.asin_ymm = fr_1.asin_ymm AND > fr_2.SubmodelID = fr_1.SubmodelID and > fr_2.fitment_key IS NOT NULL))); > > create temporary view tv2 as (select * > FROM tv1 fr_1 > WHERE SubmodelID IS NOT NULL OR > (SubmodelID IS NULL AND > NOT EXISTS (SELECT 1 FROM fitment_records fr_2 > WHERE fr_2.asin_ymm = fr_1.asin_ymm AND > fr_2.SubmodelID IS NOT NULL) )); > > create temporary view fitment_records_with_fitment_key_size as ( > select asin_ymm, SubmodelID, fitment_key, > Array_Length(string_to_array(fitment_key, '-'),1) as fitment_key_size > from tv2 > where SubmodelID is not null > and fitment_key is not null > group by asin_ymm, SubmodelID, fitment_key > ); > > create temporary view fitment_records_with_fitment_key_max_size as ( > select asin_ymm, SubmodelID, max(fitment_key_size) as max_fitment_key_size > from fitment_records_with_fitment_key_size > group by asin_ymm, SubmodelID > ); > > select * from tv2 > except > select f2.* > from fitment_records_with_fitment_key_size frws, > fitment_records_with_fitment_key_max_size frwms, > tv2 f2 > where frws.asin_ymm = frwms.asin_ymm > AND frws.SubmodelID = frwms.SubmodelID > AND frws.fitment_key_size < frwms.max_fitment_key_size > AND frws.SubmodelID = f2.SubmodelID > AND frws.asin_ymm = f2.asin_ymm > AND frws.fitment_key = f2.fitment_key; > > Thanks & Regards > > >