Using distributed join will set a baseline for correct result set. If you get different results without distributed join, then you know there's a mismatch between join conditions and affinity. If you get the same results, then data is distributed well for this join.
On Thu, Jan 12, 2023, 02:02 Айсина Роза Мунеровна <roza.ays...@sbermarket.ru> wrote: > Hi Jeremy! > > Thank you for your reply. > > Am I right that if JOIN is collocated then absence of affinity key will > lead to incorrect results as data will not be fetched from other nodes? > > So the correct way to check influence of affinity key is to enable > distributed JOIN? > > On 10 Jan 2023, at 10:48 PM, Jeremy McMillan <jeremy.mcmil...@gridgain.com> > wrote: > > Внимание: Внешний отправитель! > Если вы не знаете отправителя - не открывайте вложения, не переходите по > ссылкам, не пересылайте письмо! > > If you are only doing colocated joins, then there will be no runtime > overhead incurred by collecting distributed rows (colocated joins mean do > not try to join data that might be distributed across nodes), so there > might not be much difference in runtimes. > > The difference between different affinity keys, if any, will be seen in > the results returned, and unless there's a significant difference in row > count, it will be difficult to affect runtime performance using this > strategy. > > On Tue, Jan 10, 2023, 13:32 Айсина Роза Мунеровна < > roza.ays...@sbermarket.ru> wrote: > >> Hola! >> >> We want to optimize our SQL queries that make collocated JOINs on several >> tables (about ~8 tables). >> >> Some tables have column “product_id” on which these tables are joined. >> Business meaning is that the result are features for each product_id. >> >> So based on documentation we set “*product_id*” as affinity key >> so that all data required for join will be located on the same node. >> (Column “*product_id*” is always part of the primary key) >> >> But! After this we made experiments: >> - put affinity key to other part of primary key (for example, if primary >> key is "(product_id, store_id)", then affinity key is “store_id”); >> - didn't specify affinity key at all. >> >> The problem is that all our load testing results didn’t changed! >> >> So the question - is there any way to make more advanced *EXPLAIN*, >> that will show partition shuffling (if it happens) or data collocation? >> Some debug tool for this problem. Like query plan in Spark. >> >> >> Information about our setup: >> - Ignite cluster on 5 VMs; >> - all tables are partitioned or replicated; >> - all tables are created with DDL SQL and all interactions are made * >> only* through SQL API; >> - DDL example: >> >> CREATE TABLE IF NOT EXISTS PUBLIC.ProductFeatures >> ( >> product_id INT PRIMARY KEY, >> total_cnt_orders_with_sku INT >> ) >> WITH "CACHE_NAME=PUBLIC_ProductFeatures, >> KEY_TYPE=io.sbmt.ProductFeaturesKey, >> VALUE_TYPE=io.sbmt.ProductFeaturesValue, AFFINITY_KEY=product_id, >> TEMPLATE=PARTITIONED, BACKUPS=1 >> >> - our main SQL query: >> >> SELECT >> ProductFeatures.product_id, >> ProductFeatures.total_cnt_orders_with_sku, >> StoreProductFeatures.price, >> UserProductFeaturesOrder.num_prev_orders_with_sku, >> ... >> FROM ProductFeatures >> LEFT JOIN StoreProductFeatures >> ON ProductFeatures.product_id = StoreProductFeatures.product_id >> AND StoreProductFeatures.store_id = {store_id} >> ... (more joins) >> CROSS JOIN UserFeaturesDiscount >> WHERE UserFeaturesDiscount.user_id = {user_id} >> AND ProductFeatures.product_id IN {skus} >> … >> >> Looking forward for some help. >> *--* >> >> *Роза Айсина* >> Старший разработчик ПО >> *СберМаркет* | Доставка из любимых магазинов >> >> >> Email: roza.ays...@sbermarket.ru >> Mob: >> Web: sbermarket.ru >> App: iOS >> <https://apps.apple.com/ru/app/%D1%81%D0%B1%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82-%D0%B4%D0%BE%D1%81%D1%82%D0%B0%D0%B2%D0%BA%D0%B0-%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82%D0%BE%D0%B2/id1166642457> >> и Android >> <https://play.google.com/store/apps/details?id=ru.instamart&hl=en&gl=ru> >> >> >> >> >> >> >> *УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ:* это электронное сообщение и любые >> документы, приложенные к нему, содержат конфиденциальную информацию. >> Настоящим уведомляем Вас о том, что, если это сообщение не предназначено >> Вам, использование, копирование, распространение информации, содержащейся в >> настоящем сообщении, а также осуществление любых действий на основе этой >> информации, строго запрещено. Если Вы получили это сообщение по ошибке, >> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это >> сообщение. >> *CONFIDENTIALITY NOTICE:* This email and any files attached to it are >> confidential. If you are not the intended recipient you are notified that >> using, copying, distributing or taking any action in reliance on the >> contents of this information is strictly prohibited. If you have received >> this email in error please notify the sender and delete this email. >> > > *--* > > *Роза Айсина* > > Старший разработчик ПО > > *СберМаркет* | Доставка из любимых магазинов > > > > Email: roza.ays...@sbermarket.ru > > Mob: > > Web: sbermarket.ru > > App: iOS > <https://apps.apple.com/ru/app/%D1%81%D0%B1%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82-%D0%B4%D0%BE%D1%81%D1%82%D0%B0%D0%B2%D0%BA%D0%B0-%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82%D0%BE%D0%B2/id1166642457> > и Android > <https://play.google.com/store/apps/details?id=ru.instamart&hl=en&gl=ru> > > > > *УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ:* это электронное сообщение и любые > документы, приложенные к нему, содержат конфиденциальную информацию. > Настоящим уведомляем Вас о том, что, если это сообщение не предназначено > Вам, использование, копирование, распространение информации, содержащейся в > настоящем сообщении, а также осуществление любых действий на основе этой > информации, строго запрещено. Если Вы получили это сообщение по ошибке, > пожалуйста, сообщите об этом отправителю по электронной почте и удалите это > сообщение. > *CONFIDENTIALITY NOTICE:* This email and any files attached to it are > confidential. If you are not the intended recipient you are notified that > using, copying, distributing or taking any action in reliance on the > contents of this information is strictly prohibited. If you have received > this email in error please notify the sender and delete this email. >