Hi Jeremy!

Thank you for your reply.

Am I right that if JOIN is collocated then absence of affinity key will lead to 
incorrect results as data will not be fetched from other nodes?

So the correct way to check influence of affinity key is to enable distributed 
JOIN?

On 10 Jan 2023, at 10:48 PM, Jeremy McMillan <jeremy.mcmil...@gridgain.com> 
wrote:

Внимание: Внешний отправитель!
Если вы не знаете отправителя - не открывайте вложения, не переходите по 
ссылкам, не пересылайте письмо!

If you are only doing colocated joins, then there will be no runtime overhead 
incurred by collecting distributed rows (colocated joins mean do not try to 
join data that might be distributed across nodes), so there might not be much 
difference in runtimes.

The difference between different affinity keys, if any, will be seen in the 
results returned, and unless there's a significant difference in row count, it 
will be difficult to affect runtime performance using this strategy.

On Tue, Jan 10, 2023, 13:32 Айсина Роза Мунеровна 
<roza.ays...@sbermarket.ru<mailto:roza.ays...@sbermarket.ru>> wrote:
Hola!

We want to optimize our SQL queries that make collocated JOINs on several 
tables (about ~8 tables).

Some tables have column “product_id” on which these tables are joined.
Business meaning is that the result are features for each product_id.

So based on documentation we set “product_id” as affinity key
so that all data required for join will be located on the same node.
(Column “product_id” is always part of the primary key)

But! After this we made experiments:
- put affinity key to other part of primary key (for example, if primary key is 
"(product_id, store_id)", then affinity key is “store_id”);
- didn't specify affinity key at all.

The problem is that all our load testing results didn’t changed!

So the question - is there any way to make more advanced EXPLAIN,
that will show partition shuffling (if it happens) or data collocation?
Some debug tool for this problem. Like query plan in Spark.


Information about our setup:
- Ignite cluster on 5 VMs;
- all tables are partitioned or replicated;
- all tables are created with DDL SQL and all interactions are made only 
through SQL API;
- DDL example:

CREATE TABLE IF NOT EXISTS PUBLIC.ProductFeatures
(
    product_id INT PRIMARY KEY,
    total_cnt_orders_with_sku INT
)
WITH "CACHE_NAME=PUBLIC_ProductFeatures, KEY_TYPE=io.sbmt.ProductFeaturesKey, 
VALUE_TYPE=io.sbmt.ProductFeaturesValue, AFFINITY_KEY=product_id, 
TEMPLATE=PARTITIONED, BACKUPS=1

- our main SQL query:

SELECT
    ProductFeatures.product_id,
    ProductFeatures.total_cnt_orders_with_sku,
    StoreProductFeatures.price,
    UserProductFeaturesOrder.num_prev_orders_with_sku,
    ...
FROM ProductFeatures
LEFT JOIN StoreProductFeatures
  ON ProductFeatures.product_id = StoreProductFeatures.product_id
  AND StoreProductFeatures.store_id = {store_id}
... (more joins)
CROSS JOIN UserFeaturesDiscount
WHERE UserFeaturesDiscount.user_id = {user_id}
  AND ProductFeatures.product_id IN {skus}
  …

Looking forward for some help.
--

Роза Айсина
Старший разработчик ПО
СберМаркет | Доставка из любимых магазинов



Email: roza.ays...@sbermarket.ru<mailto:roza.ays...@sbermarket.ru>
Mob:
Web: sbermarket.ru<https://sbermarket.ru/>
App: 
iOS<https://apps.apple.com/ru/app/%D1%81%D0%B1%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82-%D0%B4%D0%BE%D1%81%D1%82%D0%B0%D0%B2%D0%BA%D0%B0-%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82%D0%BE%D0%B2/id1166642457>
 и 
Android<https://play.google.com/store/apps/details?id=ru.instamart&hl=en&gl=ru>








УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: это электронное сообщение и любые документы, 
приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем 
Вас о том, что, если это сообщение не предназначено Вам, использование, 
копирование, распространение информации, содержащейся в настоящем сообщении, а 
также осуществление любых действий на основе этой информации, строго запрещено. 
Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом 
отправителю по электронной почте и удалите это сообщение.
CONFIDENTIALITY NOTICE: This email and any files attached to it are 
confidential. If you are not the intended recipient you are notified that 
using, copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. If you have received this email in 
error please notify the sender and delete this email.


--

Роза Айсина

Старший разработчик ПО

СберМаркет | Доставка из любимых магазинов



Email: roza.ays...@sbermarket.ru<mailto:roza.ays...@sbermarket.ru>

Mob:

Web: sbermarket.ru<https://sbermarket.ru/>

App: 
iOS<https://apps.apple.com/ru/app/%D1%81%D0%B1%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82-%D0%B4%D0%BE%D1%81%D1%82%D0%B0%D0%B2%D0%BA%D0%B0-%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82%D0%BE%D0%B2/id1166642457>
 и 
Android<https://play.google.com/store/apps/details?id=ru.instamart&hl=en&gl=ru>



УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: это электронное сообщение и любые документы, 
приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем 
Вас о том, что, если это сообщение не предназначено Вам, использование, 
копирование, распространение информации, содержащейся в настоящем сообщении, а 
также осуществление любых действий на основе этой информации, строго запрещено. 
Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом 
отправителю по электронной почте и удалите это сообщение.
CONFIDENTIALITY NOTICE: This email and any files attached to it are 
confidential. If you are not the intended recipient you are notified that 
using, copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. If you have received this email in 
error please notify the sender and delete this email.

Reply via email to