[ 
https://issues.apache.org/jira/browse/HIVE-28396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tanishqchugh updated HIVE-28396:
--------------------------------
    Attachment: mm_all_2.png

> Increase Tez container & AM memory size to address OOM issues
> -------------------------------------------------------------
>
>                 Key: HIVE-28396
>                 URL: https://issues.apache.org/jira/browse/HIVE-28396
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: tanishqchugh
>            Assignee: tanishqchugh
>            Priority: Major
>         Attachments: groupBy_3_map_multi_distinct_proof_128_vs_256.png, 
> mm_all_2.png
>
>
> Increasing the tez container & AM memory sizes to 256 to address the 
> occurring OOM issues.
> Increase in the sizes causes 3 qtests to fail: mm_all.q, mm_dp.q, 
> groupby3_map_multi_distinct.q . Analyze the failures and fix them.
> Analysis:
> *groupby3_map_multi_distinct Analysis:*
> We had increased the tez container size from 128 to 256mb to address OOM 
> errors. Now this qtest has a property - {{set hive.map.aggr=true;}} . If this 
> property is set to true, a background check runs first named - 
> {{{}checkMapSideAggregation(){}}}, to verify that there is enough space 
> available to store the hash table that would be required in order to do this 
> aggregation. The allotted space for this aggregation is half of container 
> size and with half of 128mb, it was not enough to store this generated table, 
> but with half of 256mb, now it is sufficient to store this table and hence 
> map side aggregation happens. With this aggregation, the hashes for only 
> these 307 distinct rows out of 500 rows are generated and stored and 
> duplicate rows are mapped to this hashes. Thus, the change in statistics 
> which is expected.
> [!https://private-user-images.githubusercontent.com/157357971/352015810-3c1c998a-f729-43c6-83f6-824530471e38.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE1ODEwLTNjMWM5OThhLWY3MjktNDNjNi04M2Y2LTgyNDUzMDQ3MWUzOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NGM2NTZkMGU5YzAzNzk4MzgyNDdkZmQzNDk2N2E1OGVlYzI1ZWEwNWE1ZWIyZGQxNTkwZjAxMzgxN2UyZTliJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.NLDQNJaHeChul8mE6IalghFhOFjHFVfWWwvnBEaIoo0|width=771,height=291!|https://private-user-images.githubusercontent.com/157357971/352015810-3c1c998a-f729-43c6-83f6-824530471e38.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE1ODEwLTNjMWM5OThhLWY3MjktNDNjNi04M2Y2LTgyNDUzMDQ3MWUzOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NGM2NTZkMGU5YzAzNzk4MzgyNDdkZmQzNDk2N2E1OGVlYzI1ZWEwNWE1ZWIyZGQxNTkwZjAxMzgxN2UyZTliJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.NLDQNJaHeChul8mE6IalghFhOFjHFVfWWwvnBEaIoo0]
> *mm_all Analysis:*
> We had increased the tez container size from 128 to 256mb to address OOM 
> errors. Now, total memory allocated to LLAP daemon is 4096mb and with each 
> container size increased to 256 mb, available slots = 4096/256 = 16
> With increased container size, split size increases and thus each task have 
> higher resources. Due to this, each task computes larger number of rows and 
> corrresponds to one hive side file each. The amount of data processed remains 
> the same, just the amount of data processed by each task increases. Thus only 
> 16 hive files are generated.
> [!https://private-user-images.githubusercontent.com/157357971/352017180-d5947f9a-eb58-48cb-a42e-639518284fce.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE3MTgwLWQ1OTQ3ZjlhLWViNTgtNDhjYi1hNDJlLTYzOTUxODI4NGZjZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mNTE5ZTY2MTAxNGIyOGExM2Q4MDdlMTA2YzgwMjc1Yzc1YjNmNDkyYjA4MzYwOTk4ZDE0YmNjZGM2ZjE0ZTlkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.yFcCgGEvDh0MaSFVs33nY7gWIxuSRb5nztmwoZ5IYVg|width=794,height=241!|https://private-user-images.githubusercontent.com/157357971/352017180-d5947f9a-eb58-48cb-a42e-639518284fce.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE3MTgwLWQ1OTQ3ZjlhLWViNTgtNDhjYi1hNDJlLTYzOTUxODI4NGZjZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mNTE5ZTY2MTAxNGIyOGExM2Q4MDdlMTA2YzgwMjc1Yzc1YjNmNDkyYjA4MzYwOTk4ZDE0YmNjZGM2ZjE0ZTlkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.yFcCgGEvDh0MaSFVs33nY7gWIxuSRb5nztmwoZ5IYVg]
>  
> [!https://private-user-images.githubusercontent.com/157357971/352017204-718e2efb-ee97-49bd-b4ca-cc1321b9b8c7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE3MjA0LTcxOGUyZWZiLWVlOTctNDliZC1iNGNhLWNjMTMyMWI5YjhjNy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03MTIxNTNjYzZhNTAyMTU5NzRhN2QwNzA3MTA0OGViODZlNmQ2NjRkZTI1ODVmZmNmOTliNzQ0OGZkODNhNzc4JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.jQDVgD7-OMuIa-DXrVjyAKxr3KknbdPOwOja-Sy0jf0|width=831,height=499!|https://private-user-images.githubusercontent.com/157357971/352017204-718e2efb-ee97-49bd-b4ca-cc1321b9b8c7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQzMzA0ODAsIm5iZiI6MTcyNDMzMDE4MCwicGF0aCI6Ii8xNTczNTc5NzEvMzUyMDE3MjA0LTcxOGUyZWZiLWVlOTctNDliZC1iNGNhLWNjMTMyMWI5YjhjNy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyMlQxMjM2MjBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03MTIxNTNjYzZhNTAyMTU5NzRhN2QwNzA3MTA0OGViODZlNmQ2NjRkZTI1ODVmZmNmOTliNzQ0OGZkODNhNzc4JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.jQDVgD7-OMuIa-DXrVjyAKxr3KknbdPOwOja-Sy0jf0]
> *mm_dp Analysis:*
> The error in this test case arised only because of difference in the random 
> numbers generated. The random number generation not only depends on the seed 
> value passed but also on the available task resources. As above, the task 
> resources have increased and each task processes higher number of rows, 
> generating higher number of random numbers, the random numbers generated are 
> different bw container sizes of 128 and 256.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to