[ https://issues.apache.org/jira/browse/HIVE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16845032#comment-16845032 ]
slim bouguerra commented on HIVE-21391: --------------------------------------- [~prasanth_j] Took look at this and i think weak refs will not solve the issue of memory pressure. In fact the root cause is the size of the Blocking queue between the IO-Thread (the producer) and Pipeline-Processor Thread (The consumer). *Why ?* please note that as of now the number of buffered CVBs is not related to size of CVB_Pool but the size of Blocking queue. Currently no matter how big is your Pool size a fast IO will create as many CVB as possible to fill the Processor Queue. Therefore at any point in time the size of CVB is == to the size of queue + 2 (one been used by processor and second filled by IO-threads) To conclude setting the size of the Pool to one or using weak ref will not change the equation because the Blocking queue need to hold a strong ref to the CVB. *How to fix this ?* To fix the issue will need to make the blocking queue size more realistic, * Currently the minimum queue size is 10 CVB, You can see that is can fire back if you scan 2000 Decimals column that is about 1GB per executor. * Currently a Decimal is treated as 4 times (see LlapRecordReader#COL_WEIGHT_HIVEDECIMAL) Decimal64 which is wrong, As you can see Writable Deciaml is at least 10 times bigger {code} /* org.apache.hadoop.hive.serde2.io.HiveDecimalWritable object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 16 (object header) N/A 16 8 long FastHiveDecimal.fast2 N/A 24 8 long FastHiveDecimal.fast1 N/A 32 8 long FastHiveDecimal.fast0 N/A 40 4 int FastHiveDecimal.fastSignum N/A 44 4 int FastHiveDecimal.fastIntegerDigitCount N/A 48 4 int FastHiveDecimal.fastScale N/A 52 4 int FastHiveDecimal.fastSerializationScale N/A 56 1 boolean HiveDecimalWritable.isSet N/A 57 7 (alignment/padding gap) 64 8 long[] HiveDecimalWritable.internalScratchLongs N/A 72 8 byte[] HiveDecimalWritable.internalScratchBuffer N/A Instance size: 80 bytes */ {code} * Also we might need to use pseudo-linear increase in function of the column number n, something like n log_n (or n sqrt_n) like that wide column scan gets less memory. * To make the size more rational will try to make it relate to bytes instead of been unit-less. If you guys are okay with this, i propose the removal of LLAP_IO_VRB_QUEUE_LIMIT_MIN and replace it with desired size let say 1GB by default ? > LLAP: Pool of column vector buffers can cause memory pressure > ------------------------------------------------------------- > > Key: HIVE-21391 > URL: https://issues.apache.org/jira/browse/HIVE-21391 > Project: Hive > Issue Type: Bug > Components: llap > Affects Versions: 4.0.0, 3.2.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Major > Attachments: HIVE-21391.1.patch > > > Where there are too many columns (in the order of 100s), with decimal, string > types the column vector pool of buffers created here > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/EncodedDataConsumer.java#L59] > can cause memory pressure. > Example: > 128 (poolSize) * 300 (numCols) * 1024 (batchSize) * 80 (decimalSize) ~= 3GB > The pool size keeps increasing when there is slow consumer but fast llap io > (SSDs) leading to GC pressure when all LLAP io threads read splits from same > table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)