[ https://issues.apache.org/jira/browse/KUDU-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202732#comment-17202732 ]
wangningito commented on KUDU-3197: ----------------------------------- More investigation on these unused memory: The table which consumed too many unused memory, actually I didn't write any data into it but I sometimes alter the table schemas for application layer. I found the 'AlterSchemaRequest' keeps in the WAL for days by dump the content of WAL. Only schema alter but no data ingestion, it may result in that 'DMS' may never flush in to blocks, so the memory for schemas structure copy accumulated in memory. The memory can never release even I reboot the kudu-tserver for several times, it replays all the 'AlterSchemaRequest' and result the weird memory. I saw some feature optimized 'FlushDMS' strategy in 1.13 branch but we haven't apply it in our release version so I set the affect version as 1.12.0/ > Tablet keeps all history schemas in memory may result in high memory > consumption > -------------------------------------------------------------------------------- > > Key: KUDU-3197 > URL: https://issues.apache.org/jira/browse/KUDU-3197 > Project: Kudu > Issue Type: Improvement > Components: tablet > Affects Versions: 1.12.0 > Reporter: wangningito > Assignee: wangningito > Priority: Minor > Attachments: image-2020-09-25-14-45-33-402.png, > image-2020-09-25-14-49-30-913.png, image-2020-09-25-15-05-44-948.png > > > In case of high frequency of updating table, memory consumption of > kudu-tserver may be very high, and the memory in not tracked in the memory > page. > This is the memory usage of a tablet, the memory consumption of tablet-xxx‘s > peak is 3.6G, but none of its' childrens' memory can reach. > !image-2020-09-25-14-45-33-402.png! > So I use pprof to get the heap sampling. The tserver started for long but the > memory is still consuming by TabletBootstrap:PlayAlterSchemaRequest. > !image-2020-09-25-14-49-30-913.png! > I change the `old_schemas_` in tablet_metadata.h to a fixed size vector, > // Previous values of 'schema_'. > // These are currently kept alive forever, under the assumption that > // a given tablet won't have thousands of "alter table" calls. > // They are kept alive so that callers of schema() don't need to > // worry about reference counting or locking. > std::vector<Schema*> old_schemas_; > The heap sampling then becomes > !image-2020-09-25-15-05-44-948.png! > So, to make application layer more flexible, it could be better to make the > size of the old_schemas configurable. > -- This message was sent by Atlassian Jira (v8.3.4#803005)