[ https://issues.apache.org/jira/browse/KUDU-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210744#comment-17210744 ]
wangningito commented on KUDU-3197: ----------------------------------- With deeper look into existing code, I get a little doubt with the scanners. I saw two methods in tablet_service.cc, `HandleNewScanRequest` and `HandleContinueScanRequest` charges almost all scan requests. And it use projection schema to serialize the scan result. In `HandleNewScanRequest` it create projection schema of scanner in scanner manager, and serialize the result by the projection. In `HandleContinueScanRequest`, it get scanner with scanner_id from scanner manager, the projection schema is kept in scanner, so it may be irrelevant to the schema kept by tablet_metadata. So I'm wondering it may not have to much impact on deleting old old_schemas_. BTW, it took me some time to thinking about ref-counted approach, as I understand, I considered two approach, 1. Change change schema_ kept by tablet_metadata to shared_ptr wrapped one, and add a lock in SetSchema() and Schema() cause previously atomic swap is broken. It may hurts the perform a lot. 2. Add a atomic counter as Schema field, and decrease ref in scanner dtor, it may also hurt the performance when getting Schema(). > Tablet keeps all history schemas in memory may result in high memory > consumption > -------------------------------------------------------------------------------- > > Key: KUDU-3197 > URL: https://issues.apache.org/jira/browse/KUDU-3197 > Project: Kudu > Issue Type: Improvement > Components: tablet > Affects Versions: 1.12.0 > Reporter: wangningito > Assignee: wangningito > Priority: Minor > Attachments: image-2020-09-25-14-45-33-402.png, > image-2020-09-25-14-49-30-913.png, image-2020-09-25-15-05-44-948.png > > > In case of high frequency of updating table, memory consumption of > kudu-tserver may be very high, and the memory in not tracked in the memory > page. > This is the memory usage of a tablet, the memory consumption of tablet-xxx‘s > peak is 3.6G, but none of its' childrens' memory can reach. > !image-2020-09-25-14-45-33-402.png! > So I use pprof to get the heap sampling. The tserver started for long but the > memory is still consuming by TabletBootstrap:PlayAlterSchemaRequest. > !image-2020-09-25-14-49-30-913.png! > I change the `old_schemas_` in tablet_metadata.h to a fixed size vector, > // Previous values of 'schema_'. > // These are currently kept alive forever, under the assumption that > // a given tablet won't have thousands of "alter table" calls. > // They are kept alive so that callers of schema() don't need to > // worry about reference counting or locking. > std::vector<Schema*> old_schemas_; > The heap sampling then becomes > !image-2020-09-25-15-05-44-948.png! > So, to make application layer more flexible, it could be better to make the > size of the old_schemas configurable. > -- This message was sent by Atlassian Jira (v8.3.4#803005)