[ https://issues.apache.org/jira/browse/HIVE-28638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-28638: -------------------------------- Description: File system statistics in LLAP is handled like: 1. get all statistics before task callable (contains thread local), using FileSystem.getAllStatistics() 2. if more than one entry belongs to the same scheme, it's merged 3. run task 4. get all statistics again, and subtract the pre-state, considering the possibility of multiple Statistics object for the same scheme this logic is currently tightly coupled with the StatsRecordingThreadPool, which calls to LlapUtil, so it's not testable at all, and pollutes both StatsRecordingThreadPool and LlapUtil, not to mention that the exposed stat fields (bytesRead, bytesWritten, ...) are also present in different classes, which makes them harder to maintain the whole before/after logic can be moved to a separate class, which can hold the statistics as its state, making StatsRecordingThreadPool much cleaner was: File system statistics in LLAP is handled like: 1. get all statistics before task callable (contains thread local), using FileSystem.getAllStatistics() 2. if more than one entry belongs to the same scheme, it's merged 3. run task 4. get all statistics again, subtract the pre-state, considering the possibility of multiple Statistics object for the same scheme this logic is currently tightly coupled with the StatsRecordingThreadPool, which calls to LlapUtil, so it's not testable at all, and pollutes both StatsRecordingThreadPool and LlapUtil the whole before/after logic can be moved to a separate class, which can hold the statistics as its state, making StatsRecordingThreadPool much cleaner > Refactor stats handling in StatsRecordingThreadPool > --------------------------------------------------- > > Key: HIVE-28638 > URL: https://issues.apache.org/jira/browse/HIVE-28638 > Project: Hive > Issue Type: Improvement > Security Level: Public(Viewable by anyone) > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > File system statistics in LLAP is handled like: > 1. get all statistics before task callable (contains thread local), using > FileSystem.getAllStatistics() > 2. if more than one entry belongs to the same scheme, it's merged > 3. run task > 4. get all statistics again, and subtract the pre-state, considering the > possibility of multiple Statistics object for the same scheme > this logic is currently tightly coupled with the StatsRecordingThreadPool, > which calls to LlapUtil, so it's not testable at all, and pollutes both > StatsRecordingThreadPool and LlapUtil, not to mention that the exposed stat > fields (bytesRead, bytesWritten, ...) are also present in different classes, > which makes them harder to maintain > the whole before/after logic can be moved to a separate class, which can hold > the statistics as its state, making StatsRecordingThreadPool much cleaner -- This message was sent by Atlassian Jira (v8.20.10#820010)