[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535380#comment-16535380 ]
Sergey Shelukhin commented on HIVE-19820: ----------------------------------------- Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. Right now I feel this is too hot and way too under-tested; I doubt system tests that are not specifically targeted at this will catch any subtle bugs, or even any glaring bugs e.g. for inserts and select count running in parallel. cc [~hagleitn] [~ekoifman] [~steveyeom2017] > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > ------------------------------------------------------------------------------------------------ > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.02-master-txnstats.patch, HIVE-19820.03-master-txnstats.patch, > HIVE-19820.04-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)