[ https://issues.apache.org/jira/browse/HIVE-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190690#comment-16190690 ]
Sergey Shelukhin commented on HIVE-17674: ----------------------------------------- Removed some TODOs, fixed a couple, and marked others that are MM gaps as such. Remaining TODOs relevant to after the merge (although those with a "?" are very low pri, or just need to be verified): {noformat} $ git diff master | grep TODO | grep "^+" | grep -E "(--|MM gap)" + // TODO [MM gap?]: by design; no-one seems to use LB tables. They will work, but not convert. + // TODO [MM gap?]: this doesn't work, however this is MR only. + // TODO [MM gap]: CTAS may currently be broken. It used to work. See the old code, and why isCtas isn't used? + // TODO [MM gap]: for IOW, we also need to count in base dir, if any + // TODO [MM gap?]: bad merge; tblDesc is no longer CreateTableDesc, but ImportTableDesc. +-- TODO: doesn't work truncate table part_mm partition(key_mm=455); +-- TODO: need to include merge+union+DP, but it's broken for now {noformat} In the following TODOs, a couple are from moved lines (i.e. they exist in master), and the rest are for existing code. In the process of working on MM tables, I saw all of the legacy code in Hive and I was horrified, but it doesn't mean I'm going to fix it all... {noformat} $ git diff master | grep TODO | grep "^+" | grep -v -E "(--|MM gap)" + // TODO: why is it stored in both table and dpCtx? + // TODO: In a follow-up to HIVE-1361, we should refactor loadDynamicPartitions + // TODO: not clear why two if conditions are different. Preserve the existing logic for now. + // TODO: not clear why two if conditions are different. Preserve the existing logic for now. + // TODO: see HIVE-14886 - removeTempOrDuplicateFiles is broken for list bucketing, + // TODO: not having aliases for path usually means some bug. Should it give up? + String lbDirSuffix = lbDirName.replace(partDirName, ""); // TODO: should it rather do a prefix? + // TODO: we should really probably throw. Keep the existing logic for now. + // TODO: why is this in text formatter?!! + // TODO: if we are not dealing with concatenate DDL, we should not create a merge+move path + // TODO: wtf?!! why is this in this method? This has nothing to do with anything. + // TODO: this code is buggy - it relies on having one file per bucket; no MM support (by design). + // TODO: should this use getPartitionDescFromPathRecursively? That's what other code uses. + // TODO: we should probably block all ACID tables here. + // TODO: the below seem like they should just be combined into partitionDesc + // TODO: could we instead get FS from path here and add normal files for every UGI? + // TODO: move these test parameters to more specific places... there's no need to have them here {noformat} > grep TODO HIVE-15212.17.patch |wc - l = 49 > ------------------------------------------ > > Key: HIVE-17674 > URL: https://issues.apache.org/jira/browse/HIVE-17674 > Project: Hive > Issue Type: Sub-task > Components: Transactions > Reporter: Eugene Koifman > Assignee: Sergey Shelukhin > > grep TODO HIVE-15212.17.patch > + // TODO The following two utility methods can be moved to AcidUtils once > no class in metastore is relying on them, > + // TODO: CTAS for MM may currently be broken. It used to work. See the old > code and why isCtas isn't used? > + // TODO: Setting these separately is a very hairy issue in certain > combinations, since we > + // TODO: this doesn't handle list bucketing properly. Does the original > exim do that? > + // TODO: support this? we only bail because it's a PITA and hardly > anyone seems to care. > + // TODO: support this? > + // TODO: support this? > + // TODO: why is it stored in both? > + // TODO: this doesn't work... the path for writer and reader mismatch > - // TODO: In a follow-up to HIVE-1361, we should refactor > loadDynamicPartitions > + // TODO: In a follow-up to HIVE-1361, we should refactor > loadDynamicPartitions > + // TODO: this particular bit will not work for MM tables, as there can > be multiple > + // TODO: not clear why two if conditions are different. Preserve the > existing logic for now. > + // TODO: not clear why two if conditions are different. Preserve the > existing logic for now. > + // TODO: not clear why two if conditions are different. Preserve the > existing logic for now. > + // TODO: not clear why two if conditions are different. Preserve the > existing logic for now. > + // TODO: do we also need to do this when creating an empty partition > from select? > + // TODO: see HIVE-14886 - removeTempOrDuplicateFiles is broken for list > bucketing, > + // TODO: for IOW, we also need to count in base dir, if any > + // TODO: HiveFileFormatUtils.getPartitionDescFromPathRecursively for > MM tables? > + // TODO: not having aliases for path usually means some bug. Should > it give up? > updateMrWork(jobConf); // TODO: refactor this in HIVE-6366 > + // TODO: this assumes both paths are qualified; which they are, > currently. > + // TODO: just like the move path, we only do one level of recursion. > + // TODO: may be broken? no LB bugs for now but if any are found. > + String lbDirSuffix = lbDirName.replace(partDirName, ""); // TODO: wtf? > + // TODO: we should really probably throw. Keep the existing logic > for now. > + // TODO: this assumes both paths are qualified; which they are, > currently. > + // TODO: we assume lbLevels is 0 here. Same as old code for non-MM. > + // TODO: we could just find directories with any MM directories > inside? > // TODO: add test case and clean it up > + // Duplicates logic in TextMetaDataFormatter TODO: wtf?!! > + // TODO: why is this in text formatter? grrr > + // TODO# noop MoveWork to avoid q file changes in HIVE-14990. Remove > (w/the flag) after merge. > + // TODO: this should never happen for mm tables. > + // TODO: if we are not dealing with concatenate DDL, we should not > create a merge+move path > + // TODO: wtf? wtf?!! why is this in this method? > + // TODO: this relies a lot on having one file per bucket. No support for > MM tables for now. > + // TODO: should this use getPartitionDescFromPathRecursively? > + // TODO: we should probably block all ACID tables here. > + @SuppressWarnings("unused") // TODO: wtf? > + // TODO: due to the master merge, tblDesc is no longer CreateTableDesc, > but ImportTableDesc > + // TODO: ReplCopyTask is completely screwed. Need to support when > it's not as screwed. > + // TODO: ReplCopyTask is completely screwed. Need to support when > it's not as screwed. > // TODO: the below seems like they should just be combined into > partitionDesc > + // TODO: could we instead get FS from path here and add normal files for > every UGI? > +-- TODO: doesn't work truncate table part_mm partition(key_mm=455); > +-- TODO: need to include merge+union+DP, but it's broken for now > + // TODO: move these test parameters to more specific places... there's no > need to have them here -- This message was sent by Atlassian JIRA (v6.4.14#64029)