> On March 14, 2017, 9:35 p.m., Sergio Pena wrote: > > ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java > > Line 219 (original), 223-228 (patched) > > <https://reviews.apache.org/r/57503/diff/1/?file=1661202#file1661202line225> > > > > Does this mean that if a user attempts to repair a table with more than > > 500_000 partitions, the MSCK will fail? > > > > If so, I think we're having the same problem as before. Users won't be > > able to discover all partitions from a table with more than 500_000. > > > > Did we have problems before the patch which added this regression > > issue? If not, should we use PartitionIterable with unlimited number of > > partitions instead? The number of HMS transactions due to PartitionIterable > > shouldn't be a problem if the user increases the batch size. Also, as > > Vihang mentioned, we're just storing two values (partition name + table > > name), so that consumes less memory than using the hive.getPartition() > > method call.
This would limit the number of results the metastore check can return, so it would fail if there are over 500_000 partitions not on the FS or in the MS.I added this check because you expressed concerns in the Jira about an OOM situation when there are million of partitions (fetched in batches), and none of the partitions exist on the filesystem. If you changed your mind and you think that this is not a great concern (because as Vihang pointed out these are only string pairs) then we should go with the previous version of my patch. - Barna Zsombor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/57503/#review168942 ----------------------------------------------------------- On March 10, 2017, 10:36 a.m., Barna Zsombor Klara wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/57503/ > ----------------------------------------------------------- > > (Updated March 10, 2017, 10:36 a.m.) > > > Review request for hive, Peter Vary, Sergio Pena, Sahil Takiar, and Vihang > Karajgaonkar. > > > Repository: hive-git > > > Description > ------- > > HIVE-16024: MSCK Repair Requires nonstrict hive.mapred.mode > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/FixedSizeCollection.java > PRE-CREATION > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > a479deb7c0c6b779277f1029009b7dfab6dcb9e3 > common/src/test/org/apache/hadoop/hive/common/TestFixedSizeCollection.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java > 6805c17a116f5ef0febd36c59d454fa631ae0024 > ql/src/test/queries/clientnegative/msck_repair_4.q PRE-CREATION > ql/src/test/queries/clientpositive/msck_repair_0.q > ce8ef426a2a58845afc8333259d66725db416584 > ql/src/test/results/clientnegative/msck_repair_4.q.out PRE-CREATION > ql/src/test/results/clientpositive/msck_repair_0.q.out > 3f2fe75b194f1248bd5c073dd7db6b71b2ffc2ba > > > Diff: https://reviews.apache.org/r/57503/diff/1/ > > > Testing > ------- > > Tested locally and added qtests/unit tests. > > > Thanks, > > Barna Zsombor Klara > >