> On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterable.java > > Lines 8 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725290#file1725290line8> > > > > it seems to me that this "iterable" is not a real iterable; as it can't > > restart the iteration
yes, you are rigth, I will remove this and use a simpler and not misleading interface. > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java > > Lines 33 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725291#file1725291line33> > > > > If committed is true - but this commitTransaction returns false... > > > > I think committed should stay true in that case I don't know. Why do you think that? I will check it. In fact I think here I didn't changed the original logic, see https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L6903-L6906 > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java > > Lines 42 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725291#file1725291line42> > > > > nextValue belongs to the 'current' transaction; I think this code may > > probably miss the last entry's changes in every transactional block My intetion was to commit the last change in the last hasNext() which returns false. And all the testcases of TestBlockRetrieverIterable passes, so I think this should be okay. > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java > > Lines 53 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725291#file1725291line53> > > > > I'm not sure...but I've seen some silent mode somewhere in the > > options... :) yes, you are right. I don't know what is the good approach here. I don't like processes that running for hours (days?) without any progress info. Probably that '\r' type things should be used... The silentMode optione is a bit different, with silentMode I would like to avoid OOM by not storing (and logging) all the updateRecords and badRecords, maybe silentMode is not a good name. > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java > > Lines 1 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725292#file1725292line1> > > > > could you add asf headers to the new files? sure > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java > > Lines 16 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725292#file1725292line16> > > > > RetrieverIterable's iterablity is only used here for a 'for' loop...I > > think the interface and the iterable just makes it a bit more > > trickier...standard hasNext() would be simpler Yes, I will use a simpler and clearer interface. > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java > > Lines 25 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725292#file1725292line25> > > > > if an error have happend..an exception is going out at this point...why > > should it be marked on the locationupdater? I mean...the exception already > > describes the situation > > > > It seems like errorHappened only controls a log message - or I might > > have missed something Yes, it controls only the log message, to inform the user whether the update in metastore was completely succesful or not. But I'm not sure that the only way that things can go wrong are handled by exceptions, ObjectSore.commitTransaction can return false and that's also probably indicates something wrong. Anyway, yes, I have to check whether error handling is right or not. I will get back to this problem after next next patch. And this is one more argument against using iterator instead of visitor... > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/EntityUpdaterTest.java > > Lines 1 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725304#file1725304line1> > > > > I think tests ending with 'Test' are also executed...however there is a > > note in the contribution guide about starting the name with Test :) > > > > https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-JavaUnitTest In fact I'm afraid with these naming these tests weren't executed. :) I will rename those. > On May 25, 2017, 11:51 p.m., Zoltan Haindrich wrote: > > metastore/src/test/org/apache/hadoop/hive/metastore/tools/HiveMetaToolTest.java > > Lines 124 (patched) > > <https://reviews.apache.org/r/59408/diff/2/?file=1725309#file1725309line124> > > > > there might be an alternative to this check by using Set<String>'s > > instead of the List...and use assertEquals Probably... My intention was to keep the test as readable (short?) as possible. On May 25, 2017, 11:51 p.m., Zsolt Fekete wrote: > > I'm not sure...but it might probably be less trickier to try with a > > dataProvider.runVisit(myLocationUpdater) pattern, this might be probably > > fit as objectStore.run??Visit() or not...and by doing it that way you may > > probably be able control the dry-run behaviour by just not committing the > > changes in the visit runner... > > > > I think it would be ok to run a dry-run prior to executing the real > > thing...I know it's double work; but it may probably reduce the chance of > > data corruption.. Yes, this inside-out logic with this iterator is not to reader-firendly. With the runVisit() pattern it might be much better. I will give it a try to refactor that way - maybe not in the next patch, first I would like to fix other problems... - Zsolt ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59408/#review176142 ----------------------------------------------------------- On May 19, 2017, 5:05 p.m., Zsolt Fekete wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59408/ > ----------------------------------------------------------- > > (Updated May 19, 2017, 5:05 p.m.) > > > Review request for hive. > > > Repository: hive-git > > > Description > ------- > > Currently HiveMetaTool reads full tables (as DataNucleus entities) into > memory by calling PersistenceManager's retrieveAll(). > > See these methods of ObjectStore: updateMDatabaseURI, updateTblPropURI, > updateMStorageDescriptorTblPropURI, updateMStorageDescriptorTblURI, > updateSerdeURI. > > This might cause failure when the affected tables (SDS, DBS, TABLE_PARAMS, > SD_PARAMS, SERDES, SERDE_PARAMS) are too big. > > > Diffs > ----- > > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > b28983f > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterable.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/IDataProvider.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationEntity.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationEntityImplementations.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationUpdater.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/RetrieverIterable.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/ReturnValue.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/UpdateParams.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/metatool/UriUpdateChecker.java > PRE-CREATION > metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java > 22e246f > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterableTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/DataProviderStub.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/EntityUpdaterTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/LocationEntityImplementationsTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/LocationUpdaterTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/ReturnValueTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/metatool/UriUpdateCheckerTest.java > PRE-CREATION > > metastore/src/test/org/apache/hadoop/hive/metastore/tools/HiveMetaToolTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/59408/diff/2/ > > > Testing > ------- > > The new tests passed: > mvn test > -Dtest=BlockRetrieverIterableTest,EntityUpdaterTest,UriUpdateCheckerTest,LocationUpdaterTest,ReturnValueTest,LocationEntityImplementationsTest,IntegrationTest > > > Thanks, > > Zsolt Fekete > >