> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote: > > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java > > Lines 66-73 (original) > > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line66> > > > > where did these methods gone? > > Peter Vary wrote: > They are not used, and I did not foresee any usage of them, since the > QFile runs the diff itself, but I can be convinced :)
You got those back :) > On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote: > > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java > > Lines 78 (patched) > > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line87> > > > > I've not yet clearly understand the goal here...but this "take every > > second element" seems kinda odd.. > > > > would the following query's output correct? > > echo -e "select 'a\x00b';\nselect 1;" > asd.q > > > > anyway...I think it would be better to separate the different beeline > > outputs somehow instead of trying to workaround it externally... > > Peter Vary wrote: > Hi Zoltan, > > "take every second element" is *even* not *odd* :) :) > > Back to the topic: > I think, the final goal for the BeeLine tests would be the following: > - Run them against a real cluster > - Run them parallel to decrease the runtime > - Use the same .q and .q.out files as the CLI tests. > > Do we agree in the above points? Do you have any extra points we should > consider? If you have any thoughts, I would be happy to add! > > The current .q.out files contain data from 2 sources: > - Server side - Data from the Pre/PostExecuteHooks > - Client side - Query result data > > Since the data is from 2 sources, it could be only collected at the > client side effectively. I considered the following options: > 1. Generate only the required logs. For example add new > OperationLog.LoggingLevel, and modify LogDivertAppender, so only the relevant > logs are printed on server side > 2. Generate the full log, and cut it down to match the expected output > > I opted for the 2nd one, since the q.out result is very slim and if we > have any problem then we will have to dig into the cluster hive.log to find > any indication what caused the failure. This can be a real pain when multiple > tests are running in parallel. To avoid this problem, in the 2nd option we > have a very detailed log for every qfile test runs specifically for the test > (.q.out.raw), and we cut it down to the expected detail level (.q.out). > > Since these are only for tests, I think we can live with compromise where > there are restrictions on the .q and .q.out files which can be used with the > driver. If we choose this solution we can enhance it by: > - Checking the .q and the .q.out file not to contain the bounding > character > - When filtering we can check that the number of bounding characters are > even as expected > - Use a little more complicated bounding sequence to lower the likelihood > of collision > > What do you think? > Even after these enhancements, would you vote for using minimal logging, > or we could accept the restrictions above? > > Thanks for the review, > Appreciate that you take time to look through these changes! > > Peter Reworked the patch to use the proposed solution. I like this new one much better. And you? Any further suggestions? > On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote: > > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java > > Lines 79 (patched) > > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line88> > > > > i went around a few times...and I still don't see the point of > > inserting all these BOUNDING_CHARs...this line effectively discards all the > > extra information BOUNDING_CHAR may have added earlier (but it kills every > > second bucket; which content might be something...) > > > > I'm not sure...but I think that you might be able to install some > > streamdistributors or some other usefull gadget in SessionState.LogHelper - > > and possibly achieve something similar; without having to rely on a special > > separator... > > Peter Vary wrote: > See above. I think I understand your concern, and it is the same issue as > the comment above. Removed boundig chars alltogether - Peter ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/57693/#review169228 ----------------------------------------------------------- On April 7, 2017, 9:25 a.m., Peter Vary wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/57693/ > ----------------------------------------------------------- > > (Updated April 7, 2017, 9:25 a.m.) > > > Review request for hive, Zoltan Haindrich, Marta Kuczora, Miklos Csanady, > Naveen Gangam, Vihang Karajgaonkar, and Barna Zsombor Klara. > > > Bugs: HIVE-16146 > https://issues.apache.org/jira/browse/HIVE-16146 > > > Repository: hive-git > > > Description > ------- > > The goal was to generate '\0' markers around the raw log items we want to > keep in the golden files. > > To archive this I had to do a small functional change and some small refactor: > - Removed the immutability of the format map in BeeLine, so the test could > add the QFile specific OutputFromat as a possible format > - The PostExecutePrinter and the PreExecutePrinter got a common ancestor, > which was due anyway because PostExecutePrinter reused static methods from > PreExecutePrinter. This way I was able to create QFile specific printers > which are generating the desired markers. > - Moved the QFile test to the org.apache.hive.beeline package, so the test > classes can use the package private classes and methods > > For one reason or other BeeLine added an extra space character at the end of > the lines for multiline commands - I have removed this space - Will see if > this effects any unit test or not. > > With the above mentioned *OutputFromat*, *QFilePreExecutePrinter*, > *QFilePostExecutePrinter* we can mark the lines which are needed in the q.out > file, and during the filtering we can remove the unneeded parts - I prefer > to keep the log level high in the raw files, so in case of a test failure we > can have better understanding of what has happened. > > In the test files: > - Updated the beforeExecute, and afterExecute methods to set the new > outputformat and the new hooks > - Removed the query specific filters, since they are non exitstent in the CLI > tests > - Simplified the static filterset - currently only contains the filters which > are really neccessary for the actual tests - might grow to the same size than > in the QTestUtils - but if we do not want to run all of the test we would > like to keep this list as small as possible > - Removed unnecessary configurations from QFileBuilder which will be not > needed in case we want to mimic the CLI results > > > Diffs > ----- > > beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 > beeline/src/java/org/apache/hive/beeline/Commands.java 2578728 > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d4a6a0 > itests/src/test/resources/testconfiguration.properties 7a70c9c > > itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java > 0d63f5d > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 > > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java > 760fde6 > itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java > fcd50ec > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java a5c0fcd > ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppenderForTest.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/session/OperationLog.java c37a633 > ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out > 385f9b7 > ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee > ql/src/test/results/clientpositive/beeline/select_dummy_source.q.out > PRE-CREATION > > service/src/java/org/apache/hive/service/cli/operation/OperationManager.java > f62ee4e > service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java > 418f453 > > > Diff: https://reviews.apache.org/r/57693/diff/2/ > > > Testing > ------- > > Added a new simple query file from CLI driver, and checked that the generated > output is the same > > > Thanks, > > Peter Vary > >