> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 66-73 (original)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line66>
> >
> >     where did these methods gone?
> 
> Peter Vary wrote:
>     They are not used, and I did not foresee any usage of them, since the 
> QFile runs the diff itself, but I can be convinced :)

You got those back :)


> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 78 (patched)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line87>
> >
> >     I've not yet clearly understand the goal here...but this "take every 
> > second element" seems kinda odd..
> >     
> >     would the following query's output correct?
> >     echo -e "select 'a\x00b';\nselect 1;" > asd.q
> >     
> >     anyway...I think it would be better to separate the different beeline 
> > outputs somehow instead of trying to workaround it externally...
> 
> Peter Vary wrote:
>     Hi Zoltan,
>     
>     "take every second element" is *even* not *odd* :) :)
>     
>     Back to the topic:
>     I think, the final goal for the BeeLine tests would be the following:
>     - Run them against a real cluster
>     - Run them parallel to decrease the runtime
>     - Use the same .q and .q.out files as the CLI tests.
>     
>     Do we agree in the above points? Do you have any extra points we should 
> consider? If you have any thoughts, I would be happy to add!
>     
>     The current .q.out files contain data from 2 sources:
>     - Server side - Data from the Pre/PostExecuteHooks
>     - Client side - Query result data
>     
>     Since the data is from 2 sources, it could be only collected at the 
> client side effectively. I considered the following options:
>     1. Generate only the required logs. For example add new 
> OperationLog.LoggingLevel, and modify LogDivertAppender, so only the relevant 
> logs are printed on server side
>     2. Generate the full log, and cut it down to match the expected output
>     
>     I opted for the 2nd one, since the q.out result is very slim and if we 
> have any problem then we will have to dig into the cluster hive.log to find 
> any indication what caused the failure. This can be a real pain when multiple 
> tests are running in parallel. To avoid this problem, in the 2nd option we 
> have a very detailed log for every qfile test runs specifically for the test 
> (.q.out.raw), and we cut it down to the expected detail level (.q.out).
>     
>     Since these are only for tests, I think we can live with compromise where 
> there are restrictions on the .q and .q.out files which can be used with the 
> driver. If we choose this solution we can enhance it by:
>     - Checking the .q and the .q.out file not to contain the bounding 
> character
>     - When filtering we can check that the number of bounding characters are 
> even as expected
>     - Use a little more complicated bounding sequence to lower the likelihood 
> of collision
>     
>     What do you think?
>     Even after these enhancements, would you vote for using minimal logging, 
> or we could accept the restrictions above?
>     
>     Thanks for the review,
>     Appreciate that you take time to look through these changes!
>     
>     Peter

Reworked the patch to use the proposed solution.
I like this new one much better. And you? Any further suggestions?


> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 79 (patched)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line88>
> >
> >     i went around a few times...and I still don't see the point of 
> > inserting all these BOUNDING_CHARs...this line effectively discards all the 
> > extra information BOUNDING_CHAR may have added earlier (but it kills every 
> > second bucket; which content might be something...)
> >     
> >     I'm not sure...but I think that you might be able to install some 
> > streamdistributors or some other usefull gadget in SessionState.LogHelper - 
> > and possibly achieve something similar; without having to rely on a special 
> > separator...
> 
> Peter Vary wrote:
>     See above. I think I understand your concern, and it is the same issue as 
> the comment above.

Removed boundig chars alltogether


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57693/#review169228
-----------------------------------------------------------


On April 7, 2017, 9:25 a.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57693/
> -----------------------------------------------------------
> 
> (Updated April 7, 2017, 9:25 a.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Marta Kuczora, Miklos Csanady, 
> Naveen Gangam, Vihang Karajgaonkar, and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-16146
>     https://issues.apache.org/jira/browse/HIVE-16146
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The goal was to generate '\0' markers around the raw log items we want to 
> keep in the golden files.
> 
> To archive this I had to do a small functional change and some small refactor:
> - Removed the immutability of the format map in BeeLine, so the test could 
> add the QFile specific OutputFromat as a possible format
> - The PostExecutePrinter and the PreExecutePrinter got a common ancestor, 
> which was due anyway because PostExecutePrinter reused static methods from 
> PreExecutePrinter. This way I was able to create QFile specific printers 
> which are generating the desired markers.
> - Moved the QFile test to the org.apache.hive.beeline package, so the test 
> classes can use the package private classes and methods
> 
> For one reason or other BeeLine added an extra space character at the end of 
> the lines for multiline commands - I have removed this space - Will see if 
> this effects any unit test or not.
> 
> With the above mentioned *OutputFromat*, *QFilePreExecutePrinter*, 
> *QFilePostExecutePrinter* we can mark the lines which are needed in the q.out 
> file, and during the filtering we can remove the unneeded  parts - I prefer 
> to keep the log level high in the raw files, so in case of a test failure we 
> can have better understanding of what has happened.
> 
> In the test files:
> - Updated the beforeExecute, and afterExecute methods to set the new 
> outputformat and the new hooks
> - Removed the query specific filters, since they are non exitstent in the CLI 
> tests
> - Simplified the static filterset - currently only contains the filters which 
> are really neccessary for the actual tests - might grow to the same size than 
> in the QTestUtils - but if we do not want to run all of the test we would 
> like to keep this list as small as possible
> - Removed unnecessary configurations from QFileBuilder which will be not 
> needed in case we want to mimic the CLI results
> 
> 
> Diffs
> -----
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 
>   beeline/src/java/org/apache/hive/beeline/Commands.java 2578728 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d4a6a0 
>   itests/src/test/resources/testconfiguration.properties 7a70c9c 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java
>  0d63f5d 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 
>   
> itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java
>  760fde6 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java 
> fcd50ec 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java a5c0fcd 
>   ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppenderForTest.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/OperationLog.java c37a633 
>   ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 
> 385f9b7 
>   ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee 
>   ql/src/test/results/clientpositive/beeline/select_dummy_source.q.out 
> PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> f62ee4e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 418f453 
> 
> 
> Diff: https://reviews.apache.org/r/57693/diff/2/
> 
> 
> Testing
> -------
> 
> Added a new simple query file from CLI driver, and checked that the generated 
> output is the same
> 
> 
> Thanks,
> 
> Peter Vary
> 
>

Reply via email to