> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 66-73 (original)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line66>
> >
> >     where did these methods gone?

They are not used, and I did not foresee any usage of them, since the QFile 
runs the diff itself, but I can be convinced :)


> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 78 (patched)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line87>
> >
> >     I've not yet clearly understand the goal here...but this "take every 
> > second element" seems kinda odd..
> >     
> >     would the following query's output correct?
> >     echo -e "select 'a\x00b';\nselect 1;" > asd.q
> >     
> >     anyway...I think it would be better to separate the different beeline 
> > outputs somehow instead of trying to workaround it externally...

Hi Zoltan,

"take every second element" is *even* not *odd* :) :)

Back to the topic:
I think, the final goal for the BeeLine tests would be the following:
- Run them against a real cluster
- Run them parallel to decrease the runtime
- Use the same .q and .q.out files as the CLI tests.

Do we agree in the above points? Do you have any extra points we should 
consider? If you have any thoughts, I would be happy to add!

The current .q.out files contain data from 2 sources:
- Server side - Data from the Pre/PostExecuteHooks
- Client side - Query result data

Since the data is from 2 sources, it could be only collected at the client side 
effectively. I considered the following options:
1. Generate only the required logs. For example add new 
OperationLog.LoggingLevel, and modify LogDivertAppender, so only the relevant 
logs are printed on server side
2. Generate the full log, and cut it down to match the expected output

I opted for the 2nd one, since the q.out result is very slim and if we have any 
problem then we will have to dig into the cluster hive.log to find any 
indication what caused the failure. This can be a real pain when multiple tests 
are running in parallel. To avoid this problem, in the 2nd option we have a 
very detailed log for every qfile test runs specifically for the test 
(.q.out.raw), and we cut it down to the expected detail level (.q.out).

Since these are only for tests, I think we can live with compromise where there 
are restrictions on the .q and .q.out files which can be used with the driver. 
If we choose this solution we can enhance it by:
- Checking the .q and the .q.out file not to contain the bounding character
- When filtering we can check that the number of bounding characters are even 
as expected
- Use a little more complicated bounding sequence to lower the likelihood of 
collision

What do you think?
Even after these enhancements, would you vote for using minimal logging, or we 
could accept the restrictions above?

Thanks for the review,
Appreciate that you take time to look through these changes!

Peter


> On March 17, 2017, 12:37 a.m., Zoltan Haindrich wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 79 (patched)
> > <https://reviews.apache.org/r/57693/diff/1/?file=1666199#file1666199line88>
> >
> >     i went around a few times...and I still don't see the point of 
> > inserting all these BOUNDING_CHARs...this line effectively discards all the 
> > extra information BOUNDING_CHAR may have added earlier (but it kills every 
> > second bucket; which content might be something...)
> >     
> >     I'm not sure...but I think that you might be able to install some 
> > streamdistributors or some other usefull gadget in SessionState.LogHelper - 
> > and possibly achieve something similar; without having to rely on a special 
> > separator...

See above. I think I understand your concern, and it is the same issue as the 
comment above.


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57693/#review169228
-----------------------------------------------------------


On March 16, 2017, 5:15 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57693/
> -----------------------------------------------------------
> 
> (Updated March 16, 2017, 5:15 p.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Marta Kuczora, Miklos Csanady, 
> Naveen Gangam, Vihang Karajgaonkar, and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-16146
>     https://issues.apache.org/jira/browse/HIVE-16146
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The goal was to generate '\0' markers around the raw log items we want to 
> keep in the golden files.
> 
> To archive this I had to do a small functional change and some small refactor:
> - Removed the immutability of the format map in BeeLine, so the test could 
> add the QFile specific OutputFromat as a possible format
> - The PostExecutePrinter and the PreExecutePrinter got a common ancestor, 
> which was due anyway because PostExecutePrinter reused static methods from 
> PreExecutePrinter. This way I was able to create QFile specific printers 
> which are generating the desired markers.
> - Moved the QFile test to the org.apache.hive.beeline package, so the test 
> classes can use the package private classes and methods
> 
> For one reason or other BeeLine added an extra space character at the end of 
> the lines for multiline commands - I have removed this space - Will see if 
> this effects any unit test or not.
> 
> With the above mentioned *OutputFromat*, *QFilePreExecutePrinter*, 
> *QFilePostExecutePrinter* we can mark the lines which are needed in the q.out 
> file, and during the filtering we can remove the unneeded  parts - I prefer 
> to keep the log level high in the raw files, so in case of a test failure we 
> can have better understanding of what has happened.
> 
> In the test files:
> - Updated the beforeExecute, and afterExecute methods to set the new 
> outputformat and the new hooks
> - Removed the query specific filters, since they are non exitstent in the CLI 
> tests
> - Simplified the static filterset - currently only contains the filters which 
> are really neccessary for the actual tests - might grow to the same size than 
> in the QTestUtils - but if we do not want to run all of the test we would 
> like to keep this list as small as possible
> - Removed unnecessary configurations from QFileBuilder which will be not 
> needed in case we want to mimic the CLI results
> 
> 
> Diffs
> -----
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 3c8fccc 
>   beeline/src/java/org/apache/hive/beeline/Commands.java 99ee82c 
>   itests/src/test/resources/testconfiguration.properties 0c590c8 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java
>  acc02eb 
>   itests/util/src/main/java/org/apache/hive/beeline/QFileOutputFormat.java 
> PRE-CREATION 
>   
> itests/util/src/main/java/org/apache/hive/beeline/QFilePostExecutePrinter.java
>  PRE-CREATION 
>   
> itests/util/src/main/java/org/apache/hive/beeline/QFilePreExecutePrinter.java 
> PRE-CREATION 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java 49d6d24 
>   
> itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java
>  b6eac89 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java 
> fcd50ec 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecutePrinter.java b4fc125 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PreExecutePrinter.java 232c62d 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PrinterHook.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0732207 
>   ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 
> d22c9ec 
>   ql/src/test/results/clientpositive/beeline/escape_comments.q.out 5f9df93 
>   ql/src/test/results/clientpositive/beeline/select_dummy_source.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/57693/diff/1/
> 
> 
> Testing
> -------
> 
> Added a new simple query file from CLI driver, and checked that the generated 
> output is the same
> 
> 
> Thanks,
> 
> Peter Vary
> 
>

Reply via email to