> On Aug. 18, 2016, 12:11 a.m., Sergio Pena wrote: > > What about stop using the superCSV so that we can keep the 'dsv' format > > that can support singler and multiple characters? > > I don't like the use of another 'dsv2' format for multiple ones. It might > > be confusing for users.
Sure, I can change the patch to use the "new logic" instead of superCSV. I was thinking about this approach when I started to work on this issue. I was just not sure which one would be preferable: leave the existing dsv format unchanged and create a new one or change the existing one not to use superCSV any more. What do you mean exactly by stop using superCSV? Only for dsv outputformat (the formats tsv2 and csv2 will still use it) or completely remove it from the project? - Marta ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50896/#review146056 ----------------------------------------------------------- On Aug. 17, 2016, 2:14 p.m., Marta Kuczora wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50896/ > ----------------------------------------------------------- > > (Updated Aug. 17, 2016, 2:14 p.m.) > > > Review request for hive, Naveen Gangam, Sergio Pena, Szehon Ho, and Xuefu > Zhang. > > > Bugs: HIVE-14404 > https://issues.apache.org/jira/browse/HIVE-14404 > > > Repository: hive-git > > > Description > ------- > > Introduced a new outputformat (dsv2) which supports multiple characters as > delimiter. > For generating the dsv, csv2 and tsv2 outputformats, the Super CSV library is > used. This library doesn’t support multiple characters as delimiter. Since > the same logic is used for generating csv2, tsv2 and dsv outputformats, I > decided not to change this logic, rather introduce a new outputformat (dsv2) > which supports multiple characters as delimiter. > The new dsv2 outputformat has the same escaping logic as the dsv outputformat > if the quoting is not disabled. > Extended the TestBeeLineWithArgs tests with new test steps which are using > multiple characters as delimiter. > > Main changes in the code: > - Changed the SeparatedValuesOutputFormat class to be an abstract class and > created two new child classes to separate the logic for single-character and > multi-character delimiters: SingleCharSeparatedValuesOutputFormat and > MultiCharSeparatedValuesOutputFormat > > - Kept the methods which are used by both children in the > SeparatedValuesOutputFormat and moved the methods specific to the > single-character case to the SingleCharSeparatedValuesOutputFormat class. > > - Didn’t change the logic which was in the SeparatedValuesOutputFormat, only > moved some parts to the child class. > > - Implemented the value escaping and concatenation with the delimiter string > in the MultiCharSeparatedValuesOutputFormat. > > > Diffs > ----- > > beeline/src/java/org/apache/hive/beeline/BeeLine.java e0fa032 > beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java e6e24b1 > > beeline/src/java/org/apache/hive/beeline/MultiCharSeparatedValuesOutputFormat.java > PRE-CREATION > beeline/src/java/org/apache/hive/beeline/SeparatedValuesOutputFormat.java > 66d9fd0 > > beeline/src/java/org/apache/hive/beeline/SingleCharSeparatedValuesOutputFormat.java > PRE-CREATION > beeline/src/main/resources/BeeLine.properties 95b8fa1 > > itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java > 892c733 > > Diff: https://reviews.apache.org/r/50896/diff/ > > > Testing > ------- > > - Tested manually in BeeLine. > - Extended the TestBeeLineWithArgs tests with new test steps which are using > multiple characters as delimiter. > > > Thanks, > > Marta Kuczora > >