[ https://issues.apache.org/jira/browse/HIVE-20917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-20917: ---------------------------------- Labels: pull-request-available (was: ) > OpenCSVSerde quotes all columns > ------------------------------- > > Key: HIVE-20917 > URL: https://issues.apache.org/jira/browse/HIVE-20917 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers > Reporter: nicolas paris > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The OpenCSVSerde produces a CSV with all its columns quoted > no matter of they type or if the string columns contain a separator or not. > > The problem is some readers (such postgresql) are not compatible with > such CSV, in particular when bulk loading them thought COPY statement. > > I propose a new CsvSerde, based on a Univocity Parser (wich is used by Apache > Spark) > that has been described a 2 times faster thant OpenCSV. > [https://github.com/uniVocity/csv-parsers-comparison] . This new CsvSerde > whould only quote columns when needed. > > Regards, -- This message was sent by Atlassian Jira (v8.20.10#820010)