[ https://issues.apache.org/jira/browse/HIVE-20917?focusedWorklogId=822115&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-822115 ]
ASF GitHub Bot logged work on HIVE-20917: ----------------------------------------- Author: ASF GitHub Bot Created on: 31/Oct/22 21:24 Start Date: 31/Oct/22 21:24 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3718: URL: https://github.com/apache/hive/pull/3718#issuecomment-1297703933 # @check-spelling-bot Report ### :red_circle: Please review See the [files](3718/files/) view or the [action log](https://github.com/apache/hive/actions/runs/3364862660) for details. #### Unrecognized words (1) APPLYQUOTESTOALL <details><summary>Previously acknowledged words that are now absent </summary>aarry bytecode timestamplocal yyyy </details> <details><summary>To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands</summary> ... in a clone of the [g...@github.com:gigem/hive.git](https://github.com/gigem/hive.git) repository on the `HIVE-20917` branch: ``` update_files() { perl -e ' my @expect_files=qw('".github/actions/spelling/expect.txt"'); @ARGV=@expect_files; my @stale=qw('"$patch_remove"'); my $re=join "|", @stale; my $suffix=".".time(); my $previous=""; sub maybe_unlink { unlink($_[0]) if $_[0]; } while (<>) { if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; } next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print; }; maybe_unlink($previous);' perl -e ' my $new_expect_file=".github/actions/spelling/expect.txt"; use File::Path qw(make_path); use File::Basename qw(dirname); make_path (dirname($new_expect_file)); open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE; my @add=qw('"$patch_add"'); my %items; @items{@words} = @words x (1); @items{@add} = @add x (1); @words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items; open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; }; close FILE; system("git", "add", $new_expect_file); ' } comment_json=$(mktemp) curl -L -s -S \ -H "Content-Type: application/json" \ "https://api.github.com/repos/apache/hive/issues/comments/1297703933" > "$comment_json" comment_body=$(mktemp) jq -r ".body // empty" "$comment_json" > $comment_body rm $comment_json patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body") patch_add=$(perl -e '$/=undef; $_=<>; if (m{Unrecognized words[^<]*</summary>\n*```\n*([^<]*)```\n*</details>$}m) { print "$1" } elsif (m{Unrecognized words[^<]*\n\n((?:\w.*\n)+)\n}m) { print "$1" };' < "$comment_body") update_files rm $comment_body git add -u ``` </details> <! Issue Time Tracking ------------------- Worklog Id: (was: 822115) Time Spent: 20m (was: 10m) > OpenCSVSerde quotes all columns > ------------------------------- > > Key: HIVE-20917 > URL: https://issues.apache.org/jira/browse/HIVE-20917 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers > Reporter: nicolas paris > Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The OpenCSVSerde produces a CSV with all its columns quoted > no matter of they type or if the string columns contain a separator or not. > > The problem is some readers (such postgresql) are not compatible with > such CSV, in particular when bulk loading them thought COPY statement. > > I propose a new CsvSerde, based on a Univocity Parser (wich is used by Apache > Spark) > that has been described a 2 times faster thant OpenCSV. > [https://github.com/uniVocity/csv-parsers-comparison] . This new CsvSerde > whould only quote columns when needed. > > Regards, -- This message was sent by Atlassian Jira (v8.20.10#820010)