Hi all,

I’m exporting a table with Hive CLI using hive –f query.hql > file.tsv. The 
resulting  tab separated file won’t read in R because it seems that some of my 
fields contain the \t separator and that Hive CLI is neither escaping those 
characters nor putting the chr fields within quotes. The error I get is the 
following; In include it because it’s very clearly described

Error in fread(input = "/data3/webview_export.tsv", sep = "\t", header = TRUE,  
:
  Expecting 54 cols, but line 520137 contains text after processing all cols. 
It is very likely that this is due to one or more fields having embedded sep=' 
' and/or (unescaped) '\n' characters within unbalanced unescaped quotes. fread 
cannot handle such ambiguous cases and those lines may not have been read in as 
expected. Please read the section on quotes in ?fread.

Is there any clean way to instruct Hive CLI to

a)      Escape special characters

b)      Quote fields of type string

c)       Use ^A as a separator

Another error I get when skipping this problematic 520137th line is

Read 50.9% of 4260134 rows

Error in fread(input = "/data3/webview_export.tsv", sep = "\t", header = TRUE,  
:

  embedded nul in string: 'kaufmann\00400618'

Again, it looks like this character ain’t escaped properly

Maybe using an alternative SerDe could solve that?

Thanks for your help!

Thomas

Reply via email to