hive error: "Too many bytes before delimiter: 2147483648"

2019-11-29 Thread xuanhuang
Hello all,I encountered a problem while using hive, please ask you. The following is the specific situation. Platform: hive on spark Error: java.io.IOException: Too many bytes before delimiter: 2147483648 Description: When using small files of the same format for testing, there is no problem; but

Re: Update Performance in Hive with data stored as Parquet, ORC

2019-11-29 Thread Peter Vary
Hi Shivam, There were a lot of changes around ACID with the Hive 3.0 release. I assume below, that your question is about Hive 3.x release. Hive ACID v2 implements UPDATE as deleting the old row, and creating a new one for performance reasons. See Eugene's nice presentation for the details: http

Re: ORC: duplicate record - rowid meaning ?

2019-11-29 Thread Peter Vary
Hi David, Not entirely sure what you are doing here :), my guess is that you are trying to write ACID tables outside of hive. Am I right? What is the exact use-case? There might be better solutions out there than writing the files by hand. As for your question below: Yes, the files should be or