Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
Thanks for your help anyway, Jingsong & Rui. I read the jira description, and I’m +1 to check the lazy initiation first. It looks like the file creation is skipped or it doesn’t block the writing, and I’ve seen a bucket was writing to a file that was not supposed to exist, e.g. its parent dir w

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Jingsong Li
Hi, Sorry for this. This work around only works in Hive 2+. We can only wait for 1.11.2. Best, Jingsong On Tue, Jul 21, 2020 at 6:15 PM Rui Li wrote: > Hi Paul, > > I believe Jingsong meant try using native writer, for which the option key > is `table.exec.hive.fallback-mapred-writer` and is b

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Rui Li
Hi Paul, I believe Jingsong meant try using native writer, for which the option key is `table.exec.hive.fallback-mapred-writer` and is by default set to true. You can set it to false like this: tableEnv.getConfig().getConfiguration().set( HiveOptions.TABLE_EXEC_HIVE_FALLBACK_MAPRED_WRITER, false)

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
Hi JingSong, Thanks for your advice! But IIUC, it seems that `table.exec.hive.fallback-mapred-reader` is false by default? Moreover, explicitly setting this option might cause a serialization issue. Wonder if I’m setting it in the right way? ``` tableEnv.getConfig().getConfiguration().setStr

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
Hi Rui, I reproduced the error with a minimum case, the SQL is similar to `insert into hive_table_x select simple_string from kafka_table_b`. I’m pretty sure it’s not related to the table schema. And I removed all the optional properties in the Hive table DDL, the error still happened. Best,

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Jingsong Li
Hi Paul, If your orc table has no complex(list,map,row) types, you can try to set `table.exec.hive.fallback-mapred-writer` to false in TableConfig. And Hive sink will use ORC native writer, it is a work-around way. About this error, I think this is a bug for Hive 1.1 ORC. I will try to re-produce

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Rui Li
Hey Paul, Could you please share more about your job, e.g. the schema of your Hive table, whether it's partitioned, and the table properties you've set? On Tue, Jul 21, 2020 at 4:02 PM Paul Lam wrote: > Hi, > > I'm doing a POC on Hive connectors and find that when writing orc format > Hive tabl

FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
Hi, I'm doing a POC on Hive connectors and find that when writing orc format Hive tables, the job failed with FileNotFoundException right after ingesting data (full stacktrace at the bottom of the mail). The error can be steadily reproduced in my environment, which is Hadoop 2.6.5(CDH-5.6.0),