Hi, There is a storage handler that allows a table in Hive created to load data into Google BigQuery table. There is relevant GitHub link here
GoogleCloudDataproc/hive-bigquery-storage-handler: Hive Storage Handler for interoperability between BigQuery and Apache Hive (github.com) <https://github.com/GoogleCloudDataproc/hive-bigquery-storage-handler> Ok so basically create a table in Hive with available jar file as below DROP TABLE IF EXISTS test.testmebq; CREATE TABLE test.testmebq( ID BIGINT ) STORED BY 'com.google.cloud.hadoop.io.bigquery.hive.HiveBigQueryStorageHandler' TBLPROPERTIES ( 'mapred.bq.input.dataset.id'='test', 'mapred.bq.input.table.id'='testme', 'mapred.bq.project.id'='xxx', 'mapred.bq.temp.gcs.path'='gs://tmp_storage_bucket/tmp', 'mapred.bq.gcs.bucket'='etcbucket/hive/test' ) ; Very simple table with one column called ID In Google BigQuery I have also created a table called testme in test dataset. The specific jar file added to hive is ADD JAR hive-bigquery-storage-handler-1.0-shaded.jar; as per document in that link Now if I start by adding the row in igQuery table test.testme, I can see the row in Hive table test.tesmebq insert into test.testme values(99) and let us check that Hive table (running Hive in DEBUG mode to console) hive> select * from test.testmebq; 2022-03-08 15:49:16,846 INFO [main] conf.HiveConf: Using the default value passed in for log id: ec1f7b38-7286-481a-b48b-44f7ed54c20a 2022-03-08 15:49:16,846 INFO [main] session.SessionState: Updating thread name to ec1f7b38-7286-481a-b48b-44f7ed54c20a main 2022-03-08 15:49:16,846 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Compiling command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b): select * from test.testmebq 2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Starting Semantic Analysis 2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed phase 1 of Semantic Analysis 2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for source tables 2022-03-08 15:49:16,909 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for subqueries 2022-03-08 15:49:16,909 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for destination tables 2022-03-08 15:49:16,913 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed getting MetaData in Semantic Analysis 2022-03-08 15:49:16,914 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null 2022-03-08 15:49:16,969 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for source tables 2022-03-08 15:49:16,986 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for subqueries 2022-03-08 15:49:16,986 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for destination tables 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] common.FileUtils: Creating directory if it doesn't exist: file:/tmp/hive/ec1f7b38-7286-481a-b48b-44f7ed54c20a/hive_2022-03-08_15-49-16_847_7187167996261551164-1/-mr-10001/.hive-staging_hive_2022-03-08_15-49-16_847_7187167996261551164-1 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: CBO Succeeded; optimized logical plan. 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for FS(2) 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for SEL(1) 2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for TS(0) 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.HiveBigQueryStorageHandler: Configuring MapReduce Job Input Properties (if not provided) from Hive Table properties 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed plan generation 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Not eligible for results caching - no mr/tez/spark jobs 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Semantic Analysis Completed (retrial = false) 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testmebq.id, type:bigint, comment:null)], properties:null) 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.TableScanOperator: Initializing operator TS[0] 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: Initializing operator SEL[1] 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: SELECT struct<id:bigint> 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.ListSinkOperator: Initializing operator LIST_SINK[3] 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed compiling command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b); Time taken: 0.143 seconds 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] reexec.ReExecDriver: Execution #1 of query 2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Setting lock request transaction to txnid:18396 for queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b 2022-03-08 15:49:16,990 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbLockManager: Requesting: queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, dbname:test, tablename:testmebq, operationType:SELECT, isTransactional:false)], txnid:18396, user:hduser, hostname:rhes76, agentInfo:hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b) 2022-03-08 15:49:17,019 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbLockManager: Response to queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b LockResponse(lockid:116445, state:ACQUIRED) 2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Executing command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b): select * from test.testmebq 2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed executing command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b); Time taken: 0.0 seconds OK 2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: OK 2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Stopped heartbeat for query: hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b 2022-03-08 15:49:17,061 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: Column projection:id and filter text:null and BQ Filter text: {} 2022-03-08 15:49:19,026 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: Split[0] = (name='projects/axial-glow-224522/locations/us/streams/EgxfT1MwU0FJMGtvUUoaAmpkGgJpcigB', schema='{ "type": "record", "name": "__root__", "fields": [ { "name": "id", "type": [ "null", "long" ] } ] }', limit='2') 2022-03-08 15:49:19,026 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: mapreduceInputSplit is (name='projects/axial-glow-224522/locations/us/streams/EgxfT1MwU0FJMGtvUUoaAmpkGgJpcigB', schema='{ "type": "record", "name": "__root__", "fields": [ { "name": "id", "type": [ "null", "long" ] } ] }', limit='2'), class is com.google.cloud.hadoop.io.bigquery.DirectBigQueryInputFormat$DirectBigQueryInputSplit 2022-03-08 15:49:19,434 WARN [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroDeserializer: Received different schemas. Have to re-encode: {"type":"record","name":"__root__","fields":[{"name":"id","type":["null","long"]}]} SIZE{null=org.apache.hadoop.hive.serde2.avro.AvroDeserializer$SchemaReEncoder@3077e4aa} ID null 2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.TableScanOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_TS_0:1, 2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_SEL_1:1, 2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.ListSinkOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_LIST_SINK_3:1, 99 Time taken: 0.215 seconds, Fetched: 1 row(s) OK I can see that id = 99 there Now let me insert a row to Hive table > insert into test.testmebq values(11); exec.MapOperator: Initializing operator MAP[0] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] mr.ExecMapper: <MAP>Id =0 <Children>null <\Children> <Parent><\Parent> <\MAP> 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: Initializing operator TS[0] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[1] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT null 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.UDTFOperator: Initializing operator UDTF[2] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[3] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT struct<col1:int> 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Initializing operator FS[5] 2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == {"type":"record","name":"testmebq","namespace":"test","fields":[{"name":"id","type":["null","long"],"default":null}]} 2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Using serializer : org.apache.hadoop.hive.serde2.avro.AvroSerDe@619f4105 and formatter : com.google.cloud.hadoop.io.bigquery.hive.WrappedBigQueryAvroOutputFormat@2274f95f 2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.Utilities: PLAN PATH = file:/tmp/hive/ec1f7b38-7286-481a-b48b-44f7ed54c20a/hive_2022-03-08_15-51-55_732_3190209878982735306-1/-mr-10002/160d3b10-e96a-4b60-bbf6-f2fbdef3936d/map.xml 2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: New Final Path: FS hdfs://rhes75:9000/user/hive/warehouse/test.db/testmebq 2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] hive.WrappedBigQueryAvroOutputFormat: Set temporary output file to test_testmebq_a85ef2ba8644 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[5]: records written - 1 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: records read - 1 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_OUT_INTERMEDIATE:0, RECORDS_IN:3, RECORDS_OUT_OPERATOR_MAP_0:0, 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_TS_0:1, 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_SEL_1:1, 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.UDTFOperator: RECORDS_OUT_OPERATOR_UDTF_2:1, RECORDS_OUT_INTERMEDIATE:0, 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: RECORDS_OUT_OPERATOR_SEL_3:1, RECORDS_OUT_INTERMEDIATE:0, 2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[5]: records written - 1 2022-03-08 15:51:57,272 WARN [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2022-03-08 15:51:57,272 Stage-2 map = 0%, reduce = 0% 2022-03-08 15:51:57,272 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: 2022-03-08 15:51:57,272 Stage-2 map = 0%, reduce = 0% 2022-03-08 15:51:57,646 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: RECORDS_OUT_1_test.testmebq:1, RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_FS_5:1, 2022-03-08 15:51:57,647 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: 2022-03-08 15:51:57,647 INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task:attempt_local1263701903_0004_m_000000_0 is done. And is in the process of committing 2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: map 2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task 'attempt_local1263701903_0004_m_000000_0' done. 2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: Finishing task: attempt_local1263701903_0004_m_000000_0 2022-03-08 15:51:57,649 INFO [Thread-385] mapred.LocalJobRunner: map task executor complete. 2022-03-08 15:51:58,276 Stage-2 map = 100%, reduce = 0% 2022-03-08 15:51:58,276 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: 2022-03-08 15:51:58,276 Stage-2 map = 100%, reduce = 0% Ended Job = job_local1263701903_0004 2022-03-08 15:51:58,276 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: Ended Job = job_local1263701903_0004 MapReduce Jobs Launched: 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: MapReduce Jobs Launched: Stage-Stage-2: HDFS Read: 1030549 HDFS Write: 0 SUCCESS 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Stage-Stage-2: HDFS Read: 1030549 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Total MapReduce CPU Time Spent: 0 msec 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed executing command(queryId=hduser_20220308155155_9c3bc173-6893-4a5b-a061-226e2ac9b42a); Time taken: 2.297 seconds OK 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: OK 2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Stopped heartbeat for query: hduser_20220308155155_9c3bc173-6893-4a5b-a061-226e2ac9b42a Time taken: 2.558 seconds 2022-03-08 15:51:58,296 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] CliDriver: Time taken: 2.558 seconds 2022-03-08 15:51:58,296 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] conf.HiveConf: Using the default value passed in for log id: ec1f7b38-7286-481a- So it shows OK and no errors But the record is NOT added to the local Hive table! So in summary we can read rows added to the BigQuery table through Hive but cannot add any rows to Hive table itself? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.