HariprasadAllaka1612 opened a new issue #1641:
URL: https://github.com/apache/incubator-hudi/issues/1641


   Parquet schema changing for various writes to Hudi.
   
   With the continuous writes to S3 in Hudi format, there are instance the 
schema of Paruet file is changing and when writing/upserting to same partition 
we are getting a merge error, I am using COW storage format.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Write the dataframe multiple times to same partition,
   
   **Expected behavior**
   
   1. Same schema for all the parquet files 
   
   **Environment Description**
   
   * Hudi version : 0.5.1
   
   * Spark version :2.4.0
   
   * Hive version : 2.3.4
   
   * Hadoop version : 2.8.5
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : No
   
   
   2020-05-19 21:06:56 ERROR BoundedInMemoryExecutor:130 - error consuming 
records
   org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:496071269677683442614463247120938275415648800229692538900aa07220-d2a1-4f87-82ed-1348bf6df155
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_1-213-8447_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_0-118-298_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:06:59 ERROR HoodieCopyOnWriteTable:272 - Error upserting 
bucketType UPDATE for partition :1
   org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:06:59 ERROR Executor:91 - Exception in task 1.0 in stage 118.0 
(TID 299)
   org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType 
UPDATE for partition :1
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:273)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        ... 30 more
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:06:59 ERROR TaskSetManager:70 - Task 1 in stage 118.0 failed 1 
times; aborting job
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in 
stage 118.0 failed 1 times, most recent failure: Lost task 1.0 in stage 118.0 
(TID 299, localhost, executor driver): 
org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType 
UPDATE for partition :1
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:273)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        ... 30 more
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   
   Driver stacktrace:
        at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1887)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1875)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1874)
        at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1874)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
        at scala.Option.foreach(Option.scala:257)
        at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2108)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2057)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2046)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
        at 
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1168)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:145)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:91)
        at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
        at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)
        at 
com.playngodataengg.dao.DataAccessS3.writeDataToRefinedHudiS3(DataAccessS3.scala:149)
        at 
com.playngodataengg.controller.LoginDataTransform.processData(LoginDataTransform.scala:368)
        at com.playngodataengg.action.LoginData$.main(LoginData.scala:16)
        at com.playngodataengg.action.LoginData.main(LoginData.scala)
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Error upserting 
bucketType UPDATE for partition :1
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:273)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        ... 30 more
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49605000968507055240906198141791265848347950317232455682454c3447-193b-46df-9582-715f0ec61e4d
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_3-213-8449_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/18/f0f3b2c4-1864-4ee3-a16c-58cf1fd929f1-0_1-118-299_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:06:59 ERROR DataEngineering:12 - (writeDataToRefinedHudiS3) - 
There is an exception writing the data into data lake for login
   2020-05-19 21:06:59 ERROR HoodieCopyOnWriteTable:272 - Error upserting 
bucketType UPDATE for partition :3
   org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446357933851667324507459552012140546f6643052-e862-41dc-a4cc-22150ef7a240
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_0-213-8446_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_3-118-301_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446357933851667324507459552012140546f6643052-e862-41dc-a4cc-22150ef7a240
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_0-213-8446_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_3-118-301_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446357933851667324507459552012140546f6643052-e862-41dc-a4cc-22150ef7a240
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_0-213-8446_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_3-118-301_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49607126967768344261446357933851667324507459552012140546f6643052-e862-41dc-a4cc-22150ef7a240
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_0-213-8446_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/19/4300efe2-6ae2-4474-b5a5-ad758a93afd6-0_3-118-301_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:07:00 ERROR HoodieCopyOnWriteTable:272 - Error upserting 
bucketType UPDATE for partition :2
   org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446359161842247114459510822055968770b4bc494b-e50a-4118-86d6-efe500d13270
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-213-8448_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-118-300_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446359161842247114459510822055968770b4bc494b-e50a-4118-86d6-efe500d13270
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-213-8448_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-118-300_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:49607126967768344261446359161842247114459510822055968770b4bc494b-e50a-4118-86d6-efe500d13270
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-213-8448_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-118-300_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:49607126967768344261446359161842247114459510822055968770b4bc494b-e50a-4118-86d6-efe500d13270
 from old file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-213-8448_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/png/2020/05/19/67705781-2f8a-4a39-969d-2256cacc2b20-0_2-118-300_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   2020-05-19 21:07:00 ERROR HoodieCopyOnWriteTable:272 - Error upserting 
bucketType UPDATE for partition :0
   org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:496071269677683442614463247120938275415648800229692538900aa07220-d2a1-4f87-82ed-1348bf6df155
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_1-213-8447_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_0-118-298_20200519210555.parquet
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:208)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdate(HoodieCopyOnWriteTable.java:183)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpsertPartition(HoodieCopyOnWriteTable.java:265)
        at 
org.apache.hudi.HoodieWriteClient.lambda$upsertRecordsInternal$507693af$1(HoodieWriteClient.java:457)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
        at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:496071269677683442614463247120938275415648800229692538900aa07220-d2a1-4f87-82ed-1348bf6df155
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_1-213-8447_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_0-118-298_20200519210555.parquet
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:148)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable.handleUpdateInternal(HoodieCopyOnWriteTable.java:206)
        ... 32 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record 
into new file for key 
message_id:496071269677683442614463247120938275415648800229692538900aa07220-d2a1-4f87-82ed-1348bf6df155
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_1-213-8447_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_0-118-298_20200519210555.parquet
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
        ... 33 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge 
old record into new file for key 
message_id:496071269677683442614463247120938275415648800229692538900aa07220-d2a1-4f87-82ed-1348bf6df155
 from old file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_1-213-8447_20200519162625.parquet
 to new file 
s3a://gat-datalake-refined-dev/reports/login/dat/2020/05/18/e9bc50d6-2720-46a6-8e3a-6b72e998be1e-0_0-118-298_20200519210555.parquet
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:299)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:452)
        at 
org.apache.hudi.table.HoodieCopyOnWriteTable$UpdateHandler.consumeOneRecord(HoodieCopyOnWriteTable.java:442)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:126)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
   Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be 
cast to java.lang.Number
        at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:248)
        at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
        at 
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
        at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:103)
        at 
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:294)
        ... 8 more
   
   Process finished with exit code 0
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to