danhuawang opened a new issue, #10443:
URL: https://github.com/apache/gravitino/issues/10443

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   There's ` "type": "unparsed"` when register delta table including complex 
datatype to the gravitino
   
   <img width="838" height="838" alt="Image" 
src="https://github.com/user-attachments/assets/24b38285-09ba-4a2a-92ef-8daa8052dd3b";
 />
   
   ### Error message and/or stacktrace
   
   ```
   26/03/16 19:13:31 INFO SparkContext: Running Spark version 3.4.3
   26/03/16 19:13:31 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   26/03/16 19:13:31 INFO ResourceUtils: 
==============================================================
   26/03/16 19:13:31 INFO ResourceUtils: No custom resources configured for 
spark.driver.
   26/03/16 19:13:31 INFO ResourceUtils: 
==============================================================
   26/03/16 19:13:31 INFO SparkContext: Submitted application: Delta Table Test
   26/03/16 19:13:31 INFO ResourceProfile: Default ResourceProfile created, 
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , 
memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: 
offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: 
cpus, amount: 1.0)
   26/03/16 19:13:31 INFO ResourceProfile: Limiting resource is cpu
   26/03/16 19:13:31 INFO ResourceProfileManager: Added ResourceProfile id: 0
   26/03/16 19:13:31 INFO SecurityManager: Changing view acls to: 
wangdanhua,hdfs
   26/03/16 19:13:31 INFO SecurityManager: Changing modify acls to: 
wangdanhua,hdfs
   26/03/16 19:13:31 INFO SecurityManager: Changing view acls groups to: 
   26/03/16 19:13:31 INFO SecurityManager: Changing modify acls groups to: 
   26/03/16 19:13:31 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: wangdanhua, hdfs; 
groups with view permissions: EMPTY; users with modify permissions: wangdanhua, 
hdfs; groups with modify permissions: EMPTY
   26/03/16 19:13:31 INFO Utils: Successfully started service 'sparkDriver' on 
port 53352.
   26/03/16 19:13:31 INFO SparkEnv: Registering MapOutputTracker
   26/03/16 19:13:31 INFO SparkEnv: Registering BlockManagerMaster
   26/03/16 19:13:31 INFO BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   26/03/16 19:13:31 INFO BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up
   26/03/16 19:13:31 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
   26/03/16 19:13:31 INFO DiskBlockManager: Created local directory at 
/private/var/folders/wn/rgsz3fqj32x87q719rfmfh9r0000gn/T/blockmgr-3f6cb7f5-aea0-47b7-a050-738ff7a0a8da
   26/03/16 19:13:31 INFO MemoryStore: MemoryStore started with capacity 127.2 
MiB
   26/03/16 19:13:31 INFO SparkEnv: Registering OutputCommitCoordinator
   26/03/16 19:13:31 INFO Executor: Starting executor ID driver on host 
wangdanhuadembp
   26/03/16 19:13:31 INFO Executor: Starting executor with user classpath 
(userClassPathFirst = false): ''
   26/03/16 19:13:31 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 53353.
   26/03/16 19:13:31 INFO NettyBlockTransferService: Server created on 
localhost 127.0.0.1:53353
   26/03/16 19:13:31 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   26/03/16 19:13:31 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, localhost, 53353, None)
   26/03/16 19:13:31 INFO BlockManagerMasterEndpoint: Registering block manager 
localhost:53353 with 127.2 MiB RAM, BlockManagerId(driver, localhost, 53353, 
None)
   26/03/16 19:13:31 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, localhost, 53353, None)
   26/03/16 19:13:31 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(driver, localhost, 53353, None)
   Spark session initialized for Delta table operations
   26/03/16 19:13:34 WARN package: Truncated the string representation of a 
plan since it was too large. This behavior can be adjusted by setting 
'spark.sql.debug.maxToStringFields'.
   Complex datatypes Delta table created at: /tmp/delta-datatypes/complex
     When create complex datatypes delta table with spark 
delta_datatype_complex in schema delta_datatype_schema catalog 
delta_datatype_test_catalog at location /tmp/delta-datatypes/complex # 
com.datastrato.test.steps.DeltaTableSteps.createComplexDatatypesDeltaTableWithSpark(java.lang.String,java.lang.String,java.lang.String,java.lang.String)
     Then verify delta table created successfully                               
                                                                                
                              # 
com.datastrato.test.steps.DeltaTableSteps.verifyDeltaTableCreatedSuccessfully()
   Request method:      POST
   Request URI: 
http://127.0.0.1:18090/api/metalakes/delta_test_metalake/catalogs/delta_datatype_test_catalog/schemas/delta_datatype_schema/tables
   Proxy:                       <none>
   Request params:      <none>
   Query params:        <none>
   Form params: <none>
   Path params: <none>
   Headers:             Accept=application/vnd.gravitino.v1+json
                                Authorization=Basic YW5vbnltb3VzOnRlc3Q=
                                Content-Type=application/json
   Cookies:             <none>
   Multiparts:          <none>
   Body:
   {
       "columns": [
           {
               "nullable": true,
               "name": "string_array",
               "comment": "array of strings",
               "type": "list<string>"
           },
           {
               "nullable": true,
               "name": "string_map",
               "comment": "map of string to int",
               "type": "map<string,integer>"
           },
           {
               "nullable": true,
               "name": "person_struct",
               "comment": "person struct",
               "type": "struct<name:string,age:integer>"
           }
       ],
       "name": "delta_datatype_complex",
       "comment": "Delta table with complex datatypes",
       "properties": {
           "external": "true",
           "format": "delta",
           "location": "/tmp/delta-datatypes/complex"
       }
   }
     When register complex datatypes delta table delta_datatype_complex at 
location /tmp/delta-datatypes/complex in schema delta_datatype_schema catalog 
delta_datatype_test_catalog          # 
com.datastrato.test.steps.DeltaTableSteps.registerComplexDatatypesDeltaTable(java.lang.String,java.lang.String,java.lang.String,java.lang.String)
     Then check response code 200 message properties                            
                                                                                
                              # 
com.datastrato.test.steps.MetalakeSteps.verifyResponseCodeMessage(int,java.lang.String)
     When load table delta_datatype_complex in schema delta_datatype_schema 
catalog delta_datatype_test_catalog                                             
                                  # 
com.datastrato.test.steps.DeltaTableSteps.loadTable(java.lang.String,java.lang.String,java.lang.String)
   [DEBUG] Column 'string_array' Gravitino raw type JSON: 
{"type":"unparsed","unparsedType":"list<string>"}
   [DEBUG] Available Spark tables:
   +---------+---------+-----------+
   |namespace|tableName|isTemporary|
   +---------+---------+-----------+
   +---------+---------+-----------+
   
   [DEBUG] Spark schema for table at /tmp/delta-datatypes/complex:
   root
    |-- string_array: array (nullable = true)
    |    |-- element: string (containsNull = true)
    |-- string_map: map (nullable = true)
    |    |-- key: string
    |    |-- value: integer (valueContainsNull = true)
    |-- person_struct: struct (nullable = true)
    |    |-- name: string (nullable = true)
    |    |-- age: integer (nullable = true)
   
   [DEBUG] Spark column: string_array -> array<string>
   [DEBUG] >>> Target column 'string_array' Spark type: array<string> 
(catalogString: array<string>)
   [DEBUG] Spark column: string_map -> map<string,int>
   [DEBUG] Spark column: person_struct -> struct<name:string,age:int>
     Then verify gravitino table complex datatypes mapping:                     
                                                                                
                              # 
com.datastrato.test.steps.DeltaTableSteps.verifyGravitinoTableComplexDatatypesMapping(io.cucumber.datatable.DataTable)
       | Gravitino Type                  | Delta/Spark Type        | Column 
Name   |
       | list<string>                    | ArrayType(StringType)   | 
string_array  |
       | map<string,integer>             | MapType(String,Integer) | string_map 
   |
       | struct<name:string,age:integer> | StructType              | 
person_struct |
         org.opentest4j.AssertionFailedError: Column string_array type 
mismatch: expected list<string>, got unparsed ==> expected: <true> but was: 
<false>
        at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at 
org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
        at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
        at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:210)
        at 
com.datastrato.test.steps.DeltaTableSteps.verifyDatatypesMapping(DeltaTableSteps.java:803)
        at 
com.datastrato.test.steps.DeltaTableSteps.verifyGravitinoTableComplexDatatypesMapping(DeltaTableSteps.java:766)
   ```
   
   ### How to reproduce
   
   1. Create a delta table with spark
         String createTableSQL =
             String.format(
                 "CREATE TABLE delta.`%s` ("
                     + "string_array ARRAY<STRING>, "
                     + "string_map MAP<STRING,INT>, "
                     + "person_struct STRUCT<name:STRING,age:INT>"
                     + ") USING DELTA",
                 location);
         sparkSession.sql(createTableSQL);
   2. Register the table in Gravitino
   POST 
{{host}}/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/tables
   ```
   {
       "columns": [
           {
               "nullable": true,
               "name": "string_array",
               "comment": "array of strings",
               "type": "list<string>"
           },
           {
               "nullable": true,
               "name": "string_map",
               "comment": "map of string to int",
               "type": "map<string,integer>"
           },
           {
               "nullable": true,
               "name": "person_struct",
               "comment": "person struct",
               "type": "struct<name:string,age:integer>"
           }
       ],
       "name": "delta_datatype_complex",
       "comment": "Delta table with complex datatypes",
       "properties": {
           "external": "true",
           "format": "delta",
           "location": "/tmp/delta-datatypes/complex"
       }
   }
   ```
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to