jerryshao opened a new pull request, #10451:
URL: https://github.com/apache/gravitino/pull/10451

   ### What changes were proposed in this pull request?
   
   - Fixed `batchSelectTableByIdentifier` in `TableMetaBaseSQLProvider` to 
include a LEFT JOIN with `table_version_info`, selecting all version-info 
fields (`format`, `properties`, `partitioning`, `sort_orders`, `distribution`, 
`indexes`, `comment`). Previously these fields were omitted, causing incomplete 
`TableEntity` objects to be stored in the entity cache.
   - Fixed `batchSelectJobByIdentifier` in `JobMetaBaseSQLProvider` to include 
`jtm.job_template_name` and proper camelCase column aliases required for 
MyBatis field mapping. Added a comment noting this method is currently unused 
by the service layer.
   - Added regression tests in `TestTableMetaService` covering all version-info 
fields and the columns limitation.
   - Added a regression test in `TestJobMetaService` verifying 
`job_template_name` is returned correctly.
   
   ### Why are the changes needed?
   
   `MetadataAuthzHelper.preloadToCache()` calls `entityStore().batchGet()` for 
TABLE entities on every authorized request. This invokes 
`batchSelectTableByIdentifier`, which was missing the LEFT JOIN with 
`table_version_info`. As a result, incomplete `TableEntity` objects (with no 
`format` or `properties`) were stored in the entity cache.
   
   After a server restart, `tableFormatCache` (an in-memory Guava cache in 
`GenericCatalogOperations`) is cleared. When loading a Delta/generic lakehouse 
table, `tableOps()` misses the `tableFormatCache` and falls back to 
`entityStore().get()`, which returns the incomplete cached entity. This causes 
`Preconditions.checkArgument(format != null, "Table format for %s is null...")` 
to throw an `IllegalArgumentException`.
   
   Fix: #10444
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   - Added `testBatchGetTableByIdentifierIncludesVersionInfoFields` to verify 
all version-info fields (`format`, `properties`, `comment`, `distribution`, 
`sortOrders`, `partitioning`, `indexes`) are returned by 
`batchGetTableByIdentifier`.
   - Added `testBatchGetTableByIdentifierDoesNotIncludeColumns` to document 
that columns are fetched via a separate path (`getTableByIdentifier`) and are 
intentionally not included in batch results.
   - Added `testBatchSelectJobByIdentifierIncludesJobTemplateName` to verify 
`job_template_name` and all column aliases are correctly returned by 
`batchSelectJobByIdentifier`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to