jerryshao opened a new pull request, #10451: URL: https://github.com/apache/gravitino/pull/10451
### What changes were proposed in this pull request? - Fixed `batchSelectTableByIdentifier` in `TableMetaBaseSQLProvider` to include a LEFT JOIN with `table_version_info`, selecting all version-info fields (`format`, `properties`, `partitioning`, `sort_orders`, `distribution`, `indexes`, `comment`). Previously these fields were omitted, causing incomplete `TableEntity` objects to be stored in the entity cache. - Fixed `batchSelectJobByIdentifier` in `JobMetaBaseSQLProvider` to include `jtm.job_template_name` and proper camelCase column aliases required for MyBatis field mapping. Added a comment noting this method is currently unused by the service layer. - Added regression tests in `TestTableMetaService` covering all version-info fields and the columns limitation. - Added a regression test in `TestJobMetaService` verifying `job_template_name` is returned correctly. ### Why are the changes needed? `MetadataAuthzHelper.preloadToCache()` calls `entityStore().batchGet()` for TABLE entities on every authorized request. This invokes `batchSelectTableByIdentifier`, which was missing the LEFT JOIN with `table_version_info`. As a result, incomplete `TableEntity` objects (with no `format` or `properties`) were stored in the entity cache. After a server restart, `tableFormatCache` (an in-memory Guava cache in `GenericCatalogOperations`) is cleared. When loading a Delta/generic lakehouse table, `tableOps()` misses the `tableFormatCache` and falls back to `entityStore().get()`, which returns the incomplete cached entity. This causes `Preconditions.checkArgument(format != null, "Table format for %s is null...")` to throw an `IllegalArgumentException`. Fix: #10444 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Added `testBatchGetTableByIdentifierIncludesVersionInfoFields` to verify all version-info fields (`format`, `properties`, `comment`, `distribution`, `sortOrders`, `partitioning`, `indexes`) are returned by `batchGetTableByIdentifier`. - Added `testBatchGetTableByIdentifierDoesNotIncludeColumns` to document that columns are fetched via a separate path (`getTableByIdentifier`) and are intentionally not included in batch results. - Added `testBatchSelectJobByIdentifierIncludesJobTemplateName` to verify `job_template_name` and all column aliases are correctly returned by `batchSelectJobByIdentifier`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
