gada121982 opened a new issue, #10574:
URL: https://github.com/apache/gravitino/issues/10574

   ### Search before asking
   
   - [x] I searched the issues and found no similar issues.
   
   ### Version
   
   1.2.0
   
   ### Describe the bug
   
   When running Gravitino with **multiple replicas** (e.g., 3-pod Kubernetes 
deployment with shared PostgreSQL backend), the `ownerRel` cache in 
`JcasbinAuthorizer` becomes stale across replicas. This causes the **metalake 
owner** to receive `403 ForbiddenException` when performing operations that 
require `METALAKE::OWNER` privilege (e.g., `createRole`, `addUser`).
   
   ### Root cause
   
   `JcasbinAuthorizer.ownerRel` is a **local Caffeine cache** (per-JVM, no 
distributed invalidation). When `SetMetalakeOwner` is processed by replica A:
   
   1. Replica A updates the `owner_meta` table in PostgreSQL ✅
   2. Replica A invalidates its local `ownerRel` cache via 
`handleMetadataOwnerChange()` ✅
   3. **Replicas B and C still have the stale creator-as-owner cached** ❌
   
   Subsequent requests load-balanced to stale replicas fail authorization 
because `isOwner()` compares the requesting user against the **cached (stale) 
owner**, not the database value.
   
   ### Steps to reproduce
   
   **Environment:** 3 Gravitino replicas sharing one PostgreSQL database, 
OAuth2 authentication enabled.
   
   1. Create a metalake via the API (processed by replica A, creator = 
`admin-user`)
   2. Add a user to the metalake
   3. Set the new user as metalake owner via `PUT 
/api/metalakes/{name}/owners/METALAKE/{name}`  
      → This request may be processed by replica B
   4. Using the owner's token, call `POST /api/metalakes/{name}/roles` to 
create a role  
      → This request may be load-balanced to replica C (stale cache)
   
   **Expected:** Owner can create roles (authorization passes).  
   **Actual:** `403 ForbiddenException: User 'xxx' is not authorized to perform 
operation 'createRole'`
   
   ### Evidence from logs
   
   **Gravitino log (stale replica):**
   ```
   jcasbin:115 - Request: [1377586509491296807, METALAKE, 2642599110218138787, 
CREATE_ROLE] ---> false
   jcasbin:117 - Hit Policy: []
   GravitinoInterceptionService$MetadataAuthorizationMethodInterceptor:235 - 
     Authorization failed - User: 9daa3b35-..., Operation: createRole, 
     Metadata: 2000762_sample123, Expression: METALAKE::OWNER || 
METALAKE::CREATE_ROLE
   ```
   
   **Per-pod verification via port-forward:**
   
   | Pod | `getOwner` returns | `createRole` result |
   |-----|--------------------|---------------------|
   | Pod A (stale) | `950e2bea-...` (old creator) ❌ | 403 Forbidden |
   | Pod B (correct) | `9daa3b35-...` (actual owner) ✅ | Passed authorization |
   | Pod C (correct) | `9daa3b35-...` (actual owner) ✅ | Passed authorization |
   
   **Database (`owner_meta` table) shows correct owner:**
   ```sql
   SELECT owner_id, user_name FROM owner_meta WHERE deleted_at = 0;
   -- owner_id: 1377586509491296807, user_name: 9daa3b35-... (correct)
   ```
   
   ### Analysis
   
   The issue is in `JcasbinAuthorizer.isOwner()` → `loadOwnerPolicy()`:
   
   ```java
   // JcasbinAuthorizer.java:538
   private void loadOwnerPolicy(String metalake, MetadataObject metadataObject, 
Long metadataId) {
       if (ownerRel.getIfPresent(metadataId) != null) {
           return;  // Returns stale cached value on other replicas
       }
       // ... loads from DB only on cache miss
   }
   ```
   
   The `ownerRel` cache is a local `Caffeine` cache with **no cross-replica 
invalidation mechanism**. `handleMetadataOwnerChange()` only invalidates the 
local JVM's cache.
   
   Additionally, `CacheFactory` only supports `caffeine` (in-memory) — there is 
no distributed cache implementation:
   ```java
   public static final ImmutableMap<String, String> ENTITY_CACHES =
       ImmutableMap.of("caffeine", 
CaffeineEntityCache.class.getCanonicalName());
   ```
   
   ### Impact
   
   This affects **any multi-replica Gravitino deployment** where:
   - Metalake ownership is transferred (via `SetMetalakeOwner`)
   - Owner-dependent operations (`createRole`, `addUser`, `dropMetalake`) are 
load-balanced to a different replica
   
   **Workaround:** Scale Gravitino to 1 replica, or restart all replicas after 
ownership changes.
   
   ### Related
   
   - #10474 — TreeLock limitations for HA deployment (same class of per-JVM 
state problem)
   
   ### Are you willing to submit a PR?
   
   - [x] Yes, I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to