youngyjd opened a new issue, #10472:
URL: https://github.com/apache/gravitino/issues/10472

   ### What would you like to be improved?
   
   - Today the Gravitino relational entity store (JDBCBackend) uses a single 
JDBC URL for all metadata access. Under load, reads and writes compete on the 
same primary DB connection pool.
   - Operators who run MySQL (or similar) with read replicas cannot offload 
read-heavy metadata traffic (list/get paths) to replicas without pointing the 
whole server at the replica—which would break writes.
   - We want optional read/write separation so:
     - Writes (and transactional work) stay on the primary.
     - Standalone reads can use a read replica when configured, improving 
scalability and reducing primary load.
   - Behavior must remain backward compatible when no read-replica settings are 
used.
   
   ### How should we improve?
   
   ### Proposal
   - New configuration (all optional; defaults preserve current behavior)
     - gravitino.entity.store.relational.jdbcReadOnlyUrl — replica URL; if 
unset, single pool for reads and writes.
     - gravitino.entity.store.relational.jdbcReadOnlyUser / 
jdbcReadOnlyPassword — optional; fall back to primary user/password.
     - gravitino.entity.store.relational.readOnlyMaxConnections / 
readOnlyMaxWaitMillis — optional read pool sizing; -1 means inherit from 
primary pool settings.
    
   - Separate read pool only when needed
     - If no read-only-related config is set, keep a single datasource/factory. 
     - If any read-only config is set, create a second pooled datasource for 
reads.
   
   - Session routing (convention-based)
     - Write path: doWithCommit, doWithCommitAndFetchResult, doWithoutCommit, 
doMultipleWithCommit, beginTransaction → primary pool.
     - Read path: getWithoutCommit → read pool when no write session is active 
on the thread; if a write transaction is already open (e.g. nested inside 
doMultipleWithCommit), use the primary session so DML/transactional steps stay 
correct and replicas are not used for in-txn work.
   
   - Operational visibility
     - Log at startup when a separate read replica pool is enabled (e.g. Read 
replica JDBC pool enabled for entity store).
   
   - Metrics
     - Register a second datasource metrics source for the read pool when it is 
separate (e.g. gravitino-relational-store-read).
   
   - Docs / template
     - Document keys in gravitino.conf.template and (recommended) in 
docs/gravitino-server-config.md / relational backend how-to.
   
   
   
   #### Pros
   1. Offloads read-heavy metadata traffic to a replica, reducing primary load.
   2. Fully backward compatible; all new settings are optional.
   3. Replica can use its own URL, credentials, and pool size.
   4. Writes and in-transaction reads stay on the primary; no API changes for 
callers.
   
   #### Cons
   1. Standalone reads can see replica lag (eventual consistency).
   2. Routing is by API convention, not SQL analysis. Read vs write is inferred 
from getWithoutCommit vs commit/transaction APIs, not from SQL. Misuse (e.g. 
DML via getWithoutCommit at top level) could in theory hit the read pool; in 
this codebase those calls are under an active write txn, so they stay on 
primary.
   
   **Note:** If strong consistency is required (e.g. read-your-writes across 
requests), the read replica must use synchronous replication (e.g. MySQL 
semi-sync / group replication, or equivalent in other DBs); otherwise metadata 
reads may be briefly stale.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to