Re: [I] Proxy performance confirmation [shardingsphere]

via GitHub Sun, 23 Nov 2025 23:39:47 -0800


makssent commented on issue #36972:
URL: 
https://github.com/apache/shardingsphere/issues/36972#issuecomment-3569324697


   I ran your test, but I was not able to reproduce the throughput gain shown 
in your example (×4.5) for OLTP Point Select. In my environment the cluster is 
indeed faster than the single node, but the speedup is at most around ~2x. I’m 
probably missing some important factor or misinterpreting the results — I’d 
really appreciate any hints on what else to look at or adjust.
   
   Environment
   
   - NODE 1–5: 4 CPU, 8 GB RAM, SSD 65
   - Client: 4 CPU, 8 GB RAM, SSD 65
   - Proxy: 12 CPU, 8 GB RAM, SSD 65
   - Single node: 4 CPU, 8 GB RAM, SSD 65
   
   Data volume
   
   - Single: 40M rows
   - Cluster: 8M rows per node (5 nodes, total 40M rows, just distributed)
   
   2 GB is allocated for the InnoDB buffer pool. I checked that the buffer is 
empty → filled the table up to 8M rows → ran a warmup:
   ```sql
   SELECT id FROM t_order_0 FORCE INDEX (PRIMARY) ORDER BY id LIMIT 1000000000;
   ...
   SELECT id FROM t_order_9 FORCE INDEX (PRIMARY) ORDER BY id LIMIT 1000000000;
   ```
   
   After that I ran a couple of test OLTP Point SELECT runs to stabilize the 
buffer — the buffer filled up and did not grow anymore, the page count did not 
change, everything was working from memory.
   
   In theory, the speedup should come from the fact that the cluster works 
entirely from the buffer, while the single node constantly goes to disk, which 
is slower. In practice, the cluster indeed does not perform any disk reads, and 
the single node performs millions of disk reads, but the improvement turned out 
to be much smaller than expected.
   
   **The proxy also does not look like a bottleneck — its CPU utilization does 
not exceed 95%.**
   
   Single and Cluster were tested with the same Java benchmark (OLTP Point 
SELECT), only in one case the requests go directly to Single, and in the second 
— through Proxy to the cluster. So the “Java factor” and the client side are 
identical. Below are the results of all runs.
   
   ### CLUSTER — runs
   
   5 runs × 30 seconds (Cluster)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**     | disk_reads          | pages_used          |
   
|--------:|-------:|-------:|-------:|-------:|-------:|-------------:|-------------:|----------------------|----------------------|
   | 120 | 12557 | 8967  | 11823 | 10555 | 8659  | **10512** | **10.77 ms** | 
122171 → 122171 | 122603 → 122603 |
   | 160 | 12677 | 12116 | 14930 | 14930 | 11067 | **13144** | **12.18 ms** | 
122171 → 122171 | 122603 → 122603 |
   | 180 | 12318 | 13254 | 11200 | 17240 | 12288 | **13260** | **13.82 ms** | 
122171 → 122171 | 122603 → 122603 |
   | 200 | 14058 | 14721 | 13771 | 14044 | 11372 | **13593** | **14.78 ms** | 
122171 → 122171 | 122603 → 122603 |
   
   *(further `pages_used` and `disk_reads` stay unchanged: `pages_used` shows 
InnoDB buffer pool usage and remains non-full, and `disk_reads` stays at zero — 
the cluster continues to serve all reads from buffer)*
   
   5 runs × 60 seconds (Cluster)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**     |
   
|--------:|-------:|-------:|-------:|-------:|-------:|-------------:|-------------:|
   | 120 | 16743 | 19928 | 21718 | 22012 | 20496 | **20179** | **5.99 ms**  |
   | 160 | 18968 | 22258 | 22408 | 20870 | 22067 | **21314** | **7.30 ms**  |
   | 180 | 18945 | 20349 | 20888 | 21413 | 22268 | **20773** | **8.38 ms**  |
   | 200 | 19581 | 20689 | 22349 | 22426 | 22810 | **21571** | **8.88 ms**  |
   
   5 runs × 120 seconds (Cluster)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**     |
   
|--------:|-------:|-------:|-------:|-------:|-------:|-------------:|-------------:|
   | 120 | 18910 | 14995 | 15874 | 12802 | 13599 | **15236** | **8.47 ms**  |
   | 160 | 17773 | 17849 | 16568 | 14175 | 14924 | **16258** | **11.71 ms** |
   | 180 | 16472 | 18713 | 16073 | 12534 | 15560 | **15870** | **13.53 ms** |
   | 200 | 16226 | 16706 | 14546 | 13671 | 16393 | **15508** | **14.72 ms** |
   
   ---
   
   ### SINGLE — runs
   
   5 runs × 30 seconds (Single)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**      | disk_reads (total) | disk_reads_avg | pages_used |
   
|--------:|--------:|--------:|--------:|--------:|--------:|--------------:|---------------:|--------------------:|----------------:|-------------|
   | 120 | 9416 | 9721 | 9525 | 9525 | 9441 | **9526** | **11.87 ms** | 1 144 
157 | 228 831 | 131056 / 131072 |
   | 160 | 9809 | 10412 | 9805 | 9788 | 9842 | **9951** | **16.09 ms** | 1 192 
311 | 238 462 | 131056 / 131072 |
   | 180 | 9755 | 10212 | 9748 | 9400 | 9510 | **9725** | **18.56 ms** | 1 167 
719 | 233 544 | 131056 / 131072 |
   | 200 | 10239 | 9942 | 10044 | 9719 | 9780 | **9925** | **20.24 ms** | 1 194 
994 | 238 999 | 131056 / 131072 |
   
   *(from here on, `pages_used` is always 131056/131072 — the buffer is fully 
filled; further focus is on `disk_reads`)*
   
   5 runs × 60 seconds (Single)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**      | disk_reads (total) | disk_reads_avg |
   
|--------:|--------:|--------:|--------:|--------:|--------:|--------------:|---------------:|--------------------:|----------------:|
   | 120 | 9651 | 9828 | 9775 | 9896 | 9835 | **9797** | **11.91 ms** | 2 352 
894 | 470 579 |
   | 160 | 9794 | 9799 | 9783 | 9888 | 9647 | **9782** | **16.33 ms** | 2 348 
209 | 469 642 |
   | 180 | 9900 | 9707 | 9921 | 9740 | 9903 | **9834** | **18.38 ms** | 2 360 
695 | 472 139 |
   | 200 | 9746 | 9971 | 9914 | 9717 | 9794 | **9828** | **20.53 ms** | 2 360 
267 | 472 053 |
   
   5 runs × 120 seconds (Single)
   | Threads | QPS #1 | QPS #2 | QPS #3 | QPS #4 | QPS #5 | **QPS avg** | 
**avg**      | disk_reads (total) | disk_reads_avg |
   
|--------:|--------:|--------:|--------:|--------:|--------:|--------------:|---------------:|--------------------:|----------------:|
   | 120 | 9888 | 8282 | 8833 | 10478 | 10135 | **9523** | **13.35 ms** | 4 072 
067 | 814 413 |
   | 160 | 9961 | 8372 | 8552 | 11245 | 9814 | **9589** | **17.47 ms** | 4 180 
584 | 836 116 |
   | 180 | 9901 | 8443 | 8931 | 12069 | 9435 | **9756** | **19.58 ms** | 4 319 
040 | 863 808 |
   | 200 | 9851 | 8215 | 8357 | 12035 | 5988 | **8889** | **24.88 ms** | 3 971 
902 | 794 380 |
   
   ---
   
   ### CLUSTER vs SINGLE — comparison
   
   5 runs × 30 seconds
   | Threads | **QPS Cluster** | **QPS Single** | avg Cluster | avg Single | 
disk_reads Cluster | disk_reads Single | QPS gain (%) |
   
|--------:|-----------------:|---------------:|-------------:|-------------:|---------------------:|--------------------:|--------------:|
   | 120 | 10512 | 9526 | 10.77 ms | 11.87 ms | 0 | 1 144 157 | +10.3% |
   | 160 | 13144 | 9951 | 12.18 ms | 16.09 ms | 0 | 1 192 311 | +32.1% |
   | 180 | 13260 | 9725 | 13.82 ms | 18.56 ms | 0 | 1 167 719 | +36.3% |
   | 200 | 13593 | 9925 | 14.78 ms | 20.24 ms | 0 | 1 194 994 | +36.9% |
   
   5 runs × 60 seconds
   | Threads | **QPS Cluster** | **QPS Single** | avg Cluster | avg Single | 
disk_reads Cluster | disk_reads Single | QPS gain (%) |
   
|--------:|-----------------:|---------------:|-------------:|-------------:|---------------------:|--------------------:|--------------:|
   | 120 | 20179 | 9797 | 5.99 ms | 11.91 ms | 0 | 2 352 894 | +106.0% |
   | 160 | 21314 | 9782 | 7.30 ms | 16.33 ms | 0 | 2 348 209 | +117.9% |
   | 180 | 20773 | 9834 | 8.38 ms | 18.38 ms | 0 | 2 360 695 | +111.3% |
   | 200 | 21571 | 9828 | 8.88 ms | 20.53 ms | 0 | 2 360 267 | +119.5% |
   
   5 runs × 120 seconds
   | Threads | **QPS Cluster** | **QPS Single** | avg Cluster | avg Single | 
disk_reads Cluster | disk_reads Single | QPS gain (%) |
   
|--------:|-----------------:|---------------:|-------------:|-------------:|---------------------:|--------------------:|--------------:|
   | 120 | 15236 | 9523 | 8.47 ms | 13.35 ms | 0 | 4 072 067 | +59.9% |
   | 160 | 16258 | 9589 | 11.71 ms | 17.47 ms | 0 | 4 180 584 | +69.6% |
   | 180 | 15870 | 9756 | 13.53 ms | 19.58 ms | 0 | 4 319 040 | +62.7% |
   | 200 | 15508 | 8889 | 14.72 ms | 24.88 ms | 0 | 3 971 902 | +74.5% |
   
   ---
   
   In summary: there is indeed a performance gain, but at most about 100–120%, 
which is noticeably lower than in your example (×4.5). At the same time, the 
Single node shows significantly more stable results — the QPS range almost does 
not change, while in the cluster the metrics can vary from run to run.
   
   I did not use Sysbench — I plan to run similar tests later on another DBMS 
for which there are no ready-made Sysbench scripts. To have a universal 
scenario from the start, I use a Java benchmark (OLTP Point Select), and it is 
the same for both Single and Cluster.
   
   If I am misinterpreting the results, or there are MySQL/Proxy settings that 
should be adjusted, I would really appreciate any recommendations. Maybe I 
missed an important detail in the config or in the load scenario.
   
   
   My Proxy config:
   ```yaml
   # database-sharding.yaml
   databaseName: sharding_db
   
   dataSources:
     ds0:
       url: 
jdbc:mysql://localhost:3306/ds_0?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true
       username: root
       password: pass
       connectionTimeoutMilliseconds: 200000
       idleTimeoutMilliseconds: 120000
       maxLifetimeMilliseconds: 1800000
     ds1:
       url: 
jdbc:mysql://localhost:3306/ds_1?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true
       username: root
       password: pass
       connectionTimeoutMilliseconds: 200000
       idleTimeoutMilliseconds: 120000
       maxLifetimeMilliseconds: 1800000
     ds2:
       url: 
jdbc:mysql://localhost:3306/ds_2?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true
       username: root
       password: pass
       connectionTimeoutMilliseconds: 200000
       idleTimeoutMilliseconds: 120000
       maxLifetimeMilliseconds: 1800000
     ds3:
       url: 
jdbc:mysql://localhost:3306/ds_3?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true
       username: root
       password: pass
       connectionTimeoutMilliseconds: 200000
       idleTimeoutMilliseconds: 120000
       maxLifetimeMilliseconds: 1800000
     ds4:
       url: 
jdbc:mysql://localhost:3306/ds_4?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true
       username: root
       password: pass
       connectionTimeoutMilliseconds: 200000
       idleTimeoutMilliseconds: 120000
       maxLifetimeMilliseconds: 1800000
   
   rules:
     - !SHARDING
       tables:
         t_order:
           actualDataNodes: ds${0..4}.t_order_${0..9}
           databaseStrategy:
             standard:
               shardingColumn: id
               shardingAlgorithmName: db_by_id_inline
           tableStrategy:
             standard:
               shardingColumn: id
               shardingAlgorithmName: tbl_by_id_inline
       shardingAlgorithms:
         db_by_id_inline:
           type: INLINE
           props:
             algorithm-expression: ds${(int)(id / 8000000)}
         tbl_by_id_inline:
           type: INLINE
           props:
             algorithm-expression: t_order_${(int)((id % 8000000) / 800000)}
   
       keyGenerators:
         snowflake:
           type: SNOWFLAKE
   ```
   ```yaml
   #global.yaml
   mode:
     type: Standalone
     repository:
       type: JDBC
   
   authority:
     users:
       - user: root@%
         password: root
         admin: true
       - user: sharding@%
         password: sharding
     authenticators:
       native:
         type: NATIVE
     privilege:
       type: ALL_PERMITTED
   
   transaction:
     defaultType: LOCAL
   
   sqlParser:
     sqlStatementCache:
       initialCapacity: 2000
       maximumSize: 65535
     parseTreeCache:
       initialCapacity: 128
       maximumSize: 1024
   
   sqlFederation:
     sqlFederationEnabled: false
     executionPlanCache:
       initialCapacity: 2000
       maximumSize: 65535
   
   props:
     system-log-level: INFO
     sql-show: false
     check-table-metadata-enabled: false
     proxy-frontend-executor-size: 0
     proxy-frontend-max-connections: 0
     proxy-frontend-ssl-enabled: false
     proxy-netty-backlog: 1024
     cdc-server-port: 33071
     proxy-default-port: 3313
   ```
   
   Proxy connection config:
   ```properties
   spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
   
spring.datasource.url=jdbc:mysql://localhost:3313/sharding_db?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true&useCursorFetch=false
   spring.datasource.username=sharding
   spring.datasource.password=sharding
   spring.datasource.hikari.maximum-pool-size=200
   spring.datasource.hikari.minimum-idle=0
   spring.datasource.hikari.connection-timeout=65000
   spring.datasource.hikari.idle-timeout=120000
   spring.datasource.hikari.max-lifetime=1800000
   spring.datasource.hikari.keepalive-time=30000
   spring.datasource.hikari.leak-detection-threshold=0
   ```
   
   Single-node connection config:
   ```properties
   spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
   
spring.datasource.url=jdbc:mysql://localhost:3306/single_db?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true&cachePrepStmts=true&prepStmtCacheSize=256&prepStmtCacheSqlLimit=2048&useServerPrepStmts=true&useCursorFetch=false
   spring.datasource.username=root
   spring.datasource.password=pass
   spring.datasource.hikari.maximum-pool-size=200
   spring.datasource.hikari.minimum-idle=0
   spring.datasource.hikari.connection-timeout=65000
   spring.datasource.hikari.idle-timeout=120000
   spring.datasource.hikari.max-lifetime=1800000
   spring.datasource.hikari.keepalive-time=30000
   spring.datasource.hikari.leak-detection-threshold=0
   ```
   
   **Project to reproduce the tests
   Full repository with the benchmark, configs and load code:
   👉 https://github.com/makssent/sysbench_reposit**
   
   I would appreciate it if you could help clarify why the speedup is lower 
than expected and what else I can check or adjust.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Proxy performance confirmation [shardingsphere]

Reply via email to