makssent commented on issue #36972: URL: https://github.com/apache/shardingsphere/issues/36972#issuecomment-3581536549
In total, I’ve run a huge number of tests: changed nodes, tested different combinations (2/3/4 nodes, etc.), and it’s now completely clear that the issue is not in the nodes. They work consistently and don’t lose performance as the dataset grows. For example, 2 nodes = 8+8M rows produce roughly the same results as 3 nodes = 8+8+8M. So the nodes themselves operate correctly and stably. Logically, this would imply that the single-node setup should slow down significantly, but based on my calculations — that’s not happening. When increasing the dataset from 16M to 24M rows, the single node doesn’t degrade enough to justify a +300% performance gain from 3 nodes compared to +200% from 2 nodes. Because of this, it’s unclear where such a large scaling boost is supposed to come from. I’ll also prepare a separate comparison showing how the single node slows down at 8, 16, 24, 32, and 40 million rows to measure the drop per +8M (equivalent to adding one node in the cluster). But based on what I’ve seen, there likely won’t be a strong regression there — unless the database were running on HDDs, which is obviously not our case. If you have any ideas or insights on this, I’d appreciate them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
