Hi Jeff, In regards to setting read ahead, how is this set for nvme drives? Also, below is our compression settings for the table… It’s the same as our tests that we are doing against SAS SSDs so I don’t think the compression settings would be the issue…
CREATE KEYSPACE ycsb WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE ycsb.usertable ( y_id text PRIMARY KEY, field0 text, field1 text, field2 text, field3 text, field4 text, field5 text, field6 text, field7 text, field8 text, field9 text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; Below are the metrics as far as TPS output from the YCSB benchmark... DBWrapper: report latency for each error is false and specific error codes to track for latency are: [] 2018-01-08 21:50:49:100 10 sec: 1048634 operations; 104863.4 current ops/sec; est completion in 15 minutes [INSERT: Count=1048634, Max=291071, Min=194, Avg=417.22, 90=463, 99=947, 99.9=5531, 99.99=136831] 2018-01-08 21:50:59:100 20 sec: 2159133 operations; 111049.9 current ops/sec; est completion in 15 minutes [INSERT: Count=1110545, Max=409087, Min=194, Avg=434.17, 90=450, 99=612, 99.9=3409, 99.99=294911] 2018-01-08 21:51:09:101 30 sec: 3092963 operations; 93383 current ops/sec; est completion in 15 minutes [INSERT: Count=933938, Max=460287, Min=193, Avg=511.4, 90=470, 99=750, 99.9=6055, 99.99=429823] 2018-01-08 21:51:19:100 40 sec: 4153712 operations; 106074.9 current ops/sec; est completion in 15 minutes [INSERT: Count=1060595, Max=388095, Min=194, Avg=434.08, 90=457, 99=604, 99.9=3261, 99.99=335103] 2018-01-08 21:51:29:100 50 sec: 5165150 operations; 101143.8 current ops/sec; est completion in 15 minutes [INSERT: Count=1011537, Max=419839, Min=189, Avg=488.41, 90=462, 99=666, 99.9=4057, 99.99=397823] 2018-01-08 21:51:39:100 60 sec: 6151282 operations; 98613.2 current ops/sec; est completion in 15 minutes [INSERT: Count=986033, Max=408575, Min=196, Avg=474.68, 90=467, 99=671, 99.9=5463, 99.99=375807] 2018-01-08 21:51:49:100 70 sec: 7171184 operations; 101990.2 current ops/sec; est completion in 15 minutes [INSERT: Count=1019962, Max=406783, Min=189, Avg=477.11, 90=468, 99=725, 99.9=4855, 99.99=364031] 2018-01-08 21:51:59:100 80 sec: 8154478 operations; 98329.4 current ops/sec; est completion in 15 minutes [INSERT: Count=983234, Max=391423, Min=188, Avg=473.52, 90=465, 99=653, 99.9=4751, 99.99=346623] 2018-01-08 21:52:09:100 90 sec: 9204270 operations; 104979.2 current ops/sec; est completion in 14 minutes [INSERT: Count=1049855, Max=366335, Min=194, Avg=465.83, 90=466, 99=690, 99.9=4207, 99.99=347391] 2018-01-08 21:52:19:100 100 sec: 10191251 operations; 98698.1 current ops/sec; est completion in 14 minutes [INSERT: Count=986982, Max=337663, Min=191, Avg=483.67, 90=466, 99=707, 99.9=5495, 99.99=323583] 2018-01-08 21:52:29:100 110 sec: 11118897 operations; 92764.6 current ops/sec; est completion in 14 minutes [INSERT: Count=927649, Max=324607, Min=195, Avg=514.77, 90=490, 99=798, 99.9=7939, 99.99=314111] 2018-01-08 21:52:39:100 120 sec: 12106226 operations; 98732.9 current ops/sec; est completion in 14 minutes [INSERT: Count=987327, Max=327423, Min=191, Avg=483.53, 90=475, 99=749, 99.9=6303, 99.99=291583] 2018-01-08 21:52:49:100 130 sec: 12406781 operations; 30055.5 current ops/sec; est completion in 15 minutes [INSERT: Count=300545, Max=2267135, Min=195, Avg=1594.21, 90=551, 99=1412, 99.9=268031, 99.99=2059263] 2018-01-08 21:52:59:100 140 sec: 12455737 operations; 4895.6 current ops/sec; est completion in 16 minutes [INSERT: Count=48901, Max=2637823, Min=208, Avg=8719.47, 90=570, 99=1775, 99.9=2435071, 99.99=2637823] 2018-01-08 21:53:09:100 150 sec: 12545132 operations; 8939.5 current ops/sec; est completion in 17 minutes [INSERT: Count=89395, Max=2040831, Min=196, Avg=5236.29, 90=603, 99=3103, 99.9=1419263, 99.99=2039807] 2018-01-08 21:53:19:100 160 sec: 12713856 operations; 16872.4 current ops/sec; est completion in 18 minutes [INSERT: Count=168724, Max=3260415, Min=201, Avg=3212.66, 90=505, 99=825, 99.9=1442815, 99.99=3256319] 2018-01-08 21:53:29:100 170 sec: 13014136 operations; 30028 current ops/sec; est completion in 18 minutes [INSERT: Count=300280, Max=3291135, Min=195, Avg=1398.45, 90=486, 99=722, 99.9=200575, 99.99=2809855] 2018-01-08 21:53:39:100 180 sec: 13212312 operations; 19817.6 current ops/sec; est completion in 19 minutes [INSERT: Count=198176, Max=1838079, Min=196, Avg=2409.91, 90=524, 99=841, 99.9=612863, 99.99=1628159] 2018-01-08 21:53:49:100 190 sec: 13498836 operations; 28652.4 current ops/sec; est completion in 20 minutes [INSERT: Count=286524, Max=2865151, Min=195, Avg=1851.54, 90=513, 99=824, 99.9=402175, 99.99=2654207] 2018-01-08 21:53:59:100 200 sec: 13616476 operations; 11764 current ops/sec; est completion in 21 minutes [INSERT: Count=117640, Max=1835007, Min=198, Avg=4156.37, 90=555, 99=1461, 99.9=1234943, 99.99=1829887] 2018-01-08 21:54:09:100 210 sec: 13810240 operations; 19376.4 current ops/sec; est completion in 21 minutes [INSERT: Count=193764, Max=1638399, Min=196, Avg=2159.51, 90=528, 99=1352, 99.9=814591, 99.99=1637375] 2018-01-08 21:54:19:100 220 sec: 14052024 operations; 24178.4 current ops/sec; est completion in 22 minutes [INSERT: Count=241784, Max=3465215, Min=192, Avg=2111.23, 90=479, 99=847, 99.9=221183, 99.99=2643967] 2018-01-08 21:54:29:100 230 sec: 14349241 operations; 29721.7 current ops/sec; est completion in 22 minutes [INSERT: Count=297272, Max=1814527, Min=197, Avg=1722.5, 90=541, 99=1400, 99.9=418815, 99.99=1233919] 2018-01-08 21:54:39:100 240 sec: 14495872 operations; 14663.1 current ops/sec; est completion in 23 minutes [INSERT: Count=146576, Max=2435071, Min=195, Avg=2692.02, 90=509, 99=881, 99.9=1121279, 99.99=2032639] 2018-01-08 21:54:49:100 250 sec: 14581928 operations; 8605.6 current ops/sec; est completion in 24 minutes [INSERT: Count=86056, Max=3651583, Min=203, Avg=6109.73, 90=578, 99=979, 99.9=1354751, 99.99=3651583] 2018-01-08 21:54:59:100 260 sec: 14647360 operations; 6543.2 current ops/sec; est completion in 25 minutes [INSERT: Count=65432, Max=2424831, Min=204, Avg=6781.02, 90=572, 99=2071, 99.9=1839103, 99.99=2422783] 2018-01-08 21:55:09:100 270 sec: 14688500 operations; 4114 current ops/sec; est completion in 26 minutes [INSERT: Count=41140, Max=3887103, Min=217, Avg=12254.66, 90=599, 99=6423, 99.9=2451455, 99.99=3678207] 2018-01-08 21:55:19:100 280 sec: 15060816 operations; 37231.6 current ops/sec; est completion in 26 minutes [INSERT: Count=372316, Max=2234367, Min=190, Avg=1347.23, 90=493, 99=833, 99.9=410111, 99.99=1830911] 2018-01-08 21:55:29:100 290 sec: 15148256 operations; 8744 current ops/sec; est completion in 27 minutes [INSERT: Count=87440, Max=2453503, Min=203, Avg=4990.8, 90=532, 99=820, 99.9=1429503, 99.99=2453503] 2018-01-08 21:55:39:100 300 sec: 15452601 operations; 30434.5 current ops/sec; est completion in 27 minutes [INSERT: Count=304345, Max=2049023, Min=191, Avg=1774.37, 90=497, 99=762, 99.9=400383, 99.99=2013183] 2018-01-08 21:55:49:100 310 sec: 15522064 operations; 6946.3 current ops/sec; est completion in 28 minutes [INSERT: Count=69463, Max=2836479, Min=209, Avg=6808.93, 90=617, 99=187519, 99.9=1024511, 99.99=2433023] 2018-01-08 21:55:59:100 320 sec: 15589351 operations; 6728.7 current ops/sec; est completion in 28 minutes [INSERT: Count=67367, Max=2637823, Min=200, Avg=7380.59, 90=574, 99=1152, 99.9=2209791, 99.99=2637823] 2018-01-08 21:56:09:100 330 sec: 15691979 operations; 10262.8 current ops/sec; est completion in 29 minutes [INSERT: Count=102601, Max=3438591, Min=205, Avg=4675.01, 90=560, 99=1179, 99.9=995839, 99.99=3438591] 2018-01-08 21:56:19:100 340 sec: 15762632 operations; 7065.3 current ops/sec; est completion in 30 minutes [INSERT: Count=70669, Max=3080191, Min=205, Avg=6789.43, 90=570, 99=1582, 99.9=2453503, 99.99=3078143] 2018-01-08 21:56:29:100 350 sec: 15864184 operations; 10155.2 current ops/sec; est completion in 30 minutes [INSERT: Count=101483, Max=2232319, Min=195, Avg=3720.85, 90=550, 99=967, 99.9=1419263, 99.99=1840127] 2018-01-08 21:56:39:101 360 sec: 15884476 operations; 2029.2 current ops/sec; est completion in 31 minutes [INSERT: Count=20292, Max=3004415, Min=234, Avg=24097.18, 90=669, 99=1005055, 99.9=2998271, 99.99=3004415] 2018-01-08 21:56:49:100 370 sec: 15904064 operations; 1958.8 current ops/sec; est completion in 32 minutes [INSERT: Count=19588, Max=3647487, Min=250, Avg=28360.33, 90=676, 99=1220607, 99.9=3485695, 99.99=3643391] 2018-01-08 21:56:59:100 380 sec: 15922500 operations; 1843.6 current ops/sec; est completion in 33 minutes [INSERT: Count=18436, Max=3223551, Min=230, Avg=25792.42, 90=675, 99=814079, 99.9=3215359, 99.99=3223551] 2018-01-08 21:57:09:100 390 sec: 15958808 operations; 3630.8 current ops/sec; est completion in 34 minutes [INSERT: Count=36308, Max=3637247, Min=211, Avg=13116.93, 90=631, 99=163967, 99.9=3219455, 99.99=3635199] 2018-01-08 21:57:19:100 400 sec: 16037208 operations; 7840 current ops/sec; est completion in 34 minutes [INSERT: Count=78400, Max=3844095, Min=205, Avg=4716.95, 90=533, 99=875, 99.9=1020415, 99.99=3839999] com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write) Any insight would be very helpful. Thank you, Justin Sanciangco From: Jeff Jirsa [mailto:jji...@gmail.com] Sent: Friday, January 5, 2018 5:50 PM To: user@cassandra.apache.org Subject: Re: NVMe SSD benchmarking with Cassandra Second the note about compression chunk size in particular. -- Jeff Jirsa On Jan 5, 2018, at 5:48 PM, Jon Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote: Generally speaking, disable readahead. After that it's very likely the issue isn’t in the settings you’re using the disk settings, but is actually in your Cassandra config or the data model. How are you measuring things? Are you saturating your disks? What resource is your bottleneck? *Every* single time I’ve handled a question like this, without exception, it ends up being a mix of incorrect compression settings (use 4K at most), some crazy readahead setting like 1MB, and terrible JVM settings that are the bulk of the problem. Without knowing how you are testing things or *any* metrics whatsoever whether it be C* or OS it’s going to be hard to help you out. Jon On Jan 5, 2018, at 5:41 PM, Justin Sanciangco <jsancian...@blizzard.com<mailto:jsancian...@blizzard.com>> wrote: Hello, I am currently benchmarking NVMe SSDs with Cassandra and am getting very bad performance when my workload exceeds the memory size. What mount settings for NVMe should be used? Right now the SSD is formatted as XFS using noop scheduler. Are there any additional mount options that should be used? Any specific kernel parameters that should set in order to make best use of the PCIe NVMe SSD? Your insight would be well appreciated. Thank you, Justin Sanciangco