Re: Digest mismatch

2020-12-02 Thread Erick Ramirez
> > Thank you Steve - once I have the key, how do I get to a node? > Run this command to determine which replicas own the partition: $ nodetool getendpoints > So if the propagation has not taken place and a node doesn't have the data > and is the first to 'be asked' the client will get no data?

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
The IOPS dropped with the drop in read I/O throughput. The cassandra reads and network sent/revd is the same. We also did not adjust our heap size at 2x. between cachestats and thinking about how mmap works (probably doesn't formally access files in a way inotify monitors would detect) and given o

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Jeff Jirsa
What isn't clear is whether or not the slow IO is a result of doing far fewer IO and serving from RAM or if slow IO is slow IO on same number of reads/second. I assume you're doing far fewer IO, and the slowness is a sampling error. Do you know how many ops/second you're reading from each disk? Cl

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
Linux has crappy instrumentation on the file cache. I tried the cachestats on perf-tools, it is producing negative numbers on cache hits on the 2x. If the files are mmap'd, would that bypass any inotify detection when a file access occurs aka a page fault? I'm guessing yes On Wed, Dec 2, 2020 at

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Erick Ramirez
>From C* 2.2 onwards, SSTables get mapped to memory by mmap() so the hot data will be accessed much faster on systems with more RAM. On Thu, 3 Dec 2020 at 09:57, Carl Mueller wrote: > I agree in theory, I just want some way to confirm that file accesses in > the larger instance are being interce

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
I agree in theory, I just want some way to confirm that file accesses in the larger instance are being intercepted by the file cache, vs what is happening in the other case. I've tried amy tobey's pcstat I'd assume the 2x would have a file cache with lots of partial caches of files, churn in the

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
heap is the normal proportional, probably 1/2 RAM. So there definitely will be larger non-heap for file caching. The Amy Tobey utility does not show churn in the file cache however, which if it was almost three orders of magnitude difference in amount of disk access, I would expect churn in the OS

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Jeff Jirsa
This is exactly what I would expect when you double the memory and all of the data lives in page cache. On Wed, Dec 2, 2020 at 8:41 AM Carl Mueller wrote: > Oh, this is cassandra 2.2.13 (multi tenant delays) and ubuntu 18.04. > > On Wed, Dec 2, 2020 at 10:35 AM Carl Mueller > wrote: > >> We ha

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Elliott Sims
Is the heap larger on the M5.4x instance? Are you sure it's Cassandra generating the read traffic vs just evicting files read by other systems? In general, I'd call "more RAM means fewer drive reads" a very expected result regardless of the details, especially when it's the difference between fitt

Re: Digest mismatch

2020-12-02 Thread Joe Obernberger
Thank you Steve - once I have the key, how do I get to a node? After reading some of the documentation, it looks like the load-balancing-policy below *is* a token aware policy.  Perhaps writes need to be done with QUORUM; I don't know how long Cassandra will take to make sure replicas are cons

Re: Digest mismatch

2020-12-02 Thread Steve Lacerda
If you can determine the key, then you can determine which nodes do and do not have the data. You may be able to glean a bit more information like that, maybe one node is having problems, versus entire cluster. On Wed, Dec 2, 2020 at 9:32 AM Joe Obernberger wrote: > Clients are using an applicat

Re: Digest mismatch

2020-12-02 Thread Joe Obernberger
Clients are using an application.conf like: datastax-java-driver {   basic.request.timeout = 60 seconds   basic.request.consistency = ONE   basic.contact-points = ["172.16.110.3:9042", "172.16.110.4:9042", "172.16.100.208:9042", "172.16.100.224:9042", "172.16.100.225:9042", "172.16.100.253:9042

Re: Digest mismatch

2020-12-02 Thread Joe Obernberger
Python eh?  What's that?  Kidding.  (Java guy over here...) I grepped the logs for mutations but only see messages like: 2020-09-14 16:15:19,963 CommitLog.java:149 - Log replay complete, 0 replayed mutations and 2020-09-17 16:22:13,020 CommitLog.java:149 - Log replay complete, 291708 replayed

Re: Network Bandwidth and Multi-DC replication

2020-12-02 Thread Jens Fischer
Hi, I checked for all the given other factors - anti entropy repair, hints, read repair - and I still see outgoing cross-DC traffic of ~ 2x the “write size A” (as defined below). Given Jeffs answers this is not to be expected, i.e. there is something wrong here. Does anybody have an idea how to

Re: Digest mismatch

2020-12-02 Thread Carl Mueller
Are you using token aware policy for the driver? If your writes are one and your reads are one, the propagation may not have happened depending on the coordinator that is used. TokenAware will make that a bit better. On Wed, Dec 2, 2020 at 11:12 AM Joe Obernberger < joseph.obernber...@gmail.com>

Re: Digest mismatch

2020-12-02 Thread Steve Lacerda
The digest mismatch typically shows the partition key info, with something like this: DecoratedKey(-1671292413668442751, 48343732322d3838353032) That refers to the partition key, which you can gather like so: python import binascii binascii.unhexlify('48343732322d3838353032') 'H4722-88502' My a

Re: Digest mismatch

2020-12-02 Thread Joe Obernberger
Hi Carl - thank you for replying. I am using Cassandra 3.11.9-1 Rows are not typically being deleted - I assume you're referring to Tombstones.  I don't think that should be the case here as I don't think we've deleted anything here. This is a test cluster and some of the machines are small (he

Re: Digest mismatch

2020-12-02 Thread Carl Mueller
Why is one of your nodes only at 14.6% ownership? That's weird, unless you have a small rowcount. Are you frequently deleting rows? Are you frequently writing rows at ONE? What version of cassandra? On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger wrote: > Hi All - this is my first post here.

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
Oh, this is cassandra 2.2.13 (multi tenant delays) and ubuntu 18.04. On Wed, Dec 2, 2020 at 10:35 AM Carl Mueller wrote: > We have a cluster that is experiencing very high disk read I/O in the > 20-40 MB/sec range on m5.2x (gp2 drives). This is verified via VM metrics > as well as iotop. > > Whe

Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
We have a cluster that is experiencing very high disk read I/O in the 20-40 MB/sec range on m5.2x (gp2 drives). This is verified via VM metrics as well as iotop. When we switch m5.4x it drops to 60 KB/sec. There is no difference in network send/recv, read/write request counts. The graph for read

Digest mismatch

2020-12-02 Thread Joe Obernberger
Hi All - this is my first post here.  I've been using Cassandra for several months now and am loving it.  We are moving from Apache HBase to Cassandra for a big data analytics platform. I'm using java to get rows from Cassandra and very frequently get a java.util.NoSuchElementException when it

Antwort: Increased read latency with Cassandra >= 3.11.7

2020-12-02 Thread Nicolai Lune Vest
Hi, we performed a test run with 3.11.9. Unfortunately we do not experience a difference regarding increased read latency at some of our tables. Do others experience the same or similar behavior and probably do have a solution? Kind regards, Nico -"Nicolai Lune Vest" schrieb: - An