Hello Rahul,
Great compilation of resources.
Maybe add this one on the Blogs category? https://lostechies.com/ryansv
ihla/tags
This one is also quite good, I would say https://academy.datastax.com/s
upport-blog/deeper-dive-diagnosing-dse-performance-issues-ttop-and-
multidump
And since now ther
Thank you Jon, great article as usually!
One topic that was discussed in the article is filesystem cache which is
traditionally leveraged for data caching in Cassandra (with row-caching
disabled by default).
IIRC mmap() is used.
Some RDBMS and NoSQL DB's as well use direct I/O + async I/O + m
DataStax Enterprise 6.0 has a new bulk loader tool. DSE is a commercial
product, but maybe your needs are worth the investigation.
Sean Durity
From: Rahul Singh
Sent: Tuesday, August 07, 2018 9:37 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: ETL options from Hive/Presto/s3 to cassa
Deflate instead of LZ4 will probably give you somewhat better compression
at the cost of a lot of CPU. Larger chunk length might also help, but in
most cases you probably won't see much benefit above 64K (and it will
increase I/O load).
On Wed, Aug 8, 2018 at 11:18 PM, Eunsu Kim wrote:
> Hi all
Agreed about deflate.
Also you can adjust your chunk size, which may help ratios as well,
especially if you expect your data to compress well - often larger chunks
will compress better, but it depends on the nature of your data.
In the near future, look for work from Sushma @ Instagram to make av
There's a discussion about direct I/O here you might find interesting:
https://issues.apache.org/jira/browse/CASSANDRA-14466
I suspect the main reason is that O_DIRECT wasn't added till Java 10, and
while it could be used with some workarounds, there's a lot of entropy
around changing something li
I don't have any external process or planed repair in that time period.
In case of network, I can see outbound network on Cassandra node network
interface but couldn't find any way to check the VPC network to make sure
it is not going out of network. Maybe the only way is analysing VPC Flow
Log.
B.