Re: Problems with large partitions and compaction

2017-02-15 Thread Dan Kinder
m to happening. And most of the failures > I am seeing are on reads, but for an entirely different table. Lastly, does > anyone has anyone had success to switching to STCS in this situation as a > work around? > > Thanks > > - John > -- Dan Kinder Principal Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com

Commitlog still replaying after drain && shutdown

2015-06-30 Thread Dan Kinder
Hi all, To quote Sebastian Estevez in one recent thread: "You said you ran a nodetool drain before the restart, but your logs show commitlogs replayed. That does not add up..." The docs seem to generally agree with this: if you did `nodetool drain` before restarting your node there shouldn't be an

Re: Overwhelming tombstones with LCS

2015-07-10 Thread Dan Kinder
h time and headroom (this is going to do some pretty serious compaction so be careful), alter your table to STCS, let it compact into one SSTable, then convert back to LCS. It's pretty heavy-handed but as long as your gc_grace is low enough it'll do the job. Definitely do NOT do this i

future very wide row support

2015-08-31 Thread Dan Kinder
Hi, My understanding is that wide row support (i.e. many columns/CQL-rows/cells per partition key) has gotten much better in the past few years; even though the theoretical of 2 billion has been much higher than practical for a long time, it seems like now Cassandra is able to handle these better

memtable flush size with LCS

2015-10-27 Thread Dan Kinder
Hi all, The docs indicate that memtables are triggered to flush when data in the commitlog is expiring or based on memtable_flush_period_in_ms. But LCS has a specified sstable size; when using LCS are memtables flushed when they hit the desired sstable size (default 160MB) or could L0 sstables be

Re: memtable flush size with LCS

2015-10-27 Thread Dan Kinder
reference/compactSubprop.html?scroll=compactSubprop__compactionSubpropertiesLCS > for more details. > > On Tue, Oct 27, 2015 at 3:42 PM, Dan Kinder wrote: > > > > Hi all, > > > > The docs indicate that memtables are triggered to flush when data in the > commit

Re: memtable flush size with LCS

2015-11-02 Thread Dan Kinder
er than sstable_size_in_mb? >>> >> >> Yes, 'sstable_size_in_mb' plays no part in the flush process. Flushing >> is based on solely on runtime activity and the file size is determined by >> whatever was in the memtable at that time. >> >> >

compression cpu overhead

2015-11-03 Thread Dan Kinder
Hey all, Just wondering if anyone has done seen or done any benchmarking for the actual CPU overhead added by various compression algorithms in Cassandra (at least LZ4) vs no compression. Clearly this is going to be workload dependent but even a rough gauge would be helpful (ex. "Turning on LZ4 co

Re: compression cpu overhead

2015-11-03 Thread Dan Kinder
should make some difference, I didn’t immediately find perf numbers though. > > On Nov 3, 2015, at 5:42 PM, Dan Kinder wrote: > > Hey all, > > Just wondering if anyone has done seen or done any benchmarking for the > actual CPU overhead added by various compression algorith

Re: compression cpu overhead

2015-11-04 Thread Dan Kinder
u > with LZ4 vs no compression. It would be different for different h/w > configurations. > > > Thanks, > Tushar > (Sent from iPhone) > > On Nov 3, 2015, at 5:51 PM, Dan Kinder wrote: > > Most concerned about write since that's where most of the cost is, but >

Re: Production with Single Node

2016-01-22 Thread Dan Kinder
such a configuration? >>> >>> The virtual disk would be running RAID 5 and the disk controller would >>> have a flash backed write-behind cache. >>> >>> What's the best way to configure Cassandra and/or respecify the hardware >>> for an all-in-one-box solution? >>> >>> Thanks-in-advance! >>> >>> --John >>> >>> >> -- Dan Kinder Principal Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com

MemtableReclaimMemory pending building up

2016-03-02 Thread Dan Kinder
Hi y'all, I am writing to a cluster fairly fast and seeing this odd behavior happen, seemingly to single nodes at a time. The node starts to take more and more memory (instance has 48GB memory on G1GC). tpstats shows that MemtableReclaimMemory Pending starts to grow first, then later MutationStage

Re: MemtableReclaimMemory pending building up

2016-03-02 Thread Dan Kinder
Also should note: Cassandra 2.2.5, Centos 6.7 On Wed, Mar 2, 2016 at 1:34 PM, Dan Kinder wrote: > Hi y'all, > > I am writing to a cluster fairly fast and seeing this odd behavior happen, > seemingly to single nodes at a time. The node starts to take more and more > memor

Re: MemtableReclaimMemory pending building up

2016-03-04 Thread Dan Kinder
management and also trying to push Cassandra limits by increasing default > values as you seems to have resources available, to make sure Cassandra can > cope with the high throughput. Pending operations = high memory pressure. > Reducing pending stuff somehow will probably get you out off troubl

Re: MemtableReclaimMemory pending building up

2016-03-08 Thread Dan Kinder
>> /var/log/cassandra/system.log" >> > > 435 incoming connections, only warning is compaction of some large > partitions. > > >> >> As a small conclusion I would have an eye on things related to the memory >> management and also trying to push Cassandr

Re: Cassandra Golang Driver and Support

2016-04-14 Thread Dan Kinder
Just want to put a plug in for gocql and the guys who work on it. I use it for production applications that sustain ~10,000 writes/sec on an 8 node cluster and in the few times I have seen problems they have been responsive on issues and pull requests. Once or twice I have seen the API change but o

[no subject]

2017-09-28 Thread Dan Kinder
Hi, I recently upgraded our 16-node cluster from 2.2.6 to 3.11 and see the following. The cluster does function, for a while, but then some stages begin to back up and the node does not recover and does not drain the tasks, even under no load. This happens both to MutationStage and GossipStage. I

Re:

2017-09-28 Thread Dan Kinder
I should also note, I also see nodes become locked up without seeing that Exception. But the GossipStage buildup does seem correlated with gossip activity, e.g. me restarting a different node. On Thu, Sep 28, 2017 at 9:20 AM, Dan Kinder wrote: > Hi, > > I recently upgraded our 16-nod

Re:

2017-09-28 Thread Dan Kinder
> Dan, > > > > do you see any major GC? We have been hit by the following memory leak in > our loadtest environment with 3.11.0. > > https://issues.apache.org/jira/browse/CASSANDRA-13754 > > > > So, depending on the heap size and uptime, you might get into

Re:

2017-09-28 Thread Dan Kinder
0 HINT 0 MUTATION 0 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 On Thu, Sep 28, 2017 at 2:08 PM, Dan Kinder

Re:

2017-10-02 Thread Dan Kinder
Right, I just meant that calling it at all results in holding a read lock, which unfortunately is blocking these read threads. On Mon, Oct 2, 2017 at 11:40 AM, Jeff Jirsa wrote: > > > On Mon, Oct 2, 2017 at 11:27 AM, Dan Kinder wrote: > >> (As a side note, it s

Re:

2017-10-02 Thread Dan Kinder
Sure will do. On Mon, Oct 2, 2017 at 11:48 AM, Jeff Jirsa wrote: > You're right, sorry I didnt read the full stack (gmail hid it from me) > > Would you open a JIRA with your stack traces, and note (somewhat loudly) > that this is a regression? > > > On Mon, Oct 2, 20

Re:

2017-10-02 Thread Dan Kinder
Created https://issues.apache.org/jira/browse/CASSANDRA-13923 On Mon, Oct 2, 2017 at 12:06 PM, Dan Kinder wrote: > Sure will do. > > On Mon, Oct 2, 2017 at 11:48 AM, Jeff Jirsa wrote: > >> You're right, sorry I didnt read the full stack (gmail hid it from me) >>

LCS major compaction on 3.2+ on JBOD

2017-10-05 Thread Dan Kinder
Hi I am wondering how major compaction behaves for a table using LCS on JBOD with Cassandra 3.2+'s JBOD improvements. Up to then I know that major compaction would use a single thread, include all SSTables in a single compaction, and spit out a bunch of SSTables in appropriate levels. Does 3.2+

Setting min_index_interval to 1?

2018-02-01 Thread Dan Kinder
Hi, I have an unusual case here: I'm wondering what will happen if I set min_index_interval to 1. Here's the logic. Suppose I have a table where I really want to squeeze as many reads/sec out of it as possible, and where the row data size is much larger than the keys. E.g. the keys are a few bytes

storing indexes on ssd

2018-02-10 Thread Dan Kinder
Hi, We're optimizing Cassandra right now for fairly random reads on a large dataset. In this dataset, the values are much larger than the keys. I was wondering, is it possible to have Cassandra write the *index* files (*-Index.db) to one drive (SSD), but write the *data* files (*-Data.db) to anoth

Re: Setting min_index_interval to 1?

2018-02-12 Thread Dan Kinder
like you understand the potential impact > on memory and startup time. If you have the data in such a way that you can > easily experiment, I would like to see a breakdown of the impact on > response time vs. memory usage as well as where the point of diminishing > returns is on turni

Re: storing indexes on ssd

2018-02-12 Thread Dan Kinder
Created https://issues.apache.org/jira/browse/CASSANDRA-14229 On Mon, Feb 12, 2018 at 12:10 AM, Mateusz Korniak < mateusz-li...@ant.gliwice.pl> wrote: > On Saturday 10 of February 2018 23:09:40 Dan Kinder wrote: > > We're optimizing Cassandra right now for fairly ran

Re: storing indexes on ssd

2018-02-13 Thread Dan Kinder
: > On Tue, Feb 13, 2018 at 1:30 AM, Dan Kinder wrote: > >> Created https://issues.apache.org/jira/browse/CASSANDRA-14229 >> > > This is confusing. You've already started the conversation here... > > How big are your index files in the end? Even if Cassandra do

large range read in Cassandra

2014-11-24 Thread Dan Kinder
Hi, We have a web crawler project currently based on Cassandra ( https://github.com/iParadigms/walker, written in Go and using the gocql driver), with the following relevant usage pattern: - Big range reads over a CF to grab potentially millions of rows and dispatch new links to crawl - Fast inse

Re: large range read in Cassandra

2014-11-25 Thread Dan Kinder
we need to start using Hive/Spark/Pig etc. sooner, or page it manually using LIMIT and WHERE > [the last returned result]. On Mon, Nov 24, 2014 at 5:49 PM, Robert Coli wrote: > On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder wrote: > >> We have a web crawler project currently base

Re: large range read in Cassandra

2014-11-25 Thread Dan Kinder
Thanks, very helpful Rob, I'll watch for that. On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli wrote: > On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder wrote: > >> To be clear, I expect this range query to take a long time and perform >> relatively heavy I/O. What I expected

STCS limitation with JBOD?

2015-01-02 Thread Dan Kinder
Hi, Forcing a major compaction (using nodetool compact ) with STCS will result in a single sstable (ignoring repair data). However this seems like it could be a problem for large JBOD setups. For example if I have 1

Re: STCS limitation with JBOD?

2015-01-06 Thread Dan Kinder
ur reason >>> for doing that? >>> >> >> I'd say "often" and not "usually". Lots of people have schema where they >> create way too much garbage, and major compaction can be a good response. >> The docs' historic incoherent FUD notwithstanding. >> >> =Rob >> >> > > > > -- > > Thanks, > Ryan Svihla > > -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com

Re: large range read in Cassandra

2015-02-02 Thread Dan Kinder
number seems to have done the trick. On Tue, Nov 25, 2014 at 2:54 PM, Dan Kinder wrote: > Thanks, very helpful Rob, I'll watch for that. > > On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli > wrote: > >> On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder >> wrote: >>

Less frequent flushing with LCS

2015-02-27 Thread Dan Kinder
Hi all, We have a table in Cassandra where we frequently overwrite recent inserts. Compaction does a fine job with this but ultimately larger memtables would reduce compactions. The question is: can we make Cassandra use larger memtables and flush less frequently? What currently triggers the flus

Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Hey all, I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E To summarize it, on my local box with just one cassandra node I can update and then s

Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
s and then not be touched for a while. On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli wrote: > On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder wrote: > >> Theoretically sstable_size_in_mb could be causing it to flush (it's at >> the default 160MB)... though we are flushing

Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
es are flushed. > > Thanks, > Daniel > > On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder wrote: > >> I see, thanks for the input. Compression is not enabled at the moment, >> but I may try increasing that number regardless. >> >> Also I don't think in-memory t

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Done: https://issues.apache.org/jira/browse/CASSANDRA-8892 On Mon, Mar 2, 2015 at 3:26 PM, Robert Coli wrote: > On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder wrote: > >> I had been having the same problem as in those older post: >> http://mail-archives.apache.org/mod_mbox/cas

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
confirm that the gocql was in > fact sending the "write 100" query and then on the next read Cassandra > responded with "99". > > I'll be interested to see what the result of the jira ticket is. > > -psanford > > -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-03 Thread Dan Kinder
45840296000, ts:1425345840296000) > > > It looks like we're only getting millisecond precision instead of > microsecond for the column timestamps?! If you explicitly set the timestamp > value when you do the insert, you can get actual microsecond precision and > the issue should

Finding nodes that own a given token/partition key

2015-03-26 Thread Dan Kinder
Hey all, In certain cases it would be useful for us to find out which node(s) have the data for a given token/partition key. The only solutions I'm aware of is to select from system.local and/or system.peers to grab the host_id and tokens, do `SELECT token(thing) FROM myks.mytable WHERE thing = '

Re: Finding nodes that own a given token/partition key

2015-03-26 Thread Dan Kinder
urself. > > Adam Holmberg > > On Thu, Mar 26, 2015 at 3:53 PM, Roman Tkachenko > wrote: > >> Hi Dan, >> >> Have you tried using "nodetool getendpoints"? It shows you nodes that >> currently own the specific key. >> >> Roman >> &

Delete query range limitation

2015-04-15 Thread Dan Kinder
I understand that range deletes are currently not supported ( http://stackoverflow.com/questions/19390335/cassandra-cql-delete-using-a-less-than-operator-on-a-secondary-key ) Since Cassandra now does have range tombstones is there a reason why it can't be allowed? Is there a ticket for supporting

Multiple cassandra instances per physical node

2015-05-21 Thread Dan Kinder
Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the "commodity hardware" intentions of

Re: Multiple cassandra instances per physical node

2015-05-21 Thread Dan Kinder
gt;> >>>> If you google , you can find a few blogs that talk about how to do >>>> this. >>>> >>>> But it is less than ideal. We need to be able to do it by changing >>>> ports in cassandra.yaml. ( The way it is done easily with Hadoop o

counters still inconsistent after repair

2015-06-15 Thread Dan Kinder
Currently on 2.1.6 I'm seeing behavior like the following: cqlsh:walker> select * from counter_table where field = 'test'; field | value ---+--- test |30 (1 rows) cqlsh:walker> select * from counter_table where field = 'test'; field | value ---+--- test |90 (1 rows) c

Re: counters still inconsistent after repair

2015-06-19 Thread Dan Kinder
Thanks Rob, this was helpful. More counters will be added soon, I'll let you know if those have any problems. On Mon, Jun 15, 2015 at 4:32 PM, Robert Coli wrote: > On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder wrote: > >> Potentially relevant facts: >> - Recently upgrad