Re: Node failure Due To Very high GC pause time

2017-07-03 Thread Bryan Cheng
nodes your tables will have difficulty flushing. See http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html . This could also be a heap/memory configuration issue as well or a GC tuning issue (although unlikely if you've left those at default) --Bryan On M

Re: Does Too many GC pauses can cause cassandra service DOWN.

2017-02-14 Thread Bryan Cheng
GC can absolutely cause a server to get marked down by a peer. See https://support.datastax.com/hc/en-us/articles/204226199-Common-Causes-of-GC-pauses As for tuning again we use CMS but this thread has some good G1 info that I looked at while evaluating it: https://issues.apache.org/jira/browse/CA

Re: inconsistent results

2017-02-14 Thread Bryan Cheng
Change your consistency levels in the cqlsh shell while you query, from ONE to QUORUM to ALL. If you see your results change that's a consistency issue. (Assuming these are simple inserts, if there's deletes, potentially update collections, etc. in the mix then things get a bit more complex.) To d

Re: Metric to monitor partition size

2017-01-13 Thread Bryan Cheng
We're on 2.X so this information may not apply to your version, but you should see: 1) A log statement upon compaction, like "Writing large partition", including the primary partition key (see https://issues.apache.org/jira/browse/CASSANDRA-9643). Configurable threshold in cassandra.yaml 2) Probl

Re: Backup restore with a different name

2016-11-02 Thread Bryan Cheng
Hi Jens, When you refer to restoring a snapshot for a developer to look at, do you mean restoring the cluster to that state, or just exposing that state for reference while keeping the (corrupt) current state in the live cluster? You may find these useful: https://docs.datastax.com/en/cassandra/2

Re: Incremental repairs in 3.0

2016-09-06 Thread Bryan Cheng
at 2:00 AM, Jean Carlo wrote: > Hi @Bryan > > When you said "sizable amount of data" you meant a huge amount of data > right? Our big table is in LCS and if we use the migration process we will > need to run a repair seq over this table for a long time. > > We are

Re: Corrupt SSTABLE over and over

2016-08-15 Thread Bryan Cheng
Hi Alaa, Sounds like you have problems that go beyond Cassandra- likely filesystem corruption or bad disks. I don't know enough about Windows to give you any specific advice but I'd try a run of chkdsk to start. --Bryan On Fri, Aug 12, 2016 at 5:19 PM, Alaa Zubaidi (PDF) wrote:

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Bryan Cheng
Should also add that if the scope of corruption is _very_ large, and you have a good, aggressive repair policy (read: you are confident in the consistency of the data elsewhere in the cluster), you may just want to decommission and rebuild that node. On Fri, Aug 12, 2016 at 11:55 AM, Bryan Cheng

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Bryan Cheng
Looks like you're doing the offline scrub- have you tried online? Here's my typical process for corrupt SSTables. With disk_failure_policy set to stop, examine the failing sstables. If they are very small (in the range of kbs), it is unlikely that there is any salvageable data there. Just delete

Re: Debugging high tail read latencies (internal timeout)

2016-07-07 Thread Bryan Cheng
system.log that should give you per-collection breakdowns. --Bryan On Wed, Jul 6, 2016 at 6:22 PM, Nimi Wariboko Jr wrote: > Hi, > > I've begun experiencing very high tail latencies across my clusters. While > Cassandra's internal metrics report <1ms read latencies, me

Re: Cluster not working after upgrade from 2.1.12 to 3.5.0

2016-06-21 Thread Bryan Cheng
Hi Oskar, I know this won't help you as quickly as you would like but please consider updating the JIRA issue with details of your environment as it may help move the investigation along. Good luck! On Tue, Jun 21, 2016 at 12:21 PM, Julien Anguenot wrote: > You could try to sstabledump that on

Re: Incremental repairs in 3.0

2016-06-20 Thread Bryan Cheng
Sorry, meant to say "therefore manual migration procedure should be UNnecessary" On Mon, Jun 20, 2016 at 3:21 PM, Bryan Cheng wrote: > I don't use 3.x so hopefully someone with operational experience can chime > in, however my understanding is: 1) Incremental repairs shoul

Re: Incremental repairs in 3.0

2016-06-20 Thread Bryan Cheng
I don't use 3.x so hopefully someone with operational experience can chime in, however my understanding is: 1) Incremental repairs should be the default in the 3.x release branch and 2) sstable repairedAt is now properly set in all sstables as of 2.2.x for standard repairs and therefore manual migr

Re: OOM under high write throughputs on 2.2.5

2016-05-24 Thread Bryan Cheng
Hi Zhiyan, Silly question but are you sure your heap settings are actually being applied? "697,236,904 (51.91%)" would represent a sub-2GB heap. What's the real memory usage for Java when this crash happens? Other thing to look into might be memtable_heap_space_in_mb, as it looks like you're usi

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Bryan Cheng
in your cluster. --Bryan On Tue, May 24, 2016 at 3:49 PM, kurt Greaves wrote: > Not necessarily considering RF is 2 so both nodes should have all > partitions. Luke, are you sure the repair is succeeding? You don't have > other keyspaces/duplicate data/extra data in your cassandr

Re: Limit 1

2016-04-21 Thread Bryan Cheng
As far as I know, the answer is yes, however it is unlikely that the cursor will have to probe very far to find a valid row unless your data is highly bursty. The key cache (assuming you have it enabled) will allow the query to skip unrelated rows in its search. However I would caution against TTL

Re: Cassandra Golang Driver and Support

2016-04-13 Thread Bryan Cheng
Hi Yawei, While you're right that there's no first-party driver, we've had good luck using gocql (https://github.com/gocql/gocql) in production at moderate scale. What features in particular are you looking for that are missing? --Bryan On Tue, Apr 12, 2016 at 10:06 PM, Yawei L

Re: Unable to connect to CQLSH or Launch SparkContext

2016-04-11 Thread Bryan Cheng
Check your environment variables, looks like JAVA_HOME is not properly set On Mon, Apr 11, 2016 at 9:07 AM, Lokesh Ceeba - Vendor < lokesh.ce...@walmart.com> wrote: > Hi Team, > > Help required > > > > cassandra:/app/cassandra $ nodetool status > > > > Cassandra 2.0 and later requir

Re: Large primary keys

2016-04-11 Thread Bryan Cheng
While large primary keys (within reason) should work, IMO anytime you're doing equality testing you are really better off minimizing the size of the key. Huge primary keys will also have very negative impacts on your key cache. I would err on the side of the digest, but I've never had a need for la

Re: Cassandra sstable to Mysql

2016-04-02 Thread Bryan Cheng
You have SSTables and you want to get importable data? You could use a tool like sstabletojson to get json formatted data directly from the sstables; however, unless they've been perfectly compacted, there will be duplicates and updates interleaved that will be properly ordered. If this is a full

Re: cassandra disks cache on SSD

2016-04-02 Thread Bryan Cheng
Hi Vincent, have you already tried the more common tuning operations like row cache? I haven't done any disk level caching like this (we use SSD's exclusively), but you may see some benefit from putting your commitlog on a separate conventional HDD if you haven't tried this already. This may push

Re: Multi DC setup for analytics

2016-03-31 Thread Bryan Cheng
I'm jumping into this thread late, so sorry if this has been covered before. But am I correct in reading that you have two different Cassandra rings, not talking to each other at all, and you want to have a shared DC with a third Cassandra ring? I'm not sure what you want to do is possible. If I

Re: Cassandra Upgrade 3.0.x vs 3.x (Tick-Tock Release)

2016-03-14 Thread Bryan Cheng
ou're happy being a little closer to the bleeding edge. There was a bit of discussion elsewhere on this list, eg here: https://www.mail-archive.com/user@cassandra.apache.org/msg45990.html, searching may turn up some more recommendations. --Bryan On Mon, Mar 14, 2016 at 12:40 PM, Kathiresa

Re: Unexplainably large reported partition sizes

2016-03-07 Thread Bryan Cheng
Hi Tom, Do you use any collections on this column family? We've had issues in the past with unexpectedly large partitions reported on data models with collections, which can also generate tons of tombstones on UPDATE ( https://issues.apache.org/jira/browse/CASSANDRA-10547) --Bryan On Mon

Re: Modeling transactional messages

2016-03-04 Thread Bryan Cheng
I think most people will tell you what Sean did- queues are considered an anti-pattern for many reasons in Cassandra, and while it's possible, you may want to consider something more suited for the job (RabbitMQ, redis queues are just a few ideas that come to mind). If you're sold on the idea of u

Re: Lot of GC on two nodes out of 7

2016-03-03 Thread Bryan Cheng
Hi Anishek, In addition to the good advice others have given, do you notice any abnormally large partitions? What does cfhistograms report for 99% partition size? A few huge partitions will cause very disproportionate load on your cluster, including high GC. --Bryan On Wed, Mar 2, 2016 at 9:28

Re: Checking replication status

2016-03-01 Thread Bryan Cheng
they just differ in syntax. If you sustain loss of inter-dc connectivity for longer than max_hint_window_in_ms, you'll want to run a cross-dc repair, which is just the standard full repair (without specifying either). On Mon, Feb 29, 2016 at 7:38 PM, Jimmy Lin wrote: > hi Bryan, > I guess

Re: Checking replication status

2016-02-26 Thread Bryan Cheng
Hi Jimmy, If you sustain a long downtime, repair is almost always the way to go. It seems like you're asking to what extent a cluster is able to recover/resync a downed peer. A peer will not attempt to reacquire all the data it has missed while being down. Recovery happens in a few ways: 1) Hin

Re: Cassandra Multi DC (Active-Active) Setup - Measuring latency & throughput performance

2016-02-26 Thread Bryan Cheng
Hi Chandra, For write latency, etc. the tools are still largely the same set of tools you'd use for single-DC- stuff like tracing, cfhistograms, cassandra-stress come to mind. The exact results are going to differ based on your consistency tuning (can you get away with LOCAL_QUORUM vs QUORUM?) and

Re: "Not enough replicas available for query" after reboot

2016-02-04 Thread Bryan Cheng
Hey Flavien! Did your reboot come with any other changes (schema, configuration, topology, version)? On Thu, Feb 4, 2016 at 2:06 PM, Flavien Charlon wrote: > I'm using the C# driver 2.5.2. I did try to restart the client > application, but that didn't make any difference, I still get the same >

Re: EC2 storage options for C*

2016-02-03 Thread Bryan Cheng
e at peak times, when >>>>>>> multiple AWS customers have spikes of demand? >>>>>>> >>>>>>> Is RAID much of a factor or help at all using EBS? >>>>>>> >>>>>>> How exactly is EBS provisioned in terms of its own H

Re: Any tips on how to track down why Cassandra won't cluster?

2016-02-03 Thread Bryan Cheng
> On Wed, 3 Feb 2016 at 11:49 Richard L. Burton III > wrote: > >> >> Any suggestions on how to track down what might trigger this problem? I'm >> not receiving any exceptions. >> > You're not getting "Unable to gossip with any seeds" on the second node? What does nodetool status show on both machi

Re: EC2 storage options for C*

2016-01-30 Thread Bryan Cheng
Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. Personally, on sm

Re: EC2 storage options for C*

2016-01-29 Thread Bryan Cheng
Do you have any idea what kind of disk performance you need? Cassandra with RAID 0 is a fairly common configuration (Al's awesome tuning guide has a blurb on it https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html), so if you feel comfortable with the operational overhead it seems lik

Re: Session timeout

2016-01-29 Thread Bryan Cheng
To throw my (unsolicited) 2 cents into the ring, Oleg, you work for a well-funded and fairly large company. You are certainly free to continue using the list and asking for community support (I am definitely not in any position to tell you otherwise, anyway), but that community support is by defini

Re: max connection per user

2016-01-13 Thread Bryan Cheng
Are you actively exposing your database to users outside of your organization, or are you just asking about security best practices? If you mean the former, this isn't really a common use case and there isn't a huge amount out of the box that Cassandra will do to help. If you're just asking about

Help debugging a very slow query

2016-01-13 Thread Bryan Cheng
perations. 2) What could cause the Read to take such an absurd amount of time when it's a pair of sstables and the memtable being examined, and its just a single cell being read? We originally suspected just memory pressure from huge sstables, but without a corresponding GC this seems unlik

Re: Rebuilding a new Cassandra node at 100Mb/s

2015-12-03 Thread Bryan Cheng
Jonathan: Have you changed stream_throughput_outbound_megabits_per_sec in cassandra.yaml? # Throttles all outbound streaming file transfers on this node to the # given total throughput in Mbps. This is necessary because Cassandra does # mostly sequential IO when streaming data during bootstrap or

Re: Issues on upgrading from 2.2.3 to 3.0

2015-12-02 Thread Bryan Cheng
Has your configuration changed? This is a new check- https://issues.apache.org/jira/browse/CASSANDRA-10242. It seems likely either your snitch changed, your properties changed, or something caused Cassandra to think one of the two happened... What's your node layout? On Fri, Nov 27, 2015 at 6:45

Re: Transitioning to incremental repair

2015-12-02 Thread Bryan Cheng
Ah Marcus, that looks very promising- unfortunately we have already switched back to full repairs and our test cluster has been re-purposed for other tasks atm. I will be sure to apply the patch/try a fixed version of Cassandra if we attempt to migrate to incremental repair again.

Re: Transitioning to incremental repair

2015-12-01 Thread Bryan Cheng
Sorry if I misunderstood, but are you asking about the LCS case? Based on our experience, I would absolutely recommend you continue with the migration procedure. Even if the compaction strategy is the same, the process of anticompaction is incredibly painful. We observed our test cluster running 2

Re: Repair Hangs while requesting Merkle Trees

2015-11-17 Thread Bryan Cheng
e around for TCP tuning/buffer tuning and you should find some good resources. On Mon, Nov 16, 2015 at 5:23 PM, Anuj Wadehra wrote: > Hi Bryan, > > Thanks for the reply !! > I didnt mean streaming_socket_tomeout_in_ms. I meant when you run netstats > (Linux cmnd) on node A in DC1

Re: Repair Hangs while requesting Merkle Trees

2015-11-16 Thread Bryan Cheng
Hi Anuj, Did you mean streaming_socket_timeout_in_ms? If not, then you definitely want that set. Even the best network connections will break occasionally, and in Cassandra < 2.1.10 (I believe) this would leave those connections hanging indefinitely on one end. How far away are your two DC's from

Generalized download link?

2015-11-16 Thread Bryan Cheng
Hey list, Is there a URL available for downloading Cassandra that abstracts away the mirror selection (eg. just 302's to a mirror URL?) We've got a few self-configuring Cassandras (for example, the Docker container our devs use), and using the same mirror for the containers or for any bulk provisi

Re: Insertion Delay Cassandra 2.1.9

2015-11-06 Thread Bryan Cheng
Your experience, then, is expected (although 20m delay seems excessive, and is a sign you may be overloading your cluster, which may be expected with an unthrottled bulk load like that). When you insert with consistency ONE on RF > 1, that means your query returns after one node confirms the write

Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Bryan Cheng
Is your compaction progressing as expected? If not, this may cause an excessive number of tiny db files. Had a node refuse to start recently because of this, had to temporarily remove limits on that process. On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis wrote: > I'm getting too many open files er

What are the repercussions of a restart during anticompaction?

2015-11-05 Thread Bryan Cheng
but did a full repair again after that before we decommissioned our old dc. Any guidance would be appreciated! Thanks, Bryan

Re: Two node cassandra cluster doubts

2015-11-04 Thread Bryan Cheng
I believe what's going on here is this step: Select Count (*) From MYTABLE;---> 15 rows Shut down Node B. Start Up Node B. Select Count (*) From MYTABLE;---> 15 rows To understand why this is an issue, consider the way that consistency is attempted within Cassandra. With RF=2, (You should re

Re: Doubt regarding consistency-level in Cassandra-2.1.10

2015-11-03 Thread Bryan Cheng
What Eric means is that SERIAL consistency is a special type of consistency that is only invoked for a subset of operations: those that use CAS/lightweight transactions, for example "IF NOT EXISTS" queries. The differences between CAS operations and standard operations are significant and there ar

Re: Maximum node decommission // bootstrap at once.

2015-10-06 Thread Bryan Cheng
Robert, I might be misinterpreting you but I *think* your link is talking about bootstrapping a new node by bulk loading replica data from your existing cluster? I was referring to using Cassandra's bootstrap to get the node to join and run (as a member of DC2 but with physical residence in DC1),

Re: Maximum node decommission // bootstrap at once.

2015-10-06 Thread Bryan Cheng
Honestly, we've had more luck bootstrapping in our old DC (defining topology properties as the new DC) and using rsync to migrate the data files to new machines in the new datacenter. We had 10gig within the datacenter but significantly less than this cross-DC, which lead to a lot of broken streami

Re: broadcast address on EC2 without Elastic IPs.

2015-10-01 Thread Bryan Cheng
st, this is a somewhat common configuration. Hope this helps! --Bryan On Wed, Sep 30, 2015 at 7:24 AM, Renato Perini wrote: > Hello! > I have configured a small cluster composed of three nodes on Amazon EC2. > The 3 machines don't have an elastic IP (static address) so the publi

Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

2015-09-08 Thread Bryan Cheng
Tom, I don't believe so; it seems the symptom would be an indefinite (or very long) hang. To clarify, is this issue restricted to LOCAL_QUORUM? Can you issue a LOCAL_ONE SELECT and retrieve the expected data back? On Tue, Sep 8, 2015 at 12:02 PM, Tom van den Berge < tom.vandenbe...@gmail.com> wro

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Bryan Cheng
been run, the application is not connecting to the new cluster, and all your queries are run at LOCAL_* quorum levels, I do not believe those queries should be routed to the new dc. On Thu, Sep 3, 2015 at 12:14 PM, Tom van den Berge < tom.vandenbe...@gmail.com> wrote: > Hi Bryan, &

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Bryan Cheng
levels. On Thu, Sep 3, 2015 at 11:53 AM, Tom van den Berge < tom.vandenbe...@gmail.com> wrote: > Hi Bryan, > > I'm using the PropertyFileSnitch, and it contains entries for all nodes in > the old DC, and all nodes in the new DC. The replication factor for both > DCs is 1. &

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Bryan Cheng
Hey Tom, What's your replication strategy look like? When your new nodes join the ring, can you verify that they show up under a new DC and not as part of the old? --Bryan On Thu, Sep 3, 2015 at 11:27 AM, Tom van den Berge < tom.vandenbe...@gmail.com> wrote: > I want to start u

Rebuild new DC nodes against new DC?

2015-08-31 Thread Bryan Cheng
ld make things faster and ease some headaches. Thanks for any help! --Bryan

Re: Incremental, Sequential repair?

2015-08-25 Thread Bryan Cheng
Aug 25, 2015 at 2:44 PM, Bryan Cheng > wrote: > >> [2015-08-25 21:36:43,433] It is not possible to mix sequential repair and >> incremental repairs. >> >> Is this a limitation around a specific configuration? Or is it generally >> true that incremental and sequ

Incremental, Sequential repair?

2015-08-25 Thread Bryan Cheng
Hey all, Got a question about incremental repairs, a quick google search turned up nothing conclusive. In the docs, in a few places, sequential, incremental repairs are mentioned. From http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_repair_nodes_c.html (indirectly): > You can

Re: Change from single region EC2 to multi-region

2015-08-11 Thread Bryan Cheng
broadcast_address to public ip should be the correct configuration. Assuming your firewall rules are all kosher, you may need to clear gossip state? http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html -- Forwarded message -- From: Asher Newcomer Da

Re: "SELECT *..." query times out on a particular table

2015-08-03 Thread Bryan Holladay
Check Cassandra logs for tombstone threshold error On Aug 3, 2015 7:32 PM, "Robert Coli" wrote: > On Mon, Aug 3, 2015 at 2:48 PM, Sid Tantia > wrote: > >> Any select all or select count query on a particular table is timing out >> with "Cassandra::Errors::TimeoutError: Timed out" >> >> A “SELECT

Re: Cassandra compaction appears to stall, node becomes partially unresponsive

2015-07-22 Thread Bryan Cheng
15 at 2:55 PM, Bryan Cheng > wrote: > >> nodetool still reports the node as being healthy, and it does respond to >> some local queries; however, the CPU is pegged at 100%. One common thread >> (heh) each time this happens is that there always seems to be one of more >&g

Re: Cassandra compaction appears to stall, node becomes partially unresponsive

2015-07-22 Thread Bryan Cheng
cluster. On Wed, Jul 22, 2015 at 3:35 PM, Bryan Cheng wrote: > Hi Aiman, > > We previously had issues with GC, but since upgrading to 2.1.7 things seem > a lot healthier. > > We collect GC statistics through collectd via the garbage collector mbean, > ParNew GC's report

Re: Cassandra compaction appears to stall, node becomes partially unresponsive

2015-07-22 Thread Bryan Cheng
about 300ms collection time when it runs. On Wed, Jul 22, 2015 at 3:22 PM, Aiman Parvaiz wrote: > Hi Bryan > How's GC behaving on these boxes? > > On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng > wrote: > >> Hi there, >> >> Within our Cassandra cluster, we'

Cassandra compaction appears to stall, node becomes partially unresponsive

2015-07-22 Thread Bryan Cheng
affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan

Re: Missing data

2015-06-15 Thread Bryan Holladay
ist for an example: https://gist.github.com/baholladay/21eb4c61ea8905302195 ) Just loop for every 100million rows and make a new query "select * from TABLE where token(key) > lastToken" Thanks, Bryan On Mon, Jun 15, 2015 at 12:50 PM, Jean Tremblay < jean.tremb...@zen-innovations.com>

Re: Cassandra crashes daily; nothing on the log

2015-06-08 Thread Bryan Holladay
It could be the linux kernel killing Cassandra b/c of memory usage. When this happens, nothing is logged in Cassandra. Check the system logs: /var/log/messages Look for a message saying "Out of Memory"... "kill process"... On Mon, Jun 8, 2015 at 1:37 PM, Paulo Motta wrote: > try checking your s

Re: Read performance

2015-05-08 Thread Bryan Holladay
Try breaking it up into smaller chunks using multiple threads and token ranges. 86400 is pretty large. I found ~1000 results per query is good. This will spread the burden across all servers a little more evenly. On Thu, May 7, 2015 at 4:27 AM, Alprema wrote: > Hi, > > I am writing an applicatio

Re: How much disk is needed to compact Leveled compaction?

2015-04-06 Thread Bryan Holladay
quot; Are sstables smaller with leveled compaction making this a non issue? How can you determine what the new threshold for storage space is? Thanks, Bryan On Apr 6, 2015 6:19 PM, "DuyHai Doan" wrote: > If you have SSD, you may afford switching to leveled compaction strategy, > which

Re: tuning concurrent_reads param

2014-11-06 Thread Bryan Talbot
ttp://www.makelinux.net/books/lkd2/ch13lev1sec5 > What really happen if we increase it to a too high value? (maybe affecting > other read or write operation as it eat up all disk IO resource?) > Yes -Bryan

Re: new data not flushed to sstables

2014-11-03 Thread Bryan Talbot
e will contain approximately 75% of the data. From a quick eyeball check of the json-dump you provided, it looks like partition-key values are contained on 3 nodes and are absent from 1 which is exactly as expected. -Bryan

Re: OldGen saturation

2014-10-28 Thread Bryan Talbot
eptable for a unit test or playing around with but you can't actually expect it to be adequate for a load test can you? Every CF consumes some permanent heap space for its metadata. Too many CF are a bad thing. You probably have ten times more CF than would be recommended as an upper limit. -Bryan

Re: Repair taking long time

2014-09-26 Thread Bryan Talbot
With a 4.5 TB table and just 4 nodes, repair will likely take forever for any version. -Bryan On Fri, Sep 26, 2014 at 10:40 AM, Jonathan Haddad wrote: > Are you using Cassandra 2.0 & vnodes? If so, repair takes forever. > This problem is addressed in 2.1. > > On Fri, Sep 26,

updated num_tokens value while changing replication factor and getting a nodetool repair error

2014-08-19 Thread Bryan Holladay
repair works better when there are more tokens (my theory is that its working in smaller chunks), is this a good reason to use 256 instead of one? 2) How can I fix the Repair Exception above? 3) Nodetool repair takes forever to run (5+ days). Is this because I have 1 token per node or is there a better way to run this? Should I set the start and end keys? I'm running Cassandra 2.0.2 Any help would be greatly appreciated. Thanks, Bryan

Re: Index with same Name but different keyspace

2014-05-19 Thread Bryan Talbot
ex on table2(flag); > > *Bad Request: Duplicate index name sversionindex* > > Since index name is optional in the create index statement, you could just omit it and let the system give it a unique name for you. -Bryan

Re: Best partition type for Cassandra with JBOD

2014-05-19 Thread Bryan Talbot
he guidelines don’t mention setting noatime and > nodiratime flags in the fstab for data volumes, but I wonder if that’s a > common practice. > > James > > > > > -- > > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > Skype: *burtonator* > blog: http

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-19 Thread Bryan Talbot
nsert into posts(author,created_at,entry) values ('mike',now(),'This is a new entry by mike'); and then you can get posts by 'john' ordered by newest to oldest as: cqlsh:test> select author, created_at, dateOf(created_at), entry from posts where author = 'john' limit 2 ; author | created_at | dateOf(created_at) | entry +--+--+-- john | 7cb1ac30-df85-11e3-bb46-4d2d68f17aa6 | 2014-05-19 11:43:36-0700 | This is a new entry by john john | 74bb6750-df85-11e3-bb46-4d2d68f17aa6 | 2014-05-19 11:43:23-0700 | This is an old entry by john -Bryan

Failed to mkdirs $HOME/.cassandra

2014-05-15 Thread Bryan Talbot
How should nodetool command be run as the user "nobody"? The nodetool command fails with an exception if it cannot create a .cassandra directory in the current user's home directory. I'd like to schedule some nodetool commands to run with least privilege as cron jobs. I'd like to run them as the

Re: Cassandra 2.0.7 always failes due to 'too may open files' error

2014-05-05 Thread Bryan Talbot
Running #> cat /proc/$(cat /var/run/cassandra.pid)/limits as root or your cassandra user will tell you what limits it's actually running with. On Sun, May 4, 2014 at 10:12 PM, Yatong Zhang wrote: > I am running 'repair' when the error occurred. And just a few days before > I changed the com

Re: using cssandra cql with php

2014-03-04 Thread Bryan Talbot
27;ll be stuck using old CQL features from unmaintained client drivers -- probably better to not be using CQL and PHP together since mixing them seems pretty bad right now. -Bryan On Sun, Jan 12, 2014 at 11:27 PM, Jason Wee wrote: > Hi, > > operating system should not be a matt

Re: Heap is not released and streaming hangs at 0%

2013-06-21 Thread Bryan Talbot
bloom_filter_fp_chance = 0.7 is probably way too large to be effective and you'll probably have issues compacting deleted rows and get poor read performance with a value that high. I'd guess that anything larger than 0.1 might as well be 1.0. -Bryan On Fri, Jun 21, 2013 at 5:58

Re: Compaction not running

2013-06-18 Thread Bryan Talbot
Manual compaction for LCS doesn't really do much. It certainly doesn't compact all those little files into bigger files. What makes you think that compactions are not occurring? -Bryan On Tue, Jun 18, 2013 at 3:59 PM, Franc Carter wrote: > On Sat, Jun 15, 2013 at 11:49 AM,

Re: [Cassandra] Conflict resolution in Cassandra

2013-06-06 Thread Bryan Talbot
For generic questions like this, google is your friend: http://lmgtfy.com/?q=cassandra+conflict+resolution -Bryan On Thu, Jun 6, 2013 at 11:23 AM, Emalayan Vairavanathan < svemala...@yahoo.com> wrote: > Hi All, > > Can someone tell me about the conflict resolution mechani

Re: Multiple JBOD data directory

2013-06-05 Thread Bryan Talbot
sstables. This means you WILL see obsolete # data at CL.ONE! # ignore: ignore fatal errors and let requests fail, as in pre-1.2 Cassandra disk_failure_policy: stop On Wed, Jun 5, 2013 at 2:59 PM, Bryan Talbot wrote: > If you're using cassandra 1.2 then you have a choice spec

Re: Multiple JBOD data directory

2013-06-05 Thread Bryan Talbot
based on # remaining available sstables. This means you WILL see obsolete # data at CL.ONE! # ignore: ignore fatal errors a -Bryan On Wed, Jun 5, 2013 at 6:11 AM, Christopher Wirt wrote: > I would hope so. Just trying to get some confirmation from someone with >

Re: Cassandra performance decreases drastically with increase in data size.

2013-05-30 Thread Bryan Talbot
concurrency and throughput - upgrade to cassandra 1.2 which does some of these things for you -Bryan On Thu, May 30, 2013 at 2:31 PM, srmore wrote: > You are right, it looks like I am doing a lot of GC. Is there any > short-term solution for this other than bumping up the heap ? because, even

Re: data clean up problem

2013-05-28 Thread Bryan Talbot
I think what you're asking for (efficient removal of TTL'd write-once data) is already in the works but not until 2.0 it seems. https://issues.apache.org/jira/browse/CASSANDRA-5228 -Bryan On Tue, May 28, 2013 at 1:26 PM, Hiller, Dean wrote: > Oh and yes, astyanax uses client

Re: In a multiple data center setup, do all the data centers have complete data irrespective of RF?

2013-05-20 Thread Bryan Talbot
data centers and NTS, I am basically > mirroring the database. Right? > > Depending on how you've configured your placement strategy, but if you're using DC1:3 and DC2:3 like you have above, then yes, you'd expect to have 3 copies of every row in both data centers for that keyspace. -Bryan

Re: In a multiple data center setup, do all the data centers have complete data irrespective of RF?

2013-05-20 Thread Bryan Talbot
Option #3 since it depends on the placement strategy and not the partitioner. -Bryan On Mon, May 20, 2013 at 6:24 AM, Pinak Pani < nishant.has.a.quest...@gmail.com> wrote: > I just wanted to verify the fact that if I happen to setup a multi > data-center Cassandra setup, will each

Re: update does not apply to any replica if consistency = ALL and one replica is down

2013-05-17 Thread Bryan Talbot
empting the write, I don't think it will attempt the write. -Bryan On Fri, May 17, 2013 at 1:48 AM, Sergey Naumov wrote: > As described here ( > http://maxgrinev.com/2010/07/12/update-idempotency-why-it-is-important-in-cassandra-applications-2/), > if consistency level couldn't

Re: SSTable size versus read performance

2013-05-16 Thread Bryan Talbot
512 sectors for read-ahead. Are your new fancy SSD drives using large sectors? If your read-ahead is really reading 512 x 4KB per random IO, then that 2 MB per read seems like a lot of extra overhead. -Bryan On Thu, May 16, 2013 at 12:35 PM, Keith Wright wrote: > We actually have it

Re: index_interval

2013-05-13 Thread Bryan Talbot
, the process will be killed. Can the index sample storage be treated more like key cache or row cache where the total space used can be limited to something less than all available system ram, and space is recycled using an LRU (or configurable) algorithm? -Bryan On Mon, May 13, 2013 at 9:10 PM,

Re: index_interval

2013-05-13 Thread Bryan Talbot
data. -Bryan On Fri, May 10, 2013 at 7:44 PM, Edward Capriolo wrote: > If you use your off heap memory linux has an OOM killer, that will kill a > random tasks. > > > On Fri, May 10, 2013 at 11:34 AM, Bryan Talbot wrote: > >> If off-heap memory (for indes samples, bloom filte

Re: index_interval

2013-05-10 Thread Bryan Talbot
paged-out / "cold" data read back in again on demand? -Bryan On Wed, May 8, 2013 at 4:24 PM, Jonathan Ellis wrote: > index_interval won't be going away, but you won't need to change it as > often in 2.0: https://issues.apache.org/jira/browse/CASSANDRA-5521 > &g

Re: Cassandra running High Load with no one using the cluster

2013-05-06 Thread Bryan Talbot
or the stack, a large number of threads will use a large amount of memory. -Bryan

Re: How much heap does Cassandra 1.1.11 really need ?

2013-05-03 Thread Bryan Talbot
It's true that a 16GB heap is generally not a good idea; however, it's not clear from the data provided what problem you're trying to solve. What is it that you don't like about the default settings? -Bryan On Fri, May 3, 2013 at 4:27 AM, Oleg Dulin wrote: > Here i

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-26 Thread Bryan Talbot
I believe that "nodetool rebuild" is used to add a new datacenter, not just a new host to an existing cluster. Is that what you ran to add the node? -Bryan On Fri, Apr 26, 2013 at 1:27 PM, John Watson wrote: > Small relief we're not the only ones that had this issue. >

Re: Cassandra services down frequently [Version 1.1.4]

2013-04-04 Thread Bryan Talbot
On Thu, Apr 4, 2013 at 1:27 AM, wrote: > > After some time (1 hour / 2 hour) cassandra shut services on one or two > nodes with follwoing errors; > Wonder what the workload and schema is like ... We can see from below that you've tweaked and disabled many of the memory "safety valve" and other

Re: Timeseries data

2013-03-27 Thread Bryan Talbot
In the worst case, that is possible, but compaction strategies try to minimize the number of SSTables that a row appears in so a row being in ALL SStables is not likely for most cases. -Bryan On Wed, Mar 27, 2013 at 12:17 PM, Kanwar Sangha wrote: > Hi – I have a query on Read with Cassan

Re: old data / tombstones are not deleted after ttl

2013-03-04 Thread Bryan Talbot
not being compacted, the row will not be removed. -Bryan On Sun, Mar 3, 2013 at 11:07 PM, Matthias Zeilinger < matthias.zeilin...@bwinparty.com> wrote: > Hi, > > ** ** > > I´m running Cassandra 1.1.5 and have following issue. > > ** ** > > I´m using

  1   2   >