Re: Using cassandra a BLOB store / web cache.

2016-01-20 Thread Mohit Anchlia
The answer to this questions is very much dependent on the throughput, desired latency and access patters (R/W or R/O)? In general what I have seen working for high throughput environment is to either use a distributed file system like Ceph/Gluster or object store like S3 and keep the pointer in th

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:07 AM, Ertio Lew wrote: > My major concern is that is it too bad retrieving 300-500 rows (each for a > single column) in a single read query that I should store all these(around > a hundred million) columns in a single row? You could create multiple rows and each row

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:53 AM, Ertio Lew wrote: > Actually these columns are 1 for each entity in my application & I need to > query at any time columns for a list of 300-500 entities in one go. Can you describe your situation with small example?

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:00 AM, Ertio Lew wrote: > For each user in my application, I want to store a *value* that is queried > by using the userId. So there is going to be one column for each user > (userId as col Name & *value* as col Value). Now I want to store these > columns such that can

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:16 AM, Ertio Lew wrote: > I want to read columns for a randomly selected list of userIds(completely > random). I fetch the data using userIds(which would be used as column names > in case of single row or as rowkeys incase of 1 row for each user) for a > selected list o

Re: Decision Making- YCSB

2012-08-10 Thread Mohit Anchlia
I agree with Edward. We always develop our own stress tool that tests each use case of interest. Every use case is different in certain ways that can only be tested using custom stress tool. On Fri, Aug 10, 2012 at 7:25 AM, Edward Capriolo wrote: > There are many YCSB forks on github that get opt

DSE solr HA

2012-08-12 Thread Mohit Anchlia
Going through this page and it looks like indexes are stored locally http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details . My question is what happens if one of the solr nodes crashes? Is the data indexed again on those nodes? Also, if RF > 1 then is the same data being indexe

Re: Expanding cluster to include a new DR datacenter

2012-08-24 Thread Mohit Anchlia
That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey wrote: > So I’m at the point of updating the keyspaces from Simple to > NetworkTopology and I’m not sure if the changes are being accepted using > Cassandra-cli. > > ** ** > > I issue the change:*

Re: help required to resolve super column family problems

2012-08-24 Thread Mohit Anchlia
If you are starting out new use composite column names/values or you could also use JSON style doc as a column value. On Fri, Aug 24, 2012 at 2:31 PM, Rob Coli wrote: > On Fri, Aug 24, 2012 at 4:33 AM, Amit Handa wrote: > > kindly help in resolving the following problem with respect to super >

Re: Decreasing the number of nodes in the ring

2012-08-26 Thread Mohit Anchlia
use nodetool decommission and nodetool removetoken On Sun, Aug 26, 2012 at 5:31 PM, Senthilvel Rangaswamy wrote: > We have a cluster of 9 nodes in the ring. We would like SSD backed boxes. > But we may not need 9 > nodes in that case. What is the best way to downscale the cluster to 6 or > 3 nod

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
> ** ** > > On 25/08/2012, at 6:53 PM, Bryce Godfrey > wrote: > > > > > > Yes > > > > [default@unknown] describe cluster; > > Cluster Information: > > Snitch: org.apache.cassandra.locator.PropertyFileSnitch > &g

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
strategy_options I should be using the DC name > from properfy file snitch right? Ours is “Fisher” and “TierPoint” so > that’s what I used.**** > > ** ** > > *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] > *Sent:* Monday, August 27, 2012 1:21 PM > > *To:* user@ca

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would h

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
of my back log? > > Although we know when a network is flaky, we are interested in knowing how > much data is piling up in local DC that needs to be transferred. > > Greatly appreciate your help. > > VR > > > On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia wrote: > &g

Re: nodetool connection refused

2012-09-08 Thread Mohit Anchlia
Are both running on the same host? On Fri, Sep 7, 2012 at 11:53 PM, Manu Zhang wrote: > When I run Cassandra-trunk in Eclipse, nodetool fail to connect with the > following error > "Failed to connect to '127.0.0.1:7199': Connection refused" > But if I run in terminal, all will be fine. >

Re: How to replace a dead *seed* node while keeping quorum

2012-09-12 Thread Mohit Anchlia
How can this be resolved in this case? On Wed, Sep 12, 2012 at 3:53 PM, Rob Coli wrote: > On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson > wrote: > > If the downed node is a seed node then neither of the replace a dead node > > procedures work (-Dcassandra.replace_token and taking initial_to

Re: Row cache and counters

2012-12-29 Thread Mohit Anchlia
Can you post gc settings? Also check logs and see what it says Also post how many writes and reads along with avg row size Sent from my iPhone On Dec 29, 2012, at 12:28 PM, rohit bhatia wrote: > i assume u mean 8 seconds and not 8ms.. > thats pretty huge to be caused by gc. Is there lot of lo

Re: very confused by jmap dump of cassandra

2013-02-21 Thread Mohit Anchlia
Roughly how much data do you have per node? Sent from my iPhone On Feb 20, 2013, at 10:49 AM, "Hiller, Dean" wrote: > I took this jmap dump of cassandra(in production). Before I restarted the > whole production cluster, I had some nodes running compaction and it looked > like all memory had

Re: Cass 1.1.11 out of memory during compaction ?

2013-11-03 Thread Mohit Anchlia
Post your gc logs Sent from my iPhone On Nov 3, 2013, at 6:54 AM, Oleg Dulin wrote: > Cass 1.1.11 ran out of memory on me with this exception (see below). > > My parameters are 8gig heap, new gen is 1200M. > > ERROR [ReadStage:55887] 2013-11-02 23:35:18,419 AbstractCassandraDaemon.java > (l

Re: Commit log on USB flash disk?

2013-11-16 Thread Mohit Anchlia
In our testing USB tends to be slower. If there is something more integrated internally would give you better performance Sent from my iPhone On Nov 16, 2013, at 8:30 AM, Dan Simpson wrote: > It doesn't seem like a great idea. The USB drives typically use dynamic wear > leveling. See this a

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
+1 I like hector client that uses thrift interface and exposes APIs that is similar to how Cassandra physically stores the values. On Thu, Feb 20, 2014 at 9:26 AM, Peter Lin wrote: > > I disagree with the sentiment that "thrift is not worth the trouble". > > CQL and all SQL inspired dialects li

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
On Thu, Feb 20, 2014 at 4:37 PM, Edward Capriolo wrote: > Recomendations in cassandra have a shelf life of about 1 to 2 years. If > you try to assert a recomendation from year ago you stand a solid chance of > someone telling you there is now a better way. > > Casaandra once loved being a schemale

Re: Cassandra blob storage

2014-03-18 Thread Mohit Anchlia
For large volume big data scenarios we don't recommend using Cassandra as a blob storage simply because of intensive IO involved during compation, repair etc. Cassandra store is only well suited for metadata type storage. However, if you are fairly low volume then it's a different story, but if you

Re: Cannot query secondary index

2014-06-13 Thread Mohit Anchlia
Some other ways to track old records is: 1) Use external queues - One queue per week or month for instance and pile up data on the queue cluster 2) Create one more table in C* to track the keys per week or month that you can scan to read the keys of the audit table. Make sure you delete the entir

Re: Cache layer in front of cassandra... any help / suggestions?

2011-07-15 Thread Mohit Anchlia
Is row cache not enough for this? Sent from my iPad On Jul 15, 2011, at 12:04 AM, Suman Ghosh wrote: > Hi, > > > > We’re presently trying to use Cassandra as a storage/retrieval system for > live data & composite counters (on the data). > > > > As we work on telecom data records (voic

Re: Best indexing solution for Cassandra

2011-09-28 Thread Mohit Anchlia
look at elasticsearch too. It shards differently. On Wed, Sep 28, 2011 at 1:45 PM, Rafael Almeida wrote: > From Anthony Ikeda : >> Well, we go live with our project very soon and we are now looking into what >> we will be doing for the next phase. One of the enhancements we would like >> to con

Re: cassandra performance degrades after 12 hours

2011-10-03 Thread Mohit Anchlia
On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan wrote: > I am running a cassandra cluster of  6 nodes running RHEL6 virtualized by > ESXi 5.0.  Each VM is configured with 20GB of ram and 12 cores. Our test > setup performs about 3000  inserts per second.  The cassandra data partition > is on a X

Re: cassandra performance degrades after 12 hours

2011-10-03 Thread Mohit Anchlia
gt;       Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB) >       GC grace seconds: 3600 >       Compaction min/max thresholds: 4/32 >       Read repair chance: 1.0 >       Replicate on write: true >       Built indexes: [] > > > > > On Mon, Oct 3, 2011 at 12:2

Re: cassandra performance degrades after 12 hours

2011-10-03 Thread Mohit Anchlia
rs. > Can you elaborate more on reducing the heap space? Do you think it is a > problem with 17G RSS? > thanks > Ramesh > > > On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia > wrote: >> >> I am wondering if you are seeing issues because of more frequent >

Re: cassandra performance degrades after 12 hours

2011-10-03 Thread Mohit Anchlia
em, bigger files get created >> during compaction. You could be in a situation where you might be compacting >> at a higher bucket N level, and compactions build up at lower buckets. >> Run "nodetool -host localhost compactionstats" to get an idea of what's >> go

Re: Consistency level and ReadRepair

2011-10-05 Thread Mohit Anchlia
Do you see any errors in the logs? Is your HH enabled? On Wed, Oct 5, 2011 at 12:00 PM, Ramesh Natarajan wrote: > Lets assume we have 3 nodes all up and running at all times with no > failures or communication problems. > 1. If I have a RF=3 and writing with QUORUM,  2 nodes the change gets > com

Re: how to reduce disk read? (and bloom filter performance)

2011-10-07 Thread Mohit Anchlia
Check your disk utilization using iostat. Also, check if compactions are causing reads to be slow. Check GC too. You can look at cfhistograms output or post it here. On Fri, Oct 7, 2011 at 1:44 AM, Radim Kolar wrote: > Dne 7.10.2011 10:04, aaron morton napsal(a): >> >> Of the top of my head I it

Re: how to reduce disk read? (and bloom filter performance)

2011-10-07 Thread Mohit Anchlia
You'll see output like: Offset SSTables 1 8021 2 783 Which means 783 read operations accessed 2 SSTables On Fri, Oct 7, 2011 at 2:03 PM, Radim Kolar wrote: > Dne 7.10.2011 15:55, Mohit Anchlia napsal(a): >> >> Check your disk util

Re: how to reduce disk read? (and bloom filter performance)

2011-10-10 Thread Mohit Anchlia
need to compact more often. On Sun, Oct 9, 2011 at 7:09 AM, Radim Kolar wrote: > Dne 7.10.2011 23:16, Mohit Anchlia napsal(a): >> >> You'll see output like: >> >> Offset      SSTables >> 1                  8021 >> 2                  783 >>

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
You mentioned this happens only on one node? How many nodes do you have? Is it possible to turn off this node completely and run compactions on other nodes and see if this happens there too? Also, you mentioned this happens after compaction. Did you mean during compaction or right after it? What l

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
data/system/HintsColumnFamily-f-7961-Data.db')] >  INFO [FlushWriter:12] 2011-10-12 18:10:09,862 Memtable.java (line 172) > Completed flushing > /var/lib/cassandra/data/system/HintsColumnFamily-f-7962-Data.db (61 bytes) > Load, cpu and memory are nominal. The box is not stressed. ios

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
, Oct 12, 2011 at 1:13 PM, Mohit Anchlia > wrote: >> >> Yes. If you have exhausted all the options I think it will be good to >> see if this issue persists accross other nodes after you decommission >> that node. >> >> If this is not production and issue is r

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Mohit Anchlia
Do you have same seed node specified in cass-analysis-1 as cass-1,2,3? I am thinking that changing the seed node in cass-analysis-2 and following the directions in http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve the problem. Somone please correct me. On Thu, Oct 13, 2011 at 12

Re: how to reduce disk read? (and bloom filter performance)

2011-10-17 Thread Mohit Anchlia
On Sun, Oct 16, 2011 at 2:20 AM, Radim Kolar wrote: > Dne 10.10.2011 18:53, Mohit Anchlia napsal(a): >> >> Does it mean you are not updating a row or deleting them? > > yes. i have 350m rows and only about 100k of them are updated. >> >>  Can you look at JMX val

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-25 Thread Mohit Anchlia
On Tue, Oct 25, 2011 at 11:18 AM, Dan Hendry wrote: >> 2. ... So I am going to use rotational disk for the commit log and an SSD >> for data. Does this make sense? > > > > Yes, just keep in mind however that the primary characteristic of SSDs is > lower seek times which translates into faster rand

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-25 Thread Mohit Anchlia
) Is this intermediate SSD cache thing doable...or I should just stick to > the normal RAID array of disks and the indexes and in memory caching of > columns that Cassandra offers? > > Cheers, > Alex > > On Tue, Oct 25, 2011 at 9:06 PM, Todd Burruss wrote: >> >> T

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
Why not use 2 CFs? On Fri, Oct 28, 2011 at 9:42 PM, Aditya Narayan wrote: > I need to keep the data of some entities in a single CF but split in two > rows for each entity. One row contains an overview information for the > entity & another row contains detailed information about entity. I am > w

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
at 10:22 AM, Aditya Narayan wrote: > ..so that I can retrieve them through a single query. > > For reading cols from two CFs you need two queries, right ? > > > > > On Sat, Oct 29, 2011 at 9:53 PM, Mohit Anchlia > wrote: >> >> Why not use 2 CFs? >> >&

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan wrote: > @Mohit: > I have stated the example scenarios in my first post under this heading. > Also I have stated above why I want to split that data in two rows & like > Ikeda below stated, I'm too trying out to prevent the frequently accessed > row

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Mohit Anchlia
On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet wrote: > > > On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean > wrote: >> >> Hey Chris, >> >>  Thanks for sharing all  the info. >>  I have few questions: >>  1. What are you doing with so much memory :) ? How much of it do you >> allocate for heap ? >

Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson wrote: > I'm using Cassandra as a big graph database, loading large volumes of data > live and linking on the fly. Not sure if Cassandra is right fit to model complex vertexes and edges. > The number of edges grow geometrically with data added, and

Re: Second Cassandra users survey

2011-11-06 Thread Mohit Anchlia
Transparent on disk encryption with pluggable keyprovider will also be really helpful to secure sensitive information. On Sun, Nov 6, 2011 at 9:42 AM, Aaron Turner wrote: > The intent was to have a lighter solution for common problems then > having to go with Hadoop or streaming large quantities

Re: security

2011-11-09 Thread Mohit Anchlia
We lockdown ssh to root from any network. We also provide individual logins including sysadmin and they go through LDAP authentication. Anyone who does sudo su as root gets logged and alerted via trapsend. We use firewalls and also have a separate vlan for datastore servers. We then open only speci

Re: Help with Cassandra Row Caches

2011-11-11 Thread Mohit Anchlia
Can you temporarily increase the size of Heap and try? On Fri, Nov 11, 2011 at 5:21 PM, Oleg Tsvinev wrote: > Hi everybody, > > We set row cache too high, 1 or so and now all our 6 nodes fail > with OOM. I believe that high row cache causes OOMs. > > Now, we trying to change row cache sizes u

Re: Second Cassandra users survey

2011-11-14 Thread Mohit Anchlia
On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani wrote: > Re  Simpler "elasticity": > Latest opscenter will now rebalance cluster optimally > http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 > Does it cause any impact on reads and writes while re-balance is in progress? How is it handled

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss wrote: >> I'm using cassandra 1.0.  Been doing some testing on using cass's cache. >>  When I turn it on (using the CLI) I see ParNew jump from 3-4ms to >> 200-300ms.  This really screws with

Re: Data Model Design for Login Servie

2011-11-18 Thread Mohit Anchlia
Secondary indexes in Cassandra are not good fit for High Cardinality values On Fri, Nov 18, 2011 at 7:14 AM, Dan Hendry wrote: > I they are not limited to repeating values but the Datastax docs[1] on > secondary indexes certainly seem to indicate they would be a poor fit for > this case (high rea

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia wrote: >> On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne >> wrote: >>> On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss wrote: >>>> I'm using c

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 9:42 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 6:31 PM, Mohit Anchlia wrote: >> On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne >> wrote: >>> On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia >>> wrote: >>>> On F

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 1:46 PM, Todd Burruss wrote: > Ok, I figured something like that.  Switching to > ConcurrentLinkedHashCacheProvider I see it is a lot better, but still > instead of the 25-30ms response times I enjoyed with no caching, I'm > seeing 500ms at 100% hit rate on the cache.  No o

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
f GC logs including ParNew and other major phases recorded in the logs. Are there any significant writes, memtable flushes etc occuring during this time? How many read/sec and writes/sec? What's the size of your row and columns that you are trying to retrieve? > > On 11/18/11 2:40 PM

Re: Efficiency of Cross Data Center Replication...?

2011-11-20 Thread Mohit Anchlia
On Sun, Nov 20, 2011 at 4:01 AM, Boris Yen wrote: > A quick question, what if DC2 is down, and after a while it comes back on. > how does the data get sync to DC2 in this case? (assume hint is disable) > Thanks in advance. Manually, use nodetool repair in rolling fashion on all the nodes of DC2

Re: What sort of load do the tombstones create on the cluster?

2011-11-21 Thread Mohit Anchlia
On Mon, Nov 21, 2011 at 11:47 AM, Edward Capriolo wrote: > > > On Mon, Nov 21, 2011 at 3:30 AM, Philippe wrote: >> >> I don't remember your exact situation but could it be your network >> connectivity? >> I know I've been upgrading mine because I'm maxing out fastethernet on a >> 12 node cluster.

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Mohit Anchlia
> bart@node1:~$ nodetool -h localhost getendpoints A UserDetails 4545027 > 192.168.81.5 > 192.168.81.2 > 192.168.81.3 Can you see what happens if you stop C* say on node .5 and write and read at quorum? On Wed, Dec 14, 2011 at 7:06 AM, Bart Swedrowski wrote: > > > On 14 December 2011 14:58, wro

Re: Garbage collection freezes cassandra node

2011-12-19 Thread Mohit Anchlia
Increasing memory in this case may not solve the problem. Share some information about your workload. Cluster configuration, cache sizes etc. You can also try getting java heap historgram to get more info on what's on the heap. On Mon, Dec 19, 2011 at 7:35 AM, Rene Kochen wrote: > I recently se

Re: cassandra data to hadoop.

2011-12-24 Thread Mohit Anchlia
You could read using Cassandra client and write to HDFS using Hadoop FS Api. On Fri, Dec 23, 2011 at 11:20 PM, ravikumar visweswara wrote: > Jeremy, > > We use cloudera distribution for our hadoop cluster and may not be possible > to migrate to brisk quickly because of flume/hue dependencies. Did

Re: Pending on ReadStage

2012-01-06 Thread Mohit Anchlia
Are all your nodes equally balanced in terms of read requests? Are you using RandomPartitioner? Are you reading using indexes? First thing you can do is compare iostat -x output between the 2 nodes to rule out any io issues assuming your read requests are equally balanced. On Fri, Jan 6, 2012 at

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian wrote: > Hi Everyone, > > What's the best way to reliably have unique constraints like functionality > with Cassandra? I have the following (which I think should be very common) > use case. > > User CF > Row Key: user email > Columns: userId: UUID

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
the "tracker" CF too, no? > > > On Jan 6, 2012, at 10:38 AM, Mohit Anchlia wrote: > >> On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian wrote: >>> Hi Everyone, >>> >>> What's the best way to reliably have unique constraints like function

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
andra >> a month or so back on this list. >> >> -Jeremiah >> >> On 01/06/2012 02:42 PM, Bryce Allen wrote: >> > On Fri, 6 Jan 2012 10:38:17 -0800 >> > Mohit Anchlia  wrote: >> >> It could be as simple as reading before writing to make sure tha

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
ets it looks like this has been tried > before, and for various reasons was not added. It's definitely > non-trivial to get right. > > On Fri, 6 Jan 2012 13:33:02 -0800 > Mohit Anchlia wrote: >> This looks like right way to do it. But remember this still doesn't >

Installing C* on EC2

2012-01-12 Thread Mohit Anchlia
What's the best way to install C*? Any good links? Is it better to just create instances and install rpms on it first, just like regular cluster and then create image from it? I am assuming it's possible. Are there any known issues when running C* on EC2? How do other C* users deal with instance fa

Brisk with standard C* cluster

2012-01-16 Thread Mohit Anchlia
Is it possible to add Brisk only nodes to standard C* cluster? So if we have node A,B,C with standard C* then add Brisk node D,E,F for analytics?

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Mohit Anchlia
Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach wrote: > Hi, > > we're using RP and have each node assigned the same amount of the token > space. The cluster looks like that: > > Address         Status

Re: Max records per node for a given secondary index value

2012-01-18 Thread Mohit Anchlia
You need to shard your rows On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur wrote: > Anyone? > > > On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur > wrote: >> >> Hi All, >> >> It is great to know that Cassandra column family can accommodate 2 billion >> columns per row! I was reading about how Cas

Re: Garbage collection freezes cassandra node

2012-01-19 Thread Mohit Anchlia
What's the version of Java do you use? Can you try reducing NewSize and increasing Old generation? If you are on old version of Java I also recommend upgrading that version. On Thu, Jan 19, 2012 at 3:27 AM, Rene Kochen wrote: > Thanks for your comments. The application is indeed suffering from a

Re: Cassandra to Oracle?

2012-01-20 Thread Mohit Anchlia
I think the problem stems when you have data in a column that you need to run adhoc query on which is not denormalized. In most cases it's difficult to predict the type of query that would be required. Another way of solving this could be to index the fields in search engine. On Fri, Jan 20, 2012

Re: WARN [Memtable] live ratio

2012-01-30 Thread Mohit Anchlia
I have the same experience. Wondering what's causing this? One thing I noticed is that this happens if server is idle for some time and then load starts going high is when I start to see these messages. On Mon, Jan 30, 2012 at 4:54 PM, Roshan wrote: > Hi All > > Time to time I am seen this below

Re: WARN [Memtable] live ratio

2012-01-31 Thread Mohit Anchlia
I guess this is not really a WARN in that case. On Tue, Jan 31, 2012 at 4:29 PM, aaron morton wrote: > The ratio is the ratio of serialised bytes for a memtable to actual JVM > allocated memory. Using a ratio below 1 would imply the JVM is using less > bytes to store the memtable in memory than i

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
d on WARN and ERROR. But if there is nothing to do then it probably is just an INFO. > On Tue, Jan 31, 2012 at 9:41 PM, Mohit Anchlia wrote: >> I guess this is not really a WARN in that case. >> >> On Tue, Jan 31, 2012 at 4:29 PM, aaron morton >> wrote: >>> The r

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
hen read from. > > On Fri, Feb 3, 2012 at 10:31 AM, Mohit Anchlia wrote: >> On Fri, Feb 3, 2012 at 7:32 AM, Jonathan Ellis wrote: >>> It's a warn because it's nonsense for the JVM to report that an column >>> + overhead, takes less space than just the col

Re: nodetool hangs and didn't print anything with firewall

2012-02-05 Thread Mohit Anchlia
Does it work with iptables disabled? You could add log to your firewall rules to see if firewall is dropping the packets. On Sun, Feb 5, 2012 at 5:35 PM, Roshan wrote: > Hi > > I have 2 node Cassandra cluster and each linux box configured with a > firewall. The ports 7000, 7199 and 9160 are open

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
In my opinion if you are busy site or application keep blobs out of the database. On Wed, Feb 22, 2012 at 9:37 AM, Dan Retzlaff wrote: > Chunking is a good idea, but you'll have to do it yourself. A few of the > columns in our application got quite large (maybe ~150MB) and the failure > mode was

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
Outside on the file system and a pointer to it in C* On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida wrote: > Keep them where? > > -- > *From:* Mohit Anchlia > *To:* user@cassandra.apache.org > *Cc:* potek...@bnl.gov > *Sent:* Wednesday, Febr

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
; > On 2/22/2012 1:34 PM, Mohit Anchlia wrote: > > Outside on the file system and a pointer to it in C* > > On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida wrote: > >> Keep them where? >> >> -- >> *From:* Mohit Anchlia >

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Mohit Anchlia
On Sun, Feb 26, 2012 at 12:18 PM, aaron morton wrote: > Nathan Milford has a post about taking a node down > > http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ > > The only thing I would do differently would be turn off thrift first. > > Cheers > Isn't decomission meant to do the sa

Performance overhead when using start and end columns

2012-03-24 Thread Mohit Anchlia
I have rows with around 2K-50K columns but when I do a query I only need to fetch few columns between start and end columns. I was wondering what performance overhead does it cause by using slice query with start and end columns? Looking at the code it looks like when you give start and end column

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
/07/04/Cassandra-Query-Plans/ > > Tl;Dr; Select columns with no start, in the natural Comparator order. > > Cheers > > >- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 25/03/2012, at 2:25 PM, Mohit Anchl

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
ickle.com > > On 27/03/2012, at 6:21 AM, Mohit Anchlia wrote: > > Thanks but if I do have to specify start and end columns then how much > overhead roughly would that translate to since reading metadata should be > constant overall? > > On Mon, Mar 26, 2012 at 10:18 AM, aa

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-28 Thread Mohit Anchlia
We are currently using 1.0.0-2 version. Do we still need to migrate to the latest release of 1.0 before migrating to 1.1? Looks like incompatibility is only between 1.0.3-1.0.8. On Tue, Mar 27, 2012 at 6:42 AM, Benoit Perroud wrote: > Thanks for the quick feedback. > > I will drop the schema t

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
or any > details on the upgrade path for these versions). > The incompatibility here is only between 1.1.0-beta1 and 1.1.0-beta2. > > -- > Sylvain > > On Thu, Mar 29, 2012 at 2:50 AM, Mohit Anchlia > wrote: > > We are currently using 1.0.0-2 version. Do we still need

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
Any updates? On Thu, Mar 29, 2012 at 7:31 AM, Mohit Anchlia wrote: > This is from NEWS.txt. So my question is if we are on 1.0.0-2 release do > we still need to upgrade since this impacts releases between 1.0.3-1.0.5? > - > If you are running a multi datacenter setup, you shoul

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
.0.0 does not generate cross-dc forwarding message at all, so you're > safe on that side. > > Is cross-dc forwarding different than replication? > -- > Sylvain > > On Thu, Mar 29, 2012 at 9:33 PM, Mohit Anchlia > wrote: > > Any updates? > > > > > >

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
On Thu, Mar 29, 2012 at 10:08 PM, Markus Wiesenbacher | Codefreun.de < m...@codefreun.de> wrote: > Hi, > > yes you can insert data into cassandra with apollo, just try the demo > center: http://www.codefreun.de/apolloUI/ > > You can login by just press the login-button (autologin) and play around

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
at columns that falls outside of it > > ** > > *Von:* Mohit Anchlia [mailto:mohitanch...@gmail.com] > *Gesendet:* Freitag, 30. März 2012 16:57 > > *An:* user@cassandra.apache.org > *Betreff:* Re: cassandra gui > > ** ** > > On Thu, Mar 29, 2012 at 10:08 PM, Mar

Re: Question regarding major compaction.

2012-05-01 Thread Mohit Anchlia
+1 On Tue, May 1, 2012 at 12:06 PM, Edward Capriolo wrote: > Also there are some tickets in JIRA to impose a max sstable size and > some other related optimizations that I think got stuck behind levelDB > in coolness factor. Not every use case is good for leveled so adding > more tools and optimi

Updating CF to reversed type

2012-05-04 Thread Mohit Anchlia
Is it possible to update CF definition to use "reversed" type? If it's possible then what happens to the old values, do they still remain ordered in ascending order?

Re: Updating CF to reversed type

2012-05-05 Thread Mohit Anchlia
I thought so. Is there a way I can unload and load data after dropping CF and re-creating it with reversed type? On Sat, May 5, 2012 at 7:11 AM, Edward Capriolo wrote: > You can not update comparators because they effect the on disk ordering. > > On Sat, May 5, 2012 at 2:11 AM, Mohi

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-14 Thread Mohit Anchlia
That's right. Create class that implements the required interface and then drop that jar in lib directory and start the cluster. On Mon, May 14, 2012 at 11:41 AM, Kirk True wrote: > Disclaimer: I've never tried, but I'd imagine you can drop a JAR > containing the class(es) into the lib directory

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-15 Thread Mohit Anchlia
I agree with Brandon. We only use it for enhancing authz and authn modules to use LDAP that C* currently doesn't provide. On Mon, May 14, 2012 at 11:08 PM, Brandon Williams wrote: > On Tue, May 15, 2012 at 12:53 AM, Ertio Lew wrote: > > @Brandon : I just created a jira issue to request this typ

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
On Tue, Jun 26, 2012 at 7:52 AM, Karthik N wrote: > My Cassandra ring spans two DCs. I use local quorum with replication > factor=3. I do a write in DC1 with local quorum. Data gets written to > multiple nodes in DC1. For the same write to propagate to DC2 only one > copy is sent from the coordin

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
estion. In general I don't think you can selectively decide on HH. Besides HH should only be used when the outage is in mts, for longer outages using HH would only create memory pressure. > On Tuesday, June 26, 2012, Mohit Anchlia wrote: > >> >> On Tue, Jun 26, 2012 at 7:52

Re: Cassandra Authentication

2012-06-28 Thread Mohit Anchlia
Sent from my iPad On Jun 28, 2012, at 8:45 AM, Christof Bornhoevd wrote: > Hi, > > we are using Cassandra v1.0.8 with Hector v1.0-5 and would like to move our > current system to an operational setting based on Amazon AWS. What are best > practices for addessing security for Cassandra on A

Re: Reduce Cassandra GC

2013-06-15 Thread Mohit Anchlia
Can you paste you gc config? Also can you take a heap dump at 2 diff points so that we can compare it? Quick thing to do would be to do a histo live at 2 points and compare Sent from my iPhone On Jun 15, 2013, at 6:57 AM, Takenori Sato wrote: > > INFO [ScheduledTasks:1] 2013-04-15 14:00:02,74

Re: Reduce Cassandra GC

2013-06-18 Thread Mohit Anchlia
Is your young generation size set to 4GB? Can you paste the output of ps -ef|grep cassandra ? On Tue, Jun 18, 2013 at 8:48 AM, Joel Samuelsson wrote: > Yes, like I said, the only relevant output from that file was: > 2013-06-17T08:11:22.300+: 2551.288: [GC 870971K->216494K(4018176K), > 145.18

  1   2   >