from:"Konstantin Naryshkin"

Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin

I have a basic understanding of how Cassandra handles the file system (flushes in Memtables out to SSTables, SSTables get compacted) and I understand that old files are only deleted when a node is restarted, when Java does a GC, or when Cassandra feels like it is running out of space. My quest

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin

So, in summary, there is no way to predictably and efficiently tell Cassandra to get rid of all of the extra space it is using on disk? - Original Message - From: "Jeffrey Kesselman" To: user@cassandra.apache.org Sent: Thursday, May 26, 2011 8:57:49 PM Subject: Re: Forcing Cassandra to f

Re: pb deletion

2011-05-27 Thread Konstantin Naryshkin

What is the ConsitencyLevel of your reads? A ConsistencyLevel.ONE remove returns when it has deleted the record from at least 1 replica (and any other ones will be deleted when they can). It could be the case that you are deleting the record off of one node and then reading it off of the other o

Re: Setting up cluster and nodetool ring in 0.8.0

2011-06-03 Thread Konstantin Naryshkin

Did you set the token values for you nodes? I remember having similar symptoms when I had a token conflict. - Original Message - From: "David McNelis" To: user@cassandra.apache.org Sent: Friday, June 3, 2011 5:06:10 PM Subject: Re: Setting up cluster and nodetool ring in 0.8.0 Edwa

Re: bring out your rpms...

2011-06-14 Thread Konstantin Naryshkin

You could try to roll your own. I managed to create a custom 0.8 RPM using the spec file from the redhat directory. First check out the source. Then edit the spec file with the following changes: Set the Version and Release variables appropriately. At the end of %install, add the following 2 li

Re: Unable to access column family in CLI after building CF in CQL

2011-06-16 Thread Konstantin Naryshkin

The second error (the CQL select) is because you have different Key Validation Class values for your two user columns. users is org.apache.cassandra.db.marshal.BytesType, while users2 is org.apache.cassandra.db.marshal.UTF8Type. The select is failing because you are comparing a String to a bu

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Konstantin Naryshkin

As I understand, it has to do with a node being up but missing the delete message (remember, if you apply the delete at CL.QUORUM, you can have almost half the replicas miss it and still succeed). Imagine that you have 3 nodes A, B, and C, each of which has a column 'foo' with a value 'bar'. The

Re: Survey: Cassandra/JVM Resident Set Size increase

2011-07-13 Thread Konstantin Naryshkin

Do you mean that it is using all of the available heap? That is the expected behavior of most long running Java applications. The JVM will not GC until it needs memory (or you explicitly ask it to) and will only free up a bit of memory at a time. That is very good behavior from a performance sta

Re: insert a super column

2011-07-13 Thread Konstantin Naryshkin

A ColumnPath can contain a super column, so you should be fine inserting a super column family (in fact I do that). Quoting cassandra.thrift: struct ColumnPath { 3: required string column_family, 4: optional binary super_column, 5: optional binary column, } - Original Message ---

Re: deletion questions

2011-07-19 Thread Konstantin Naryshkin

"2. Trying to reduce disk occupation I deleted CF which used 90% of available space. After issuing a "drop column family User;" command no *User*.db files were deleted. "nodetool compact" haven't helped too. How can that deletion be triggered?" You have to wait for a garbage collect (or do a roll

Re: best example of indexing

2011-07-20 Thread Konstantin Naryshkin

In the Cassandra CLI tutorial(http://wiki.apache.org/cassandra/CassandraCli), there is an example of creating a secondary index. Konstantin - Original Message - From: "CASSANDRA learner" To: user@cassandra.apache.org Sent: Wednesday, July 20, 2011 9:47:28 AM Subject: best example of ind

Re: Cassandra start/stop scripts

2011-08-02 Thread Konstantin Naryshkin

As mentioned, there is an init.d script in the RPM package to start and stop Cassandra (it is what we use). If you do not use the RPM and don't want to or cannot install the full package, you can get just the script at: https://svn.apache.org/repos/asf/cassandra/trunk/redhat/cassandra - Ori

Re: Question about eventually consistent in Cassandra

2011-08-02 Thread Konstantin Naryshkin

I believe that what would happen is that whichever data center has the later clock will win. Every modification you make gets a time stamp (generally set by your client to the current time, if you are using one). I believe that whatever modification happened with the last time stamp is canonical

Re: Problems using Thrift API in C

2011-08-04 Thread Konstantin Naryshkin

I have had similar issues when I generated Cassandra for Erlang. It seems that Thrift 0.6.1 (the latest stable version) does not work with Cassandra. Using Thrift 0.7 does. I had issues where it would give me run time errors when trying to send an insert (it would not serialize correctly). ---

Re: Problems using Thrift API in C

2011-08-04 Thread Konstantin Naryshkin

t API in C - Original Message - > From: Konstantin Naryshkin > To: user@cassandra.apache.org > Cc: > Sent: Thursday, August 4, 2011 10:36 AM > Subject: Re: Problems using Thrift API in C > > I have had similar issues when I generated Cassandra for Erlang. It seems > that

Re: How to release a customised Cassandra from Eclipse?

2011-08-10 Thread Konstantin Naryshkin

When I build cassandra, I use: #ant #ant release It does produce a working cassandra.jar, though I am not sure if it will fulfill your needs since I make mine to create an RPM out of it. - Original Message - From: "Norman Maurer" To: user@cassandra.apache.org Sent: Monday, August 8, 201

Re: Planet Cassandra is now live

2011-08-12 Thread Konstantin Naryshkin

Would you consider adding an RSS feed to the site for the benefit of those who like to use feed readers to keep track of unread posts and what not? - Original Message - From: "Lynn Bender" To: user@cassandra.apache.org Sent: Friday, August 12, 2011 2:18:45 PM Subject: Planet Cassandra is

Re: Planet Cassandra is now live

2011-08-15 Thread Konstantin Naryshkin

Thanks. I did not see a link to it when I was sending my message. - Original Message - From: "Zhu Han" To: user@cassandra.apache.org Sent: Saturday, August 13, 2011 12:11:37 AM Subject: Re: Planet Cassandra is now live On Sat, Aug 13, 2011 at 4:35 AM, Konstantin

Re: Reg row limit & sorting

2011-08-17 Thread Konstantin Naryshkin

1. The 100 row limit is for listing (i.e. how many rows that the list command will print). You can give list another limit: list User limit 1000; This limit has nothing to do with any internal Cassandra limitation. I am not aware of any limitation on the number of rows that you can have. 2. I be

Re: Customized Secondary Index Schema

2011-08-25 Thread Konstantin Naryshkin

Why are you keeping all your indexes in the same row? We do a similar thing (maintain several indexes over the same data) and we just have an index column family with keys like "dest192.168.0.1" which means destination index of 192.168.0.1. You can do rows like User_Keys_By_Last_Name_adams and

Re: Customized Secondary Index Schema

2011-08-25 Thread Konstantin Naryshkin

x by range query starting with "adams_". Am I right? I want to know what's the cost difference of rang query and slice query? If I can use either composite key or composite column name, which one gives me less query cost? 2011/8/25 Konstantin Naryshkin < konstant...@a-bb.net

Re: Replicate On Write behavior

2011-09-01 Thread Konstantin Naryshkin

Yeah, I believe that Yan has a type in his post. A CF is no read in one go, a row is. As for the scalability of having all the columns being read at once, I do not believe that it was ever meant to be. All the columns in a row are stored together, on the same set of machines. This means that if

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Konstantin Naryshkin

I think that Oleg may have misunderstood how replicas are selected. If you have 3 nodes in your cluster and a RF of 2, Cassandra first selects what two nodes, out of the 3 will get data, then, and only then does it write it out. The selection is based on the row key, the token of the node, and y

Re: Replace Live Node

2011-09-12 Thread Konstantin Naryshkin

The ring wraps around, so the value before 0 is the max possible token. I believe that it is 2**127 -1 . - Original Message - From: "Kyle Gibson" To: user@cassandra.apache.org Sent: Monday, September 12, 2011 3:30:20 PM Subject: Re: Replace Live Node What could you do if the initial_tok

Re: Configuring multi DC cluster

2011-09-15 Thread Konstantin Naryshkin

Wait, his nodes are going SC, SC, AT, AT. Shouldn't they go SC, AT, SC, AT? By which I mean that if he adds another node to the ring (or lowers the replication factor), he will have a node that is under-utilized. The rings in his data centers have the tokens: SC: 0, 1 AT: 85070591730234615865843

Re: Column family has more SSTables than threshold but no minorcompaction is running

2011-09-20 Thread Konstantin Naryshkin

I believe that minor compactions work on tables of same or similar size, so as long as your tables do not fall within a small range of each other in terms of size, Cassandra does not see an opportunity to run a minor compaction. - Original Message - From: "myreasoner" To: cassandra-u...

Re: Search over composite Column and Super Column name

2011-09-22 Thread Konstantin Naryshkin

One thing you can do is search over the range from "username:" to "username;". "username:" is the first possible string starting with "username:". "username;" is the first possible sting after all of the stings that start with "username:" . This works because ; is the character right after : in

Re: unable to start as a service on Ubuntu server

2011-09-27 Thread Konstantin Naryshkin

Yes, they start Cassandra as a daemon in the background. It is running. You can connect to it from the CLI or any other client. You can see what it is doing by reading the logs. cassandra -f starts Cassandra in the foreground, that is why it does not return a prompt when the server starts.

Re: node selection for replication factor 3

2011-10-03 Thread Konstantin Naryshkin

It picks sequentially (the two previous ones, I believe). So in your example it would be 105.12 and 105.11 - Original Message - From: "Ramesh Natarajan" To: user@cassandra.apache.org Sent: Monday, October 3, 2011 5:06:10 PM Subject: node selection for replication factor 3 I have 6 nod

Re: Question about sharding of rows and atomicity

2011-10-05 Thread Konstantin Naryshkin

Cassandra does not break apart a row. All of the columns of a row are kept on the same nodes. I believe that writing multiple columns of the same row is transactional, but not atomic. By which I mean that if one column is written all the other ones will be written as well, but if a read happens

Re: One CF vs several CFs

2011-10-17 Thread Konstantin Naryshkin

Method 1 may also result in very wide rows if you have lots and lots of tags and comments. This is a very drastic inefficiency for Cassadra (but again, it depends on your data). On Mon, Oct 17, 2011 at 05:40, Chintana Wilamuna wrote: > Hi, > > Does anyone have an idea about the pros/cons with mod

Changing the replication factor of a keyspace

2011-10-24 Thread Konstantin Naryshkin

We are setting up my application around Cassandra .8.0 (will move to Cassandra 1.0 in the near future). In production the application will be running in a two (or more) node cluster with RF 2. In development, we do not always have 2 machines to test on, so we may have to run a Cassandra cluster con

Re: Best way to search content in Cassandra

2011-10-28 Thread Konstantin Naryshkin

You can do a column slice for columns between "image/" (the first ASCII string that starts with that sub-string) and "image/~" (the last printable ASCII string that starts with that sub-string). On Thu, Oct 27, 2011 at 21:10, Jean-Nicolas Boulay Desjardins wrote: > Normally in SQL I would use "%"

Re: Second Cassandra users survey

2011-11-03 Thread Konstantin Naryshkin

I realize that it is not realistic to expect it, but is would be good to have a Partitioner that supports both range slices and automatic load balancing. On Thu, Nov 3, 2011 at 13:57, Ertio Lew wrote: > Provide an option to sort columns by timestamp i.e, in the order they have > been added to the

Re: Physical data layout of columns in super column family

2011-11-09 Thread Konstantin Naryshkin

I assume that Reports is the Super column family, the first 1: is the report id and in the topology is the row key, that the second 1: is the report line and in the Cassandra topology the super column, and that "value 1" is the column name. If this is not the case, maybe explain the topology better

Re: questions on frequency and timing of async replication between DCs

2011-11-14 Thread Konstantin Naryshkin

It may be the case that your CL is the issue. You are writing it at ONE, which means that out of the 4 replicas of that key (two in each data center), you are only putting it on one of them. When you read at CL ONE, if only looks at a single replica to see if the data is there. In other words. If y

Re: Fast lookups for userId to username and vice versa

2011-11-16 Thread Konstantin Naryshkin

Or just have two column families to do it: A CF idToName that has the userIds as keys and the userName as the only column and a CF nameToId that has the userNames as keys and the userId as the only column On Mon, Nov 14, 2011 at 03:50, chovatia jaydeep wrote: > Check if Cassandra secondary index

cqlsh not returning the column name of the first column when reversed

2011-12-06 Thread Konstantin Naryshkin

I am running Cassandra 1.0.0. I am using cqlsh for inspecting my data (very useful tool, thank you whoever wrote it). I notice that when I query for the FIRST N REVERSED column, it is omitting the column name on the first column. For example, cqlsh> SELECT FIRST 1 REVERSED * FROM netflow_raw; '{"bo

Re: Replica data distributing between racks

2011-05-04 Thread Konstantin Naryshkin

The way that I understand it (and that seems to be consistent with what was said in this discussion) is that each DC has its own data space. Using your simplified 1-10 system: DC1 DC2 0 D1R1 D2R2 1 D1R1 D2R1 2 D1R1 D2R1 3 D1R1 D2R1 4 D1R1 D2R1 5 D1R2 D2R1 6 D1R2 D2R2 7 D1R2 D

Making a custom Cassandra RPM

2011-05-04 Thread Konstantin Naryshkin

I want to create a custom RPM of Cassandra (so I can deploy it pre-configured). There is an RPM in the source tree, but it does not contain any details of the setup required to create the RPM (what files should I have where). I have tried to run rpmbuild -bi on the spec file and I am getting the

Re: Making a custom Cassandra RPM

2011-05-06 Thread Konstantin Naryshkin

: Making a custom Cassandra RPM Your apache ant install is too old. The ant that comes with rhel/centos 5.X isn't new enough to build cassandra. You will need to install ant manually. On Wed, May 4, 2011 at 2:01 PM, Konstantin Naryshkin wrote: > I want to create a custom RPM of Cassandra (

Re: Making a custom Cassandra RPM

2011-05-06 Thread Konstantin Naryshkin

: "Konstantin Naryshkin" To: user@cassandra.apache.org Sent: Friday, May 6, 2011 2:56:43 PM Subject: Re: Making a custom Cassandra RPM Sorry that I did not get back to you on the issue. Your suggestion worked and I was able to get the RPM to build. Unfortunately, it still does not work for

42 matches

Mail list logo