Re: Beginner Assumptions

2010-06-13 Thread Torsten Curdt
or right now looks like this: > > One CF with LongType Keys which represent a day (eg. 20100612, > 20100613, ...). Each value is a simple Time Series which is just a > list of 24 Integers (1 Counter for every Hour) packed into 96 bytes > (24x4byte). > > Then I have alot of rows w

Re: Beginner Assumptions

2010-06-13 Thread Paul Prescod
On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt wrote: > > So you want to increment those counters per hit? I don't think there > is an atomic increment semantic in cassandra yet. (Some one else to > confirm?) > > True for Cassandra 0.6. Progress continued on it a week or so ago: * https://iss

Re: GC Storm

2010-06-13 Thread Peter Schuller
> No, i do not disable compaction during my inserts. It is weird the minor > compaction is triggered less ofen. If you were just inserting a lot of data fast, it may be that background compaction was unable to keep up with the insertion rate. Simply leaving the node(s) for a while after the insert

Re: GC Storm

2010-06-13 Thread Torsten Curdt
> If you were just inserting a lot of data fast, it may be that > background compaction was unable to keep up with the insertion rate. > Simply leaving the node(s) for a while after the insert storm will let > it catch up with compaction. > > (At least this was the behavior for me on a recent trunk

Re: GC Storm

2010-06-13 Thread Peter Schuller
> We've also seen similar problems > >  https://issues.apache.org/jira/browse/CASSANDRA-1177 To be clear though; un-*flushed* data is very different from un-*compacted* data and the above seems to be about unflushed data? In my test case there was no problem at all flushing data. But my test was

Re: Beginner Assumptions

2010-06-13 Thread Thomas Heller
ner workings of cassandra than how to use it. ;) > And the per hour counts are stored as json? No, they are stored as byte arrays with a fixed size (96 = 24x4byte integers). >  cassandra.get("/page/1", Slice("20100612"..."20100613")) I know how to do it in c

Re: Beginner Assumptions

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt wrote: > > TBH while we are using super columns, the somehow feel wrong to me. I > would be happier if we could move what we do with super columns into > the row key space. But in our case that does not seem to be so easy. > > I'd be quite interes

Re: GC Storm

2010-06-13 Thread Benjamin Black
On Sat, Jun 12, 2010 at 7:46 PM, Anty wrote: > Hi:ALL > I have 10 nodes cluster ,after inserting many records into the cluster, i > compact each node by nodetool compact. > during the compaciton process ,something  wrong with one of the 10 nodes , > when the size of the compacted temp file rech ne

Re: File Descriptor leak

2010-06-13 Thread Matthew Conway
Pretty sure as the list of file descriptors below shows (at this point the client has exited, so doubly sure its not open sockets): # lsof -p `ps ax | grep [C]assandraDaemon | awk '{print $1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5 2 /usr/local/apache-cassandra-2010-06

Re: File Descriptor leak

2010-06-13 Thread Jonathan Ellis
Can you open a new ticket, then? Preferably with the thrift code involved, I'm not sure what find_by_natural_key or find_all_by_service is translating into. (It looks like just one of those is responsible for the leak.) On Sun, Jun 13, 2010 at 12:11 PM, Matthew Conway wrote: > Pretty sure as th

Re: Data format stability

2010-06-13 Thread Matthew Conway
Not so much worried about temporary breakages, but more about design decisions that are made to enhance cassandra at the cost of a data format change. So long as the policy here is to preserve backwards compatibility with the on disk storage format (possibly with an automatic conversion), even

Re: Data format stability

2010-06-13 Thread Benjamin Black
What specifically is driving you to use trunk rather than the stable, 0.6 branch? On Sun, Jun 13, 2010 at 1:37 PM, Matthew Conway wrote: > Not so much worried about temporary breakages, but more about design > decisions that are made to enhance cassandra at the cost of a data format > change.  

Re: Beginner Assumptions

2010-06-13 Thread Mark Robson
On 13 June 2010 18:54, Benjamin Black wrote: > On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt wrote: > > > > TBH while we are using super columns, the somehow feel wrong to me. I > > would be happier if we could move what we do with super columns into > > the row key space. But in our case tha

Re: Beginner Assumptions

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 3:08 PM, Mark Robson wrote: > > Range queries I think make them less useful, Not to my knowledge. > but only work if you're using > OrderPreservingPartitioner. The OPP comes with its own caveats - your nodes > are likely to become badly unbalanced, particularly if you use

Re: Data format stability

2010-06-13 Thread Matthew Conway
The ability to dynamically add new column families. Our app is currently under heavy development, and we will be adding new column families at least once a week after we have shipped the initial production app. From the existing docs, it seemed to me that the procedure for changing schema in 0.

Re: Data format stability

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 5:58 PM, Matthew Conway wrote: > The ability to dynamically add new column families.  Our app is currently > under heavy development, and we will be adding new column families at least > once a week after we have shipped the initial production app. From the > existing do

RE: read operation is slow

2010-06-13 Thread aaron
I'm not sure about the client you're using, but I've noticed in the past the incorrect Thrift stack can make things run slow (like 40 times slower). Check that the network stack wraps the socket in a Transport, preferably the TBufferedTransport. I'm guessing the client your're using is doing the