Re: Beginner Assumptions

2010-06-13 Thread Torsten Curdt
> Anyways, I want to store some (alot of) Time Series data in Cassandra
> and would like to check if my assumptions are correct so far. So if
> someone with operational experience could confirm these I'd really
> appreciate it.
>
> Basically the structure I'm going for right now looks like this:
>
> One CF with LongType Keys which represent a day (eg. 20100612,
> 20100613, ...). Each value is a simple Time Series which is just a
> list of 24 Integers (1 Counter for every Hour) packed into 96 bytes
> (24x4byte).
>
> Then I have alot of rows which each accumulate one column per day. Put
> in Web Analytics terms I might count the number of views a page gets:
>
> row:"/page/1" => cols:[20100612 => [12,34,...], 20100613 => [34,93,...], ...]
> row:"/page/2" => cols:[20100612 => [1,...], ...]

And the per hour counts are stored as json?

> Over a couple of years I would collect "millions" of rows, each with
> "hundreds" of columns.
>
> So, Assumption #1:
>
> Only the row key decides where the data lives (via consistent
> hashing)? So each tuple for a row lives on the same node(s) which in
> turn makes querying for slices of columns fast. I really need fast
> queries (It is fast in my tests but I'm working on a very small subset
> only).
>
> Assumption #2:
>
> Basically the only query for this CF will always look like "get  date range> of data for ". I can actually just get a slice of
> columns using 'start' and 'count' and this would perform just as fast
> (or faster) than building my list of keys on the client and doing a
> multi get?
>
> Beware SQL! Translated to SQL (since this is what my brain does all the time):
> SELECT data FROM time_series WHERE key = '/page/1' ORDER BY day DESC LIMIT 90;
> vs
> SELECT data FROM time_series WHERE key = '/page/1' AND day IN
> ('2010-06-13', '2010-06-12', ...);
> vs
> memcache.get(['20100613:/page/1', '20100612:/page/1', ...])

I thought your row key is "/page/1" and the date is in the column? So
you want this?

 cassandra.get("/page/1", Slice("20100612"..."20100613"))

> Assumption #3:
>
> Since the data grows in a fixed rate per row and only the number of
> rows varies it should be simple enough to predict storage
> requirements. Rows are "equally" distributed on the cluster (using
> RandomPartitioner) and should a node reach its capacity limit the
> cluster will migrate rows to new nodes. Making it easy to scale out.
> Thats the point right? :P

I doubt you data will grow at a fixed rate per row. (Unless you have
always the same hit pattern for your pages) But you should be able to
able to calculated the maximal required storage requirement. That said
- I am wondering... where are you aggregating the counts per hour?

> Assumption #4:
>
> I might update the current day data multiple times until the day
> passes and the data becomes immutable. It is ok for Clients to see old
> data but the data must be "correct" at some point (eventually
> consistent ha!). This seems to be solved, just something the SQL Devil
> on my shoulder keeps bugging me about.

So you want to increment those counters per hit? I don't think there
is an atomic increment semantic in cassandra yet. (Some one else to
confirm?)

> I think I "got" it and will get my hands dirty soon, just wanted to
> squash my last doubts. I've done this on Riak too but I wasnt too
> happy with it.

Just wondering what felt wrong about Riak.

> Cassandra feels "right" although it took some Jedi Mind
> Tricks to grasp SuperColumns.


TBH while we are using super columns, the somehow feel wrong to me. I
would be happier if we could move what we do with super columns into
the row key space. But in our case that does not seem to be so easy.


cheers
--
Torsten


Re: Beginner Assumptions

2010-06-13 Thread Paul Prescod
On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt  wrote:

>
> So you want to increment those counters per hit? I don't think there
> is an atomic increment semantic in cassandra yet. (Some one else to
> confirm?)
>
> True for Cassandra 0.6.

Progress continued on it a week or so ago:

 * https://issues.apache.org/jira/browse/CASSANDRA-1072

 Paul Prescod


Re: GC Storm

2010-06-13 Thread Peter Schuller
> No, i do not disable compaction during my inserts. It is weird the minor
> compaction is triggered less ofen.

If you were just inserting a lot of data fast, it may be that
background compaction was unable to keep up with the insertion rate.
Simply leaving the node(s) for a while after the insert storm will let
it catch up with compaction.

(At least this was the behavior for me on a recent trunk.)

-- 
/ Peter Schuller


Re: GC Storm

2010-06-13 Thread Torsten Curdt
> If you were just inserting a lot of data fast, it may be that
> background compaction was unable to keep up with the insertion rate.
> Simply leaving the node(s) for a while after the insert storm will let
> it catch up with compaction.
>
> (At least this was the behavior for me on a recent trunk.)

We've also seen similar problems

 https://issues.apache.org/jira/browse/CASSANDRA-1177

Sounds like this is about pushing back on too fast inserts

 https://issues.apache.org/jira/browse/CASSANDRA-685

cheers
--
Torsten


Re: GC Storm

2010-06-13 Thread Peter Schuller
> We've also seen similar problems
>
>  https://issues.apache.org/jira/browse/CASSANDRA-1177

To be clear though; un-*flushed* data is very different from
un-*compacted* data and the above seems to be about unflushed data?

In my test case there was no problem at all flushing data. But my test
was sustained write speeds up to 200 million rows and ~ 200 gb of
space, and as the database grew larger the compaction goes goes up (as
expected).

For that particular workload it would probably have been beneficial to
have a configurable concurrency on compactions as well (similar to how
it is configurable for sstable flushing) because CPU was the
bottleneck in the compaction process (i.e., after stopping inserts and
letting compaction complete in the background, the compaction thread
was CPU-bound and there was plenty of available I/O capacity).

-- 
/ Peter Schuller


Re: Beginner Assumptions

2010-06-13 Thread Thomas Heller
Hey,

I'm sorry, I think I didnt make myself clear enough. I'm using
cassandra only the store the _results_ (the calculated time series)
not the source data. Also using "Beginner Assumptions" as the Subject
propably wasnt the best choice since I'm more interested in the inner
workings of cassandra than how to use it. ;)

> And the per hour counts are stored as json?

No, they are stored as byte arrays with a fixed size (96 = 24x4byte integers).

>  cassandra.get("/page/1", Slice("20100612"..."20100613"))

I know how to do it in cassandra, I just was comparing it to others. I
was interested to know if

cassandra.get("/page/1", :start => "20100612", :count => 90)
is actually just as fast as
cassandra.get("/page/1", Slice("20100612", "20100613", ...)) with 90 keys

>
>> Assumption #3:
> I doubt you data will grow at a fixed rate per row. (Unless you have
> always the same hit pattern for your pages) But you should be able to
> able to calculated the maximal required storage requirement. That said
> - I am wondering... where are you aggregating the counts per hour?

The Data is currently just stored in logfiles which are parsed once an
hour in a map/reduce like fashion (not stored in cassandra). Even if
there are no values to be saved there will still be a column for this
row with [0, 0, 0, ...]. I also do not need to increment any of those
counters live. Hit Patterns dont matter since 1million views per hour
consume just the same space as 0 views (96 bytes fixed). I may at some
time remove the 0 values to save space but right now there is always
one column per day per row.

>
> So you want to increment those counters per hit? I don't think there
> is an atomic increment semantic in cassandra yet. (Some one else to
> confirm?)

No, see above. Each View generates one entry in a logfile which is
append only (much like the cassandra commitlog). Incrementing those
counters live is very unlikely to happen, since they are just one part
of the whole log map/reduce thing. The offline processing part is not
moving into cassandra anytime soon, I just wanna put the results
somewhere. SQL is fine for that (atm) but I was interested in some
NoSQL and this seemed like a good usecase (very structed data, only
accessed by keys or key ranges but the key is always known, aka no
dynamic queries)

Cheers,
/thomas


Re: Beginner Assumptions

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt  wrote:
> 
> TBH while we are using super columns, the somehow feel wrong to me. I
> would be happier if we could move what we do with super columns into
> the row key space. But in our case that does not seem to be so easy.
> 
>

I'd be quite interested to learn what you are doing with super columns
that cannot be replicated with composite keys and range queries.


b


Re: GC Storm

2010-06-13 Thread Benjamin Black
On Sat, Jun 12, 2010 at 7:46 PM, Anty  wrote:
> Hi:ALL
> I have 10 nodes cluster ,after inserting many records into the cluster, i
> compact each node by nodetool compact.
> during the compaciton process ,something  wrong with one of the 10 nodes ,
> when the size of the compacted temp file rech nearly 100GB( before
> compaction ,the size is ~240G)

Compaction is not compression, it is merging of SSTables and tombstone
elimination.  If you are not doing many deletes or overwrites of
existing data, the compacted SSTable will be about the same size as
the total size of all the smaller SSTables that went into it.  It is
not clear to me how you ended up with 5000 SSTables (the *-data.db
files) of such small size if you have not disabled minor compactions.

Can you post your storage-conf.xml someplace (pastie or
gist.github.com, for example)?


b


Re: File Descriptor leak

2010-06-13 Thread Matthew Conway
Pretty sure as the list of file descriptors below shows (at this point the 
client has exited, so doubly sure its not open sockets):

# lsof -p `ps ax | grep [C]assandraDaemon | awk '{print $1}'` | awk '{print 
$9}' | sort | uniq -c | sort -n | tail -n 5
  2 
/usr/local/apache-cassandra-2010-06-11_12-30-33/lib/slf4j-log4j12-1.5.8.jar
  2 /usr/local/apache-cassandra-2010-06-11_12-30-33/lib/snakeyaml-1.6.jar
  2 /usr/share/java/gnome-java-bridge.jar
   1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-1-Data.db
   1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-2-Data.db

On Jun 11, 2010, at Fri Jun 11, 7:34 PM, Jonathan Ellis wrote:

> it goes up by exactly 2000, which is the number of loop iterations
> exactly?  are you sure this isn't just counting your open sockets?
> 
> On Fri, Jun 11, 2010 at 1:53 PM, Matthew Conway  wrote:
>> Thanks, I just tried apache-cassandra-2010-06-11_12-30-33 (hudson 462) but 
>> my tests ares still reporting a leak (though not as bad), I do the following 
>> (ruby tests using cassandra_object/cassandra, but you should be able to get 
>> the idea):
>> 
>>  should "not leak file descriptors" do
>>cassandra_pid = `ps ax | grep [C]assandraDaemon | awk '{print $1}'`
>>original_count = `lsof -p #{cassandra_pid}`.lines.to_a.size
>>assert original_count > 0
>>count = 1000
>>count.times do |n|
>>  ChildMetadatum.new(:service_id => 4, :child_id => "def#{n}", 
>> :updated => Time.now, :labels => ["label2", "label3"]).save!
>>end
>>count.times do |n|
>>  ChildMetadatum.find_by_natural_key(:service_id => 4, :child_id => 
>> "def#{n}")
>>  ChildMetadatum.find_all_by_service_id(3)
>>end
>>new_count = `lsof -p #{cassandra_pid}`.lines.to_a.size
>>assert new_count > 0
>>assert new_count < original_count * 1.1, "File descriptors leaked 
>> from #{original_count} to #{new_count}"
>>  end
>> 
>> Which reports: File descriptors leaked from 112 to 2112.
>> SHould I reopen the bug or create a new one?
>> 
>> Matt
>> 
>> On Jun 10, 2010, at Thu Jun 10, 6:40 PM, Jonathan Ellis wrote:
>> 
>>> Fixed in https://issues.apache.org/jira/browse/CASSANDRA-1178
>>> 
>>> On Thu, Jun 10, 2010 at 9:01 AM, Matt Conway  wrote:
 Hi All,
 I'm running a small 4-node cluster with minimal load using
 the 2010-06-08_12-31-16 build from trunk, and its exhausting file
 descriptors pretty quickly (65K in less than an hour).  Here's a list of 
 the
 files I see it  leaking, I can do a more specific query if you'd like.  Am 
 I
 doing something wrong, is this a known problem, something being done wrong
 from the client side, or something else?  Any help appreciated, thanks,
 Matt
 r...@cassandra01:~# lsof -p `ps ax | grep [C]assandraDaemon | awk '{print
 $1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
   3 /mnt/cassandra/data/system/Schema-c-2-Data.db
1278 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-7-Data.db
1405 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-9-Data.db
1895 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-5-Data.db
   26655 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-11-Data.db
 
 
>>> 
>>> 
>>> 
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com



Re: File Descriptor leak

2010-06-13 Thread Jonathan Ellis
Can you open a new ticket, then?  Preferably with the thrift code
involved, I'm not sure what find_by_natural_key or find_all_by_service
is translating into.  (It looks like just one of those is responsible
for the leak.)

On Sun, Jun 13, 2010 at 12:11 PM, Matthew Conway  wrote:
> Pretty sure as the list of file descriptors below shows (at this point the 
> client has exited, so doubly sure its not open sockets):
>
> # lsof -p `ps ax | grep [C]assandraDaemon | awk '{print $1}'` | awk '{print 
> $9}' | sort | uniq -c | sort -n | tail -n 5
>      2 
> /usr/local/apache-cassandra-2010-06-11_12-30-33/lib/slf4j-log4j12-1.5.8.jar
>      2 /usr/local/apache-cassandra-2010-06-11_12-30-33/lib/snakeyaml-1.6.jar
>      2 /usr/share/java/gnome-java-bridge.jar
>   1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-1-Data.db
>   1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-2-Data.db
>
> On Jun 11, 2010, at Fri Jun 11, 7:34 PM, Jonathan Ellis wrote:
>
>> it goes up by exactly 2000, which is the number of loop iterations
>> exactly?  are you sure this isn't just counting your open sockets?
>>
>> On Fri, Jun 11, 2010 at 1:53 PM, Matthew Conway  wrote:
>>> Thanks, I just tried apache-cassandra-2010-06-11_12-30-33 (hudson 462) but 
>>> my tests ares still reporting a leak (though not as bad), I do the 
>>> following (ruby tests using cassandra_object/cassandra, but you should be 
>>> able to get the idea):
>>>
>>>      should "not leak file descriptors" do
>>>        cassandra_pid = `ps ax | grep [C]assandraDaemon | awk '{print $1}'`
>>>        original_count = `lsof -p #{cassandra_pid}`.lines.to_a.size
>>>        assert original_count > 0
>>>        count = 1000
>>>        count.times do |n|
>>>          ChildMetadatum.new(:service_id => 4, :child_id => "def#{n}", 
>>> :updated => Time.now, :labels => ["label2", "label3"]).save!
>>>        end
>>>        count.times do |n|
>>>          ChildMetadatum.find_by_natural_key(:service_id => 4, :child_id => 
>>> "def#{n}")
>>>          ChildMetadatum.find_all_by_service_id(3)
>>>        end
>>>        new_count = `lsof -p #{cassandra_pid}`.lines.to_a.size
>>>        assert new_count > 0
>>>        assert new_count < original_count * 1.1, "File descriptors leaked 
>>> from #{original_count} to #{new_count}"
>>>      end
>>>
>>> Which reports: File descriptors leaked from 112 to 2112.
>>> SHould I reopen the bug or create a new one?
>>>
>>> Matt
>>>
>>> On Jun 10, 2010, at Thu Jun 10, 6:40 PM, Jonathan Ellis wrote:
>>>
 Fixed in https://issues.apache.org/jira/browse/CASSANDRA-1178

 On Thu, Jun 10, 2010 at 9:01 AM, Matt Conway  wrote:
> Hi All,
> I'm running a small 4-node cluster with minimal load using
> the 2010-06-08_12-31-16 build from trunk, and its exhausting file
> descriptors pretty quickly (65K in less than an hour).  Here's a list of 
> the
> files I see it  leaking, I can do a more specific query if you'd like.  
> Am I
> doing something wrong, is this a known problem, something being done wrong
> from the client side, or something else?  Any help appreciated, thanks,
> Matt
> r...@cassandra01:~# lsof -p `ps ax | grep [C]assandraDaemon | awk '{print
> $1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
>       3 /mnt/cassandra/data/system/Schema-c-2-Data.db
>    1278 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-7-Data.db
>    1405 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-9-Data.db
>    1895 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-5-Data.db
>   26655 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-11-Data.db
>
>



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com
>>>
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Data format stability

2010-06-13 Thread Matthew Conway
Not so much worried about temporary breakages, but more about design decisions 
that are made to enhance cassandra at the cost of a data format change.  So 
long as the policy here is to preserve backwards compatibility with the on disk 
storage format (possibly with an automatic conversion), even that shouldn't be 
a problem.  I'll go ahead and follow commits/dev, but given my lack of 
experience with cassandra, I'm worried I might not be able to tell an important 
change from not, so it would be helpful if I knew if any more were planned for 
0.7, or if an extra step was taken to announce these kinds of changes in one of 
the lists.  Thanks,

Matt

On Jun 11, 2010, at Fri Jun 11, 7:43 PM, Jonathan Ellis wrote:

> If you're comfortable following comm...@cassandra.apache.org, it
> should be pretty obvious which changes are going to break things
> temporarily or require a commitlog drain.  Otherwise, we recommend
> sticking with the stable branch until a beta is released.
> 
> On Fri, Jun 11, 2010 at 2:24 PM, Matthew Conway  wrote:
>> Hi All,
>> 
>> I'd like to start using trunk for something real, but am concerned about 
>> stability of the data format.  That is, will I be able to upgrade a running 
>> system to a newer version of trunk and eventually to the 7.0 release, or are 
>> there any changes planned to the format of the data stored on disk that 
>> would prevent this.  I'm ok with having to do a full shutdown to do 
>> upgrades, or even some form of export/import (would prefer to avoid this), 
>> but obviously would need to know when the format has changed so I can do the 
>> right thing (announcements to mailing list?).  What is the recommend 
>> procedure for dealing with upgrades?  Thanks,
>> 
>> Matt
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com



Re: Data format stability

2010-06-13 Thread Benjamin Black
What specifically is driving you to use trunk rather than the stable,
0.6 branch?

On Sun, Jun 13, 2010 at 1:37 PM, Matthew Conway  wrote:
> Not so much worried about temporary breakages, but more about design 
> decisions that are made to enhance cassandra at the cost of a data format 
> change.  So long as the policy here is to preserve backwards compatibility 
> with the on disk storage format (possibly with an automatic conversion), even 
> that shouldn't be a problem.  I'll go ahead and follow commits/dev, but given 
> my lack of experience with cassandra, I'm worried I might not be able to tell 
> an important change from not, so it would be helpful if I knew if any more 
> were planned for 0.7, or if an extra step was taken to announce these kinds 
> of changes in one of the lists.  Thanks,
>
> Matt
>
> On Jun 11, 2010, at Fri Jun 11, 7:43 PM, Jonathan Ellis wrote:
>
>> If you're comfortable following comm...@cassandra.apache.org, it
>> should be pretty obvious which changes are going to break things
>> temporarily or require a commitlog drain.  Otherwise, we recommend
>> sticking with the stable branch until a beta is released.
>>
>> On Fri, Jun 11, 2010 at 2:24 PM, Matthew Conway  wrote:
>>> Hi All,
>>>
>>> I'd like to start using trunk for something real, but am concerned about 
>>> stability of the data format.  That is, will I be able to upgrade a running 
>>> system to a newer version of trunk and eventually to the 7.0 release, or 
>>> are there any changes planned to the format of the data stored on disk that 
>>> would prevent this.  I'm ok with having to do a full shutdown to do 
>>> upgrades, or even some form of export/import (would prefer to avoid this), 
>>> but obviously would need to know when the format has changed so I can do 
>>> the right thing (announcements to mailing list?).  What is the recommend 
>>> procedure for dealing with upgrades?  Thanks,
>>>
>>> Matt
>>>
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>


Re: Beginner Assumptions

2010-06-13 Thread Mark Robson
On 13 June 2010 18:54, Benjamin Black  wrote:

> On Sun, Jun 13, 2010 at 12:53 AM, Torsten Curdt  wrote:
> > 
> > TBH while we are using super columns, the somehow feel wrong to me. I
> > would be happier if we could move what we do with super columns into
> > the row key space. But in our case that does not seem to be so easy.
> > 
> >
>
> I'd be quite interested to learn what you are doing with super columns

that cannot be replicated with composite keys and range queries.
>

Range queries I think make them less useful, but only work if you're using
OrderPreservingPartitioner. The OPP comes with its own caveats - your nodes
are likely to become badly unbalanced, particularly if you use time-based
keys.

Mark


Re: Beginner Assumptions

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 3:08 PM, Mark Robson  wrote:
>
> Range queries I think make them less useful,

Not to my knowledge.

> but only work if you're using
> OrderPreservingPartitioner. The OPP comes with its own caveats - your nodes
> are likely to become badly unbalanced, particularly if you use time-based
> keys.

I am aware of the need for careful token selection and cluster
management.  You can mitigate much of the problem using randomized
keys, in effect emulating RP.  My question stands.


b


Re: Data format stability

2010-06-13 Thread Matthew Conway
The ability to dynamically add new column families.  Our app is currently under 
heavy development, and we will be adding new column families at least once a 
week after we have shipped the initial production app. From the existing docs, 
it seemed to me that the procedure for changing schema in 0.6 is very manual in 
nature and thus error prone and likely to cause data corruption.  Feel free to 
correct me if I'm wrong :)

Matt

On Jun 13, 2010, at Sun Jun 13, 5:01 PM, Benjamin Black wrote:

> What specifically is driving you to use trunk rather than the stable,
> 0.6 branch?
> 
> On Sun, Jun 13, 2010 at 1:37 PM, Matthew Conway  wrote:
>> Not so much worried about temporary breakages, but more about design 
>> decisions that are made to enhance cassandra at the cost of a data format 
>> change.  So long as the policy here is to preserve backwards compatibility 
>> with the on disk storage format (possibly with an automatic conversion), 
>> even that shouldn't be a problem.  I'll go ahead and follow commits/dev, but 
>> given my lack of experience with cassandra, I'm worried I might not be able 
>> to tell an important change from not, so it would be helpful if I knew if 
>> any more were planned for 0.7, or if an extra step was taken to announce 
>> these kinds of changes in one of the lists.  Thanks,
>> 
>> Matt
>> 
>> On Jun 11, 2010, at Fri Jun 11, 7:43 PM, Jonathan Ellis wrote:
>> 
>>> If you're comfortable following comm...@cassandra.apache.org, it
>>> should be pretty obvious which changes are going to break things
>>> temporarily or require a commitlog drain.  Otherwise, we recommend
>>> sticking with the stable branch until a beta is released.
>>> 
>>> On Fri, Jun 11, 2010 at 2:24 PM, Matthew Conway  wrote:
 Hi All,
 
 I'd like to start using trunk for something real, but am concerned about 
 stability of the data format.  That is, will I be able to upgrade a 
 running system to a newer version of trunk and eventually to the 7.0 
 release, or are there any changes planned to the format of the data stored 
 on disk that would prevent this.  I'm ok with having to do a full shutdown 
 to do upgrades, or even some form of export/import (would prefer to avoid 
 this), but obviously would need to know when the format has changed so I 
 can do the right thing (announcements to mailing list?).  What is the 
 recommend procedure for dealing with upgrades?  Thanks,
 
 Matt
 
 
>>> 
>>> 
>>> 
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>> 
>> 



Re: Data format stability

2010-06-13 Thread Benjamin Black
On Sun, Jun 13, 2010 at 5:58 PM, Matthew Conway  wrote:
> The ability to dynamically add new column families.  Our app is currently 
> under heavy development, and we will be adding new column families at least 
> once a week after we have shipped the initial production app. From the 
> existing docs, it seemed to me that the procedure for changing schema in 0.6 
> is very manual in nature and thus error prone and likely to cause data 
> corruption.  Feel free to correct me if I'm wrong :)
>

I do schema manipulations in 0.6 regularly.  The answer is automation.
 As for data corruption: what did you read that gave you that
impression?

If this is the only motivator and you are really only changing things
once/week or so, I suggest sticking with 0.6 and figuring out some
automation.  You should be using it, anyway.


b


RE: read operation is slow

2010-06-13 Thread aaron
I'm not sure about the client you're using, but I've noticed in the past
the incorrect Thrift stack can make things run slow (like 40 times slower).


Check that the network stack wraps the socket in a Transport, preferably
the TBufferedTransport. I'm guessing the client your're using is doing the
right thing, just an suggestion. 

Aaron

On Fri, 11 Jun 2010 18:49:46 -0700, "caribbean410"

wrote:
> Thanks for the suggestion. For the test case, it is 1 key and 1 column.
I
> once changed 10 to 1, as I remember there is no much difference.
> 
>  
> 
> I have 200k keys and each key is randomly generated. I will try the
> optimized query next week. But maybe you still have to face the case
that
> each time a client just wants to query one key from db.
> 
>  
> 
> From: Dop Sun [mailto:su...@dopsun.com] 
> Sent: Friday, June 11, 2010 6:05 PM
> To: user@cassandra.apache.org
> Subject: RE: read operation is slow
> 
>  
> 
> And also, you are only select 1 key and 10 columns?
> 
>  
> 
> criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
> nameFirst, 10);
> 
>  
> 
> Then, if you have 200k keys, you have 200k Thrift calls.  If this is the
> case, you may need to optimize the way you do the query (to combine
> multiple
> keys into a single query), and to reduce the number of calls.
> 
>  
> 
> From: Dop Sun [mailto:su...@dopsun.com] 
> Sent: Saturday, June 12, 2010 8:57 AM
> To: user@cassandra.apache.org
> Subject: RE: read operation is slow
> 
>  
> 
> You mean after you "I remove some unnecessary column family and change
the
> size of rowcache and keycache, now the latency changes from 0.25ms to
> 0.09ms. In essence 0.09ms*200k=18s.", it still takes 400 seconds to
> returning?
> 
>  
> 
> From: Caribbean410 [mailto:caribbean...@gmail.com] 
> Sent: Saturday, June 12, 2010 8:48 AM
> To: user@cassandra.apache.org
> Subject: Re: read operation is slow
> 
>  
> 
> Hi, do you mean this one should not introduce much extra delay? To read
a
> record, I need select here, not sure where the extra delay comes from.
> 
> On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun  wrote:
> 
> Jassandra is used here:
> 
>  
> 
> Map> map = criteria.select();
> 
>  
> 
> The select here basically is a call to Thrift API: get_range_slices
> 
>  
> 
>  
> 
> From: Caribbean410 [mailto:caribbean...@gmail.com] 
> Sent: Saturday, June 12, 2010 8:00 AM
> 
> 
> To: user@cassandra.apache.org
> Subject: Re: read operation is slow
> 
>  
> 
> I remove some unnecessary column family and change the size of rowcache
and
> keycache, now the latency changes from 0.25ms to 0.09ms. In essence
> 0.09ms*200k=18s. I don't know why it takes more than 400s total. Here is
> the
> client code and cfstats. There are not many operations here, why is the
> extra time so large?
> 
> 
> 
>   long start = System.currentTimeMillis();
>   for (int j = 0; j < 1; j++) {
>   for (int i = 0; i < numOfRecords; i++) {
>   int n = random.nextInt(numOfRecords);
>   ICriteria criteria = cf.createCriteria();
>   userName = keySet[n];
>  
> criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
> nameFirst, 10); 
>   Map> map =
criteria.select(); 
>   List list = map.get(userName); 
> //  ByteArray bloc = list.get(0).getValue();
> //  byte[] byteArrayloc = bloc.toByteArray();
> //  loc = new String(byteArrayloc);
> 
> //  readBytes = readBytes + loc.length();
>   readBytes = readBytes + blobSize;
>   }
>   }
> 
> long finish=System.currentTimeMillis();
> 
> float totalTime=(finish-start)/1000;
> 
> 
> Keyspace: Keyspace1
> Read Count: 60
> Read Latency: 0.090530067 ms.
> Write Count: 20
> Write Latency: 0.01504989 ms.
> Pending Tasks: 0
> Column Family: Standard2
> SSTable count: 3
> Space used (live): 265990358
> Space used (total): 265990358
> Memtable Columns Count: 2615
> Memtable Data Size: 2667300
> Memtable Switch Count: 3
> Read Count: 60
> Read Latency: 0.091 ms.
> Write Count: 20
> Write Latency: 0.015 ms.
> Pending Tasks: 0
> Key cache capacity: 1000
> Key cache size: 187465
> Key cache hit rate: 0.0
> Row cache capacity: 1000
> Row cache size: 189990
> Row cache hit rate: 0.68335
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
> 
> 
> Keyspace: system
> Read Count: 1
> Read Latency: 10.954 ms.
> Write Count: 4
> Write Latency: 0.28075 ms.
> Pending Tasks: 0
> Column Family