Forgot to put on the end of this, you could take that approach but it's not what CF's are designed for. Delete's are relatively cheap compared to MySql etc because most of the work is done in the compaction. My first approach would be to use row keys with prefixes, switch at the application level,
AFAIK yes.
Until your row is column_index_size_in_kb in size (and in some circumstances a
compaction must have run) the code has to scan through all of the columns in
the row to find the 150-200 you want.
From the help in cassandra.yaml
# Add column indexes to a row after its contents reach t
I just now watched some videos about performance tunning. And it looks like
most of the bottleneck could be on reads. Also, it looks like it's advisable
to put commit logs on separate drive.
I was wondering if it makes sense to use NFS (if we can) with netapp array
which provides it's own read an
Jonathan,
If I ask for around 150-200 columns (totally random not sequential) from a
very wide row that contains more than a million or even more columns then,
is the read performance of the SliceQuery operation affected by or "depends
on the length of the row" ?? (For my use case, I would use the
The can be, but it's not necessary.
Aaron
On 13/02/2011, at 9:04 PM, Xiaobo Gu wrote:
> Hi,
> If the cluster only have tow nodes, should they both in the seeds list?
>
> Regards,
>
> Xiaobo Gu
You can get consistency by using Quorum, or write at All and read at one, or
write at one and read at All
Start with quorum.
If you read at one, then read repair will work in the background to fix the
data. But the result returned to your client may be inconsistent.
Aaron
On 12/02/2011, at 7:
I should note up front that the JVM simply does not handle heap sizes above
20G very well because the GC starts to become problematic.
Do you read rows in a uniformly random way? If not, caching is your best
bet for reducing read latencies. You should have enough space to cache all
of your keys,
FWIW I would first try to reduce the number of columns, before reducing their
name length. If you always pull back the same columns (e.g. User details)
consider packing them in json dict and storing them in one column.
Aaron
On 12/02/2011, at 5:22 AM, Chris Burroughs wrote:
> On 02/11/2011 05
The best way to store things depends on how you want to read them back.
You could use a compound key such as user/listtype and then store the items in
the lists as column were the col name is a timestamp and the col value is a
packed data structure like json.
As bill says, don't create a CF per
There are functions on the Cassandra API to rename and drop column families,
see
http://wiki.apache.org/cassandra/API dropping a CF does not immediately free up
the disk space, see the docs.
AFAIK the rename is not atomic across the cluster (that would require locks) so
you best bet would be t
> Excellent! How about adding Hinted Handoff enabled/disabled option?
Sure, once I understand it ;-)
/Janne
On 13/02/2011 13:49, Janne Jalkanen wrote:
Folks,
as it seems that wrapping the brain around the R+W>N concept is a big hurdle
for a lot of users, I made a simple web page that allows you to try out the
different parameters and see how they affect the system.
http://www.ecyrd.com/cassandracal
On Sun, 2011-02-13 at 15:49 +0200, Janne Jalkanen wrote:
> as it seems that wrapping the brain around the R+W>N concept is a big
> hurdle for a lot of users, I made a simple web page that allows you to
> try out the different parameters and see how they affect the system.
>
> http://www.ecyrd.com/
On Sun, Feb 13, 2011 at 1:39 AM, Xiaobo Gu wrote:
> multiple network paths for inner-cluster communication will boost performance
>
> Thanks.
>
> Xiaobo Gu
>
No. Each node has a single IP. You can boost performance in a similar
way with Ethernet bonding, or 10G
No.
On Sun, Feb 13, 2011 at 8:48 AM, Shay Assulin wrote:
> HI,
>
> Is there a way to get only the keys of indexed rows (without getting
> columns) using get_indexed_slices method?
>
> I am using Hector to access Cassandra and I want to count rows with a
> specific index - so i need to get only th
http://wiki.apache.org/cassandra/FAQ#range_ghosts
On Sun, Feb 13, 2011 at 9:08 AM, Mark Zitnik wrote:
> Hi,
>
> I would like to delete a key permanently in cassandra 0.7 and not receive it
> in get range api.
> Is it possible.
> Thanks
>
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
c
On Sun, Feb 13, 2011 at 12:37 AM, E S wrote:
> I've gotten myself really confused by
> http://wiki.apache.org/cassandra/ArchitectureInternals and am hoping someone
> can
> help me understand what the io behavior of this operation would be.
>
> When I do a get_slice for a column range, will it see
Hi,
I would like to delete a key permanently in cassandra 0.7 and not receive it
in get range api.
Is it possible.
Thanks
HI,
Is there a way to get only the keys of indexed rows (without getting
columns) using get_indexed_slices method?
I am using Hector to access Cassandra and I want to count rows with a
specific index - so i need to get only the keys.
I am doing the following:
n = 0
while (true) {
i
Folks,
as it seems that wrapping the brain around the R+W>N concept is a big hurdle
for a lot of users, I made a simple web page that allows you to try out the
different parameters and see how they affect the system.
http://www.ecyrd.com/cassandracalculator/
Let me know if you have any suggest
> But when modeling the application I understand so far that ColumnFamily is
> sort of "table with objects". In typical application there are lot of tables
> so why is the mindset set towards having more or less 10 ColumnFamilies?
> Even in this trivial example there are already 7 CFs
> http://www.
On 13.2.2011 11:40, Peter Schuller wrote:
Reading in the documentation (specially on the tuning section) is clear the
the number of Column Families affects the performance, in particular the
amount of memory assigned to the heap.
My question is: What's the hard limit on the number of CFs?
Does a
> Some questions I have:
Answering two of them independently of your Java snippet; not sure
what you intend to be read into it.
> 1) Is partitioning based on CF.KEY or KEY of Column? From what I read it's
> based on column keys and not the CF keys but want to confirm.
Partitioning is based on ro
> Reading in the documentation (specially on the tuning section) is clear the
> the number of Column Families affects the performance, in particular the
> amount of memory assigned to the heap.
> My question is: What's the hard limit on the number of CFs?
> Does anybody implemented an application w
I agree, that is the way to go. Then each piece of new functionality will
not have to be implemented twice.
On Sat, Feb 12, 2011 at 9:41 AM, Stu Hood wrote:
> I would like to continue to support super columns, but to slowly convert
> them into "compound column names", since that is really all th
Hi,
If the cluster only have tow nodes, should they both in the seeds list?
Regards,
Xiaobo Gu
26 matches
Mail list logo