Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they could be directly used.
I'd like to fetch whole row. I was able to dump the big row with sstable2json,
but both my app and cli is unable to read the row from cassandra.
I see in json dump that all columns are marked as "deletedAt":
-9223372036854775808, so SuperColumn::isMarkedForDelete() should return false.
My cluster is running cassandra 0.7.4 and it path was
0.7.0->0.7.2->0.7.3->0.7.4.
What's wrong? Bloom filters seems to be OK - I couldn't find tool for reading
them but attached program does the job.
I'm sure that both my app and cli refer to proper keys this big rows is getting
bigger and bigger as my app appends new super- and sub-columns to it, but can't
read it:
get mycf[utf8('my-key')];
Returned 0 results.
I'm really confused - tried to turn debug on, but I can't see anything
interesting in it. Any ideas what to check next?
Regards,
Wojtek
From: aaron morton [mailto:[email protected]]
Sent: Wednesday, May 11, 2011 12:29 AM
To: [email protected]
Subject: Re: Finding big rows
I'm not aware of anything to find the row sizes, and your code looks like a
good approach. Converting the key bytes to a string only makes sense if your
app is doing the same thing.
In the cli try using one of the data type functions to format the key the same
way as your app is, e.g. get FooCF[utf8('my-key')]
The main limitation on Super Columns is that Sub columns are not indexed
http://wiki.apache.org/cassandra/CassandraLimitations. If you have a huge row
use the get_slice() api call to get back slices of columns. The cli does not
support slicing columns.
Hope that helps.
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 10 May 2011, at 20:41, Meler Wojciech wrote:
Hello,
I've noticed very nice stats exposed with JMX. I was quite shocked when I saw
that MaxRowSize was about 400MB (it was expected to be several MB).
What is the best way to find keys of such big rows?
I couldn't find anything so I've written simple program to dump sizes from
Index files (see attachment),
and got the keys, but when I used cassandra-cli to get such columns it said
"Returned 0 results.".
I've realised that my app creates such big rows because it can't read them from
Cassandra and recreates them every time.
Are there any tuneable limits for getting whole row? Any limits on
supercolumns?
Regards,
Wojtek
"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS
0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
<IdxDump.java>
"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS
0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
BFCheck.java
Description: BFCheck.java
