El mié, 11-05-2011 a las 14:24 +1200, aaron morton escribió:
> What version and what were the values for RecentBloomFilterFalsePositives and
> BloomFilterFalsePositives ?
>
> The bloom filter metrics are updated in SSTableReader.getPosition() the only
> slightly odd thing I can see is that we do
Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they
could be directly used.
I'd like to fetch whole row. I was able to dump the big row with sstable2json,
but both my app and cli is unable to read the row from cassandra.
I see in json dump that all columns are marked
>
>
> Not sure I follow you. 4 sstables is the minimum compaction look for
> (by default).
> If there is 30 sstables of ~20MB sitting there because compaction is
> behind, you
> will compact those 30 sstables together (unless there is not enough space
> for
> that and considering you haven't change
On Wed, May 11, 2011 at 8:06 AM, aaron morton wrote:
> For a reasonable large amount of use cases (for me, 2 out of 3 at the
> moment) supercolumns will be units of data where the columns (attributes)
> will never change by themselves or where the data does not change anyway
> (archived data).
>
>
I finally found some time to get back to this issue.
I turned on the DEBUG log on the StorageProxy and it shows that all of these
request are read from the other datacenter.
Shimi
On Tue, Apr 12, 2011 at 2:31 PM, aaron morton wrote:
> Something feels odd.
>
> From Peters nice write up of the dyn
What are the values for RecentBloomFilterFalsePositives and
BloomFilterFalsePositives the non ratio ones ?
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 11 May 2011, at 19:53, Héctor Izquierdo Seliva wrote:
> El mié, 11-05-2011 a las
Couple of questions to ask. You may also get some value from the #cassandra
chat room where you can have a bit more of a conversation.
- checking you ran nodetool scrub when upgrading to 0.7.3 ? (not related to
the current problem, just asking)
- what client library was using to write the data
I am currently working on a system with Cassandra that is written purely in
Java. I know our end solution will require other languages to access the
data in Cassandra (Python, C++ etc.). What is the best way to store data to
ensure I can do this? Should I serialize everything to strings/json/xml
pr
Sorry aaron, here are the values you requested
RecentBloomFilterFalsePositives = 5;
BloomFilterFalsePositives = 385260;
uptime of the node is three days and a half, more or less
El mié, 11-05-2011 a las 22:05 +1200, aaron morton escribió:
> What are the values for RecentBloomFilterFalsePositiv
Hello,
It's a question on jconsole rather than cassandra, how can I invoke
getNaturalEndpoints with jconsole?
org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
I want to run this method to find nodes which are responsible to store
data for specific row key.
I can find thi
We are using this patch in our multi-region testing... yes this approach is
going to be integrated into
https://issues.apache.org/jira/browse/CASSANDRA-2491 once it is committed
(you might want to wait for that). Yes this fix the Amazon infrastructure
problems and it will automatically detect the D
I didn't run nodetool scrub. My app uses c++ thrift client (0.5.0 and 0.6.1) .
As this is production environment I get a lot of messages "collecting %s of
%s", but there is no row key.
I've matched it by uuid and thread - hope it is ok:
[ReadStage:3][org.apache.cassandra.db.filter.SliceQueryFilte
On 05/10/2011 10:24 PM, aaron morton wrote:
> What version and what were the values for RecentBloomFilterFalsePositives and
> BloomFilterFalsePositives ?
>
> The bloom filter metrics are updated in SSTableReader.getPosition() the only
> slightly odd thing I can see is that we do not count a key
As far as I know you can not call getNaturalEndpoints from jconsole
because it takes a byte array as a parameter and jconsole doesn't
provide a way for inputting a byte array. You might be able to use the
thrift call 'describe_ring' to do what you want though. You will have
to manually hash your ke
Thanks,
So my options are:
1. Write a thrift client code to call describe_ring with hashed key
or
2. Write a JMX client code to call getNaturalEndpoints
right?
2011/5/11 Nick Bailey :
> As far as I know you can not call getNaturalEndpoints from jconsole
> because it takes a byte array as a param
Yes.
On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe wrote:
> Thanks,
>
> So my options are:
> 1. Write a thrift client code to call describe_ring with hashed key
> or
> 2. Write a JMX client code to call getNaturalEndpoints
>
> right?
>
> 2011/5/11 Nick Bailey :
>> As far as I know you can not ca
You are of course free to reduce the min per bucket to 2.
The fundamental idea of sstables + compaction is to trade disk space
for higher write performance. For most applications this is the right
trade to make on modern hardware... I don't think you'll get very far
trying to get the 2nd without t
> What is the best way to find keys of such big rows?
One, if not necessarily the best, way is to check system.log for large
row warnings that trigger for rows large enough to be compacted
lazily. Grep for 'azy' (or lazy case-insens) and you should find it.
--
/ Peter Schuller
I keep reading that Hadoop/Brisk is not suitable for online querying, only
for offline/batch processing. What exactly are the reasons it is unsuitable?
My use case is a fairly high query load, and each query ideally would return
within about 20 seconds. The queries will use indexes to narrow down t
Add a new faq:
http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg
2011/5/11 Nick Bailey :
> Yes.
>
> On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe
> wrote:
>> Thanks,
>>
>> So my options are:
>> 1. Write a thrift client code to call describe_ring with hashed key
>> or
>> 2. Write a JMX cli
Thanks!
On Wed, May 11, 2011 at 10:20 AM, Maki Watanabe wrote:
> Add a new faq:
> http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg
>
> 2011/5/11 Nick Bailey :
>> Yes.
>>
>> On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe
>> wrote:
>>> Thanks,
>>>
>>> So my options are:
>>> 1. Write a thri
I wouldn't mind knowing how other people are approaching this problem too.
On 11 May 2011 11:27, Oliver Dungey wrote:
> I am currently working on a system with Cassandra that is written purely in
> Java. I know our end solution will require other languages to access the
> data in Cassandra (Pytho
Close: the problem is we don't count *any* true positives *unless*
cache is enabled.
Fix attached to https://issues.apache.org/jira/browse/CASSANDRA-2637.
On Wed, May 11, 2011 at 7:04 AM, Chris Burroughs
wrote:
> On 05/10/2011 10:24 PM, aaron morton wrote:
>> What version and what were the value
On 5/11/11 5:27 AM, Oliver Dungey wrote:
I am currently working on a system with Cassandra that is written
purely in Java. I know our end solution will require other languages
to access the data in Cassandra (Python, C++ etc.). What is the best
way to store data to ensure I can do this? Should
You should have no problems with byte conversion consistencies. For
the serialization test cases in Hector, we verify the most of the
results with o.a.c.utils.ByteBufferUtil from Cassandra source.
On Wed, May 11, 2011 at 10:23 AM, Luke Biddell wrote:
> I wouldn't mind knowing how other people are
> On Wed, May 11, 2011 at 10:23 AM, Luke Biddell wrote:
>> I wouldn't mind knowing how other people are approaching this problem too.
>>
>> On 11 May 2011 11:27, Oliver Dungey wrote:
>>> I am currently working on a system with Cassandra that is written purely in
>>> Java. I know our end solution
Hi all,
Any London-based people who are interested in Brisk should come along to the
Cassandra London meetup on Monday. There will be a talk and live demo.
http://www.meetup.com/Cassandra-London/events/16643691/
Dave
Hello -
I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and
some account related data. I have index on both Company and AccountNumber.
If I run a query -
SELECT FROM COMPANYCF WHERE COMPANY='XXX' AND ACCOUNTNUMBER = 'YYY'
Even though ACCOUNTNUMBER based Index is a better
On Wed, May 11, 2011 at 11:19 AM, Ben Scholl wrote:
> I keep reading that Hadoop/Brisk is not suitable for online querying, only
> for offline/batch processing. What exactly are the reasons it is unsuitable?
> My use case is a fairly high query load, and each query ideally would return
> within ab
Greetings,
I'm experiencing some issues with 2 nodes (out of more than 10). Right
after startup (Listening for thrift clients...) the nodes will create
objects at high rate using all available CPU cores:
INFO 18:13:15,350 GC for PS Scavenge: 292 ms, 494902976 reclaimed
leaving 2024909864 used; m
Hello,
I installed 0.7.5 to my Ubuntu 11.04 64 bit from package at
deb http://www.apache.org/dist/cassandra/debian 07x main
And I met really strange problem.
Any shell command that requires Cassandra's jsvc command line (for
example, "ps -ef", or "top" with cmdline args) - just hangs.
Using STRAC
No, Cassandra uses statistics to see which index will result in less
rows to check.
On Wed, May 11, 2011 at 12:42 PM, Baskar Duraikannu
wrote:
> Hello -
> I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and
> some account related data. I have index on both Company and Accou
Thanks
A
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 12 May 2011, at 03:44, Jonathan Ellis wrote:
> Close: the problem is we don't count *any* true positives *unless*
> cache is enabled.
>
> Fix attached to https://issues.apache.org/
We use the Java Service Wrapper from Tanuki Software and are very happy
with it. It's a lot more robust than jsvc.
http://wrapper.tanukisoftware.com/doc/english/download.jsp
The free community version will be enough in most cases.
Jon
On May 11, 2011 10:30pm, Anton Belyaev wrote:
Hello,
I guess it is not trivial to modify the package to make it use JSW
instead of JSVC.
I am still not sure the JSVC itself is a culprit. Maybe something is
wrong in my setup.
2011/5/12 :
> We use the Java Service Wrapper from Tanuki Software and are very happy with
> it. It's a lot more robust than
When I run this from the Cassandra CMD-Line:
create keyspace MyKeySpace with placement_strategy =
'org.apache.cassandra.locator.SimpleStrategy' and strategy_options =
[{replication_factor:2}];
I get this error: Internal error processing system_add_keyspace
My syntax is correct for creating the ke
Hi All,
I am testing network topology strategy in cassandra I am using
two nodes , one node each in different data center.
Since the nodes are in different dc I assigned token 0 to both the nodes.
I added both the nodes as seeds in the cassandra.yaml and I am using
properyfilesnitch
On 5/9/11 9:49 PM, Jonathan Ellis wrote:
On Mon, May 9, 2011 at 5:58 PM, Alex Araujo> How many
replicas are you writing?
Replication factor is 3.
So you're actually spot on the predicted numbers: you're pushing
20k*3=60k "raw" rows/s across your 4 machines.
You might get another 10% or so fro
Anurag,
The Cassandra ring spans datacenters, so you can't use token 0 on both
nodes. Cassandra’s ring is from 0 to 2**127 in size.
Try assigning one node the token of 0 and the second node 8.50705917 × 10^37
(input this as a single long number).
To add a new keyspace in 0.8, run this from the C
FYI - creating the keyspace with the syntax below works in beta1, just not
beta2.
jeromatron on the IRC channel commented that it looks like the java
classpath is using the wrong library dependency for commons lang in beta2.
- Sameer
On Wed, May 11, 2011 at 4:09 PM, Sameer Farooqui wrote:
> Wh
Thanks Jaydeep.
On first insertion, I inserted data using Thrift API programmatically. So, I
could specify the timestamp which is the current system time. However, for
deleting the columns I used command line client that comes with Cassandra. I
have no way to specify delete timestamp in command li
Thanks Sameer for your answer.
I am using two DCs DC1 , DC2 with both having one node each, my
straegy_options values are DC1:1,DC2:1 I am not sure what my RF should be ,
should it be 1 or 2?
Please Advise
Thanks
Anurag
On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui wrote:
> Anurag,
>
> The Ca
Thanks aaron. here come the details:
1) Version: 0.7.4
2) Its a two node cluster with RF=2
3) It works perfectly after 1st get. Then I delete all the columns in a row.
Finally, I try to insert into the same row with same row id. However, its
not getting inserted programmatically.
Thanks,
Anuya
O
My understanding is that the replication factor is for the entire ring. Even
if you have 2 DCs the nodes are part of the same ring. What you get
additionally from NTS is that you can specify how many replicas to place in
each DC.
So RF = 1 and DC1:1, DC2:1 looks incorrect to me.
What is possible
Yeah, Narendra is correct.
If you have 2 nodes, one in each data center, use RF=2 and do reads and
writes with either level ONE or QUORUM (which means 2 in this case).
However, if you had 2 nodes in DC1 and 1 node in DC2, then you could use
RF=3 and use LOCAL_QUORUM for reads and writes.
For wri
Hi Alex,
This has been a useful thread, we've been comparing your numbers with
our own tests.
Why did you choose four big instances rather than more smaller ones?
For $8/hr you get four m2.4xl with a total of 8 disks.
For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x
disk spa
I download a fresh 0.8 beta2 and create keyspaces fine - including the ones
below.
I don't know if there are relics of a previous install somewhere or something
wonky about the classpath. You said that you might have /var/lib/cassandra
data left over so one thing to try is starting fresh there
People have been using that sort of configuration in EC2 deployments to run the
listen_address through a VPN and rpc_address on the private IP.
Are you still having troubles connecting ?
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On
Let me know if you get anywhere, I'm on there as aaron_morton but I'm also way
over in New Zealand.
If you are using your own client and writing data you cannot read back check
that the byte encoding is always the same and that you are setting appropriate
timestamps for every call. In the log
I'm assuming the two nodes are the ones receiving the HH after they were down.
Are there a lot of hints collected while they are down ? you can check the
HintedHandOffManager MBean in JConsole
What does the TPStats look like on the nodes under pressure ? And how many
nodes are delivering hints
Hey Adrian -
Why did you choose four big instances rather than more smaller ones?
Mostly to see the impact of additional CPUs on a write only load. The
portion of the application we're migrating from MySQL is very write
intensive. The other 8 core option was c1.xl with 7GB of RAM. I will
ve
How do you delete the data in the cli ? Is it a row delete e.g. del
MyCF['my-key'];
What client are you using the insert the row the second time ? e.g. custom
thrift wrapper or pycassa
How is the second read done, via the cli ?
Does the same test work when you only use your app ?
Cassandr
When creating a multi DC deployment tokens should be evenly distributed in
*each* dc, see this recent discussion for an example
http://www.mail-archive.com/user@cassandra.apache.org/msg12975.html (I'll also
update the wiki when I get time, making a note now) But no two nodes in the
global ring c
Doesn't really look abnormal to me for a heavy write load situation
which is what "receiving hints" is.
On Wed, May 11, 2011 at 1:55 PM, Gabriel Tataranu wrote:
> Greetings,
>
> I'm experiencing some issues with 2 nodes (out of more than 10). Right
> after startup (Listening for thrift clients...
54 matches
Mail list logo