date:20110511

Re: Index interval tuning

2011-05-11 Thread Héctor Izquierdo Seliva

El mié, 11-05-2011 a las 14:24 +1200, aaron morton escribió: > What version and what were the values for RecentBloomFilterFalsePositives and > BloomFilterFalsePositives ? > > The bloom filter metrics are updated in SSTableReader.getPosition() the only > slightly odd thing I can see is that we do

RE: Finding big rows

2011-05-11 Thread Meler Wojciech

Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they could be directly used. I'd like to fetch whole row. I was able to dump the big row with sstable2json, but both my app and cli is unable to read the row from cassandra. I see in json dump that all columns are marked

Re: compaction strategy

2011-05-11 Thread Terje Marthinussen

> > > Not sure I follow you. 4 sstables is the minimum compaction look for > (by default). > If there is 30 sstables of ~20MB sitting there because compaction is > behind, you > will compact those 30 sstables together (unless there is not enough space > for > that and considering you haven't change

Re: column bloat

2011-05-11 Thread Terje Marthinussen

On Wed, May 11, 2011 at 8:06 AM, aaron morton wrote: > For a reasonable large amount of use cases (for me, 2 out of 3 at the > moment) supercolumns will be units of data where the columns (attributes) > will never change by themselves or where the data does not change anyway > (archived data). > >

Re: Read time get worse during dynamic snitch reset

2011-05-11 Thread shimi

I finally found some time to get back to this issue. I turned on the DEBUG log on the StorageProxy and it shows that all of these request are read from the other datacenter. Shimi On Tue, Apr 12, 2011 at 2:31 PM, aaron morton wrote: > Something feels odd. > > From Peters nice write up of the dyn

Re: Index interval tuning

2011-05-11 Thread aaron morton

What are the values for RecentBloomFilterFalsePositives and BloomFilterFalsePositives the non ratio ones ? - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11 May 2011, at 19:53, Héctor Izquierdo Seliva wrote: > El mié, 11-05-2011 a las

Re: Finding big rows

2011-05-11 Thread aaron morton

Couple of questions to ask. You may also get some value from the #cassandra chat room where you can have a bit more of a conversation. - checking you ran nodetool scrub when upgrading to 0.7.3 ? (not related to the current problem, just asking) - what client library was using to write the data

Data types for cross language access

2011-05-11 Thread Oliver Dungey

I am currently working on a system with Cassandra that is written purely in Java. I know our end solution will require other languages to access the data in Cassandra (Python, C++ etc.). What is the best way to store data to ensure I can do this? Should I serialize everything to strings/json/xml pr

Re: Index interval tuning

2011-05-11 Thread Héctor Izquierdo Seliva

Sorry aaron, here are the values you requested RecentBloomFilterFalsePositives = 5; BloomFilterFalsePositives = 385260; uptime of the node is three days and a half, more or less El mié, 11-05-2011 a las 22:05 +1200, aaron morton escribió: > What are the values for RecentBloomFilterFalsePositiv

How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Hello, It's a question on jconsole rather than cassandra, how can I invoke getNaturalEndpoints with jconsole? org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints I want to run this method to find nodes which are responsible to store data for specific row key. I can find thi

Re: EC2 Snitch

2011-05-11 Thread Vijay

We are using this patch in our multi-region testing... yes this approach is going to be integrated into https://issues.apache.org/jira/browse/CASSANDRA-2491 once it is committed (you might want to wait for that). Yes this fix the Amazon infrastructure problems and it will automatically detect the D

RE: Finding big rows

2011-05-11 Thread Meler Wojciech

I didn't run nodetool scrub. My app uses c++ thrift client (0.5.0 and 0.6.1) . As this is production environment I get a lot of messages "collecting %s of %s", but there is no row key. I've matched it by uuid and thread - hope it is ok: [ReadStage:3][org.apache.cassandra.db.filter.SliceQueryFilte

Re: Index interval tuning

2011-05-11 Thread Chris Burroughs

On 05/10/2011 10:24 PM, aaron morton wrote: > What version and what were the values for RecentBloomFilterFalsePositives and > BloomFilterFalsePositives ? > > The bloom filter metrics are updated in SSTableReader.getPosition() the only > slightly odd thing I can see is that we do not count a key

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Nick Bailey

As far as I know you can not call getNaturalEndpoints from jconsole because it takes a byte array as a parameter and jconsole doesn't provide a way for inputting a byte array. You might be able to use the thrift call 'describe_ring' to do what you want though. You will have to manually hash your ke

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Thanks, So my options are: 1. Write a thrift client code to call describe_ring with hashed key or 2. Write a JMX client code to call getNaturalEndpoints right? 2011/5/11 Nick Bailey : > As far as I know you can not call getNaturalEndpoints from jconsole > because it takes a byte array as a param

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Nick Bailey

Yes. On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe wrote: > Thanks, > > So my options are: > 1. Write a thrift client code to call describe_ring with hashed key > or > 2. Write a JMX client code to call getNaturalEndpoints > > right? > > 2011/5/11 Nick Bailey : >> As far as I know you can not ca

Re: compaction strategy

2011-05-11 Thread Jonathan Ellis

You are of course free to reduce the min per bucket to 2. The fundamental idea of sstables + compaction is to trade disk space for higher write performance. For most applications this is the right trade to make on modern hardware... I don't think you'll get very far trying to get the 2nd without t

Re: Finding big rows

2011-05-11 Thread Peter Schuller

> What is the best way to find keys of such big rows? One, if not necessarily the best, way is to check system.log for large row warnings that trigger for rows large enough to be compacted lazily. Grep for 'azy' (or lazy case-insens) and you should find it. -- / Peter Schuller

Online text search with Hadoop/Brisk

2011-05-11 Thread Ben Scholl

I keep reading that Hadoop/Brisk is not suitable for online querying, only for offline/batch processing. What exactly are the reasons it is unsuitable? My use case is a fairly high query load, and each query ideally would return within about 20 seconds. The queries will use indexes to narrow down t

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Add a new faq: http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg 2011/5/11 Nick Bailey : > Yes. > > On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe > wrote: >> Thanks, >> >> So my options are: >> 1. Write a thrift client code to call describe_ring with hashed key >> or >> 2. Write a JMX cli

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Jonathan Ellis

Thanks! On Wed, May 11, 2011 at 10:20 AM, Maki Watanabe wrote: > Add a new faq: > http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg > > 2011/5/11 Nick Bailey : >> Yes. >> >> On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe >> wrote: >>> Thanks, >>> >>> So my options are: >>> 1. Write a thri

Re: Data types for cross language access

2011-05-11 Thread Luke Biddell

I wouldn't mind knowing how other people are approaching this problem too. On 11 May 2011 11:27, Oliver Dungey wrote: > I am currently working on a system with Cassandra that is written purely in > Java. I know our end solution will require other languages to access the > data in Cassandra (Pytho

Re: Index interval tuning

2011-05-11 Thread Jonathan Ellis

Close: the problem is we don't count *any* true positives *unless* cache is enabled. Fix attached to https://issues.apache.org/jira/browse/CASSANDRA-2637. On Wed, May 11, 2011 at 7:04 AM, Chris Burroughs wrote: > On 05/10/2011 10:24 PM, aaron morton wrote: >> What version and what were the value

Re: Data types for cross language access

2011-05-11 Thread Alex Araujo

On 5/11/11 5:27 AM, Oliver Dungey wrote: I am currently working on a system with Cassandra that is written purely in Java. I know our end solution will require other languages to access the data in Cassandra (Python, C++ etc.). What is the best way to store data to ensure I can do this? Should

Re: Data types for cross language access

2011-05-11 Thread Nate McCall

You should have no problems with byte conversion consistencies. For the serialization test cases in Hector, we verify the most of the results with o.a.c.utils.ByteBufferUtil from Cassandra source. On Wed, May 11, 2011 at 10:23 AM, Luke Biddell wrote: > I wouldn't mind knowing how other people are

Re: Data types for cross language access

2011-05-11 Thread Eric tamme

> On Wed, May 11, 2011 at 10:23 AM, Luke Biddell wrote: >> I wouldn't mind knowing how other people are approaching this problem too. >> >> On 11 May 2011 11:27, Oliver Dungey wrote: >>> I am currently working on a system with Cassandra that is written purely in >>> Java. I know our end solution

Talk on DataStax Brisk on Monday at Cassandra London

2011-05-11 Thread Dave Gardner

Hi all, Any London-based people who are interested in Brisk should come along to the Cassandra London meetup on Monday. There will be a talk and live demo. http://www.meetup.com/Cassandra-London/events/16643691/ Dave

Choice of Index

2011-05-11 Thread Baskar Duraikannu

Hello - I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and some account related data. I have index on both Company and AccountNumber. If I run a query - SELECT FROM COMPANYCF WHERE COMPANY='XXX' AND ACCOUNTNUMBER = 'YYY' Even though ACCOUNTNUMBER based Index is a better

Re: Online text search with Hadoop/Brisk

2011-05-11 Thread Edward Capriolo

On Wed, May 11, 2011 at 11:19 AM, Ben Scholl wrote: > I keep reading that Hadoop/Brisk is not suitable for online querying, only > for offline/batch processing. What exactly are the reasons it is unsuitable? > My use case is a fairly high query load, and each query ideally would return > within ab

Excessive allocation during hinted handoff

2011-05-11 Thread Gabriel Tataranu

Greetings, I'm experiencing some issues with 2 nodes (out of more than 10). Right after startup (Listening for thrift clients...) the nodes will create objects at high rate using all available CPU cores: INFO 18:13:15,350 GC for PS Scavenge: 292 ms, 494902976 reclaimed leaving 2024909864 used; m

jsvc hangs shell

2011-05-11 Thread Anton Belyaev

Hello, I installed 0.7.5 to my Ubuntu 11.04 64 bit from package at deb http://www.apache.org/dist/cassandra/debian 07x main And I met really strange problem. Any shell command that requires Cassandra's jsvc command line (for example, "ps -ef", or "top" with cmdline args) - just hangs. Using STRAC

Re: Choice of Index

2011-05-11 Thread Jonathan Ellis

No, Cassandra uses statistics to see which index will result in less rows to check. On Wed, May 11, 2011 at 12:42 PM, Baskar Duraikannu wrote: > Hello - > I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and > some account related data. I have index on both Company and Accou

Re: Index interval tuning

2011-05-11 Thread aaron morton

Thanks A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12 May 2011, at 03:44, Jonathan Ellis wrote: > Close: the problem is we don't count *any* true positives *unless* > cache is enabled. > > Fix attached to https://issues.apache.org/

Re: jsvc hangs shell

2011-05-11 Thread jonathan . colby

We use the Java Service Wrapper from Tanuki Software and are very happy with it. It's a lot more robust than jsvc. http://wrapper.tanukisoftware.com/doc/english/download.jsp The free community version will be enough in most cases. Jon On May 11, 2011 10:30pm, Anton Belyaev wrote: Hello,

Re: jsvc hangs shell

2011-05-11 Thread Anton Belyaev

I guess it is not trivial to modify the package to make it use JSW instead of JSVC. I am still not sure the JSVC itself is a culprit. Maybe something is wrong in my setup. 2011/5/12 : > We use the Java Service Wrapper from Tanuki Software and are very happy with > it. It's a lot more robust than

Keyspace creation error on 0.8 beta2

2011-05-11 Thread Sameer Farooqui

When I run this from the Cassandra CMD-Line: create keyspace MyKeySpace with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:2}]; I get this error: Internal error processing system_add_keyspace My syntax is correct for creating the ke

network topology issue

2011-05-11 Thread Anurag Gujral

Hi All, I am testing network topology strategy in cassandra I am using two nodes , one node each in different data center. Since the nodes are in different dc I assigned token 0 to both the nodes. I added both the nodes as seeds in the cassandra.yaml and I am using properyfilesnitch

Re: Ec2 Stress Results

2011-05-11 Thread Alex Araujo

On 5/9/11 9:49 PM, Jonathan Ellis wrote: On Mon, May 9, 2011 at 5:58 PM, Alex Araujo> How many replicas are you writing? Replication factor is 3. So you're actually spot on the predicted numbers: you're pushing 20k*3=60k "raw" rows/s across your 4 machines. You might get another 10% or so fro

Re: network topology issue

2011-05-11 Thread Sameer Farooqui

Anurag, The Cassandra ring spans datacenters, so you can't use token 0 on both nodes. Cassandra’s ring is from 0 to 2**127 in size. Try assigning one node the token of 0 and the second node 8.50705917 × 10^37 (input this as a single long number). To add a new keyspace in 0.8, run this from the C

Re: Keyspace creation error on 0.8 beta2

2011-05-11 Thread Sameer Farooqui

FYI - creating the keyspace with the syntax below works in beta1, just not beta2. jeromatron on the IRC channel commented that it looks like the java classpath is using the wrong library dependency for commons lang in beta2. - Sameer On Wed, May 11, 2011 at 4:09 PM, Sameer Farooqui wrote: > Wh

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread anuya joshi

Thanks Jaydeep. On first insertion, I inserted data using Thrift API programmatically. So, I could specify the timestamp which is the current system time. However, for deleting the columns I used command line client that comes with Cassandra. I have no way to specify delete timestamp in command li

Re: network topology issue

2011-05-11 Thread Anurag Gujral

Thanks Sameer for your answer. I am using two DCs DC1 , DC2 with both having one node each, my straegy_options values are DC1:1,DC2:1 I am not sure what my RF should be , should it be 1 or 2? Please Advise Thanks Anurag On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui wrote: > Anurag, > > The Ca

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread anuya joshi

Thanks aaron. here come the details: 1) Version: 0.7.4 2) Its a two node cluster with RF=2 3) It works perfectly after 1st get. Then I delete all the columns in a row. Finally, I try to insert into the same row with same row id. However, its not getting inserted programmatically. Thanks, Anuya O

Re: network topology issue

2011-05-11 Thread Narendra Sharma

My understanding is that the replication factor is for the entire ring. Even if you have 2 DCs the nodes are part of the same ring. What you get additionally from NTS is that you can specify how many replicas to place in each DC. So RF = 1 and DC1:1, DC2:1 looks incorrect to me. What is possible

Re: network topology issue

2011-05-11 Thread Sameer Farooqui

Yeah, Narendra is correct. If you have 2 nodes, one in each data center, use RF=2 and do reads and writes with either level ONE or QUORUM (which means 2 in this case). However, if you had 2 nodes in DC1 and 1 node in DC2, then you could use RF=3 and use LOCAL_QUORUM for reads and writes. For wri

Re: Ec2 Stress Results

2011-05-11 Thread Adrian Cockcroft

Hi Alex, This has been a useful thread, we've been comparing your numbers with our own tests. Why did you choose four big instances rather than more smaller ones? For $8/hr you get four m2.4xl with a total of 8 disks. For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x disk spa

Re: Keyspace creation error on 0.8 beta2

2011-05-11 Thread Jeremy Hanna

I download a fresh 0.8 beta2 and create keyspaces fine - including the ones below. I don't know if there are relics of a previous install somewhere or something wonky about the classpath. You said that you might have /var/lib/cassandra data left over so one thing to try is starting fresh there

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-11 Thread aaron morton

People have been using that sort of configuration in EC2 deployments to run the listen_address through a VPN and rpc_address on the private IP. Are you still having troubles connecting ? - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On

Re: Finding big rows

2011-05-11 Thread aaron morton

Let me know if you get anywhere, I'm on there as aaron_morton but I'm also way over in New Zealand. If you are using your own client and writing data you cannot read back check that the byte encoding is always the same and that you are setting appropriate timestamps for every call. In the log

Re: Excessive allocation during hinted handoff

2011-05-11 Thread aaron morton

I'm assuming the two nodes are the ones receiving the HH after they were down. Are there a lot of hints collected while they are down ? you can check the HintedHandOffManager MBean in JConsole What does the TPStats look like on the nodes under pressure ? And how many nodes are delivering hints

Re: Ec2 Stress Results

2011-05-11 Thread Alex Araujo

Hey Adrian - Why did you choose four big instances rather than more smaller ones? Mostly to see the impact of additional CPUs on a write only load. The portion of the application we're migrating from MySQL is very write intensive. The other 8 core option was c1.xl with 7GB of RAM. I will ve

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread aaron morton

How do you delete the data in the cli ? Is it a row delete e.g. del MyCF['my-key']; What client are you using the insert the row the second time ? e.g. custom thrift wrapper or pycassa How is the second read done, via the cli ? Does the same test work when you only use your app ? Cassandr

Re: network topology issue

2011-05-11 Thread aaron morton

When creating a multi DC deployment tokens should be evenly distributed in *each* dc, see this recent discussion for an example http://www.mail-archive.com/user@cassandra.apache.org/msg12975.html (I'll also update the wiki when I get time, making a note now) But no two nodes in the global ring c

Re: Excessive allocation during hinted handoff

2011-05-11 Thread Jonathan Ellis

Doesn't really look abnormal to me for a heavy write load situation which is what "receiving hints" is. On Wed, May 11, 2011 at 1:55 PM, Gabriel Tataranu wrote: > Greetings, > > I'm experiencing some issues with 2 nodes (out of more than 10). Right > after startup (Listening for thrift clients...

54 matches

Mail list logo