date:20120419

Re: Cassandra read optimization

2012-04-19 Thread Paolo Bernardi

Look into your Cassandra's logs to see if JNA is really enabled (it really should be, by default), and more importantly if JNA is loaded correctly. You might find some surprising message over there: if this is the case, just install JNA with your distro's package manager and, if still doesn't work,

Re: Cassandra read optimization

2012-04-19 Thread Dan Feldman

Hi Paolo, Thanks for the hint - JNA indeed wasn't installed. However, now that cassandra is actually using it, there doesn't seem to be any change in terms of speed - still 7 seconds with pycassa. On Thu, Apr 19, 2012 at 12:14 AM, Paolo Bernardi wrote: > Look into your Cassandra's logs to see i

RE: blob fields, bynary or hexa?

2012-04-19 Thread mdione.ext

De : phuduc nguyen [mailto:duc.ngu...@pearson.com] > How are you passing a blob or binary stream to the CLI? It sounds like > you're passing in a representation of a binary stream as ascii/UTF8 > which will create the problems you describe. So this is only a limitation of Cassandra-cli? -- Marc

By passing Socket communication

2012-04-19 Thread Tarun Gupta

Hi, I am interesting in knowing what is the best way to create my Cassandra Client bypassing the Socket communication and directly interacting with the 'Storage Manager'. I checked Cassandra Wiki and some of the Hector Examples, mostly what I see is that Cassandra when run in embedded mode, requir

Re: User authorized for cannot create CFs

2012-04-19 Thread aaron morton

What version are you on ? AFAIK the SimpleAuthenticator, and to some degree authentication (?), has been essentially deprecated as it was considered incomplete and was not under development. This is why the SimpleAuthenticator was moved out to the examples directory in 1.X. I doubt it will be

Re: exceptions after upgrading from 1.0.7 to 1.0.9

2012-04-19 Thread aaron morton

try this http://www.datastax.com/docs/1.0/install/upgrading#upgrading-between-minor-releases-of-cassandra-1-0-x Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/04/2012, at 3:02 AM, Tamar Fraenkel wrote: > Thanks!!! > Two simple actions

Re: Multi Master replication : rejoining a node after split network

2012-04-19 Thread aaron morton

For background: http://www.datastax.com/docs/1.0/cluster_architecture/index http://thelastpickle.com/2011/02/07/Introduction-to-Cassandra/ > Which mechanism is used to replicate the changes from one system to another: > statement distribution or recording the changeset via triggers or storing the

Re: By passing Socket communication

2012-04-19 Thread Watanabe Maki

You can get some idea from reading org.apache.cassandra.thrift.CassandraServer.java, but I wonder what kind of use case will justify such effort. From iPhone On 2012/04/19, at 18:17, Tarun Gupta wrote: > Hi, > > I am interesting in knowing what is the best way to create my Cassandra > Clie

Re: Single Vs. Multiple Keyspaces

2012-04-19 Thread aaron morton

I would suggest you build one cluster, using all your nodes, and create one keyspace for all users. There are lots of reasons, here a few: * many nodes in a single clusters spreads the load and gives you fault tolerance. * read and write requests can be distributed in a many node cluster. * ca

Re: RMI/JMX errors, weird

2012-04-19 Thread aaron morton

At some point the gossip system on the node this log is from decided that 130.199.185.195 was DOWN. This was based on how often the node was gossiping to the cluster. The active repair session was informed. And to avoid failing the job unnecessarily it tested that the errant nodes phi value wa

Re: Multi Master replication : rejoining a node after split network

2012-04-19 Thread Romain HARDOUIN

As timestamps are set by clients, a common gotcha is to have all or some clients which are not synchronised by NTP.

200TB in Cassandra ?

2012-04-19 Thread Franc Carter

Hi, One of the projects I am working on is going to need to store about 200TB of data - generally in manageable binary chunks. However, after doing some rough calculations based on rules of thumb I have seen for how much storage should be on each node I'm worried. 200TB with RF=3 is 600TB = 600

Re: Multi Master replication : rejoining a node after split network

2012-04-19 Thread Samba

Thanks Aaron and Romain, very useful information indeed; and yes there is no alternative to personally trying out and dirtying our hands. Regards, Samba

Re: Cassandra read optimization

2012-04-19 Thread aaron morton

Here's a test I did a while ago about creating column objects in python http://www.mail-archive.com/user@cassandra.apache.org/msg06729.html As Tyler said, the best approach is to limit the size of the slices. If are are trying to load 125K super columns with 25 columns each your are asking fo

RE 200TB in Cassandra ?

2012-04-19 Thread Romain HARDOUIN

Cassandra supports data compression and depending on your data, you can gain a reduction in data size up to 4x. 600 TB is a lot, hence requires lots of servers... Franc Carter a écrit sur 19/04/2012 13:12:19 : > Hi, > > One of the projects I am working on is going to need to store about > 2

Re: RE 200TB in Cassandra ?

2012-04-19 Thread Franc Carter

On Thu, Apr 19, 2012 at 9:38 PM, Romain HARDOUIN wrote: > > Cassandra supports data compression and depending on your data, you can > gain a reduction in data size up to 4x. > The data is gzip'd already ;-) > 600 TB is a lot, hence requires lots of servers... > > > Franc Carter a écrit sur 19/

Re: exceptions after upgrading from 1.0.7 to 1.0.9

2012-04-19 Thread Tamar Fraenkel

Thanks. This was the one I followed :) Wonder if there is something more detailed... *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Thu, Apr 19, 2012 at 1:06 PM, aaron mor

Re: 200TB in Cassandra ?

2012-04-19 Thread John Doe

Franc Carter > One of the projects I am working on is going to need to store about 200TB of > data - generally in manageable binary chunks. However, after doing some rough > calculations based on rules of thumb I have seen for how much storage should > be on each node I'm worried. > 200TB wit

Re: RE 200TB in Cassandra ?

2012-04-19 Thread Yiming Sun

600 TB is really a lot, even 200 TB is a lot. In our organization, storage at such scale is handled by our storage team and they purchase specialized (and very expensive) equipment from storage hardware vendors because at this scale, performance and reliability is absolutely critical. but it soun

Re: 200TB in Cassandra ?

2012-04-19 Thread Franc Carter

On Thu, Apr 19, 2012 at 10:07 PM, John Doe wrote: > Franc Carter > > > One of the projects I am working on is going to need to store about > 200TB of data - generally in manageable binary chunks. However, after doing > some rough calculations based on rules of thumb I have seen for how much > st

Re: RE 200TB in Cassandra ?

2012-04-19 Thread Franc Carter

On Thu, Apr 19, 2012 at 10:16 PM, Yiming Sun wrote: > 600 TB is really a lot, even 200 TB is a lot. In our organization, > storage at such scale is handled by our storage team and they purchase > specialized (and very expensive) equipment from storage hardware vendors > because at this scale, pe

Re: RE 200TB in Cassandra ?

2012-04-19 Thread Nigel Kerr

Can you say more about how and how often these 200TB get used, queried, updated? Is a different usage profile needed? What kind of column families do you have in mind for them? On Thu, Apr 19, 2012 at 8:24 AM, Franc Carter wrote: > On Thu, Apr 19, 2012 at 10:16 PM, Yiming Sun wrote: > >> 600

Re: 200TB in Cassandra ?

2012-04-19 Thread Dave Brosius

I think your math is 'relatively' correct. It would seem to me you should focus on how you can reduce the amount of storage you are using per item, if at all possible, if that node count is prohibitive. On 04/19/2012 07:12 AM, Franc Carter wrote: Hi, One of the projects I am working on is go

Re: By passing Socket communication

2012-04-19 Thread Romain HARDOUIN

Take a peep at cassandra-unit, maybe this could help you : https://github.com/jsevellec/cassandra-unit

Re: blob fields, bynary or hexa?

2012-04-19 Thread phuduc nguyen

Well, I'm not sure exactly how you're passing a blob to the CLI. It would be helpful if you pasted your commands/code and maybe there is a simple oversight. With that said, Cassandra can most definitely save blob/binary values. I think most people use a high level client; we use Hector. If you're

RE: RE 200TB in Cassandra ?

2012-04-19 Thread Dan Hendry

> The bit I am trying to understand is whether my figure of 400GB/node in practice for Cassandra is correct, or whether we can push the GB/node higher and if so how high Our cluster runs with up to 2TB/node (thats the compressed size) and an RF=2. The figure of 400GB/node is by no way a maximum

Re: blob fields, bynary or hexa?

2012-04-19 Thread R. Verlangen

PHPCassa does support binaries, so that should not be the problem. 2012/4/19 phuduc nguyen > Well, I'm not sure exactly how you're passing a blob to the CLI. It would > be > helpful if you pasted your commands/code and maybe there is a simple > oversight. > > With that said, Cassandra can most d

migrating from SimpleStrategy to NetworkTopologyStrategy

2012-04-19 Thread simojenki

Hi, Is there any documentation on what the procedure for migrating from SimpleStrategy to NetworkTopologyStrategy? thanks Simon

Write Performance

2012-04-19 Thread Trevor Francis

Would there be any reason why I can't write more than 875 writes/sec to a cluster of 2 cassandra boxes? They are quad core machines with 8gb of ram running raid 10, so not huge servers….but certainly enough to handle a much larger load than that. We are feeding data into it through a Flume sin

Two Random Ports in Private port range

2012-04-19 Thread W F

Hi All, I did a web search of the archives (hope I looked in the right place) and could not find a request like this. When Cassandra is running, it seems to create to random tcp listen ports. For example: "50378 and 58692", "49952, 52792". What are are these for and is there documentation regar

default required in cassandra-topology.properties?

2012-04-19 Thread Bill Au

All the examples of cassandra-topology.properties that I have seen have a default entry assigning unknown nodes to a specific data center and rack. Is it possible to have Cassandra ignore unknown nodes for the purpose of replication? Bill

Re: Cassandra read optimization

2012-04-19 Thread Dan Feldman

We'll try doing multithreaded requests today-tomorrow As for tuning down the number of supercolumns per slice, I tried doing that, but I've noticed that the time was decreasing linearly with the length of the slice. So, grabbing 1000 per slice would take 1/5 as long as 5000, but i'll have to make

Re: migrating from SimpleStrategy to NetworkTopologyStrategy

2012-04-19 Thread Marcus Both

I think that is enough to do an update on keyspace, for example (cassandra-cli): update keyspace KEYSPACE with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options = {datacenter1: 1}; On Thu, 19 Apr 2012 16:18:46 +0100 simojenki wrote: > Hi, > > Is

High Log Storage

2012-04-19 Thread Trevor Francis

I have a web application that generates multiple log files in a log file directory. On a particularly chatty box, up to 2000 entries per second are written to those log files. We are looking for a solution to tail that directory and insert new entries into a cassandra db. The fields in the log

RE: default required in cassandra-topology.properties?

2012-04-19 Thread Richard Lowe

Yes it is possible. Put the following as the last line of your topology file: default=unknown:unknown So long as you don't have any DC or rack with this name your local node will not be able to address any nodes that aren't explicitly given in its topology file. However bear in mind that, whil

Re: High Log Storage

2012-04-19 Thread bill

Try writing them through Kafka. It should that load. Bill Sent from my BlackBerry® wireless handheld -Original Message- From: Trevor Francis Date: Thu, 19 Apr 2012 12:04:19 To: Reply-To: user@cassandra.apache.org Subject: High Log Storage I have a web application that generates multip

Re: default required in cassandra-topology.properties?

2012-04-19 Thread Bill Au

I had thought that the topology file is used for replicas placement only such that for the token range that the unknown node is responsible for, data is still read and write there. It just won't be replicated since replication factor is not defined. Bill On Thu, Apr 19, 2012 at 1:18 PM, Richard

Re: 200TB in Cassandra ?

2012-04-19 Thread aaron morton

Couple of ideas: * take a look at compression in 1.X http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression * is there repetition in the binary data ? Can you save space by implementing content addressable storage ? Cheers - Aaron Morton Freelance Developer

Re: Write Performance

2012-04-19 Thread aaron morton

You should be able to get more than that. Run nodetool cfstats, look at the Write Latency (this is the recent latency, i.e. is reset each time you run it). This will give you an idea of how long an individual node is spending on a write. Fire up JConsole, go to the StorageProxy MBean and look

Re: Cassandra read optimization

2012-04-19 Thread aaron morton

> but i'll have to make 5 times as many requests to the database 5 times a small number can be less than 1 big number :) see http://wiki.apache.org/cassandra/HadoopSupport It's also covered in the O'Reilly cassandra book, however that book is somewhat out of date. also search for posts from Jere

Re: migrating from SimpleStrategy to NetworkTopologyStrategy

2012-04-19 Thread aaron morton

There is this, it's old.. http://wiki.apache.org/cassandra/Operations#Replication There was also a discussion about it in the last month or so. i *think* it's ok so long as you move to a single DC and single rack. But please test. Cheers - Aaron Morton Freelance Developer @a

RE: DataStax Opscenter 2.0 question

2012-04-19 Thread Jay Parashar

Firefox version 3.6.10. on Ubuntu 10.10. Let me update it and try. Thanks Nick! Will let you know. -Original Message- From: Nick Bailey [mailto:n...@datastax.com] Sent: Wednesday, April 18, 2012 4:56 PM To: user@cassandra.apache.org Subject: Re: DataStax Opscenter 2.0 question What versi

RE: DataStax Opscenter 2.0 question

2012-04-19 Thread Jay Parashar

Thanks Nick, that was it. With Firefox 11, it works. -Original Message- From: Nick Bailey [mailto:n...@datastax.com] Sent: Wednesday, April 18, 2012 4:56 PM To: user@cassandra.apache.org Subject: Re: DataStax Opscenter 2.0 question What version of firefox? Someone has reported a similar

Re: migrating from SimpleStrategy to NetworkTopologyStrategy

2012-04-19 Thread Ravikumar Govindarajan

We tried this route previously. We did not run repair at all {our use-cases don't need a repair} but while adding a secondary data center, we were forced to run repair. It ended up exploding the data. We finally had to start afresh, scrapped the cluster and re-import the data with NTS. Now, whethe

44 matches

Mail list logo