subject:"Re\: Best way to do a multi

Re: Best way to do a multi_get using CQL

2014-06-20 Thread DuyHai Doan

"The bad design part (just my opinion, no intention to offend) is not allow the possibility of sending batches directly to the data nodes, without using a coordinator." Well it's normal that it's not possible. What is a batch ? It's a bunch of insert/update/delete statements put together. Now e

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Jonathan Haddad

I forgot to add that each connection can handle multiple simultaneous queries. This was part of the original protocol as of C* 1.2: http://www.datastax.com/dev/blog/binary-protocol Asynchronous: each connection can handle more than one active request at the same time. In practice, this means that

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Jeremy Jongsma

There is nothing preventing that in Cassandra, it's just a matter of how intelligent the driver API is. Submit a feature request to Astyanax or Datastax driver projects. On Fri, Jun 20, 2014 at 2:27 PM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote: > The bad design part (just my opin

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Marcelo Elias Del Valle

The bad design part (just my opinion, no intention to offend) is not allow the possibility of sending batches directly to the data nodes, without using a coordinator. I would choose that option. []s 2014-06-20 16:05 GMT-03:00 DuyHai Doan : > Well it's kind of a trade-off. > > Either you send da

Re: Best way to do a multi_get using CQL

2014-06-20 Thread DuyHai Doan

Well it's kind of a trade-off. Either you send data directly to the primary replica nodes to take advantage of data-locality using token-aware strategy and the price to pay is a high number of opened connections from client side. Or you just batch data to a random node playing the coordinator ro

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Marcelo Elias Del Valle

I am using python + CQL Driver. I wonder how they do... These things seems little important, but they are fundamental to get a good performance in Cassandra... I wish there was a simpler way to query in batches. Opening a large amount of connections and sending 1 message at a time seems bad to me,

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Jeremy Jongsma

That depends on the connection pooling implementation in your driver. Astyanax will keep N connections open to each node (configurable) and route each query in a separate message over an existing connection, waiting until one becomes available if all are in use. On Fri, Jun 20, 2014 at 12:32 PM,

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Marcelo Elias Del Valle

A question, not sure if you guys know the answer: Supose I async query 1000 rows using token aware and suppose I have 10 nodes. Suppose also each node would receive 100 row queries each. How does async work in this case? Would it send each row query to each node in a different connection? Different

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Jeremy Jongsma

I've found that if you have any amount of latency between your client and nodes, and you are executing a large batch of queries, you'll usually want to send them together to one node unless execution time is of no concern. The tradeoff is resource usage on the connected node vs. time to complete al

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Laing, Michael

However my extensive benchmarking this week of the python driver from master shows a performance *decrease* when using 'token_aware'. This is on 12-node, 2-datacenter, RF-3 cluster in AWS. Also why do the work the coordinator will do for you: send all the queries, wait for everything to come back

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Marcelo Elias Del Valle

Yes, I am using the CQL datastax drivers. It was a good advice, thanks a lot Janathan. []s 2014-06-20 0:28 GMT-03:00 Jonathan Haddad : > The only case in which it might be better to use an IN clause is if > the entire query can be satisfied from that machine. Otherwise, go > async. > > The nati

Re: Best way to do a multi_get using CQL

2014-06-19 Thread Jonathan Haddad

The only case in which it might be better to use an IN clause is if the entire query can be satisfied from that machine. Otherwise, go async. The native driver reuses connections and intelligently manages the pool for you. It can also multiplex queries over a single connection. I am assuming yo

Re: Best way to do a multi_get using CQL

2014-06-19 Thread Marcelo Elias Del Valle

This is interesting, I didn't know that! It might make sense then to use select = + async + token aware, I will try to change my code. But would it be a "recomended solution" for these cases? Any other options? I still would if this is the right use case for Cassandra, to look for random keys in

Re: Best way to do a multi_get using CQL

2014-06-19 Thread Jonathan Haddad

If you use async and your driver is token aware, it will go to the proper node, rather than requiring the coordinator to do so. Realistically you're going to have a connection open to every server anyways. It's the difference between you querying for the data directly and using a coordinator as a

Re: Best way to do a multi_get using CQL

2014-06-19 Thread Marcelo Elias Del Valle

But using async queries wouldn't be even worse than using SELECT IN? The justification in the docs is I could query many nodes, but I would still do it. Today, I use both async queries AND SELECT IN: SELECT_ENTITY_LOOKUP = "SELECT entity_id FROM " + ENTITY_LOOKUP + " WHERE name=%s and value in(%s

Re: Best way to do a multi_get using CQL

2014-06-19 Thread Jonathan Haddad

Your other option is to fire off async queries. It's pretty straightforward w/ the java or python drivers. On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle wrote: > I was taking a look at Cassandra anti-patterns list: > > http://www.datastax.com/documentation/cassandra/2.0/cassandra/arch

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

Re: Best way to do a multi_get using CQL

16 matches

Site Navigation

Mail list logo

Footer information