Let's say we have a table like with just an integer primary key named ID
and a text column named VALUE…
if we set value to 0, "hello world"
… obviously , that's a normal value.
However, what happens if we update it with
0, null
… how is the 'null' stored?
I couldn't find any documentation fo
Hi,
I have a problem when insert data of the map type into a cassandra table. I
tried all kinds of MapSerializer to serialize the Map data and did not
succeed.
My code is like this:
Column column = new Column();
column.name=columnSerializer.toByteBuffer(colname); // the
co
Hi all,
I have a quick question on the unit of the latency in the output of
cassandra-stress: is it milli-second or second? I cannot find the answer in the
documentation:
http://www.datastax.com/documentation/cassandra/1.2/cassandra/tools/toolsCStressOutput_c.html
Thanks,
Senhua
On Fri, Jun 20, 2014 at 3:09 PM, DuyHai Doan wrote:
> @Robert: do we still need to cleanup manually snapshot when truncating ?
I remembered that on the 1.2 branch, even though the auto_snapshot param
was set to false, truncating leads to snapshot creation that forced us to
manually remove the snap
Thanks,
Is there any code way to know when the scheme finished to settle down?
Can working RF=2 and CL=ANY result in any problem with consistency? I am
not sure I can have problems with consistency if I don't do updates, only
writes and reads. Can I?
By the way I am using Cassandra 2.0.8.
Pavel
Thanks Robert,
Can you please explain what problems DROP/CREATE keyspace may cause?
Seems like truncate working per column family and I have up to 10.
What I should I delete from disk in that case? I can't delete whole folder
right? I need to delete all content under each cf folder, but not folder
Schema propagation takes times:
https://issues.apache.org/jira/browse/CASSANDRA-5725
@Robert: do we still need to cleanup manually snapshot when truncating ? I
remembered that on the 1.2 branch, even though the auto_snapshot param was
set to false, truncating leads to snapshot creation that forced
On Fri, Jun 20, 2014 at 2:48 PM, Pavel Kogan
wrote:
> So what we did is creating every hour new keyspace named _MM_dd_HH and
> when disk becomes full, script running in crontrab on each node drops
> keyspace with "IF EXISTS" flag, and deletes whole keyspace folder. That way
> whole process is
Am 20.06.2014 um 23:48 schrieb Pavel Kogan :
> 1) When new keyspace with its columnfamilies is being just created (every
> round hour), sometimes other modules failed to read/write data, and we lose
> request. Can it be that creation of keyspace and columnfamilies is async
> operation or there
Hi,
In our project, many distributed modules sending each other binary blobs,
up to 100-200kb each in average. Small JSONs are being sent over message
queue, while Cassandra is being used as temporary storage for blobs. We are
using Cassandra instead of in memory distributed cache like Couch due t
Ok, in my case it was straightforward. It is just warning, which however
says that batches with large data size (above 5Kb) can sometimes lead to
node instability (why?). This limit seems to be hard-coded, I didn't find
anyway to configure it externally. Anyway, removing batch and giving up
atomici
Logged batch.
On Fri, Jun 20, 2014 at 2:13 PM, DuyHai Doan wrote:
> I think some figures from "nodetool tpstats" and "nodetool
> compactionstats" may help seeing clearer
>
> And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED
> batch ?
>
>
>
>
>
> On Fri, Jun 20, 2014 at 8:02
"The bad design part (just my opinion, no intention to offend) is not allow
the possibility of sending batches directly to the data nodes, without
using a coordinator."
Well it's normal that it's not possible.
What is a batch ? It's a bunch of insert/update/delete statements put
together. Now e
I forgot to add that each connection can handle multiple simultaneous
queries. This was part of the original protocol as of C* 1.2:
http://www.datastax.com/dev/blog/binary-protocol
Asynchronous: each connection can handle more than one active request
at the same time. In practice, this means that
There is nothing preventing that in Cassandra, it's just a matter of how
intelligent the driver API is. Submit a feature request to Astyanax or
Datastax driver projects.
On Fri, Jun 20, 2014 at 2:27 PM, Marcelo Elias Del Valle <
marc...@s1mbi0se.com.br> wrote:
> The bad design part (just my opin
The bad design part (just my opinion, no intention to offend) is not allow
the possibility of sending batches directly to the data nodes, without
using a coordinator.
I would choose that option.
[]s
2014-06-20 16:05 GMT-03:00 DuyHai Doan :
> Well it's kind of a trade-off.
>
> Either you send da
Well it's kind of a trade-off.
Either you send data directly to the primary replica nodes to take
advantage of data-locality using token-aware strategy and the price to pay
is a high number of opened connections from client side.
Or you just batch data to a random node playing the coordinator ro
I am using python + CQL Driver.
I wonder how they do...
These things seems little important, but they are fundamental to get a good
performance in Cassandra...
I wish there was a simpler way to query in batches. Opening a large amount
of connections and sending 1 message at a time seems bad to me,
This is nice!
I was looking for something like this to implement a multi DC cluster
between OVh and Amazon.
Thanks for sharing!
[]s
2014-06-20 15:35 GMT-03:00 Jeremy Jongsma :
> Sharing in case anyone else wants to use this:
>
>
> https://github.com/barchart/cassandra-plugins/blob/master/src/mai
Sharing in case anyone else wants to use this:
https://github.com/barchart/cassandra-plugins/blob/master/src/main/java/com/barchart/cassandra/plugins/snitch/GossipingPropertyFileWithEC2FallbackSnitch.java
Basically it is a proxy that attempts to use GossipingPropertyFileSnitch,
and it that fails
I think some figures from "nodetool tpstats" and "nodetool compactionstats"
may help seeing clearer
And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED batch
?
On Fri, Jun 20, 2014 at 8:02 PM, Marcelo Elias Del Valle <
marc...@s1mbi0se.com.br> wrote:
> If you have 32 Gb RAM
The lib directory (where all the other jars are). bin/cassandra.in.sh does
this:
for jar in "$CASSANDRA_HOME"/lib/*.jar; do
CLASSPATH="$CLASSPATH:$jar"
done
On Fri, Jun 20, 2014 at 12:58 PM, Jeremy Jongsma
wrote:
> Where do I add my custom snitch JAR to the Cassandra classpath so I can
>
If you have 32 Gb RAM, the heap is probably 8Gb.
200 writes of 100 kb / s would be 20MB / s in the worst case, supposing all
writes of a replica goes to a single node.
I really don't see any reason why it should be filling up the heap.
Anyone else?
But did you check the logs for the GCInspector?
I
Where do I add my custom snitch JAR to the Cassandra classpath so I can use
it?
Hi Marcelo,
No pending write tasks, I am writing a lot, about 100-200 writes each up to
100Kb every 15[s].
It is running on decent cluster of 5 identical nodes, quad cores i7 with
32Gb RAM and 480Gb SSD.
Regards,
Pavel
On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle <
marc...@s1mbi0
That depends on the connection pooling implementation in your driver.
Astyanax will keep N connections open to each node (configurable) and route
each query in a separate message over an existing connection, waiting until
one becomes available if all are in use.
On Fri, Jun 20, 2014 at 12:32 PM,
A question, not sure if you guys know the answer:
Supose I async query 1000 rows using token aware and suppose I have 10
nodes. Suppose also each node would receive 100 row queries each.
How does async work in this case? Would it send each row query to each node
in a different connection? Different
On Fri, Jun 20, 2014 at 2:39 AM, Simon Chemouil wrote:
> OK, so Cassandra 2.1 now rejects writes it considers too big. It is
> possible to increase the value by changing commitlog_segment_size_in_mb
> in cassandra.yaml. It defaults to 32MB, and the maximum segment size for
> a write is half that
Pavel,
In my case, the heap was filling up faster than it was draining. I am still
looking for the cause of it, as I could drain really fast with SSD.
However, in your case you could check (AFAIK) nodetool tpstats and see if
there are too many pending write tasks, for instance. Maybe you really a
I've found that if you have any amount of latency between your client and
nodes, and you are executing a large batch of queries, you'll usually want
to send them together to one node unless execution time is of no concern.
The tradeoff is resource usage on the connected node vs. time to complete
al
Thanks Simon for the info. I didn't know that the maximum payload size is
related to commit log config, interesting ...
On Fri, Jun 20, 2014 at 11:39 AM, Simon Chemouil
wrote:
> OK, so Cassandra 2.1 now rejects writes it considers too big. It is
> possible to increase the value by changing comm
The cluster is new, so no updates were done. Version 2.0.8.
It happened when I did many writes (no reads). Writes are done in small
batches of 2 inserts (writing to 2 column families). The values are big
blobs (up to 100Kb).
Any clues?
Pavel
On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Va
However my extensive benchmarking this week of the python driver from
master shows a performance *decrease* when using 'token_aware'.
This is on 12-node, 2-datacenter, RF-3 cluster in AWS.
Also why do the work the coordinator will do for you: send all the queries,
wait for everything to come back
Thank you very much, I recompiled it with 2.0 and it works well, now I
will try to figure out which granularity works better.
Your example was really a boost, thanks again!
Regards,
Paolo
Il 19/06/2014 22:42, Paulo Ricardo Motta Gomes ha scritto:
Hello Paolo,
I just published an open source
OK, so Cassandra 2.1 now rejects writes it considers too big. It is
possible to increase the value by changing commitlog_segment_size_in_mb
in cassandra.yaml. It defaults to 32MB, and the maximum segment size for
a write is half that value:
from CommitLog.java:
// we only permit records HALF the s
For the record, I could reproduce the problem with blobs of size below 64MB.
Caused by: java.lang.IllegalArgumentException: Mutation of 32000122
bytes is too large for the maxiumum size of 16777216
32000122 is just ~30MB and fails on 2.1-rc1 while it works on 2.0.X for
even larger values (up to 6
So looks like I was sending more than I expected. Still the question
stands: is CQL the best way to send BLOBs? Are there any remote
operations available on BLOBs?
Thanks,
Simon
Le 20/06/2014 10:03, Simon Chemouil a écrit :
> Hi,
>
> I read in Cassandra's FAQ that it is fine with BLOBs up to 64M
Le 20/06/2014 10:41, Duncan Sands a écrit :
> Hi Simon,
> 122880122 bytes is a lot more than 0.6MB... How are you sending your blob?
Turns out there was a mistake in my code. The blob in this case was
actually 122MB!
Still the same code works fine on Cassandra 2.0.x so there might be a
bug lurkin
Hi Simon,
On 20/06/14 10:18, Simon Chemouil wrote:
Hi,
When I am sending BLOBs _below_ the max query size (blob size=0.6MB), on
Cassandra 2.0, it works fine, but on 2.1-rc1 I get the following error
within the Cassandra server (from the logs) and the query just dies:
WARN [SharedPool-Worker-2
Hi,
When I am sending BLOBs _below_ the max query size (blob size=0.6MB), on
Cassandra 2.0, it works fine, but on 2.1-rc1 I get the following error
within the Cassandra server (from the logs) and the query just dies:
WARN [SharedPool-Worker-2] 2014-06-20 10:06:00,263
AbstractTracingAwareExecutor
Hi,
I read in Cassandra's FAQ that it is fine with BLOBs up to 64MB. Here am
I trying to send a 1.6MB BLOB using CQL and Cassandra rejects my query
with the following message:
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException:
Request is too big: length 409600086 exceeds maximum
Yes, I am using the CQL datastax drivers.
It was a good advice, thanks a lot Janathan.
[]s
2014-06-20 0:28 GMT-03:00 Jonathan Haddad :
> The only case in which it might be better to use an IN clause is if
> the entire query can be satisfied from that machine. Otherwise, go
> async.
>
> The nati
42 matches
Mail list logo