Hello,
We have a Cassandra database that is about 5 years old and has gone through
multiple upgrades. Today I noticed a very odd thing (current timestamp would
be around 1502957436214912):
cqlsh:siq_prod> select id,account_id,sweep_id from items where id=34681132;
id | account_i
It's a long, so you can't grab it with readInt - 8 bytes instead of 4
You can delete it by issuing a delete with an explicit time stamp at least 1
higher the. The timestamp on the cell
DELETE FROM table USING TIMESTAMP=? WHERE
https://cassandra.apache.org/doc/latest/cql/dml.html#delete
T
Dor,
I believe, I tried it in many ways and the result is quite disappointing.
I've run my scans on 3 different clusters, one of which was using on VMs
and I was able to scale it up and down (3-5-7 VMs, 8 to 24 cores) to see,
how this affects the performance.
I also generated the flow from spark
Hi all,
I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.
I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?
Thanks in advance.
-
No.
If you would double all the hardware on that node vs the others would still
be a bad idea.
Keep the cluster uniform vnodes wise.
Regards,
Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
Pythian - Love your data
rolo@pythian | Twitter: @cjrolo | Skype
On Thu, Aug 17, 2017 at 9:36 AM, Alex Kotelnikov <
alex.kotelni...@diginetica.com> wrote:
> Dor,
>
> I believe, I tried it in many ways and the result is quite disappointing.
> I've run my scans on 3 different clusters, one of which was using on VMs
> and I was able to scale it up and down (3-5-7
yup, user_id is the primary key.
First of all,can you share, how to "go to a node directly"?.
Also such approach will retrieve all the data RF times, coordinator should
have enough metadata to avoid that.
Should not requesting multiple coordinators provide certain concurrency?
On 17 August 2017
Thanks for your help, I wrote a script to cycle through these early records and
try to update them (some columns were missing, but could be gleaned from
another db), then do the update, re-read, and if its not correct figure out the
write time and re-issue the update with a timestamp + 1. We’re
There are certainly cases where corruption has happened in cassandra (rare,
thankfully), but like I mentioned, I'm not aware of any that only corrupted
timestamps. It wouldn't surprise me to see a really broken clock, and it
wouldnt' surprise me to see bit flips on bad hardware (even hardware with
Brian Hess has perhaps the best open source code example of the right way
to do this:
https://github.com/brianmhess/cassandra-loader/blob/master/src/main/java/com/datastax/loader/CqlDelimUnload.java
On Thu, Aug 17, 2017 at 10:00 AM, Alex Kotelnikov <
alex.kotelni...@diginetica.com> wrote:
> yu
Thanks Felipe and Erick,
Yes, your comment helped a lot, I was able to resolve that by:
ALTER KEYSPACE dse_system WITH replication = {'class': 'SimpleStrategy',
'replication_factor':'1'};
Another problem I had was with CentOS release 6.7 (Final)
I was getting glibc 2.14 not found.
Based on this <
Are you saying if a node had double the hardware capacity in every way it
would be a bad idea to up num_tokens? I thought that was the whole idea of
that setting though?
On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo wrote:
> No.
>
> If you would double all the hardware on that node vs the others
Hi Alex,
How do you generate you subrange set for running queries?
It may happen that some of your ranges intersect data ownership range
borders (check it running 'nodetool describering [keyspace_name]')
Those range queries will be highly ineffective in that case and that could
explain your result
If you really double the hardware in every way, it's PROBABLY reasonable to
double num_tokens. It won't be quite the same as doubling all-the-things,
because you still have a single JVM, and you'll still have to deal with GC
as you're now reading twice as much and generating twice as much garbage,
So it is also terribly slow.
Does not work with materialized views, quick hack about that below and UDT,
this requires more time to fix.
So I used it to retrieve the only built-in type column, the key. To make
the task more time-consuming I exteneded the dataset a bit, to ~2.5M
records.
All of m
Ok found a solution for this problem.
I deleted the system's keyspace directory and restarted COSS and it was
rebuilt.
rm -rf /var/lib/cassandra/data/system
A bit drastic but I'll test it also on a multi-node cluster.
On Thu, Aug 17, 2017 at 3:57 PM, Ioannis Zafiropoulos
wrote:
> Thanks Felip
16 matches
Mail list logo