Hi,
I created a table, schema like here:
CREATE TABLE profile_new.user_categories_1477899735 (
id bigint,
category int,
score double,
PRIMARY KEY (id, category)
) WITH CLUSTERING ORDER BY (category ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL',
I've collected some more data-points, and I still see dropped
mutations with compaction_throughput_mb_per_sec set to 8.
The only notable thing regarding the current setup is that I have
another keyspace (not being repaired though) with really wide rows
(100MB per partition), but that shouldn't have
>Both nodes can be seeds.
Probably I misunderstood Raimund as setting each node as the only seed. If he
set both IP on both nodes it's OK.
Best regards, Vladimir Yudovin,
Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.
On Sun, 30 Oct 2016 14:48:00 -0400Jonathan H
I would set rpc_address to 0.0.0.0 and broadcast_rpc_address to EACH_IP
This allows to connect to both 127.0.0.1 from inside and to IP from outside.
By a way, I see that port 7000 bound to external IP. Aren't both node in the
same network? If yes, use internal IPs.
Best regards, Vladimi
Hi,
Is it a good approach to make a boolean column with TTL and build a
secondary index on it?
(For example, I want to get rows which need to be updated after a certain
time, but I don't want, say, to add a filed "update_date" as clustering
column or to create another table)
In what kind of troub
http://www.planetcassandra.org/blog/cassandra-native-secondary-index-deep-dive/
See section E Caveats which applies to your boolean use-case
On Mon, Oct 31, 2016 at 2:19 PM, Oleg Krayushkin
wrote:
> Hi,
>
> Is it a good approach to make a boolean column with TTL and build a
> secondary index on
Native Cassandra 2nd index does not perform very well with inequalities (<,
>, <=, >=). In your case, even if you provide partition key (which is a
very good idea), Cassandra still need to perform a full scan on the local
node to find any score matching the inequality and it is pretty expensive,
th
Hi, DuyHai, thank you.
I got the idea of caveat with too low cardinality, but still wondering of
possible troubles at the idea to put TTL (months) on indexed column (not
bool, say, 100 different values of int).
2016-10-31 16:33 GMT+03:00 DuyHai Doan :
> http://www.planetcassandra.org/blog/cassan
Hi,
I secured my C* cluster by having "authenticator:
org.apache.cassandra.auth.PasswordAuthenticator" in cassandra.yaml. I know
it secures the CQL native interface running on port 9042 because my code
uses such interface. Does this also secure the Thrift API interface running
on port 9160? I sear
Technically TTL should be handled properly. However, be careful of expired
data turning into tombstones. For the original table, it may be a tombstone
on a skinny partition but for the 2nd index, it may be a tombstone set on a
wide partition and you'll start getting into trouble when reading a
part
Hi Guys,
I keep reading the articles below but the biggest questions for me are as
follows
1) what is the "data size" per request? without data size it hard for me to
see anything sensible
2) is there batching here?
http://www.datastax.com/1-million-writes
http://techblog.netflix.com/2014/07/r
Blowing out to 1k SSTables seems a bit full on. What args are you passing
to repair?
Kurt Greaves
k...@instaclustr.com
www.instaclustr.com
On 31 October 2016 at 09:49, Stefano Ortolani wrote:
> I've collected some more data-points, and I still see dropped
> mutations with compaction_throughput_
>From the article:
java -jar stress.jar -d "144 node ids" -e ONE -n 2700 -l 3 -i 1 -t 200
-p 7102 -o INSERT -c 10 -r
The client is writing 10 columns per row key, row key randomly chosen from
27 million ids, each column has a key and 10 bytes of data. The total on
disk size for each write incl
The original article
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
On Mon, Oct 31, 2016 at 5:57 PM, Peter Reilly
wrote:
> From the article:
> java -jar stress.jar -d "144 node ids" -e ONE -n 2700 -l 3 -i 1 -t 200
> -p 7102 -o INSERT -c 10 -r
>
> The client i
Hi Users,
I am trying to find a migration guide from 2.1.* to 3.x and figured I
should go through the NEWS.txt so I read that and found out few things that
I should be careful/consider during the upgrade.
I'm curious there's any documentation with specific steps how to do the
migration.
Anyone f
Should be the same as going to 3.0, no file format version bumps between 3.0
and 3.9
(There was one format change in 3.6 – CASSANDRA-11206 should have probably
bumped the version identifier, but we didn’t, and there’s nothing special you’d
need to do for it anyway.)
From: Lahiru Ga
Hey Jeff,
Thanks a lot. Biggest change I have my mind is using
TimeWindowCompactionStrategy in our timeseries tables (currently we use
SizeTieredCompactionStrategy).
We already have data in those tables (6 nodes each with 250GB and timedout
data but didn't get deleted from the disk) and do you t
Hello,
Has anyone played around with the cassandra reaper (
https://github.com/spotify/cassandra-reaper)?
if so can some please help me with the set-up, I can't get it working. I
used the below steps:
1. create jar file using maven
2. java -jar cassandra-reaper-0.2.3-SNAPSHOT.jar server
cassandr
Hi Peter,
Thanks for sending this over. I dont know how 100 Bytes (10 bytes of data *
10 columns) can represent anything useful? These days it is better to
benchmark things around 1KB.
Thanks!
On Mon, Oct 31, 2016 at 4:58 PM, Peter Reilly
wrote:
> The original article
> http://techblog.netflix
Hi, all
I hava a problem. I create a table named "tblA" in c* and create a
materialized view name viewA on tblA. I run spark job to processing data from
'viewA'.
In the beginning, it works well. But in the next day, the spark job failed.
And when I select data from the 'viewA' and 'tblA'
20 matches
Mail list logo