We need to make a copy of a cluster. We’re going to do some testing against the
copy and then discard it. What’s the best way of doing that? I created another
datacenter, and then have tried to divorce it from the original datacenter, but
have had troubles doing so.
Suggestions?
Thanks in adva
In my opinion, this is not broken and “fixing” it would break existing code.
Consider a batch that includes multiple inserts, each of which inserts the
value returned by now(). Getting the same UUID for each insert would be a major
problem.
Cheers
Robert
On Nov 30, 2016, at 4:46 PM, Todd Fast
I used to think it was terrible as well. But it really isn’t. Just put your
non-counter columns in a separate table with the same primary key. If you want
to query both counter and non-counter columns at the same time, just query both
tables at the same time with asynchronous queries.
On Nov 1,
I had this problem, and it was caused by my retry policy. For reasons I don’t
remember (but is documented in a C* Jira ticket), when onWriteTimeout() is
called, you cannot call RetryDecision.retry(cl), as it will be a CL that is
incompatible with LWT. After the fix (2.1.?), you can pass null, an
have found a similar bug.
The Java driver mailing list is the best place to follow up on this. It can be
found at
https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver-user.
On Thu, May 19, 2016 at 12:11 AM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
When executing b
When executing bulk CAS queries, I intermittently get the following error:
SERIAL is not supported as conditional update commit consistency. Use ANY if
you mean "make sure it is accepted but I don't care how many replicas commit it
for non-SERIAL reads”
This doesn’t make any sense. Obviously,
to the document given the document text.
-- Jack Krupansky
On Mon, Apr 11, 2016 at 7:12 PM, James Carman
mailto:ja...@carmanconsulting.com>> wrote:
S3 maybe?
On Mon, Apr 11, 2016 at 7:05 PM Robert Wille
mailto:rwi...@fold3.com>> wrote:
I do realize its kind of a weird use case, but it is
eperate
> table per day / hour or something like that, so you can quickly get all keys
> for a time range. A query without the partition key may be very slow.
>
> Jan
>
> Am 11.04.2016 um 23:43 schrieb Robert Wille:
>> I have a need to be able to use the text of a document
I have a need to be able to use the text of a document as the primary key in a
table. These texts are usually less than 1K, but can sometimes be 10’s of K’s
in size. Would it be better to use a digest of the text as the key? I have a
background process that will occasionally need to do a full ta
You still need compaction. Compaction is what organizes your data into levels.
Without compaction, every query would have to look at every SSTable.
Also, due to commit log rotation, your memtable may get flushed from time to
time before it is full, resulting in small SSTables that would benefit
Yes, there is memory overhead for each column family, effectively limiting the
number of column families. The general wisdom is that you should limit yourself
to a few hundred.
Robert
On Feb 29, 2016, at 10:30 AM, Fernando Jimenez
mailto:fernando.jime...@wealth-port.com>>
wrote:
Hi all
I ha
You shouldn’t be using IN anyway. It is better to issue multiple queries, each
for a single key, and issue them in parallel. Better performance. Less GC
pressure.
On Feb 4, 2016, at 7:54 AM, Sylvain Lebresne
mailto:sylv...@datastax.com>> wrote:
That behavior has been changed in 2.2 and upwards
I disagree. I think that you can extrapolate very little information about RF>1
and CL>1 by benchmarking with RF=1 and CL=1.
On Jan 13, 2016, at 8:41 PM, Anurag Khandelwal
mailto:anur...@berkeley.edu>> wrote:
Hi John,
Thanks for responding!
The aim of this benchmark was not to benchmark Cassa
I would personally classify both of those use cases as light, and I wouldn’t
have any qualms about using a single cluster for both of those.
On Dec 23, 2015, at 3:06 PM, cass savy wrote:
> How do you determine if we can share cluster in prod for 2 different
> applications
>
> 1. Has anybody
The nulls in the original data created the tombstones. They won’t go away until
gc_grace_seconds have passed (default is 10 days).
On Dec 7, 2015, at 4:46 PM, Kai Wang wrote:
> I bulkloaded a few tables using CQLSStableWrite/sstableloader. The data are
> large amount of wide rows with lots of
her if you
have TTL on all non static (clustering and data) columns, you don’t
(necessarily) want the static data to disappear when the other cells do -
though you can achieve this with statement wide TTL-ing on insertion of the
static data.
On Dec 3, 2015, at 6:31 PM, Robert Wille
mailto:rwi...
With this schema:
CREATE TABLE roll (
id INT,
image BIGINT,
data VARCHAR static,
PRIMARY KEY ((id), image)
) WITH gc_grace_seconds = 3456000 AND compaction = { 'class' :
'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 };
if I run SELECT image FROM roll WHERE id = X on 2.0, where partitio
than 500 customers in 45 countries, DataStax is the database
technology and transactional backbone of choice for the worlds most innovative
companies such as Netflix, Adobe, Intuit, and eBay.
On Mon, Nov 23, 2015 at 5:55 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
I’m wanting to upgrade from 2.
I’m wanting to upgrade from 2.0 to 2.1. The upgrade instructions at
http://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html
has the following, which leaves me with more questions than it answers:
If your cluster does not use vnodes, disable vnodes in each new cassa
I’m planning an upgrade from 2.0 to 2.1, and was reading about counters, and
ended up with a question. I read that in 2.0, counters are implemented by
storing deltas, and in 2.1, read-before-write is used to store totals instead.
What does this mean for the following scenario?
Suppose we have a
I’m on 2.0.16 and want to upgrade to the latest 2.1.x. I’ve seen some comments
about issues with counters not migrating properly. I have a lot of counters.
Any concerns there? Do I need to run nodetool upgradesstables? Any other
gotchas?
Thanks
Robert
, 2015, at 2:33 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
It's a paging bug. I ALWAYS get a duplicated record every fetchSize records.
Easily duplicated 100% of the time.
I’ve logged a bug: https://issues.apache.org/jira/browse/CASSANDRA-10442
Robert
On Oct 3, 2015, at 10:59
We had some problems with a node, so we decided to rebootstrap it. My IT guy
screwed up, and when he added -Dcassandra.replace_address to cassandra-env.sh,
he forgot the closing quote. The node bootstrapped, and then refused to join
the cluster. We shut it down, and then noticed that nodetool st
It's a paging bug. I ALWAYS get a duplicated record every fetchSize records.
Easily duplicated 100% of the time.
I’ve logged a bug: https://issues.apache.org/jira/browse/CASSANDRA-10442
Robert
On Oct 3, 2015, at 10:59 AM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
Oops, I was t
(imageId == lastImageId)
{
logger.warn("Cassandra duplicated " + imageId);
continue;
}
total++;
lastImageId = imageId;
}
On Oct 3, 2015, at 10:54 AM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
I don’t think its an application problem. The following simple snippets produce
deviation. Especially
since you don't see the duplicates in cqlsh, I have a hunch this is an
application bug.
On Fri, Oct 2, 2015 at 4:58 PM Robert Wille
mailto:rwi...@fold3.com>> wrote:
When I run the query "SELECT image FROM roll WHERE roll = :roll“ against this
table
CRE
When I run the query "SELECT image FROM roll WHERE roll = :roll“ against this
table
CREATE TABLE roll (
roll INT,
image BIGINT,
data VARCHAR static,
mid VARCHAR,
imp_st VARCHAR,
PRIMARY KEY ((roll), image)
) WITH gc_grace_seconds = 3456000 AND compaction = { 'class' :
'LeveledCompactionStrategy'
kes up into the thousands. In my case it's a LCS
table with fluctuating-but-sometimes-pretty-high write load and lots of
(intentional) overwrite, infrequent deletes. C* 2.1.7.
On Thu, Sep 24, 2015 at 12:59 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
I have some tables that have
It sounds like its probably GC. Grep for GC in system.log to verify. If it is
GC, there are a myriad of issues that could cause it, but at least you’ve
narrowed it down.
On Sep 9, 2015, at 11:05 PM, Roman Tkachenko wrote:
> Hey guys,
>
> We've been having issues in the past couple of days wit
ESC;
Essentially the order by clause has to specify the clustering columns in order
in full. It doesn’t by default know that you have already essentially filtered
by type.
Alec Collier | Workplace Service Design
Corporate Operations Group - Technology | Macquarie Group Limited •
From: Robert Wil
d.
So to do an order by Id, C* will need to perform an in-memory re-ordering, not
sure how bad it is for performance. In any case currently it's not possible,
maybe you should create a JIRA to ask for lifting the limitation.
On Thu, Sep 3, 2015 at 10:27 PM, Robert Wille
mailto:rwi..
Given this table:
CREATE TABLE import_file (
roll int,
type text,
id timeuuid,
data text,
PRIMARY KEY ((roll), type, id)
)
This should be possible:
SELECT data FROM import_file WHERE roll = 1 AND type = 'foo' ORDER BY id DESC;
but it results in the following error:
Bad Request: Order
But it shouldn’t matter. I have missing data, and no errors, which shouldn’t be
possible except with CL=ANY.
FWIW, I’m working on some sample code so I can post a Jira.
Robert
On Aug 21, 2015, at 5:04 AM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
RF=1 with QUORUM consistency.
wrote:
What consistency level were the writes?
From: Robert Wille<mailto:rwi...@fold3.com>
Sent: 8/20/2015 18:25
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the clie
I wrote a data migration application which I was testing, and I pushed it too
hard and the FlushWriter thread pool blocked, and I ended up with dropped
mutation messages. I compared the source data against what is in my cluster,
and as expected I have missing records. The strange thing is that m
he new materialized view feature of Cassandra 3.0 would make it an
even easier fit.
-- Jack Krupansky
On Thu, Jul 23, 2015 at 6:30 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
I obviously worded my original email poorly. I guess that’s what happens when
you post at the end of the day just bef
I have a database which has a fair amount of churn. When I need to update a
data structure, I create a new one, and when it is complete, I delete the old
one. I have gc_grace_seconds=0, so the space for the old data structures should
be reclaimed on the next compaction. This has been working fin
Krupansky
mailto:jack.krupan...@gmail.com>> wrote:
Maybe you could explain in more detail what you mean by recently modified
documents, since that is precisely what I thought I suggested with descending
ordering.
-- Jack Krupansky
On Thu, Jul 23, 2015 at 3:40 PM, Robert Wille
mailto:rwi...
ng taking care of the delete,
automatically.
-- Jack Krupansky
On Tue, Jul 21, 2015 at 12:37 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
The time series doesn’t provide the access pattern I’m looking for. No way to
query recently-modified documents.
On Jul 21, 2015, at 9:13
My guess is that you don’t understand what an atomic batch is, give that you
used the phrase “updated synchronously”. Atomic batches do not provide
isolation, and do not guarantee immediate consistency. The only thing an atomic
batch guarantees is that all of the statements in the batch will eve
se as, due to the specified clustering order, the latest
modification will always be first record in the row.
Hope it helps.
Carlos Alonso | Software Engineer | @calonso<https://twitter.com/calonso>
On 21 July 2015 at 05:59, Robert Wille
mailto:rwi...@fold3.com>> wrote:
Data structures
helps.
Carlos Alonso | Software Engineer | @calonso<https://twitter.com/calonso>
On 21 July 2015 at 05:59, Robert Wille
mailto:rwi...@fold3.com>> wrote:
Data structures that have a recently-modified access pattern seem to be a poor
fit for Cassandra. I’m wondering if any of you smar
Data structures that have a recently-modified access pattern seem to be a poor
fit for Cassandra. I’m wondering if any of you smart guys can provide
suggestions.
For the sake of discussion, lets assume I have the following tables:
CREATE TABLE document (
docId UUID,
doc TEXT,
I have two test clusters, both 2.0.15. One has a single node and one has three
nodes. Truncate on the three node cluster is really slow, but is quite fast on
the single-node cluster. My test cases truncate tables before each test, and >
95% of the time in my test cases is spent truncating tables
You can get tombstones from inserting null values. Not sure if that’s the
problem, but it is another way of getting tombstones in your data.
On Jun 15, 2015, at 10:50 AM, Jean Tremblay
mailto:jean.tremb...@zen-innovations.com>>
wrote:
Dear all,
I identified a bit more closely the root cause o
Internode messages which are received by a node, but do not get not to be
processed within rpc_timeout are dropped rather than processed. As the
coordinator node will no longer be waiting for a response. If the Coordinator
node does not receive Consistency Level responses before the rpc_timeout
I meant to say I’m *not* overloading my cluster.
On Jun 12, 2015, at 6:52 PM, Robert Wille wrote:
> I am preparing to migrate a large amount of data to Cassandra. In order to
> test my migration code, I’ve been doing some dry runs to a test cluster. My
> test cluster is 2.0.15, 3 no
I am preparing to migrate a large amount of data to Cassandra. In order to test
my migration code, I’ve been doing some dry runs to a test cluster. My test
cluster is 2.0.15, 3 nodes, RF=1 and CL=QUORUM. I know RF=1 and CL=QUORUM is a
weird combination, but my production cluster that will eventu
I was wondering something about Cassandra’s internals.
Suppose I have CL > 1 and I read a partition with a bunch of tombstones. Those
tombstones have to be sent to the coordinator for consistency reasons so that
if another replica produces non-tombstone data that is older than the
tombstone, it
Have you cleared snapshots?
On May 15, 2015, at 2:24 PM, Analia Lorenzatto
mailto:analialorenza...@gmail.com>> wrote:
The Replication Factor = 2. The RP is the default, but not sure how to check
it.
I am attaching the output of: nodetool ring
Thanks a lot!
On Fri, May 15, 2015 at 4:17 PM, Ki
Timestamps have millisecond granularity. If you make multiple writes within the
same millisecond, then the outcome is not deterministic.
Also, make sure you are running ntp. Clock skew will manifest itself similarly.
On May 13, 2015, at 3:47 AM, Jared Rodriguez
mailto:jrodrig...@kitedesk.com>>
batch update?
On Wed, May 13, 2015 at 5:48 PM, Ali Akhtar
mailto:ali.rac...@gmail.com>> wrote:
The 6k is only the starting value, its expected to scale up to ~200 million
records.
On Wed, May 13, 2015 at 5:44 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
You could use lightweig
You could use lightweight transactions to update only if the record is newer.
It doesn’t avoid the read, it just happens under the covers, so it’s not really
going to be faster compared to a read-before-write pattern (which is an
anti-pattern, BTW). It is probably the easiest way to avoid gettin
Bag the IN clause and execute multiple parallel queries instead. It’s more
performant anyway.
On May 2, 2015, at 11:46 AM, Abhishek Singh Bailoo
mailto:abhishek.singh.bai...@gmail.com>> wrote:
Hi
I have run into the following issue
https://issues.apache.org/jira/browse/CASSANDRA-6722 when run
I’ve come across the same thing. I have a table with at least half a dozen
columns that could be null, in any combination. Having a prepared statement for
each permutation of null columns just isn’t going to happen. I don’t want to
build custom queries each time because I have a really cool syst
s nor their employees accept any responsibility.
From: Robert Wille [mailto:rwi...@fold3.com]
Sent: 22 April 2015 15:00
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
I should have been more clear. What I meant was t
ir employees accept any responsibility.
From: Robert Wille [mailto:rwi...@fold3.com]
Sent: 22 April 2015 14:44
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
Keep in mind that "select count(l)" and "
or their employees, unless expressly so stated. It is the responsibility of the
recipient to ensure that this email is virus free, therefore neither Peridale
Ltd, its subsidiaries nor their employees accept any responsibility.
From: Robert Wille [mailto:rwi...@fold3.com]
Sent: 22 April 2015
Keep in mind that "select count(l)" and "select l" amount to essentially the
same thing.
On Apr 22, 2015, at 3:41 AM, Tommy Stendahl
mailto:tommy.stend...@ericsson.com>> wrote:
Hi,
Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in
cqlsh.
/Tommy
On 2015-04-22 11:1
Add more nodes to your cluster
On Apr 22, 2015, at 1:39 AM, John Anderson
mailto:son...@gmail.com>> wrote:
Hey, I'm looking at querying around 500,000 rows that I need to pull into a
Pandas data frame for processing. Currently testing this on a single cassandra
node it takes around 21 seconds
I can readily reproduce the bug, and filed a JIRA ticket:
https://issues.apache.org/jira/browse/CASSANDRA-9194
I’m posting for posterity
On Apr 13, 2015, at 11:59 AM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
Unfortunately, I’ve switched email systems and don’t have my emails fro
lear. If so, what was
the JIRA #? Have you filed a JIRA for the new problem?
On Mon, Apr 13, 2015 at 12:21 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
Back in 2.0.4 or 2.0.5 I ran into a problem with delete-only workloads. If I
did lots of deletes and no upserts, Cassandra woul
Back in 2.0.4 or 2.0.5 I ran into a problem with delete-only workloads. If I
did lots of deletes and no upserts, Cassandra would report that the memtable
was 0 bytes because an accounting error. The memtable would never flush and
Cassandra would eventually die. Someone was kind enough to create
I moved my site over to Cassandra a few months ago, and everything has been
just peachy until a few hours ago (yes, it would be in the middle of the night)
when my entire cluster suffered death by GC. By death by GC, I mean this:
[rwille@cas031 cassandra]$ grep GC system.log | head -5
INFO [Sch
Ben Bromhead sent an email to me directly and expressed an interest in seeing
some of my queries. I may as well post them for everyone. Here are my queries
for the part of my code that reads and cleans up browse trees.
@NamedCqlQueries({
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CHECK_TREE_
me to dig a little deeper, I’d be happy to. Just email me.
Robert
On Mar 27, 2015, at 5:35 PM, Ben Bromhead
mailto:b...@instaclustr.com>> wrote:
+1 would love to see how you do it
On 27 March 2015 at 07:18, Jonathan Haddad
mailto:j...@jonhaddad.com>> wrote:
I'd be interested
I have a cluster which stores tree structures. I keep several hundred unrelated
trees. The largest has about 180 million nodes, and the smallest has 1 node.
The largest fanout is almost 400K. Depth is arbitrary, but in practice is
probably less than 10. I am able to page through children and sib
I would also like to add that if you avoid IN and use async queries instead, it
is pretty trivial to use a semaphore or some other limiting mechanism to put a
ceiling on the amount on concurrent work you are sending to the cluster. If you
use a query with an IN clause with a thousand things, you
Our Cassandra database just rolled to live last night. I’m looking at our query
performance, and overall it is very good, but perhaps 1 in 10,000 queries takes
several hundred milliseconds (up to a full second). I’ve grepped for GC in the
system.log on all nodes, and there aren’t any recent GC e
nodetool repair has some options that I don’t understand. Reading the
documentation doesn’t exactly make things more clear. I’m running a 2.0.11
cluster with vnodes and a single data center.
The docs say "Use -pr to repair only the first range returned by the
partitioner”. What does this mean?
After bootstrapping a node, the node repeatedly compacts the same tables over
and over, even though my cluster is completely idle. I’ve noticed the same
behavior after extended periods of heavy writes. I realize that during
bootstrapping (or extended periods of heavy writes) that compaction coul
4, 2014, at 11:44 PM, Jens Rantil
mailto:jens.ran...@tink.se>> wrote:
Hi Robert ,
Maybe you need to flush your memtables to actually see the disk usage increase?
This applies to both hosts.
Cheers,
Jens
On Sun, Dec 14, 2014 at 3:52 PM, Robert Wille
mailto:rwi...@fold3.com>> wr
Tombstones have to be created. The SSTables are immutable, so the data cannot
be deleted. Therefore, a tombstone is required. The value you deleted will be
physically removed during compaction.
My workload sounds similar to yours in some respects, and I was able to get C*
working for me. I have
cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__hinted_handoff_enabled
>
> Rahul
>
>> On Dec 14, 2014, at 9:46 AM, Robert Wille wrote:
>>
>> I have a cluster with RF=3. If I shut down one node, add a bunch of data to
I have a cluster with RF=3. If I shut down one node, add a bunch of data to the
cluster, I don’t see a bunch of records added to system.hints. Also, du of
/var/lib/cassandra/data/system/hints of the nodes that are up shows that hints
aren’t being stored. When I start the down node, its data does
I have spent a lot of time working with single-node, RF=1 clusters in my
development. Before I deploy a cluster to our live environment, I have spent
some time learning how to work with a multi-node cluster with RF=3. There were
some surprises. I’m wondering if people here can enlighten me. I do
At the data modeling class at the Cassandra Summit, the instructor said that
lots of small partitions are just fine. I’ve heard on this list that that is
not true, and that its better to cluster small partitions into fewer, larger
partitions. Due to conflicting information on this issue, I’d be
This is a follow-up to my previous post “Cassandra taking snapshots
automatically?”. I’ve renamed the thread to better describe the new information
I’ve discovered.
I have a four node, RF=3, 2.0.11 cluster that was producing snapshots at a
prodigious rate. I let the cluster sit idle overnight t
14 at 10:25:12 AM Robert Wille
mailto:rwi...@fold3.com>> wrote:
I built my first multi-node cluster and populated it with a bunch of data, and
ran out of space far more quickly than I expected. On one node, I ended up with
76 snapshots, consuming a total of 220 GB of space. I only have 4
really bad shuffle. Did you run
removenode on the old host after you took it down (I assume so since all nodes
are in UN status)? Is the test node in its own seeds list?
On Tue Dec 02 2014 at 4:10:10 PM Robert Wille
mailto:rwi...@fold3.com>> wrote:
I didn’t do anything except kill t
r.html#reference_ds_qfg_n1r_1k__snapshot_before_compaction
On Wed Dec 03 2014 at 10:25:12 AM Robert Wille
mailto:rwi...@fold3.com>> wrote:
I built my first multi-node cluster and populated it with a bunch of data, and
ran out of space far more quickly than I expected. On one node, I ended
I built my first multi-node cluster and populated it with a bunch of data, and
ran out of space far more quickly than I expected. On one node, I ended up with
76 snapshots, consuming a total of 220 GB of space. I only have 40 GB of data.
It took several snapshots per hour, sometimes within a min
3:38 PM, Tyler Hobbs
mailto:ty...@datastax.com>> wrote:
On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
As a a test, I took down a node, deleted /var/lib/cassandra and restarted it.
Did you decommission or removenode it when you took it down? If you
at 12:21 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
As a a test, I took down a node, deleted /var/lib/cassandra and restarted it.
After it joined the cluster, it’s about 75% the size of its neighbors (both in
terms of bytes and numbers of keys). Prior to my test it was approximate
As a a test, I took down a node, deleted /var/lib/cassandra and restarted it.
After it joined the cluster, it’s about 75% the size of its neighbors (both in
terms of bytes and numbers of keys). Prior to my test it was approximately the
same size. I have no explanation for why that node would shr
I would suggest that dynamic table creation is, in general, not a great idea,
regardless of the database. I would seriously consider altering your approach
to use a fixed set of tables.
On Nov 28, 2014, at 1:53 AM, Marcus Olsson
mailto:marcus.ols...@ericsson.com>> wrote:
Hi,
We encountered th
Is it possible to replicate a subset of the keyspaces to a data center? For
example, if I want to run reports without impacting my production nodes, can I
put the relevant column families in a keyspace and create a DC for reporting
that replicates only that keyspace?
Robert
Suppose I have the primary keys for 10,000 rows and I want them all. Is there a
rule of thumb for the maximum number of concurrent asynchronous queries I
should execute?
and aggregate at read time),
or you can make each row a rolling 24 hours (aggregating at write time),
depending on which use case fits your needs better.
On Sun Nov 23 2014 at 8:42:11 AM Robert Wille
mailto:rwi...@fold3.com>> wrote:
I’m working on moving a bunch of counters out of our relati
I’m working on moving a bunch of counters out of our relational database to
Cassandra. For the most part, Cassandra is a very nice fit, except for one
feature on our website. We manage a time series of view counts for each
document, and display a list of the most popular documents in the last se
I’m wondering if there’s a best practice for an annoyance I’ve come across.
Currently all my environments (dev, staging and live) have a single DC. In the
future my live environment will most likely have a second DC. When that
happens, I’ll want to use LOCAL_* consistency levels. However, if I w
At the Cassandra Summit I became aware of that there are issues with deleting
counters. I have a few questions about that. What is the bad thing that happens
(or can possibly happen) when a counter is deleted? Is it safe to delete an
entire row of counters? Is there any 2.0.x version of Cassandr
he coordinator memory.
On Sat, Oct 4, 2014 at 3:09 PM, Robert Wille
mailto:rwi...@fold3.com>> wrote:
I have a table of small documents (less than 1K) that are often accessed
together as a group. The group size is always less than 50. Which produces less
load on the server, one query us
I am architecting a solution for moving a large number of documents out of our
MySQL database to C*. We use Solr to index these documents. I’ve recently
become aware of a few different packages that integrate C* and Solr. At first
blush, this seems like the perfect fit, as it would eliminate a c
I have a table of small documents (less than 1K) that are often accessed
together as a group. The group size is always less than 50. Which produces less
load on the server, one query using an IN clause to get all 50 back together,
or 50 concurrent queries? Which one is fastest?
Thanks
Robert
I’m in a fairly unique position. Almost a year ago I developed code to migrate
part of our MySQL database to Cassandra. Shortly after 2.0.6 was released, I
was on the verge of rolling it to live when my project got shelved, and my team
got put on a completely different product. In a month or two
>
> 2) Are there any other recommended procedures for this?
0) stop writes to columnfamily
1) TRUNCATE columnfamily;
2) nodetool clearsnapshot # on the snapshot that results
3) DROP columnfamily;
My two cents here is that this process is extremely difficult to automate,
making testing that i
Password protection doesn¹t protect against an engineer accidentally running
test cases using the live config file instead of the test config file. To
protect against that, our RDBMS system will only accept connections from
certain IP addresses. Is there an equivalent thing in Cassandra, or should
I use truncate between my test cases. Never had a problem with one test
case inheriting the data from the previous one. I¹m using a single node,
so that may be why.
On 2/26/14, 9:27 AM, "Ben Hood" <0x6e6...@gmail.com> wrote:
>On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan wrote:
>> Try truncate fo
Yeah, it¹s called a rule. Set one up to delete everything from
user@cassandra.apache.org.
On 2/22/14, 10:32 AM, "Paul "LeoNerd" Evans"
wrote:
>A question about the mailing list itself, rather than Cassandra.
>
>I've re-subscribed simply because I have to be subscribed in order to
>send to the li
1 - 100 of 139 matches
Mail list logo