l disk space (50% max for
> sized tier compaction vs about 80% for leveled compaction)
> >
> > "Our usage pattern is write once, read once (export) and delete once! "
> >
> > In this case, I think that leveled compaction fits your needs.
> >
> > "Can anyone suggest which (if any) is better? Are there better
> solutions?"
> >
> > Are your sstable compressed ? You have 2 types of built-in compression
> and you may use them depending on the model of each of your CF.
> >
> > see:
> http://www.datastax.com/docs/1.1/operations/tuning#configure-compression
> >
> > Alain
> >
> > 2012/11/22 Alexandru Sicoe
> > We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk
> per node for the data dir and separate disk for the commitlog, 12 cores, 24
> GB RAM (12GB to Cassandra heap).
> >
>
>
Hello everyone,
We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk per
node for the data dir and separate disk for the commitlog, 12 cores, 24 GB
RAM (12GB to Cassandra heap).
We now have 1.1 TB worth of data per node (RF = 2).
Our data input is between 20 to 30 GB per day, d
Hello,
We are planning to upgrade from version 1.0.7 to the 1.1 branch. Which is
the stable version that people are using? I see the latest release is 1.1.5
but maybe it's not fully wise to use this. Is 1.1.4 the one to use?
Cheers,
Alex
the local network, or the nodes broadcast their
> internal IP, in which case the "outside" nodes are helpless in trying to
> connect to a local net. On DC2 nodes/the node you issue the repair on,
> check for any sockets being opened to the internal addresses of the nodes
> in
Hello everyone,
I have a 2 DC (DC1:3 and DC2:6) Cassandra1.0.7 setup. I have about
300GB/node in the DC2.
The DCs are communicating over a gateway where I do NAT for ports 7000,
9160 and 7199.
I did a "nodetool repair" on a node in DC2 without any external load on
the system.
It took 5 hrs
ethodology of the tests ?
>
> (Here is the methodology I used to time queries previously
> http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/)
>
> Cheers
>
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpic
Hi guys,
I'm trying out DSE and looking for the best way to arrange the cluster. I
have 9 nodes: 3 behind a gateway taking in writes from my collectors and 6
outside the gateway that are supposed to take replicas from the other 3 and
serve reads and analytics jobs.
1. Is it ok to run the 3 nodes
Sender: adsi...@gmail.com
Subject: composite query performance depends on component ordering
Message-Id:
Recipient: adam.nicho...@hl.co.uk
__
This email has been scanned by the Symantec Email Security.cloud service.
For more inf
Hi guys,
I am consistently seeing a 20% improvement in query retrieval times if I
use the composite comparator "Timestamp:ID" instead of "ID:Timestamp" where
Timestamp=Long and ID=~100 character strings. I am retrieving all columns
(~1 million) from a single row. Why is this happening?
Cheers,
Al
t on. In the meantime though OpsCenter
> will need to be able to hit the listen_address for each node.
>
> On Thu, Mar 29, 2012 at 12:47 PM, Alexandru Sicoe
> wrote:
> > Hello,
> > I am planning on testing OpsCenter to see how it can monitor a multi DC
> > cluster. Th
Hello,
I am planning on testing OpsCenter to see how it can monitor a multi DC
cluster. There are 2 DCs each on a different side of a firewall. I've
configured NAT on the firewall to allow the communication between all
Cassandra nodes on ports 7000, 7199 and 9160. The cluster works fine.
However w
Hello everyone,
How are people running multi DC Cassandra across remote locations? Are
VPNs used? Or some dedicated application proxis? What is the norm here?
Any advice is much appreciated,
Alex
single counter column family, where the column name is the row key and the
> value is the counter.) A naive solution would require reading the directory
> before every read and the counter before every write--caching could
> probably help with that. So this approach would probably le
#x27;t think it would make too much difference.
> range slice used by map-reduce will find the first row in the batch and
> then step through them.
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
&
Hi guys,
Based on what you are saying there seems to be a tradeoff that developers
have to handle between:
"keep your rows under a certain size" vs
"keep data that's queried together, on disk together"
How would you handle this tradeoff in my case:
I monitor about
Hi everyone,
If you have 3 data centers (DC1,DC2 and DC3) with 3 nodes each and you have
a keyspace where the strategy options are such that each DC gets 2
replicas. If you only write to the nodes in DC1 what is the path the
replicas take? Assuming you've correctly interleaved the tokens of all th
d DC3?
Cheers,
Alex
On Thu, Mar 15, 2012 at 11:26 PM, Alexandru Sicoe wrote:
> Sorry for that last message, I was confused because I thought I needed to
> use the DseSimpleSnitch but of course I can use the PropertyFileSnitch and
> that allows me to get the configuration with 3 data cen
Sorry for that last message, I was confused because I thought I needed to
use the DseSimpleSnitch but of course I can use the PropertyFileSnitch and
that allows me to get the configuration with 3 data centers explained.
Cheers,
Alex
On Thu, Mar 15, 2012 at 10:56 AM, Alexandru Sicoe wrote
> for m/r jobs is ONE, which would work.
>
> As far as tokens go, interleaving all three DCs and evenly spacing the
> tokens will work. For example, the ordering of your nodes might be [1, 4,
> 7, 2, 5, 8, 3, 6, 9].
>
>
> On Wed, Mar 14, 2012 at 12:05 PM, Alexandru Sicoe wrot
Hi everyone,
I want to test out the Datastax Enterprise software to have a mixed
workload setup with an analytics and a real time part.
However I am not sure how to configure it to achieve what I want: I will
have 3 real machines on one side of a gateway (1,2,3) and 6 VMs on
another(4,5,6).
1,2
ce I will want to do a semi real time replication of just the latest
data added this won't work because I will be copying over all the data in
the CF.
Cheers,
A
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thela
Hello everyone,
I'm battling with this contraint that I have: I need to regularly ship out
timeseries data from a Cassandra cluster that sits within an enclosed
network, outside of the network.
I tried to select all the data within a certian time window, writing to a
file, and then copying the fi
Hi Aaron and Martin,
Sorry about my previous reply, I thought you wanted to process only all the
row keys in CF.
I have a similar issue as Martin because I see myself being forced to hit
more than a million rows with a query (I only get a few columns from every
row). Aaron, we've talked about thi
Hey Martin,
Have you tried CQL query: "SELECT FIRST 0 * FROM cfName" ?
Cheers,
Alex
On Mon, Feb 13, 2012 at 11:00 PM, Martin Arrowsmith <
arrowsmith.mar...@gmail.com> wrote:
> Hi Experts,
>
> My program is such that it queries all keys on Cassandra. I want to do
> this as quick as possible, in o
Hi Maxim,
Why do you need to know this?
Cheers,
Alex
On Sat, Jan 7, 2012 at 10:03 AM, aaron morton wrote:
>
> http://www.datastax.com/docs/1.0/operations/tuning#tuning-compaction
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 7/
> Ok, thanks for these suggestions, I will have to investigate further.
> Also considering talking to Data Stax about DSE.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/01/2012, at 1:
can backup and empty the node in DC2 before the TTLs expire in the other 2
nodes.
Cheers,
Alex
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/01/2012, at 11:41 PM, Alexandru Sicoe wrote:
>
> Hi,
&
ant to get the data out for an off node backup or is it for
> processing in another system ?
>
> You may get by using:
>
> * TTL to expire data via compaction
> * snapshots for backups
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaron
Hi everyone and Happy New Year!
I need advice for organizing data flow outside of my 3 node Cassandra 0.8.6
cluster. I am configuring my keyspace to use the NetworkTopologyStrategy. I
have 2 data centers each with a replication factor 1 (i.e. DC1:1; DC2:1)
the configuration of the PropertyFileSnit
Hi everyone,
I am currently in the process of writing a hardware proposal for a Cassandra
cluster for storing a lot of monitoring time series data. My workload is
write intensive and my data set is extremely varied in types of variables
and insertion rate for these variables (I will have to handle
Perfectly right. Sorry for not paying attention!
Thanks Eric,
Alex
On Tue, Oct 4, 2011 at 4:19 AM, Eric Evans wrote:
> On Mon, Oct 3, 2011 at 12:02 PM, Alexandru Sicoe
> wrote:
> > Hi,
> > I am using Cassandra 0.8.5, Hector 0.8.0-2 and cqlsh (cql 1.0.3). If I
> > def
Hi,
I am using Cassandra 0.8.5, Hector 0.8.0-2 and cqlsh (cql 1.0.3). If I
define a CF with comparator LongType like this:
BasicColumnFamilyDefinition columnFamilyDefinition = new
BasicColumnFamilyDefinition();
columnFamilyDefinition.setKeyspaceName("XXX");
columnFamilyDef
32 matches
Mail list logo