All nodes in the cluster need two way communication. Nodes need to talk to 
Gossip to each other so they know they are alive. 

If you need to dump a lot of data consider the Hadoop integration. 
http://wiki.apache.org/cassandra/HadoopSupport It can run a bit faster than 
going through the thrift api.

Copying sstables may be another option depending on the data size. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/02/2012, at 3:21 AM, Alexandru Sicoe wrote:

> Hello everyone,
> 
> I'm battling with this contraint that I have: I need to regularly ship out 
> timeseries data from a Cassandra cluster that sits within an enclosed 
> network, outside of the network. 
> 
> I tried to select all the data within a certian time window, writing to a 
> file, and then copying the file out but this hits the I/O performance because 
> even for a small time window (say 5mins) I am hitting more than a million 
> rows. 
> 
> It would really help if I used Cassandra to replicate the data automatically 
> outside. The problem is they will only allow me to have outbound traffic out 
> of the enclosed network (not inbound). Is there any way to configure the 
> cluster or have 2 data centers in such a way that the data center (node or 
> cluster) outside of the enclosed network only gets a replica of the data, 
> without ever needing to communicate anything back?
> 
> I appreciate the help,
> Alex

Reply via email to