you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb link 19 lut 2013 02:01, "Hiller, Dean" <dean.hil...@nrel.gov> napisał(a):
> I thought about this more, and even with a 10Gbit network, it would take > 40 days to bring up a replacement node if mongodb did truly have a 42T / > node like I had heard. I wrote the below email to the person I heard this > from going back to basics which really puts some perspective on it….(and a > lot of people don't even have a 10Gbit network like we do) > > Nodes are hooked up by a 10G network at most right now where that is > 10gigabit. We are talking about 10Terabytes on disk per node recently. > > Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second (yes I > could have divided by 8 in my head but eh…course when I saw the number, I > went duh) > > So trying to transfer 10 Terabytes or 10,000 Gigabytes to a node that we > are bringing online to replace a dead node would take approximately 5 > days??? > > This means no one else is using the bandwidth too ;). 10,000Gigabytes * 1 > second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days. This is more > likely 11 days if we only use 50% of the network. > > So bringing a new node up to speed is more like 11 days once it is > crashed. I think this is the main reason the 1Terabyte exists to begin > with, right? > > From an ops perspective, this could sound like a nightmare scenario of > waiting 10 days…..maybe it is livable though. Either way, I thought it > would be good to share the numbers. ALSO, that is assuming the bus with > it's 10 disk can keep up with 10G???? Can it? What is the limit of > throughput on a bus / second on the computers we have as on wikipedia there > is a huge variance? > > What is the rate of the disks too (multiplied by 10 of course)? Will they > keep up with a 10G rate for bringing a new node online? > > This all comes into play even more so when you want to double the size of > your cluster of course as all nodes have to transfer half of what they have > to all the new nodes that come online(cassandra actually has a very data > center/rack aware topology to transfer data correctly to not use up all > bandwidth unecessarily…I am not sure mongodb has that). Anyways, just food > for thought. > > From: aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com > >> > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Date: Monday, February 18, 2013 1:39 PM > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, Vegard > Berget <p...@fantasista.no<mailto:p...@fantasista.no>> > Subject: Re: cassandra vs. mongodb quick question > > My experience is repair of 300GB compressed data takes longer than 300GB > of uncompressed, but I cannot point to an exact number. Calculating the > differences is mostly CPU bound and works on the non compressed data. > > Streaming uses compression (after uncompressing the on disk data). > > So if you have 300GB of compressed data, take a look at how long repair > takes and see if you are comfortable with that. You may also want to test > replacing a node so you can get the procedure documented and understand how > long it takes. > > The idea of the soft 300GB to 500GB limit cam about because of a number of > cases where people had 1 TB on a single node and they were surprised it > took days to repair or replace. If you know how long things may take, and > that fits in your operations then go with it. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 18/02/2013, at 10:08 PM, Vegard Berget <p...@fantasista.no<mailto: > p...@fantasista.no>> wrote: > > > > Just out of curiosity : > > When using compression, does this affect this one way or another? Is 300G > (compressed) SSTable size, or total size of data? > > .vegard, > > ----- Original Message ----- > From: > user@cassandra.apache.org<mailto:user@cassandra.apache.org> > > To: > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Cc: > > Sent: > Mon, 18 Feb 2013 08:41:25 +1300 > Subject: > Re: cassandra vs. mongodb quick question > > > If you have spinning disk and 1G networking and no virtual nodes, I would > still say 300G to 500G is a soft limit. > > If you are using virtual nodes, SSD, JBOD disk configuration or faster > networking you may go higher. > > The limiting factors are the time it take to repair, the time it takes to > replace a node, the memory considerations for 100's of millions of rows. If > you the performance of those operations is acceptable to you, then go crazy. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com<http://www.thelastpickle.com/> > > On 16/02/2013, at 9:05 AM, "Hiller, Dean" <dean.hil...@nrel.gov<mailto: > dean.hil...@nrel.gov>> wrote: > > So I found out mongodb varies their node size from 1T to 42T per node > depending on the profile. So if I was going to be writing a lot but rarely > changing rows, could I also use cassandra with a per node size of +20T or > is that not advisable? > > Thanks, > Dean > > >