We run on both ephemeral and persistent on AWS. Ephemeral storage is the local storage attached to the server host. We don't have extreme write & read, so EBS is fine.
If youever shut down the EC2 instance, your data is guaranteed to be gone because AWS moves your VM to another host after every shut down. Otherwise, you can be certain with 99% of the time your data is safe with soft reboot UNTIL you messed up your OS. In that case, your system is down, even though the instance has a 100% uptime. Then you are pretty much out of luck because now you can't access you ephemeral storage. (you have to contact AWS support see if they can help you secure that data, but my bet is they would hesitant to do that for you for security reason). We run backup hourly by taking a snapshot, and use rsync to copy the files over to a separate server. Since snapshots are hardlinks, we use this to our benefit. So snapshot 1 and snapshot 2 may only have 1 file in difference then we only transfer 1 file, while the other N files are shared via hardlink. As with every technology, whether it's EBS or local storage, data corruption is an inevitable risk, but it's a very rare occasion for a customer to find EBS corruption. There are a number of tools available to do backup you can look it up. On Tue, May 23, 2017 at 6:55 PM, Gopal, Dhruva <dhruva.go...@aspect.com> wrote: > By that do you mean it’s like bootstrapping a node if it fails or is > shutdown and with a RF that is 2 or higher, data will get replicated when > it’s brought up? > > > > *From: *Cogumelos Maravilha <cogumelosmaravi...@sapo.pt> > *Date: *Tuesday, May 23, 2017 at 1:52 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > > *Subject: *Re: EC2 instance recommendations > > > > Yes we can only reboot. > > But using rf=2 or higher it's only a node fresh restart. > > EBS is a network attached disk. Spinning disk or SSD is almost the same. > > It's better take the "risk" and use type i instances. > > Cheers. > > > > On 23-05-2017 21:39, sfesc...@gmail.com wrote: > > I think this is overstating it. If the instance ever stops you'll lose the > data. That means if the server crashes for example, or if Amazon decides > your instance requires maintenance. > > > > On Tue, May 23, 2017 at 10:30 AM Gopal, Dhruva <dhruva.go...@aspect.com> > wrote: > > Thanks! So, I assume that as long we make sure we never explicitly > “shutdown” the instance, we are good. Are you also saying we won’t be able > to snapshot a directory with ephemeral storage and that is why EBS is > better? We’re just finding that to get a reasonable amount of IOPS (gp2) > out of EBS at a reasonable rate, it gets more expensive than an I3. > > > > *From: *Jonathan Haddad <j...@jonhaddad.com> > *Date: *Tuesday, May 23, 2017 at 9:42 AM > *To: *"Gopal, Dhruva" <dhruva.go...@aspect.com> <dhruva.go...@aspect.com>, > Matija Gobec <matija0...@gmail.com>, Bhuvan Rawal <bhu1ra...@gmail.com> > *Cc: *"user@cassandra.apache.org" <user@cassandra.apache.org> > > > *Subject: *Re: EC2 instance recommendations > > > > > Oh, so all the data is lost if the instance is shutdown or restarted > (for that instance)? > > > > When you restart the OS, you're technically not shutting down the > instance. As long as the instance isn't stopped / terminated, your data is > fine. I ran my databases on ephemeral storage for years without issue. In > general, ephemeral storage is going to give you lower latency since there's > no network overhead. EBS is generally cheaper than ephemeral, is > persistent, and you can take snapshots easily. > > > > On Tue, May 23, 2017 at 9:35 AM Gopal, Dhruva <dhruva.go...@aspect.com> > wrote: > > Oh, so all the data is lost if the instance is shutdown or restarted (for > that instance)? If we take a naïve approach to backing up the directory, > and restoring it, if we ever have to bring down the instance and back up, > will that work as a strategy? Data is only kept around for 2 days and is > TTL’d after. > > > > *From: *Matija Gobec <matija0...@gmail.com> > *Date: *Tuesday, May 23, 2017 at 8:15 AM > *To: *Bhuvan Rawal <bhu1ra...@gmail.com> > *Cc: *"Gopal, Dhruva" <dhruva.go...@aspect.com> <dhruva.go...@aspect.com>, > "user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: EC2 instance recommendations > > > > We are running on I3s since they came out. NVMe SSDs are really fast and I > managed to push them to 75k IOPs. > > As Bhuvan mentioned the i3 storage is ephemeral. If you can work around it > and plan for failure recovery you are good to go. > > > > I ran Cassandra on m4s before and had no problems with EBS volumes (gp2) > even in low latency use cases. With the cost of M4 instances and EBS > volumes that make sense in IOPs, I would recommend going with more i3s and > working around the ephemeral issue (if its an issue). > > > > Best, > > Matija > > On Tue, May 23, 2017 at 2:13 AM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote: > > i3 instances will undoubtedly give you more meat for buck - easily 40K+ > iops whereas on the other hand EBS maxes out at 20K PIOPS which is highly > expensive (at times they can cost you significantly more than cost of > instance). > > But they have ephemeral local storage and data is lost once instance is > stopped, you need to be prudent in case of i series, it is generally used > for large persistent caches. > > > > Regards, > > Bhuvan > > On Tue, May 23, 2017 at 4:55 AM, Gopal, Dhruva <dhruva.go...@aspect.com> > wrote: > > Hi – > > We’ve been running M4.2xlarge EC2 instances with 2-3 TB of storage and > have been comparing this to I-3.2xlarge, which seems more cost effective > when dealing with this amount of storage and from an IOPS perspective. Does > anyone have any recommendations/ on the I-3s and how it performs overall, > compared to the M4 equivalent? On the surface, without us having taken it > through its paces performance-wise, it does seem to be pretty powerful. We > just ran through an exercise with a RAIDed 200 TB volume (as opposed to a > non RAIDed 3 TB volume) and were seeing a 20-30% improvement with the > RAIDed setup, on a 6 node Cassandra ring. Just looking for any > feedback/experience folks may have had with the I-3s. > > > > Regards, > > *DHRUVA GOPAL* > > *sr. MANAGER, ENGINEERING* > > *REPORTING, ANALYTICS AND BIG DATA* > > *+1 408.325.2011 <+1%20408-325-2011>* *WORK* > > *+1 408.219.1094 <+1%20408-219-1094>* *MOBILE* > > *UNITED STATES* > > *dhruva.go...@aspect.com <dhruva.go...@aspect.com> * > > *aspect.com <http://www.aspect.com/>* > > *Error! Filename not specified.* > > > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. > > > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. >