All very good news!

I must correct my mistake.  I wrote RAID/0.  This should be RAID/1.  We're 
after mirroring with no stripes or parity.

As to the disk space requirements, we'll have to do some calculations.  In 
the simplest case we have a single VM running a single instance of an 
operating system.  The baseline for this is around 12GB.  Right off the bat 
we can complicate matters by not using pre-allocated storage.  I tend to 
value sanity over space optimization, so I just pre-allocate 12GB per VM.  
We then need some data storage.  Will the source tree be shared via a 
distribited filesystem?  Or cloned from a repo?  We'll need a build area to 
hold intermediary objects during builds.  We may need a test region for 
regressions.  I leave these calculations to the SAGE experts.  The 
important point is that shared areas only count once where clones count per 
VM.

Now for the topic of snapshots.  VM snapshots are a very good thing.  In 
the simplest scenario, which i recommend, we use QEMU-style disk files on 
the master node as LUNs for the VMs.  Applying a snapshot initiates a 
copy-on-write policy for all filesystem blocks on the QEMU LUNs.  In the 
limiting case, the snapshot will grow to the same size as the source LUN, 
so to be perfectly safe we would have to double our storage spec.  I don't 
tend to do this, though.  We can just monitor the snapshots and alert at a 
threshold value.

So let's go to the board.  (12GB * N) + (xGB * N) for N instances of size 
x, or N(12 + x)GB.  If the source trees and build areas are on distributed 
filesystems such as NFS, we are closer to: (N * 12GB) + xGB.  Also, we can 
get away without backing up build & test regions (they're scratch paper) if 
it will be economically desirable.

So we use one of these formulae to find our size S, and then multiply by 
1.6 for snapshots.

Man, I hope I didn't mess up the factoring or arithmetic there.  This would 
not be a sympathetic crowd...

--Matt

On Wednesday, August 29, 2012 2:38:23 PM UTC-5, William wrote:
>
> On Wed, Aug 29, 2012 at 10:45 AM, Matthew Alton 
> <matthe...@gmail.com<javascript:>> 
> wrote: 
> > This is very powerful hardware.  It should do nicely. 
> > 
> > Here is my detailed assessment. 
> > 
> > Recoverability: 
> > 
> > The PDF specifies a single 1TB HDD connected to a non-redundant internal 
> > controller. 
>
> Sorry -- that was just the default when I visited the page.   I will 
> definitely want to change it for this project.... 
>
> >  The operation of the system is thus dependent upon the status 
> > of a single disk drive.  This is fine for a development environment as 
> long 
> > as we don't mind being down for the time required to restore from 
> backup. 
> > We must also be prepared to lose all changes to the system made between 
> > point-of-backup and point-of-failure.  If we're running a weekly full 
> backup 
> > to tape, for instance, we are exposed to a potential loss of 7 days' 
> work. 
> > This situation can be mitigated to a large extent by the addition of a 
> > redundant disk drive.  This would require the purchase of another HDD 
> > identical to the first one in conjunction with the use of drive 
> mirroring 
> > capabilities (RAID/0).  Drive mirroring can be done in hardware by the 
> HDD 
> > controller or in software by the OS LVM.  Hardware is more reliable and 
> > performant -- and correspondingly more expensive.  Your call, sir. 
>
> I'll get something with at least RAID/0. 
>
> Can you estimate the total backed up disk space we will need? 
>
> > We must formulate a recovery strategy.  Is there an existing solution 
> > on-site?  NetBackup?  Bacula?  Amanda?  If not, will we recover the 
> system 
> > in its entirety from a boot image or will we re-install and reconfigure? 
> > The virtual images can be backed up via snapshots and archived.  Is 
> there a 
> > data archival method in place?  Tape?  ETL?  EDL? 
>
> The university now offers a networked backup strategy that is not 
> insanely priced, but I'll need a disk space estimate to see if it is 
> viable or not. 
>
> > 
> > System Availability: 
> > 
> > Will the machine be housed in a data center with redundant power sources 
> and 
> > line conditioners? 
>
> Yes. 
>
> >  The specified power supply is non-redundant.  If this 
> > box is living in a non-ventilated wire closet it's vulnerable to a 
> single 
> > power surge.  We should have a replacement power supply on hand at a 
> > minimum.  A redundant power option is much better. 
>
> I can get that. 
>
> > 
> > Performance: 
> > 
> > The CPU and memory are easily sufficient to the intended deployment with 
> > sufficient headroom for expansion. 
> > 
> > 
> > I hate to be a pain about all of this, but if you've ever seen the look 
> of 
> > horror and grief on the face of someone who is just realizing that their 
> > important data is irretreivably lost I'm ceratain you would understand. 
>
> I'm happy to make this machine more highly robust to failure, since I 
> will have sufficient funds. 
>
> > I'm not familiar with the iDRAC6, but I always tend to err on the side 
> of 
> > over-specification.  Get the big one.  It will probably help with frame 
> > flicker. 
> > 
> > Let me know your decisions and we'll firm up the HDD config.  We'll need 
> to 
> > make a detailed network topology plan as well.  And other things. 
>
>
>
> > 
> > --Matt 
> > 
> > On Wednesday, August 29, 2012 11:50:09 AM UTC-5, William wrote: 
> >> 
> >> On Wed, Aug 29, 2012 at 9:39 AM, Matthew Alton <matthe...@gmail.com> 
> >> wrote: 
> >> > Excellent!  I am very happy to be of help.  I have an IBM xSeries 
> AMD64 
> >> > server at home.  If you will tell me your choice of platform and 
> >> > virtualization technology for this project I can begin working this 
> >> > weekend. 
> >> 
> >> Nice.  See the attached PDF for what I'm currently planning to buy (I 
> >> can get it cheaper through my university than the price listed there). 
> >>  It's a 1U rack mounted 16-core 3Ghz AMD64 box, with 64GB of ECC RAM. 
> >>  Going 1U is very advantageous for me regarding longterm hosting costs 
> >> (e.g., it costs half as much as 2U). 
> >> 
> >> > If we coordinate our efforts well, we should be able to transfer my 
> >> > virtual 
> >> > machine images and definitions to your new hardware and then just 
> fire 
> >> > them 
> >> > up.  I recommend CentOS/KVM on technical merit, but you must of 
> course 
> >> > weigh 
> >> > many non-technical and site-specific factors into your ultimate 
> choice. 
> >> > I 
> >> > am highly flexible. 
> >> 
> >> I personally use VirtualBox for virtualization for my projects, and 
> >> Ubuntu as my main host OS.  However, the primary reason I use 
> >> VirtualBox is that I also have several OS X machines, and one to be 
> >> able to share identical virtual machines on both Linux and OS X hosts. 
> >>   I think KVM would make a lot of sense for *this* project, since it's 
> >> the best at running Linux machines, and that's precisely what we're 
> >> planning to do.  Also, if you're doing the admin work, I'm fine with 
> >> CentOS as well.    Note that I'm planning to get an iDRAC6 Enterprise 
> >> management card, so you can easily get console access (with video) to 
> >> the computer when necessary.  Should I get the extra 8GB SD card 
> >> version of the iDRAC6?  (I can't see why, but maybe you would find is 
> >> useful for running VM's, since that's what is evidently designed 
> >> for...) 
> >> 
> >> How should the hard drives be configured? 
> >> 
> >>  -- William 
> >> 
> >> > 
> >> > --Matt 
> >> > 
> >> > On Wednesday, August 29, 2012 10:12:03 AM UTC-5, William wrote: 
> >> >> 
> >> >> On Wed, Aug 29, 2012 at 7:30 AM, Matthew Alton <matthe...@gmail.com> 
>
> >> >> wrote: 
> >> >> > Hello, 
> >> >> > 
> >> >> >   I am a professional Unix sys admin.  My current title is Senior 
> >> >> > Systems 
> >> >> > Administrator at Magellan Health Services.  I have been working in 
> >> >> > Unix 
> >> >> > and 
> >> >> > Linux system administration for 23 years.  I currently work with a 
> >> >> > small 
> >> >> > team on the care and feeding of a few hundred AIX LPARs and RHEL 
> >> >> > VMware 
> >> >> > nodes.  Back in 2001 I was able to get Python onto the list of 
> >> >> > approved 
> >> >> > programming languages at Anheuser-Busch.  My colleague and I wrote 
> >> >> > several 
> >> >> > hundred thousand lines of Python code which are still in 
> production 
> >> >> > on 
> >> >> > Solaris, AIX and Linux machines there.  I am also a master C 
> >> >> > programmer. 
> >> >> > I 
> >> >> > specialize in systems programming and software porting.  SAGE is a 
> >> >> > huge 
> >> >> > help 
> >> >> > to me in my studies at Saint Louis Community College where I am 
> >> >> > currently 
> >> >> > re-taking Calculus 2 after 25 years. 
> >> >> > 
> >> >> >   If you can provide the hardware and bandwidth for a build farm 
> of 
> >> >> > VMs, 
> >> >> > I 
> >> >> > can help with the porting, testing, packaging, and documentation 
> of 
> >> >> > SAGE.  I 
> >> >> > would recommend using a Linux machine as the VM host with KVM as 
> the 
> >> >> > virtualization technology.  I can help with designing and building 
> >> >> > such 
> >> >> > a 
> >> >> > machine.  I would be happy to help with any other technology as 
> well, 
> >> >> > though 
> >> >> > I might be less familiar with it.  I've worked with KVM, Xen, BSD 
> >> >> > Jails, 
> >> >> > AIX 
> >> >> > LPARs, Solaris Domains, VMware, VirtualBox, and several 
> experimental 
> >> >> > systems.  As long as you're happy with purely simulated hardware 
> >> >> > devices 
> >> >> > KVM 
> >> >> > is an excellent choice. 
> >> >> > 
> >> >> >   We might start with RHEL/Fedora/CentOS/Scientific Linux (RPM), 
> SUSE 
> >> >> > (RPM), 
> >> >> > Ubuntu/Debian (APT), FreeBSD (PKG), Gentoo and others.  Once the 
> farm 
> >> >> > is 
> >> >> > built the job of maintaining the ports and packages on each node 
> >> >> > could 
> >> >> > be 
> >> >> > shared.  We should assume the role of official package maintainer 
> for 
> >> >> > SAGE 
> >> >> > with each of the OSes & distros. 
> >> >> > 
> >> >> >   I might be able to recruit a colleague to assist in the effort, 
> >> >> > though 
> >> >> > he 
> >> >> > is currently enrolled in Real Analysis at UMSL.  His time is 
> severely 
> >> >> > limited. 
> >> >> > 
> >> >> >   Peace. 
> >> >> > 
> >> >> > --Matt 
> >> >> > 
> >> >> 
> >> >> Awesome -- the above is enough to convince me to plan to move 
> forward 
> >> >> with this.  I will plan to have such a machine setup to be 
> configured 
> >> >> in less than 2 months.  I want to wait to try out a similar machine, 
> >> >> which I ordered a week ago, but hasn't arrived yet. 
> >> >> 
> >> >>  -- William 
> >> > 
> >> > -- 
> >> > You received this message because you are subscribed to the Google 
> >> > Groups 
> >> > "sage-devel" group. 
> >> > To post to this group, send email to sage-...@googlegroups.com. 
> >> > To unsubscribe from this group, send email to 
> >> > sage-devel+...@googlegroups.com. 
> >> > Visit this group at http://groups.google.com/group/sage-devel?hl=en. 
> >> > 
> >> > 
> >> 
> >> 
> >> 
> >> -- 
> >> William Stein 
> >> Professor of Mathematics 
> >> University of Washington 
> >> http://wstein.org 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "sage-devel" group. 
> > To post to this group, send email to 
> > sage-...@googlegroups.com<javascript:>. 
>
> > To unsubscribe from this group, send email to 
> > sage-devel+...@googlegroups.com <javascript:>. 
> > Visit this group at http://groups.google.com/group/sage-devel?hl=en. 
> > 
> > 
>
>
>
> -- 
> William Stein 
> Professor of Mathematics 
> University of Washington 
> http://wstein.org 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To post to this group, send email to sage-devel@googlegroups.com.
To unsubscribe from this group, send email to 
sage-devel+unsubscr...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.


Reply via email to