Hi Brian,

Good input, thanks for the clarifications!

Please remember that Nova also supports the OpenStack API, too, not
just the EC2 API, so we need a solution that is more than just the EC2
API. Nothing wrong with your solution, of course! Just want to have
parity on the OpenStack API side too :)

-jay

On Fri, Feb 11, 2011 at 1:24 PM, Brian Schott <bfsch...@gmail.com> wrote:
> In the EC2 API, the field is:
> RunningInstancesSetType
> Field: instanceType, Type: xs:string, The instance type (e.g., m1.small).
> or InstanceType in the RunInstances Action.
> http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/
>
> I thought about the user_data field, but the problem is that so many 
> application frameworks use it to pass things like arbitrary shell scripts.  
> Both UserData and InstanceType are xs:string types in the XML schema, just 
> UserData is required to be Base64-encoded MIME.
>
> Our team has verified that you can pass somewhat arbitrary strings using -t 
> with euca2ools and they appear unmolested in the create() function in 
> nova/compute/api.py.  We've already prototyped an "architecture" aware 
> scheduler using the instanceType field as the designator, so that -t 
> 'sh1.4xlarge' goes to our SGI UltraViolet machine and -t 'tp.8x8' goes to one 
> of the Tilera TileEmpower (tiled 64-core non-x86) boards.
>
> We've added a bunch of instance types to nova/compute/instance_types.py) to 
> accommodate our target virtual machine types:
> - 'tp.8x8': dict(memory_mb=16384, vcpus=64, local_gb=500, flavorid=6),
> - 'sh1.4xlarge': dict(memory_mb=520192, vcpus=128, local_gb=500, flavorid=13)
>
> Right now, we're switching on the label, which is I think is a bad idea.  I 
> need to capture this in a blueprint for Diablo, but what we're proposing is 
> to expand the instance_types dictionary to include additional fields and 
> possibly turn it into a full editable table so that openstack deployments can 
> advertise additional capabilities as new instance_types beyond the Amazon 
> defaults.
>
> Examples:
> - 'tp.8x8': dict(memory_mb=16384, vcpus=64, local_gb=500, cpu_arch='tilemp', 
> cpu_geometry='8x8', net_gbps=5, flavorid=6), would advertise an 8x8 tile of 
> cores (the entire chip currently) on a TileraMP Pro with requested reserved 
> network bandwidth of 5 gbps.
> - 'p7.large+gpu': dict(memory_mb=32768, vcpus=8, local_gb=1024, cpu_arch='p7' 
> gpu_arch='fermi' gcpus=448, flavorid=7), would advertise a Power7 CPU with 
> GPU acceleration of 448 cores (like an nVidia c2050 board).
>
> What I like about Justin's proposal, is that we can request to OVERRIDE these 
> default attributes or pass through additional specifications using the same 
> instance types field without breaking existing toolchains or even the 
> semantics of the InstanceType field.
>
> For example:
> euca-run-instances -n 10
>   -t "m1.small;net_gb=1,topology='cluster',local_gb=200,near_vol='vol0001'" 
> ...
> - is me asking for 10 small nodes with 1 gigabit network bandwidth, 200GB of 
> local disk per instance, and optimized for cluster computing (minimize hops) 
> preferably near my existing block storage volume.
>
> This is semantically consistent with what InstanceTypes are all about.  
> Doesn't interfere with the default Amazon instance types which are pretty 
> standard for better or worse, but also doesn't lock us into only supporting 
> standard Amazon instance types.
>
> I threw a lot of concepts in this post.  There is a role for zones also, but 
> some of these scheduler hints don't really fit zones defined during 
> datacenter deployment.
>
> Justin, I'd be happy to work with you if you think your blueprint matches 
> above.  We're actually pretty close to having a bexar-branched prototype 
> working in nova-hpc branch.
>
> Brian Schott
> bfsch...@gmail.com
>
>
>
> On Feb 11, 2011, at 10:44 AM, Jay Pipes wrote:
>
>> On Thu, Feb 10, 2011 at 7:21 PM, Brian Schott <bfsch...@gmail.com> wrote:
>>> Justin,
>>>
>>> Our USC-ISI team is very interested in this.  We are implementing different 
>>> architecture types beyond x86_64.  We are also interested in suggesting 
>>> switch topology for MPI cluster jobs, passing in requests for GPU 
>>> accelerators, etc.  Currently, our approach has been to specify these 
>>> through instance_types. What you describe is more flexible, but I wonder if 
>>> for EC2 api we could stretch the -t flag.
>>
>> Just to make something clear, the EC2 API has nothing to do with the
>> -t flag. That is specific to the eucatools (or ec2 CLI tools).  The
>> request that goes through to the EC2 API controller (in nova, this is
>> nova.api.ec2.cloud.CloudController), passing to the controller an XML
>> packet that has a variety of fields that the controller then looks for
>> in populating the database with information about the instance to spin
>> up. Tacking on something to the -t flag would be a total hack that
>> wouldn't be particularly future-proof.
>>
>> I think that perhaps the user_data field in the XML might be a better
>> choice, since this has a more free-form capacity than a very specific
>> instance_type code that the EC2 API controller looks for.
>>
>> The root of the problem here, though, is how can the clients that
>> request new instances be spun up (or volumes be attached) send a set
>> of custom attributes that define the requirements that the request
>> entails? There are many, many attributes that should be able to attach
>> to a request:
>>
>> * What zone the instance/volume should be in
>> * What zone(s) the instance/volume should be *near*
>> * What hardware/architecture the instance should be placed on
>> * What service-level agreement a zone or group of volumes should be
>> running under
>>
>> etc, etc.
>>
>> We need to figure out a way of sending this type of request data
>> without a) breaking the existing API, and b) allows the scheduler
>> nodes to route requests more intelligently by looking at these
>> additional attributes of the request from the client.
>>
>> My initial thought is to make a simple Middleware class whose only
>> purpose is to look for certain fields in the HTTP request body or
>> headers (client_request?) and place those attributes in the wsgi
>> environ mapping so that middleware further down the "pipe" (such as
>> the Scheduler controllers) can easily pull this data out and more
>> intelligently route client requests to the zone scheduler controllers
>> that meet the requirements sent from the client.
>>
>> -jay
>
>

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to