vishesh92 opened a new pull request, #11143: URL: https://github.com/apache/cloudstack/pull/11143
### Description This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM. <details><summary>Generated summary</summary> <p> This pull request introduces several changes across multiple files, focusing on enhancing GPU-related functionality, adding new properties for VM hooks, and updating resource management capabilities. The most significant updates include the addition of GPU properties and event types, the introduction of new VM shell script properties, and modifications to resource limits and types to support GPU devices. ### GPU-related enhancements: * [`api/src/main/java/com/cloud/agent/api/VgpuTypesInfo.java`](diffhunk://#diff-e3e556189a550db67a31fc39bb45b6d55e39efe9336d9bf3d3f7af5877542d1aR18-R43): Added new fields such as `deviceType`, `busAddress`, `vendorId`, and `vmName` to support detailed GPU device information. Also included getter and setter methods for these fields and updated constructors to accommodate the new properties. [[1]](diffhunk://#diff-e3e556189a550db67a31fc39bb45b6d55e39efe9336d9bf3d3f7af5877542d1aR18-R43) [[2]](diffhunk://#diff-e3e556189a550db67a31fc39bb45b6d55e39efe9336d9bf3d3f7af5877542d1aR57-R92) [[3]](diffhunk://#diff-e3e556189a550db67a31fc39bb45b6d55e39efe9336d9bf3d3f7af5877542d1aL74-R235) * [`api/src/main/java/com/cloud/agent/api/to/GPUDeviceTO.java`](diffhunk://#diff-53f797ddd2392a3f5efbc8fe81af591f7d63558c9d0eb55ba4cf1f93dfc21ba2R19-R45): Introduced new fields like `gpuCount` and `gpuDevices` to manage GPU device details and added corresponding getter/setter methods. Updated constructors to handle the new fields. [[1]](diffhunk://#diff-53f797ddd2392a3f5efbc8fe81af591f7d63558c9d0eb55ba4cf1f93dfc21ba2R19-R45) [[2]](diffhunk://#diff-53f797ddd2392a3f5efbc8fe81af591f7d63558c9d0eb55ba4cf1f93dfc21ba2R67-R74) [[3]](diffhunk://#diff-53f797ddd2392a3f5efbc8fe81af591f7d63558c9d0eb55ba4cf1f93dfc21ba2R83-R89) * [`api/src/main/java/com/cloud/event/EventTypes.java`](diffhunk://#diff-5ba116e51166a94b6d3bf3dddc7dfb6cfd25de3fe0e481bde84ac3702f943a56R382-R396): Added new GPU-related event types (`EVENT_GPU_CARD_CREATE`, `EVENT_VGPU_PROFILE_CREATE`, etc.) and mapped them to corresponding entities such as `GpuCard` and `VgpuProfile`. [[1]](diffhunk://#diff-5ba116e51166a94b6d3bf3dddc7dfb6cfd25de3fe0e481bde84ac3702f943a56R382-R396) [[2]](diffhunk://#diff-5ba116e51166a94b6d3bf3dddc7dfb6cfd25de3fe0e481bde84ac3702f943a56R1021-R1035) ### VM hook properties: * [`agent/src/main/java/com/cloud/agent/properties/AgentProperties.java`](diffhunk://#diff-967488af285b7f13f46833e85a5bee82912ec7af150055f14a0b7e70a8efc3feR216-R224): Added new shell script properties (`AGENT_HOOKS_LIBVIRT_VM_XML_TRANSFORMER_SHELL_SCRIPT`, `AGENT_HOOKS_LIBVIRT_VM_ON_START_SHELL_SCRIPT`, etc.) for VM lifecycle hooks, enabling execution of shell scripts for VM state changes. [[1]](diffhunk://#diff-967488af285b7f13f46833e85a5bee82912ec7af150055f14a0b7e70a8efc3feR216-R224) [[2]](diffhunk://#diff-967488af285b7f13f46833e85a5bee82912ec7af150055f14a0b7e70a8efc3feR245-R253) [[3]](diffhunk://#diff-967488af285b7f13f46833e85a5bee82912ec7af150055f14a0b7e70a8efc3feR273-R281) ### Resource management updates: * [`api/src/main/java/com/cloud/capacity/Capacity.java`](diffhunk://#diff-bdc24376c0a98242d02786dc78d88d5c12bacf296638c237e80942603b96f531L36-R36): Updated GPU capacity type ID from `19` to `11`. * [`api/src/main/java/com/cloud/configuration/Resource.java`](diffhunk://#diff-3777e3ea77b2d29b104b0574cb987bcfb6b111446e7fab9c3eccf32758c0b274L40-R41): Added a new resource type for GPUs (`gpu`). * [`api/src/main/java/com/cloud/user/ResourceLimitService.java`](diffhunk://#diff-ec9fda7965f679b5929dbf50b165bc1d2d293159d471545432532c9022a375ceL53-R60): Introduced new configuration keys for GPU limits at the account, domain, and project levels (`DefaultMaxAccountGpus`, `DefaultMaxDomainGpus`, etc.). Added methods to check, increment, and decrement GPU resource limits. [[1]](diffhunk://#diff-ec9fda7965f679b5929dbf50b165bc1d2d293159d471545432532c9022a375ceL53-R60) [[2]](diffhunk://#diff-ec9fda7965f679b5929dbf50b165bc1d2d293159d471545432532c9022a375ceR293-R296) ### Miscellaneous updates: * [`.github/workflows/ci.yml`](diffhunk://#diff-b803fcb7f17ed9235f1e5cb1fcd2f5d3b2838429d4368ae4c57ce4436577f03fR140): Added a new smoke test for deploying VMs with vGPU enabled (`smoke/test_deploy_vgpu_enabled_vm`). * [`api/src/main/java/org/apache/cloudstack/api/ApiConstants.java`](diffhunk://#diff-72d5bf21c12ffd0a3c2d3a6033ec70ae5ba6c31338e915fe65df1b7a67827a9eR71): Added constants for GPU-related attributes such as `BUS_ADDRESS` and `DEVICE_NAME`. [[1]](diffhunk://#diff-72d5bf21c12ffd0a3c2d3a6033ec70ae5ba6c31338e915fe65df1b7a67827a9eR71) [[2]](diffhunk://#diff-72d5bf21c12ffd0a3c2d3a6033ec70ae5ba6c31338e915fe65df1b7a67827a9eR161) </p> </details> <!--- Describe your changes in DETAIL - And how has behaviour functionally changed. --> <!-- For new features, provide link to FS, dev ML discussion etc. --> <!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. --> <!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged --> <!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" --> <!-- Fixes: # --> <!--- ******************************************************************************* --> <!--- NOTE: AUTOMATION USES THE DESCRIPTIONS TO SET LABELS AND PRODUCE DOCUMENTATION. --> <!--- PLEASE PUT AN 'X' in only **ONE** box --> <!--- ******************************************************************************* --> ### Types of changes - [x] Breaking change (fix or feature that would cause existing functionality to change) - [x] New feature (non-breaking change which adds functionality) - [ ] Bug fix (non-breaking change which fixes an issue) - [x] Enhancement (improves an existing feature and functionality) - [ ] Cleanup (Code refactoring and cleanup, that may add test cases) - [ ] build/CI - [x] test (unit or integration test code) ### Feature/Enhancement Scale or Bug Severity #### Feature/Enhancement Scale - [x] Major - [ ] Minor ### Screenshots (if appropriate): ### How Has This Been Tested? This was tested locally on my laptop with passthrough of a consumer graphics card. Due to unavailability of actual hardware, I wasn't able to test with vGPU profiles or mdev. <!-- Please describe in detail how you tested your changes. --> <!-- Include details of your testing environment, and the tests you ran to --> #### How did you try to break this feature and the system with this change? <!-- see how your change affects other areas of the code, etc. --> <!-- Please read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md) document --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org