gemini-code-assist[bot] commented on PR #18867: URL: https://github.com/apache/tvm/pull/18867#issuecomment-3992447219
## Summary of Changes Hello, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances Adreno GPU support by updating the CLML SDK version, introducing a dedicated Docker-based CI environment, and extending the CLML backend with new capabilities for Large Language Model (LLM) inference. The changes streamline Adreno development and testing, improve CLML runtime efficiency, and expand the range of supported operations, particularly for quantized matrix multiplications. ### Highlights * **CLML SDK Version Update**: The CLML SDK major version has been updated from 2 to 4 in the CMake configuration, reflecting advancements in the underlying library. * **Adreno CI Environment Setup**: A new Dockerfile and an associated installation script have been introduced to establish a dedicated Adreno CI environment, including the Android SDK and Gradle, streamlining continuous integration for Adreno targets. * **Large Language Model (LLM) Support**: The CLML backend now includes support for dequantize-matmul patterns, specifically tailored for optimizing Large Language Model (LLM) inference on Adreno GPUs. * **CLML Runtime Enhancements**: The CLML runtime has been significantly refactored to handle dynamic tensor dimensions, improve memory management, and introduce a new CLML function descriptor for more robust operator management. * **Test Infrastructure Consolidation**: All Adreno-related tests have been consolidated under a new `tests/python/relax/backend/adreno` directory, involving renaming existing test files and updating the test infrastructure for better organization and maintainability. 🧠**New Feature in Public Preview:** You can now enable **Memory** to help **Gemini Code Assist** learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. **Click [here](https://codeassist.google/code-review/login) to enable Memory in your admin console.** <details> <summary><b>Changelog</b></summary> * **cmake/modules/contrib/CLML.cmake** * Updated the CLML major version from 2 to 4. * **docker/Dockerfile.ci_adreno** * Added a new Dockerfile for Adreno CI, including necessary tools like clang-format, adb, and uv. * **docker/install/ubuntu_install_androidsdk.sh** * Added a script to install the Android SDK, NDK, and CMake within the Docker environment. * **python/tvm/relax/backend/adreno/clml.py** * Introduced GlobalVarPattern and TuplePattern imports. * Added _check_dequantize_matmul and dequantize_matmul_patterns functions. * Implemented a new OpenCLMLOffLoadForLLM pass to support dequantize-matmul operations. * **python/tvm/relax/backend/adreno/pipeline.py** * Modified library_dispatch_passes to include the new OpenCLMLOffLoadForLLM pass when CLML is enabled. * **src/relax/backend/contrib/clml/codegen.cc** * Removed the SaveGlobalAttributes(node) call from the OpenCLMLJSONSerializer. * **src/runtime/contrib/clml/clml_memory_planner.cc** * Changed ssize_t to size_t for free_start, free_size, and last_found_size variables. * **src/runtime/contrib/clml/clml_runtime.cc** * Implemented dynamic tensor dimension handling. * Updated memory release logic. * Modified clEnqueueMLOpQCOM calls to use function[i].op. * Added SetTensorMemDesc for dynamic tensor updates. * Introduced CreateDequantMatmulLayer for dequantize-matmul operations. * Refactored layer_.function to store CLMLFunctionDesc objects and removed layer_.layer_names. * **src/runtime/contrib/clml/clml_runtime.h** * Added V5_API macro and new CLML_CALL macros for clSetMLTensorDimensionsQCOM, clUpdateMLOpQCOM, clGetMLOpDeviceMemoryRequirementsQCOM, clUpdateMLOpDeviceMemoryQCOM. * Updated CLML_CALL_clCreateMLTensorQCOM to support CLML v5. * Introduced CLMLFunctionDesc struct to store op, layer name, and properties. * Updated CachedLayer to use the new CLMLFunctionDesc struct. * **src/runtime/contrib/clml/clml_utils.cc** * Modified DeviceMakeCLMLTensor and MakeCLMLTensor functions to accept tensorProps for dynamic tensor properties. * Adjusted GetTensorDims to handle dynamic shapes by using fmax and checking shape.size(). * **src/runtime/contrib/clml/clml_utils.h** * Updated function signatures for DeviceMakeCLMLTensor and MakeCLMLTensor to include tensorProps parameter. * **tests/python/relax/backend/adreno/mod_utils.py** * Renamed from tests/python/relax/backend/clml/mod_utils.py. * Updated codegen attribute formats from list-of-lists to single values or lists. * Added get_dequant_matmul_module and get_dequant_vec_matmul_module for LLM tests. * **tests/python/relax/backend/adreno/test_clml_ops.py** * Renamed from tests/python/relax/backend/clml/test_clml_codegen.py. * Updated test infrastructure to use utils.py. * Added CLML_VERSION and TARGET_CLML_VERSION checks. * Introduced verify_clml_codegen and a more flexible verify function. * Added new tests for dequant_matmul and dequant_vec_matmul. * Commented out some existing tests due to padding realization changes. * **tests/python/relax/backend/adreno/test_texture_network.py** * Renamed from tests/python/relax/texture/test_network.py. * Updated test verification to use verify_results from the new utils.py. * **tests/python/relax/backend/adreno/test_texture_ops.py** * Renamed from tests/python/relax/texture/test_ops.py. * Updated test verification to use verify_results from the new utils.py. * Adjusted some test modules by removing redundant add and relu operations. * **tests/python/relax/backend/adreno/test_transform_annotate_custom_scope.py** * Renamed from tests/python/relax/adreno/test_transform_annotate_custom_scope.py. * **tests/python/relax/backend/adreno/test_transform_fold_vdevice_scope_change.py** * Renamed from tests/python/relax/adreno/test_transform_fold_vdevice_scope_change.py. * **tests/python/relax/backend/adreno/utils.py** * Added a new utility file containing run_time_check, SessionManager, run_local, build_and_run, and verify_results functions to standardize Adreno test execution and verification. * **tests/python/relax/backend/clml/conftest.py** * Removed. * **tests/python/relax/backend/clml/test_op_exec_clml_codegen.py** * Removed. * **tests/python/relax/backend/clml/utils.py** * Removed. * **tests/scripts/ci.py** * Modified the generate_command function signature to make the 'tests' parameter optional. * **tests/scripts/task_python_integration_gpuonly.sh** * Removed the export of TVM_RELAY_OPENCL_TEXTURE_TARGETS. * **tests/scripts/task_python_unittest_gpuonly.sh** * Updated TVM_TEST_TARGETS to include OpenCL for Mali and Adreno. * Removed old CLML texture test execution. </details> <details> <summary><b>Using Gemini Code Assist</b></summary> <br> The full guide for Gemini Code Assist can be found on our [documentation page](https://developers.google.com/gemini-code-assist/docs/review-github-code), here are some quick tips. <b>Invoking Gemini</b> You can request assistance from Gemini at any point by creating a comment using either `/gemini <command>` or `@gemini-code-assist <command>`. Below is a summary of the supported commands on the current page. Feature | Command | Description --- | --- | --- Code Review | `/gemini review` | Performs a code review for the current pull request in its current state. Pull Request Summary | `/gemini summary` | Provides a summary of the current pull request in its current state. Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments. Help | `/gemini help` | Displays a list of available commands. <b>Customization</b> To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a `.gemini/` folder in the base of the repository. Detailed instructions can be found [here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github). <b>Limitations & Feedback</b> Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up [here](https://google.qualtrics.com/jfe/form/SV_2cyuGuTWsEw84yG). <b>You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the [Gemini Code Assist IDE Extension](https://cloud.google.com/products/gemini/code-assist).</b> </details> [^1]: Review the [Privacy Notices](https://policies.google.com/privacy), [Generative AI Prohibited Use Policy](https://policies.google.com/terms/generative-ai/use-policy), [Terms of Service](https://policies.google.com/terms), and learn how to configure Gemini Code Assist in GitHub [here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github). Gemini can make mistakes, so double check it and [use code with caution](https://support.google.com/legal/answer/13505487). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
