Bucky789 commented on issue #62500:
URL: https://github.com/apache/airflow/issues/62500#issuecomment-4131106946

   Hi @jason810496 and @potiuk,
   
   I'm incredibly interested in taking on the **Airflow Contribution & 
Verification Agent Skills** project for GSoC.
   
   I recently merged **PR #60963** (adding the `max_mails` parameter to the 
IMAP hook), which took me through the complete contributor lifecycle: local 
setup → `prek `static checks → targeted `pytest `in Breeze. This gave me a very 
concrete mental model of the exact host vs. container workflows that agents 
currently struggle with. I've also built LLM-powered systems that require 
structured, machine-usable interfaces for tool-calling, so bridging this gap 
feels like a natural intersection of my experience.
   
   As I draft my proposal, my high-level approach centers on:
   
   - **Context Detection**: A two-tier strategy. Check existing environment 
signals (e.g., `AIRFLOW_HOME`, existing Docker env vars) first, falling back to 
injecting a lightweight `BREEZE_ENV `marker if needed. This prioritizes 
reliability and backwards compatibility.
   
   - **Sync Mechanism**: Parsing Breeze CLI docstrings with `prek `to 
auto-generate structured tool schemas (specifically the JSON Schema format 
expected by Claude/OpenAI tool-calling). This ensures agents always have the 
absolute source of truth without manual updates.
   
   - **Workflow Skills**: Modeling both `prek `(host-side) and `pytest 
`(container-side) as first-class skills, with clear context-aware routing.
   
   **One Architectural Question for the Evaluation & Test Harness:**
   I'm currently envisioning two complementary layers for the "exam":
   
   1. **Unit tests**: Asserting that the skills output the correct commands for 
a given mocked host/container state.
   
   2. **E2E simulation**: Feeding a full PR workflow (stage → check → fix → 
re-check) and verifying the agent loop successfully navigates it.
   
   Should I plan for both in the scope of this project, or are you leaning more 
heavily toward one specific evaluation approach?
   
   I'm finalizing my proposal draft now and would love your thoughts!
   
   Thanks,
   Manthan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to