jason810496 commented on issue #62500:
URL: https://github.com/apache/airflow/issues/62500#issuecomment-4148027543
Hi all,
Here're my answers to the above questions so far, if you have any follow-up
questions or need further clarification, please comment in this thread, and
I'll be happy to provide more details.
---
**For @MichaelChenGithub's questions:**
> AGENTS.md vs contributing-docs: which takes precedence when they conflict?
The contributing-docs should be the source of truth, but you also point out
a problem of not having source of truth: the inconsistency between AGENTS.md
and contributing-docs. That's why I always advocate for having a single source
of truth.
> When there's a conflict like this, which source should the skill body
follow?
> And is the correct fix to first update contributing-docs to include the
`--project` pattern (so AGENTS.md and contributing-docs are consistent) before
embedding the SKILL block?
We can raise discussion in `#contributor` channel in Slack channel about
which command pattern is more accurate and update the other one accordingly. I
prefer the `--project` pattern as those commands are able to executed in
project root and no need to change the cwd but maybe someone have more thoughts
on this.
> Single anchor vs multi-chunk: which design do you prefer for long-term
maintainability?
> Which approach do you think is better for long-term maintainability across
the codebase?
You're on the right track about the "anchor" approach.
From my perspective, it's a small detail that won't have a significant
impact on long-term maintainability, I will prefer to keep this question open
and defer in the future.
> Tag-based rendering vs standalone duplicate block?
> When embedding skill content in the RST, should the approach be:
> Which approach aligns better with how you'd want contributors to maintain
these docs long-term?
I prefer the inline tag-based approach, as the cons you mentioned
("Maintainers must remember to update both the prose and the SKILL block when
the workflow changes.") for the other approach, which is also the same problem
for all the AI toolings.
I would avoid the change of oversight to update the corresponding SKILL
block in the contributing-docs for all the AI toolings, and I think the inline
tag-based approach can help to avoid this problem.
---
**For @Bucky789's questions:**
> Should I plan for both in the scope of this project, or are you leaning
more heavily toward one specific evaluation approach?
Unit tests are definitely important for any project. However, in this case,
I don't think we're able to verify the agent's behavior through unit tests. So
the E2E simulation, manual testing or agent evaluation would be more
appropriate for this project.
There're [skill
evaluations](https://claude.com/blog/improving-skill-creator-test-measure-and-refine-agent-skills)
in
[Skill-creator](https://github.com/anthropics/skills/blob/main/skills/skill-creator/SKILL.md),
which might be a handy resource for you if you have Claude Code. Or a manual
testing like I mentioned in
https://github.com/apache/airflow/issues/62500#issuecomment-4060207948 would
also be helpful to verify the agent's behavior.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]