timsaucer opened a new pull request, #1497:
URL: https://github.com/apache/datafusion-python/pull/1497

   ## Summary
   
   - Add `python/datafusion/AGENTS.md` — a comprehensive DataFrame API guide 
that ships with `pip install datafusion` (Maturin includes all files under 
`python-source = "python"`). Covers core abstractions, import conventions, data 
loading, all DataFrame operations, expression building, a SQL-to-DataFrame 
reference table, common pitfalls, idiomatic patterns, and a categorized 
function index.
   - Enrich the `__init__.py` module docstring from 2 lines to a full overview 
with core abstractions, a quick-start example, and a pointer to AGENTS.md.
   
   This is "PR 1a" from the plan in #1394 (comment 
https://github.com/apache/datafusion-python/issues/1394#issuecomment-4252413645).
 The goal is that any agent encountering `datafusion` — whether via pip, docs 
site, or repo — gets enough context to write idiomatic DataFrame code.
   
   ### What's in AGENTS.md
   
   1. What DataFusion is (in-process engine, not a database)
   2. Core abstractions (`SessionContext` → `DataFrame` → `Expr` → `functions`)
   3. Import conventions
   4. Data loading (files, Python objects, SQL)
   5. DataFrame operations quick reference (select, filter, join, aggregate, 
window, sort, limit, set operations, deduplication)
   6. Executing and collecting results
   7. Expression building (arithmetic, comparisons, boolean logic, null 
handling, CASE/WHEN, casting, aliasing, BETWEEN, IN)
   8. SQL-to-DataFrame reference table (~25 mappings)
   9. Common pitfalls (boolean operators, `lit()` wrapping, column quoting, 
immutable DataFrames, window frame defaults, HAVING pattern)
   10. Idiomatic patterns (fluent chaining, variables as CTEs, window functions 
for scalar subqueries, semi/anti joins for EXISTS/NOT EXISTS)
   11. Categorized function index (aggregate, window, string, math, date/time, 
conditional, array, struct/map, regex, hash, type)
   
   ## Test plan
   
   - [x] All pre-commit hooks pass (ruff, ruff format, codespell)
   - [x] `pytest python/tests/test_imports.py` passes (5/5)
   - [x] `pytest python/tests/test_dataframe.py test_context.py test_expr.py 
test_functions.py test_imports.py` — 827 passed, 3 skipped (1 deselected is 
pre-existing #1492 fix)
   - [x] `pytest --doctest-modules python/datafusion/dataframe.py functions.py 
expr.py` — 243 passed
   - [x] `python -c "import datafusion; print(datafusion.__doc__[:80])"` shows 
new docstring
   - [x] AGENTS.md is in `python/datafusion/` alongside `py.typed`, confirming 
it will ship in the wheel
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to