timsaucer opened a new issue, #1394:
URL: https://github.com/apache/datafusion-python/issues/1394

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   More and more frequently users are reaching for LLMs to generate code and 
solve problems. We should add to our repository instructions to aid the LLMs in 
building `datafusion-python` code.
   
   **Describe the solution you'd like**
   
   According to my very quick research into the topic, a `llms.txt` file seems 
to be one emerging standard. I know some repositories have opted for a 
`CLAUDE.md` file as well. I think part of this issue will be to investigate 
what the emerging standards are and what we need to do to ensure the major 
agents out there are able to use these instructions.
   
   Additionally, since we will have users coming from different communities it 
is probably helpful to have LLM oriented instructions for how to rewrite 
queries from other dataframe APIs into DataFusion.
   
   We should cover:
   - Spark
   - Pandas
   - Polars
   
   Additionally there are probably recommendations for how we update our 
docstrings to make sure they are easily usable by the LLMs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to