I have a library that uses Python’s AST module to parse Python expressions and 
map them to Arrow Dataset expressions. I could extract the AST bits into a repo 
if you’re interested. It’s really simple but could serve as inspiration.

It allows us to do things like:

table = path.read_table(“valid_from < date <= valid_to and security_id in 
[...]”)

which is pretty handy when you’re in IPython or Jupyter.

> On Feb 8, 2021, at 15:23, Josh Mayer <joshuaama...@gmail.com> wrote:
> 
> It would be useful to be able to create a filter expression from a string,
> e.g. "date == '2020-01-01' and value >= 1" instead of (field("date") ==
> '2020-01-01') & (field("value") >= 1).
> 
> There are some existing libraries that make it pretty easy to do in Python
> (see here <https://gist.github.com/josham/e5a13a16e9f18d7b9056127ac522cf23>)
> though an old issue ARROW-3458
> <https://issues.apache.org/jira/browse/ARROW-3458> suggests using Antlr and
> C++.  If a Python only solution is OK I'd be happy to work on adding the
> feature.  If Antlr/C++ is preferred I can help with the grammar and testing
> but probably not the best person to do the C++ work.
> 
> Josh

Reply via email to