I would also very much love to see a SQL++ to NL capability - one that
can explain what a query is doing. This could (1) help folks reverse
engineer complex (e.g., multi-page) queries, which occur with
surprisingly high frequency in practice, and (2) be used to help NL to
SQL++ users interact with the system. Regarding point (2), a user could
ask for a query to do some task, and the system could generate the query
(NL-to-SQL), and then the SQL++ to NL capability could be used to let
the system tell the user what the generated query does - and the user
could use that to verify that the query they're about to use really
captures their intent. (One would want to use a different model, or at
least a non-polluted context, for the reverse query explanation.) This
could actually be two sub-projects over the summer - one going in each
direction - both are likely to be interesting and challenging projects.
(They could perhaps share a metadata model and code that populates it.)
Cheers,
Mike
On 3/19/26 11:23 PM, Suryaa Charan Shivakumar wrote:
Hello Tanya,
Hope you are doing well. In terms of the revised UI, we have a lot of APIs
available and are not properly utilized in the existing UI. The other
problem is it is written in angular and the team feels it's outdated and
bloated.
So the goal is to utilize all the APIs and build a lightweight modern UI
that is maintainable and helps developers do everything in one place.
The code is not public yet, but I can share where you can look for all the
APIs and parameters -
asterix-app/src/main/java/org/apache/asterix/api/http/server
Query execution internals:
asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java
In the new UI/UX ideally we'd like to have a section that translates
(1) natural language query to SQL++ and run it and
(2) help users understand the plan and results.
(3) visualization would be nice too
Frontier LLMs are decent in generating SQL++ but we need to do better than
that, we have asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj and
lots of docs and test cases in the repo as knowledge base.
To write good SQL++ queries, our tool should have a very good idea of
metadata, SQL++ syntax and data stats, samples. But the data should never
really be transferred outside (say claude APIs), putting privacy and
security first.
Agent and LLM tooling have come a long way this year, looking forward to
hearing your thoughts on how to make this possible and future ready.
Check outhttps://www.couchbase.com/blog/introducing-couchbase-capella-iq/
Best,
Suryaa
On Wed, Mar 18, 2026 at 11:06 AM Tanya Rai<[email protected]> wrote:
Hi Mike,
Thank you so much for the warm welcome and the guidance!
The 'middleware' approach makes perfect sense. Building it as a
service that interacts with AsterixDB via the REST API keeps the core
engine clean and allows for a more flexible LLM integration (like
using LangChain4j or direct Gemini API calls).
I am now focusing my research on the Metadata Catalog queries in SQL++
to understand how to best feed the schema context to the LLM.
@Suryaa, I would love to hear more about the revised UI you are
working on. I'd like to ensure my proposal for the NL2SQL bridge
aligns perfectly with the new user experience you are envisioning.
I'll be drafting my formal proposal based on this 'REST-based
Middleware' architecture. Looking forward to more discussions!
Best regards, Tanya Rai
On Wed, 18 Mar 2026 at 21:40, Mike Carey<[email protected]> wrote:
Welcome! Entry point wise, this feature should probably live on the
outside of AsterixDB looking in, using its REST-based API to issue its
requests for catalog info (via SQL++ queries) and to submit queries and
get their results. Suryaa can probably comment more on this - he is
working on a revised UI that this would be envisioned as a part of.
Cheers,
Mike
On 3/18/26 5:16 AM, Tanya Rai wrote:
Hello AsterixDB Team,
I am Tanya Rai, a 2nd-year B.Tech CSE student. I am very interested in
the Natural Language to SQL (NL2SQL) project for GSoC 2026.
I have successfully set up the project on my Windows machine. To help
others, I have submitted my first PR which adds a Windows
Troubleshooting Guide:https://github.com/apache/asterixdb/pull/39.
I am now diving into QueryServiceServlet and the SQL++ grammar to
understand the best entry point for the NL2SQL layer. I look forward
to contributing more!