I've been meaning to write and publish something on this since I had the same question + it seems to come up often
There are essentially two important things you need to know about Calcite to work with federated data: 1. Calcite schemas work like filesystem hierarchies. You have a "rootSchema" which is like the system root. Datasources you want to query need to be added as child schemas of this rootSchema.\ In the case of something like a standard DB (Postgres/MySQL), what you'll get may not be one schema, but a "tree" of schemas. What I mean by this is that there are wrapper/utility methods to give you back a "Calcite" representation of the database structure as schemas For a Postgres DB, that has a structure like: my_postgres_db1 / \ public schema_2 You'll get a Schema that has two child schemas. To do this, you use JdbcCatalogSchema: JdbcCatalogSchema (Apache Calcite API) <https://calcite.apache.org/javadocAggregate/org/apache/calcite/adapter/jdbc/JdbcCatalogSchema.html> For a DB like MySQL, which doesn't have the notion of "Databases", you'd instead use JdbcSchema: JdbcSchema (Apache Calcite API) <https://calcite.apache.org/javadocAggregate/org/apache/calcite/adapter/jdbc/JdbcSchema.html> 2. Don't close your Calcite Connection object. It's stateful, and whatever you've done to add new datasources to it will vanish if the connection is closed. This is especially important to keep in mind because it's AutoCloseable. Don't use try-with-resources blocks with it, or call .close() You can see an example of some of this stuff here: https://github.com/GavinRay97/GraphQLCalcite/blob/92b18a850d5546403b9760d96dc5b3a59533f3ca/src/main/kotlin/calcite/CalciteSchemaManager.kt#L26-L60 Please ignore "addDatabase" manually iterating schemas and not using JdbcCatalogSchema Also the methods which run manual RelNode expressions instead of SQL aren't necessary I asked about this and Julian said that running SQL through a java.sql.Connection object amounts to the same thing Hope this overview helps, let me know if anything is not clear. Also if anyone else has advice on this topic, I'd be keen to hear it too (or if I've made mistakes here) On Tue, Aug 2, 2022 at 2:58 PM Pranav Deshpande < deshpande.v.pra...@gmail.com> wrote: > Hi Team, > Say that I have a query which references 5 tables, and 3 of these are in > Cassandra and 1 is a CSV file. Out of these, the cassandra tables are split > across 2 different cassandra DB instances. > > How does calcite federation work in this case? > > Do I have to instantiate multiple connections? Is there an example for > this? How does calcite know which tablescan/filter etc. to call? > > Regards, > Pranav >