Hello, Is there a way in spark, where I define the data source (say the JDBC Source) and define the list of tables to be used on that data source. Like JDBC connection, where we define the connection and run execute statement based on that connection. In current external table implementation, each table requires complete data source information (like url, etc).
The use case is something like I have n tables on database1 and m tables on database2; when there is a complex query that combines both m,n tables. Can spark sql decompose the complex query into data source specific queries say source 1 query (with n tables) executed on database-1 and source 2 query (with m tables) executed on database-2 and the source 1 & source 2 query result is joined/merged in spark layer to produce the final output? Will pushdown optimization work at the data source level or at the separate external table level? Thanks Sathish