Hm, actually that doesn't look like the queries that Spark uses to test existence, which will be "SELECT 1 ... LIMIT 1" or "SELECT * ... WHERE 1=0" depending on the dialect. What version, and are you sure something else is not sending those queries?
On Thu, Nov 17, 2022 at 11:02 AM Ramakrishna Rayudu < ramakrishna560.ray...@gmail.com> wrote: > Hi Sean, > > Thanks for your response I think it has the performance impact because if > the query return one million rows then in the response It's self we will > one million rows unnecessarily like below. > > 1 > 1 > 1 > 1 > . > . > 1 > > > Its impact the performance. Can we any alternate solution for this. > > Thanks, > Rama > > > On Thu, Nov 17, 2022, 10:17 PM Sean Owen <sro...@gmail.com> wrote: > >> This is a query to check the existence of the table upfront. >> It is nearly a no-op query; can it have a perf impact? >> >> On Thu, Nov 17, 2022 at 10:42 AM Ramakrishna Rayudu < >> ramakrishna560.ray...@gmail.com> wrote: >> >>> Hi Team, >>> >>> I am facing one issue. Can you please help me on this. >>> >>> <https://stackoverflow.com/> >>> >>> 1. >>> >>> >>> <https://stackoverflow.com/posts/74477662/timeline> >>> >>> We are connecting Tera data from spark SQL with below API >>> >>> Dataset<Row> jdbcDF = spark.read().jdbc(connectionUrl, tableQuery, >>> connectionProperties); >>> >>> when we execute above logic on large table with million rows every time we >>> are seeing below >>> >>> extra query is executing every time as this resulting performance hit on DB. >>> >>> This below information we got from DBA. We dont have any logs on SPARK >>> SQL. >>> >>> SELECT 1 FROM ONE_MILLION_ROWS_TABLE; >>> >>> 1 >>> 1 >>> 1 >>> 1 >>> 1 >>> 1 >>> 1 >>> 1 >>> 1 >>> >>> Can you please clarify why this query is executing or is there any >>> chance that this type of query is executing from our code it self while >>> check for rows count from dataframe. >>> >>> Please provide me your inputs on this. >>> >>> >>> Thanks, >>> >>> Rama >>> >>