We are using spark 2.4.4 version.
I can see two types of queries in DB logs.

SELECT 1 FROM (INPUT_QUERY) SPARK_GEN_SUB_0

SELECT * FROM (INPUT_QUERY) SPARK_GEN_SUB_0 WHERE 1=0

When we see `SELECT *` which ending up with `Where 1=0`  but query starts
with `SELECT 1` there is no where condition.

Thanks,
Rama

On Thu, Nov 17, 2022, 10:39 PM Sean Owen <sro...@gmail.com> wrote:

> Hm, actually that doesn't look like the queries that Spark uses to test
> existence, which will be "SELECT 1 ... LIMIT 1" or "SELECT * ... WHERE 1=0"
> depending on the dialect. What version, and are you sure something else is
> not sending those queries?
>
> On Thu, Nov 17, 2022 at 11:02 AM Ramakrishna Rayudu <
> ramakrishna560.ray...@gmail.com> wrote:
>
>> Hi Sean,
>>
>> Thanks for your response I think it has the performance impact because if
>> the query return one million rows then in the response It's self we will
>> one million rows unnecessarily like below.
>>
>> 1
>> 1
>> 1
>> 1
>> .
>> .
>> 1
>>
>>
>> Its impact the performance. Can we any alternate solution for this.
>>
>> Thanks,
>> Rama
>>
>>
>> On Thu, Nov 17, 2022, 10:17 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> This is a query to check the existence of the table upfront.
>>> It is nearly a no-op query; can it have a perf impact?
>>>
>>> On Thu, Nov 17, 2022 at 10:42 AM Ramakrishna Rayudu <
>>> ramakrishna560.ray...@gmail.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> I am facing one issue. Can you please help me on this.
>>>>
>>>> <https://stackoverflow.com/>
>>>>
>>>>    1.
>>>>
>>>>
>>>> <https://stackoverflow.com/posts/74477662/timeline>
>>>>
>>>> We are connecting Tera data from spark SQL with below API
>>>>
>>>> Dataset<Row> jdbcDF = spark.read().jdbc(connectionUrl, tableQuery, 
>>>> connectionProperties);
>>>>
>>>> when we execute above logic on large table with million rows every time we 
>>>> are seeing below
>>>>
>>>> extra query is executing every time as this resulting performance hit on 
>>>> DB.
>>>>
>>>> This below information we got from DBA. We dont have any logs on SPARK
>>>> SQL.
>>>>
>>>> SELECT 1 FROM ONE_MILLION_ROWS_TABLE;
>>>>
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>> 1
>>>>
>>>> Can you please clarify why this query is executing or is there any
>>>> chance that this type of query is executing from our code it self while
>>>> check for rows count from dataframe.
>>>>
>>>> Please provide me your inputs on this.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Rama
>>>>
>>>

Reply via email to