Hm, actually that doesn't look like the queries that Spark uses to test
existence, which will be "SELECT 1 ... LIMIT 1" or "SELECT * ... WHERE 1=0"
depending on the dialect. What version, and are you sure something else is
not sending those queries?

On Thu, Nov 17, 2022 at 11:02 AM Ramakrishna Rayudu <
ramakrishna560.ray...@gmail.com> wrote:

> Hi Sean,
>
> Thanks for your response I think it has the performance impact because if
> the query return one million rows then in the response It's self we will
> one million rows unnecessarily like below.
>
> 1
> 1
> 1
> 1
> .
> .
> 1
>
>
> Its impact the performance. Can we any alternate solution for this.
>
> Thanks,
> Rama
>
>
> On Thu, Nov 17, 2022, 10:17 PM Sean Owen <sro...@gmail.com> wrote:
>
>> This is a query to check the existence of the table upfront.
>> It is nearly a no-op query; can it have a perf impact?
>>
>> On Thu, Nov 17, 2022 at 10:42 AM Ramakrishna Rayudu <
>> ramakrishna560.ray...@gmail.com> wrote:
>>
>>> Hi Team,
>>>
>>> I am facing one issue. Can you please help me on this.
>>>
>>> <https://stackoverflow.com/>
>>>
>>>    1.
>>>
>>>
>>> <https://stackoverflow.com/posts/74477662/timeline>
>>>
>>> We are connecting Tera data from spark SQL with below API
>>>
>>> Dataset<Row> jdbcDF = spark.read().jdbc(connectionUrl, tableQuery, 
>>> connectionProperties);
>>>
>>> when we execute above logic on large table with million rows every time we 
>>> are seeing below
>>>
>>> extra query is executing every time as this resulting performance hit on DB.
>>>
>>> This below information we got from DBA. We dont have any logs on SPARK
>>> SQL.
>>>
>>> SELECT 1 FROM ONE_MILLION_ROWS_TABLE;
>>>
>>> 1
>>> 1
>>> 1
>>> 1
>>> 1
>>> 1
>>> 1
>>> 1
>>> 1
>>>
>>> Can you please clarify why this query is executing or is there any
>>> chance that this type of query is executing from our code it self while
>>> check for rows count from dataframe.
>>>
>>> Please provide me your inputs on this.
>>>
>>>
>>> Thanks,
>>>
>>> Rama
>>>
>>

Reply via email to