To add more details the problem,

We have more than 500+ queries/API built on 200+ vertex based schema. All
the query were written optimally with right indexes so that at any normal
condition the response time of the queries will be under 50 ms. Most of the
query response in less than 20 ms. You can refer the screenshot shared
before.

All of the sudden one of these query freeze at the database indefinitely
and all the subsequent queries fired from application also start to freeze
indefinitely. This leads to an increase in concurrent connections to the
database, with none of the query responding back. This leads to the maximum
connection limit at the database level and the database stop accepting new
connections. Looking at the database, the CPU, Memory remains stable. There
is a very slight increase in CPU (due to too high concurrent connection).
This indicates the query is not executed in the database and are waiting
for resource/lock.

To bring the server back to normal, we have to stop the database (thus kill
the connections), bounce back again to access. This happens very frequently
and sometime during restart the index crashes. So we have to restore the
database from backup.

We log every query being executed. After bouncing the server, we tried to
run the frozen queries (same query with same parameter), they executed
normally as usual and responded in usual latency (10 - 20 ms). We tried
running all the queries (first query, some random query from all frozen
query set), all executed as expected.

When the database goes to freeze mode, even simple query that supposes pick
single record by primary Id also freezes. We have no clue why the database
goes to freeze state all of sudden.

We have been using OrientDb for last 5 years and never faced such a
situation.

We tried passing timeout argument along with all the read query (with
timeout as 5000 ms), we reduced record.locktimeout, network level various
timeout to lower the number, session time out, connection timeout, etc.
None of them helped. The queries are not timing out. The connection breaks
and application is getting SocketTimeoutException, but connection/query
seems to be staying in frozen/lock state in the database side and not
allowing the new connection.

We tried to kill the connection using Command "Kill", "interrupt", both
have failed, the command just hangs in waiting to get the response from the
server for the first connections.


We are currently rebuilding the index for the entire database on one go as
last resort.

We are a startup, built the entire product using OrientDB. Due to this, our
service is down for the last 5 days and we are losing our customer trust
and we are having big crisis.

Help us identify the root cause and overcome the issue.

Regards,
Ram







On Wed, Aug 21, 2019 at 7:36 PM Ram Karthik <ramkarthik.m...@gmail.com>
wrote:

> Hi  Luigi,
>
> Thanks for your reply.
>
> Database size 40GB
>
> Typical workload  - 50 TPS
>
> Added Server logs and sample schema below
> we have around 200+ schema
>
> Thanks in advance.
>
>
> On Wed, Aug 21, 2019 at 6:58 PM Luigi Dell'Aquila <
> luigi.dellaqu...@gmail.com> wrote:
>
>> Hi Ram
>>
>> It's hard to give you a quick solution with so few information.
>> V 2.0 is EOL so we will hardly release a community patch, but we can try
>> to troubleshoot the problem and see if we can work around it.
>> Can you provide a bit more information, eg. server logs, typical
>> workload, some information about DB size and schema...
>>
>> Thanks
>>
>> Luigi
>>
>>
>> Il giorno mer 21 ago 2019 alle ore 14:57 Ram Karthik <
>> ramkarthik.m...@gmail.com> ha scritto:
>>
>>> We are using OrientDB ver 2.0.18, and we are facing a critical issue for
>>> the past 5 days. The following issues we are facing
>>>
>>>    1. Orient DB server is unreachable frequently
>>>    2. We cannot able to shut down the server. We are forced to kill the
>>>    DB
>>>    3. Sometimes the Index gets crashed.
>>>
>>> The above issues occur when we open the traffic to use our application.
>>>
>>> This is a very critical issue, many users are unable to use the
>>> application due to this issue. We depend on the OrientDB, due to this we
>>> are facing many issues.
>>>
>>> Please help us to resolve this issue soon.
>>>
>>> Thanks,
>>> Ram
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to orient-database+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/orient-database/80c7ea2d-4214-432a-9679-5e09a4a1cc99%40googlegroups.com
>>> <https://groups.google.com/d/msgid/orient-database/80c7ea2d-4214-432a-9679-5e09a4a1cc99%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "OrientDB" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/orient-database/fmo0WKnfXUc/unsubscribe
>> .
>> To unsubscribe from this group and all its topics, send an email to
>> orient-database+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/orient-database/CAFZLH8kGV%3Di1O1BWhtmSG4YKKXLW4x6DXEu88N%2B55QJf19opkA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/orient-database/CAFZLH8kGV%3Di1O1BWhtmSG4YKKXLW4x6DXEu88N%2B55QJf19opkA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to orient-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/orient-database/CAJybfWepOurPSvDRjZa%2B2CV%3DPU%3DSj7aymt0rMeEv4TkEisy8Eg%40mail.gmail.com.

Reply via email to