OK, I'm not surprised by the SB-Tree insert cost increase as adding a key
complexity in such a Tree is O(log(n)).

For your first case, I see no other solution as build an index but you can
do it with a UNIQUE_HASH_INDEX. If the implementation is good, adding a key
should be mean time constant (some keys are punctually more expensive, when
the index storage base has to grow).

For other cases, have you tried to query directly from the vertex ?

Suppose we have this data:
create class Person extends V;
create property Person.name string;

create class Company extends V;
create property Company.name string;

create class WorkedAt extends E;

/* Add constraints on the edge. */
create property WorkedAt.out link Person;
create property WorkedAt.in link Company;

insert into Person (name) values ('jerome');
insert into Person (name) values ('john doe');

insert into Company (name) values ('Zeenea');
insert into Company (name) values ('Ippon Technologies');
insert into Company (name) values ('Klee Group');
insert into Company (name) values ('World Big Company');

create edge WorkedAt from (select from Person where name = 'jerome') to
(select from Company where name = 'Zeenea');
create edge WorkedAt from (select from Person where name = 'jerome') to
(select from Company where name = 'Ippon Technologies');
create edge WorkedAt from (select from Person where name = 'jerome') to
(select from Company where name = 'Klee Group');
create edge WorkedAt from (select from Person where name = 'john doe') to
(select from Company where name = 'World Big Company');

*Use case 2*
I can count out going link from Person with this query:

orientdb {db=tdb}> select name, out('WorkedAt').size() from Person

+----+--------+----------------------+
|#   |name    |out('WorkedAt').size()|
+----+--------+----------------------+
|0   |jerome  |3                     |
|1   |john doe|1                     |
+----+--------+----------------------+

Which can be further optimized as (if not already done by the optimizer):

orientdb {db=tdb}> select name, out_WorkedAt.size() from Person

+----+--------+-------------------+
|#   |name    |out_WorkedAt.size()|
+----+--------+-------------------+
|0   |jerome  |3                  |
|1   |john doe|1                  |
+----+--------+-------------------+

Those queries use direct links and don't need index, the last one just
don't need the edge at all.

*Use case 3*
I can test if a person work in a company with this query:

orientdb {db=tdb}> select count() from Person where name = 'jerome' and
out('WorkedAt') contains (name = 'Zeenea')

+----+-------+
|#   |count()|
+----+-------+
|0   |1      |
+----+-------+

If count result is one or more items are linked.
This query use direct links and don't need index.

Of course that just a way to give you the idea. You have to adapt it to
your use case.

Last but not least, just don't trust me. Test!
I don't have billions of edges.
Give me some feedback if I'm wrong or if I miss something. (I am learning
while I respond to you.)

my 2 cents,

-- 
Jérôme Mainaud
jer...@mainaud.com


Le mer. 8 mai 2019 à 23:37, Suhas <suhassumu...@gmail.com> a écrit :

> Hey Jerome,
>
> Here are a few reasons why I needed an index:
>
> 1. Apply unique constraint on the edge. (no more than a single edge
> between a pair of vertices)
> 2. Compute incoming and outgoing edge count faster.
> 3. Whether two vertices are connected or not.
>
> Meanwhile, I'm using an SB-Tree Index
>
>
> On Wednesday, May 8, 2019 at 7:15:25 PM UTC, Jérôme Mainaud wrote:
>>
>> Hello,
>>
>> I don't know the exact implementation used by OrientDB, and it depends of
>> the type of index you choose.
>> But it's not a big surprise that the time to include a key increase with
>> the number of entries in the index.
>> Hash indexes should be less sensible to cost increase.
>>
>> What the purpose of indexing in and ou keys of your edge ?
>> Queries won't benefit from them as they use links from vertex to the edge
>> to traverse the graph which is far more efficient.
>> Tell me if I'm wrong about that.
>>
>> --
>> Jérôme Mainaud
>> jer...@mainaud.com
>>
>>
>> Le mer. 8 mai 2019 à 16:04, Suhas <suhass...@gmail.com> a écrit :
>>
>>> I’m creating indexes for an Edge class containing about 500 million
>>> records on keys (in, out). The index creation progressed well in the
>>> beginning at about 20,000 items/sec. But then after some time has decreased
>>> to <1000 items/sec.
>>>
>>>
>>> 2019-05-08 08:43:25:885 INFO  {db=cgraph} --> 37.00% progress, 177,405,476 
>>> indexed so far (855 items/sec) [OIndexRebuildOutputListener]
>>> 2019-05-08 08:43:35:899 INFO  {db=cgraph} --> 37.00% progress, 177,415,347 
>>> indexed so far (987 items/sec) [OIndexRebuildOutputListener]
>>> 2019-05-08 08:43:45:902 INFO  {db=cgraph} --> 37.00% progress, 177,427,464 
>>> indexed so far (1,211 items/sec) [OIndexRebuildOutputListener]
>>>
>>>
>>> At this speed, it’ll take like 3-4 days!!
>>> Settings used on 16GB RAM and 300GB SSD
>>> java -server -Xms2G -Xmx7G -Dstorage.diskCache.bufferSize=7200
>>>
>>>
>>> [image: Screenshot from 2019-05-08 09-06-47.png]
>>>
>>> Any idea why the speed of indexing decreased so drastically? And how can
>>> I increase the speed of indexing?
>>>
>>> Orientdb 3.0.15
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to orient-...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/orient-database/95597c3e-632b-4570-af51-f07227dc1965%40googlegroups.com
>>> <https://groups.google.com/d/msgid/orient-database/95597c3e-632b-4570-af51-f07227dc1965%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to orient-database+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/orient-database/52f2837f-0663-4abf-9ed2-1715cda3c97b%40googlegroups.com
> <https://groups.google.com/d/msgid/orient-database/52f2837f-0663-4abf-9ed2-1715cda3c97b%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to orient-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/orient-database/CAJXyu8%3DiYfPr%2BKUJN52rZuanz4MO88Jz_psraeNeNBNx%2Bs-anw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to