Response inline. On Thursday, May 9, 2019 at 3:21:47 PM UTC, Jérôme Mainaud wrote: > > OK, I'm not surprised by the SB-Tree insert cost increase as adding a key > complexity in such a Tree is O(log(n)). > > For your first case, I see no other solution as build an index but you can > do it with a UNIQUE_HASH_INDEX. If the implementation is good, adding a > key should be mean time constant (some keys are punctually more expensive, > when the index storage base has to grow). > > Tried it. There is no difference. Initially beginning with 20000 items/sec, after about one and a half days, the speed decreased down to 500 items/sec.
For other cases, have you tried to query directly from the vertex ? > > Suppose we have this data: > create class Person extends V; > create property Person.name string; > > create class Company extends V; > create property Company.name string; > > create class WorkedAt extends E; > > /* Add constraints on the edge. */ > create property WorkedAt.out link Person; > create property WorkedAt.in link Company; > > insert into Person (name) values ('jerome'); > insert into Person (name) values ('john doe'); > > insert into Company (name) values ('Zeenea'); > insert into Company (name) values ('Ippon Technologies'); > insert into Company (name) values ('Klee Group'); > insert into Company (name) values ('World Big Company'); > > create edge WorkedAt from (select from Person where name = 'jerome') to > (select from Company where name = 'Zeenea'); > create edge WorkedAt from (select from Person where name = 'jerome') to > (select from Company where name = 'Ippon Technologies'); > create edge WorkedAt from (select from Person where name = 'jerome') to > (select from Company where name = 'Klee Group'); > create edge WorkedAt from (select from Person where name = 'john doe') to > (select from Company where name = 'World Big Company'); > > *Use case 2* > I can count out going link from Person with this query: > > orientdb {db=tdb}> select name, out('WorkedAt').size() from Person > > +----+--------+----------------------+ > |# |name |out('WorkedAt').size()| > +----+--------+----------------------+ > |0 |jerome |3 | > |1 |john doe|1 | > +----+--------+----------------------+ > > Which can be further optimized as (if not already done by the optimizer): > > orientdb {db=tdb}> select name, out_WorkedAt.size() from Person > > +----+--------+-------------------+ > |# |name |out_WorkedAt.size()| > +----+--------+-------------------+ > |0 |jerome |3 | > |1 |john doe|1 | > +----+--------+-------------------+ > > Those queries use direct links and don't need index, the last one just > don't need the edge at all. > > *Use case 3* > I can test if a person work in a company with this query: > > orientdb {db=tdb}> select count() from Person where name = 'jerome' and > out('WorkedAt') contains (name = 'Zeenea') > > +----+-------+ > |# |count()| > +----+-------+ > |0 |1 | > +----+-------+ > > If count result is one or more items are linked. > This query use direct links and don't need index. > > Of course that just a way to give you the idea. You have to adapt it to > your use case. > > Last but not least, just don't trust me. Test! > I don't have billions of edges. > Give me some feedback if I'm wrong or if I miss something. (I am learning > while I respond to you.) > > my 2 cents, > > -- > Jérôme Mainaud > jer...@mainaud.com <javascript:> > > > Le mer. 8 mai 2019 à 23:37, Suhas <suhass...@gmail.com <javascript:>> a > écrit : > >> Hey Jerome, >> >> Here are a few reasons why I needed an index: >> >> 1. Apply unique constraint on the edge. (no more than a single edge >> between a pair of vertices) >> 2. Compute incoming and outgoing edge count faster. >> 3. Whether two vertices are connected or not. >> >> Meanwhile, I'm using an SB-Tree Index >> >> >> On Wednesday, May 8, 2019 at 7:15:25 PM UTC, Jérôme Mainaud wrote: >>> >>> Hello, >>> >>> I don't know the exact implementation used by OrientDB, and it depends >>> of the type of index you choose. >>> But it's not a big surprise that the time to include a key increase with >>> the number of entries in the index. >>> Hash indexes should be less sensible to cost increase. >>> >>> What the purpose of indexing in and ou keys of your edge ? >>> Queries won't benefit from them as they use links from vertex to the >>> edge to traverse the graph which is far more efficient. >>> Tell me if I'm wrong about that. >>> >>> -- >>> Jérôme Mainaud >>> jer...@mainaud.com >>> >>> >>> Le mer. 8 mai 2019 à 16:04, Suhas <suhass...@gmail.com> a écrit : >>> >>>> I’m creating indexes for an Edge class containing about 500 million >>>> records on keys (in, out). The index creation progressed well in the >>>> beginning at about 20,000 items/sec. But then after some time has >>>> decreased >>>> to <1000 items/sec. >>>> >>>> >>>> 2019-05-08 08:43:25:885 INFO {db=cgraph} --> 37.00% progress, 177,405,476 >>>> indexed so far (855 items/sec) [OIndexRebuildOutputListener] >>>> 2019-05-08 08:43:35:899 INFO {db=cgraph} --> 37.00% progress, 177,415,347 >>>> indexed so far (987 items/sec) [OIndexRebuildOutputListener] >>>> 2019-05-08 08:43:45:902 INFO {db=cgraph} --> 37.00% progress, 177,427,464 >>>> indexed so far (1,211 items/sec) [OIndexRebuildOutputListener] >>>> >>>> >>>> At this speed, it’ll take like 3-4 days!! >>>> Settings used on 16GB RAM and 300GB SSD >>>> java -server -Xms2G -Xmx7G -Dstorage.diskCache.bufferSize=7200 >>>> >>>> >>>> [image: Screenshot from 2019-05-08 09-06-47.png] >>>> >>>> Any idea why the speed of indexing decreased so drastically? And how >>>> can I increase the speed of indexing? >>>> >>>> Orientdb 3.0.15 >>>> >>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to orient-...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/orient-database/95597c3e-632b-4570-af51-f07227dc1965%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/orient-database/95597c3e-632b-4570-af51-f07227dc1965%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "OrientDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to orient-...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/orient-database/52f2837f-0663-4abf-9ed2-1715cda3c97b%40googlegroups.com >> >> <https://groups.google.com/d/msgid/orient-database/52f2837f-0663-4abf-9ed2-1715cda3c97b%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to orient-database+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/orient-database/21389bd0-d014-4b25-ba4c-af685f55974f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.