I think we can invite them to our virtual meetup and share details. Your thoughts?
чт, 28 окт. 2021 г. в 10:15, Ivan Pavlukhin <vololo...@gmail.com>: > Hi Maximiliano, > > Thank you for pointing this out, rather interesting. Have you tried to > communicate with a hawkore team? I doubt that anyone in Community > knows implementation details of hawkore additions. > > 2021-10-22 19:58 GMT+03:00, Maximiliano Gazquez <maximiliano....@gmail.com > >: > > Hello everyone! > > > > I wanted to add this to the discussion. > > I've found this project https://github.com/hawkore/ignite-hk which > promises > > to solve most of the issues that are being discussed here like > pagination, > > sorting and most important, persisting the lucene index. > > > > It does stuff like this to create indexes: > > > > CREATE INDEX PERSON_LUCENE_IDX ON "PUBLIC".PERSON(LUCENE) > > FULLTEXT '{ > > ''refresh_seconds'':''60'', > > ''directory_path'':'''', > > ''ram_buffer_mb'':''10'', > > ''max_cached_mb'':''-1'', > > ''partitioner'':''{"type":"token","partitions":10}'', > > ''optimizer_enabled'':''true'', > > ''optimizer_schedule'':''0 1 * * *'', > > ''version'':''0'', > > ''schema'':''{ > > "default_analyzer":"english", > > > > > "analyzers":{"my_custom_analyzer":{"type":"snowball","language":"Spanish","stopwords":"el,la,lo,loas,las,a,ante,bajo,cabe,con,contra"}}, > > "fields":{ > > > > > "duration":{"type":"date_range","from":"start_date","to":"stop_date","validated":false,"pattern":"yyyy/MM/dd"}, > > > > > "place":{"type":"geo_point","latitude":"latitude","longitude":"longitude"}, > > "date":{"type":"date","validated":true,"pattern":"yyyy/MM/dd"}, > > "number":{"type":"integer","validated":false,"boost":1.0}, > > "gender":{"type":"string","validated":true,"case_sensitive":true}, > > "bool":{"type":"boolean","validated":false}, > > > > > "phrase":{"type":"text","validated":false,"analyzer":"my_custom_analyzer"}, > > "name":{"type":"string","validated":false,"case_sensitive":true}, > > "animal":{"type":"string","validated":false,"case_sensitive":true}, > > "age":{"type":"integer","validated":false,"boost":1.0}, > > "food":{"type":"string","validated":false,"case_sensitive":true} > > } > > }'' > > }'; > > > > And this to use that lucene index from inside SQL: > > > > SELECT * FROM "test".user > > WHERE lucene = '{ query : { > > type : "boolean", > > must : [{type : "wildcard", field : "name", > > value : "J*"}, > > {type : "wildcard", field : "food", > > value : "tu*"}]}}'; > > > > More examples here > > > https://github.com/hawkore/examples-apache-ignite-extensions/tree/master/examples-advanced-ignite-indexing > > > > I don't have anything to do with that company but it would be great to > know > > how they implemented this stuff. > > > > > > On Mon, Aug 9, 2021 at 3:00 AM Ivan Pavlukhin <vololo...@gmail.com> > wrote: > > > >> Hi Atri, > >> > >> Sorry for a late answer. > >> > >> > I didn't quite understand. Are you proposing that Ignite should not > >> > have > >> FTS capabilities? > >> > >> It seems an option to me. IMHO it is better to have no FTS instead of > >> something like current Ignite TextQueries. > >> > >> 2021-08-03 12:45 GMT+03:00, Atri Sharma <a...@apache.org>: > >> > Hi Ivan, > >> > > >> > I didn't quite understand. Are you proposing that Ignite should not > >> > have FTS capabilities? > >> > > >> > Atri > >> > > >> > On Tue, Aug 3, 2021 at 2:57 PM Ivan Pavlukhin <vololo...@gmail.com> > >> wrote: > >> >> > >> >> Hi Atri, > >> >> > >> >> My main concern is non-maleficence. Every task has several solutions, > >> >> e.g. straightforward ones: > >> >> 1. Do not implement FTS. > >> >> 2. Create own implementation. > >> >> > >> >> Some of the strongest ones live without FTS [1]. > >> >> > >> >> [1] https://github.com/cockroachdb/cockroach/issues/7821 > >> >> > >> >> 2021-08-02 11:33 GMT+03:00, Atri Sharma <a...@apache.org>: > >> >> > Hi Ivan, > >> >> > > >> >> > Would you like to propose an alternative to Lucene? > >> >> > > >> >> > Atri > >> >> > > >> >> > On Mon, 2 Aug 2021, 13:48 Ivan Pavlukhin, <vololo...@gmail.com> > >> wrote: > >> >> > > >> >> >> Folks, > >> >> >> > >> >> >> Sorry if read the thread not thoroughly enough, but do we consider > >> >> >> Lucene as obviously right choice? In my understanding Ignite > >> >> >> history > >> >> >> has shown clearly that "fastest feature implementation" is not > >> usually > >> >> >> the best. And one example of this are text queries. Are not we > >> >> >> trying > >> >> >> to do a same mistake again? FTS is a huge feature, I do not > believe > >> >> >> there is an easy win for it. > >> >> >> > >> >> >> 2021-07-27 19:18 GMT+03:00, Atri Sharma <a...@apache.org>: > >> >> >> > Andrey, > >> >> >> > > >> >> >> >> Per-partition Lucene index looks simple to implement, but it > may > >> >> >> >> require > >> >> >> >> per-partition SQL to make full-text search expressions work > >> >> >> >> correctly > >> >> >> >> within the SQL quiery. > >> >> >> > I think that as long as we follow the map - reduce process that > >> >> >> > we > >> >> >> > already do for other queries, we should be fine. > >> >> >> > > >> >> >> >> Per-partition SQL index may kill the performance. We already > >> >> >> >> tried > >> >> >> >> to > >> >> >> >> do > >> >> >> >> that in Ignite 2. However, QueryParallelism feature helps to > >> >> >> >> speed > >> >> >> >> up > >> >> >> >> some > >> >> >> >> data-intensive queries, > >> >> >> >> but hits the performance in simple cases, and at some point > >> >> >> >> (e.g. > >> >> >> >> segments > >> >> >> >> > number of CPU) the performance rapidly degrades with the > >> >> >> >> > increasing > >> >> >> >> number of segments. > >> >> >> > > >> >> >> > Yeah, that is always the case, but a global index will be a > >> >> >> > nightmare > >> >> >> > in terms of concurrency and pessimistic concurrency control will > >> >> >> > anyways kill the benefits, coupled with the metadata > >> >> >> > requirements. > >> >> >> > What were the specific issues with per partition index? > >> >> >> >> > >> >> >> >> AFAIK, Lucene widely used bitmap indices that are easy to > merge. > >> >> >> >> Maybe, the map-reduce technique underneath FTS expressions and > >> some > >> >> >> hacks > >> >> >> >> will add a minimal overhead. > >> >> >> > > >> >> >> > Lucene uses many types of indices but the aspect here is that > per > >> >> >> > partition Lucene indices can return docIDs and we can merge them > >> >> >> > in > >> >> >> > reduce phase. So we are abstracted out from specifics of the > >> >> >> > internal > >> >> >> > index being used to serve the query. > >> >> >> > > >> >> >> >> > >> >> >> >> > As illustrated by Ilya, we can use Ignite's WAL records to > >> >> >> >> > rebuild > >> >> >> >> > Lucene indices. The important thing here is to not treat > >> >> >> >> > Lucene > >> >> >> >> > indices as source of truth. > >> >> >> >> To use WAL we either should relay Lucene files to our Page > >> >> >> >> memory > >> >> >> >> or > >> >> >> >> be > >> >> >> >> aware of Lucene files structure. > >> >> >> >> The first looks tricky, as we should guarantee a contiguous > >> address > >> >> >> space > >> >> >> >> in Page memory for reflecting Lucene file. Maybe separate > >> >> >> >> managed > >> >> >> >> memory > >> >> >> >> segment with its own rules? > >> >> >> > > >> >> >> > Why not use Lucene's MMappedDirectory and map it to our storage > >> >> >> > classes? > >> >> >> > > >> >> >> >> > >> >> >> >> >> Transactions. > >> >> >> >> >> * Will we support transactions? > >> >> >> >> > Lucene has no concept of transactions. > >> >> >> >> Yes, but we have. > >> >> >> >> Lucene index may be non-transactional, but users never expect > to > >> >> >> >> see > >> >> >> >> uncommited data. > >> >> >> >> How does this connect with transactional SQL? > >> >> >> > We could have the Lucene writes done as a part of transactions > >> >> >> > and > >> >> >> > ack > >> >> >> > back only when it succeeds/fails. WDYT? > >> >> >> >> > >> >> >> >> On Tue, Jul 27, 2021 at 1:36 PM Atri Sharma <a...@apache.org> > >> >> >> >> wrote: > >> >> >> >> > >> >> >> >> > Sorry, I planned on creating a Wiki page for this, but it > >> >> >> >> > makes > >> >> >> >> > more > >> >> >> >> > sense to be replying here. > >> >> >> >> > > >> >> >> >> > > * How Lucene index can be split among the nodes? > >> >> >> >> > > >> >> >> >> > We can have partition level indices on each node. > >> >> >> >> > > >> >> >> >> > > * If we'll have a single index for all partitions on the > >> >> >> >> > > particular > >> >> >> >> > > node, > >> >> >> >> > > then how index records will be aware of partitioning? > >> >> >> >> > > >> >> >> >> > Index records dont need to be aware of partitioning -- each > >> >> >> >> > Lucene > >> >> >> >> > index is independent. > >> >> >> >> > > >> >> >> >> > > This is important to filter out backup records from the > >> results > >> >> >> >> > > to > >> >> >> >> > > avoid > >> >> >> >> > > duplicates. > >> >> >> >> > > >> >> >> >> > We can merge documents from different nodes and remove > >> duplicates > >> >> >> >> > as > >> >> >> >> > long as docIDs are globally unique. > >> >> >> >> > > >> >> >> >> > > * How results from several nodes can be merged on the > Reduce > >> >> >> >> > > stage? > >> >> >> >> > > >> >> >> >> > As long as documents have a globally unique docID, Lucene has > >> >> >> >> > merge > >> >> >> >> > functions that can merge results from multiple partial > >> >> >> >> > results. > >> >> >> >> > > >> >> >> >> > > * Does Lucene supports smth like JOIN operation or others > >> >> >> >> > > that > >> >> >> >> > > may > >> >> >> >> > require > >> >> >> >> > > data from another partition or index? > >> >> >> >> > > >> >> >> >> > As illustrated by Ilya, Block-Join works for us. > >> >> >> >> > > >> >> >> >> > > If so, then it likes to multistep query with merging > results > >> on > >> >> >> >> > > intermediate stages and requires detailed investigation and > >> >> >> >> > > design. > >> >> >> >> > > It is ok if Ignite will have some limitations here, but we > >> >> >> >> > > would > >> >> >> like > >> >> >> >> > > to > >> >> >> >> > > know about them at the early stage. > >> >> >> >> > > >> >> >> >> > > * How effectively map Lucene files to the page memory? Is > it > >> >> >> >> > > even > >> >> >> >> > possible? > >> >> >> >> > > >> >> >> >> > Lucene has PageDirectory implementations which allow storing > >> >> >> >> > Lucene > >> >> >> >> > indices on different kind of file structures. It has a > >> >> >> >> > MMappedFileDirectory that we could use? > >> >> >> >> > > >> >> >> >> > > Otherwise, how to deal with potential OOM on large queries > >> >> >> >> > > and > >> >> >> memory > >> >> >> >> > > capacity planning? > >> >> >> >> > > >> >> >> >> > We can use Lucene's MMapped directory. > >> >> >> >> > > >> >> >> >> > > > >> >> >> >> > > Persistence. > >> >> >> >> > > * How and what consistency guarantees could we have/expect? > >> >> >> >> > > >> >> >> >> > Lucene does not have WAL logs but is append only > >> >> >> >> > > >> >> >> >> > > Seems, we may not be able to write physical records for > >> >> >> >> > > Lucene > >> >> >> >> > > index > >> >> >> >> > > to > >> >> >> >> > our > >> >> >> >> > > WAL. What can we do with this? > >> >> >> >> > > >> >> >> >> > As illustrated by Ilya, we can use Ignite's WAL records to > >> >> >> >> > rebuild > >> >> >> >> > Lucene indices. The important thing here is to not treat > >> >> >> >> > Lucene > >> >> >> >> > indices as source of truth. > >> >> >> >> > > > >> >> >> >> > > Transactions. > >> >> >> >> > > * Will we support transactions? > >> >> >> >> > Lucene has no concept of transactions. > >> >> >> >> > > >> >> >> >> > > * Should Lucene be aware of Transaction and track mvcc (or > >> >> >> >> > > whatever) > >> >> >> >> > > versions for the records? > >> >> >> >> > No > >> >> >> >> > > * What will be consistency guarantees? > >> >> >> >> > We can acknowledge writes back only after Lucene index is > >> >> >> >> > updated. > >> >> >> >> > > > >> >> >> >> > > UX > >> >> >> >> > > * How to add FullText search queries syntax into Calcite? > >> >> >> >> > Postgres's FTS functions are a good reference. > >> >> >> >> > > * AFAIK, the Lucene index has many properties for tuning. > >> >> >> >> > > How > >> >> >> >> > > will > >> >> >> >> > > the > >> >> >> >> > user > >> >> >> >> > > configure the index? > >> >> >> >> > Most of those properties can be cluster level and exposed as > a > >> >> >> >> > new > >> >> >> >> > sub > >> >> >> >> > config for ignite. > >> >> >> >> > > * How and where to store the settings? What are > cluster-wide > >> >> >> >> > > and > >> >> >> what > >> >> >> >> > > a > >> >> >> >> > > local to the particular node? > >> >> >> >> > All can be cluster level. > >> >> >> >> > > * Will be all the settings immutable? Can be they changed > >> >> >> >> > > on-fly? > >> >> >> >> > > after > >> >> >> >> > > node/grid restart? > >> >> >> >> > They should be applied post restart. > >> >> >> >> > > >> >> >> >> > > * Any limitations on query syntax? > >> >> >> >> > It depends on how we model our queries for text search. > >> >> >> >> > > >> >> >> >> > > > >> >> >> >> > > SQL > >> >> >> >> > > * Will we support FullText search in SQL? > >> >> >> >> > We need custom functions for it. See Postgres's FTS > functions. > >> >> >> >> > > * How to integrate Lucene index into Calcite? What is the > >> >> >> >> > > cost > >> >> >> model? > >> >> >> >> > There cannot be any cost model since there are no paths for a > >> >> >> >> > text > >> >> >> >> > query. If we see a text query, we have to use Lucene index or > >> >> >> >> > return > >> >> >> >> > an error. In this way, we need to model text search as a set > >> >> >> >> > of > >> >> >> >> > UDFs > >> >> >> >> > > >> >> >> >> > > Splitting rules? Traits? > >> >> >> >> > Please see my reply above. > >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > With all of this, you can go with the IEP (or even some > >> >> >> >> > > short > >> >> >> >> > > summary) > >> >> >> >> > and > >> >> >> >> > > further POC and implementation. > >> >> >> >> > > That's a big deal, so let's discuss what could be done > here. > >> >> >> >> > > > >> >> >> >> > > On Fri, Jul 23, 2021 at 12:58 PM Atri Sharma > >> >> >> >> > > <a...@apache.org > >> > > >> >> >> wrote: > >> >> >> >> > > > >> >> >> >> > > > I am actually happy to drive the feature for Ignite 3. > FTS > >> is > >> >> >> >> > > > very > >> >> >> >> > > > important for me and I think Ignite users will benefit > >> >> >> >> > > > from > >> >> >> >> > > > it > >> >> >> >> > > > greatly. > >> >> >> >> > > > > >> >> >> >> > > > If it makes sense to be focusing on Ignite 3 for this > >> >> >> >> > > > capability, > >> >> >> I > >> >> >> >> > > > am > >> >> >> >> > > > eager to contribute there and lead the development. > >> >> >> >> > > > > >> >> >> >> > > > Please share your thoughts. > >> >> >> >> > > > > >> >> >> >> > > > On Fri, Jul 23, 2021 at 3:21 PM Andrey Mashenkov > >> >> >> >> > > > <andrey.mashen...@gmail.com> wrote: > >> >> >> >> > > > > > >> >> >> >> > > > > Hi Atri, > >> >> >> >> > > > > > >> >> >> >> > > > > All the Jira tickets we have on the Full-text search > >> >> >> >> > > > > (FTS) > >> >> >> >> > > > > thing > >> >> >> >> > > > > are > >> >> >> >> > > > > targeted to Ignite 2. > >> >> >> >> > > > > > >> >> >> >> > > > > AFAIK, we want, but we have NOT committed to FTS > support > >> in > >> >> >> Ignite > >> >> >> >> > > > > 3, > >> >> >> >> > > > yet. > >> >> >> >> > > > > By the way, we are getting requests for this thing from > >> the > >> >> >> >> > > > > user > >> >> >> >> > side, > >> >> >> >> > > > and > >> >> >> >> > > > > definitely, > >> >> >> >> > > > > FTS would be a valuable feature for Ignite. > >> >> >> >> > > > > > >> >> >> >> > > > > It will be great if the one wants to drive it, any help > >> >> >> >> > > > > will > >> >> >> >> > > > > be > >> >> >> >> > > > appreciated. > >> >> >> >> > > > > > >> >> >> >> > > > > > >> >> >> >> > > > > On Fri, Jul 23, 2021 at 12:12 PM Atri Sharma > >> >> >> >> > > > > <a...@apache.org> > >> >> >> >> > wrote: > >> >> >> >> > > > > > >> >> >> >> > > > > > Hello, > >> >> >> >> > > > > > > >> >> >> >> > > > > > An update, please. I am working through persistence > of > >> >> >> >> > > > > > Lucene > >> >> >> >> > > > > > index > >> >> >> >> > > > using > >> >> >> >> > > > > > Ignite Dictionary, and will be asking some questions > >> >> >> >> > > > > > soon. > >> >> >> >> > > > > > > >> >> >> >> > > > > > I had one doubt - - where does this change go? Ignite > >> >> >> >> > > > > > 3? > >> >> >> >> > > > > > > >> >> >> >> > > > > > Also, I know we want to build native support for text > >> >> >> >> > > > > > searches > >> >> >> >> > > > > > in > >> >> >> >> > > > Ignite 3. > >> >> >> >> > > > > > Is the work I am proposing here part of that, or will > >> >> >> >> > > > > > that > >> >> >> >> > > > > > be > >> >> >> a > >> >> >> >> > > > separate > >> >> >> >> > > > > > effort? > >> >> >> >> > > > > > > >> >> >> >> > > > > > On Mon, 28 Jun 2021, 19:20 Ilya Kasnacheev, < > >> >> >> >> > ilya.kasnach...@gmail.com > >> >> >> >> > > > > > >> >> >> >> > > > > > wrote: > >> >> >> >> > > > > > > >> >> >> >> > > > > > > Hello! > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > I think that number one is the most important one, > >> then > >> >> >> maybe > >> >> >> >> > > > > > > it > >> >> >> >> > > > will see > >> >> >> >> > > > > > > more use and other deficiencies become more > >> >> >> >> > > > > > > apparent, > >> >> >> leading > >> >> >> >> > > > > > > to > >> >> >> >> > more > >> >> >> >> > > > > > > tickets and visibility. > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > Maybe 2. and 3. will even use a different approach > >> when > >> >> >> >> > persistence > >> >> >> >> > > > is > >> >> >> >> > > > > > > implemented. > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > Regards, > >> >> >> >> > > > > > > -- > >> >> >> >> > > > > > > Ilya Kasnacheev > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > пн, 28 июн. 2021 г. в 14:34, Atri Sharma > >> >> >> >> > > > > > > <a...@apache.org>: > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > > Hello Again! > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > I have been looking into the aforementioned and > >> >> >> >> > > > > > > > here > >> >> >> >> > > > > > > > are > >> >> >> my > >> >> >> >> > follow > >> >> >> >> > > > up > >> >> >> >> > > > > > > > thoughts: > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > 1. Support persistence of Lucene indexes. > >> >> >> >> > > > > > > > 2. > >> https://issues.apache.org/jira/browse/IGNITE-12401 > >> >> >> >> > > > > > > > (Needs > >> >> >> >> > > > fixing of > >> >> >> >> > > > > > > > moving partitions first) > >> >> >> >> > > > > > > > 3. Figure out how to return scores from nodes and > >> use > >> >> >> >> > > > > > > > them > >> >> >> >> > > > > > > > as > >> >> >> >> > sort > >> >> >> >> > > > > > > > parameters on the coordinator node > >> >> >> >> > > > > > > > ( > https://issues.apache.org/jira/browse/IGNITE-12291 > >> ) > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > Please let me know if this looks ok to make text > >> >> >> >> > > > > > > > queries > >> >> >> >> > > > functional? > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > Atri > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > On Mon, Jun 21, 2021 at 2:49 PM Alexei Scherbakov > >> >> >> >> > > > > > > > <alexey.scherbak...@gmail.com> wrote: > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > Hi. > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > One of the biggest issues with text queries is > a > >> >> >> >> > > > > > > > > lack > >> >> >> >> > > > > > > > > of > >> >> >> >> > support > >> >> >> >> > > > for > >> >> >> >> > > > > > > > lucene > >> >> >> >> > > > > > > > > indices persistence, which makes this > >> functionality > >> >> >> >> > > > > > > > > useless > >> >> >> >> > if a > >> >> >> >> > > > > > > > > persistence is enabled. > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > I would first take care of it. > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > пн, 21 июн. 2021 г. в 12:16, Maksim Timonin < > >> >> >> >> > > > timonin.ma...@gmail.com > >> >> >> >> > > > > > >: > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > > Hi, Atri! > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > You're right, Actually there is a lack of > >> support > >> >> >> >> > > > > > > > > > for > >> >> >> >> > > > TextQueries. > >> >> >> >> > > > > > > For > >> >> >> >> > > > > > > > the > >> >> >> >> > > > > > > > > > last ticket I'm doing I see some obvious > >> >> >> >> > > > > > > > > > issues > >> >> >> >> > > > > > > > > > with > >> >> >> >> > > > > > > > > > them > >> >> >> >> > (no > >> >> >> >> > > > page > >> >> >> >> > > > > > > size > >> >> >> >> > > > > > > > > > support, for example). I'm glad that somebody > >> >> >> >> > > > > > > > > > wants > >> >> >> >> > > > > > > > > > to > >> >> >> >> > maintain > >> >> >> >> > > > > > this > >> >> >> >> > > > > > > > > > functionality. Thanks a lot! > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > For the MergeSort algorithm there is already > a > >> >> >> >> > > > > > > > > > patch > >> >> >> >> > > > > > > > > > for > >> >> >> >> > that > >> >> >> >> > > > [1]. > >> >> >> >> > > > > > > It's > >> >> >> >> > > > > > > > > > currently on review. This patch introduces an > >> >> >> >> > > > > > > > > > abstract > >> >> >> >> > reducer > >> >> >> >> > > > for > >> >> >> >> > > > > > > > > > CacheQueries with 2 implementations > >> >> >> >> > > > > > > > > > (unordered, > >> >> >> >> > merge-sort). > >> >> >> >> > > > Then > >> >> >> >> > > > > > > > TextQuery > >> >> >> >> > > > > > > > > > leverages on MergeSort to order results from > >> >> >> >> > > > > > > > > > multiple > >> >> >> >> > nodes by > >> >> >> >> > > > > > score. > >> >> >> >> > > > > > > > This > >> >> >> >> > > > > > > > > > patch also fixes the pageSize issue, I've > >> >> >> >> > > > > > > > > > mentioned > >> >> >> >> > > > > > > > > > before. > >> >> >> >> > > > Could > >> >> >> >> > > > > > you > >> >> >> >> > > > > > > > > > please check if it fully matches your idea? > >> >> >> >> > > > > > > > > > Any > >> >> >> >> > > > > > > > > > issues > >> >> >> >> > > > > > > > > > or > >> >> >> >> > > > comments > >> >> >> >> > > > > > > are > >> >> >> >> > > > > > > > > > welcome. > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > I've prepared this ticket, because I need the > >> >> >> MergeSort > >> >> >> >> > > > algorithm > >> >> >> >> > > > > > for > >> >> >> >> > > > > > > > the > >> >> >> >> > > > > > > > > > new type of queries I'm implementing > >> (IndexQuery, > >> >> >> >> > > > > > > > > > it > >> >> >> >> > > > > > > > > > should > >> >> >> >> > > > also > >> >> >> >> > > > > > > > provide > >> >> >> >> > > > > > > > > > ordered results over multiple nodes). > >> >> >> >> > > > > > > > > > Currently > >> >> >> >> > > > > > > > > > I'm > >> >> >> not > >> >> >> >> > > > planning to > >> >> >> >> > > > > > > go > >> >> >> >> > > > > > > > > > further with TextQuery, so if you're going to > >> >> >> >> > > > > > > > > > support > >> >> >> >> > > > > > > > > > this > >> >> >> >> > > > it'll > >> >> >> >> > > > > > be a > >> >> >> >> > > > > > > > great > >> >> >> >> > > > > > > > > > contribution, I think. > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > [1] > >> >> >> https://issues.apache.org/jira/browse/IGNITE-14703 > >> >> >> >> > > > > > > > > > [2] > https://github.com/apache/ignite/pull/9081 > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > On Mon, Jun 21, 2021 at 11:11 AM Atri Sharma > < > >> >> >> >> > a...@apache.org> > >> >> >> >> > > > > > > wrote: > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > > Hi All, > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > I have been looking into our text queries > >> >> >> >> > > > > > > > > > > support > >> >> >> and > >> >> >> >> > > > > > > > > > > see > >> >> >> >> > > > that it > >> >> >> >> > > > > > > has > >> >> >> >> > > > > > > > > > > limited community support. > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > Therefore, I volunteer to be the maintainer > >> >> >> >> > > > > > > > > > > of > >> >> >> >> > > > > > > > > > > the > >> >> >> >> > module and > >> >> >> >> > > > > > work > >> >> >> >> > > > > > > on > >> >> >> >> > > > > > > > > > > enhancing it further. > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > First goal would be to move to Lucene 8.x, > >> then > >> >> >> >> > > > > > > > > > > work > >> >> >> >> > > > > > > > > > > on > >> >> >> >> > > > sorted > >> >> >> >> > > > > > > reduce > >> >> >> >> > > > > > > > > > > - merge across nodes. Fundamentally, this > is > >> >> >> >> > > > > > > > > > > doable > >> >> >> >> > > > > > > > > > > since > >> >> >> >> > > > Lucene > >> >> >> >> > > > > > > > ranks > >> >> >> >> > > > > > > > > > > documents according to their score, and > >> >> >> >> > > > > > > > > > > documents > >> >> >> are > >> >> >> >> > > > returned in > >> >> >> >> > > > > > > the > >> >> >> >> > > > > > > > > > > order of their score. Since the scoring > >> >> >> >> > > > > > > > > > > function > >> >> >> >> > > > > > > > > > > is > >> >> >> >> > > > homogeneous, > >> >> >> >> > > > > > > this > >> >> >> >> > > > > > > > > > > means that across nodes, we can compare > >> >> >> >> > > > > > > > > > > scores > >> >> >> >> > > > > > > > > > > and > >> >> >> >> > > > > > > > > > > merge > >> >> >> >> > > > sort. > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > Please let me know if I can take this up. > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > Atri > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > -- > >> >> >> >> > > > > > > > > > > Regards, > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > Atri > >> >> >> >> > > > > > > > > > > Apache Concerted > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > -- > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > > > Best regards, > >> >> >> >> > > > > > > > > Alexei Scherbakov > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > -- > >> >> >> >> > > > > > > > Regards, > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > Atri > >> >> >> >> > > > > > > > Apache Concerted > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > >> >> >> >> > > > > > >> >> >> >> > > > > > >> >> >> >> > > > > -- > >> >> >> >> > > > > Best regards, > >> >> >> >> > > > > Andrey V. Mashenkov > >> >> >> >> > > > > >> >> >> >> > > > -- > >> >> >> >> > > > Regards, > >> >> >> >> > > > > >> >> >> >> > > > Atri > >> >> >> >> > > > Apache Concerted > >> >> >> >> > > > > >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > -- > >> >> >> >> > > Best regards, > >> >> >> >> > > Andrey V. Mashenkov > >> >> >> >> > > >> >> >> >> > -- > >> >> >> >> > Regards, > >> >> >> >> > > >> >> >> >> > Atri > >> >> >> >> > Apache Concerted > >> >> >> >> > > >> >> >> >> > >> >> >> >> > >> >> >> >> -- > >> >> >> >> Best regards, > >> >> >> >> Andrey V. Mashenkov > >> >> >> > > >> >> >> > -- > >> >> >> > Regards, > >> >> >> > > >> >> >> > Atri > >> >> >> > Apache Concerted > >> >> >> > > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> > >> >> >> Best regards, > >> >> >> Ivan Pavlukhin > >> >> >> > >> >> > > >> >> > >> >> > >> >> -- > >> >> > >> >> Best regards, > >> >> Ivan Pavlukhin > >> > > >> > -- > >> > Regards, > >> > > >> > Atri > >> > Apache Concerted > >> > > >> > >> > >> -- > >> > >> Best regards, > >> Ivan Pavlukhin > >> > > > > > -- > > Best regards, > Ivan Pavlukhin >