Re: Indexing time increase moving from Lucene 8 to 9

2024-04-24 Thread Matt Davis
Marc, We also ran into this problem on updating to Lucene 9.5. We found it sufficient in our use case to just bump up LRU cache in the constructor to a high enough value to not pose a performance problem. The default value of 4k was way too low for our use case with millions of unique facet valu

Re: Extending Directory Object

2022-12-19 Thread Matt Davis
A really old example (Lucene 6.x) is here: https://github.com/lumongo/lumongo/wiki/Distributed-Directory https://github.com/lumongo/lumongo/tree/master/lumongo-storage/src/main/java/org/lumongo/storage/lucene But the gist is to extend BaseDirectory and IndexOutput/IndexInput which I think still ap

Re: Best strategy migrate indexes

2022-10-29 Thread Matt Davis
Inside of Zulia search engine, the object being indexed is always a JSON/BSON object and we store the BSON as a stored byte field in the index. This allows easy internal reindexing when the searchable fields change but also allows us to update to the latest lucene version. Combined with using luc

Re: How to filter KnnVectorQuery with multiple terms?

2022-08-31 Thread Matt Davis
If I understand correctly, I believe you would want to use a TermInSetQuery query. An example usage can be found here https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398. You can also check out the usage of KnnVectorQuery here: h

Re: Example / Demo re support for filtering in nearest-neighbor vector search (Lucene 9.1.0)

2022-05-22 Thread Matt Davis
Thanks Julie. I was able to implement vector search in Zulia with your pointers. The pull request might be helpful to others: https://github.com/zuliaio/zuliasearch/pull/70 Thanks, Matt On Fri, May 20, 2022 at 9:23 AM Michael Wechner wrote: > Hi Julie > > I got it running and it seems to work

Re: Taxonomy vs SSDVFF for faceted search

2021-04-29 Thread Matt Davis
stributed system around it with sharding/rebalancing? > > -- > Regards, > Alex > > > On Wed, Apr 28, 2021 at 11:18 AM Matt Davis > wrote: > > > Alex, > > > > With our lucene based implementation of Zulia ( > > https://github.com/zuliaio/zuliasearch

Re: Taxonomy vs SSDVFF for faceted search

2021-04-28 Thread Matt Davis
Alex, With our lucene based implementation of Zulia ( https://github.com/zuliaio/zuliasearch) we have went back and forth. We started with Taxonomy and switched and then switched back to taxonomy. In our experience the Taxonomy based approach is more scalable and performant. We do large search

Re: best way (performance wise) to search for field without value?

2020-11-13 Thread Matt Davis
With Zulia we chose to rewrite fieldName:* queries to hiddenField:fieldName and add all field names that are present to a hidden field automatically as Uwe described as an alternative. It seems to work well. https://github.com/zuliaio/zuliasearch/blob/master/zulia-query-parser/src/main/java/io/zu

Re: Simultaneous Indexing and searching

2020-09-02 Thread Matt Davis
Also can check out https://github.com/zuliaio/zuliasearch for a thinner approach closer to native lucene or just to see examples of using lucene. On Wed, Sep 2, 2020, 11:24 AM Alex K wrote: > FWIW, I agree with Michael: this is not a simple problem and there's been a > lot of effort in Elasticse

Re: Port on iOS

2020-08-21 Thread Matt Davis
Never used it, but you could look at https://gluonhq.com/products/mobile/. Also see https://github.com/oracle/graal/issues/373#issuecomment-563435260. My guess is that Lucene would be a large ask to compile in Graal. REST services are probably the way to go. On Fri, Aug 21, 2020 at 10:25 AM Vince

Re: Searching number of tokens in text field

2020-01-02 Thread Matt Davis
> That is a clever idea. I would still prefer something cleaner but this > > > could work. Thanks! > > > > > > On Sat, Dec 28, 2019 at 10:11 PM Michael Sokolov > > wrote: > > > > > >> I don't know of any pre-existing thing that do

Re: Searching number of tokens in text field

2019-12-29 Thread Matt Davis
), and then > appends some special token encoding the length? > > On Sat, Dec 28, 2019, 9:36 AM Matt Davis wrote: > > > Hello, > > > > I was wondering if it is possible to search for the number of tokens in a > > text field. For example find book titles with 3

Searching number of tokens in text field

2019-12-28 Thread Matt Davis
hanks, Matt Davis

Re: Iterating Over All Documents On a Changing Index

2019-10-30 Thread Matt Davis
ou pulled a new reader and while you are in the process > of reindexing. > > On Sat, Oct 19, 2019 at 1:35 AM Matt Davis > wrote: > > > > Hi All, > > > > I am working on implementing of an in place reindex using Lucene. In my > > case, I have BSON document sto

Iterating Over All Documents On a Changing Index

2019-10-18 Thread Matt Davis
Hi All, I am working on implementing of an in place reindex using Lucene. In my case, I have BSON document stored in a binary field and have a set of rules that pull fields out of the BSON and indexes them into different Lucene fields with different analyzers. I would like to be able to change t