Re: Best strategy migrate indexes

2022-10-29 Thread Baris Kazar
It is always great practice to retain non-indexed data since when Lucene changes version, even minor version, I always reindex. Best regards From: Gus Heck Sent: Saturday, October 29, 2022 2:17 PM To: java-user@lucene.apache.org Subject: Re: Best strategy migrate

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-06 Thread Baris Kazar
Thank You Thank You Best regards From: Michael McCandless Sent: Saturday, August 6, 2022 11:29:25 AM To: Baris Kazar Cc: java-user@lucene.apache.org Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-06 Thread Baris Kazar
I think so. Best regards From: Michael McCandless Sent: Saturday, August 6, 2022 10:12 AM To: java-user@lucene.apache.org Cc: Baris Kazar Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-06 Thread Baris Kazar
My github username is bmkazar can You please register me? Best regards From: Michael McCandless Sent: Saturday, August 6, 2022 6:05:51 AM To: d...@lucene.apache.org Cc: Lucene Users ; java-dev Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account i

Re: Using Lucene 8.5.1 vs 8.5.2

2022-07-26 Thread Baris Kazar
Great, 8.11 has gone further. Thanks Mike Best regards From: Mike Drob Sent: Tuesday, July 26, 2022 5:18 PM To: java-user@lucene.apache.org Cc: Baris Kazar Subject: Re: Using Lucene 8.5.1 vs 8.5.2 I would use 8.5.2 if possible when considering fuzzy queries

Lucene 9.1.0 has changed name of lucene-analysis-common-9.1.0.jar

2022-07-26 Thread Baris Kazar
Dear Folks,- I see that Lucene has changed one of the JAR files' name to lucene-analysis-common-9.1.0.jar in Lucene version 9.1.0. It used to use analyzers. Can someone please confirm? Best regards

Re: Performance Comparison of Benchmarks by using Lucene 9.1.0 vs 8.5.1

2022-07-26 Thread Baris Kazar
Great, this was very helpful. This gives rough idea using the dates of the Lucene bugs/features added on those graphs. Best regards From: Michael Sokolov Sent: Tuesday, July 26, 2022 3:55 PM To: java-user@lucene.apache.org Cc: Baris Kazar Subject: Re

Performance Comparison of Benchmarks by using Lucene 9.1.0 vs 8.5.1

2022-07-26 Thread Baris Kazar
Dear Folks,- Similar question to my previous post: this time I wonder if there is a Lucene web site where benchmarks are run against these two versions of Lucene. I see many (44+16) api changes and (48+9) improvements and (16+15) Bug fixes, which sounds great. Best regards

Using Lucene 8.5.1 vs 8.5.2

2022-07-26 Thread Baris Kazar
Dear Folks,- May I please ask if using 8.5.1 is ok wrt 8.5.2? The only change was the following where fuzzy query was fixed for a major bug (?). How much does this affect the fuzzy query performance? Has Dev Team done a study to compare Lucene-9350 Bug vs Lucene-9068 Bug? https://lucene.apache.o

Re: How to handle corrupt Lucene index

2022-04-13 Thread Baris Kazar
yes that is a great point to look at first and that would eliminate any jdbc related issues that may lead to such problems. Best regards From: Tim Whittington Sent: Wednesday, April 13, 2022 9:17:44 PM To: java-user@lucene.apache.org Subject: Re: How to handle co

Re: How to handle corrupt Lucene index

2022-04-13 Thread Baris Kazar
, these indexes are created and read with the same Lucene version (7.3.0). Tim On Thu, 14 Apr 2022 at 12:45, Baris Kazar wrote: > In my experience that if you built index at version x then use index also > in version x. > I never encountered any problems this way witj Lucene. > >

Re: How to handle corrupt Lucene index

2022-04-13 Thread Baris Kazar
In my experience that if you built index at version x then use index also in version x. I never encountered any problems this way witj Lucene. Can you maybe recreate lucene index on 7.3.0? Also how do you use database in your scenario? Are you using jdbc like operations like in Oracle database?

Re: How to propose a new feature

2022-04-01 Thread Baris Kazar
This cache can work on different indexable fields or even maybe stored fields. But indexable fields is better i think. It can be configured to cache which fields, too. Probably most people may choose all indexable fields. Thanks From: Baris Kazar Sent: Friday

Re: How to propose a new feature

2022-04-01 Thread Baris Kazar
, April 1, 2022 12:58 PM To: Lucene Users Mailing List Cc: Baris Kazar Subject: Re: How to propose a new feature Just send an email with the problem that you want to solve and the approach that you are suggesting. On Fri, Apr 1, 2022 at 6:56 PM Baris Kazar wrote: > > Resent due to need fo

Re: How to propose a new feature

2022-04-01 Thread Baris Kazar
Resent due to need for help. Thanks From: Baris Kazar Sent: Wednesday, March 30, 2022 2:30 PM To: java-user@lucene.apache.org Cc: Baris Kazar Subject: How to propose a new feature Hi Everyone,- What is the process to propose a new feature for Core Lucene engine

How to propose a new feature

2022-03-30 Thread Baris Kazar
Hi Everyone,- What is the process to propose a new feature for Core Lucene engine? Best regards

Re: test

2022-02-20 Thread Baris Kazar
Yes, please. Welcome. Best regards From: Claude Lepere Sent: Sunday, February 20, 2022 1:32 PM To: java-user@lucene.apache.org Subject: test Am I subscribed, please? Claude Lepère claudelep...@gmail.com

Re: Log4j

2021-12-15 Thread Baris Kazar
Ok these are good to know. thanks From: Uwe Schindler Sent: Wednesday, December 15, 2021 5:18 PM To: java-user@lucene.apache.org ; Ali Akhtar Cc: Baris Kazar Subject: Re: Log4j Hi, It only has an abstract logging interface inside IndexWriter to track actions

Log4j

2021-12-15 Thread Baris Kazar
Hi Folks,- Lucene is not affected by the latest bug, right? I saw on Solr News page there are some fixes already made to Solr. Best regards

org.apache.lucene.index.memory.MemoryIndex

2021-10-06 Thread Baris Kazar
Hi,- Is there a project within Apache Lucene to extend this class to allow multiple results? Best regards

Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score()

2021-10-05 Thread Baris Kazar
From: Baris Kazar Sent: Tuesday, October 5, 2021 3:56 PM To: Adrien Grand ; Lucene Users Mailing List ; Baris Kazar Subject: Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score() Hi Adrien,- Thanks for taking a look at it and sure, that will be very nice to fix

Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score()

2021-10-05 Thread Baris Kazar
: Tuesday, October 5, 2021 3:18 PM To: Lucene Users Mailing List Cc: Baris Kazar Subject: Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score() Hmm we should fix these access$ accessors by fixing the visibility of some fields. These breakdowns do not necessarily signal

Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score()

2021-10-04 Thread Baris Kazar
.score() -->> Weight$DefaultBulkScorer.score() -->>-->> Weight$DefaultBulkScorer.scoreAll() -->>-->>-->> WANDScorer$1.nextDoc() -->>-->>-->>-->> WANDScorer$1.advance() -->>-->>-->>-->>-->> WANDScorer.access$300() (constitutes %65 of Bul

Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score()

2021-10-02 Thread Baris Kazar
: Adrien Grand Sent: Saturday, October 2, 2021 1:44:40 AM To: Lucene Users Mailing List Cc: Baris Kazar Subject: Re: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score() Is your profiler reporting inclusive or exclusive costs for each function? Ie. does it exclude time

org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score()

2021-10-01 Thread Baris Kazar
Hi,- I performance profiled my application via jvisualvm on Java and saw that 75% of the search process from org.apache.lucene.search.IndexSearcher.search() are spent on these units: org.apache.lucene.search.BooleanWeight.bulkScorer() and BulkScorer.score() Is there any study or project to speed u

Re: Potential bug

2021-06-14 Thread Baris Kazar
i was clear on what i wanted to do with Lucene experiments in this thread. (last part of first paragraph below) Best regards From: Baris Kazar Sent: Monday, June 14, 2021 10:28:47 AM To: Atri Sharma ; java-user@lucene.apache.org ; a.benede...@sease.io ; Baris

Re: Potential bug

2021-06-14 Thread Baris Kazar
, June 14, 2021 8:46 AM To: java-user@lucene.apache.org Cc: Baris Kazar Subject: Re: Potential bug +1 to Adrien. Let's keep the tone neutral. On Mon, 14 Jun 2021, 16:00 Adrien Grand, mailto:jpou...@gmail.com>> wrote: Baris, you called out an insult from Alessandro and your replies sug

Re: Potential bug

2021-06-11 Thread baris . kazar
Let me guide to a professional answer to the below email: Hi Baris, Since You mentioned You did all the performance study on your application and still believe that the bottleneck is the fuzzy search api from Lucene, it would be best to time the application for: * matching phase (identif

Re: Potential bug

2021-06-11 Thread baris . kazar
i expect the answers from this list to be more professional please. You dont have to answer to this list if you intend to insult. Best regards On 6/11/21 11:57 AM, Alessandro Benedetti wrote: Hi Bazir, this feels like an X Y problem [1

Re: Potential bug

2021-06-11 Thread baris . kazar
Lets start with writing my name correctly. Then we can talk Best regards On 6/11/21 11:57 AM, Alessandro Benedetti wrote: Hi Bazir, this feels like an X Y problem [1

Re: Potential bug

2021-06-09 Thread baris . kazar
Yes, i did those and i believe i am at the best level of performance now and it is not bad at all but i want to make it much better. i see like a linear drop in timings when i go lower number of words but let me do that quick study again. Fuzzy search  is always expensive but that seems to su

Re: Potential bug

2021-06-09 Thread baris . kazar
i cant reveal those details i am very sorry. but it is more than 1 million. let me tell that i have a lot of code that processes results from lucene but the bottle neck is lucene fuzzy search. Best regards On 6/9/21 1:53 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) wrote: How many documents do

Re: Potential bug

2021-06-09 Thread baris . kazar
i have only two fields one string the other is a number (stored as string), i guess you cant go simpler than this. i retreieve the hits and my major bottleneck is lucene fuzzy search. i take each word from the string which is usually around at most 10 words i build a fuzzy boolean query out o

Re: Potential bug

2021-06-09 Thread baris . kazar
Thanks Adrien, but the differences is too far apart. I think the algorithm needs to be revised. what if the user needs to limit the search process? that leaves no control. there should be a way to speedup lucene then if this is not possible, since for some simple queries it takes half a seco

Potential bug

2021-06-09 Thread baris . kazar
Hi,-  i think this is a potential bug i set this time totalHitsThreshold to 10 and i get totalhits reported as 1655 but i get 10 results in total. I think this suggests that there might be a bug with TopScoreDocCollector algorithm. Best regards

Re: TopScoreDocCollector class usage

2021-06-09 Thread baris . kazar
Ok i found it 300 times number of words in the search string but these needs to be precisely documented in the Javadocs i dont want to have trial and error and i guess nobody wants that, either please. Best regards On 6/9/21 12:11 PM, baris.ka...@oracle.com wrote: Hi,-  i used this cl

TopScoreDocCollector class usage

2021-06-09 Thread baris . kazar
Hi,-  i used this class now before IndexSearher.search api (with collector as 2nd arg) (Please see the "an interesting case" thread before this question) but this time i have a very weird behavior: i used to have 4000+ hits with default TopScoreDocCollector.create(int numHits,  ScoreDoc a

Re: An interesting case

2021-06-08 Thread baris . kazar
Tue, Jun 8, 2021 at 6:28 PM mailto:baris.ka...@oracle.com>> wrote: >> >>> i am currently happy with Lucene performance but i want to understand >>> and speedup further >>> >>> by limiting the results concretely. So i still don

Re: An interesting case

2021-06-08 Thread baris . kazar
ting the results concretely. So i still donot know why totalHits >>> and scoredocs report >>> >>> different number of hits. >>> >>> >>> Best regards >>> >>> >>> On 6/8/21 2:52 AM, Baris Kazar wrote:

Re: An interesting case

2021-06-08 Thread baris . kazar
not know why totalHits >>> and scoredocs report >>> >>> different number of hits. >>> >>> >>> Best regards >>> >>> >>> On 6/8/21 2:52 AM, Baris Kazar wrote: >>>> m

Re: An interesting case

2021-06-08 Thread baris . kazar
nt to understand and speedup further by limiting the results concretely. So i still donot know why totalHits and scoredocs report different number of hits. Best regards On 6/8/21 2:52 AM, Baris Kazar wrote: my worry is actually about the lucene's performance. if lucene collects thousan

Re: An interesting case

2021-06-08 Thread baris . kazar
d and speedup further by limiting the results concretely. So i still donot know why totalHits and scoredocs report different number of hits. Best regards On 6/8/21 2:52 AM, Baris Kazar wrote: my worry is actually about the lucene's performance. if lucene collects thousands of hits instead o

Re: On which field document is searched

2021-06-08 Thread baris . kazar
I guess you can setup an experiment like search your text against each field and then look at the score but you need to normalize the score in order to compare and normalization will include probably length of the field etc. Maybe there is an api in lucene for this but i dont know. Hope this

Re: An interesting case

2021-06-08 Thread baris . kazar
i am currently happy with Lucene performance but i want to understand and speedup further by limiting the results concretely. So i still donot know why totalHits and scoredocs report different number of hits. Best regards On 6/8/21 2:52 AM, Baris Kazar wrote: my worry is actually about

Re: An interesting case

2021-06-07 Thread Baris Kazar
Best regards From: Adrien Grand Sent: Tuesday, June 8, 2021 2:46 AM To: Lucene Users Mailing List Cc: Baris Kazar Subject: Re: An interesting case When you call IndexSearcher#search(Query query, int n), there are two cases: - either your query matches n hits or more, and the TopDocs ob

Re: An interesting case

2021-06-07 Thread baris . kazar
https://stackoverflow.com/questions/50368313/relation-between-topdocs-totalhits-and-parameter-n-of-indexsearcher-search looks like someone else also had this problem, too. Any suggestions please? Best regards On 6/8/21 1:36 AM, baris.ka...@oracle.com wrote: Hi,-  I use IndexSearcher.search

An interesting case

2021-06-07 Thread baris . kazar
Hi,-  I use IndexSearcher.search API with two parameters like Query and int number (i set as 20). However, when i look at the TopDocs object which is the result of this above API call i see thousands of hits from totalhits. Is this inaccurate or Lucene is doing actually search based on tha

Interface IndexReader.CacheHelper

2021-03-26 Thread baris . kazar
Hi,- https://lucene.apache.org/core/8_5_2/core/org/apache/lucene/index/IndexReader.CacheHelper.html?is-external=true  it would be nice to have more detailed explanation and maybe an example for this interesting interface? Best regards ---

MemoryIndex class

2021-03-26 Thread baris . kazar
Hi,- https://lucene.apache.org/core/8_5_2/memory/index.html what is meant by single document in this sentence? "High-performance single-document main memory Apache Lucene fulltext search index." The doc for this MemoryIndex still mentions about the deprecated class RAMDirectory. https://

NRTCachingDirectory class information

2021-03-26 Thread baris . kazar
Hi,-  Related to my previous thread: Warming up index files via cat to make it in memory index I found out about this class in the Book Lucene 4 Cookbook by Edwood Ng. May i please ask about any pointers, best practices paper or any Lucene documentation for comparing this NRTCachingDirec

Warming up index files via cat to make it in memory index

2021-03-25 Thread baris . kazar
Hi,-  This new thread is the continuation of previous thread back in Feb 2021: Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory) May i mention that i cat'ed *fdt files (largest index files among 98 index files generated) by directing to new files so that these files a

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Baris Kazar
So, just cat will do this. Thanks From: Robert Muir Sent: Tuesday, February 23, 2021 4:45 PM To: Baris Kazar Cc: java-user Subject: Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory) The preload isn't magical. It only "reads in

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks again, Robert. Could you please explain "preload"? Which functionality is that? we discussed in this thread before about a preload. Is there a Lucene url / site that i can look at for preload? Thanks for the explanations. This thread will be useful for many folks i believe. Best regar

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
(edited previous response) Thanks, but each different query at the first run i see some slowdown (not much though) with MMapDirectory and FSDirectory wrt second, third runs (due to cold start), though. Cold start slowdown is a little bit more with FSdirectory. So, MMapDirectory is slightly

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks, but each different query i see some slowdown (not much though) with MMapDirectory and FSDirectory, though. It is a little bit more with FSdirectory. So, MMapDirectory is slightly better in that, too: ie, cold start. What i want to achieve: Problem statement: base case is disk based

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks but then how will MMapDirectory help gain speedup? i will try tmpfs and see what happens. i was expecting to get on order of magnitude of speedup from already very fast on disk Lucene indexes. So i was expecting really really really fast response with MMapDirectory. Thanks On 2/23/21

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
As Uwe suggested some time ago, tmpfs file system usage with MMapDirectory is the only way to get high speedup wrt on disk Lucene index, right? Best regards On 2/23/21 1:44 PM, baris.ka...@oracle.com wrote: Ok, but how is this MMapDirectory used then? Best regards On 2/23/21 7:03 AM, Rob

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Ok, but how is this MMapDirectory used then? Best regards On 2/23/21 7:03 AM, Robert Muir wrote: On Tue, Feb 23, 2021 at 2:30 AM > wrote: Hi,-   I tried MMapDirectory and i allocated as big as index size on my J2EE Container but Don't alloc

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-22 Thread baris . kazar
Hi,-  I tried MMapDirectory and i allocated as big as index size on my J2EE Container but it only gives me at most 25% speedup and even sometimes a small amount of slowdown. How can i effectively use Lucene indexes in memory? Best regards On 12/14/20 6:35 PM, baris.ka...@oracle.com wrote

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Robert. I think these valuable comments need to be placed on javadocs for future references. i think i am getting enough info for making a decision: i will use MMapDirectory without setPreload and i hope my index will fit into the RAM. i plan to post a blog for findings. Best regar

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
e.com Sent: Sunday, December 13, 2020 10:18 PM To: java-user@lucene.apache.org Cc: BARIS KAZAR Subject: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory) Hi,- it would be nice to create a Lucene index in files and then effectively load it into memory once (since i use in read-only mode). I

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
l1TtPJMV80mkA-w$ eMail: u...@thetaphi.de -Original Message- From: baris.ka...@oracle.com Sent: Sunday, December 13, 2020 10:18 PM To: java-user@lucene.apache.org Cc: BARIS KAZAR Subject: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory) Hi,- it would be nice to

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
iek 19, D-28357 Bremen https://urldefense.com/v3/__https://www.thetaphi.de__;!!GqivPVa7Brio!Ll3PR4BZgqmgJNQ7MrnsXr27zNYgjsyXlMh9h6awmbZgSNW-yVLBCDuFHTogNnw9_Q$ eMail: u...@thetaphi.de -Original Message- From: baris.ka...@oracle.com Sent: Sunday, December 13, 2020 10:18 PM To: java-user@l

MMapDirectory usage during indexing and search

2020-12-14 Thread baris . kazar
Hi,-  are there some examples on how to use MMapDirectory during indexing (i used the constructor to create it) and search? what are the best practices? should i repeat during search what i did during indexing for MMapDirectory i.e, use the constructor to create the MMapDirectory object by

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Jigar, these are great notes, observations, experiments to know about and they are very very valuable, i also plan to write a blog on this topic to help Lucene advance. Best regards On 12/14/20 12:44 PM, Jigar Shah wrote: I used one of the Linux feature (ramfs, basically mounting ram

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Mike, appreciate the reply and the suggestions very much. And Your article link to concurrent search is amazing. Together with in memory and concurrent index (especially in read only mode) these will speed up Lucene queries very much. Happy Holidays Best regards On 12/14/20 10:12 AM,

MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-13 Thread baris . kazar
Hi,- it would be nice to create a Lucene index in files and then effectively load it into memory once (since i use in read-only mode). I am looking into if this is doable in Lucene. i wish there were an option to load whole Lucene index into memory: Both of below urls have links to the blog ur

Re: Fwd: org.apache.lucene.index.DirectoryReader Javadocs

2020-12-10 Thread baris . kazar
Thanks for the reply. Sure, i should have included the url since it already caused confusion. Here is the url: https://lucene.apache.org/core/8_5_2/core/org/apache/lucene/index/DirectoryReader.html Please see:     open(Directory directory, Map readerAttributes) Returns a IndexReader reading

org.apache.lucene.index.DirectoryReader Javadocs

2020-12-10 Thread baris . kazar
Hi,- May i request to add more info into Lucene org.apache.lucene.index.DirectoryReader about reaOnly=true attribute and more info on readerAttributes parameters please? I guess the default is read only, right? Best regards -

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread baris . kazar
Great answer Thanks Michael. Yes the difference was too much > 1G Best regards > On Nov 13, 2020, at 1:49 PM, Michael Sokolov wrote: > > You can't directly compare disk usage across two indexes, even with > the same data. Try re-indexing one of your datasets, and you will see > that the disk s

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread baris . kazar
Nothing changed between two index generations except the data changed a bit as i described. When Lucene is done generating index, that is what i am reporting as the size of the directory where all index files are stored. I dont know about deleted docs? How do you trace that? yes the queries

Re: Which Lucene 8.5.X is recommended?

2020-11-12 Thread baris . kazar
Thanks, i will use 8.5.2. i think saw some minor release switch on (z) without any issues but i will double check this. However, i will use 8.5.2 since the bug fixes in that release may result in better performance for Lucene index. Best regards > On Nov 12, 2020, at 11:09 PM, Erick Erickson

Re: Which Lucene 8.5.X is recommended?

2020-11-12 Thread baris . kazar
Thanks, i will use 8.5.2. i think saw some minor release (z) without any issues but i will double check this. However, i will use 8.5.2. The bug fixes in that release may result in better performance. Best regards > On Nov 12, 2020, at 11:09 PM, Erick Erickson wrote: > Always use the most r

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
Hi,- Thanks. These are final finished sizes in both cases. Best regards > On Nov 12, 2020, at 11:12 PM, Erick Erickson wrote: > > Yes, that issue is fixed. The “Resolution” tag is the key, it’s marked > “fixed” and the version is 8.0 > > As for your other question, index size is a very impre

Which Lucene 8.5.X is recommended?

2020-11-12 Thread baris . kazar
Hi,-  is it best to use 8.5.2? Best regards Release 8.5.2 Bug Fixes   (1) LUCENE-9350: Partial reversion of LUCENE-9068; holding levenshtein automata on FuzzyQuery can end up blowing up query caches which use query objects as cache keys, so building the automata is now delayed to search ti

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
On a related issue: i experience that with Version 7.7.2 i experienced this: data is all lower case (same amount of docs as next case though) vs data is camel case except last word always in capital letters but i used in indexer the lowercase filter in both cases so indexing is done with al

https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
https://issues.apache.org/jira/browse/LUCENE-8448 Hi,-  is this issue fixed please? Could You please help me figure it out? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional comma

Re: Links to classes missing for BMW

2020-10-12 Thread baris . kazar
Hi Adrien,- Great, thanks. Best regards > On Oct 12, 2020, at 1:13 PM, Adrien Grand wrote: > > It's not the most visible place, but the paper is referenced in the source > code of the class that implements BM WAND > https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/907d1142

Re: Links to classes missing for BMW

2020-10-12 Thread baris . kazar
Hi Uwe,-  i see, thanks for the info, i wish the documentation mentions this new algorithm by referencing the papers (i have the papers). Best regards On 10/12/20 12:27 PM, Uwe Schindler wrote: There's not much new documentation, it works behind scenes, except that IndexSearcher.search and

Re: Links to classes missing for BMW

2020-10-12 Thread baris . kazar
Hi Uwe,-  Could You please point me to the class documentation please? Best regards On 10/12/20 12:16 PM, Uwe Schindler wrote: BMW support is in Lucene since version 8.0. Uwe Am October 12, 2020 4:08:42 PM UTC schrieb baris.ka...@oracle.com: Hi,-  Is BMW (Block Max Wand) support

Links to classes missing for BMW

2020-10-12 Thread baris . kazar
Hi,-  Is BMW (Block Max Wand) support only for Solr? https://lucene.apache.org/solr/guide/8_6/solr-upgrade-notes.html This pages says "also" so it implies support for Lucene, too, right? Best regards - To unsubscribe, e-mai

Re: [VOTE] Lucene logo contest, here we go again

2020-09-01 Thread baris . kazar
bmitted by Baris Kazar. This entry has 8 variants. [C1] https://urldefense.com/v3/__https://issues.apache.org/jira/secure/attachment/13006392/lucene_logo1_full.pdf__;!!GqivPVa7Brio!JgXZ50SROMPIvwQUnc6YZqLl0mBhVxdDyqRU8SwN7lRfSROEh7KwzR18JgtoX1z6Yg$ [C2] https://urldefense.com/v3/__https://issues.

Re: [VOTE] Lucene logo contest

2020-06-18 Thread baris . kazar
Hi Ryan,-  I very much appreciate this oppurtunity to submit my designs. Best regards On 6/18/20 1:29 AM, Ryan Ernst wrote: > IMHO this vote is invalid because... > it doesn’t include the red / orange variants submitted by Dustin Haver I considered the latest submission by Dustin Haver to be

Re: [VOTE] Lucene logo contest

2020-06-18 Thread baris . kazar
Hi Ryan,-  That sounds awesome, i found my designs and i am so excited. Even if i dont win, it is amazing to submit designs. Best regards On 6/18/20 1:32 AM, Ryan Ernst wrote: Hi Baris, Please see my latest reply on this thread. We will be restarting the vote next week, so you can submit

Re: [VOTE] Lucene logo contest

2020-06-16 Thread baris . kazar
Hello,- i would like to just say that i produced 3 more designs last Feb but forgot to submit them. I will need to look for where they are in my office. I drew them on my post-its and lot of folks liked them. Can there be some extension to this voting process please? I know this might confuse f

Re: How to tell Lucene index search to stop when it takes too long

2020-02-28 Thread baris . kazar
I have one more question on this, should i use Thread to use this class? The snippet did not have that. Best regards On 2/28/20 11:07 AM, baris.ka...@oracle.com wrote: Thanks Mikhail. I missed that cosntructor's first parameter. Best regards On 2/28/20 12:53 AM, Mikhail Khludnev wrote: Pa

Re: What is the Lucene 8.4.1 equivalent for StandardAnalyzer.STOP_WORDS_SET

2020-02-28 Thread baris . kazar
Thanks Michael. Best regards On 2/24/20 7:18 PM, Michael Froh wrote: Those words (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.3.1/lucene/core/src/java/org/apache/lucene/analysis/standard/StandardAnalyzer.java#L44-L49

maxMultiTermExpansions parameter of PhraseWildcardQuery class in Lucene 8.4.1 Sandbox

2020-02-28 Thread baris . kazar
Hi,- i hope everyone is doing great. i set this parameter as Integer.MAX_VALUE and it is mostly working only 1 time had Memory issue. However, by reducing this parameter how will it affect the search time and quality of search results? Has anybody done such an experiment? The explanation

Re: How to tell Lucene index search to stop when it takes too long

2020-02-28 Thread baris . kazar
Thanks Mikhail. I missed that cosntructor's first parameter. Best regards On 2/28/20 12:53 AM, Mikhail Khludnev wrote: Pass TopDocsCollector as the first arg into TimeLimitingCollector. On Thu, Feb 27, 2020 at 2:31 PM wrote: Hi,- Sometimes the search takes too long even with PhraseWildcar

Re: How to tell Lucene index search to stop when it takes too long

2020-02-27 Thread baris . kazar
Hi,- Sometimes the search takes too long even with PhraseWildcardQuery, so i would like to limit the search time via TimeLimitingCollector API. Thanks to Mikhail and this Forum to inform me about this API. i checked this IndexSearcher API with Collector parameter but that API does not have

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

2020-02-26 Thread baris . kazar
Followup on this thread: i ended up using WildcardQuery with "*" at the end of last token for PhraseWildcardQuery class from the sandbox, i tested this class rigorously and i think it is ready to move it from sandbox jar to the appropriate release jar. Is there a plan for that? PhraseWil

Re: How to tell Lucene index search to stop when it takes too long

2020-02-24 Thread baris . kazar
Will do, Thanks > On Feb 25, 2020, at 1:34 AM, Mikhail Khludnev wrote: > > Hello. > > Meet org.apache.lucene.search.TimeLimitingCollector. > >> On Mon, Feb 24, 2020 at 2:51 PM wrote: >> >> Hi,- >> >> I hope everyone is doing great. >> >> >> i am trying to find an api to tell Lucene Inde

What is the Lucene 8.4.1 equivalent for StandardAnalyzer.STOP_WORDS_SET

2020-02-24 Thread baris . kazar
Hi,-  I hope everyone is doing great. What is the Lucene 8.4.1 equivalent for StandardAnalyzer.STOP_WORDS_SET? https://lucene.apache.org/core/7_3_1/core/org/apache/lucene/analysis/standard/StandardAnalyzer.html#STOP_WORDS_SET https://lucene.apache.org/core/8_4_1/core/org/apache/lucene/analysis

Lucene 7.7.2 Indexwriter.numDocs() replacement in Lucene 8.4.1

2020-02-24 Thread baris . kazar
Hi,-  I hope everyone is doing great. I think the Lucene 7.7.2  Indexwriter.numDocs() https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/index/IndexWriter.html#numDocs-- can be replaced by the following in Lucene 8.4.1, right? https://lucene.apache.org/core/8_4_1/core/org/apache/luc

Re: Lucene 7.7.2 Indexwriter.numDocs() replacement in Lucene 8.4.1

2020-02-24 Thread baris . kazar
A typo corrected below. Best regards On 2/24/20 5:54 PM, baris.ka...@oracle.com wrote: Hi,-  I hope everyone is doing great. I think the Lucene 7.7.2  Indexwriter.numDocs() https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/index/IndexWriter.html#numDocs-- can be replaced by t

How to tell Lucene index search to stop when it takes too long

2020-02-24 Thread baris . kazar
Hi,- I hope everyone is doing great. i am trying to find an api to tell Lucene Index Searcher to stop after 0.5 seconds (when it takes longer than this). Is there such an api or plan to implement one? Best regards - To

Re: Lucene download page

2020-02-24 Thread baris . kazar
Thanks Erick and the Forum. Best regards On 2/23/20 8:32 AM, Erick Erickson wrote: No, 7.7.2 was a patch fix that _was_ released after 8.1.1. On Feb 22, 2020, at 2:49 PM, baris.ka...@oracle.com wrote: Hi,- i hope everyone is doing great. Licene 7.7.2 is listed as released after Lucene 8

Lucene download page

2020-02-22 Thread baris . kazar
Hi,-  i hope everyone is doing great. Licene 7.7.2 is listed as released after Lucene 8.1.1 is released on this page https://lucene.apache.org/core/corenews.html#apache-lucenetm-841-available I think the order may need to change there. Best regards ---

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

2020-02-21 Thread baris . kazar
Hi,-  Looks like the only way to use and test the new PhraseWildCardQuery class in Lucene 8.4.0 sandbox is to switch to Lucene 8.4.0 from Lucene 7.7.2. I thought i could adapt it to Lucene 7.7.2 but so far i saw i needed to change heavily 20+ classes and it will be way more than this. So,

StandardFilter and StandardFilterFactory removed in Lucene 8.x

2020-02-21 Thread baris . kazar
Hi,- I hope everyone is doing great. What replaces these classes in Lucene 8.x? https://issues.apache.org/jira/browse/LUCENE-8356 says they presumably do nothing. Is that certain please? On the other hand: I see that (for example) the Query class has been changed quite a lot when someone

  1   2   3   >